WO2021003935A1 - Data cluster storage method and apparatus, and computer device - Google Patents

Data cluster storage method and apparatus, and computer device Download PDF

Info

Publication number
WO2021003935A1
WO2021003935A1 PCT/CN2019/118232 CN2019118232W WO2021003935A1 WO 2021003935 A1 WO2021003935 A1 WO 2021003935A1 CN 2019118232 W CN2019118232 W CN 2019118232W WO 2021003935 A1 WO2021003935 A1 WO 2021003935A1
Authority
WO
WIPO (PCT)
Prior art keywords
physical
physical cluster
cluster
stored
file
Prior art date
Application number
PCT/CN2019/118232
Other languages
French (fr)
Chinese (zh)
Inventor
兰东平
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021003935A1 publication Critical patent/WO2021003935A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0626Reducing size or complexity of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Definitions

  • cluster storage aggregates the storage space of multiple storage devices into one that can provide unified access to application servers.
  • the storage pool of the interface and management interface the application can transparently access and utilize the disks on all storage devices through the access interface, and can give full play to the performance and disk utilization of the storage device.
  • index libraries need to be established to record the correspondence between data and clusters. Query the clusters stored in historical files in the index database.
  • the aforementioned storage method cannot guarantee the regular storage of data in the cluster, and cannot quickly locate the location of the data storage, resulting in poor storage performance.
  • this application discloses a method, device and computer equipment for data cluster storage.
  • the main purpose is to solve the problem that when data cluster storage is performed, the regular storage of data in the cluster cannot be guaranteed, and the location of data storage cannot be quickly located. , Resulting in poor storage performance.
  • a data cluster storage method including:
  • the uniformly mapping the physical clusters to the physical nodes of the consistent hash ring specifically includes: obtaining the storage space of the physical cluster; and the first physical cluster with the storage space greater than or equal to a preset threshold according to the preset Suppose the proportion is divided into multiple sub-physical clusters with equal space; according to the naming rule, the second physical cluster whose storage space is less than the preset threshold and each of the sub-physical clusters are configured with an identity code; according to the identity code Determine the hash value of the second physical cluster and each of the sub-physical clusters; use the hash value to calculate the physical node positions of the second physical cluster and the sub-physical cluster on the consistent hash ring;
  • a data cluster storage device which includes:
  • the acquisition module is used to acquire all physical clusters used to store data
  • a mapping module which is used to map the physical cluster to a physical node of a consistent hash ring
  • the mapping unit is specifically configured to: obtain the storage space of the physical cluster; divide the first physical cluster with the storage space greater than or equal to a preset threshold into a plurality of sub-physical clusters with equal space according to a preset ratio;
  • the naming rule is that the second physical cluster whose storage space is less than the preset threshold and each of the sub-physical clusters are configured with an identification code; the second physical cluster and each of the sub-physical clusters are determined according to the identification code Use the hash value to calculate the physical node positions of the second physical cluster and the sub-physical cluster on the consistent hash ring;
  • the determining module is used to determine the optimal storage target physical cluster according to the hash value of the file to be stored;
  • the storage module is used to store the file to be stored in the target physical cluster.
  • a non-volatile readable storage medium having computer readable instructions stored thereon, and the computer readable instructions are executed by a processor to implement the above-mentioned data cluster storage method.
  • a computer device including a non-volatile readable storage medium, a processor, and a computer-readable storage medium that is stored on the non-volatile readable storage medium and can run on the processor. Instructions, when the processor executes the computer-readable instructions, the method for data cluster storage is implemented.
  • this application can evenly map physical clusters to consistent data.
  • the logical node position of the file to be stored in the consistent hash ring is determined according to the hash value of the file to be stored, and the optimal storage target physical cluster is filtered based on the logical node position, and then the file to be stored is selected Stored in the target physical cluster.
  • This application can quickly locate the cluster where the data file should be stored through calculation. Because the hash value of the data file is fixed, it can ensure the regular storage of the data in the cluster.
  • each physical cluster is evenly mapped to the physical nodes of the consistent hash ring, so that each physical cluster can store data, avoiding the centralized storage of data in a physical cluster, resulting in increased storage pressure and data avalanche The problem.
  • integrating the consistent hash ring into the data cluster storage of the present application can effectively reduce the complexity of data storage, thereby reducing costs, and can achieve efficient positioning of the physical cluster to meet the needs of massive storage expansion.
  • FIG. 1 shows a schematic flowchart of a data cluster storage method provided by an embodiment of the present application
  • FIG. 2 shows a schematic flowchart of another data cluster storage method provided by an embodiment of the present application
  • FIG. 3 shows an example schematic diagram of a data cluster storage method provided by an embodiment of the present application
  • FIG. 4 shows an example schematic diagram of another data cluster storage method provided by an embodiment of the present application
  • FIG. 5 shows an example schematic diagram of yet another data cluster storage method provided by an embodiment of the present application
  • FIG. 6 shows a schematic structural diagram of a data cluster storage device provided by an embodiment of the present application.
  • FIG. 7 shows a schematic structural diagram of another data cluster storage device provided by an embodiment of the present application.
  • the embodiment of the present application provides a method for data cluster storage. As shown in FIG. 1, the method includes:
  • the purpose of obtaining all physical clusters is to configure all physical clusters equally in a consistent hash ring, so as to realize uniform configuration distribution of physical clusters.
  • a unified namespace can be used to name each physical cluster, thereby mapping each physical cluster to a consistent hash ring.
  • the identification code or host name of the physical cluster can be selected as the key to calculate the hash value , So that each machine can determine its position on the hash ring, so as to achieve targeted storage of data files based on a consistent hash algorithm.
  • the consistent hash ring can be imagined as a ring composed of 2 ⁇ 32 points, the point directly above the ring represents 0, the first point to the right of the 0 point represents 1, and so on, 2. 3, 4, 5, 6... until 2 ⁇ 32-1, which means that the first point to the left of 0 points represents 2 ⁇ 32-1.
  • the consistent hash ring has two layers of nodes: the first layer is a logical node, the number is 2 ⁇ 32; the second layer is a physical node, which is the actual storage cluster.
  • the target physical cluster is the physical cluster that is most suitable for the storage of the file to be stored determined according to the consistent hash algorithm.
  • the method of using the consistent hash algorithm to determine the target physical cluster is: starting from the logical node location where the file to be stored is located, The first physical cluster with normal storage status encountered in the clockwise direction is determined as the target physical cluster.
  • the files to be stored can be stored in the target physical cluster, and queries and data acquisitions of the data to be stored can be received.
  • the physical cluster can be evenly mapped to the physical nodes of the consistent hash ring, and the logical node of the file to be stored in the consistent hash ring is determined according to the hash value of the file to be stored Location, the optimal storage target physical cluster is filtered out based on the logical node location, and then the files to be stored are stored in the target physical cluster.
  • This application can quickly locate the cluster where the data file should be stored through calculation. Because the hash value of the data file is fixed, it can ensure the regular storage of the data in the cluster.
  • each physical cluster is evenly mapped to the physical nodes of the consistent hash ring, so that each physical cluster can store data, avoiding the centralized storage of data in a physical cluster, resulting in increased storage pressure and data avalanche The problem.
  • integrating the consistent hash ring into the data cluster storage of the present application can effectively reduce the complexity of data storage, thereby reducing costs, and can achieve efficient positioning of the physical cluster to meet the needs of massive storage expansion.
  • the method includes:
  • all physical clusters used to store data can be obtained from the data storage system. For example, if the data storage system contains four physical clusters A, B, C, and D, the basic information of the four clusters A, B, C, and D needs to be extracted.
  • the preset threshold value is the minimum storage space for judging to divide the physical cluster into multiple sub-physical clusters.
  • the preset ratio is the number of divisions that divide the physical cluster into sub-physical clusters, and the value of the preset ratio can be preset according to actual needs.
  • set the preset ratio of the unit capacity of the physical cluster and the sub-physical cluster is 30TB. If the storage space of physical cluster A is 200TB, the storage space of physical cluster B is 100TB, and the storage space of physical cluster C is 100TB. The storage space is 20TB. Because the storage space of physical cluster A and physical cluster B is greater than the preset threshold, physical cluster A and physical cluster B are defined as the first physical cluster to be divided, and physical cluster A is divided into 10 20TB pieces according to the preset ratio Sub-physical cluster, which divides physical cluster B into 10 sub-physical clusters of 10TB. Since the storage space of the physical cluster C is less than the preset threshold, it can be determined that the storage space is small and does not need to be divided into multiple sub-physical clusters, and it is defined as the second physical cluster.
  • the naming rule configure an identification code for the second physical cluster and each sub-physical cluster whose storage space is less than the preset threshold.
  • a unified namespace needs to be used to configure identities that comply with the naming rules for each sub-physical cluster and the second physical cluster to facilitate the physical Unified management of storage space.
  • the naming rule can be uniformly set as cluster[cluster number]-[physical node number]. Before naming the physical cluster, it is necessary to obtain the cluster number of the physical cluster, and determine that the physical cluster corresponding to the cluster number is the first physical cluster It is the second physical cluster. If it is determined to be the first physical cluster, it is necessary to further obtain the sequence number of the sub-physical cluster in the first physical cluster, that is, the physical node number in the corresponding naming rule. When it is determined that the physical cluster corresponding to the cluster number is the second physical cluster, the physical node number can be directly set to 1.
  • the two sub-physical nodes can be named cluster1-1 and cluster1-2 in sequence; if it is determined that the storage cluster 2 has four sub-physical nodes, the four sub-physical nodes can be named in sequence As, cluster2-1, cluster2-2, cluster2-3, cluster2-4; if it is determined that the storage cluster 3 is the second physical cluster, it can be named cluster3-1.
  • the method for determining the mapping of the second physical cluster and sub-physical clusters to the consistent hash ring may be: using the MD5 message digest algorithm to generate a 128-bit (16 byte) Hash value is used to ensure complete and consistent information transmission.
  • the specific implementation method is: MD5 processes the input identification code in 512-bit groups, and each group is divided into 16 32-bit sub-groups. After a series of processing, the output of the algorithm is composed of four 32-bit groups. , Cascading these four 32-bit packets will generate a 128-bit hash value.
  • the result calculated by the above formula must be an integer between 0 and 2 ⁇ 32-1, so the calculated integer represents the physical cluster. Since this integer must be between 0 and 2 ⁇ 32-1, then, The location of the physical node must be determined on the consistent hash ring, that is, to map each physical cluster to the consistent hash ring.
  • the calculation method in step 204 is the same. After the identification code of the file to be stored is obtained, the identification code is converted into a hash value.
  • the result calculated by the above formula must be an integer between 0 and 2 ⁇ 32-1, so the calculated integer represents the file to be stored. Since this integer must be between 0 and 2 ⁇ 32-1, then , The location of the logical node corresponding to the file to be stored must be determined on the consistent hash ring.
  • a consistent hash algorithm can be used to determine the target physical cluster corresponding to the file to be stored.
  • the principle of the consistent hash algorithm is: After mapping the physical cluster and the file to be stored on the hash ring, starting from the location of the file to be stored, the first physical cluster encountered in the clockwise direction is the current The physical cluster where the object will be cached. Since the hashed value of the file to be stored and the physical cluster is fixed, the file to be stored must be cached on the fixed physical cluster when the physical cluster remains unchanged. Then, When you want to access the file to be stored next time, just use the same algorithm again to calculate the location where the file to be stored is cached, and go directly to the corresponding physical cluster to find it.
  • the physical cluster is searched clockwise using key1 as the starting point.
  • the physical node key2 on the ring is the first second physical cluster or the first sub-physical cluster retrieved, and the physical cluster corresponding to the key2 point can be determined as the target physical cluster of the file to be stored at the key1 point.
  • the physical cluster has fault tolerance and scalability.
  • the failed physical cluster needs to be removed.
  • physical clusters can be added to the consistent hash ring according to the actual situation.
  • the physical node where the physical cluster is located can be cleared; after it is determined that a new physical cluster needs to be added, an identity identifier needs to be configured for the physical cluster according to the naming rules Code, and determine the physical node position of the physical cluster on the consistent hash ring based on the identification code.
  • step 211 of the embodiment specifically includes: acquiring a newly added second physical cluster or each sub-physical cluster;
  • the naming rule is to configure the identity code for the newly added second physical cluster or each sub-physical cluster; based on the identity code, determine the position of the newly added physical node of the newly added second physical cluster or each sub-physical cluster on the consistent hash ring ; Extract the data to be migrated between the location of the newly added physical node and the location of the previous cluster physical node in the ring space, where the location of the previous cluster physical node is the first and second one that is checked counterclockwise from the location of the newly added physical node as the starting point The physical node location corresponding to the physical cluster or the first sub-physical cluster; the data to be migrated is migrated and stored in the sub-physical cluster or the second physical cluster corresponding to the location of the newly added physical node.
  • step 211 of the embodiment specifically includes: determining the second physical cluster or sub-physical cluster to be deleted; and setting the second physical cluster to be deleted Or all storage files in the sub-physical cluster are migrated and stored in the first second physical cluster or the first sub-physical cluster in the clockwise direction; after the storage files are migrated and stored, the second physical cluster to be deleted Or delete the sub-physical cluster.
  • the affected data is only the data stored in the physical cluster to be deleted, and other data will not be affected.
  • the affected data is only between this physical cluster and the previous physical cluster in its ring space (that is, the first physical cluster encountered when walking in a counterclockwise direction) Data, other data will not be affected.
  • the data to be stored in the physical cluster may specifically include: receiving a query request for the file to be stored; The hash value of the file determines the target physical cluster; the data file is retrieved in the target physical cluster.
  • the file to be stored when the physical cluster remains unchanged, the file to be stored must be cached on a fixed physical cluster. Then, when you want to access the file to be stored next time, you only need to use the same algorithm for calculation again. You can calculate where the file to be stored is cached, and go directly to the corresponding physical cluster to find it.
  • a physical cluster with a larger storage space can be divided into multiple sub-physical clusters with equal space to achieve uniform distribution of physical clusters, so that different data can be evenly stored in corresponding locations, thereby balancing each
  • the storage pressure of the physical cluster avoids the avalanche of stored data caused by centralized data storage.
  • the second physical cluster and each of the sub-physical clusters whose storage space is less than the preset threshold is configured with an identification code, and the hash value is calculated according to the identification code, and then the second physical cluster and each of the sub-physical clusters
  • the physical clusters are evenly mapped into the consistent hash ring, and then the consistent hash algorithm is used to determine the target storage cluster corresponding to the optimal storage of the file to be stored, and the file to be stored is stored in the target storage cluster.
  • it can also receive the addition and deletion instructions of the physical cluster to meet the expansion needs of mass storage.
  • the queryability of data storage can be guaranteed, that is, when querying the file to be stored, the original storage location of the data is not considered, and the current storage physical cluster location can be accurately located according to the hash value of the file to be stored. And then realize the efficient positioning of data storage.
  • an embodiment of the present application provides a data cluster storage device.
  • the device includes: an acquisition module 31, a mapping module 32, and a determination module 33.
  • Storage module 34 is provided.
  • the obtaining module 31 can be used to obtain all physical clusters used to store data
  • the mapping module 32 can be used to map the physical cluster to the physical nodes of the consistent hash ring;
  • the determining module 33 can be used to determine the optimal storage target physical cluster according to the hash value of the file to be stored;
  • the storage module 34 can be used to store the files to be stored in the target physical cluster.
  • the mapping module 32 can be specifically used to obtain the storage space of the physical cluster; the storage space is greater than or equal to the preset threshold of the first physical
  • the cluster is divided into multiple sub-physical clusters with equal space according to the preset ratio; according to the naming rules, the second physical cluster with storage space less than the preset threshold and each sub-physical cluster are configured with identification codes; the second physical cluster is determined according to the identification codes And the hash value of each sub-physical cluster; the hash value is used to calculate the physical node positions of the second physical cluster and the sub-physical cluster on the consistent hash ring.
  • the determining module 33 can be specifically used to calculate the hash value of the file to be stored according to the identification code of the file to be stored; the hash value is used to determine the location of the file to be stored The location of the logical node on the consistent hash ring; the first second physical cluster or the first sub-physical cluster that is checked out clockwise from the logical node location on the consistent hash ring is determined as the target physical cluster.
  • the device further includes: a receiving module 35, an update module 36, and an adjustment module 37.
  • the receiving module 35 can be used to receive addition and deletion instructions to the physical cluster
  • the update module 36 can be used to update the second physical cluster and/or sub-physical cluster on the physical node according to the addition and deletion instructions;
  • the adjustment module 37 can be used to adjust the storage location of the files to be stored that meet the preset conditions.
  • the adjustment module 37 can be specifically used to obtain the newly added second physical cluster or each sub-physical cluster; according to the naming rule, the newly added physical cluster Configure the identity code for the second physical cluster or each sub-physical cluster; determine the position of the newly added physical node of the second physical cluster or each sub-physical cluster on the consistent hash ring based on the identity code; extract the new physical node
  • the data to be migrated between the location and the location of the physical node of the previous cluster in the ring space, where the location of the physical node of the previous cluster is the first second physical cluster or the first sub-physical that was checked counterclockwise from the location of the new physical node as the starting point The location of the physical node corresponding to the cluster; the data to be migrated is migrated and stored in the sub-physical cluster or the second physical cluster corresponding to the location of the newly added physical node.
  • the adjustment module 37 can be specifically used to determine the second physical cluster or sub-physical cluster to be deleted; and the second physical cluster or sub-physical cluster to be deleted All storage files in the sub-physical cluster are migrated and stored to the first second physical cluster or the first sub-physical cluster in the clockwise direction; after the storage files are migrated and stored, the second physical cluster or The child physical cluster is deleted.
  • the device in order to provide a query on the storage location of the file to be stored, as shown in FIG. 7, the device further includes: a query module 38.
  • the receiving module 35 can also be used to receive query requests for files to be stored
  • the determining module 33 may also be used to determine the target physical cluster according to the hash value of the file to be stored;
  • the retrieval module 38 can be used to retrieve data files in the target physical cluster.
  • embodiments of the present application also provide a storage medium on which computer-readable instructions are stored.
  • the foregoing 1 and Figure 2 show the method of data cluster storage.
  • the technical solution of this application can be embodied in the form of a software product.
  • the software product can be stored in a non-volatile storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.), including several
  • the instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute the methods in each implementation scenario of the present application.
  • an embodiment of the present application also provides a computer device, which may be a personal computer, Servers, network devices, etc.
  • the physical device includes a storage medium and a processor; the storage medium is used to store a computer program; the processor is used to execute the computer program to implement the data cluster storage method shown in FIG. 1 and FIG. 2 .
  • the computer device may also include a user interface, a network interface, a camera, a radio frequency (RF) circuit, a sensor, an audio circuit, a WI-FI module, and so on.
  • the user interface may include a display screen (Display), an input unit such as a keyboard (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, and the like.
  • the network interface can optionally include a standard wired interface, a wireless interface (such as a Bluetooth interface, a WI-FI interface), etc.
  • the computer device structure provided in this embodiment does not constitute a limitation on the physical device, and may include more or fewer components, or combine certain components, or arrange different components.
  • the non-volatile readable storage medium may also include an operating system and a network communication module.
  • the operating system is the computer readable instructions of the physical device hardware and software resources stored in the data cluster, and supports the operation of the information processing computer readable instructions and other software and/or computer readable instructions.
  • the network communication module is used to implement communication between various components in the non-volatile readable storage medium and communication with other hardware and software in the physical device.
  • the present application can be implemented by means of software plus a necessary general hardware platform, or by hardware.
  • the present application can divide a physical cluster with a larger storage space into multiple sub-physical clusters with equal space, realize the uniform distribution of the physical clusters, and enable different data Evenly store to the corresponding location, thereby balancing the storage pressure of each physical cluster, and avoiding stored data avalanches caused by centralized data storage.
  • the second physical cluster and each of the sub-physical clusters whose storage space is less than the preset threshold is configured with an identification code, and the hash value is calculated according to the identification code, and then the second physical cluster and each of the sub-physical clusters
  • the physical clusters are evenly mapped into the consistent hash ring, and then the consistent hash algorithm is used to determine the target storage cluster corresponding to the optimal storage of the file to be stored, and the file to be stored is stored in the target storage cluster.
  • it can also receive the addition and deletion instructions of the physical cluster to meet the expansion needs of mass storage.
  • the queryability of data storage can be guaranteed, that is, when querying the file to be stored, the original storage location of the data is not considered, and the current storage physical cluster location can be accurately located according to the hash value of the file to be stored. And then realize the efficient positioning of data storage.

Abstract

A data cluster storage method and apparatus, and a computer device, which relate to the field of data processing, and can solve the problem of poor storage performance caused by incapability of guaranteeing the regular storage of data in a cluster and rapidly locating the position where the data is stored when the cluster is used for data storage. The method comprises: obtaining all physical clusters for storing data (101); uniformly mapping the physical clusters onto a physical node of a consistent hash ring (102); determining, according to a hash value of a file to be stored, the optimal target physical cluster for storage (103); and storing said file into the target physical cluster (104). The method is applicable to data cluster storage.

Description

数据集群存储的方法、装置及计算机设备Method, device and computer equipment for data cluster storage 技术领域Technical field
本申请要求与2019年7月11日提交中国专利局、申请号为201910625543.5、申请名称为“数据集群存储的方法、装置及计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在申请中。This application claims priority with the Chinese patent application filed on July 11, 2019 with the Chinese Patent Office, the application number is 201910625543.5, and the application name is "Data cluster storage methods, devices and computer equipment", the entire content of which is incorporated by reference Applying.
背景技术Background technique
随着数据存储服务使用服务器、机柜等物理资源的累加,通过集群扩充存储容量的方式受到广泛欢迎,其中,集群存储是将多台存储设备中的存储空间聚合成一个能够给应用服务器提供统一访问接口和管理界面的存储池,应用可以通过该访问接口透明地访问和利用所有存储设备上的磁盘,可以充分发挥存储设备的性能和磁盘利用率。With the accumulation of physical resources such as servers and cabinets used by data storage services, the way to expand storage capacity through clusters is widely welcomed. Among them, cluster storage aggregates the storage space of multiple storage devices into one that can provide unified access to application servers. The storage pool of the interface and management interface, the application can transparently access and utilize the disks on all storage devices through the access interface, and can give full play to the performance and disk utilization of the storage device.
目前,在对数据进行集群存储时,普遍是将数据随机存储到空闲的物理集群中,并且在进行对数据的跨集群存储时,需要建立额外的索引库来记录数据和集群的对应关系,通过在索引库中查询获知历史文件所存储的集群。At present, when data is stored in clusters, it is common to randomly store data in idle physical clusters, and when performing cross-cluster storage of data, additional index libraries need to be established to record the correspondence between data and clusters. Query the clusters stored in historical files in the index database.
然而,上述这种存储方式不能保证数据在集群中的规则存储,且无法快速定位数据存储的位置,导致存储性能较差。However, the aforementioned storage method cannot guarantee the regular storage of data in the cluster, and cannot quickly locate the location of the data storage, resulting in poor storage performance.
发明内容Summary of the invention
有鉴于此,本申请公开了一种数据集群存储的方法、装置及计算机设备,主要目的在于解决在进行数据集群存储时,不能保证数据在集群中的规则存储,且无法快速定位数据存储的位置,导致存储性能较差的问题。In view of this, this application discloses a method, device and computer equipment for data cluster storage. The main purpose is to solve the problem that when data cluster storage is performed, the regular storage of data in the cluster cannot be guaranteed, and the location of data storage cannot be quickly located. , Resulting in poor storage performance.
根据本申请的一个方面,提供了一种数据集群存储的方法,该方法包括:According to one aspect of the present application, there is provided a data cluster storage method, the method including:
获取所有用于存储数据的物理集群;Get all physical clusters used to store data;
将所述物理集群映射到一致性哈希环的物理节点上;Mapping the physical cluster to the physical node of the consistent hash ring;
所述将所述物理集群均匀映射到一致性哈希环的物理节点上,具体包括:获取所述物理集群的存储空间;将所述存储空间大于或等于预设阈值的第一物理集群按照预设比例分割成多个空间均等的子物理集群;按照命名规则为所述存储空间小于所述预设阈值的第二物理集群及各个所述子物理集群配置身份标识码;根据所述身份标识码确定所述第二物理集群及各个所述子物理集群的哈希值;利用所述哈希值计算所述第二物理集群及所述子物理集群在一致性哈希环上的物理节点位置;The uniformly mapping the physical clusters to the physical nodes of the consistent hash ring specifically includes: obtaining the storage space of the physical cluster; and the first physical cluster with the storage space greater than or equal to a preset threshold according to the preset Suppose the proportion is divided into multiple sub-physical clusters with equal space; according to the naming rule, the second physical cluster whose storage space is less than the preset threshold and each of the sub-physical clusters are configured with an identity code; according to the identity code Determine the hash value of the second physical cluster and each of the sub-physical clusters; use the hash value to calculate the physical node positions of the second physical cluster and the sub-physical cluster on the consistent hash ring;
根据待存储文件的哈希值确定最优存储的目标物理集群;Determine the optimal storage target physical cluster according to the hash value of the file to be stored;
将所述待存储文件存储到所述目标物理集群中。Storing the file to be stored in the target physical cluster.
根据本申请的另一个方面,提供了一种数据集群存储的装置,该装置包括:According to another aspect of the present application, there is provided a data cluster storage device, which includes:
获取模块,用于获取所有用于存储数据的物理集群;The acquisition module is used to acquire all physical clusters used to store data;
映射模块,用于将所述物理集群映射到一致性哈希环的物理节点上;A mapping module, which is used to map the physical cluster to a physical node of a consistent hash ring;
所述映射单元,具体用于:获取所述物理集群的存储空间;将所述存储空间大于或等于预设阈值的第一物理集群按照预设比例分割成多个空间均等的子物理集群;按照命名规则为所述存储空间小于所述预设阈值的第二物理集群及各个所述子物理集群配置身份标识码;根据所述身份标识码确定所述第二物理集群及各个所述子物理集群的哈希值;利用所述哈希值计算所述第二物理集群及所述子物理集群在一致性哈希环上的物理节点位置;The mapping unit is specifically configured to: obtain the storage space of the physical cluster; divide the first physical cluster with the storage space greater than or equal to a preset threshold into a plurality of sub-physical clusters with equal space according to a preset ratio; The naming rule is that the second physical cluster whose storage space is less than the preset threshold and each of the sub-physical clusters are configured with an identification code; the second physical cluster and each of the sub-physical clusters are determined according to the identification code Use the hash value to calculate the physical node positions of the second physical cluster and the sub-physical cluster on the consistent hash ring;
确定模块,用于根据待存储文件的哈希值确定最优存储的目标物理集群;The determining module is used to determine the optimal storage target physical cluster according to the hash value of the file to be stored;
存储模块,用于将所述待存储文件存储到所述目标物理集群中。The storage module is used to store the file to be stored in the target physical cluster.
根据本申请的又一个方面,提供了一种非易失性可读存储介质,其上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现上述数据集群存储的方法。According to another aspect of the present application, there is provided a non-volatile readable storage medium having computer readable instructions stored thereon, and the computer readable instructions are executed by a processor to implement the above-mentioned data cluster storage method.
根据本申请的再一个方面,提供了一种计算机设备,包括非易失性可读存储介质、处理器及存储在非易失性可读存储介质上并可在处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现上述数据集群存储的方法。According to another aspect of the present application, there is provided a computer device, including a non-volatile readable storage medium, a processor, and a computer-readable storage medium that is stored on the non-volatile readable storage medium and can run on the processor. Instructions, when the processor executes the computer-readable instructions, the method for data cluster storage is implemented.
借由上述技术方案,本申请提供的一种数据集群存储的方法、装置及计算机设备,与目前数据随机存储到空闲的物理集群的方式相比,本申请可将物理集群均匀映射到一致性哈希环的物理节点上,根据待存储文件的哈希值确定待存储文件在一致性哈希环中的逻辑节点位置,基于逻辑节点位置筛选出最优存储的目标物理集群,之后将待存储文件存储到目标物理集群中。本申请能够通过计算,快速定位数据文件应该存储的集群,因数据文件的哈希值是固定的,故能够保证数据在集群中的规则存储。且将各个物理集群均匀映射到一致性哈希环的物理节点上,能使各个物理集群均能存储到数据,避免数据集中存储到某一物理集群中,导致存储压力增大,从而导致数据雪崩的问题。另外,将一致性哈希环融入本申请的数据集群存储中,可有效降低数据存储的复杂性、从而降低成本,且能实现对物理集群的高效定位,满足海量存储扩容需求。With the above technical solutions, the method, device and computer equipment for data cluster storage provided by this application, compared with the current method of randomly storing data in idle physical clusters, this application can evenly map physical clusters to consistent data. On the physical nodes of Xihuan, the logical node position of the file to be stored in the consistent hash ring is determined according to the hash value of the file to be stored, and the optimal storage target physical cluster is filtered based on the logical node position, and then the file to be stored is selected Stored in the target physical cluster. This application can quickly locate the cluster where the data file should be stored through calculation. Because the hash value of the data file is fixed, it can ensure the regular storage of the data in the cluster. Moreover, each physical cluster is evenly mapped to the physical nodes of the consistent hash ring, so that each physical cluster can store data, avoiding the centralized storage of data in a physical cluster, resulting in increased storage pressure and data avalanche The problem. In addition, integrating the consistent hash ring into the data cluster storage of the present application can effectively reduce the complexity of data storage, thereby reducing costs, and can achieve efficient positioning of the physical cluster to meet the needs of massive storage expansion.
附图说明Description of the drawings
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本地申请的不当限定。在附图中:The drawings described here are used to provide a further understanding of the application and constitute a part of the application. The exemplary embodiments and descriptions of the application are used to explain the application, and do not constitute an improper limitation of the local application. In the attached picture:
图1示出了本申请实施例提供的一种数据集群存储的方法的流程示意图;FIG. 1 shows a schematic flowchart of a data cluster storage method provided by an embodiment of the present application;
图2示出了本申请实施例提供的另一种数据集群存储的方法的流程示意图;FIG. 2 shows a schematic flowchart of another data cluster storage method provided by an embodiment of the present application;
图3示出了本申请实施例提供的一种数据集群存储的方法的实例示意图;FIG. 3 shows an example schematic diagram of a data cluster storage method provided by an embodiment of the present application;
图4示出了本申请实施例提供的另一种数据集群存储的方法的实例示意图;FIG. 4 shows an example schematic diagram of another data cluster storage method provided by an embodiment of the present application;
图5示出了本申请实施例提供的又一种数据集群存储的方法的实例示意图;FIG. 5 shows an example schematic diagram of yet another data cluster storage method provided by an embodiment of the present application;
图6示出了本申请实施例提供的一种数据集群存储的装置的结构示意图;FIG. 6 shows a schematic structural diagram of a data cluster storage device provided by an embodiment of the present application;
图7示出了本申请实施例提供的另一种数据集群存储的装置的结构示意图。FIG. 7 shows a schematic structural diagram of another data cluster storage device provided by an embodiment of the present application.
具体实施方式Detailed ways
下文将参考附图并结合实施例来详细说明本申请。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互结合。Hereinafter, the application will be described in detail with reference to the drawings and in conjunction with embodiments. It should be noted that the embodiments in this application and the features in the embodiments can be combined with each other if there is no conflict.
针对目前在进行数据集群存储时,不能保证数据在集群中的规则存储,且无法快速定位数据存储的位置,导致存储性能较差的问题。本申请实施例提供了一种数据集群存储的方法,如图1所示,该方法包括:In view of the current data cluster storage, the regular storage of data in the cluster cannot be guaranteed, and the location of the data storage cannot be quickly located, resulting in poor storage performance. The embodiment of the present application provides a method for data cluster storage. As shown in FIG. 1, the method includes:
101、获取所有用于存储数据的物理集群。101. Obtain all physical clusters used to store data.
对于本实施例,获取所有物理集群的目的在于将所有物理集群均等配置到一致性哈希环中,实现对物理集群的均匀配置分布。For this embodiment, the purpose of obtaining all physical clusters is to configure all physical clusters equally in a consistent hash ring, so as to realize uniform configuration distribution of physical clusters.
102、将物理集群均匀映射到一致性哈希环的物理节点上。102. Evenly map the physical clusters to the physical nodes of the consistent hash ring.
对于本实施例,可利用统一命名空间为各个物理集群进行命名,从而将各个物理集群映射到一致性哈希环中,具体可以选择物理集群的身份标识码或主机名作为关键字计算哈希值,这样每台机器就能确定其在哈希环上的位置,以便基于一致性哈希算法实现对数据文件的针对性存储。For this embodiment, a unified namespace can be used to name each physical cluster, thereby mapping each physical cluster to a consistent hash ring. Specifically, the identification code or host name of the physical cluster can be selected as the key to calculate the hash value , So that each machine can determine its position on the hash ring, so as to achieve targeted storage of data files based on a consistent hash algorithm.
其中,可以把一致性哈希环想象成由2^32个点组成的圆环,圆环的正上方的点代表0,0点右侧的第一个点代表1,以此类推,2、3、4、5、6……直到2^32-1,也就是说0点左侧的第一个点代表2^32-1。一致性哈希环拥有两层节点:第一层为逻辑节点,数目为2^32个;第二层为物理节点,为实际的存储集群。Among them, the consistent hash ring can be imagined as a ring composed of 2^32 points, the point directly above the ring represents 0, the first point to the right of the 0 point represents 1, and so on, 2. 3, 4, 5, 6... until 2^32-1, which means that the first point to the left of 0 points represents 2^32-1. The consistent hash ring has two layers of nodes: the first layer is a logical node, the number is 2^32; the second layer is a physical node, which is the actual storage cluster.
103、根据待存储文件的哈希值确定最优存储的目标物理集群。103. Determine an optimal storage target physical cluster according to the hash value of the file to be stored.
其中,目标物理集群为根据一致性哈希算法判定出的最适合待存储文件存储的物理集群,利用一致性哈希算法判定目标物理集群的方法为:从待存储文件所在的逻辑节点位置开始,将沿顺时针方向遇到的第一个存储状态正常的物理集群确定为目标物理集群。Among them, the target physical cluster is the physical cluster that is most suitable for the storage of the file to be stored determined according to the consistent hash algorithm. The method of using the consistent hash algorithm to determine the target physical cluster is: starting from the logical node location where the file to be stored is located, The first physical cluster with normal storage status encountered in the clockwise direction is determined as the target physical cluster.
104、将待存储文件存储到目标物理集群中。104. Store the file to be stored in the target physical cluster.
在具体的应用场景中,在确定出最优存储的目标物理集群后,可将待存储文件存储到该目标物理集群中,并可接收对待存储数据的查询以及数据获取。In a specific application scenario, after determining the optimal storage target physical cluster, the files to be stored can be stored in the target physical cluster, and queries and data acquisitions of the data to be stored can be received.
通过本实施例中数据集群存储的方法,可将物理集群均匀映射到一致性哈希环的物理节点上,根据待存储文件的哈希值确定待存储文件在一致性哈希环中的逻辑节点位置,基于逻辑节点位置筛选出最优存储的目标物理集群,之后将待存储文件存储到目标物理集群中。本申请能够通过计算,快速定位数据文件应 该存储的集群,因数据文件的哈希值是固定的,故能够保证数据在集群中的规则存储。且将各个物理集群均匀映射到一致性哈希环的物理节点上,能使各个物理集群均能存储到数据,避免数据集中存储到某一物理集群中,导致存储压力增大,从而导致数据雪崩的问题。另外,将一致性哈希环融入本申请的数据集群存储中,可有效降低数据存储的复杂性、从而降低成本,且能实现对物理集群的高效定位,满足海量存储扩容需求。Through the method of data cluster storage in this embodiment, the physical cluster can be evenly mapped to the physical nodes of the consistent hash ring, and the logical node of the file to be stored in the consistent hash ring is determined according to the hash value of the file to be stored Location, the optimal storage target physical cluster is filtered out based on the logical node location, and then the files to be stored are stored in the target physical cluster. This application can quickly locate the cluster where the data file should be stored through calculation. Because the hash value of the data file is fixed, it can ensure the regular storage of the data in the cluster. Moreover, each physical cluster is evenly mapped to the physical nodes of the consistent hash ring, so that each physical cluster can store data, avoiding the centralized storage of data in a physical cluster, resulting in increased storage pressure and data avalanche The problem. In addition, integrating the consistent hash ring into the data cluster storage of the present application can effectively reduce the complexity of data storage, thereby reducing costs, and can achieve efficient positioning of the physical cluster to meet the needs of massive storage expansion.
进一步的,作为上述实施例具体实施方式的细化和扩展,为了完整说明本实施例中的具体实施过程,提供了另一种数据集群存储的方法,如图2所示,该方法包括:Further, as a refinement and extension of the specific implementation of the foregoing embodiment, in order to fully illustrate the specific implementation process in this embodiment, another method for data cluster storage is provided. As shown in FIG. 2, the method includes:
201、获取所有用于存储数据的物理集群。201. Obtain all physical clusters used to store data.
在具体的应用场景中,可从数据存储系统中获取到所有用于存储数据的物理集群。例如,数据存储系统中共包含A、B、C、D四个物理集群,则需要提取出A、B、C、D四个集群的基本信息。In specific application scenarios, all physical clusters used to store data can be obtained from the data storage system. For example, if the data storage system contains four physical clusters A, B, C, and D, the basic information of the four clusters A, B, C, and D needs to be extracted.
202、获取物理集群的存储空间,并将存储空间大于或等于预设阈值的第一物理集群按照预设比例分割成多个空间均等的子物理集群。202. Obtain the storage space of the physical cluster, and divide the first physical cluster with the storage space greater than or equal to a preset threshold into multiple sub-physical clusters with equal space according to a preset ratio.
在具体的应用场景中,若在一致性哈希环中存在物理集群较少时,容易因为节点分布不均匀而造成数据存储倾斜问题。故在本申请中,需要将存储空间较大的物理集群拆分成多个子物理集群,将各个子物理集群分布于不同的物理节点中,保证数据在各个物理集群间的均匀分布,避免数据存储倾斜的问题,且当单个子物理集群出现故障时,其他的正常的子物理集群不会受到影响,从而保证数据存储的安全性。In specific application scenarios, if there are fewer physical clusters in the consistent hash ring, it is easy to cause data storage skew problems due to uneven node distribution. Therefore, in this application, a physical cluster with a larger storage space needs to be split into multiple sub-physical clusters, and each sub-physical cluster is distributed in different physical nodes to ensure the uniform distribution of data among the physical clusters and avoid data storage The problem of tilt, and when a single sub-physical cluster fails, other normal sub-physical clusters will not be affected, thereby ensuring the security of data storage.
其中,预设阈值为判定将物理集群分割为多个子物理集群的最小存储空间。预设比例为将物理集群划分为子物理集群的划分数量,预设比例的数值可根据实际需求预先进行设定。Wherein, the preset threshold value is the minimum storage space for judging to divide the physical cluster into multiple sub-physical clusters. The preset ratio is the number of divisions that divide the physical cluster into sub-physical clusters, and the value of the preset ratio can be preset according to actual needs.
例如,设定物理集群和子物理集群单位容量的预设比例为10:1,预设阈值为30TB,若获取到物理集群A的存储空间为200TB,物理集群B的存储空间为100TB,物理集群C的存储空间为20TB。因物理集群A和物理集群B的存储空间大于预设阈值,则将物理集群A和物理集群B定义为待分割的第一物理集群,并按照预设比例将物理集群A分割成10个20TB的子物理集群,将物理集群B分割成10个10TB的子物理集群。因物理集群C的存储空间小于预设阈值,故可判定该存储空间较小,不需要划分为多个子物理集群,并将其定义为第二物理集群。For example, set the preset ratio of the unit capacity of the physical cluster and the sub-physical cluster to 10:1, and the preset threshold is 30TB. If the storage space of physical cluster A is 200TB, the storage space of physical cluster B is 100TB, and the storage space of physical cluster C is 100TB. The storage space is 20TB. Because the storage space of physical cluster A and physical cluster B is greater than the preset threshold, physical cluster A and physical cluster B are defined as the first physical cluster to be divided, and physical cluster A is divided into 10 20TB pieces according to the preset ratio Sub-physical cluster, which divides physical cluster B into 10 sub-physical clusters of 10TB. Since the storage space of the physical cluster C is less than the preset threshold, it can be determined that the storage space is small and does not need to be divided into multiple sub-physical clusters, and it is defined as the second physical cluster.
203、按照命名规则为存储空间小于预设阈值的第二物理集群及各个子物理集群配置身份标识码。203. According to the naming rule, configure an identification code for the second physical cluster and each sub-physical cluster whose storage space is less than the preset threshold.
在基于实施例步骤202将存储空间较大的第一物理集群分割成多个子物理集群之后,需要利用统一命名空间为各个子物理集群及第二物理集群配置符合命名规则的身份标识,便于对物理存储空间的统一管 理。After the first physical cluster with larger storage space is divided into multiple sub-physical clusters based on step 202 of the embodiment, a unified namespace needs to be used to configure identities that comply with the naming rules for each sub-physical cluster and the second physical cluster to facilitate the physical Unified management of storage space.
其中,命名规则可统一设定为cluster[集群编号]-[物理节点编号],在对物理集群进行命名前,需要获取物理集群的集群编号,判定该集群编号对应的物理集群是第一物理集群还是第二物理集群,如判定为第一物理集群,则需要进一步获取子物理集群在第一物理集群中的排列序号,即对应命名规则中的物理节点编号。当判定集群编号对应的物理集群是第二物理集群时,可直接将物理节点编号设定为1。例如判定存储集群1共有两个子物理节点,则可将两个子物理节点依次命名为,cluster1-1、cluster1-2;若判定存储集群2共有四个子物理节点,则可将四个子物理节点依次命名为,cluster2-1、cluster2-2、cluster2-3,cluster2-4;若判定存储集群3为第二物理集群,则可将其命名为cluster3-1。Among them, the naming rule can be uniformly set as cluster[cluster number]-[physical node number]. Before naming the physical cluster, it is necessary to obtain the cluster number of the physical cluster, and determine that the physical cluster corresponding to the cluster number is the first physical cluster It is the second physical cluster. If it is determined to be the first physical cluster, it is necessary to further obtain the sequence number of the sub-physical cluster in the first physical cluster, that is, the physical node number in the corresponding naming rule. When it is determined that the physical cluster corresponding to the cluster number is the second physical cluster, the physical node number can be directly set to 1. For example, if it is determined that the storage cluster 1 has two sub-physical nodes, the two sub-physical nodes can be named cluster1-1 and cluster1-2 in sequence; if it is determined that the storage cluster 2 has four sub-physical nodes, the four sub-physical nodes can be named in sequence As, cluster2-1, cluster2-2, cluster2-3, cluster2-4; if it is determined that the storage cluster 3 is the second physical cluster, it can be named cluster3-1.
204、根据身份标识码确定第二物理集群及各个子物理集群的哈希值。204. Determine the hash value of the second physical cluster and each sub-physical cluster according to the identification code.
对于本实施例,在具体的应用场景中,确定第二物理集群及子物理集群映射到一致性哈希环上的方法可为:采用MD5消息摘要算法产生出一个128位(16字节)的散列值(hash value),用于确保信息传输完整一致。具体实现方法为:MD5以512位分组来处理输入的身份标识码,且每一分组又被划分为16个32位子分组,经过了一系列的处理后,算法的输出由四个32位分组组成,将这四个32位分组级联后将生成一个128位散列值。For this embodiment, in a specific application scenario, the method for determining the mapping of the second physical cluster and sub-physical clusters to the consistent hash ring may be: using the MD5 message digest algorithm to generate a 128-bit (16 byte) Hash value is used to ensure complete and consistent information transmission. The specific implementation method is: MD5 processes the input identification code in 512-bit groups, and each group is divided into 16 32-bit sub-groups. After a series of processing, the output of the algorithm is composed of four 32-bit groups. , Cascading these four 32-bit packets will generate a 128-bit hash value.
205、利用哈希值计算第二物理集群及子物理集群在一致性哈希环上的物理节点位置。205. Calculate physical node positions of the second physical cluster and sub-physical clusters on the consistent hash ring by using the hash value.
对于本实施例,在具体的应用场景中,利用哈希值确定第二物理集群及子物理集群在一致性哈希环上的物理节点位置的方法可为:利用哈希值函数,将求取的第二物理集群或子物理集群的hash值再对2^32求模,获取结果即为对应一致性哈希环的物理节点,即:Hash(cluster1)=hash(cluster1)%2^32,其中,hash(cluster1)为根据身份标识码求取的hash值,hash(cluster1)即为对应第二物理集群或子物理集群在一致性哈希环上的物理节点。通过上述公式算出的结果一定是一个0到2^32-1之间的一个整数,故用算出的这个整数,代表物理集群,既然这个整数肯定处于0到2^32-1之间,那么,在一致性哈希环上必定能确定出物理节点位置,即实现将各个物理集群映射到一致性哈希环上。For this embodiment, in a specific application scenario, the method of using the hash value to determine the physical node positions of the second physical cluster and the sub-physical cluster on the consistent hash ring may be: using a hash value function to obtain The hash value of the second physical cluster or sub-physical cluster is modulo 2^32, and the obtained result is the physical node corresponding to the consistent hash ring, namely: Hash(cluster1)=hash(cluster1)%2^32, Among them, hash(cluster1) is the hash value obtained according to the identification code, and hash(cluster1) is the physical node corresponding to the second physical cluster or sub-physical cluster on the consistent hash ring. The result calculated by the above formula must be an integer between 0 and 2^32-1, so the calculated integer represents the physical cluster. Since this integer must be between 0 and 2^32-1, then, The location of the physical node must be determined on the consistent hash ring, that is, to map each physical cluster to the consistent hash ring.
206、根据待存储文件的身份标识码计算待存储文件的哈希值。206. Calculate the hash value of the file to be stored according to the identification code of the file to be stored.
本实施例与实施例步骤204的计算方法相同,在获取到待存储文件的身份标识码后,将身份标识码转换为哈希值。In this embodiment, the calculation method in step 204 is the same. After the identification code of the file to be stored is obtained, the identification code is converted into a hash value.
207、利用哈希值确定待存储文件在一致性哈希环上的逻辑节点位置。207. Use the hash value to determine the logical node position of the file to be stored on the consistent hash ring.
对于本实施例,相应的,利用哈希值确定待存储文件在一致性哈希环上的逻辑节点位置的方法可为: 利用哈希值函数,将求取的待存储文件hash值再对2^32求模,获取结果即为对应一致性哈希环的逻辑节点,即:Hash(obj1)=hash(obj1)%2^32,其中,hash(obj1)为根据待存储文件的身份标识码求取得到的hash值,Hash(obj1)为待存储文件对应的逻辑节点。通过上述公式算出的结果一定是一个0到2^32-1之间的一个整数,故用算出的这个整数,代表待存储文件,既然这个整数肯定处于0到2^32-1之间,那么,在一致性哈希环上必定能确定出该待存储文件对应的逻辑节点位置。For this embodiment, correspondingly, the method of using the hash value to determine the logical node position of the file to be stored on the consistent hash ring may be: Using the hash value function, the obtained hash value of the file to be stored is then paired by 2 ^32 is the modulus, and the obtained result is the logical node corresponding to the consistent hash ring, namely: Hash(obj1)=hash(obj1)%2^32, where hash(obj1) is the identification code according to the file to be stored Find the obtained hash value, Hash(obj1) is the logical node corresponding to the file to be stored. The result calculated by the above formula must be an integer between 0 and 2^32-1, so the calculated integer represents the file to be stored. Since this integer must be between 0 and 2^32-1, then , The location of the logical node corresponding to the file to be stored must be determined on the consistent hash ring.
208、将一致性哈希环上以逻辑节点位置为起点顺时针查取出的首个第二物理集群或首个子物理集群确定为目标物理集群。208. Determine the first second physical cluster or the first sub-physical cluster that is checked out in a clockwise direction from the logical node position on the consistent hash ring as the target physical cluster.
在具体的应用场景中,可通过一致性哈希算法确定待存储文件对应存储的目标物理集群。其中,一致性哈希算法的原理为:在将物理集群与待存储文件都映射到hash环上以后,从待存储文件的位置出发,沿顺时针方向遇到的第一个物理集群,就是当前对象将要缓存于的物理集群,由于待存储文件与物理集群hash后的值是固定的,所以,在物理集群不变的情况下,待存储文件必定会被缓存到固定的物理集群上,那么,当下次想要访问这个待存储文件时,只要再次使用相同的算法进行计算,即可算出这个待存储文件被缓存的位置,直接去对应的物理集群查找即可。In a specific application scenario, a consistent hash algorithm can be used to determine the target physical cluster corresponding to the file to be stored. Among them, the principle of the consistent hash algorithm is: After mapping the physical cluster and the file to be stored on the hash ring, starting from the location of the file to be stored, the first physical cluster encountered in the clockwise direction is the current The physical cluster where the object will be cached. Since the hashed value of the file to be stored and the physical cluster is fixed, the file to be stored must be cached on the fixed physical cluster when the physical cluster remains unchanged. Then, When you want to access the file to be stored next time, just use the same algorithm again to calculate the location where the file to be stored is cached, and go directly to the corresponding physical cluster to find it.
例如,如图3所示,若通过实施例步骤确定待存储文件处于一致性哈希环上的逻辑节点位置为key1,则以key1为起点,顺时针查找物理集群,若确定在一致性哈希环上的物理节点key2为查取出的首个第二物理集群或首个子物理集群,则可将key2点对应的物理集群确定为key1点处待存储文件的目标物理集群。For example, as shown in Figure 3, if it is determined through the embodiment steps that the logical node position of the file to be stored on the consistent hash ring is key1, then the physical cluster is searched clockwise using key1 as the starting point. The physical node key2 on the ring is the first second physical cluster or the first sub-physical cluster retrieved, and the physical cluster corresponding to the key2 point can be determined as the target physical cluster of the file to be stored at the key1 point.
209、接收对物理集群的增删指令。209. Receive an addition or deletion instruction to the physical cluster.
对于本实施例,在具体的应用场景中,物理集群具有容错性以及可扩展性。在判定存在物理集群故障时,为了不影响数据存储,需要将故障的物理集群移除,另外,为了增加集群存储的存储空间,还可根据实际情况在一致性哈希环中增加物理集群。For this embodiment, in a specific application scenario, the physical cluster has fault tolerance and scalability. When it is determined that there is a physical cluster failure, in order not to affect the data storage, the failed physical cluster needs to be removed. In addition, in order to increase the storage space of the cluster storage, physical clusters can be added to the consistent hash ring according to the actual situation.
210、根据增删指令更新物理节点上的第二物理集群和/或子物理集群。210. Update the second physical cluster and/or sub-physical cluster on the physical node according to the addition and deletion instructions.
对于本实施例,在接收到删除某一物理集群的指令后,可将该物理集群所在的物理节点清除;在确定需要增加新的物理集群后,需要按照命名规则,为该物理集群配置身份标识码,并基于身份标识码确定该物理集群在一致性哈希环上的物理节点位置。For this embodiment, after receiving an instruction to delete a physical cluster, the physical node where the physical cluster is located can be cleared; after it is determined that a new physical cluster needs to be added, an identity identifier needs to be configured for the physical cluster according to the naming rules Code, and determine the physical node position of the physical cluster on the consistent hash ring based on the identification code.
211、调整符合预设条件的待存储文件的存储位置。211. Adjust the storage location of the files to be stored that meet the preset conditions.
对于本实施例,在具体的应用场景中,若基于实施例步骤209接收到对物理集群的增加指令,则实施例步骤211具体包括:获取新增加的第二物理集群或各个子物理集群;按照命名规则为新增加的第二物理 集群或各个子物理集群配置身份标识码;基于身份标识码确定新增加的第二物理集群或各个子物理集群在一致性哈希环上的新增物理节点位置;提取新增物理节点位置与环空间中前一集群物理节点位置之间的待迁移数据,其中,前一集群物理节点位置为以新增物理节点位置为起点逆时针查取出的首个第二物理集群或首个子物理集群对应的物理节点位置;将待迁移数据迁移存储到新增物理节点位置对应的子物理集群或第二物理集群中。For this embodiment, in a specific application scenario, if an instruction to add a physical cluster is received based on step 209 of the embodiment, step 211 of the embodiment specifically includes: acquiring a newly added second physical cluster or each sub-physical cluster; The naming rule is to configure the identity code for the newly added second physical cluster or each sub-physical cluster; based on the identity code, determine the position of the newly added physical node of the newly added second physical cluster or each sub-physical cluster on the consistent hash ring ; Extract the data to be migrated between the location of the newly added physical node and the location of the previous cluster physical node in the ring space, where the location of the previous cluster physical node is the first and second one that is checked counterclockwise from the location of the newly added physical node as the starting point The physical node location corresponding to the physical cluster or the first sub-physical cluster; the data to be migrated is migrated and stored in the sub-physical cluster or the second physical cluster corresponding to the location of the newly added physical node.
例如,如图4所示,当在一致性哈希环物理节点key2的顺时针前端方向上添加一个新的物理节点key5,则需要确定目标物理节点位置key5与环空间中前一集群物理节点位置key4之间存在的所有待迁移数据,如确定存在待迁移数据key1,则可将原本存储于物理节点key2上的数据文件key1迁移存储到物理节点key5对应的物理集群中,其余存储的数据文件则不需要变化。故如果新增加一个物理集群,则受影响的数据仅仅是新物理集群到其环空间中前一物理集群(即沿着逆时针方向行走遇到的第一个物理集群)之间的数据,其它数据不会受到影响。For example, as shown in Figure 4, when a new physical node key5 is added in the clockwise front direction of the consistent hash ring physical node key2, it is necessary to determine the target physical node location key5 and the previous cluster physical node location in the ring space All the data to be migrated between key4, if it is determined that there is the data key1 to be migrated, the data file key1 originally stored on the physical node key2 can be migrated and stored to the physical cluster corresponding to the physical node key5, and the rest of the stored data files are No changes are required. Therefore, if a new physical cluster is added, the affected data is only the data between the new physical cluster and the previous physical cluster in the ring space (that is, the first physical cluster encountered when walking in the counterclockwise direction). The data will not be affected.
相应的,若基于实施例步骤209接收到对物理集群的删减指令,则实施例步骤211具体包括:确定待删减的第二物理集群或子物理集群;将待删减的第二物理集群或子物理集群中的所有存储文件全部迁移存储到顺时针方向上的首个第二物理集群或首个子物理集群中;在完成对存储文件的迁移存储后,将待删减的第二物理集群或子物理集群删除。Correspondingly, if a deletion instruction for a physical cluster is received based on step 209 of the embodiment, step 211 of the embodiment specifically includes: determining the second physical cluster or sub-physical cluster to be deleted; and setting the second physical cluster to be deleted Or all storage files in the sub-physical cluster are migrated and stored in the first second physical cluster or the first sub-physical cluster in the clockwise direction; after the storage files are migrated and stored, the second physical cluster to be deleted Or delete the sub-physical cluster.
例如,如图5所示,若确定key2物理节点位置对应的物理集群存在故障,只需要将key2从hash环上移除即可,且在key2未移除时,需要将之前存储到key2上的数据文件key1迁移存储到key3对应的物理集群中,因为从key1的位置出发,沿顺时针方向遇到的第一个物理集群就是key3对应的物理集群。在本实施例中,如果删除一个物理集群,则受影响的数据仅仅是待删除物理集群存储的数据,其它数据不会受到影响。也就是说,如果一个物理集群不可用,则受影响的数据仅仅是此物理集群到其环空间中前一物理集群(即沿着逆时针方向行走遇到的第一个物理集群)之间的数据,其它数据不会受到影响。For example, as shown in Figure 5, if it is determined that the physical cluster corresponding to the location of the key2 physical node is faulty, you only need to remove key2 from the hash ring, and when key2 is not removed, you need to store the previous data on key2. The data file key1 is migrated and stored in the physical cluster corresponding to key3, because starting from the location of key1, the first physical cluster encountered in the clockwise direction is the physical cluster corresponding to key3. In this embodiment, if a physical cluster is deleted, the affected data is only the data stored in the physical cluster to be deleted, and other data will not be affected. In other words, if a physical cluster is unavailable, the affected data is only between this physical cluster and the previous physical cluster in its ring space (that is, the first physical cluster encountered when walking in a counterclockwise direction) Data, other data will not be affected.
在具体的应用场景中,为了方便对存储数据的查取,作为一种优选方式,在将待存储数据存储到物理集群中后,具体还可以包括:接收对待存储文件的查询请求;根据待存储文件的哈希值确定目标物理集群;在目标物理集群中查取数据文件。In specific application scenarios, in order to facilitate the retrieval of stored data, as a preferred way, after storing the data to be stored in the physical cluster, it may specifically include: receiving a query request for the file to be stored; The hash value of the file determines the target physical cluster; the data file is retrieved in the target physical cluster.
对于本实施例,在物理集群不变的情况下,待存储文件必定会被缓存到固定的物理集群上,那么,当下次想要访问这个待存储文件时,只要再次使用相同的算法进行计算,即可算出这个待存储文件被缓存的位置,直接去对应的物理集群查找即可。For this embodiment, when the physical cluster remains unchanged, the file to be stored must be cached on a fixed physical cluster. Then, when you want to access the file to be stored next time, you only need to use the same algorithm for calculation again. You can calculate where the file to be stored is cached, and go directly to the corresponding physical cluster to find it.
通过上述数据集群存储的方法,可将存储空间较大的物理集群分割成多个空间均等的子物理集群,实现物理集群的均匀分布,使不同的数据能够均匀存储到对应的位置,从而均衡各个物理集群的存储压力, 避免数据集中存储造成的存储数据雪崩。之后按照命名规则为存储空间小于所述预设阈值的第二物理集群及各个所述子物理集群配置身份标识码,根据身份标识码计算哈希值,进而将第二物理集群及各个所述子物理集群均匀映射到一致性哈希环中,之后利用一致性哈希算法确定待存储文件对应最优存储的目标存储集群,将待存储文件存储到目标存储集群中。另外,还可接收物理集群的增删指令,满足海量存储的扩容需求。之后根据所述增删指令更新所述物理节点上的所述第二物理集群和/或所述子物理集群,并且及时调整符合预设条件的所述待存储文件的存储位置。可保证数据存储的可查询性,即查询所述待存储文件时,不用考虑数据原来存储的位置,只需要根据所述待存储文件的哈希值即可准确定位当前的存储的物理集群位置,进而实现对数据存储的高效定位。Through the above method of data cluster storage, a physical cluster with a larger storage space can be divided into multiple sub-physical clusters with equal space to achieve uniform distribution of physical clusters, so that different data can be evenly stored in corresponding locations, thereby balancing each The storage pressure of the physical cluster avoids the avalanche of stored data caused by centralized data storage. After that, according to the naming rules, the second physical cluster and each of the sub-physical clusters whose storage space is less than the preset threshold is configured with an identification code, and the hash value is calculated according to the identification code, and then the second physical cluster and each of the sub-physical clusters The physical clusters are evenly mapped into the consistent hash ring, and then the consistent hash algorithm is used to determine the target storage cluster corresponding to the optimal storage of the file to be stored, and the file to be stored is stored in the target storage cluster. In addition, it can also receive the addition and deletion instructions of the physical cluster to meet the expansion needs of mass storage. Then update the second physical cluster and/or the sub-physical cluster on the physical node according to the add-delete instruction, and adjust the storage location of the file to be stored that meets the preset condition in time. The queryability of data storage can be guaranteed, that is, when querying the file to be stored, the original storage location of the data is not considered, and the current storage physical cluster location can be accurately located according to the hash value of the file to be stored. And then realize the efficient positioning of data storage.
进一步的,作为图1和图2所示方法的具体体现,本申请实施例提供了一种数据集群存储的装置,如图3所示,该装置包括:获取模块31、映射模块32、确定模块33、存储模块34。Further, as a specific embodiment of the method shown in FIG. 1 and FIG. 2, an embodiment of the present application provides a data cluster storage device. As shown in FIG. 3, the device includes: an acquisition module 31, a mapping module 32, and a determination module 33. Storage module 34.
获取模块31,可用于获取所有用于存储数据的物理集群;The obtaining module 31 can be used to obtain all physical clusters used to store data;
映射模块32,可用于将物理集群映射到一致性哈希环的物理节点上;The mapping module 32 can be used to map the physical cluster to the physical nodes of the consistent hash ring;
确定模块33,可用于根据待存储文件的哈希值确定最优存储的目标物理集群;The determining module 33 can be used to determine the optimal storage target physical cluster according to the hash value of the file to be stored;
存储模块34,可用于将待存储文件存储到目标物理集群中。The storage module 34 can be used to store the files to be stored in the target physical cluster.
在具体的应用场景中,为了将物理集群映射到一致性哈希环的物理节点上,映射模块32,具体可用于获取物理集群的存储空间;将存储空间大于或等于预设阈值的第一物理集群按照预设比例分割成多个空间均等的子物理集群;按照命名规则为存储空间小于预设阈值的第二物理集群及各个子物理集群配置身份标识码;根据身份标识码确定第二物理集群及各个子物理集群的哈希值;利用哈希值计算第二物理集群及子物理集群在一致性哈希环上的物理节点位置。In a specific application scenario, in order to map the physical cluster to the physical nodes of the consistent hash ring, the mapping module 32 can be specifically used to obtain the storage space of the physical cluster; the storage space is greater than or equal to the preset threshold of the first physical The cluster is divided into multiple sub-physical clusters with equal space according to the preset ratio; according to the naming rules, the second physical cluster with storage space less than the preset threshold and each sub-physical cluster are configured with identification codes; the second physical cluster is determined according to the identification codes And the hash value of each sub-physical cluster; the hash value is used to calculate the physical node positions of the second physical cluster and the sub-physical cluster on the consistent hash ring.
相应的,为了确定出存储待存储文件的最优目标物理集群,确定模块33,具体可用于根据待存储文件的身份标识码计算待存储文件的哈希值;利用哈希值确定待存储文件在一致性哈希环上的逻辑节点位置;将一致性哈希环上以逻辑节点位置为起点顺时针查取出的首个第二物理集群或首个子物理集群确定为目标物理集群。Correspondingly, in order to determine the optimal target physical cluster for storing the file to be stored, the determining module 33 can be specifically used to calculate the hash value of the file to be stored according to the identification code of the file to be stored; the hash value is used to determine the location of the file to be stored The location of the logical node on the consistent hash ring; the first second physical cluster or the first sub-physical cluster that is checked out clockwise from the logical node location on the consistent hash ring is determined as the target physical cluster.
在具体的应用场景中,为了实现对故障物理集群的剔除或实现对物理集群的扩展,如图6所示,本装置还包括:接收模块35、更新模块36、调整模块37。In a specific application scenario, in order to eliminate the faulty physical cluster or realize the expansion of the physical cluster, as shown in FIG. 6, the device further includes: a receiving module 35, an update module 36, and an adjustment module 37.
接收模块35,可用于接收对物理集群的增删指令;The receiving module 35 can be used to receive addition and deletion instructions to the physical cluster;
更新模块36,可用于根据增删指令更新物理节点上的第二物理集群和/或子物理集群;The update module 36 can be used to update the second physical cluster and/or sub-physical cluster on the physical node according to the addition and deletion instructions;
调整模块37,可用于调整符合预设条件的待存储文件的存储位置。The adjustment module 37 can be used to adjust the storage location of the files to be stored that meet the preset conditions.
在具体的应用场景中,若利用接收模块35接收到对物理集群的增加指令时,调整模块37,具体可用于获取新增加的第二物理集群或各个子物理集群;按照命名规则为新增加的第二物理集群或各个子物理集群配置身份标识码;基于身份标识码确定新增加的第二物理集群或各个子物理集群在一致性哈希环上的新增物理节点位置;提取新增物理节点位置与环空间中前一集群物理节点位置之间的待迁移数据,其中,前一集群物理节点位置为以新增物理节点位置为起点逆时针查取出的首个第二物理集群或首个子物理集群对应的物理节点位置;将待迁移数据迁移存储到新增物理节点位置对应的子物理集群或第二物理集群中。In a specific application scenario, if the receiving module 35 receives an instruction to add a physical cluster, the adjustment module 37 can be specifically used to obtain the newly added second physical cluster or each sub-physical cluster; according to the naming rule, the newly added physical cluster Configure the identity code for the second physical cluster or each sub-physical cluster; determine the position of the newly added physical node of the second physical cluster or each sub-physical cluster on the consistent hash ring based on the identity code; extract the new physical node The data to be migrated between the location and the location of the physical node of the previous cluster in the ring space, where the location of the physical node of the previous cluster is the first second physical cluster or the first sub-physical that was checked counterclockwise from the location of the new physical node as the starting point The location of the physical node corresponding to the cluster; the data to be migrated is migrated and stored in the sub-physical cluster or the second physical cluster corresponding to the location of the newly added physical node.
相应的,若利用接收模块35接收到对物理集群的删减指令时,调整模块37,具体可用于确定待删减的第二物理集群或子物理集群;将待删减的第二物理集群或子物理集群中的所有存储文件全部迁移存储到顺时针方向上的首个第二物理集群或首个子物理集群中;在完成对存储文件的迁移存储后,将待删减的第二物理集群或子物理集群删除。Correspondingly, if the receiving module 35 receives the deletion instruction for the physical cluster, the adjustment module 37 can be specifically used to determine the second physical cluster or sub-physical cluster to be deleted; and the second physical cluster or sub-physical cluster to be deleted All storage files in the sub-physical cluster are migrated and stored to the first second physical cluster or the first sub-physical cluster in the clockwise direction; after the storage files are migrated and stored, the second physical cluster or The child physical cluster is deleted.
在具体的应用场景中,为了提供对待存储文件存储位置的查询,如图7所示,本装置还包括:查取模块38。In a specific application scenario, in order to provide a query on the storage location of the file to be stored, as shown in FIG. 7, the device further includes: a query module 38.
接收模块35,还可用于接收对待存储文件的查询请求;The receiving module 35 can also be used to receive query requests for files to be stored;
确定模块33,还可用于根据待存储文件的哈希值确定目标物理集群;The determining module 33 may also be used to determine the target physical cluster according to the hash value of the file to be stored;
查取模块38,可用于在目标物理集群中查取数据文件。The retrieval module 38 can be used to retrieve data files in the target physical cluster.
需要说明的是,本实施例提供的一种数据集群存储的装置所涉及各功能单元的其它相应描述,可以参考图1至图2中的对应描述,在此不再赘述。It should be noted that, for other corresponding descriptions of the functional units involved in the device for data cluster storage provided in this embodiment, reference may be made to the corresponding descriptions in FIGS. 1 to 2, and details are not repeated here.
基于上述如图1和图2所示方法,相应的,本申请实施例还提供了一种存储介质,其上存储有计算机可读指令,该计算机可读指令被处理器执行时实现上述如图1和图2所示的数据集群存储的方法。Based on the methods shown in Figures 1 and 2, correspondingly, embodiments of the present application also provide a storage medium on which computer-readable instructions are stored. When the computer-readable instructions are executed by a processor, the foregoing 1 and Figure 2 show the method of data cluster storage.
基于这样的理解,本申请的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施场景的方法。Based on this understanding, the technical solution of this application can be embodied in the form of a software product. The software product can be stored in a non-volatile storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.), including several The instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute the methods in each implementation scenario of the present application.
基于上述如图1、图2所示的方法,以及图6、图7所示的虚拟装置实施例,为了实现上述目的,本申请实施例还提供了一种计算机设备,具体可以为个人计算机、服务器、网络设备等,该实体设备包括存储介质和处理器;存储介质,用于存储计算机程序;处理器,用于执行计算机程序以实现上述如图1和图2所示的数据集群存储的方法。Based on the above methods shown in Figures 1 and 2 and the virtual device embodiments shown in Figures 6 and 7, in order to achieve the above objectives, an embodiment of the present application also provides a computer device, which may be a personal computer, Servers, network devices, etc., the physical device includes a storage medium and a processor; the storage medium is used to store a computer program; the processor is used to execute the computer program to implement the data cluster storage method shown in FIG. 1 and FIG. 2 .
可选地,该计算机设备还可以包括用户接口、网络接口、摄像头、射频(Radio Frequency,RF)电路,传感器、音频电路、WI-FI模块等等。用户接口可以包括显示屏(Display)、输入单元比如键盘(Keyboard)等,可选用户接口还可以包括USB接口、读卡器接口等。网络接口可选的可以包括标准的有线接口、无线接口(如蓝牙接口、WI-FI接口)等。Optionally, the computer device may also include a user interface, a network interface, a camera, a radio frequency (RF) circuit, a sensor, an audio circuit, a WI-FI module, and so on. The user interface may include a display screen (Display), an input unit such as a keyboard (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, and the like. The network interface can optionally include a standard wired interface, a wireless interface (such as a Bluetooth interface, a WI-FI interface), etc.
本领域技术人员可以理解,本实施例提供的计算机设备结构并不构成对该实体设备的限定,可以包括更多或更少的部件,或者组合某些部件,或者不同的部件布置。Those skilled in the art can understand that the computer device structure provided in this embodiment does not constitute a limitation on the physical device, and may include more or fewer components, or combine certain components, or arrange different components.
非易失性可读存储介质中还可以包括操作系统、网络通信模块。操作系统是数据集群存储的实体设备硬件和软件资源的计算机可读指令,支持信息处理计算机可读指令以及其它软件和/或计算机可读指令的运行。网络通信模块用于实现非易失性可读存储介质内部各组件之间的通信,以及与该实体设备中其它硬件和软件之间通信。The non-volatile readable storage medium may also include an operating system and a network communication module. The operating system is the computer readable instructions of the physical device hardware and software resources stored in the data cluster, and supports the operation of the information processing computer readable instructions and other software and/or computer readable instructions. The network communication module is used to implement communication between various components in the non-volatile readable storage medium and communication with other hardware and software in the physical device.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到本申请可以借助软件加必要的通用硬件平台的方式来实现,也可以通过硬件实现。通过应用本申请的技术方案,与目前现有技术相比,本申请可将存储空间较大的物理集群分割成多个空间均等的子物理集群,实现物理集群的均匀分布,使不同的数据能够均匀存储到对应的位置,从而均衡各个物理集群的存储压力,避免数据集中存储造成的存储数据雪崩。之后按照命名规则为存储空间小于所述预设阈值的第二物理集群及各个所述子物理集群配置身份标识码,根据身份标识码计算哈希值,进而将第二物理集群及各个所述子物理集群均匀映射到一致性哈希环中,之后利用一致性哈希算法确定待存储文件对应最优存储的目标存储集群,将待存储文件存储到目标存储集群中。另外,还可接收物理集群的增删指令,满足海量存储的扩容需求。之后根据所述增删指令更新所述物理节点上的所述第二物理集群和/或所述子物理集群,并且及时调整符合预设条件的所述待存储文件的存储位置。可保证数据存储的可查询性,即查询所述待存储文件时,不用考虑数据原来存储的位置,只需要根据所述待存储文件的哈希值即可准确定位当前的存储的物理集群位置,进而实现对数据存储的高效定位。Through the description of the foregoing implementation manners, those skilled in the art can clearly understand that this application can be implemented by means of software plus a necessary general hardware platform, or by hardware. By applying the technical solution of the present application, compared with the current prior art, the present application can divide a physical cluster with a larger storage space into multiple sub-physical clusters with equal space, realize the uniform distribution of the physical clusters, and enable different data Evenly store to the corresponding location, thereby balancing the storage pressure of each physical cluster, and avoiding stored data avalanches caused by centralized data storage. After that, according to the naming rules, the second physical cluster and each of the sub-physical clusters whose storage space is less than the preset threshold is configured with an identification code, and the hash value is calculated according to the identification code, and then the second physical cluster and each of the sub-physical clusters The physical clusters are evenly mapped into the consistent hash ring, and then the consistent hash algorithm is used to determine the target storage cluster corresponding to the optimal storage of the file to be stored, and the file to be stored is stored in the target storage cluster. In addition, it can also receive the addition and deletion instructions of the physical cluster to meet the expansion needs of mass storage. Then update the second physical cluster and/or the sub-physical cluster on the physical node according to the add-delete instruction, and adjust the storage location of the file to be stored that meets the preset condition in time. The queryability of data storage can be guaranteed, that is, when querying the file to be stored, the original storage location of the data is not considered, and the current storage physical cluster location can be accurately located according to the hash value of the file to be stored. And then realize the efficient positioning of data storage.
本领域技术人员可以理解附图只是一个优选实施场景的示意图,附图中的模块或流程并不一定是实施本申请所必须的。本领域技术人员可以理解实施场景中的装置中的模块可以按照实施场景描述进行分布于实施场景的装置中,也可以进行相应变化位于不同于本实施场景的一个或多个装置中。上述实施场景的模块可以合并为一个模块,也可以进一步拆分成多个子模块。Those skilled in the art can understand that the accompanying drawings are only schematic diagrams of preferred implementation scenarios, and the modules or processes in the accompanying drawings are not necessarily necessary for implementing this application. Those skilled in the art can understand that the modules in the device in the implementation scenario can be distributed in the device in the implementation scenario according to the description of the implementation scenario, or can be changed to be located in one or more devices different from the implementation scenario. The modules of the above implementation scenarios can be combined into one module or further divided into multiple sub-modules.
上述本申请序号仅仅为了描述,不代表实施场景的优劣。以上公开的仅为本申请的几个具体实施场景,但是,本申请并非局限于此,任何本领域的技术人员能思之的变化都应落入本申请的保护范围。The above serial number of this application is only for description, and does not represent the merits of implementation scenarios. The above disclosures are only a few specific implementation scenarios of the application, but the application is not limited to these, and any changes that can be thought of by those skilled in the art should fall into the protection scope of the application.

Claims (20)

  1. 一种数据集群存储的方法,其特征在于,包括:A method for data cluster storage, characterized in that it comprises:
    获取所有用于存储数据的物理集群;Get all physical clusters used to store data;
    将所述物理集群均匀映射到一致性哈希环的物理节点上;Evenly map the physical cluster to the physical nodes of the consistent hash ring;
    所述将所述物理集群均匀映射到一致性哈希环的物理节点上,具体包括:获取所述物理集群的存储空间;将所述存储空间大于或等于预设阈值的第一物理集群按照预设比例分割成多个空间均等的子物理集群;按照命名规则为所述存储空间小于所述预设阈值的第二物理集群及各个所述子物理集群配置身份标识码;根据所述身份标识码确定所述第二物理集群及各个所述子物理集群的哈希值;利用所述哈希值计算所述第二物理集群及所述子物理集群在一致性哈希环上的物理节点位置;The uniformly mapping the physical clusters to the physical nodes of the consistent hash ring specifically includes: obtaining the storage space of the physical cluster; and the first physical cluster with the storage space greater than or equal to a preset threshold according to the preset Suppose the proportion is divided into multiple sub-physical clusters with equal space; according to the naming rule, the second physical cluster whose storage space is less than the preset threshold and each of the sub-physical clusters are configured with an identity code; according to the identity code Determine the hash value of the second physical cluster and each of the sub-physical clusters; use the hash value to calculate the physical node positions of the second physical cluster and the sub-physical cluster on the consistent hash ring;
    根据待存储文件的哈希值确定最优存储的目标物理集群;Determine the optimal storage target physical cluster according to the hash value of the file to be stored;
    将所述待存储文件存储到所述目标物理集群中。Storing the file to be stored in the target physical cluster.
  2. 根据权利要求1所述的方法,其特征在于,所述根据待存储文件的哈希值确定最优存储的目标物理集群,具体包括:The method according to claim 1, wherein the determining the optimal storage target physical cluster according to the hash value of the file to be stored specifically comprises:
    根据待存储文件的身份标识码计算所述待存储文件的哈希值;Calculating the hash value of the file to be stored according to the identification code of the file to be stored;
    利用所述哈希值确定所述待存储文件在所述一致性哈希环上的逻辑节点位置;Using the hash value to determine the logical node position of the file to be stored on the consistent hash ring;
    将所述一致性哈希环上以所述逻辑节点位置为起点顺时针查取出的首个第二物理集群或首个子物理集群确定为目标物理集群。The first second physical cluster or the first sub-physical cluster that is checked out clockwise on the consistent hash ring with the location of the logical node as a starting point is determined as the target physical cluster.
  3. 根据权利要求2所述的方法,其特征在于,所述在将所述待存储文件存储到所述目标物理集群中之后,还包括:The method according to claim 2, wherein after storing the file to be stored in the target physical cluster, the method further comprises:
    接收对物理集群的增删指令;Receive addition and deletion instructions to the physical cluster;
    根据所述增删指令更新所述物理节点上的所述第二物理集群和/或所述子物理集群;Updating the second physical cluster and/or the sub-physical cluster on the physical node according to the addition and deletion instruction;
    调整符合预设条件的所述待存储文件的存储位置。Adjust the storage location of the file to be stored that meets the preset conditions.
  4. 根据权利要求3所述的方法,其特征在于,若接收到对所述物理集群的增加指令,所述调整符合预设条件的所述待存储文件的存储位置,具体包括:The method according to claim 3, wherein if an instruction to add the physical cluster is received, the adjusting the storage location of the file to be stored that meets a preset condition specifically includes:
    获取新增加的第二物理集群或各个子物理集群;Obtain the newly added second physical cluster or each sub-physical cluster;
    按照所述命名规则为所述新增加的第二物理集群或各个子物理集群配置身份标识码;Configure an identity code for the newly added second physical cluster or each sub-physical cluster according to the naming rule;
    基于所述身份标识码确定所述新增加的第二物理集群或各个子物理集群在所述一致性哈希环上的新增物理节点位置;Determining the position of the newly added physical node of the newly added second physical cluster or each sub-physical cluster on the consistent hash ring based on the identity identification code;
    提取所述新增物理节点位置与环空间中前一集群物理节点位置之间的待迁移数据,其中,所述前一集群物理节点位置为以所述新增物理节点位置为起点逆时针查取出的首个第二物理集群或首个子物理集群对应的物理节点位置;Extract the data to be migrated between the location of the newly added physical node and the location of the previous cluster physical node in the ring space, where the location of the previous cluster physical node is checked counterclockwise from the location of the newly added physical node as a starting point The location of the physical node corresponding to the first second physical cluster or the first sub-physical cluster;
    将所述待迁移数据迁移存储到所述新增物理节点位置对应的子物理集群或第二物理集群中。The data to be migrated is migrated and stored in the sub-physical cluster or the second physical cluster corresponding to the position of the newly added physical node.
  5. 根据权利要求3所述的方法,其特征在于,若接收到对所述物理集群的删减指令,所述调整符合预设条件的所述待存储文件的存储位置,具体包括:The method according to claim 3, wherein if a deletion instruction for the physical cluster is received, the adjusting the storage location of the file to be stored that meets a preset condition specifically includes:
    确定待删减的第二物理集群或子物理集群;Determine the second physical cluster or sub-physical cluster to be deleted;
    将所述待删减的第二物理集群或子物理集群中的所有存储文件全部迁移存储到顺时针方向上的首个第二物理集群或首个子物理集群中;Migrating and storing all the storage files in the second physical cluster or sub-physical cluster to be deleted to the first second physical cluster or the first sub-physical cluster in a clockwise direction;
    在完成对所述存储文件的迁移存储后,将所述待删减的第二物理集群或子物理集群删除。After completing the migration and storage of the storage file, delete the second physical cluster or sub-physical cluster to be deleted.
  6. 根据权利要求1所述的方法,其特征在于,所述在将所述待存储文件存储到所述目标物理集群中之后,具体还包括:The method according to claim 1, wherein after storing the to-be-stored file in the target physical cluster, it specifically further comprises:
    接收对所述待存储文件的查询请求;Receiving a query request for the file to be stored;
    根据所述待存储文件的哈希值确定目标物理集群;Determining the target physical cluster according to the hash value of the file to be stored;
    在所述目标物理集群中查取所述数据文件。The data file is retrieved in the target physical cluster.
  7. 一种数据集群存储的装置,其特征在于,包括:A data cluster storage device, characterized in that it comprises:
    获取模块,用于获取所有用于存储数据的物理集群;The acquisition module is used to acquire all physical clusters used to store data;
    映射模块,用于将所述物理集群映射到一致性哈希环的物理节点上;A mapping module, which is used to map the physical cluster to a physical node of a consistent hash ring;
    所述映射单元,具体用于:获取所述物理集群的存储空间;将所述存储空间大于或等于预设阈值的第一物理集群按照预设比例分割成多个空间均等的子物理集群;按照命名规则为所述存储空间小于所述预设阈值的第二物理集群及各个所述子物理集群配置身份标识码;根据所述身份标识码确定所述第二物理集群及各个所述子物理集群的哈希值;利用所述哈希值计算所述第二物理集群及所述子物理集群在一致性哈希环上的物理节点位置;The mapping unit is specifically configured to: obtain the storage space of the physical cluster; divide the first physical cluster with the storage space greater than or equal to a preset threshold into a plurality of sub-physical clusters with equal space according to a preset ratio; The naming rule is that the second physical cluster whose storage space is less than the preset threshold and each of the sub-physical clusters are configured with an identification code; the second physical cluster and each of the sub-physical clusters are determined according to the identification code Use the hash value to calculate the physical node positions of the second physical cluster and the sub-physical cluster on the consistent hash ring;
    确定模块,用于根据待存储文件的哈希值确定最优存储的目标物理集群;The determining module is used to determine the optimal storage target physical cluster according to the hash value of the file to be stored;
    存储模块,用于将所述待存储文件存储到所述目标物理集群中。The storage module is used to store the file to be stored in the target physical cluster.
  8. 根据权利要求7所述的装置,其特征在于,所述确定模块,具体用于根据待存储文件的身份标识码计算所述待存储文件的哈希值;利用所述哈希值确定所述待存储文件在所述一致性哈希环上的逻辑节点位置;将所述一致性哈希环上以所述逻辑节点位置为起点顺时针查取出的首个第二物理集群或首个子物理集群确定为目标物理集群。7. The device according to claim 7, wherein the determining module is specifically configured to calculate the hash value of the file to be stored according to the identification code of the file to be stored; and to determine the The location of the logical node of the stored file on the consistent hash ring; the first second physical cluster or the first sub-physical cluster that is checked out clockwise from the location of the logical node on the consistent hash ring is determined Is the target physical cluster.
  9. 根据权利要求8所述的装置,其特征在于,所述装置还包括:接收模块、更新模块、调整模块;The device according to claim 8, wherein the device further comprises: a receiving module, an update module, and an adjustment module;
    所述接收模块,用于接收对物理集群的增删指令;The receiving module is used to receive addition and deletion instructions to the physical cluster;
    所述更新模块,用于根据所述增删指令更新所述物理节点上的所述第二物理集群和/或所述子物理集群;The update module is configured to update the second physical cluster and/or the sub-physical cluster on the physical node according to the addition and deletion instruction;
    所述调整模块,用于调整符合预设条件的所述待存储文件的存储位置。The adjustment module is configured to adjust the storage location of the file to be stored that meets a preset condition.
  10. 根据权利要求9所述的装置,其特征在于,若接收到对所述物理集群的增加指令,所述调 整模块,具体用于获取新增加的第二物理集群或各个子物理集群;按照所述命名规则为所述新增加的第二物理集群或各个子物理集群配置身份标识码;基于所述身份标识码确定所述新增加的第二物理集群或各个子物理集群在所述一致性哈希环上的新增物理节点位置;提取所述新增物理节点位置与环空间中前一集群物理节点位置之间的待迁移数据,其中,所述前一集群物理节点位置为以所述新增物理节点位置为起点逆时针查取出的首个第二物理集群或首个子物理集群对应的物理节点位置;将所述待迁移数据迁移存储到所述新增物理节点位置对应的子物理集群或第二物理集群中。The apparatus according to claim 9, wherein if an instruction to add the physical cluster is received, the adjustment module is specifically configured to obtain a newly added second physical cluster or each sub-physical cluster; according to the The naming rule is to configure an identity code for the newly added second physical cluster or each sub-physical cluster; based on the identity code, it is determined that the newly added second physical cluster or each sub-physical cluster is in the consistent hash The position of the newly added physical node on the ring; extracting the data to be migrated between the position of the newly added physical node and the position of the previous cluster physical node in the ring space, wherein the position of the previous cluster physical node is based on the newly added physical node position The physical node location is the starting point and counterclockwise to check the physical node location corresponding to the first second physical cluster or the first sub-physical cluster; migrate and store the data to be migrated to the sub-physical cluster or the first sub-physical cluster corresponding to the location of the newly added physical node 2. In the physical cluster.
  11. 根据权利要求9所述的装置,其特征在于,若接收到对所述物理集群的删减指令,所述调整模块,具体用于确定待删减的第二物理集群或子物理集群;将所述待删减的第二物理集群或子物理集群中的所有存储文件全部迁移存储到顺时针方向上的首个第二物理集群或首个子物理集群中;在完成对所述存储文件的迁移存储后,将所述待删减的第二物理集群或子物理集群删除。The device according to claim 9, wherein if a deletion instruction for the physical cluster is received, the adjustment module is specifically configured to determine the second physical cluster or sub-physical cluster to be deleted; All storage files in the second physical cluster or sub-physical cluster to be deleted are migrated and stored in the first second physical cluster or first sub-physical cluster in the clockwise direction; after completing the migration and storage of the storage files After that, the second physical cluster or sub-physical cluster to be deleted is deleted.
  12. 根据权利要求7所述的装置,其特征在于,还包括:查取模块;7. The device according to claim 7, further comprising: a search module;
    所述接收模块,还用于接收对所述待存储文件的查询请求;The receiving module is further configured to receive a query request for the file to be stored;
    所述确定模块,具体用于根据所述待存储文件的哈希值确定目标物理集群;The determining module is specifically configured to determine the target physical cluster according to the hash value of the file to be stored;
    所述查取模块,用于在所述目标物理集群中查取所述数据文件。The retrieval module is used to retrieve the data file in the target physical cluster.
  13. 一种非易失性可读存储介质,其上存储有计算机可读指令,其特征在于,所述计算机可读指令被处理器执行时实现数据集群存储的方法,包括:A non-volatile readable storage medium having computer readable instructions stored thereon, wherein the method for realizing data cluster storage when the computer readable instructions are executed by a processor includes:
    获取所有用于存储数据的物理集群;Get all physical clusters used to store data;
    将所述物理集群均匀映射到一致性哈希环的物理节点上;Evenly map the physical cluster to the physical nodes of the consistent hash ring;
    所述将所述物理集群均匀映射到一致性哈希环的物理节点上,具体包括:获取所述物理集群的存储空间;将所述存储空间大于或等于预设阈值的第一物理集群按照预设比例分割成多个空间均等的子物理集群;按照命名规则为所述存储空间小于所述预设阈值的第二物理集群及各个所述子物理集群配置身份标识码;根据所述身份标识码确定所述第二物理集群及各个所述子物理集群的哈希值;利用所述哈希值计算所述第二物理集群及所述子物理集群在一致性哈希环上的物理节点位置;The uniformly mapping the physical clusters to the physical nodes of the consistent hash ring specifically includes: obtaining the storage space of the physical cluster; and the first physical cluster with the storage space greater than or equal to a preset threshold according to the preset Suppose the proportion is divided into multiple sub-physical clusters with equal space; according to the naming rule, the second physical cluster whose storage space is less than the preset threshold and each of the sub-physical clusters are configured with an identity code; according to the identity code Determine the hash value of the second physical cluster and each of the sub-physical clusters; use the hash value to calculate the physical node positions of the second physical cluster and the sub-physical cluster on the consistent hash ring;
    根据待存储文件的哈希值确定最优存储的目标物理集群;Determine the optimal storage target physical cluster according to the hash value of the file to be stored;
    将所述待存储文件存储到所述目标物理集群中。Storing the file to be stored in the target physical cluster.
  14. 根据权利要求13所述的计算机可读存储介质,其特征在于,所述计算机可读指令被处理器执行时实现所述根据待存储文件的哈希值确定最优存储的目标物理集群,包括:The computer-readable storage medium according to claim 13, wherein when the computer-readable instruction is executed by a processor, the determination of the optimal storage target physical cluster according to the hash value of the file to be stored comprises:
    根据待存储文件的身份标识码计算所述待存储文件的哈希值;Calculating the hash value of the file to be stored according to the identification code of the file to be stored;
    利用所述哈希值确定所述待存储文件在所述一致性哈希环上的逻辑节点位置;Using the hash value to determine the logical node position of the file to be stored on the consistent hash ring;
    将所述一致性哈希环上以所述逻辑节点位置为起点顺时针查取出的首个第二物理集群或首个子物理集群确定为目标物理集群。The first second physical cluster or the first sub-physical cluster that is checked out clockwise on the consistent hash ring with the location of the logical node as a starting point is determined as the target physical cluster.
  15. 根据权利要求14所述的计算机可读存储介质,其特征在于,所述计算机可读指令被处理器 执行时实现将所述待存储文件存储到所述目标物理集群中之后,还包括:The computer-readable storage medium according to claim 14, wherein after the computer-readable instructions are executed by the processor to store the file to be stored in the target physical cluster, the method further comprises:
    接收对物理集群的增删指令;Receive addition and deletion instructions to the physical cluster;
    根据所述增删指令更新所述物理节点上的所述第二物理集群和/或所述子物理集群;Updating the second physical cluster and/or the sub-physical cluster on the physical node according to the addition and deletion instruction;
    调整符合预设条件的所述待存储文件的存储位置。Adjust the storage location of the file to be stored that meets the preset conditions.
  16. 根据权利要求15所述的计算机可读存储介质,其特征在于,所述计算机可读指令被处理器执行时实现将所述待存储文件存储到所述目标物理集群中之后,还包括:The computer-readable storage medium according to claim 15, wherein after the computer-readable instructions are executed by the processor to store the file to be stored in the target physical cluster, the method further comprises:
    接收对所述待存储文件的查询请求;Receiving a query request for the file to be stored;
    根据所述待存储文件的哈希值确定目标物理集群;Determining the target physical cluster according to the hash value of the file to be stored;
    在所述目标物理集群中查取所述数据文件。The data file is retrieved in the target physical cluster.
  17. 一种计算机设备,包括非易失性可读存储介质、处理器及存储在非易失性可读存储介质上并可在处理器上运行的计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现数据集群存储的方法,包括:A computer device, including a non-volatile readable storage medium, a processor, and computer readable instructions stored on the non-volatile readable storage medium and running on the processor, characterized in that the processor The method for realizing data cluster storage when executing the computer-readable instruction includes:
    获取所有用于存储数据的物理集群;Get all physical clusters used to store data;
    将所述物理集群均匀映射到一致性哈希环的物理节点上;Evenly map the physical cluster to the physical nodes of the consistent hash ring;
    所述将所述物理集群均匀映射到一致性哈希环的物理节点上,具体包括:获取所述物理集群的存储空间;将所述存储空间大于或等于预设阈值的第一物理集群按照预设比例分割成多个空间均等的子物理集群;按照命名规则为所述存储空间小于所述预设阈值的第二物理集群及各个所述子物理集群配置身份标识码;根据所述身份标识码确定所述第二物理集群及各个所述子物理集群的哈希值;利用所述哈希值计算所述第二物理集群及所述子物理集群在一致性哈希环上的物理节点位置;The uniformly mapping the physical clusters to the physical nodes of the consistent hash ring specifically includes: obtaining the storage space of the physical cluster; and the first physical cluster with the storage space greater than or equal to a preset threshold according to the preset Suppose the proportion is divided into multiple sub-physical clusters with equal space; according to the naming rule, the second physical cluster whose storage space is less than the preset threshold and each of the sub-physical clusters are configured with an identity code; according to the identity code Determine the hash value of the second physical cluster and each of the sub-physical clusters; use the hash value to calculate the physical node positions of the second physical cluster and the sub-physical cluster on the consistent hash ring;
    根据待存储文件的哈希值确定最优存储的目标物理集群;Determine the optimal storage target physical cluster according to the hash value of the file to be stored;
    将所述待存储文件存储到所述目标物理集群中。Storing the file to be stored in the target physical cluster.
  18. 根据权利要求17所述的计算机设备,其特征在于,所述计算机可读指令被处理器执行时实现所述根据待存储文件的哈希值确定最优存储的目标物理集群,包括:The computer device according to claim 17, wherein, when the computer-readable instructions are executed by a processor, the determination of the optimal storage target physical cluster according to the hash value of the file to be stored comprises:
    根据待存储文件的身份标识码计算所述待存储文件的哈希值;Calculating the hash value of the file to be stored according to the identification code of the file to be stored;
    利用所述哈希值确定所述待存储文件在所述一致性哈希环上的逻辑节点位置;Using the hash value to determine the logical node position of the file to be stored on the consistent hash ring;
    将所述一致性哈希环上以所述逻辑节点位置为起点顺时针查取出的首个第二物理集群或首个子物理集群确定为目标物理集群。The first second physical cluster or the first sub-physical cluster that is checked out clockwise on the consistent hash ring with the location of the logical node as a starting point is determined as the target physical cluster.
  19. 根据权利要求18所述的计算机设备,其特征在于,所述计算机可读指令被处理器执行时实现将所述待存储文件存储到所述目标物理集群中之后,还包括:The computer device according to claim 18, wherein after the computer-readable instructions are executed by the processor to store the file to be stored in the target physical cluster, the method further comprises:
    接收对物理集群的增删指令;Receive addition and deletion instructions to the physical cluster;
    根据所述增删指令更新所述物理节点上的所述第二物理集群和/或所述子物理集群;Updating the second physical cluster and/or the sub-physical cluster on the physical node according to the addition and deletion instruction;
    调整符合预设条件的所述待存储文件的存储位置。Adjust the storage location of the file to be stored that meets the preset conditions.
  20. 根据权利要求19所述的计算机设备,其特征在于,所述计算机可读指令被处理器执行时实现将所述待存储文件存储到所述目标物理集群中之后,还包括:The computer device according to claim 19, wherein after the computer-readable instructions are executed by the processor to store the file to be stored in the target physical cluster, the method further comprises:
    接收对所述待存储文件的查询请求;Receiving a query request for the file to be stored;
    根据所述待存储文件的哈希值确定目标物理集群;Determining the target physical cluster according to the hash value of the file to be stored;
    在所述目标物理集群中查取所述数据文件。The data file is retrieved in the target physical cluster.
PCT/CN2019/118232 2019-07-11 2019-11-13 Data cluster storage method and apparatus, and computer device WO2021003935A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910625543.5 2019-07-11
CN201910625543.5A CN110489059B (en) 2019-07-11 2019-07-11 Data cluster storage method and device and computer equipment

Publications (1)

Publication Number Publication Date
WO2021003935A1 true WO2021003935A1 (en) 2021-01-14

Family

ID=68547014

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118232 WO2021003935A1 (en) 2019-07-11 2019-11-13 Data cluster storage method and apparatus, and computer device

Country Status (2)

Country Link
CN (1) CN110489059B (en)
WO (1) WO2021003935A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113672524A (en) * 2021-08-20 2021-11-19 上海哔哩哔哩科技有限公司 Data processing method and system based on multi-level cache
CN113689103A (en) * 2021-08-18 2021-11-23 国电南瑞南京控制系统有限公司 Adaptive load balancing employing flow distribution intelligent scheduling management method, device and system
CN113708937A (en) * 2021-10-28 2021-11-26 湖南天河国云科技有限公司 Processing method and system for block chain transaction
CN113934377A (en) * 2021-10-28 2022-01-14 山东英信计算机技术有限公司 Metadata cluster deployment method, device, equipment and readable storage medium
CN114666338A (en) * 2022-05-19 2022-06-24 杭州指令集智能科技有限公司 Message-based multi-instance load balancing method and system
CN115001969A (en) * 2022-05-24 2022-09-02 中欣链证数字科技有限公司 Data storage node deployment method, data storage method, device and equipment
CN115002131A (en) * 2022-05-24 2022-09-02 中欣链证数字科技有限公司 User request distribution method, device, equipment and system

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111258508B (en) * 2020-02-16 2020-11-10 西安奥卡云数据科技有限公司 Metadata management method in distributed object storage
CN111756828B (en) * 2020-06-19 2023-07-14 广东浪潮大数据研究有限公司 Data storage method, device and equipment
CN113778341A (en) * 2021-09-17 2021-12-10 北京航天泰坦科技股份有限公司 Distributed storage method and device for remote sensing data and remote sensing data reading method
CN114489483A (en) * 2021-12-24 2022-05-13 深圳市捷顺科技实业股份有限公司 Disk management method based on object storage and object storage module

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130179481A1 (en) * 2012-01-11 2013-07-11 Tonian Inc. Managing objects stored in storage devices having a concurrent retrieval configuration
CN103929500A (en) * 2014-05-06 2014-07-16 刘跃 Method for data fragmentation of distributed storage system
CN106572153A (en) * 2016-10-21 2017-04-19 乐视控股(北京)有限公司 Data storage method and device of cluster
CN106909557A (en) * 2015-12-23 2017-06-30 中国电信股份有限公司 The storage method and device of main memory cluster, the read method and device of main memory cluster
CN107844269A (en) * 2017-10-17 2018-03-27 华中科技大学 A kind of layering mixing storage system and method based on uniformity Hash
CN109639777A (en) * 2018-11-28 2019-04-16 优刻得科技股份有限公司 Data synchronous method, apparatus, system and non-volatile memory medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104754000B (en) * 2013-12-30 2018-08-24 国家电网公司 A kind of load-balancing method and system
US9860316B2 (en) * 2014-09-19 2018-01-02 Facebook, Inc. Routing network traffic based on social information
CN109271391B (en) * 2018-09-29 2021-05-28 武汉极意网络科技有限公司 Data storage method, server, storage medium and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130179481A1 (en) * 2012-01-11 2013-07-11 Tonian Inc. Managing objects stored in storage devices having a concurrent retrieval configuration
CN103929500A (en) * 2014-05-06 2014-07-16 刘跃 Method for data fragmentation of distributed storage system
CN106909557A (en) * 2015-12-23 2017-06-30 中国电信股份有限公司 The storage method and device of main memory cluster, the read method and device of main memory cluster
CN106572153A (en) * 2016-10-21 2017-04-19 乐视控股(北京)有限公司 Data storage method and device of cluster
CN107844269A (en) * 2017-10-17 2018-03-27 华中科技大学 A kind of layering mixing storage system and method based on uniformity Hash
CN109639777A (en) * 2018-11-28 2019-04-16 优刻得科技股份有限公司 Data synchronous method, apparatus, system and non-volatile memory medium

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113689103A (en) * 2021-08-18 2021-11-23 国电南瑞南京控制系统有限公司 Adaptive load balancing employing flow distribution intelligent scheduling management method, device and system
CN113689103B (en) * 2021-08-18 2023-11-24 国电南瑞南京控制系统有限公司 Mining and shunting intelligent scheduling management method, device and system for self-adaptive load balancing
CN113672524A (en) * 2021-08-20 2021-11-19 上海哔哩哔哩科技有限公司 Data processing method and system based on multi-level cache
CN113708937A (en) * 2021-10-28 2021-11-26 湖南天河国云科技有限公司 Processing method and system for block chain transaction
CN113934377A (en) * 2021-10-28 2022-01-14 山东英信计算机技术有限公司 Metadata cluster deployment method, device, equipment and readable storage medium
CN114666338A (en) * 2022-05-19 2022-06-24 杭州指令集智能科技有限公司 Message-based multi-instance load balancing method and system
CN114666338B (en) * 2022-05-19 2022-08-26 杭州指令集智能科技有限公司 Message-based multi-instance load balancing method and system
CN115001969A (en) * 2022-05-24 2022-09-02 中欣链证数字科技有限公司 Data storage node deployment method, data storage method, device and equipment
CN115002131A (en) * 2022-05-24 2022-09-02 中欣链证数字科技有限公司 User request distribution method, device, equipment and system
CN115002131B (en) * 2022-05-24 2024-03-01 中欣链证数字科技有限公司 User request distribution method, device, equipment and system

Also Published As

Publication number Publication date
CN110489059B (en) 2022-04-12
CN110489059A (en) 2019-11-22

Similar Documents

Publication Publication Date Title
WO2021003935A1 (en) Data cluster storage method and apparatus, and computer device
US10977277B2 (en) Systems and methods for database zone sharding and API integration
US10997211B2 (en) Systems and methods for database zone sharding and API integration
US11144651B2 (en) Secure cloud-based storage of data shared across file system objects and clients
CN107408128B (en) System and method for providing access to a sharded database using caching and shard topology
US10467245B2 (en) System and methods for mapping and searching objects in multidimensional space
US10331641B2 (en) Hash database configuration method and apparatus
RU2475988C2 (en) Method and system to use local cash supported with host node and cryptographic hash functions in order to reduce network traffic
US20200125758A1 (en) File System Permission Setting Method and Apparatus
US20190340171A1 (en) Data Redistribution Method and Apparatus, and Database Cluster
US10860604B1 (en) Scalable tracking for database udpates according to a secondary index
US20200065306A1 (en) Bloom filter partitioning
CN110445822B (en) Object storage method and device
EP4231167A1 (en) Data storage method and apparatus based on blockchain network
US20180225048A1 (en) Data Processing Method and Apparatus
US20170212939A1 (en) Method and mechanism for efficient re-distribution of in-memory columnar units in a clustered rdbms on topology change
EP3093789B1 (en) Storing structured information
CN110737663A (en) data storage method, device, equipment and storage medium
US10664349B2 (en) Method and device for file storage
CN111026711A (en) Block chain based data storage method and device, computer equipment and storage medium
US20150278543A1 (en) System and Method for Optimizing Storage of File System Access Control Lists
US10924452B1 (en) Auditing IP address assignments
WO2019174558A1 (en) Data indexing method and device
JP6233846B2 (en) Variable-length nonce generation
CN115563073A (en) Method and device for data processing of distributed metadata and electronic equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19937110

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19937110

Country of ref document: EP

Kind code of ref document: A1