CN106909556B - Memory cluster storage balancing method and device - Google Patents

Memory cluster storage balancing method and device Download PDF

Info

Publication number
CN106909556B
CN106909556B CN201510976653.8A CN201510976653A CN106909556B CN 106909556 B CN106909556 B CN 106909556B CN 201510976653 A CN201510976653 A CN 201510976653A CN 106909556 B CN106909556 B CN 106909556B
Authority
CN
China
Prior art keywords
key
primary key
data
original
original primary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510976653.8A
Other languages
Chinese (zh)
Other versions
CN106909556A (en
Inventor
杨维
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianyi Cloud Technology Co Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN201510976653.8A priority Critical patent/CN106909556B/en
Publication of CN106909556A publication Critical patent/CN106909556A/en
Application granted granted Critical
Publication of CN106909556B publication Critical patent/CN106909556B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures

Abstract

The invention discloses a storage balancing method and device of a memory cluster, and relates to the field of databases. The method comprises the following steps: segmenting the table segmentation of the original primary key K1 to obtain a segmentation number; generating a balanced primary key K1' according to the original primary key K1 and the fragment number; storing the key-value data in a data storage form of (K1', DO); the original primary key K1 is an original logical key of the customized object DO, the original primary key K1 is composed of a database segment, a data table segment and a table partition segment, and the balance primary key K1' is a balance logical key of the customized object DO. Therefore, the data are uniformly distributed on the data nodes of the memory cluster, and the storage capacity of the memory cluster is fully utilized.

Description

Memory cluster storage balancing method and device
Technical Field
The present invention relates to the field of databases, and in particular, to a storage balancing method and apparatus for a memory cluster.
Background
The keys (keys) in the cluster are subjected to consistent hashing by the proxy server, and the keys can be ideally and uniformly distributed on the cluster nodes. Since the data volumes of the partitions are usually different, the difference in the data volumes of the partitions may cause the storage resource occupation of the cluster nodes to be unbalanced.
As shown in FIG. 1, assume that a cluster has 4 nodes, with 10 for each node. Storing 8 subarea data twice, storing 4 subareas (4 keys) for the first time, wherein the data volumes are 1, 2, 4 and 8 respectively; the second time 4 further partitions (4 keys) are stored, and the data amount is 2. If the stored partitioned data of both times are uniformly stored on 4 nodes, the data amount of the 4 nodes is 3, 4, 6, 10, respectively. If data is inserted into the cluster node again, it is likely that the data is consistently hashed onto the node with storage 10, and storage fails. At this point, the other 3 nodes have different levels of free resources but are unavailable.
Therefore, the traditional consistent hash algorithm does not consider partition influence, directly processes data members in a peer-to-peer manner, and meanwhile, the uneven distribution of partitions is aggravated by adding a virtual node in recent years, so that the problem of overall performance bottleneck caused by the fact that partition data is excessively concentrated on a plurality of service nodes is necessarily solved.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: how to realize the uniform distribution of data on the data nodes of the memory cluster, and further, the storage capacity of the memory cluster is fully utilized.
According to an aspect of the embodiments of the present invention, a method for balancing storage of a memory cluster is provided, including: segmenting the table segmentation of the original primary key K1 to obtain a segmentation number; generating a balanced primary key K1' according to the original primary key K1 and the fragment number; storing the key-value data in a data storage form of (K1', DO); the original primary key K1 is an original logical key of the customized object DO, the original primary key K1 is composed of a database segment, a data table segment and a table partition segment, and the balance primary key K1' is a balance logical key of the customized object DO.
According to another aspect of the embodiments of the present invention, there is provided a storage balancing apparatus for a memory cluster, including: the segmentation module is used for segmenting the table segmentation of the original primary key K1 to obtain a segmentation number; the balanced primary key generation module is used for generating a balanced primary key K1' according to the original primary key K1 and the fragment number; a key-value data storage module for storing key-value data in the form of a data store of (K1', DO); the original primary key K1 is an original logical key of the customized object DO, the original primary key K1 is composed of a database segment, a data table segment and a table partition segment, and the balance primary key K1' is a balance logical key of the customized object DO.
The invention has at least the following advantages:
the method comprises the steps of obtaining a fragment number by further segmenting a table partition of an original key, reconstructing a balance key according to the original key and the fragment number, storing data based on the balance key, realizing uniform distribution of the data on data nodes of the memory cluster, and further fully utilizing the storage capacity of the memory cluster.
Other features of the present invention and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 illustrates a schematic diagram of a storage resource occupation imbalance of cluster nodes caused by a traditional hash algorithm.
Fig. 2 is a flowchart illustrating a storage balancing method for a memory cluster according to an embodiment of the present invention.
FIG. 3 is a diagram illustrating the table partition of the original primary key K1 being sliced to obtain the partition number according to the present invention.
FIG. 4 is a diagram illustrating one embodiment of the present invention for slicing table partitions of the original primary key K1 into tile numbers.
FIG. 5 is a diagram illustrating one embodiment of the present invention for slicing table partitions of the original primary key K1 into tile numbers.
Fig. 6 is a schematic diagram illustrating an embodiment of a storage balancing apparatus for a memory cluster according to the present invention.
FIG. 7 illustrates a schematic diagram of one embodiment of a key-value data storage module of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The following describes a storage balancing method for a memory cluster according to an embodiment of the present invention with reference to fig. 2.
Fig. 2 is a flowchart illustrating a storage balancing method for a memory cluster according to an embodiment of the present invention. As shown in fig. 2, the method of this embodiment includes:
step S202, the table subsection of the original primary key K1 is segmented to obtain a fragment number. The original primary key K1 is an original logical key of the customized object DO, and the original primary key K1 is composed of a database segment, a data table segment and a table partition segment, so that a data organization mode of the database, the data table and the partition is provided on the Redis cluster. The database db is the largest data isolation unit, and is a logical concept. A user may create multiple databases. Generally, a class of data should be placed in an in-memory database. The data table is a collection of data with the same theme, service and concept, such as a user table, a detail table and the like, and is a logical concept. The table partition is a unit for further logically splitting a large data table based on the Redis cluster characteristics. The data in different partitions are stored separately, so that the distribution of the data is more uniform. In addition, logical concepts of databases, data tables, partitions may be described in terms of metadata. The key for storing metadata is STORE and the value type is Hash. All metadata information is stored in the hash table with STORE as a key. The metadata description information of all databases is stored in the metadata database (hash table) in the form of Key-Value, and the Key name of the database may be DB _ [ database name ]. The metadata description information of all data tables is stored in the metadata database (hash table) in the form of Key-Value. The key name of the data table may be TB _ [ database name ] _ [ data table name ], so that the relationship between the data table and the database is represented on the key name. The data partition is further logic division of the data table, and the data partition is stored in the metadata of the data table in a metadata mode. The key name of the data partition may be [ database name ] _[ data table name ] _[ partition name ].
Wherein, the customized object DO stores (K2, V), the secondary key K2 is a data key of the customized object DO, and V is a data value of the customized object DO.
In step S204, a balanced primary key K1' is generated according to the original primary key K1 and the fragment number. Wherein, the balance primary key K1' is a balance logic key of the customized object DO.
In step S206, the key value data is stored in the data storage format of (K1', DO).
By the method, the table partition of the original primary key K1 is further subdivided, the balance key is reconstructed according to the original key and the partition number, and data is stored based on the balance key, so that the data is uniformly distributed on the data nodes of the memory cluster, and the storage capacity of the memory cluster is fully utilized.
FIG. 3 is a diagram illustrating the table partition of the original primary key K1 being sliced to obtain the partition number according to the present invention. The shards are calculated from the number of nodes stored by the user. When data is not stored, it cannot be determined how many fragments a partition is divided into. When the user stores Key-Value, the hash Value is taken from Key, and then the remainder is taken according to the number of planned fragments. Combining the calculated result with the complete key name of the original partition to form a new partition key name,
the method for splitting the table partition of the original primary key K1 into the partition numbers according to an embodiment of the present invention is described below with reference to fig. 4.
FIG. 4 is a diagram illustrating one embodiment of the present invention for slicing table partitions of the original primary key K1 into tile numbers. As shown in fig. 4, (K2, V) is stored in the custom object DO, the secondary key K2 is a data key of the custom object DO, and V is a data value of the custom object DO. A specific implementation method for obtaining the fragment number by segmenting the table fragment of the original primary key K1 comprises the following steps:
in step S402, a hash operation is performed on the original primary key K1 and the secondary key K2.
And S404, performing remainder operation on the counted number M of the divided pieces by using the hash value obtained by the hash operation.
In step S406, the result of the remainder operation is used as the slice number.
For example, the number of cluster nodes is 4, and the maximum number of partitions is designed to be 40. Then the user stores a new value into table1 of db 1: Key-Value, the partition corresponding to this Value is: string slice is String. value of (mat. abs (key. hashcode ())/40). Assume that the result of the calculation is 23. If the full partition key name is db1_ table1_ p1, then the new partition name is db1_ table1_ p1_ 23.
The data is further "sliced" into slices on a partition basis. When the number of slices is larger than the number of nodes, the slices are uniformly distributed on different nodes according to the characteristics of the TwyProxy. In principle, the more the number of slices is, the smaller the data storage granularity is, the more uniform the data distribution is, so that the uniform distribution of data on the data nodes of the memory cluster is realized, and the storage capacity of the memory cluster is fully utilized. However, the slice data is not necessarily too large, and is generally kept at N (N.ltoreq.10) times the number of nodes.
The method for splitting the table partition of the original primary key K1 to obtain the partition number according to another embodiment of the present invention is described below with reference to fig. 5.
FIG. 5 is a diagram illustrating another embodiment of the present invention for slicing table partitions of the original primary key K1 into tile numbers. As shown in fig. 5, another specific implementation method for segmenting the table partition of the original primary key K1 to obtain the fragment number includes:
step S502, md5 operation is carried out on the original primary key K1 and the secondary key K2;
in step S504, the last n bits of the digest value obtained by the md5 operation are taken as the slice number, and the maximum value that the n bits can represent is not greater than the number M of the slice.
The original primary key K1 and the original secondary key K2 are subjected to md5 operation to obtain the fragment number, so that the smaller the data storage granularity is, the more uniform the data distribution is, the uniform distribution of data on the data nodes of the memory cluster is realized, and the storage capacity of the memory cluster is fully utilized. Compared with the embodiment shown in fig. 4, the md5 algorithm can calculate the slice number with a unique value according to the primary key K1 and the secondary key K2, but the md5 algorithm occupies a high CPU and affects the performance under a large concurrent pressure.
Furthermore, the custom object DO may be stored in the form of a MAP.
The following describes a storage balancing apparatus for a memory cluster according to an embodiment of the present invention with reference to fig. 6.
Fig. 6 is a schematic diagram illustrating an embodiment of a storage balancing apparatus for a memory cluster according to the present invention. As shown in fig. 6, the storage balancing apparatus 60 of the memory cluster of this embodiment includes:
the segmentation module 602 is configured to segment the table segment of the original primary key K1 to obtain a segment number.
And an equalizing primary key generating module 604, configured to generate an equalizing primary key K1' according to the original primary key K1 and the fragment number.
A key value data storage module 606 for storing key value data in the form of a data store of (K1', DO).
The original primary key K1 is an original logical key of the customized object DO, the original primary key K1 is composed of a database segment, a data table segment and a table partition segment, and the balance primary key K1' is a balance logical key of the customized object DO.
In one embodiment, a (K2, V) is stored in the custom object DO, the secondary key K2 is a data key of the custom object DO, and V is a data value of the custom object DO;
the slicing module 602 is configured to: carrying out hash operation on the original primary key K1 and the original secondary key K2; carrying out remainder operation on the counted number M of the divided pieces by the hash value obtained by the hash operation; the result of the remainder operation is taken as the slice number.
In another embodiment, the segmentation module 602 may be configured to: performing md5 operation on the original primary key K1 and the secondary key K2; and taking the last n bits of the digest value obtained by the operation of md5 as the number of the fragments, wherein the maximum value represented by the n bits is not more than the number M of the fragments.
The storage balancing apparatus 60 of the memory cluster may further include:
a counting and dividing piece number determining module 608, configured to determine the counting and dividing piece number M according to the number of nodes stored by the user, where the counting and dividing piece number M is N times of the number of nodes, and N is a natural number not greater than 10.
A key-value data storage module of one embodiment of the present invention is described below in conjunction with fig. 7.
FIG. 7 illustrates a schematic diagram of one embodiment of a key-value data storage module of the present invention. As shown in fig. 7, the key value data storage module 606 of this embodiment includes a custom object storage unit 6062 for storing a custom object DO in the form of a MAP.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (6)

1. A storage balancing method for a memory cluster comprises the following steps:
segmenting the table segmentation of the original primary key K1 to obtain a segmentation number;
generating a balanced primary key K1' according to the original primary key K1 and the fragment number;
storing the key-value data in a data storage form of (K1', DO);
the system comprises an original primary key K1, a balance primary key K1 and a balance primary key K1, wherein the original primary key K1 is an original logic key of a customized object DO, the original primary key K1 consists of a database segment, a data table segment and a table partition segment, and the balance primary key K1' is a balance logic key of the customized object DO; storing (K2, V) in the custom object DO, wherein a secondary key K2 is a data key of the custom object DO, and V is a data value of the custom object DO;
the step of segmenting the table subsections of the original primary key K1 to obtain the subsection numbers comprises the following steps:
carrying out hash operation on the original primary key K1 and the original secondary key K2; carrying out remainder operation on the counted number M of the divided pieces by the hash value obtained by the hash operation; taking the result of the remainder operation as a fragment number;
alternatively, the first and second electrodes may be,
performing md5 operation on the original primary key K1 and the secondary key K2; and taking the last n bits of the digest value obtained by the operation of md5 as the number of the fragments, wherein the maximum value represented by the n bits is not more than the number M of the fragments.
2. The method of claim 1, further comprising:
and determining the number M of the counting and dividing pieces according to the number of the nodes stored by the user, wherein the number M of the counting and dividing pieces is N times of the number of the nodes, and N is a natural number not more than 10.
3. The method according to claim 1, wherein the custom object DO is stored in the form of MAP.
4. A storage leveling apparatus for a memory cluster, comprising:
the segmentation module is used for segmenting the table segmentation of the original primary key K1 to obtain a segmentation number;
the balanced primary key generation module is used for generating a balanced primary key K1' according to the original primary key K1 and the fragment number;
a key-value data storage module for storing key-value data in the form of a data store of (K1', DO);
the system comprises an original primary key K1, a balance primary key K1 and a balance primary key K1, wherein the original primary key K1 is an original logic key of a customized object DO, the original primary key K1 consists of a database segment, a data table segment and a table partition segment, and the balance primary key K1' is a balance logic key of the customized object DO; storing (K2, V) in the custom object DO, wherein a secondary key K2 is a data key of the custom object DO, and V is a data value of the custom object DO; the slitting module is used for: carrying out hash operation on the original primary key K1 and the original secondary key K2; carrying out remainder operation on the counted number M of the divided pieces by the hash value obtained by the hash operation; taking the result of the remainder operation as a fragment number; or, the cutting module is configured to: performing md5 operation on the original primary key K1 and the secondary key K2; and taking the last n bits of the digest value obtained by the operation of md5 as the number of the fragments, wherein the maximum value represented by the n bits is not more than the number M of the fragments.
5. The apparatus of claim 4, further comprising:
and the counting and dividing piece number determining module is used for determining the counting and dividing piece number M according to the node number stored by the user, wherein the counting and dividing piece number M is N times of the node number, and N is a natural number not greater than 10.
6. The apparatus of claim 4, wherein the key value data storage module further comprises a custom object storage unit for storing the custom object DO in the form of a MAP.
CN201510976653.8A 2015-12-23 2015-12-23 Memory cluster storage balancing method and device Active CN106909556B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510976653.8A CN106909556B (en) 2015-12-23 2015-12-23 Memory cluster storage balancing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510976653.8A CN106909556B (en) 2015-12-23 2015-12-23 Memory cluster storage balancing method and device

Publications (2)

Publication Number Publication Date
CN106909556A CN106909556A (en) 2017-06-30
CN106909556B true CN106909556B (en) 2020-03-20

Family

ID=59200301

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510976653.8A Active CN106909556B (en) 2015-12-23 2015-12-23 Memory cluster storage balancing method and device

Country Status (1)

Country Link
CN (1) CN106909556B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109101635B (en) * 2018-08-16 2020-09-11 广州小鹏汽车科技有限公司 Data processing method and device based on Redis Hash structure
CN109344161A (en) * 2018-12-04 2019-02-15 大唐网络有限公司 A kind of mass data storage means based on mongodb
CN110287197B (en) * 2019-06-28 2022-02-08 微梦创科网络科技(中国)有限公司 Data storage method, migration method and device
CN110427434B (en) * 2019-06-28 2022-06-07 苏宁云计算有限公司 Multidimensional data query method and device
CN113791740B (en) * 2021-11-10 2022-02-18 深圳市杉岩数据技术有限公司 Method for recording object storage bucket statistics and counting

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101876983A (en) * 2009-04-30 2010-11-03 国际商业机器公司 Method for partitioning database and system thereof
CN103699676A (en) * 2013-12-30 2014-04-02 厦门市美亚柏科信息股份有限公司 MSSQL SERVER based table partition and automatic maintenance method and system
CN103838770A (en) * 2012-11-26 2014-06-04 中国移动通信集团北京有限公司 Logic data partition method and system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9519668B2 (en) * 2013-05-06 2016-12-13 International Business Machines Corporation Lock-free creation of hash tables in parallel
US20140351239A1 (en) * 2013-05-23 2014-11-27 Microsoft Corporation Hardware acceleration for query operators

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101876983A (en) * 2009-04-30 2010-11-03 国际商业机器公司 Method for partitioning database and system thereof
CN103838770A (en) * 2012-11-26 2014-06-04 中国移动通信集团北京有限公司 Logic data partition method and system
CN103699676A (en) * 2013-12-30 2014-04-02 厦门市美亚柏科信息股份有限公司 MSSQL SERVER based table partition and automatic maintenance method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Oracle数据库系统优化调整;刘超 等;《信息安全与技术》;20140710;第103-104页 *

Also Published As

Publication number Publication date
CN106909556A (en) 2017-06-30

Similar Documents

Publication Publication Date Title
CN106909556B (en) Memory cluster storage balancing method and device
US10747780B2 (en) Blockchain-based data processing method and device
CN110489059B (en) Data cluster storage method and device and computer equipment
CN106844510B (en) Data migration method and device for distributed database cluster
US10331641B2 (en) Hash database configuration method and apparatus
CN110347651B (en) Cloud storage-based data synchronization method, device, equipment and storage medium
US10140351B2 (en) Method and apparatus for processing database data in distributed database system
US8719237B2 (en) Method and apparatus for deleting duplicate data
CN110287197B (en) Data storage method, migration method and device
WO2021082157A1 (en) Methods, systems and devices for data sharing, and data and metadata storage
CN106161633B (en) Transmission method and system for packed files based on cloud computing environment
WO2014210499A1 (en) Computing connected components in large graphs
CN111723073B (en) Data storage processing method, device, processing system and storage medium
CN107169009B (en) Data splitting method and device of distributed storage system
CN106775470B (en) Data storage method and system
US20180225048A1 (en) Data Processing Method and Apparatus
CN106909557B (en) Memory cluster storage method and device and memory cluster reading method and device
CN106897281B (en) Log fragmentation method and device
JP2011170667A (en) File-synchronizing system, file synchronization method, and file synchronization program
US20140359213A1 (en) Differencing disk improved deployment of virtual machines
WO2016101751A1 (en) Master and slave balancing method and device in distributed storage system
US10700934B2 (en) Communication control device, communication control method, and computer program product
US9703788B1 (en) Distributed metadata in a high performance computing environment
US11681475B2 (en) Methods, devices, and a computer program product for processing an access request and updating a storage system
CN107085501B (en) Data storage method, data migration method, data storage device and data migration device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220128

Address after: 100007 room 205-32, floor 2, building 2, No. 1 and No. 3, qinglonghutong a, Dongcheng District, Beijing

Patentee after: Tianyiyun Technology Co.,Ltd.

Address before: No.31, Financial Street, Xicheng District, Beijing, 100033

Patentee before: CHINA TELECOM Corp.,Ltd.

TR01 Transfer of patent right