CN112765262B - Data redistribution method, electronic equipment and storage medium - Google Patents

Data redistribution method, electronic equipment and storage medium Download PDF

Info

Publication number
CN112765262B
CN112765262B CN201911070817.5A CN201911070817A CN112765262B CN 112765262 B CN112765262 B CN 112765262B CN 201911070817 A CN201911070817 A CN 201911070817A CN 112765262 B CN112765262 B CN 112765262B
Authority
CN
China
Prior art keywords
data
target
migrated
number identifier
barrel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911070817.5A
Other languages
Chinese (zh)
Other versions
CN112765262A (en
Inventor
郭龙波
岳新新
丁岩
刘志文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinzhuan Xinke Co Ltd
Original Assignee
Jinzhuan Xinke Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinzhuan Xinke Co Ltd filed Critical Jinzhuan Xinke Co Ltd
Priority to CN201911070817.5A priority Critical patent/CN112765262B/en
Priority to PCT/CN2020/116284 priority patent/WO2021088531A1/en
Publication of CN112765262A publication Critical patent/CN112765262A/en
Application granted granted Critical
Publication of CN112765262B publication Critical patent/CN112765262B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/278Data partitioning, e.g. horizontal or vertical partitioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention relates to the field of communication and discloses a data redistribution method, electronic equipment and a storage medium. In the invention, the method comprises the following steps: adding a barrel number identifier for a data record to be stored, and storing the data record to a fragment database corresponding to the barrel number identifier of the data record according to a preset corresponding relation between the barrel number identifier and the fragment database; responding to the data redistribution request, and acquiring a target bucket number identifier of the data to be migrated; according to the target barrel number identification, data to be migrated is migrated to a target fragment database, and the data is logically stored in a partitioned manner, so that the expansibility of redistribution of the ultra-large amount of data is improved, the problem of hot spots caused by the limitation of physical partitioning when the data amount is too large is avoided, and the reliability of the system is improved; and data migration is carried out by taking the data record corresponding to one barrel number identifier as a basic unit, so that the performance and quality of data migration are ensured, the influence on service is reduced, and the usability of data redistribution is improved.

Description

Data redistribution method, electronic equipment and storage medium
Technical Field
The embodiment of the invention relates to the field of distributed data storage, in particular to a data redistribution method, electronic equipment and a storage medium.
Background
With the development and progress of communication technology, the current method for storing data is not limited to centralized storage, and for the storage of large data, the data is split and distributed, and in the distributed database, the data is stored on different data nodes according to distribution keys to form a logically large database, so that the comprehensive utilization of storage resources for improving the data storage capacity is improved. The data redistribution of the current data is an important basic function of the distributed database. In the prior art, the redistribution method comprises offline redistribution and online data redistribution based on data pre-fragmentation counting, wherein the offline redistribution process closes service operation, then data redistribution is carried out, a plurality of tables are established in each node in advance based on the online data distribution of the data pre-fragmentation, the data are split in a physical partitioning mode, then the data are filled into corresponding tables according to the mapping relation, the tables are stored in the corresponding fragments according to the mapping relation, and the fragments are used as basic units for data change during the data redistribution.
The inventors found that at least the following problems exist in the related art: the offline redistribution can not meet the service operation requirement, and the existing counting online redistribution method has low indexing efficiency and can reduce the system reliability; when the redistribution of the huge amount of data is carried out, the expansibility is poor, the hot spot problem still occurs, and the data migration may fail due to the capacity of the disk during the data migration process.
Disclosure of Invention
An object of embodiments of the present invention is to provide a data redistribution method, an electronic device, and a storage medium, which enable efficient online redistribution of data, reduce the influence of data migration on services, improve the scalability of data redistribution, avoid the problem of hot spots, and ensure the efficiency and quality of data migration.
In order to solve the above technical problem, an embodiment of the present invention provides a data redistribution method, including: adding a barrel number identifier for a data record to be stored, and storing the data record to a fragment database corresponding to the barrel number identifier of the data record according to a preset corresponding relation between the barrel number identifier and the fragment database; responding to the data redistribution request, and acquiring a target barrel number identifier of the data to be migrated; and migrating the data to be migrated to the target fragment database according to the target barrel number identifier.
An embodiment of the present invention also provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the data redistribution method described above.
Embodiments of the present invention also provide a computer-readable storage medium storing a computer program, which when executed by a processor implements the data redistribution method described above.
Compared with the prior art, the embodiment of the invention logically partitions and stores the data by adding the barrel number identifier to each received data record and storing the data according to the barrel number identifier of the data record and the corresponding relation between the barrel number identifier and the fragment database, so that a table of the data records stored in the database is not required to be maintained independently for each fragment database, the limitation of table building quantity when the table is built for data storage according to physical partitions is avoided, the service data is divided into smaller units for storage, the expansibility of data redistribution is improved, the hot spot problem caused by overlarge data volume is avoided, and the reliability of the system is improved; when data redistribution is carried out, the target bucket number identification of the data to be migrated is determined, and the data corresponding to the target bucket number identification is migrated into the target database, so that data migration with the data record corresponding to one bucket number identification as a basic unit is realized, the efficiency and quality of data migration are ensured, the influence on service is reduced, and the usability of data redistribution is improved.
In addition, according to the target barrel number identifier, migrating the data to be migrated to the target fragment database, including: dividing target barrel number identifications of data to be migrated into N groups, wherein the data to be migrated corresponding to the target barrel number identifications of the same group are used as data to be migrated in the same batch; wherein N is an integer greater than 1; according to the target barrel number identifications of each group, the data to be migrated are migrated to the target database in batches, and the data to be migrated are migrated in batches according to the barrel number identifications, so that the problems that the data to be migrated once is too much, the occupied storage resources are too large, and the migration performance is influenced are avoided, and meanwhile, the influence on the service in the data migration process is reduced due to the adoption of the batch data migration mode.
In addition, according to the target barrel number identification of each group, the batch migration to the target database comprises the following steps: locking the target barrel number identification of the data to be migrated of the current batch; the data migration method includes the steps that data to be migrated in a current batch are migrated to a target database, target barrel number identification of the data to be migrated in the current batch is unlocked after migration is completed, and when data corresponding to data identification in a group are migrated, operation authority of the data is locked according to the data identification, so that the problems that data are accessed and operated during data migration to cause data loss and the like are solved, and data migration quality is guaranteed.
In addition, migrating the data to be migrated of the current batch to the target database, including: sending a data migration instruction to a database agent, wherein the data migration instruction carries a target barrel number identifier of current batch of data to be migrated and a target fragment database, so that the database agent obtains the current batch of data to be migrated from a source fragment database according to the target barrel number identifier, migrates the current batch of data to be migrated to the target fragment database, issuing a data migration command according to the barrel number identifier of the data to be migrated and the target fragment database of data migration, and reading and migrating the data to be migrated by the database agent.
In addition, before adding the bucket number identifier for the data record to be stored, the method further comprises the following steps: the number M of bucket number identifiers is calculated according to the following formula,
Figure BDA0002260878020000031
s is the estimated number of data records to be stored, L is the maximum use time preset for each barrel number identifier, and the barrel number identifiers with enough number are preset, so that the problem of hot spots generated during data distribution and storage is avoided, and the data are distributed as uniformly as possible.
In addition, add the bucket number sign for the data record that waits to store, include: determining a barrel number identifier to be added according to a random hash algorithm; and adding the determined bucket number identification to be added into an expansion field of the data record to be stored, and distributing the bucket number identification to be added to the data record to be stored through a random hash algorithm, so that the uniform distribution of data is ensured when the data record is stored into the fragment database according to the bucket number identification.
In addition, after adding the bucket number identification for the data record to be stored, the method further comprises the following steps: detecting whether the use times of the added barrel number identification reach the preset maximum use times or not; if the preset maximum use times are reached, marking the added barrel number identification as forbidden use; the method further comprises; after determining a barrel number identifier to be added according to a random hash algorithm, before adding the determined barrel number identifier to be added to an extended field of a data record to be stored, detecting whether the determined barrel number identifier to be added is a barrel number identifier marked as forbidden to be used; if the barrel number identifier is marked as the forbidden use, determining the barrel number identifier to be added again according to the random hash algorithm; if the data is not marked as the forbidden use bucket number identifier, the determined bucket number identifier to be added is added to the extended field of the data record to be stored, the bucket number identifier is forbidden when the use frequency of the bucket number identifier is enough, and the bucket number identifier is distributed again when the calculated data result corresponds to the forbidden use bucket number identifier, so that the performance reduction of data migration caused by excessive corresponding data under one bucket number identifier is avoided.
In addition, according to the target barrel number identifier, after the data to be migrated is migrated to the target fragment database, the method further comprises the following steps: and updating the corresponding relation between the barrel number identifier and the fragment database, and updating the corresponding relation between the barrel number identifier and the fragment database to avoid data access errors and ensure the quality of data migration.
Drawings
One or more embodiments are illustrated by the corresponding figures in the drawings, which are not meant to be limiting.
Fig. 1 is a flow chart of a data redistribution method according to a first embodiment of the present invention;
fig. 2 is a flowchart of a data redistribution method according to a second embodiment of the present invention;
FIG. 3 is a system architecture diagram of a data redistribution method according to a second embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to a third embodiment of the invention.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, it will be appreciated by those of ordinary skill in the art that numerous technical details are set forth in order to provide a better understanding of the present application in various embodiments of the present invention. However, the technical solution claimed in the present application can be implemented without these technical details and various changes and modifications based on the following embodiments. The following embodiments are divided for convenience of description, and should not constitute any limitation to the specific implementation manner of the present invention, and the embodiments may be mutually incorporated and referred to without contradiction.
The first embodiment of the invention relates to a data redistribution method, in the embodiment, a barrel number identifier is added to a data record to be stored, and the data record is stored to a fragment database corresponding to the barrel number identifier of the data record according to the corresponding relation between the data record and a preset barrel number identifier and the fragment database; responding to the data redistribution request, and acquiring a target barrel number identifier of the data to be migrated; according to the target barrel number identification, data to be migrated is migrated to the target fragment database, and the data is logically partitioned and stored, so that the problem that hot spots are easy to occur due to poor expansibility when physical partition storage is avoided.
The following describes implementation details of a data redistribution method according to the present embodiment, and the following is only provided for easy understanding and is not essential to the present solution.
A specific flow of a data redistribution method in this embodiment is shown in fig. 1, and specifically includes the following steps:
step 101, adding a barrel number identifier for a data record to be stored and storing the data record into a fragment database corresponding to the barrel number identifier.
Specifically, after receiving the data to be stored, the metadata server adds a barrel number identifier to the data record to be stored, and stores the data record to the fragment database corresponding to the barrel number identifier of the data record according to a preset correspondence between the barrel number identifier and the fragment database.
In one example, the metadata server, before performing data storage, estimates the number of data records to be stored, and in combination with the capacity requirement for migration data during data migration, presets the maximum number of times of use for each barrel number identifier, and then in combination with the estimated number of data records to be stored and the preset maximum number of times of use for each barrel number identifier, calculates the number M of barrel number identifiers to be generated according to the following formula,
Figure BDA0002260878020000051
and S is the estimated number of data records to be stored, and L is the preset maximum use times of each barrel number identifier.
For example, the metadata server predicts that the number of data records to be stored is 10 hundred million, and according to the fragmentation characteristic and the empirical value of the fragmentation database, the maximum number of data records corresponding to one barrel number identifier is 100 ten thousand, that is, the maximum number of times of use of one barrel number identifier is 100 ten thousand, according to a formula, the number M of barrel number identifiers to be generated = the number S of data records to be stored per the preset maximum number of times of use of the barrel number identifier, that is, the number M =10 hundred million/100 ten thousand =1000 barrel number identifiers to be created, according to a formula, the metadata server generates 1000 barrel number identifiers according to a calculation result, constructs a BGMT mapping table of a corresponding relationship between a barrel number identifier and a fragmentation database, and fills each barrel number identifier into the mapping table according to the corresponding fragmentation database.
In practical application, the preset maximum number of times of use of the barrel number identifier can be adjusted according to actual conditions and needs, and the preset maximum number of times is set, which is not limited in the embodiment.
When receiving data to be stored, the metadata server determines a to-be-added bucket number identifier of a data record to be stored by means of a random hash algorithm, adds the determined to-be-added bucket number identifier to an extension field of the data record to be stored, and stores the data record to a fragment database corresponding to the bucket number identifier of the data record according to a preset corresponding relationship between the bucket number identifier and the fragment database.
For example, a random hash algorithm adopted by the metadata server is a hash algorithm, the number of the calculated and created barrel number identifiers is 1000, and since the result of the hash algorithm corresponds to the barrel number identifiers one to one, 1000 hash values can be obtained, when a new piece of data to be stored is received, the data to be stored is input into the hash algorithm, the hash value obtained through calculation is 23, a field of the barrel number identifier 23 is added into a data record expansion field of the piece of data, then a sharded database corresponding to the barrel number identifier 23 is inquired in a list of correspondence relationship between the barrel number identifier and the sharded database, and the inquired sharded database is 2, and then the piece of data is stored into the sharded database 2.
In practical application, when determining the bucket number identifier to be added to the data record and determining the correspondence between the bucket number identifier in the BGMT mapping table and the fragment database, the random hash algorithm used may be adjusted according to actual needs.
Step 102, when a data redistribution request is received, acquiring a target barrel number identifier of data to be migrated.
Specifically, after receiving the data redistribution request, the metadata server, in response to the data redistribution request, obtains a target bucket number identifier of the data to be migrated, that is, determines which data corresponding to the bucket number identifiers are to be migrated.
In one example, after receiving a data redistribution request issued by an upper layer, a metadata server generates a complete data redistribution task plan according to information of metadata to be redistributed, and determines a bucket number identifier of data to be migrated, for example, in the received data redistribution request, if the bucket number identifier to be redistributed is 5 bucket number identifiers of 5 to 9, the 5 bucket number identifiers of 5 to 9 are used as target bucket number identifiers.
And 103, migrating the data to be migrated to the target fragment database.
Specifically, after determining a bucket number identifier for data migration, the metadata server determines a target database of the target bucket number identifier, and then migrates data to be migrated to the target fragment database according to the target bucket number identifier.
In an example, when the metadata server determines a bucket number identifier to be subjected to data migration, the fragment databases to which data corresponding to the target bucket number identifier are respectively migrated are obtained, for example, the target fragment database of 3 bucket number identifiers with bucket numbers of 5 to 7 is database 4, the target fragment databases of bucket numbers of 7 and 8 are database 1, all data with bucket numbers of 5 to 7 are migrated to database 4, all data with bucket numbers of 7 or 8 are migrated to database 1, and after the data migration is completed, the mapping table between the bucket number identifier and the fragment databases is updated.
Therefore, the embodiment provides a data redistribution method, which comprises the steps of establishing enough barrel number identifications according to the estimated quantity of data to be stored, adding the barrel number identifications to the data to be stored according to the calculation result of a random hash algorithm when receiving the data to be stored, storing the data to be stored into corresponding fragment databases according to the barrel number identifications of the data to be stored according to the corresponding relation between the barrel number identifications and the fragment databases, and logically storing the data in a partition manner, so that a data record table stored in each fragment database does not need to be maintained independently, the limitation of physical table establishment when the data storage is carried out by establishing the table is avoided, service data can be divided into smaller units for storage, the expansibility of data redistribution is improved, the hot spot problem which possibly occurs is avoided, and the influence on the reliability of a system is avoided; when data redistribution is carried out, a target barrel number identification for carrying out data migration is determined, data to be migrated is migrated to a target fragment database for storage according to the target fragment database corresponding to the target barrel number identification, the data corresponding to one barrel number mark is used as a basic unit of data migration, requirements and occupation on a disk are avoided, the efficiency and quality of data migration are improved, meanwhile, the influence on services in the data migration process is reduced by using the data corresponding to one barrel number identification as a basic unit of data migration, and the availability of online redistribution is improved.
The second embodiment of the invention relates to a data redistribution method, the second embodiment is substantially the same as the first embodiment, in the embodiment, after allocating a barrel number identifier for data to be stored through a random hash algorithm, detecting whether the barrel number identifier to be added is a barrel number identifier prohibited from being used, and when the barrel number identifier is prohibited from being used, reallocating the barrel number identifier to be added; when data redistribution is carried out, target barrel number identifications to be subjected to data migration are grouped, batch migration is carried out on data to be migrated according to the barrel number identifications contained in each group, in addition, when data corresponding to a group of barrel number identifications are migrated, the operation permission of the data to be migrated is firstly locked, and after the migration is finished, the operation permission is unlocked again.
A specific flow of a data redistribution method in this embodiment is shown in fig. 2, and specifically includes the following steps:
step 201, determining a bucket number identifier to be added to data to be stored.
Specifically, when the data record to be stored is obtained, the metadata server determines the bucket number identifier to be added to the data record to be stored by means of a random hash algorithm.
Step 202, detecting whether the determined bucket number identifier to be added is a bucket number identifier marked as forbidden to be used; if the identifier is the barrel number identifier marked as forbidden, the procedure returns to step 201, and if the identifier is not the barrel number identifier marked as forbidden, the procedure goes to step 203.
Specifically, after a to-be-added barrel number identifier is determined according to a random hash algorithm, before the determined to-be-added barrel number identifier is added to an extension field of a to-be-stored data record, whether the determined to-be-added barrel number identifier is marked as a prohibited-to-use barrel number identifier is detected; if the identifier is the barrel number identifier marked as forbidden, returning to step 201, and determining the barrel number identifier to be added again according to the random hash algorithm; if not, step 203 is entered.
And step 203, adding a barrel number identifier for the data record to be stored and storing the data record into a fragment database corresponding to the barrel number identifier.
Specifically, when the to-be-added barrel number identifier is not marked as a prohibited-to-be-used barrel number identifier, adding the determined to-be-added barrel number identifier to an extension field of the to-be-stored data record, and detecting whether the use frequency of the added barrel number identifier reaches a preset maximum use frequency after adding the barrel number identifier to the to-be-stored data record; and if the number of times of the preset maximum use is reached, marking the added barrel number identifier as forbidden to use, and then storing the data to be stored into the fragment database according to the corresponding relation between the barrel number identifier and the fragment database.
In one example, it is detected that the to-be-added bucket number identifier is not marked as forbidden use, the determined to-be-added bucket number identifier is added to an extension field of the to-be-stored data record, the data record is stored to a fragment database corresponding to the bucket number identifier of the data record according to a preset corresponding relation between the bucket number identifier and the fragment database, then, the number of usage times of the added bucket number identifier is detected, whether the number of usage times of the added bucket number identifier reaches a preset maximum number of usage times is detected, if the number of usage times of the added bucket number identifier reaches the preset maximum number of usage times, the added bucket number identifier is marked as forbidden use, and if the number of usage times of the added bucket number identifier does not reach the preset maximum number of usage times, the number of usage times of the bucket number identifier is updated.
And step 204, acquiring the target barrel number identification of the data to be migrated and grouping.
Specifically, when the metadata server responds to the data redistribution request, a complete redistribution plan is generated according to the metadata information, the target bucket number identifiers of the data to be migrated are obtained first, that is, the bucket number identifiers of the data to be migrated of all corresponding data records are determined, and then the target bucket number identifiers of the data to be migrated are divided into N groups, wherein N is an integer greater than 1.
In one example, after receiving a data redistribution request, the metadata server determines that data records corresponding to the bucket number identifiers 1 to 100 are to be redistributed according to information of the data redistribution request, and in order to ensure the performance of data migration, the metadata server groups the bucket number identifiers of the hundred data to be migrated, for example, according to 1 to 10, 11 to 20 \8230, 8230, and every ten bucket number identifiers are used as a group.
In practical application, the number of groups and the number of barrel number identifiers included in each group may be set according to actual needs, and the present embodiment does not limit the grouping method of the target barrel number identifier.
Step 205, migrating the data to be migrated to the target database in batches.
Specifically, after the target bucket number identifiers of the data to be migrated are grouped, the data to be migrated corresponding to the target bucket number identifiers of the same group are used as the data to be migrated of the same batch, and then the data to be migrated is migrated to the target database in batches according to the target bucket number identifiers of each group.
In one example, after grouping the bucket number identifiers of the data to be migrated, the metadata server randomly selects a group of bucket number identifiers, and performs data migration on the data corresponding to the bucket number identifiers. For example, when detecting that the target bucket number identifier of the current batch of data to be migrated is 1 to 10, the metadata server stores the bucket number identifier information to be locked in the metadata, issues a bucket locking request to each database agent, each database agent updates the bucket locking information of the memory after receiving the request issued by the metadata server, detects the bucket number identifier corresponding to each partitioned database, and when detecting the bucket number identifier to be locked, the database agent stops performing DML (data operation language) operation on the data corresponding to the bucket number identifier to be locked. In the bucket locking process, the database agent detects that part of data in the data corresponding to the bucket number identifier 7 is accessed and changed by the running service, while the data corresponding to the rest of the bucket number identifiers are in an unaccessed state, at the moment, the data under 9 bucket number identifiers of the bucket number identifiers 1 to 6 and 8 to 10 are locked in operation authority, the service is prohibited from accessing and changing the data, the data corresponding to the 9 bucket number identifiers are used as the data to be migrated in the batch, and after the data with the bucket number identifier 7 cannot be accessed, the data with the bucket number identifier 7 is migrated.
After the barrel number identification of the data to be migrated of the current batch is locked, a data migration instruction is sent to a database agent, the data migration instruction carries a target barrel number identification of the data to be migrated of the current batch and a target fragment database, the data to be migrated of the current batch is obtained from a source fragment database by the database agent according to the target barrel number identification, the data to be migrated of the current batch is migrated to the target fragment database, and then the corresponding relation between the barrel number identification and the fragment database is updated.
For example, after the bucket number identifier of the data to be migrated is locked, the metadata server selects the database agent 1 as a database agent for performing data migration, the metadata server sends a data migration instruction to the database agent 1, because data corresponding to the bucket number identifier 7 is not migrated temporarily, the database agent constructs a data migration SQL (structured query language) statement according to the data migration instruction, the data with the bucket number identifiers 1 to 6 are respectively extracted from the source sharded database into the storage space of the database agent 1, then the obtained data is imported into the sharded database 11 from the storage space of the database agent, and the data cache of the sharded database is cleared, and then the data with the bucket number identifiers 8 to 10 are migrated into the sharded database 9 in the same manner.
After the data migration of the batch is finished, the database proxy 1 not only clears the data cached by the database proxy 1, but also clears the migrated data in the fragmented database subjected to data migration, updates the bucket number identifier corresponding to each fragmented database, and then feeds back a message of successful data migration to the metadata server. A system architecture diagram of the data migration method is shown in fig. 3, a metadata server receives a redistribution request, the metadata server selects a group of barrel number identifiers of data to be migrated, locks data corresponding to the selected barrel number identifiers by issuing a barrel locking instruction to each database proxy, then issues a data migration instruction to a designated database proxy, the selected database proxy caches data (Bn) corresponding to one barrel number identifier in a sharded database as a basic unit to a storage space of the database proxy, then guides the cached data into a target sharded database of the data to be migrated, unlocks the barrel number identifiers of which the data migration is completed, and finally updates a corresponding relationship between the barrel number identifiers and the sharded database.
In practical application, when data migration is performed, the bucket number identifier for preferentially performing data migration can be determined according to practical situations, and the data migration sequence is not limited in this embodiment.
After the data to be migrated of the current batch is migrated to the target database, the target barrel number identification of the data migration completed by the corresponding data is unlocked, then data corresponding to a group of barrel number identifications are reselected for new data redistribution until the data corresponding to the data redistribution request complete the data migration, and a data migration result is fed back.
In one example, after a metadata server receives that data corresponding to barrel number identifiers 1 to 4 fed back by a database agent has been successfully migrated to a fragment database 2, the metadata server releases barrel locking information for the barrel number identifiers 1 to 4 in the metadata, and issues an unlocking request to each database agent, the database agent unlocks the data corresponding to the barrel number identifiers 1 to 4 after receiving the unlocking request, recovers the database agent to access and modify the data, and feeds back information of successful unlocking to the metadata server, after unlocking is completed, the metadata server updates the corresponding relationship between the barrel number identifiers and the fragment database according to feedback information after data migration is completed, and recovers modification and access of the data corresponding to the barrel number identifiers 1 to 4, then the metadata server selects a new group of barrel number identifiers from the grouped barrel number identifiers, performs data migration on the data corresponding to the barrel number identifiers, and feeds back the data migration result of the data migration until data corresponding to a data redistribution request is completed.
The embodiment provides a data redistribution method, and the use times of the to-be-added barrel number identifiers are detected, so that excessive data corresponding to one barrel number identifier is avoided, and the data migration performance is reduced; when data migration is carried out, the bucket number identifications of the data to be migrated are grouped, so that batch migration of the data is realized, the data migration performance is improved, and the influence on the service is reduced; by completing data migration in the database agent, the data to be migrated is prevented from being additionally imported and exported in a file system, and the performance and efficiency of data migration are improved; the operation authority of the data of the current batch is locked before the data is migrated, and the data is unlocked after the data is migrated, so that the data loss and the service errors caused by the data loss are avoided; therefore, efficient and accurate online redistribution of the data is realized, the flexibility and the usability of the data redistribution are improved, and the user experience is greatly improved.
The steps of the above methods are divided for clarity, and the implementation may be combined into one step or split some steps, and the steps are divided into multiple steps, so long as the same logical relationship is included, which are within the scope of the present patent; it is within the scope of this patent to add insignificant modifications or introduce insignificant designs to the algorithms or processes, but not to change the core designs of the algorithms and processes.
A third embodiment of the invention relates to an electronic device, as shown in fig. 4, comprising at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the data redistribution method as described above.
Where the memory and processor are connected by a bus, the bus may comprise any number of interconnected buses and bridges, the bus connecting together various circuits of the memory and the processor or processors. The bus may also connect various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. A bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor is transmitted over a wireless medium via an antenna, which further receives the data and transmits the data to the processor.
The processor is responsible for managing the bus and general processing and may also provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And the memory may be used to store data used by the processor in performing operations.
A fourth embodiment of the present invention relates to a computer-readable storage medium storing a computer program. The computer program realizes the above-described method embodiments when executed by a processor.
That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples for carrying out the invention, and that various changes in form and details may be made therein without departing from the spirit and scope of the invention in practice.

Claims (9)

1. A method of redistributing data, comprising:
adding a barrel number identifier for a data record to be stored, and storing the data record to a fragment database corresponding to the barrel number identifier of the data record according to a preset corresponding relation between the barrel number identifier and the fragment database;
responding to the data redistribution request, and acquiring a target barrel number identifier of the data to be migrated;
according to the target barrel number identification, migrating the data to be migrated to a target fragment database;
wherein, after adding the barrel number identifier for the data record to be stored, the method further comprises the following steps:
detecting whether the use times of the added barrel number identification reach the preset maximum use times or not;
if the preset maximum use times are reached, marking the added barrel number identification as forbidden use;
the method further comprises;
after determining a to-be-added barrel number identifier according to a random hash algorithm, before adding the determined to-be-added barrel number identifier to an extension field of the to-be-stored data record, detecting whether the determined to-be-added barrel number identifier is marked as a prohibited barrel number identifier;
if the barrel number identifier marked as forbidden to use, determining the barrel number identifier to be added again according to a random hash algorithm;
and if the data record is not marked as the forbidden use bucket number identifier, executing the determined bucket number identifier to be added, and adding the determined bucket number identifier to the extension field of the data record to be stored.
2. The data redistribution method of claim 1 wherein the migrating the data to be migrated to a target shard database according to the target bucket number identifier comprises:
dividing the target barrel number identifications of the data to be migrated into N groups, wherein the data to be migrated corresponding to the target barrel number identifications of the same group are used as the data to be migrated in the same batch; wherein N is an integer greater than 1;
and migrating the target data to the target database in batches according to the target barrel number identifications of each group.
3. The data redistribution method of claim 2 wherein the batch migration to the target database based on the target bucket number identifiers of each group comprises:
locking a target barrel number identifier of the data to be migrated in the current batch;
and migrating the data to be migrated of the current batch to the target database, and unlocking the target barrel number identification of the data to be migrated of the current batch after migration is completed.
4. The data redistribution method of claim 3, wherein the migrating the data to be migrated of the current lot to the target database comprises:
and sending a data migration instruction to a database agent, wherein the data migration instruction carries the target barrel number identifier of the data to be migrated of the current batch and a target fragment database, so that the database agent obtains the data to be migrated of the current batch from a source fragment database according to the target barrel number identifier and migrates the data to be migrated of the current batch to the target fragment database.
5. The data redistribution method of claim 1, wherein before adding the bucket number identifier to the data record to be stored, further comprising: the number M of bucket number identifiers is calculated according to the following formula,
Figure DEST_PATH_IMAGE001
and S is the estimated number of data records to be stored, and L is the preset maximum use times of each barrel number identifier.
6. A data redistribution method according to any one of claims 1 to 5 wherein the adding of a bucket number identification to the data record to be stored comprises:
determining a barrel number identifier to be added according to a random hash algorithm;
and adding the determined bucket number identifier to be added into an expansion field of the data record to be stored.
7. The data redistribution method according to any one of claims 1 to 5, wherein after migrating the data to be migrated to a target fragment database according to the target bucket number identifier, further comprising:
and updating the corresponding relation between the barrel number identification and the fragment database.
8. An electronic device, comprising:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a data redistribution method as claimed in any one of claims 1 to 7.
9. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the data redistribution method of any one of claims 1 to 7.
CN201911070817.5A 2019-11-05 2019-11-05 Data redistribution method, electronic equipment and storage medium Active CN112765262B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201911070817.5A CN112765262B (en) 2019-11-05 2019-11-05 Data redistribution method, electronic equipment and storage medium
PCT/CN2020/116284 WO2021088531A1 (en) 2019-11-05 2020-09-18 Data redistribution method, electronic device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911070817.5A CN112765262B (en) 2019-11-05 2019-11-05 Data redistribution method, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112765262A CN112765262A (en) 2021-05-07
CN112765262B true CN112765262B (en) 2023-02-28

Family

ID=75692567

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911070817.5A Active CN112765262B (en) 2019-11-05 2019-11-05 Data redistribution method, electronic equipment and storage medium

Country Status (2)

Country Link
CN (1) CN112765262B (en)
WO (1) WO2021088531A1 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113392067A (en) * 2021-06-11 2021-09-14 北京金山云网络技术有限公司 Data processing method, device and system for distributed database
CN113438645A (en) * 2021-06-29 2021-09-24 恒安嘉新(北京)科技股份公司 Data security protection method, device, medium and electronic equipment in 5G network
CN113468148B (en) * 2021-08-13 2023-02-17 上海浦东发展银行股份有限公司 Data migration method and device of database, electronic equipment and storage medium thereof
CN113791736A (en) * 2021-09-15 2021-12-14 京东科技信息技术有限公司 Data migration method, network card device, server and data migration system
CN113641686B (en) * 2021-10-19 2022-02-15 腾讯科技(深圳)有限公司 Data processing method, data processing apparatus, electronic device, storage medium, and program product
CN114697376A (en) * 2022-03-16 2022-07-01 浪潮云信息技术股份公司 Super-large message transmission method and device
CN115118604B (en) * 2022-07-01 2023-04-11 杭州宇信数字科技有限公司 Dynamic capacity expansion data migration method, device, system and medium
CN118113774A (en) * 2022-11-23 2024-05-31 华为云计算技术有限公司 Method, system, equipment cluster and storage medium for redistributing database

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7613707B1 (en) * 2006-12-22 2009-11-03 Amazon Technologies, Inc. Traffic migration in a multilayered data service framework
US7774329B1 (en) * 2006-12-22 2010-08-10 Amazon Technologies, Inc. Cross-region data access in partitioned framework
CN102968498A (en) * 2012-12-05 2013-03-13 华为技术有限公司 Method and device for processing data
CN105808340A (en) * 2014-12-29 2016-07-27 中移(苏州)软件技术有限公司 Load balancing method and system
CN106936899A (en) * 2017-02-25 2017-07-07 九次方大数据信息集团有限公司 The collocation method of distributed statistical analysis system and distributed statistical analysis system

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8504521B2 (en) * 2005-07-28 2013-08-06 Gopivotal, Inc. Distributed data management system
CN102033938B (en) * 2010-12-10 2012-06-06 天津神舟通用数据技术有限公司 Secondary mapping-based cluster dynamic expansion method
CN103905503B (en) * 2012-12-27 2017-09-26 中国移动通信集团公司 Data access method, dispatching method, equipment and system
US9489239B2 (en) * 2014-08-08 2016-11-08 PernixData, Inc. Systems and methods to manage tiered cache data storage
CN104615657A (en) * 2014-12-31 2015-05-13 天津南大通用数据技术股份有限公司 Expanding and shrinking method for distributed cluster with nodes supporting multiple data fragments
CN108932256A (en) * 2017-05-25 2018-12-04 中兴通讯股份有限公司 Distributed data redistribution control method, device and data management server

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7613707B1 (en) * 2006-12-22 2009-11-03 Amazon Technologies, Inc. Traffic migration in a multilayered data service framework
US7774329B1 (en) * 2006-12-22 2010-08-10 Amazon Technologies, Inc. Cross-region data access in partitioned framework
CN102968498A (en) * 2012-12-05 2013-03-13 华为技术有限公司 Method and device for processing data
CN105808340A (en) * 2014-12-29 2016-07-27 中移(苏州)软件技术有限公司 Load balancing method and system
CN106936899A (en) * 2017-02-25 2017-07-07 九次方大数据信息集团有限公司 The collocation method of distributed statistical analysis system and distributed statistical analysis system

Also Published As

Publication number Publication date
WO2021088531A1 (en) 2021-05-14
CN112765262A (en) 2021-05-07

Similar Documents

Publication Publication Date Title
CN112765262B (en) Data redistribution method, electronic equipment and storage medium
US9239741B2 (en) System and method for flexible distributed massively parallel processing (MPP)
US10002148B2 (en) Memory-aware joins based in a database cluster
US11500832B2 (en) Data management method and server
US20160026684A1 (en) Framework for volatile memory query execution in a multi node cluster
US20140279849A1 (en) Hierarchical tablespace space management
CN106682215B (en) Data processing method and management node
CN104954468A (en) Resource allocation method and resource allocation device
CN110427386B (en) Data processing method, device and computer storage medium
CN110569302A (en) method and device for physical isolation of distributed cluster based on lucene
US9785697B2 (en) Methods and apparatus for implementing a distributed database
CN116150160B (en) Adjustment method and device for database cluster processing nodes and storage medium
US20170212939A1 (en) Method and mechanism for efficient re-distribution of in-memory columnar units in a clustered rdbms on topology change
CN110727738A (en) Global routing system based on data fragmentation, electronic equipment and storage medium
US20190372825A1 (en) Communication apparatus, communication method, and recording medium
CN109844723B (en) Method and system for master control establishment using service-based statistics
CN111459913B (en) Capacity expansion method and device of distributed database and electronic equipment
CN112463786B (en) Data synchronization method, system, server and storage medium
CN111782634B (en) Data distributed storage method, device, electronic equipment and storage medium
CN114442952A (en) Cold data migration method and device, storage medium and electronic device
CN116204546A (en) SQL precompilation method, SQL precompilation device, SQL precompilation server and SQL precompilation storage medium
CN116915510B (en) Distributed storage system based on high-speed encryption algorithm
CN111221857A (en) Method and apparatus for reading data records from a distributed system
US20240169072A1 (en) Native multi-tenant row table encryption
CN112860654B (en) Data slicing processing method, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220110

Address after: 100176 floor 18, building 8, courtyard 10, KEGU 1st Street, Beijing Economic and Technological Development Zone, Daxing District, Beijing (Yizhuang group, high-end industrial area of Beijing Pilot Free Trade Zone)

Applicant after: Jinzhuan Xinke Co.,Ltd.

Address before: 518057 Zhongxing building, science and technology south road, Nanshan District hi tech Industrial Park, Guangdong, Shenzhen

Applicant before: ZTE Corp.

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant