CN118113774A - Method, system, equipment cluster and storage medium for redistributing database - Google Patents

Method, system, equipment cluster and storage medium for redistributing database Download PDF

Info

Publication number
CN118113774A
CN118113774A CN202211472353.2A CN202211472353A CN118113774A CN 118113774 A CN118113774 A CN 118113774A CN 202211472353 A CN202211472353 A CN 202211472353A CN 118113774 A CN118113774 A CN 118113774A
Authority
CN
China
Prior art keywords
data
data node
redistribution
original data
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211472353.2A
Other languages
Chinese (zh)
Inventor
任伟明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Cloud Computing Technologies Co Ltd
Original Assignee
Huawei Cloud Computing Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Cloud Computing Technologies Co Ltd filed Critical Huawei Cloud Computing Technologies Co Ltd
Priority to CN202211472353.2A priority Critical patent/CN118113774A/en
Priority to PCT/CN2023/125968 priority patent/WO2024109415A1/en
Publication of CN118113774A publication Critical patent/CN118113774A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2308Concurrency control
    • G06F16/2315Optimistic concurrency control
    • G06F16/2329Optimistic concurrency control using versioning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a method, a device cluster and a storage medium for database redistribution, wherein the method comprises the following steps: the cloud platform receives a data redistribution instruction of a user, a redistribution device is started, the redistribution device indicates an original data node to mount a cloud hard disk, then the original data node creates a second data table in the cloud hard disk, copies a first data table of the second data table to the second data table, and finally, the redistribution device redistributes the second data table of the cloud hard disk to the original data node and a new data node to finish the redistribution of the database. The method can save disk space and improve the utilization rate of the database disk.

Description

Method, system, equipment cluster and storage medium for redistributing database
Technical Field
The present application relates to the field of database technologies, and in particular, to a method, a system, an apparatus cluster, and a storage medium for redistributing a database.
Background
The database cluster is a distributed database system formed by at least two or more relatively independent database servers and a high-speed communication network. The data nodes of each database cluster (i.e., each server in the cluster) are independent servers running respective services, which can communicate with each other. Along with the change of service demands, the database cluster has elastic expansion capability and can be expanded on line. For example, when the physical storage space is insufficient due to the service growth, the database cluster needs to add data nodes for capacity expansion, i.e. add data nodes and redistribute part of data of the original data nodes to newly added data nodes to complete capacity expansion of the database.
In the prior art, adding data nodes is a common database capacity expansion method. Specifically, first, a new data node is added in the database cluster, metadata of the data node is synchronized, and then, data redistribution is performed to complete table switching. The process of data redistribution is to create a temporary table in the cluster, then import the data from the original table into the temporary table, and provide the metadata of the original table for the temporary table to use after the import is completed. However, when the database expansion is performed in this way, the disk space occupied by the temporary table data is basically equal to that occupied by the original data, so that the disk occupied by the service data is at most 45%. Database expansion can also be performed by adding data nodes in a hash bucket (HashBucket), but the database index needs to be redesigned in the way, and the scheme is difficult to realize, complex to operate and large in time dimension, so that the method is a temporary transition scheme.
Disclosure of Invention
The application provides a method, a system, a device cluster and a storage medium for redistributing a database, which can realize capacity expansion without reserving 50% of disk space and can improve the disk utilization rate of the database.
In a first aspect, a method of database redistribution is provided. Firstly, a cloud platform receives a redistribution instruction of a user, and after the redistribution instruction is received, the cloud platform starts a redistribution device; then, the redistribution device instructs the original data node to mount the cloud hard disk; and finally, the original data node stores the first data table as a second data table of the cloud hard disk, and the redistribution device redistributes the second data table of the cloud hard disk to the original data node and the new data node.
According to the scheme, the cloud hard disk is mounted through the original data node, the first data table on the original data node is stored on the second data table of the cloud hard disk, and then the table data of the second data table are redistributed to the original data node and the new data node, so that capacity expansion is carried out without reserving 50% of disk space, and the disk utilization rate of the database can be improved.
With reference to the first aspect, in one possible implementation manner, the method includes: the original data node deletes the first data table of the original data node.
With reference to the first aspect, in another possible implementation manner, the method specifically may include: if the data storage utilization rate of the original data node is larger than the target threshold value, the cloud platform prompts a user whether to redistribute or not.
Wherein the target threshold is a value greater than one-half.
In another possible implementation manner, the original data node stores the first data table thereof as the second data table of the cloud hard disk, and specifically includes: the original data node creates a second data table on the cloud hard disk, and then copies the table data of the first data table to the second data table.
With reference to the first aspect and any one of the possible implementation manners of the first aspect, in another possible implementation manner of the first aspect, the method includes: the cloud platform deploys the new data nodes.
In one possible implementation, the method includes: the original data node uninstalls the cloud hard disk.
According to the implementation mode, the cloud hard disk is mounted on the original data node, the first data table of the original data node is stored as the second data table of the cloud hard disk, and then the first data table is redistributed to the original data node and the new data node, so that the capacity expansion of the database is completed, the disk space can be saved, and the utilization rate of the database disk is improved.
In a second aspect, there is provided a database redistribution system, the system comprising: the cloud platform is used for receiving a redistribution instruction of a user and starting a redistribution device; the redistribution device is used for indicating the original data node to mount the cloud hard disk; the original data node is used for storing a first data table of the original data node as a second data table of the cloud hard disk; and the redistribution device is also used for redistributing the second data table in the cloud hard disk to the original data node and the new data node.
With reference to the second aspect, in one possible implementation manner, the original data node is used to delete the first data table in the original data node.
With reference to the second aspect, in another possible implementation manner, the cloud platform is configured to prompt a user whether to redistribute data if a data storage utilization of an original data node is greater than a target threshold.
In another possible implementation manner, the storing, by the original data node, the first data table as the second data table of the cloud hard disk specifically includes: the original data node is used for creating a second data table on the cloud hard disk; the original data node is further configured to copy table data of the first data table to the second data table.
In another possible implementation manner, the cloud platform is used for deploying new data nodes.
With reference to the second aspect and any one of possible implementation manners of the second aspect, in another possible implementation manner of the second aspect, the foregoing primary data node is used to offload a cloud hard disk.
In a third aspect, a cluster of computing devices is provided, the cluster of computing devices including at least one computing device, each computing device including a processor and a memory; wherein the processor of the at least one computing device is configured to execute instructions stored in the memory of the at least one computing device to cause the cluster of computing devices to perform a method as provided above in the first aspect or any one of the possible implementations of the first aspect.
In a fourth aspect, there is provided a computer program product comprising instructions which, when executed by a cluster of computing devices, cause said cluster of computing devices to perform a method as provided by the above-described first aspect or any of the possible implementations of the first aspect.
In a fifth aspect, a computer readable storage medium is provided, comprising computer program instructions which, when executed by a cluster of computing devices, perform a method as provided by the above-mentioned first aspect or any of the possible implementations of the first aspect.
Drawings
In order to more clearly describe the embodiments of the present application or the technical solutions in the background art, the following description will describe the drawings that are required to be used in the embodiments of the present application or the background art.
FIG. 1 is a schematic diagram of a database redistribution method according to the present application;
Fig. 2 is a schematic diagram of a database redistribution method according to the present application.
Fig. 3 is a schematic flow chart of database redistribution provided by the present application.
Fig. 4 is a schematic diagram of an initial stage of database redistribution provided by the present application.
Fig. 5 is a schematic diagram of a database redistribution process according to the present application.
FIG. 6 is a schematic diagram of a database redistribution completion provided by the present application.
Fig. 7 is a schematic structural diagram of a database redistribution system according to the present application.
Fig. 8 is a schematic structural diagram of a computing device provided by the present application.
Fig. 9 is a schematic structural diagram of a computing device cluster according to the present application.
Fig. 10 is a schematic diagram of a structure of a computing device connected by a network according to the present application.
Detailed Description
The application provides a method, a system, a device cluster and a storage medium for database redistribution, and the method, the system, the device cluster and the storage medium are described below with reference to the accompanying drawings.
In order to make the technical scheme provided by the application clearer, before the technical scheme provided by the application is specifically described, explanation of related terms is firstly carried out.
(1) Structured query language (Structured Query Language, SQL): the structured query language is a standard language of a relational database, SQL is a general relational database language with strong functionality, is a standard interface for relational database access, and is the basis for operation between different database systems. SQL is based on relational algebra and tuple relational algorithm, and integrates data query, data operation and data control functions.
(2) Shared-nothing architecture (Shared Noth ing Structure): each node in the database cluster completely has independent CPU, memory and storage medium, shared resources are not existed, all nodes are communicated through a protocol, the parallel capability and the expansion capability are better, all nodes process local data, and the processing result can be summarized to an upper layer or circulated among the nodes through a communication protocol.
(3) Multi-version concurrency control (Mu lt i-Vers ion Concurrency Contro l, MVCC): the multi-version concurrency control is one of the database concurrency control protocols, realizes the concurrency access to the database, has the basic algorithm that one tuple can have multiple versions, different queries can work on different versions, can improve the concurrency performance of the database, and can treat read-write conflicts in a better mode.
(4) Online transaction (On-LINE TRANSACT ion processing, OLTP): the transaction-oriented processing process is also referred to as a process for managing transaction data by using a computer system, and is basically characterized in that user data received before and after can be immediately transmitted to a computing center for processing, and processing results are given in a short time, so that the transaction-oriented processing process is one of the modes of quick response to user operation. OLTP is a major application of traditional relational databases for airplane booking, banking, stock trading, supermarket sales, restaurant management, and the like.
(5) On-line analytical processing (OLAP): is a technique for organizing large business databases and supports complex analysis. OLAP is a major application of data warehouse systems, supporting complex analysis operations, focusing on decision support, providing query results to decision makers in an intuitive and understandable form without negatively impacting the transaction system.
(6) Data node (Data Note, DN): for storing data, independently maintaining own metadata, typically a complete database, each DN stores a subset of data, each subset of data being a partition. And is responsible for actually executing the storage and query operations of the table data.
The structure of a distributed database according to an embodiment of the present application is briefly described below.
Fig. 1 is a schematic structural diagram of a distributed database according to an embodiment of the present application, where the distributed storage database is based on a shared-nothing architecture, has complex transaction mixing load capability, supports strong consistency of distributed transactions, supports expansion of data nodes, and has elastic expansion capability. Under the architecture, the data nodes are easy to expand, and the addition of the data nodes can improve the inquiry, storage and loading performances. The database includes a database thread pool 101, an SQL engine layer 102, and a storage engine layer 103.
The database thread pool 101 adopts a thread pool model, can be executed in parallel, has low high concurrent switching cost and low memory loss, and compared with a process model, the process model realizes communication and data sharing through a shared memory, and each process corresponds to one concurrent connection and has switching performance loss.
The SQL engine layer 102 is the core layer of the database, providing all the logical functions of the data, and is divided into four components:
and the SQL interface is used for receiving and processing the SQL command of the client, sending the SQL command to other parts, receiving the result data returned by the other parts, and then returning the result data to the client.
The SQL parser is mainly used for parsing the query sentences and finally generating a grammar tree. Firstly, the parser parses the query sentence, and if the sentence grammar has errors, the parser returns corresponding error information. After the grammar check is passed, the parser queries the cache, and if the cache has corresponding sentences, the result is directly returned without the subsequent query operation.
The SQL optimizer is mainly used for optimizing the query statement, and comprises the steps of selecting a proper index, a data reading mode and the like.
Parallel execution is the execution of query statements and can be performed in parallel.
The storage engine layer 103 adopts a multi-engine structure, including an MVCC row storage engine, a memory engine, and a column storage engine. The MVCC line storage engine stores tables in a database in a line mode based on multi-version concurrency control, stores all attributes of each line together, is suitable for querying all attributes of one line, and is commonly used for the database of an OLTP scene. The memory engine is used for controlling the use amount of the thread memory and ensuring that the thread memory does not exceed the memory. The column storage engine stores the data of the table in columns, stores a plurality of records of each column together, and uses column storage only in OLAP scenes where complex queries are performed and the data amount is large.
Based on the shared-nothing architecture shown in fig. 1, after a new data node is added to the database, part of data on the original data node is redistributed to the new data node which is newly added to complete capacity expansion. Fig. 2 is a schematic diagram of adding data nodes and redistributing data in a database according to an embodiment of the present application. The current database is used to include 2 data nodes DN1 and DN2, with the addition of 2 data nodes DN3 and DN4 being illustrated.
As shown in fig. 2, the current database set includes 2 original data nodes DN1, DN2 and 2 new data nodes DN3, DN4, each connected to a storage medium. After the new data nodes DN3 and DN4 are expanded, the old tuples in the original data nodes DN1 and DN2 are redistributed to the new data nodes DN3 and DN4, and the old tuples on the original data nodes are deleted. Where the data of the database system is typically stored in the form of a table, the tuples are the basic elements of the table.
The method comprises the steps of adding data nodes and carrying out data redistribution, namely creating a temporary table in a database, distributing a plurality of sub-tables of the temporary table in a storage medium connected with an original data node and a new data node, importing old tuples on the original data node to the temporary table after the temporary table is built, and then redistributing data of the temporary table to the original data node and the new data node. Specifically, the old tuples of the original data node are originally distributed on the original data nodes DN1 and DN2, and when the database expansion is performed, the old tuples need to be redistributed to the original data node and the new data nodes DN1 to DN4, so that the redistribution of the database is completed. However, when the database expansion is performed in this way, the disk space occupied by the data of the temporary table is basically equal to the table data of the original data node, so that the disk space occupied by the service data is at most 45%.
In order to solve the problems, the application provides a method, a system, a device cluster and a storage medium for redistributing a database, which are characterized in that a cloud hard disk is newly added as a transition disk, table data on an original data node is written on the cloud hard disk, and then the data just written on the cloud hard disk are redistributed on the original data node and a new data node, so that expansion without reserving 50% of disk space is realized, and the disk utilization rate of the database is improved.
In order to more clearly understand the method, system, device cluster and storage medium for database redistribution provided by the present application, the following detailed descriptions are provided with reference to the corresponding drawings.
Fig. 3 is a flowchart of a database redistribution method according to an embodiment of the present application, including steps S301 to S305:
In step S301, the cloud platform receives a data redistribution instruction of the user, and starts the redistribution device.
If the data storage utilization rate of the data node of the current database is greater than a target threshold, the cloud platform prompts a user whether to redistribute the database, wherein the target threshold is a numerical value of one half of the disk space of the data node. When the user selects the redistribution of the database, the cloud platform deploys new data nodes and starts the redistribution device after receiving a redistribution instruction of the user.
Step S302, the redistribution device indicates the original data node to mount the cloud hard disk.
The redistribution device is used for indicating the original data node to mount the cloud hard disk, wherein the cloud hard disk is not distributed in the storage media of the original data node and the new data node, so that the storage space in the storage media is not occupied.
Step S303, the original data node creates a second data table in the cloud hard disk, and copies the table data of the first data table to the second data table.
After the cloud hard disk is mounted on the original data node, a second data table is created in the cloud hard disk, then the original data node copies the first data table into the second data table, then the first data table of the original data node is deleted, and the disk space of the original data node is released.
Step S304, the redistribution device redistributes the second data table of the cloud hard disk to the original data node and the new data node.
After the steps are completed, the redistribution device redistributes the second data table of the cloud hard disk to the original data node and the new data node.
Step S305, the original data node uninstalls the cloud hard disk.
Through the steps, the database is redistributed, and the table data of the first data table are distributed on the original data node and the new data node. After the database is redistributed, the original data node also needs to unload the cloud hard disk.
The technical scheme of the embodiment of the data expansion method related to the application will be described in detail.
For a clearer understanding of the above embodiments of the present application, the present database is still exemplified with 2 data nodes DN1 and DN2, and 2 data nodes DN3 and DN4 added.
Fig. 4 is a schematic diagram of an initial stage of database redistribution, fig. 5 is a schematic diagram during the database redistribution, and fig. 6 is a schematic diagram of the completion of the database redistribution.
As shown in fig. 4, the current database includes two data nodes DN1 and DN2, each data node is connected to a storage medium, where table data of an original data node is divided into a plurality of sub-tables dn1.old and dn2.old, and the sub-tables are stored in the storage medium corresponding to each node, and for convenience in describing the technical solutions, tables in the storage medium are divided in fig. 4 to 6. Where ". Onld" is used to identify that the tuple in the table is an old tuple and ". New" is used to identify that the tuple in the table is a new tuple. Because the data storage utilization rate of the data nodes of the current database is larger than the target threshold, database capacity expansion is needed, namely two data nodes DN3 and DN4 are newly added to form a new database cluster together with the original data nodes DN1 and DN2, the database cluster comprises four data nodes, wherein original table data are distributed on the original data nodes DN1 and DN2, and the new data nodes DN3 and DN4 have no table data. The connection mode of the new data nodes DN3 and DN4 after being added into the database cluster is shown in FIG. 5, namely DN3 and DN4 are mutually communicated with other original data nodes in the database cluster to form the new database cluster.
After the data nodes DN3 and DN4 are newly added, the redistribution device instructs the original data nodes DN1 and DN2 to mount the cloud hard disk, and creates a data table t1_tmp in the cloud hard disk by the original data nodes, wherein the data table t1_tmp is divided into a plurality of sub-tables and stored on the cloud hard disk, as shown in FIG. 4, for data redistribution shown in FIG. 5.
As shown in fig. 5, after the creation of the data table t1_tmp in the cloud hard disk is completed, the old tuple on the original data node needs to be copied to the data table t1_tmp, and the data on the original data node is stored in the cloud hard disk. New tables dn3.new and dn4.new are then created at new data nodes DN3 and DN4, and the table data is redistributed from the data table t1_tmp of the cloud hard disk onto the original table of the original data node and the new table of the new data node. Specifically, as shown in fig. 5, old tuples in the original data node are originally distributed in DN1 and DN2, and when database cluster expansion is performed, the old tuples need to be redistributed into DN1 to DN4.
Through the above steps, the database redistribution process is completed, as shown in fig. 6, the old tuples are uniformly distributed in the data nodes DN1 to DN4, and after the data redistribution is completed, the original data nodes uninstall the cloud hard disk.
In summary, the method provided by the application provides a method for redistributing a database by using a time-exchanged space and using a cloud hard disk mounted on an original data node as a transition disk, and by copying table data on the original data node to the cloud hard disk and then redistributing the table data of the cloud hard disk to the original data node and a new data node, the capacity expansion without reserving 50% of disk space is realized, and the utilization rate of the database disk is improved.
The database redistribution method provided by the application is described in detail above, and in order to facilitate better implementation of the above scheme provided by the application, correspondingly, a database redistribution system, a device cluster and a storage medium for implementing the above scheme in a matching manner are also provided below.
The present application also provides a system for redistributing a database, as shown in fig. 7, the system 700 for redistributing a database includes: cloud platform 710, redistribution means 720, and primary data nodes 730.
Cloud platform 710 for receiving a redistribution instruction from a user.
Cloud platform 710 is also used to activate the redistribution device.
The redistribution device 720 is configured to instruct the original data node to mount the cloud hard disk.
The primary data node 730 is configured to store a first data table of the primary data node as a second data table of the cloud hard disk.
The redistribution device 720 is further configured to redistribute the second data table in the cloud hard disk to the original data node and the new data node.
In one possible implementation, the original data node 730 is configured to delete the first data table in the original data node.
In one possible implementation, cloud platform 710 is configured to prompt a user whether to redistribute data if the data storage utilization of the original data node is greater than a target threshold.
In one possible implementation, cloud platform 710 is used to deploy new data nodes.
In one possible implementation, the original data node 730 is used to offload cloud hard disk.
Cloud platform 710, redistribution means 720, and primary data nodes 730 may all be implemented by software, or may be implemented by hardware. Illustratively, an implementation of cloud platform 710 is described next. Similarly, the implementation of the redistribution means 720 and the original data nodes 730 may refer to the implementation of the cloud platform 710.
Modules as one example of a software functional unit, cloud platform 710 may include code that runs on a computing instance. Wherein the computing instance may be at least one of a physical host (computing device), a virtual machine, a container, etc. computing device. Further, the computing device may be one or more. For example, cloud platform 710 may include code that runs on multiple hosts/virtual machines/containers. It should be noted that, multiple hosts/virtual machines/containers for running the application may be distributed in the same region, or may be distributed in different regions. Multiple hosts/virtual machines/containers for running the code may be distributed among the same AZ or among different AZs, each AZ including one data center or multiple geographically close data centers. Wherein typically a region may comprise a plurality of AZs.
Also, multiple hosts/virtual machines/containers for running the code may be distributed in the same VPC, or may be distributed among multiple VPCs. Where typically one VPC is placed within one region. The inter-region communication between two VPCs in the same region and between VPCs in different regions needs to set a communication gateway in each VPC, and the interconnection between the VPCs is realized through the communication gateway.
Modules as one example of hardware functional units, cloud platform 710 may include at least one computing device, such as a server, or the like. Or cloud platform 710 may be a device implemented using an ASIC, or a PLD, or the like. Wherein, the PLD can be CPLD, FPGA, GAL or any combination thereof.
Cloud platform 710 may include multiple computing devices distributed in the same region or in different regions. Cloud platform 710 may include multiple computing devices distributed among the same AZ or among different AZs. Likewise, multiple computing devices included in cloud platform 710 may be distributed in the same VPC or may be distributed among multiple VPCs. Wherein the plurality of computing devices may be any combination of computing devices such as servers, ASIC, PLD, CPLD, FPGA, and GAL.
As can be seen from the above, in the database redistribution system provided by the present application, as shown in fig. 7, when the cloud platform 710 receives a data redistribution instruction of a user, the redistribution device 720 is started, the redistribution device 720 instructs the original data node 730 to mount a cloud hard disk, the original data node 730 creates a second data table in the cloud hard disk, copies the data of the first data table to the second data table, and the redistribution device 720 redistributes the second data table of the cloud hard disk to the original data node and the new data node, thereby completing the redistribution of the database and improving the disk space utilization rate.
The present application also provides a computing device 800. As shown in fig. 8, a computing device 800 includes: bus 802, processor 804, memory 806, and communication interface 808. Communication between processor 804, memory 806, and communication interface 808 is via bus 802. Computing device 800 may be a server or a terminal device. It should be understood that the present application is not limited to the number of processors, memories in computing device 800.
Bus 802 may be a peripheral component interconnect (PER IPHERA L component interconnect, PCI) bus, or an extended industry standard architecture (extended industry STANDARD ARCH itecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one line is shown in fig. 8, but not only one bus or one type of bus. Bus 802 may include a path to transfer information between various components of computing device 800 (e.g., memory 806, processor 804, communication interface 808).
The processor 804 may include any one or more of a central processing unit (centra l process ing un it, CPU), a graphics processor (graph ics process ing un it, GPU), a Microprocessor (MP), or a digital signal processor (D IGITA LS IGNA L processor, DSP).
Memory 806 may include volatile memory (vo l at i le memory), such as random access memory (random access memory, RAM). The memory 806 may also include non-volatile memory (non-vo l at i le memory), such as read-only memory (ROM), flash memory, mechanical hard disk (HARD D I SK DR IVE, HDD) or solid state hard disk (so L ID STATE DR IVE, SSD).
The memory 806 has stored therein executable code that is executed by the processor 804 to implement the functions of the cloud platform 710, the redistribution means 720, and the original data node 730, respectively, to implement a database redistribution method. That is, the memory 806 has instructions stored thereon for performing the database redistribution method.
The communication interface 808 enables communication between the computing device 800 and other devices or communication networks using a transceiver module such as, but not limited to, a network interface card, transceiver, or the like.
The embodiment of the application also provides a computing device cluster. The cluster of computing devices includes at least one computing device. The computing device may be a server, such as a central server, an edge server, or a local server in a local data center. In some embodiments, the computing device may also be a terminal device such as a desktop, notebook, or smart phone.
As shown in fig. 9, the cluster of computing devices includes at least one computing device 800. The same instructions for performing the database redistribution method may be stored in memory 806 in one or more computing devices 800 in the computing device cluster.
In some possible implementations, portions of instructions for performing the database redistribution method may also be stored separately in the memory 806 of one or more computing devices 800 in the computing device cluster. In other words, a combination of one or more computing devices 800 may collectively execute instructions for performing a database redistribution method.
It should be noted that, the memory 806 in different computing devices 800 in the computing device cluster may store different instructions for performing part of the functions of the cloud platform 710, the redistribution apparatus 720, and the original data node 730, respectively.
In some possible implementations, one or more computing devices in a cluster of computing devices may be connected through a network. Wherein the network may be a wide area network or a local area network, etc. Fig. 10 shows one possible implementation. As shown in fig. 10, two computing devices 800A and 800B are connected by a network. Specifically, the connection to the network is made through a communication interface in each computing device. In this class of possible implementations, the memory 806 in different computing devices 800 in a cluster of computing devices may store different instructions for performing part of the functions of the database redistribution system. That is, the instructions stored by memory 806 in different computing devices 800 may implement the functionality of one or more of cloud platform 710, redistribution means 720, and primary data node 730.
Embodiments of the present application also provide a computer program product comprising instructions. The computer program product may be software or a program product containing instructions capable of running on a computing device or stored in any useful medium. The computer program product, when run on at least one computing device, causes the at least one computing device to perform a database redistribution method, or a database redistribution method.
The embodiment of the application also provides a computer readable storage medium. The computer readable storage medium may be any available medium that can be stored by a computing device or a data storage device such as a data center containing one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk), etc. The computer-readable storage medium includes instructions that instruct a computing device to perform a database redistribution method or instruct a computing device to perform a database redistribution method.
The foregoing is a specific embodiment of the present application. The above embodiments are only for illustrating the technical solution of the present application, and are not limiting. Although the application has been described in detail with reference to the foregoing embodiments, it will be appreciated by those skilled in the art that variations may be made to the embodiments described in the foregoing description or equivalents may be substituted for elements thereof. These modifications or substitutions do not depart from the essence of the corresponding technical solutions from the protection scope of the technical solutions of the specific embodiments of the present application.

Claims (16)

1. A method of database redistribution, the method comprising:
The cloud platform receives a redistribution instruction of a user;
The cloud platform starts a redistribution device;
the redistribution device indicates the original data node to mount the cloud hard disk;
The original data node stores a first data table of the original data node as a second data table of the cloud hard disk;
And the redistribution device redistributes the second data table in the cloud hard disk to the original data node and the new data node.
2. The method according to claim 1, characterized in that the method comprises:
and deleting the first data table in the original data node by the original data node.
3. The method according to claim 1 or 2, characterized in that the method comprises:
and if the data storage utilization rate of the original data node is larger than a target threshold value, the cloud platform prompts a user whether to redistribute the data.
4. A method according to claim 3, wherein the target threshold is a value greater than one half.
5. The method of any of claims 1 to 4, wherein the original data node stores a first data table of the original data node as a second data table of the cloud hard disk, comprising:
the original data node creates a second data table at Yun Yingpan;
The original data node copies table data of the first data table to the second data table.
6. The method according to any one of claims 1 to 5, characterized in that the method comprises:
The cloud platform deploys new data nodes.
7. The method according to any one of claims 1 to 6, characterized in that the method comprises:
and unloading the cloud hard disk by the original data node.
8. A system for database redistribution, the system comprising:
The cloud platform is used for receiving a redistribution instruction of a user;
the cloud platform is also used for starting the redistribution device;
The redistribution device is used for indicating the original data node to mount the cloud hard disk;
the original data node is used for storing a first data table of the original data node as a second data table of the cloud hard disk;
The redistribution device is further configured to redistribute the second data table in the cloud hard disk to the original data node and the new data node.
9. The system of claim 8, wherein the database redistribution system comprises:
the original data node is configured to delete the first data table in the original data node.
10. The system according to claim 8 or 9, wherein the database redistribution system comprises:
and the cloud platform is used for prompting a user whether to redistribute the data if the data storage utilization rate of the original data node is greater than a target threshold value.
11. The system according to any one of claims 8 to 10, wherein the primary data node is configured to store a first data table of the primary data node as a second data table of the cloud hard disk, comprising:
The original data node is configured to create a second data table at Yun Yingpan;
the original data node is further configured to copy table data of the first data table to the second data table.
12. The system according to any one of claims 8 to 11, wherein the database redistribution system comprises:
the cloud platform is used for deploying new data nodes.
13. The system according to any one of claims 8 to 12, wherein the database redistribution system comprises:
And the original data node is used for unloading the cloud hard disk.
14. A cluster of computing devices, comprising at least one computing device, each computing device comprising a processor and a memory;
The processor of the at least one computing device is configured to execute instructions stored in the memory of the at least one computing device to cause the cluster of computing devices to perform the method of any one of claims 1 to 7.
15. A computer program product containing instructions that, when executed by a cluster of computing devices, cause the cluster of computing devices to perform the method of any of claims 1 to 7.
16. A computer readable storage medium comprising computer program instructions which, when executed by a cluster of computing devices, perform the method of any of claims 1 to 7.
CN202211472353.2A 2022-11-23 2022-11-23 Method, system, equipment cluster and storage medium for redistributing database Pending CN118113774A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202211472353.2A CN118113774A (en) 2022-11-23 2022-11-23 Method, system, equipment cluster and storage medium for redistributing database
PCT/CN2023/125968 WO2024109415A1 (en) 2022-11-23 2023-10-23 Database redistribution method and system, and device cluster and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211472353.2A CN118113774A (en) 2022-11-23 2022-11-23 Method, system, equipment cluster and storage medium for redistributing database

Publications (1)

Publication Number Publication Date
CN118113774A true CN118113774A (en) 2024-05-31

Family

ID=91195157

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211472353.2A Pending CN118113774A (en) 2022-11-23 2022-11-23 Method, system, equipment cluster and storage medium for redistributing database

Country Status (2)

Country Link
CN (1) CN118113774A (en)
WO (1) WO2024109415A1 (en)

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106383845A (en) * 2016-08-31 2017-02-08 天津南大通用数据技术股份有限公司 Shared storage-based MPP database data redistribution system
CN111198782A (en) * 2018-11-16 2020-05-26 中国移动通信集团辽宁有限公司 Data redistribution method, device, equipment and storage medium
CN109491618A (en) * 2018-11-20 2019-03-19 上海科技大学 Data management system, method, terminal and medium based on mixing storage
WO2021046750A1 (en) * 2019-09-11 2021-03-18 华为技术有限公司 Data redistribution method, device, and system
CN112765262B (en) * 2019-11-05 2023-02-28 金篆信科有限责任公司 Data redistribution method, electronic equipment and storage medium
CN112115146B (en) * 2020-09-15 2023-09-15 北京人大金仓信息技术股份有限公司 Data redistribution method, device, equipment and storage medium of database
CN113051250A (en) * 2021-03-24 2021-06-29 北京金山云网络技术有限公司 Database cluster capacity expansion method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
WO2024109415A1 (en) 2024-05-30

Similar Documents

Publication Publication Date Title
JP6553822B2 (en) Dividing and moving ranges in distributed systems
US6477535B1 (en) Method and apparatus for concurrent DBMS table operations
US9996427B2 (en) Parallel backup for distributed database system environments
US9317208B2 (en) Data row cache for an acid compliant in-memory row store in a page-based RDBMS engine
EP3822811A1 (en) Real-time cross-system database replication for hybrid-cloud elastic scaling and high-performance data virtualization
CN109643310B (en) System and method for redistribution of data in a database
US9075858B2 (en) Non-disruptive data movement and node rebalancing in extreme OLTP environments
CN110019251A (en) A kind of data processing system, method and apparatus
WO2019109854A1 (en) Data processing method and device for distributed database, storage medium, and electronic device
US20100169289A1 (en) Two Phase Commit With Grid Elements
US20170270149A1 (en) Database systems with re-ordered replicas and methods of accessing and backing up databases
CN113868028A (en) Method for replaying log on data node, data node and system
US11625503B2 (en) Data integrity procedure
US20230394027A1 (en) Transaction execution method, computing device, and storage medium
CN106855858B (en) Database operation method and device
WO2022265768A2 (en) Versioned metadata using virtual databases
US11797523B2 (en) Schema and data modification concurrency in query processing pushdown
US20180011897A1 (en) Data processing method having structure of cache index specified to transaction in mobile environment dbms
WO2022132362A1 (en) Operation fragmentation with metadata serialization in query processing pushdowns
US20230376479A1 (en) Schema and data modification concurrency in query processing pushdown
US20220197761A1 (en) Cloud architecture for replicated data services
Arrieta-Salinas et al. Classic replication techniques on the cloud
US11940972B2 (en) Execution of operations on partitioned tables
CN118113774A (en) Method, system, equipment cluster and storage medium for redistributing database
US11860829B2 (en) Page split detection and affinity in query processing pushdowns

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication