CN116150160A - Adjustment method and device for database cluster processing nodes and storage medium - Google Patents

Adjustment method and device for database cluster processing nodes and storage medium Download PDF

Info

Publication number
CN116150160A
CN116150160A CN202310404326.XA CN202310404326A CN116150160A CN 116150160 A CN116150160 A CN 116150160A CN 202310404326 A CN202310404326 A CN 202310404326A CN 116150160 A CN116150160 A CN 116150160A
Authority
CN
China
Prior art keywords
target
processing node
data
node
database cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310404326.XA
Other languages
Chinese (zh)
Other versions
CN116150160B (en
Inventor
杨刚
任王义
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gushu Polytron Technologies Inc
Original Assignee
Beijing Gushu Polytron Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gushu Polytron Technologies Inc filed Critical Beijing Gushu Polytron Technologies Inc
Priority to CN202310404326.XA priority Critical patent/CN116150160B/en
Publication of CN116150160A publication Critical patent/CN116150160A/en
Application granted granted Critical
Publication of CN116150160B publication Critical patent/CN116150160B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a method, a device and a storage medium for adjusting a database cluster processing node, wherein the method comprises the following steps: in response to receiving a processing node adjustment request, adjusting the number of initial processing nodes in a database cluster to obtain at least one target processing node, determining at least one target logical partition of target data, distributing the target logical partition to each target processing node according to a preset distribution strategy to obtain a mapping relation between the target logical partition and the target processing nodes, wherein the mapping relation is used for enabling each target processing node to determine a corresponding target logical partition, and determining at least one corresponding data block according to index information of at least one data block included in the corresponding target logical partition so as to perform data processing on the corresponding data block. The method can rapidly complete the elastic expansion of the database cluster, and solves the problem that the elastic expansion of the database cluster cannot be rapidly completed in the related technology.

Description

Adjustment method and device for database cluster processing nodes and storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method and an apparatus for adjusting a database cluster processing node, and a storage medium.
Background
In a distributed database cluster, user data is typically stored and processed by database nodes in the database cluster.
In the related art, an analytical distributed database cluster generally adopts a Shared-Nothing (english: shared-notify) architecture, in which user data that needs to be stored and processed by each database node is fixed. When the scale of the database cluster is adjusted, user data corresponding to all the database nodes before adjustment are required to be migrated to all the adjusted database nodes so as to ensure the uniform distribution of the user data on all the adjusted database nodes.
However, data migration is often time consuming, which can result in the adapted database cluster not being able to quickly resume the externally serviced state. That is, in the related art, there is a technical problem that elastic expansion and contraction of the database cluster cannot be completed quickly.
Disclosure of Invention
The embodiment of the application provides a method, a device and a storage medium for adjusting a database cluster processing node, which are used for solving the technical problem that the elastic expansion of a database cluster cannot be completed rapidly in the prior art.
In a first aspect, the present application provides a method for adjusting a processing node of a database cluster, where the database cluster includes at least one initial processing node, and the method is applied to a coordinating node, and includes: in response to receiving the processing node adjustment request, adding or deleting a preset number of processing nodes in the at least one initial processing node to obtain at least one target processing node; determining at least one target logic fragment of target data, wherein each target logic fragment comprises index information of at least one data block in the target data; distributing at least one target logic fragment to each target processing node according to a preset distribution strategy so as to acquire a mapping relation between the target logic fragment and the target processing node; the mapping relationship is used for enabling each target processing node to determine a corresponding target logic fragment, and determining at least one corresponding data block according to index information included in the corresponding target logic fragment so as to perform data processing on the at least one corresponding data block.
In a second aspect, the present application provides a method for adjusting a processing node of a database cluster, where the database cluster includes at least one initial processing node, and the method is applied to a coordinating node determining device, and the method includes: load information of at least one initial processing node is obtained; determining a coordination node from at least one initial processing node according to the load information; the coordination node is used for responding to the received processing node adjustment request, and adding or deleting a preset number of processing nodes in at least one initial processing node so as to acquire at least one target processing node; determining at least one target logic fragment of target data, wherein each target logic fragment comprises index information of at least one data block in the target data; distributing at least one target logic fragment to each target processing node according to a preset distribution strategy so as to acquire a mapping relation between the target logic fragment and the target processing node; the mapping relation is used for enabling at least one target processing node to respectively determine corresponding target logic fragments, and determining at least one corresponding data block according to index information included in the corresponding target logic fragments so as to perform data processing on the at least one corresponding data block.
In a third aspect, the present application provides an adjustment apparatus for a processing node of a database cluster, where the database cluster includes at least one initial processing node, the apparatus includes: the first acquisition module is used for adding or deleting a preset number of processing nodes in at least one initial processing node in response to receiving the processing node adjustment request so as to acquire at least one target processing node; the first determining module is used for determining at least one target logic fragment of target data, wherein each target logic fragment comprises index information of at least one data block in the target data; the second acquisition module is used for distributing at least one target logic fragment to each target processing node according to a preset distribution strategy so as to acquire the mapping relation between the target logic fragment and the target processing node; the mapping relationship is used for enabling each target processing node to determine a corresponding target logic fragment, and determining at least one corresponding data block according to index information included in the corresponding target logic fragment so as to perform data processing on the at least one corresponding data block.
In a fourth aspect, the present application provides an adjustment apparatus for a processing node of a database cluster, where the database cluster includes at least one initial processing node, the apparatus includes: the third acquisition module is used for acquiring the load information of at least one initial processing node; the second determining module is used for determining a coordination node from at least one initial processing node according to the load information; the coordination node is used for responding to the received processing node adjustment request, and adding or deleting a preset number of processing nodes in at least one initial processing node so as to acquire at least one target processing node; determining at least one target logic fragment of target data, wherein each target logic fragment comprises index information of at least one data block in the target data; distributing at least one target logic fragment to each target processing node according to a preset distribution strategy so as to acquire a mapping relation between the target logic fragment and the target processing node; the mapping relationship is used for enabling each target processing node to determine a corresponding target logic fragment, and determining at least one corresponding data block according to index information included in the corresponding target logic fragment so as to perform data processing on the at least one corresponding data block.
In a fifth aspect, the present application provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method for tuning a database cluster processing node according to any one of the first and second aspects when the program is executed by the processor.
In a sixth aspect, the present application provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method of tuning a database cluster processing node as in any of the first and second aspects.
The method, the device and the storage medium for adjusting the database cluster processing nodes, wherein the database cluster comprises at least one initial processing node, the method is applied to a coordination node, and the method comprises the following steps: in response to receiving the processing node adjustment request, adding or deleting a preset number of processing nodes in the at least one initial processing node to obtain at least one target processing node; determining at least one target logic fragment of target data, wherein each target logic fragment comprises index information of at least one data block in the target data; distributing at least one target logic fragment to each target processing node according to a preset distribution strategy so as to acquire a mapping relation between the target logic fragment and the target processing node; the mapping relationship is used for enabling each target processing node to determine a corresponding target logic fragment, and determining at least one corresponding data block according to index information included in the corresponding target logic fragment so as to perform data processing on the at least one corresponding data block. In this embodiment, the target logical partition is a virtual data file including index information of the data blocks, instead of real user data, so when the processing nodes of the database cluster are adjusted, the data blocks corresponding to each target processing node can be determined by determining the target logical partition based on the mapping relationship between the target logical partition and the target processing node and the index information of the data blocks in the target logical partition, that is, the process of adjusting the processing nodes in the data block cluster does not involve migration of the target data, and distribution of the target data among the adjusted target processing nodes can be completed. Therefore, the time for data migration can be saved, the adjustment of the processing nodes in the database cluster can be rapidly completed, the elastic expansion of the database cluster can be rapidly realized, and the technical problem that the elastic expansion of the database cluster cannot be rapidly completed in the related technology is solved.
Drawings
For a clearer description of the present application or of the prior art, the drawings that are used in the description of the embodiments or of the prior art will be briefly described, it being apparent that the drawings in the description below are some embodiments of the present application, and that other drawings may be obtained from these drawings without inventive effort for a person skilled in the art.
Fig. 1 is an application scenario schematic diagram of a method for adjusting a database cluster processing node according to an embodiment of the present application;
FIG. 2 is a flowchart of a method for adjusting a database cluster processing node according to an embodiment of the present disclosure;
FIG. 3 is a flowchart of a method for adjusting a database cluster processing node according to another embodiment of the present disclosure;
FIG. 4 is a schematic diagram of a database cluster structure before adjustment of a processing node according to an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of a database cluster structure after adjustment of a processing node according to an embodiment of the present disclosure;
FIG. 6 is a schematic structural diagram of an adjustment apparatus for a database cluster processing node according to an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of an adjustment device of a database cluster processing node according to another embodiment of the present application;
Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions in the present application will be clearly and completely described below with reference to the drawings in the present application, and it is apparent that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
In the related art, a conventional distributed database cluster with analysis is generally configured by using a Shared-notification architecture, under which each database node has independent computing and storage resources, and the database nodes communicate with each other using a high-speed internet. In this mode, each database instance is only responsible for managing data located on the local storage device, without competing for shared resources. However, the user data corresponding to the database nodes is fixed, and the computing and storage functions of the user data are bound, so that the dynamic extensibility of the cluster of the computing and storage binding architecture is poor. When the cluster scale is adjusted, in order to ensure that the adjusted cluster still has better query analysis performance, data migration is required, and then existing data is redistributed on all data block nodes so as to ensure uniform data distribution. That is, in this mode, it takes time for data migration or handling to achieve redistribution of data. The larger the data volume, the longer the time required for data migration or handling, which in some practical applications usually takes days or weeks. Because the data migration or handling takes a long time, the clusters after the scale adjustment cannot quickly recover the state of providing services to the outside. Therefore, in the related art, the user cannot flexibly adjust the cluster scale according to the actual requirement and the load condition of the database node, so that the resource and time loss caused by data migration is reduced, that is, in the related art, the technical problem that the elastic expansion and contraction of the database cluster cannot be completed quickly exists.
To solve the problems in the related art, it is necessary to avoid migrating user data between different processing nodes when the processing nodes of the database cluster are tuned. Specifically, the target logical partition of the target data is determined, a mapping relation between the target processing nodes and the target logical partition is established, and the data blocks corresponding to the adjusted target processing nodes are determined according to the mapping relation and the index information of the data blocks in the target logical partition, so that the adjustment of the processing nodes in the database cluster is completed, and the elastic expansion and contraction of the database cluster are realized.
The following describes an application scenario of the adjustment method for the database cluster processing node. Fig. 1 is an application scenario schematic diagram of a method for adjusting a database cluster processing node according to an embodiment of the present application. As shown in fig. 1, the application scenario includes a terminal device 101, a load balancing module 102, a computing module 103, and a data storage module 104, where the terminal device 101 is in communication connection with the load balancing module 102, the load balancing module 102 is respectively in communication connection with at least one processing node 105 in the computing module 103, each processing node 105 corresponds to each logic slice 106, each logic slice 106 respectively includes at least one data segment 107, and each data segment 107 respectively includes at least one data block 108. Wherein each logical tile 106 and each data segment 107 is management metadata, and at least one data block 108 of the target data is real user data.
Specifically, both the management metadata and the target data are stored in the shared data storage module 104. The load balancing module 102 receives the processing node adjustment request sent by the terminal device 101, determines a coordination node from at least one initial processing node, receives the processing node adjustment request, adjusts the number of the initial processing nodes, and determines a mapping relationship between the target logical partition and the target processing node. It should be noted that, the method of the present application may be at least applicable to a scenario where all cluster nodes share storage, for example, localized server cluster deployment, cloud platform deployment, and containerized deployment.
The following describes the technical solution of the present application and how the technical solution of the present application solves the above technical problems in detail with specific embodiments with reference to the accompanying drawings.
Fig. 2 is a flowchart of a method for adjusting a database cluster processing node according to an embodiment of the present application, where an execution body of the method in this embodiment is a coordinating node. The coordinating node may be one of at least one initial processing node of the database cluster, or may be a node set in the database cluster according to the requirements. Referring to fig. 2, the method may include the steps of:
S100, in response to receiving the processing node adjustment request, adding or deleting a preset number of processing nodes in at least one initial processing node to acquire at least one target processing node.
Wherein processing the node adjustment request includes at least: a preset number of processing nodes is added or deleted in at least one initial processing node. The initial processing nodes are processing nodes before processing node adjustment is performed on the database cluster, and the target processing nodes are processing nodes after processing node adjustment of a preset number of at least one initial processing node.
For example, each initial processing node in the database cluster is deployed separately, each initial processing node may be a data processing process, or may be another module carrying a data processing program, and computing capacity and memory of each processing node may be the same, or different memory and computing capacity may be configured for each processing node according to a user requirement. Each initial processing node is located in a computing module, and the target data is stored in a data storage module, which is a shared module accessible to each target processing node. Each initial processing node is only used to process the target data and is not responsible for storing the target data, i.e. the data processing and data storage functions of each initial processing node are separate. The plurality of processing nodes in the computing module are deployed in a stateless mode, so that each node can be added and removed rapidly, and the cluster building efficiency is improved.
In some embodiments, the preset number of processing nodes is added to the database cluster, which may be to start multiple groups of database processes, start multiple database containers, or other methods for adding processing nodes, which are not limited herein.
In one embodiment, the coordination node receives the processing node adjustment request sent by the request sending device, and adds or deletes the preset number of processing nodes in the database cluster according to the preset number of information of adding or deleting the processing nodes in the processing node adjustment request. The request sending device may be a resource monitoring auxiliary module or a terminal device.
The resource monitoring auxiliary module is provided with a resource monitoring auxiliary program, the resource monitoring auxiliary program monitors the computing resource occupancy rate of the database cluster, and when the computing resource occupancy rate exceeds the upper limit value of the preset resource occupancy rate range, a processing node adjustment request for increasing the preset number of processing nodes is sent to the coordination node; and when the calculated resource occupancy rate is lower than the lower limit value of the preset resource occupancy rate range, sending a processing node adjustment request for deleting the preset number of processing nodes to the coordination node. Illustratively, the terminal device receives user input and generates a processing node adjustment request, which may be a structured query language database (Structured Query Language server database, SQL) request.
S102, determining at least one target logic fragment of target data, wherein each target logic fragment comprises index information of at least one data block in the target data.
The target data are user data stored and processed in the database cluster. The target logical partition is obtained by dividing target data, is a virtual logical file comprising index information of corresponding data blocks, and the data blocks are the minimum units of real data obtained by dividing the target data. The size of the data block can be set according to the requirements of users.
In one embodiment, it is determined whether the target data needs to be repartitioned based on a total number of logical partitions of the initial logical partitions in the database cluster and a total number of nodes of the target processing node to obtain at least one target logical partition. Wherein each target logical partition includes index information of a data block divided into the target logical partition. The initial logical partition is the logical partition before the processing node adjustment is performed on the database cluster.
S104, distributing at least one target logic fragment to each target processing node according to a preset distribution strategy so as to acquire the mapping relation between the target logic fragment and the target processing node; the mapping relationship is used for enabling each target processing node to determine a corresponding target logic fragment, and determining at least one corresponding data block according to index information included in the corresponding target logic fragment so as to perform data processing on the at least one corresponding data block.
In one embodiment, the preset allocation policy is: and distributing a preset number of target logic fragments to each target processing node. For example, the preset allocation policy may be set according to a user requirement, for example, in a case where the memories of the target processing nodes are different, the number of target logical slices allocated to the target processing nodes may be determined according to the memory size.
Illustratively, a mapping relationship between the target logical partition and the target processing node is constructed according to the identification of the target processing node and the identification of the target logical partition.
Illustratively, the index information of the data block includes at least location information of the data block. The target processing node can determine the corresponding target logical partition according to the mapping relation, determine the position of the data block according to the index information of the data block in the target logical partition, and access the data in the database and perform corresponding processing on the data based on the position of the data block.
In this alternative embodiment, the target logical partition is a virtual data file including index information of the data blocks, instead of real user data, so when the processing nodes of the database cluster are adjusted, the data blocks corresponding to each target processing node can be determined by determining the target logical partition based on the mapping relationship between the target logical partition and the target processing node and the index information of the data blocks in the target logical partition, that is, the process of adjusting the processing nodes in the data block cluster does not involve migration of the target data, and the distribution of the target data among the adjusted target processing nodes can be completed. Therefore, the time for data migration can be saved, the adjustment of the processing nodes in the database cluster can be rapidly completed, the elastic expansion of the database cluster can be rapidly realized, and the technical problem that the elastic expansion of the database cluster cannot be rapidly completed in the related technology is solved.
In some alternative embodiments, determining at least one target logical tile of target data includes:
s200, determining whether the initial logical partition is distributed to each target processing node according to a preset distribution strategy according to the total number of the logical partitions of the initial logical partition in the database cluster and the total number of nodes of the target processing nodes, wherein the initial logical partition is the logical partition before the processing nodes are adjusted by the database cluster.
For example, the preset allocation policy is to evenly allocate the logical slices to each target processing node, and if the total number of the logical slices of the initial logical slices is 2 and the total number of the nodes of the target processing nodes is 3, it is determined that the requirement of the preset allocation policy is not met; if the total number of the logic fragments of the initial logic fragments is 6 and the total number of the nodes of the target processing nodes is 3, determining that the requirement of the preset allocation strategy is met.
S202, if not, executing the following operations: and determining the target number of the data blocks respectively corresponding to the at least one target processing node according to the total number of nodes of the target processing nodes and the total number of the data blocks, grouping the at least one data block according to the target number, and generating at least one target logic fragment according to a grouping result.
In one embodiment, the preset allocation policy includes: the logic fragments are distributed to each target processing node in an average way, and correspondingly, the target number B of the data blocks corresponding to each target processing node is determined new The method can comprise the following steps:
Figure SMS_1
wherein each initial processing node manages a fragment table, n represents the total number of nodes of the initial processing nodes, i represents the ith initial processing node, P (i, j) represents the jth segment file in the fragment table managed by the ith initial processing node, B (P (i, j)) represents the number of data blocks contained in the jth segment file in the fragment table managed by the ith initial processing node, and P n (i) And the total number of the files in the fragment table managed by the ith initial processing node is represented. m represents the preset number of added or deleted processing nodes, wherein if m represents the preset number of added processing nodes, m takes a positive value, otherwise takes a negative value. Target quantity B new Equal to the ratio of the total number of data blocks of all data blocks in the database cluster to the total number of nodes of the target processing node.
For example, if the database cluster includes 6 data blocks, where the total number of nodes of the target processing nodes is 3, the target number of data blocks allocated to each target processing node is 2. The 6 data blocks may be grouped in a group of 2 by means of random grouping, thereby obtaining 3 target logical slices, each of which includes index information of 2 data blocks corresponding thereto. When the data blocks are grouped, a random grouping method may be used, or the grouping method may be set according to the user requirement, which is not limited herein.
S204, if yes, executing the following operations: if so, the following operations are performed: and determining the initial logical partition as the target logical partition.
The execution sequence between step S202 and step S204 is not limited.
The initial logical tile information is illustratively stored in a mapping relationship metadata table. If a new target logical fragment is generated, deleting the initial logical fragment information in the mapping relation metadata table, and inserting the information of the target logical fragment, otherwise, not modifying. The mapping relation metadata table at least stores identification information and storage mode of the logic fragments and position information of the real data blocks.
In this optional embodiment, whether the initial logical partition meets a preset allocation policy is determined, the initial logical partition is allocated to each target processing node, and when the initial logical partition does not meet the preset allocation policy, the target number of data blocks corresponding to at least one target processing node is determined according to the total number of nodes of the target processing nodes and the total number of data blocks of the data blocks, the at least one data block is grouped according to the target number, and at least one target logical partition is generated according to a grouping result; and when the initial logical partition is satisfied, determining the initial logical partition as a target logical partition. Thus, it is ensured that the target logical tile satisfying the preset allocation policy is acquired.
In one embodiment, the preset allocation policy includes: at least one target logical partition is evenly distributed to each target processing node. By evenly distributing at least one target logical partition to each target processing node, it may be ensured that each target processing node reaches a load-balanced state.
In some alternative embodiments, before determining the at least one target logical tile of the target data, the method further comprises the steps of: and setting the running state of the database cluster to be a suspension state, wherein the suspension state is used for indicating that the database cluster stops receiving transaction requests sent by other devices except the database cluster to the database cluster, and the transaction requests are used for indicating the target processing node to process data of at least one corresponding data block.
It should be noted that, when the current state of the database cluster is set to the suspended state, the database cluster may refuse to receive a new transaction request. Illustratively, wherein the new transaction request comprises an SQL request. It should be appreciated that in the process of resilient scaling of the database cluster (i.e., the process of adjusting the processing nodes of the database cluster), the mapping relationship between the target logical partition and the target processing node needs to be redetermined. Therefore, during the elastic telescoping of the database cluster, the target processing node cannot determine the corresponding data block, which may cause the target processing node to fail to access the corresponding user data, thereby causing the transaction execution failure. In addition, when the database cluster is elastically stretched, cache data solidification is needed, and in the process of cache data solidification, if new data modification is generated, the modified data cannot be persisted on the data storage module, and in this case, even if the transaction execution is successful, the data can still be lost after the elastic stretching of the cluster is completed, and the reliability of the data cannot be ensured.
In this alternative embodiment, by setting the running state of the data block cluster to the suspended state, the situation that the transaction fails to execute or the data after the execution is lost can be avoided.
In an alternative embodiment, after the coordinating node sets the current running state of the database cluster to the suspended state, the method further includes: it is checked whether there are currently active transactions and if so, it waits for all active transactions to complete commit or rollback. Thus, the transaction in the activity can be ensured to be executed and completed, and the failure of executing the transaction in the activity is avoided.
In an alternative embodiment, after the current running state of the database cluster is the suspended state, the coordinating node reads node information of an active initial processing node in the database system and connection information of the corresponding initial processing node from the processing node metadata table. The coordinating node establishes communication connection with each initial processing node except itself based on the node information and the connection information. After the communication connection is established, the coordination node generates a data cleaning task list, asynchronously sends the data cleaning task list to all processing nodes including the coordination node to execute, and the initial processing node responds to the received data preprocessing instruction to execute corresponding data preprocessing.
In an alternative embodiment, before determining at least one target logical tile of target data, the method further comprises the steps of: and sending a cache data solidification instruction to at least one initial processing node so that the at least one initial processing node solidifies corresponding cache data into a data storage module, wherein the cache data is generated after the initial processing node performs data processing on data in the corresponding at least one data block in a preset cache region, and the data storage module is used for storing at least one data block of target data. Illustratively, the plurality of initial processing nodes are respectively in one-to-one correspondence with the plurality of preset cache areas.
Illustratively, the coordination node is one of the initial processing nodes, and the coordination node generates a data preprocessing instruction in response to receiving the processing node adjustment request and sends the data preprocessing instruction to all the initial processing nodes including the coordination node, so that all the initial processing nodes execute corresponding cache data solidification operation.
In one embodiment, the initial processing node solidifies the corresponding buffered data to the data storage module, and may include the steps of: the initial processing node checks the position of the refreshing point in the corresponding transaction log, and if the position of the refreshing point is the same as the position of the latest transaction log, the initial processing node indicates that all data modification in the memory is persisted to the disk and solidification operation is not needed; otherwise, the initial processing node will start from the latest refreshing point, brush all dirty pages in the preset cache area into the data storage module, and update the position of the refreshing point, thereby completing the persistence of the cache data in the memory. For example, the preset buffer zone may be a shared buffer pool, and the shared buffer pool may be a partition preset in the computing module. The database system records transaction behavior and data change into a transaction log in the running process, and records a refreshing point in the transaction log when the cached data in a preset cache area is solidified each time, which means that all the changes made by the transactions before the refreshing point are solidified.
It should be noted that, in order to ensure the performance of the database system, frequent operations such as deletion and update performed on data, such as disk Input/Output (I/O), are avoided, and the processing node does not immediately synchronize the data modification to the disk for persistence. By the method of the embodiment, the cached data can be ensured to be solidified on the data storage device, so that the data loss is avoided.
In some alternative embodiments, prior to determining the at least one target logical tile of the target data, further comprising: transmitting a tuple state modification instruction to at least one initial processing node, so that the at least one initial processing node performs the following operations on all target tuples in the corresponding data block, wherein the target tuples do not perform a deletion operation: and writing a preset keyword for identifying that the target tuple is visible to the target transaction in the state flag bit of the target tuple.
In one embodiment, multiple initial processing nodes in the database cluster adopt a multi-version concurrent access control mechanism, and a flag bit written with a status flag bit exists in multiple tuples of target data, for example, a DELETE (English: DELETE) statement sets the tuple to a deleted state through the status flag bit, and specifically, a keyword for indicating the deleted state can be written in the status flag bit, so that the tuple is set to the deleted state. In addition, there may be different versions in a tuple, e.g., a new tuple generated by the UPDATE statement and an old tuple form two versions of the tuple. The deleted tuples or old tuples left after the UPDATE statement is executed are dead tuples and the coordinator node sends a dead tuple delete instruction to the initial processing node to cause the initial processing node to delete these dead tuples completely from the data file.
In one embodiment, processing node 2 current transaction number 610, processing node 3 current transaction number 503, tuple a executing insert operation transaction number 498, tuple a executing delete operation transaction number 512, tuple B executing insert operation transaction number 604, and tuple B executing delete operation transaction number 0. The method by which the initial processing node modifies the state of the target tuple to be visible to all transactions based on tuple state modification instructions is further described below in connection with this embodiment. In this embodiment, if all transactions currently have completed commit, the management of tuple A and tuple B is performed by processing node 2 before performing the database cluster elastic telescoping. For processing node 2 (transaction number 610), the transaction number for which the delete operation transaction was performed on tuple a (transaction number 512) is prior to the current transaction number, so tuple a is a dead tuple, which is invisible (not queried) to the user, while the transaction number for which the insert operation transaction was performed on tuple B (transaction number 604) is also prior to the current transaction number, and no delete operation was performed on it, then tuple B is visible (queried) to the user. If the repartitioning of the logical fragments is performed directly at this point, the management of tuple a and tuple B is taken over by processing node 3 (transaction number 503), then for processing node 3, the transaction number (transaction number 498) for performing the insert operation transaction on tuple a is before the current transaction number, the transaction number for performing the delete operation transaction on tuple a is after the current transaction number, tuple a is visible to the user; while the transaction number (transaction number 604) for the insert operation transaction for tuple B is not visible to the user after the current transaction number. At this time, the target data in the database system will become chaotic, and the data queried by the user will change after the elastic expansion without any data operation, and the data storage will become unreliable, so that the state of the tuple in the data file needs to be modified. In the actual processing, since the database system does not receive new requests at this time and all transactions have completed commit, it is not necessary to make a determination of the visibility of the tuple, and the target tuples are visible tuples except for the tuple marked for deletion. Specifically, the target tuple may be modified to be visible by writing a preset key for identifying that the tuple is visible to the target transaction in a status flag bit of the target tuple.
In one embodiment, the initial processing node sets the storage space occupied by the dead tuple marked for deletion in the data file managed by the node to 0 according to the index information of the tuple, sets the tuple flag bits of all visible tuples to be reserved to be in a "cleared state", and the cleared state is a preset keyword written in the state flag bit, so that the target tuple is set to be visible to all transactions. It should be noted that, the target keyword may be "cleaned state" in this embodiment, or may be another word set according to the user requirement. With continued reference to FIG. 3, tuple A is completely deleted and tuple B is placed in a "cleaned state" where tuple B is directly determined to be visible when a visibility determination is subsequently made, and its transaction information is not modified until a transaction modifies it.
In this optional embodiment, by writing a preset keyword in the status flag bit of the target tuple, the status of all target tuples not performing the deletion operation can be quickly modified to be visible to the target transaction, so that the target transaction can be ensured to identify the target tuple, so as to perform corresponding data processing on the target tuple.
In some alternative embodiments, the cache data curing instruction and the tuple state modification instruction may be included in the same instruction and sent simultaneously or may be sent separately, without limitation.
In some alternative embodiments, before the coordinating node sends the cache data curing instruction and the tuple state modifying instruction to the at least one initial processing node, the method further comprises: it is checked whether there are currently active transactions and if so, it waits for all active transactions to complete commit or rollback. Thus, it can be ensured that the transaction in the activity can be performed to completion. For example, the initial processing node determines a field for writing a transaction number of an active transaction, and determines whether there is currently an active transaction based on whether the field is empty. Specifically, if the transaction is empty, determining that there is no transaction in the current activity, otherwise, determining that there is a transaction in the current activity.
In some alternative embodiments, before sending the cache data curing instruction and the tuple state modifying instruction to the at least one initial processing node, the method further comprises the steps of: communication connections are established with the respective initial processing nodes. Wherein the active initial processing node is an available processing node, otherwise it is an unavailable processing node. The information of each initial processing node which is currently active at least comprises the following information: the node identification of the initial processing node, and the connection information of the initial processing node at least comprises the node identification of each initial processing node, the node internet protocol (Internet Protocol, IP) address and the port. The metadata table is used for storing node identifiers corresponding to the initial processing nodes, IP addresses of the nodes, ports and node states. The node identifier is used for establishing a mapping relation between the initial processing node and the logic fragment; the IP address and port information are used for enabling the coordination node to establish communication connection with other initial processing nodes except the coordination node, the coordination node generates a data preprocessing task list, and tasks are asynchronously sent to the corresponding initial processing nodes through the communication connection; the node status is used to determine whether the corresponding initial processing node is in a non-available state. The data preprocessing task is sent in an asynchronous sending mode, so that the data preprocessing efficiency can be improved.
Fig. 3 is a flowchart of a method for adjusting a processing node of a database cluster according to another embodiment of the present application, where the database cluster includes at least one initial processing node, and the method is applied to a coordinating node determining device, where the coordinating node determining device may be a device for determining a coordinating node set in the database cluster by a user, and may also be a load balancing module in the database cluster. Correspondingly, referring to fig. 3, the adjustment method of the database cluster processing node may include the following steps:
s300, load information of at least one initial processing node is obtained;
the load balancing module obtains a processing node adjustment request sent by the terminal device or the resource monitoring auxiliary module, and calculates and obtains load information of each initial processing node in the database cluster, where the load information may be set according to a user requirement, and the load information may include, for example, a memory of the processing node, an occupancy rate of a central processing unit (Central Processing Unit, CPU), or other information that may represent a load condition of the processing node.
S302, determining a coordination node from at least one initial processing node according to the load information. Wherein, the coordination node is specifically configured to: in response to receiving the processing node adjustment request, adding or deleting a preset number of processing nodes in the at least one initial processing node to obtain at least one target processing node; determining at least one target logic fragment of target data, wherein each target logic fragment comprises index information of at least one data block in the target data; distributing at least one target logic fragment to each target processing node according to a preset distribution strategy so as to acquire a mapping relation between the target logic fragment and the target processing node; the mapping relation is used for enabling at least one target processing node to respectively determine corresponding target logic fragments, and determining at least one corresponding data block according to index information included in the corresponding target logic fragments so as to perform data processing on the at least one corresponding data block.
In one embodiment, the load balancing module obtains load information of each initial processing node, determines a processing node with a smaller load in each processing node as a coordination node, and establishes communication connection between the coordination node and the client or the resource monitoring auxiliary module.
Illustratively, the less loaded initial processing node is determined to be the coordinating node. Of course, the coordination node may also be determined from at least one of the initial processing nodes according to user requirements.
In this alternative embodiment, load information of at least one initial processing node is obtained; and dynamically selecting a coordination node from at least one initial processing node according to the load information. Therefore, no coordination node is required to be additionally arranged in the database cluster, and resources can be saved.
In one embodiment, after each target processing node determines a corresponding data block from the shared data storage module, the target processing node processes the data of the corresponding data block, and sends a processing result to the coordination node, and the coordination node integrates the received processing result and returns the integrated processing result to the terminal device.
Based on the foregoing embodiments and optional embodiments, the present application further provides an optional implementation manner of a method for adjusting a database cluster processing node, which is specifically described below.
S400, the load balancing module acquires a processing node adjustment request sent by the terminal equipment or the resource monitoring auxiliary module. Load information of each initial processing node is obtained, a processing node with smaller load in each processing node is determined to be a coordination node, and communication connection between the coordination node and a client or a resource monitoring auxiliary module is established.
S401, the coordination node receives the processing node adjustment request.
S402, the coordination node sets the current running state of the database cluster to be a suspension state, and refuses to receive a new request; it is checked whether there are currently active transactions and if so, it waits for all active transactions to complete commit or rollback.
S403, the coordination node reads the current active node information of the database system from the processing node metadata table, and reads the connection information of the corresponding node from the processing node metadata table. The coordinating node establishes communication connection with each initial processing node except the coordinating node based on the current active node information and the connection information of the nodes.
S404, the coordination node generates a data cleaning task list and asynchronously sends the data cleaning task to all processing nodes including the coordination node to execute.
S405, the initial processing node receives a data cleaning task and performs the following operations: and solidifying the cache data in the memory into the data storage module. The transaction information for the tuple in the table file is traversed, and the flag bit for the tuple is modified so that the tuple is visible to any transaction.
S406, after the initial processing nodes finish data cleaning, a data cleaning finishing response is sent to the coordination node, so that the coordination node determines that all the initial processing nodes finish data preprocessing operation.
S407, the coordination node determines that each initial processing node completes the data cleaning operation, and performs the following processing on the initial logical fragments in the database cluster: and judging whether the initial logical fragments can be distributed to at least one target processing node according to a preset distribution strategy according to the number of the initial processing nodes and the number of the initial logical fragments in a data table of the database cluster. If yes, the initial logic fragment is not modified, and the initial logic fragment is directly used as a target logic fragment; otherwise, integrating and repartitioning the block indexes of the database table file to generate a new logic slicing file.
The data block table file is a file of target data, and the block index of the database table file at least comprises the identification of the data block and the position information of the data block. Integrating and repartitioning the block indexes of the database table file to generate a new logic slicing file, which is equivalent to repartitioning at least one data block of the target data to obtain a plurality of grouping results, wherein the block indexes of the grouping results are respectively in one-to-one correspondence with the target logic slices.
Wherein the target data and the management metadata are both stored in a shared data storage module. The management metadata information comprises node metadata information, fragment metadata information, mapping relation metadata information and the like, and is used for managing nodes and fragments and storing logical fragment files abstracted from a database system. The logical partition plays an index role, is used for storing address information of a corresponding real data file (data block), provides correct user data for the processing node, achieves the effect that the processing node directly manages the user data, and can quickly repartition target data when the database cluster is elastically stretched out and drawn back, thereby achieving the effect of data redistribution.
And S408, the coordination node responds to the completion of the determination of the target logical partition, adds or deletes a preset number of processing nodes in at least one initial processing node of the database cluster so as to acquire at least one target processing node, and inserts the node information of the at least one target processing node into the processing node metadata table.
S409, modifying metadata information to modify the mapping relation between the target processing node and the logical partition.
It should be noted that step S408 may be performed after step S407 or after step S401, and is not limited herein. After deleting the preset number of processing nodes, the coordination node sets the state of the corresponding processing nodes in the node information metadata table to be unavailable.
S410, the coordination node modifies the running state of the coordination node, and the receiving terminal equipment initiates a connection request to start processing the service.
In this alternative embodiment, the adjustment of the cluster size can be completed quickly. The method has the advantages that in the elastic expansion process of the database cluster, the block indexes of the database table files are integrated and re-divided only through virtual logic partition files, the effect of quickly splitting the files can be achieved under the condition that data are not migrated, then, the adjustment of the cluster scale can be quickly completed only by modifying the mapping relation between the target processing nodes and the target logic files, the re-allocation of computing resources can be flexibly carried out according to the current load condition in the operation process of the database system, the waste of computing resources in an idle state is avoided, and the performance reduction of the database system due to the fact that the computing resources are insufficient under the condition of busy service is avoided.
The present application also provides another alternative embodiment, specifically described below.
S500, the coordination node receives the elastic expansion request, generates a data cleaning command and issues a task. The method specifically comprises the following steps: the coordinator node receives the flexible command, stops receiving new requests and checks whether there are active transactions. After all the transactions are completed, the coordination node generates a data cleaning command and sends the data cleaning command to all the processing nodes in the cluster for execution. The data cleansing command may include a dead tuple delete instruction, a cache data cure instruction, and a tuple state modification instruction. Each processing node performs data cleaning based on the data cleaning command; and returning the cleaning result to the coordination node after the cleaning work is completed. The data cleaning process at least comprises the following steps: persisting the cache data in the memory to the storage device; the transaction information in the tuple is reset.
S501: splitting and remapping of the data files is performed. The method specifically comprises the following steps: after the coordination node waits for all the processing nodes to finish data cleaning, the coordination node finishes re-splitting of the data file according to the increased number of the processing nodes. Further, the modification of the mapping relationship between the processing node and the logical partition is completed by modifying the metadata information, which specifically includes: the coordination node releases the mapping relation between the initial processing node and the logic fragment, and deletes the related record in the mapping relation metadata table; adding the processing node information of the newly added node into a node metadata table; and corresponding the added processing nodes with the target logical fragments, and adding the mapping relation into the fragment metadata table.
In this alternative embodiment, taking the structure of the database cluster before adjustment as an example as shown in fig. 4, referring to fig. 4, there are two initial processing nodes in the database cluster, namely, a processing node 1 and a processing node 2, where each processing node manages one initial logical partition, and the two initial logical partitions are respectively a logical partition 1 and a logical partition 2, and each initial logical partition respectively corresponds to a plurality of data segment files, and each segment file respectively corresponds to a plurality of data blocks. In the application scenario, the coordination node determined by the load balancing module is the processing node 1. In this alternative embodiment, the processing node adjustment request is described by taking as an example that one processing node is added to the database cluster.
The database system pulls up 1 new database processing node and adds the new database processing node to the current database cluster, thereby obtaining a database cluster adjusted by the processing node, the structure diagram of the cluster is shown in fig. 5, and referring to fig. 5, the newly added processing node is the processing node 3. With reference to fig. 4, only 2 logical slices exist in the database cluster before adjustment, after elastic capacity expansion is performed, the logical slices cannot be uniformly distributed to 3 processing nodes, and new logical slices need to be re-abstracted.
With continued reference to FIG. 4, the pre-reconciled database cluster contains 2 segment files per shard and 3 data blocks per segment file, and the pre-reconciled database cluster contains a total of 12A data block. Target number B of data blocks contained in each target logical partition of the adjusted database cluster new 4, the data block indexes of the table file are thus integrated and repartitioned, and can be abstracted into 3 target logical partitions, each corresponding to 4 data blocks, as shown in fig. 5. Because the target logical shard is different from the initial logical shard in the database cluster before adjustment, the shard information in the metadata information needs to be modified.
S502: the coordination node modifies the running state of the coordination node, allows the client to initiate a connection request and starts to process the service. So far, the database cluster has completed the elastic capacity expansion, and can restart to receive the client request and execute the received SQL command.
In the process of deleting the elastic volume reduction of the processing node, the state of the processing node to be deleted is set to be unavailable, and then the mapping relation with the target logic fragment is modified based on the current available processing node. After the elastic capacity shrinkage of the database cluster is completed, the whole database cluster recovers normal service, a background processing process recovers the processing node with the unavailable mark, the information of the processing node in the metadata table is cleaned, and the resources occupied by the processing node are recovered.
Fig. 6 is a schematic structural diagram of an adjustment device for a database cluster processing node according to an embodiment of the present application, where the database cluster includes at least one initial processing node. Referring to fig. 6, the apparatus includes a first acquisition module 600, a first determination module 602, and a second acquisition module 604.
Wherein, the first obtaining module 600 is configured to add or delete a preset number of processing nodes in the at least one initial processing node in response to receiving the processing node adjustment request, so as to obtain at least one target processing node; a first determining module 602, configured to determine at least one target logical partition of target data, where each target logical partition includes index information of at least one data block in the target data; a second obtaining module 604, configured to allocate at least one target logical partition to each target processing node according to a preset allocation policy, so as to obtain a mapping relationship between the target logical partition and the target processing node; the mapping relationship is used for enabling each target processing node to determine a corresponding target logic fragment, and determining at least one corresponding data block according to index information included in the corresponding target logic fragment so as to perform data processing on the at least one corresponding data block.
Optionally, the first determining module 602 is specifically configured to: determining whether the initial logical partition is distributed to each target processing node according to a preset distribution strategy according to the total number of the logical partitions of the initial logical partition in the database cluster and the total number of nodes of the target processing nodes, wherein the initial logical partition is the logical partition before the processing nodes are regulated by the database cluster; if not, the following operations are executed: determining the target number of the data blocks corresponding to at least one target processing node according to the total number of nodes of the target processing nodes and the total number of the data blocks, grouping the at least one data block according to the target number, and generating at least one target logic fragment according to a grouping result; if so, the following operations are performed: and determining the initial logical partition as the target logical partition.
Optionally, the apparatus further includes a setting module, configured to set, before the first determining module 602 performs the corresponding operation, an operation state of the database cluster to a suspension state, where the suspension state is used to indicate that the database cluster stops receiving a transaction request sent by a device other than the database cluster to the database cluster, where the transaction request is used to instruct the target processing node to perform data processing on the corresponding at least one data block.
Optionally, the apparatus further includes a data solidification module, configured to send a cache data solidification instruction to at least one initial processing node before the first determining module 602 performs the corresponding operation, so that the at least one initial processing node solidifies corresponding cache data into the data storage module, where the cache data is data generated after the initial processing node performs data processing on data in the corresponding at least one data block in the preset cache area, and the data storage module is configured to store at least one data block of the target data.
Optionally, the apparatus further includes a state modifying module, configured to send a tuple state modifying instruction to the at least one initial processing node before the first determining module 602 performs the corresponding operation, so that the at least one initial processing node performs the following operations on all target tuples in the corresponding data block that do not perform the deletion operation: and writing a preset keyword for identifying that the target tuple is visible to the target transaction in the state flag bit of the target tuple.
Fig. 7 is a schematic structural diagram of an adjustment apparatus for a processing node of a database cluster according to another embodiment of the present application, and referring to fig. 7, the database cluster includes at least one initial processing node, and the apparatus includes a third obtaining module 700 and a second determining module 702.
The third obtaining module 700 is configured to obtain load information of at least one initial processing node; a second determining module 702, configured to determine, according to the load information, a coordination node from at least one initial processing node; the coordination node is used for responding to the received processing node adjustment request, and adding or deleting a preset number of processing nodes in at least one initial processing node so as to acquire at least one target processing node; determining at least one target logic fragment of target data, wherein each target logic fragment comprises index information of at least one data block in the target data; distributing at least one target logic fragment to at least one target processing node according to a preset distribution strategy so as to acquire a mapping relation between the target logic fragment and the target processing node; the mapping relation is used for enabling at least one target processing node to respectively determine corresponding target logic fragments, and determining at least one corresponding data block according to index information included in the corresponding target logic fragments so as to perform data processing on the at least one corresponding data block.
Since the apparatus of the embodiment of the present application is the same as the principle of the method of the foregoing embodiment, the explanation of the apparatus in more detail is not repeated here.
It should be noted that, in the embodiment of the present application, the related functional modules may be implemented by a hardware processor (hardware processor).
Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application, and referring to fig. 8, the electronic device may include: processor 81 (processor), communication interface 82 (Communications Interface), memory 83 (memory) and communication bus 84, processor 81, communication interface 82, and memory 83 complete communication with each other through communication bus 84. Processor 81 may invoke logic instructions in memory 83 to perform the method of tuning the database cluster processing node.
Further, the logic instructions in the memory 83 described above may be implemented in the form of software functional units and may be stored in a computer readable storage medium when sold or used as a stand alone product. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, the present application further provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, are capable of performing the method for adjusting a database cluster processing node provided by the methods described above.
In yet another aspect, the present application further provides a non-transitory computer readable storage medium, on which a computer program is stored, which when executed by a processor is implemented to perform the method for adjusting a database cluster processing node provided in the foregoing embodiments.
The apparatus embodiments described above are merely illustrative, wherein elements illustrated as separate elements may or may not be physically separate, and elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on such understanding, the foregoing technical solutions may be embodied essentially or in part in the form of a software product, which may be stored in a computer-readable storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the various embodiments or methods of some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting thereof; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims (11)

1. A method for adjusting a processing node of a database cluster, wherein the database cluster includes at least one initial processing node, the method being applied to a coordinating node, the method comprising:
in response to receiving a processing node adjustment request, adding or deleting a preset number of processing nodes in the at least one initial processing node to obtain at least one target processing node;
determining at least one target logic fragment of target data, wherein each target logic fragment comprises index information of at least one data block in the target data;
distributing the at least one target logical partition to each target processing node according to a preset distribution strategy so as to acquire a mapping relation between the target logical partition and the target processing node; the mapping relation is used for enabling each target processing node to determine a corresponding target logic fragment, and determining at least one corresponding data block according to index information included in the corresponding target logic fragment so as to perform data processing on the at least one corresponding data block.
2. The method of claim 1, wherein said determining at least one target logical slice of target data comprises:
Determining whether the initial logical partition is distributed to each target processing node according to the preset distribution strategy according to the total number of the logical partitions of the initial logical partition in the database cluster and the total number of nodes of the target processing nodes, wherein the initial logical partition is the logical partition before the processing node adjustment of the database cluster;
if not, the following operations are executed: determining the target number of data blocks corresponding to the at least one target processing node according to the total number of nodes of the target processing node and the total number of data blocks of the target data, grouping the at least one data block according to the target number, and generating the at least one target logic fragment according to a grouping result;
if so, the following operations are performed: and determining the initial logic fragment as the target logic fragment.
3. The method of claim 1, wherein the preset allocation policy comprises: and evenly distributing the at least one target logic fragment to each target processing node.
4. The method of claim 1, further comprising, prior to determining the at least one target logical slice of the target data: setting the running state of the database cluster to a suspension state, wherein the suspension state is used for indicating the database cluster to stop receiving transaction requests sent to the database cluster by other devices except the database cluster, and the transaction requests are used for indicating the target processing node to process data of at least one corresponding data block.
5. The method of claim 1, further comprising, prior to determining the at least one target logical slice of the target data:
and sending a cache data solidification instruction to the at least one initial processing node so that the at least one initial processing node solidifies corresponding cache data into a data storage module, wherein the cache data is generated after the initial processing node performs data processing on data in the corresponding at least one data block in a preset cache region, and the data storage module is used for storing the at least one data block of the target data.
6. The method of claim 1, further comprising, prior to determining the at least one target logical slice of the target data:
transmitting a tuple state modification instruction to the at least one initial processing node, so that the at least one initial processing node performs the following operations on all target tuples in the corresponding data block, wherein the target tuples do not perform a deletion operation: and writing preset keywords for identifying that the target tuple is visible to all transactions in the state flag bit of the target tuple.
7. A method of tuning a processing node of a database cluster, wherein the database cluster comprises at least one initial processing node, the method being applied to a coordinating node determining device, the method comprising:
Acquiring load information of the at least one initial processing node;
determining a coordination node from the at least one initial processing node according to the load information;
the coordination node is used for responding to the received processing node adjustment request, and adding or deleting a preset number of processing nodes in the at least one initial processing node so as to acquire at least one target processing node; determining at least one target logic fragment of target data, wherein each target logic fragment comprises index information of at least one data block in the target data; distributing the at least one target logical partition to each target processing node according to a preset distribution strategy so as to acquire a mapping relation between the target logical partition and the target processing node; the mapping relationship is used for enabling the at least one target processing node to respectively determine corresponding target logic fragments, and determining at least one corresponding data block according to index information included in the corresponding target logic fragments so as to perform data processing on the at least one corresponding data block.
8. An adjustment device for a processing node of a database cluster, wherein the database cluster comprises at least one initial processing node, the device comprising:
The first acquisition module is used for responding to the received processing node adjustment request, and adding or deleting a preset number of processing nodes in the at least one initial processing node so as to acquire at least one target processing node;
the first determining module is used for determining at least one target logic fragment of target data, wherein each target logic fragment comprises index information of at least one data block in the target data;
the second acquisition module is used for distributing the at least one target logic fragment to each target processing node according to a preset distribution strategy so as to acquire the mapping relation between the target logic fragment and the target processing node; the mapping relation is used for enabling each target processing node to determine a corresponding target logic fragment, and determining at least one corresponding data block according to index information included in the corresponding target logic fragment so as to perform data processing on the at least one corresponding data block.
9. An adjustment device for a processing node of a database cluster, wherein the database cluster comprises at least one initial processing node, the device comprising:
a third obtaining module, configured to obtain load information of the at least one initial processing node;
The second determining module is used for determining a coordination node from the at least one initial processing node according to the load information;
the coordination node is used for responding to the received processing node adjustment request, and adding or deleting a preset number of processing nodes in the at least one initial processing node so as to acquire at least one target processing node; determining at least one target logic fragment of target data, wherein each target logic fragment comprises index information of at least one data block in the target data; distributing the at least one target logical partition to each target processing node according to a preset distribution strategy so as to acquire a mapping relation between the target logical partition and the target processing node; the mapping relation is used for enabling each target processing node to determine a corresponding target logic fragment, and determining at least one corresponding data block according to index information included in the corresponding target logic fragment so as to perform data processing on the at least one corresponding data block.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method for tuning a database cluster processing node according to any one of claims 1 to 7 when the program is executed by the processor.
11. A non-transitory computer readable storage medium, having stored thereon a computer program, which when executed by a processor, implements the steps of the method of tuning a database cluster processing node according to any of claims 1 to 7.
CN202310404326.XA 2023-04-17 2023-04-17 Adjustment method and device for database cluster processing nodes and storage medium Active CN116150160B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310404326.XA CN116150160B (en) 2023-04-17 2023-04-17 Adjustment method and device for database cluster processing nodes and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310404326.XA CN116150160B (en) 2023-04-17 2023-04-17 Adjustment method and device for database cluster processing nodes and storage medium

Publications (2)

Publication Number Publication Date
CN116150160A true CN116150160A (en) 2023-05-23
CN116150160B CN116150160B (en) 2023-06-23

Family

ID=86362088

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310404326.XA Active CN116150160B (en) 2023-04-17 2023-04-17 Adjustment method and device for database cluster processing nodes and storage medium

Country Status (1)

Country Link
CN (1) CN116150160B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116471274A (en) * 2023-06-20 2023-07-21 深圳富联富桂精密工业有限公司 Database node deployment method and device, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200026576A1 (en) * 2017-01-19 2020-01-23 Nutanix, Inc. Determining a number of nodes required in a networked virtualization system based on increasing node density
CN110730238A (en) * 2019-10-21 2020-01-24 中国民航信息网络股份有限公司 Cluster calling system, method and device
US10754696B1 (en) * 2017-07-20 2020-08-25 EMC IP Holding Company LLC Scale out capacity load-balancing for backup appliances
CN111913670A (en) * 2020-08-07 2020-11-10 北京百度网讯科技有限公司 Load balancing processing method and device, electronic equipment and storage medium
CN113051250A (en) * 2021-03-24 2021-06-29 北京金山云网络技术有限公司 Database cluster capacity expansion method and device, electronic equipment and storage medium
CN113297166A (en) * 2020-07-27 2021-08-24 阿里巴巴集团控股有限公司 Data processing system, method and device
CN113391757A (en) * 2020-03-12 2021-09-14 杭州海康威视数字技术股份有限公司 Node expansion method and device and migration node
CN115422165A (en) * 2022-09-22 2022-12-02 北京奥星贝斯科技有限公司 Database data migration method and database

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200026576A1 (en) * 2017-01-19 2020-01-23 Nutanix, Inc. Determining a number of nodes required in a networked virtualization system based on increasing node density
US10754696B1 (en) * 2017-07-20 2020-08-25 EMC IP Holding Company LLC Scale out capacity load-balancing for backup appliances
CN110730238A (en) * 2019-10-21 2020-01-24 中国民航信息网络股份有限公司 Cluster calling system, method and device
CN113391757A (en) * 2020-03-12 2021-09-14 杭州海康威视数字技术股份有限公司 Node expansion method and device and migration node
CN113297166A (en) * 2020-07-27 2021-08-24 阿里巴巴集团控股有限公司 Data processing system, method and device
CN111913670A (en) * 2020-08-07 2020-11-10 北京百度网讯科技有限公司 Load balancing processing method and device, electronic equipment and storage medium
CN113051250A (en) * 2021-03-24 2021-06-29 北京金山云网络技术有限公司 Database cluster capacity expansion method and device, electronic equipment and storage medium
CN115422165A (en) * 2022-09-22 2022-12-02 北京奥星贝斯科技有限公司 Database data migration method and database

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116471274A (en) * 2023-06-20 2023-07-21 深圳富联富桂精密工业有限公司 Database node deployment method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN116150160B (en) 2023-06-23

Similar Documents

Publication Publication Date Title
US11016939B2 (en) Architecture for scalable metadata microservices orchestration
US8583756B2 (en) Dynamic configuration and self-tuning of inter-nodal communication resources in a database management system
US11782875B2 (en) Directory structure for a distributed storage system
US10534674B1 (en) Scalable, persistent, high performance and crash resilient metadata microservice
US11847098B2 (en) Metadata control in a load-balanced distributed storage system
US20030182264A1 (en) Dynamic cluster database architecture
CN110147407B (en) Data processing method and device and database management server
CN112765262B (en) Data redistribution method, electronic equipment and storage medium
US20200019476A1 (en) Accelerating Write Performance for Microservices Utilizing a Write-Ahead Log
WO2016148670A1 (en) Deduplication and garbage collection across logical databases
US11934348B2 (en) Pushing a point in time to a backend object storage for a distributed storage system
CN108900626B (en) Data storage method, device and system in cloud environment
US20210049044A1 (en) Slab memory allocator with dynamic buffer resizing
US20200019330A1 (en) Combined Read/Write Cache for Deduplicated Metadata Service
CN116150160B (en) Adjustment method and device for database cluster processing nodes and storage medium
CN112162846B (en) Transaction processing method, device and computer readable storage medium
CN113742135A (en) Data backup method and device and computer readable storage medium
CN110162395B (en) Memory allocation method and device
CN113127444B (en) Data migration method, device, server and storage medium
CN113568749B (en) Method for distributing shards based on elastic search cluster
CN112597173A (en) Distributed database cluster system peer-to-peer processing system and processing method
US20230334066A1 (en) Data movement from data storage clusters
CN118069335A (en) Memory release method, device and storage medium
CN116226081A (en) Database elastic expansion method and device, electronic equipment and storage medium
CN117785047A (en) Method, device and storage medium for managing distributed database

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant