WO2024001079A1 - 一种数据库主备同步操作的加速方法和系统 - Google Patents

一种数据库主备同步操作的加速方法和系统 Download PDF

Info

Publication number
WO2024001079A1
WO2024001079A1 PCT/CN2022/139851 CN2022139851W WO2024001079A1 WO 2024001079 A1 WO2024001079 A1 WO 2024001079A1 CN 2022139851 W CN2022139851 W CN 2022139851W WO 2024001079 A1 WO2024001079 A1 WO 2024001079A1
Authority
WO
WIPO (PCT)
Prior art keywords
database
operation module
data
execution
module
Prior art date
Application number
PCT/CN2022/139851
Other languages
English (en)
French (fr)
Inventor
刘睿民
李文峰
莫明勋
Original Assignee
北京柏睿数据技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京柏睿数据技术股份有限公司 filed Critical 北京柏睿数据技术股份有限公司
Publication of WO2024001079A1 publication Critical patent/WO2024001079A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/275Synchronous replication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/217Database tuning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system

Definitions

  • the present invention relates to the field of database technology, and in particular to a method and system for accelerating the synchronization operation of primary and secondary databases.
  • data synchronization operations and data reading and writing separation are important means to improve the reading and writing performance and data synchronization efficiency of the database system, and to ensure data consistency, especially in database clusters composed of one master and multiple backup database nodes.
  • the data synchronization method mainly involves the main database node completing relevant database operations, and then achieving data synchronization through asynchronous data replication or synchronous replication, including incremental synchronization and full synchronization.
  • asynchronous replication or synchronous replication both can achieve incremental data synchronization on the basis of ensuring data consistency.
  • data synchronization needs to wait for the master node to complete certain database operations before the master database node can send data to the server through the data transmission network.
  • the standby database node performs data replication, resulting in a large delay in data synchronization, which seriously affects the operating efficiency and overall performance of the database cluster.
  • Data synchronization is performed only after the database operation is completed, causing a large delay in data synchronization
  • Database operations are completed by the CPU of the database master node, consuming a large amount of master node resources. At the same time, due to the performance bottleneck of the master node, the performance of the entire cluster is reduced;
  • the present invention proposes a method and system for accelerating the synchronization operation of the main and backup databases.
  • the database operation and data synchronization of the present invention can be performed and completed simultaneously.
  • data synchronization is performed only after the completion of the database operation, resulting in a large delay in data synchronization.
  • the present invention can release database node resources.
  • existing database operations are all completed by the CPU of the database master node, which consumes a large amount of master node resources.
  • the performance of the entire cluster is reduced.
  • the operation module of the present invention implements unified management and dynamic coordination to achieve load balancing.
  • the existing technology cannot dynamically respond to the user's read request according to the idle status of the nodes in the cluster, resulting in a large delay in the response.
  • the present invention provides a method for accelerating the synchronization operation of active and standby databases, which is applied to a database cluster composed of active and standby nodes.
  • the method includes: S1, initialization operation: executing time information synchronization, and downloading from the database node Data; S2, receives user requests in real time, and completes corresponding database operations and data synchronization according to the request type; S3, returns database operation results to the user.
  • S1 initialization operation perform time information synchronization and download data from the database node, specifically including:
  • the main operation module obtains time information through the time module and sends initialization operation instructions to the operation modules on the main and backup nodes of the database;
  • the operation module deployed on the active and backup nodes of the database downloads data from the corresponding database node and extracts the data information on the database node;
  • the main operation module compares the instruction completion result information to determine whether the initialization data of the operation module on each database node is consistent
  • S1-6 count the idle status of each operation module; wherein, the idle status includes whether to execute tasks, the number of executed tasks, and the type of currently executed tasks.
  • the S2 receives user requests in real time and completes corresponding database operations and data synchronization according to the request type, specifically including:
  • the main operation module receives the client request in real time and stamps the time tag T start ;
  • the main operation module extracts the SQL statement in the user request, parses, compiles and optimizes the SQL, and generates an execution plan with the time tag T start ;
  • the main operation module distributes the execution plan to each operation module on the database node for execution according to the request type and the idle status and master-slave relationship of each operation module on the database node; wherein the request type includes data addition , delete, update and query;
  • the S3 returns the database operation results to the user end, specifically including;
  • the main operation module receives the execution result and determines the type of the execution result
  • the main operation module if the execution result is the execution result of the database operation of adding, deleting or updating, when receiving the first execution result, the main operation module returns the database operation result to the user end and extracts the And update the data information and database operation information, record the completion time T end , and update the idle state of the operation module; wherein the database operation result is whether the data operation is successful and/or the operation volume;
  • the main operation module when receiving the execution result, the main operation module returns the data query result to the user terminal, records the time information T end at the same time, and updates the operation The idle state of the module; wherein the database operation result is that the data operation is a data query result.
  • the present invention provides an acceleration system for synchronized operation of master and backup databases, including a master operation unit and operation modules deployed on multiple database nodes;
  • the main operation unit includes a main operation module and a time module.
  • the time module is connected to the main operation module through the IO bus; the main operation module is connected to the user terminal and the operation module deployed on the database node respectively;
  • the time module is used to obtain time information
  • the main operation module specifically includes:
  • Input unit used to receive request instructions from the user end
  • the first synchronization unit is used to obtain time information from the time module and put a time tag on the client request received from the client; and complete the time information synchronization between the main operation module and the operation module deployed on the database node;
  • the first data operation execution unit is used to parse, optimize, compile and generate the corresponding execution plan for the user request; and to collect statistics on the idle status information of the operation modules deployed on the database node;
  • the first sending unit is used to send the execution plan to the operation module deployed on the database node;
  • a first receiving unit configured to receive execution results returned by the operation module
  • the first cache unit is used to cache the execution plan and the execution results returned by the operation module;
  • the first data storage unit is used to store the idle status information of each operation module, the returned execution results, and data information
  • the first node and operating device interface are used to provide an interface for the connection between the main operating module and the operating module deployed on the database node;
  • An output unit is used to return to the user terminal the execution result corresponding to the user terminal request
  • the operation module deployed on the database node specifically includes:
  • the second synchronization unit is used to obtain the time information sent by the main operation module, and after the operation module completes the database operation, stamp the database operation with a time tag of completion of the database operation;
  • a second data operation execution unit configured to execute the execution plan received by the operation module
  • a second sending unit configured to return the execution result to the main operation module
  • a second receiving unit configured to receive the execution plan and synchronized time information sent by the main operation module
  • a second cache unit configured to cache the execution plan and the intermediate results of the execution of the execution plan, as well as the data information and data extracted from the database;
  • the second data storage unit is used to store data information and data extracted from the database
  • the second node and operating device interface are used to provide an interface between the main operating module and the operating module deployed on the database node.
  • the present invention downloads the calculations of database operations (add, delete and update) and the data that need to be synchronized to the operation module through the deployment and centralized management of the database operation module on the active and backup nodes, and the operation module completes the processing in a unified manner, and After the processing is completed, the database operation results and data are synchronized to the active and standby nodes in the database cluster through each operation module for execution; when the user reads the data, the main operation module parses and optimizes the read instructions, and performs the operations according to the requirements of each operation module.
  • the idle state is unified and coordinated, and tasks are distributed to idle operation modules for execution, so that all operation modules can respond to read requests.
  • the present invention at least has the following beneficial effects:
  • the database operations previously completed by the database can be downloaded to the operation module for completion, which reduces the workload of the database itself and improves the read and write of the database cluster. Performance and efficiency;
  • the operation module completes the data synchronization of each database node in the main and backup database clusters, and the main operation module ensures the consistency of the data through unified verification, avoiding the need for network transmission between database nodes.
  • the data synchronization operation brings delay problems, and also fundamentally solves the problem of the data synchronization mechanism of the main and backup database nodes restricting the performance of the database cluster. It provides a brand-new method, which effectively improves the real-time performance of preparing data synchronization between database nodes. , and database cluster performance;
  • the main operation module performs unified management, time synchronization and load balancing of the operation modules deployed on each database node, fully and effectively utilizing idle resources in the database cluster, and further improving the data synchronization and overall operation of the database cluster. efficiency and availability.
  • the present invention realizes that while the operation module processes and accelerates database operations, it also completes data synchronization operations between the main and backup database nodes and between the operation module and the database nodes, and the main and backup nodes dynamically respond to read requests. Therefore, on the basis of ensuring the data consistency of the active and standby nodes, the data synchronization replication operation between nodes is omitted, and the real-time performance of data synchronization replication is improved.
  • the computing resources of the nodes are further saved, and the database system provides external read and write services. performance; and through unified management and dynamic coordination, the load balancing of the system is achieved.
  • Figure 1 is a schematic flowchart of the method for accelerating the synchronization operation of the main and backup databases in the present invention.
  • Figure 2 is an architecture diagram of the accelerated system for synchronizing the main and backup database operations of the present invention.
  • FIG. 3 is a schematic structural diagram of the main operation module in the acceleration system for the main and backup synchronization operations of the database according to the present invention.
  • Figure 4 is a schematic structural diagram of an operation module in the acceleration system for synchronized operation of the main and backup databases of the present invention.
  • an embodiment of the present invention provides a method for accelerating the synchronization operation of active and standby databases, which is applied to a database cluster composed of active and standby nodes.
  • the databases are mainly mainstream databases, such as mysql, oracle, informix, db2, etc.
  • the cluster is dominated by a unified database.
  • the method for accelerating primary and secondary database synchronization operations performs the following process:
  • Execute operation module time information and data initialization, including execution of time information synchronization and downloading data from the database node.
  • the main operation module can obtain time information from the time module through the IO bus.
  • S102 determine whether the initialization of time information and data is successful. If not, return to S101 to perform initialization of time information and data of the operation module. If successful, proceed to the next step.
  • the main operation module receives the client request in real time and puts a time stamp on it.
  • the synchronization of the data query operation is different from other operations, so it is judged whether the request type is a data query operation.
  • the main operation module will deliver the execution plan to each operation module on the data node for execution, and update the idle status of each operation module.
  • Each operation module executes the execution plan and generates execution results including metadata, operation status, time consumption, data volume and other data information, as well as database operation information and completion time.
  • the main operation module receives the execution result, returns the database operation result to the user end, records the time information, and updates the idle status of the operation module.
  • the main operation module will give priority to the execution plan according to the execution tasks or to-be-execution tasks of each operation module on the database node. Execute on the operation module and update the idle status of the operation module;
  • the operation module generates execution results after completing the execution plan.
  • the execution results include information such as query results and query completion time.
  • the operation module returns the execution result to the main operation module.
  • S122 is also executed.
  • the main operation module receives the execution result, returns the database operation result to the user, records the time information, and updates the idle status of the operation module.
  • the database operations previously completed by the database are downloaded to the operation module for completion. While completing the database operations, the operation module completes the data synchronization of each database node in the active and standby database clusters.
  • the present invention thus enables the operation module to process and accelerate database operations while also completing data synchronization operations between the main and backup database nodes and between the operation module and the database nodes, and the main and backup nodes dynamically respond to read requests, thereby
  • the main and backup nodes dynamically respond to read requests, thereby
  • it eliminates the need for synchronous data replication operations between nodes, improves the real-time performance of data synchronous replication, further saves the computing resources of the nodes, and improves the ability of the database system to provide external read and write services. performance; and through unified management and dynamic coordination, the load balancing of the system is achieved.
  • the method for accelerating the main and backup synchronization operations of the database in the present invention is summarized as follows: S1, initialization operation: perform time information synchronization, and download data from the database node; S2, receive user requests in real time, and complete the corresponding according to the request type Database operation and data synchronization; S3 returns database operation results to the user.
  • the S1, initialization operation performs time information synchronization and downloads data from the database node, specifically including:
  • the main operation module obtains time information through the time module and sends initialization operation instructions to the operation modules on the main and backup nodes of the database;
  • the operation module deployed on the active and backup nodes of the database downloads data from the corresponding database node and extracts the data information on the database node;
  • the main operation module compares the instruction completion result information to determine whether the initialization data of the operation module on each database node is consistent
  • S1-6 count the idle status of each operation module; wherein, the idle status includes whether to execute tasks, the number of executed tasks, and the type of currently executed tasks.
  • the main operation module obtains time information through the time module and sends initialization operation instructions to the operation modules on the main and backup nodes of the database, specifically including:
  • the main operation module obtains time information from the time module through the IO bus, which is recorded as the database cluster initialization startup time Tsu ; wherein, the time module is connected to the main operation module through the IO bus, and the time The time information of the module is updated synchronously with the real time;
  • initialization operation instructions send initialization operation instructions to the operation modules on the active and backup nodes of the database, and synchronously send the initialization start time Tsu; wherein the initialization operation instructions are operations generated by SQL compilation that can be directly executed by the database system.
  • the operation instruction includes the data initialization ratio P,
  • CM max is the maximum storage space that can be used by the operation module
  • P operation module is the preset proportion of storage space allowed to be used by the operation module, such as 60%, 80%, etc.
  • CM data is the storage occupied by all data on the database node The size of the space.
  • the operation module deployed on the active and backup nodes of the database downloads data from the corresponding database node and extracts the data information on the database node, specifically including:
  • the operation module deployed on the active and backup nodes of the database decomposes the instruction to obtain the initialization startup time Tsu and the initialization operation instruction;
  • the execution database obtains data with a proportion of the initialization proportion P according to the initialization operation instruction, and downloads the data to the operation module deployed on the node;
  • the operation module obtains the time information synchronized by the main operation module, which is recorded as the data loading completion time T SE ;
  • the operation instruction completion result information at least includes the following data: metadata, operation status, time Consumption, and data size.
  • the instruction completion result information and time information are returned to the main operation module, which also includes:
  • the main operation module After receiving the operation instruction completion result information, the main operation module extracts the data and compares it with the information returned by the operation module deployed on the main node;
  • step S1-3-3 if inconsistent, back up the data on the main database node to the inconsistent database node, and perform step S1-3-2 until the initialization is successful, and record the database cluster initialization end time T SE .
  • the S2 receives client requests in real time and completes corresponding database operations and data synchronization according to the request type, specifically including:
  • the main operation module receives the client request in real time and stamps the time tag T start ;
  • the main operation module extracts the SQL statement in the user request, parses, compiles and optimizes the SQL, and generates an execution plan with the time tag T start ;
  • the main operation module distributes the execution plan to each operation module on the database node for execution according to the request type and the idle status and master-slave relationship of each operation module on the database node; wherein the request type includes data addition , delete, update and query can be judged by the SQL statement in the request;
  • the main operation module distributes the execution plan to each operation module on the database node for execution based on the request type and the idle status and master-slave relationship of each operation module on the database node, specifically including:
  • the main operation module determines and identifies the request type based on the SQL statement
  • the main operation module sequentially delivers the execution plan to each operation module on the database node for execution based on the tasks being executed or tasks to be executed by each operation module on the database node, and updates each operation module.
  • the main operation module gives priority to the execution plan to the operation module with idle resources or fewer remaining execution tasks based on the tasks being executed or tasks to be executed by each operation module on the database node. Execute on, and update the idle state of the operation module; the operation module generates an execution result after completing the execution plan; wherein the execution result includes the query result and the query completion time T end .
  • the operation module generates an execution result after completing the execution of the execution plan, specifically including:
  • the operation module receives the execution plan and first determines the storage location of the data involved in the execution plan
  • the operation module sends the execution plan to the database where it is located for execution. After the execution is completed, the query result is returned to the execution module, and the execution result will receive the query result. The time of is recorded as the time information T end of query completion, and the query result and the time information T end are returned to the main operation module as execution results.
  • the S3 returns the database operation results to the user terminal, specifically including;
  • the main operation module receives the execution result and determines the type of the execution result
  • the main operation module returns the database operation result to the user end and extracts the And update the data information and database operation information, record the completion time T end , and update the idle state of the operation module; wherein the database operation result is whether the data operation is successful and/or the operation volume;
  • the database operation results returned by add, delete or delete operations are: add/delete/modify successfully, n pieces of data were added/delete/modified.
  • the main operation module when receiving the execution result, the main operation module returns the data query result to the user terminal, records the time information T end at the same time, and updates the operation The idle state of the module; wherein the database operation result is that the data operation is a data query result.
  • an embodiment of the present invention also provides an acceleration system for database master-slave synchronization operation, which includes a master operation unit and operation modules deployed on multiple database nodes.
  • the main operation unit includes a main operation module and a time module.
  • the time module is connected to the main operation module through an IO bus; the main operation module is connected to the user terminal and the operation module deployed on the database node.
  • the time module is used to obtain time information.
  • the main operation module specifically includes:
  • Input unit used to receive request instructions from the user end
  • the first synchronization unit is used to obtain time information from the time module and put a time tag on the client request received from the client; and complete the time information synchronization between the main operation module and the operation module deployed on the database node;
  • the first data operation execution unit is used to parse, optimize, compile and generate the corresponding execution plan for the user request; and to collect statistics on the idle status information of the operation modules deployed on the database node;
  • the first sending unit is used to send the execution plan to the operation module deployed on the database node;
  • a first receiving unit configured to receive execution results returned by the operation module
  • the first cache unit is used to cache the execution plan and the execution results returned by the operation module;
  • the first data storage unit is used to store the idle status information of each operation module, the returned execution results, and data information
  • the first node and operating device interface are used to provide an interface for the connection between the main operating module and the operating module deployed on the database node.
  • the interface can be an Ethernet port, an IB port, etc.;
  • An output unit is used to return to the user terminal the execution result corresponding to the user terminal request
  • the operation module deployed on the database node specifically includes:
  • the second synchronization unit is used to obtain the time information sent by the main operation module, and after the operation module completes the database operation (including adding, deleting, modifying and reading), stamp the database operation with the completion of the database operation. time tag;
  • a second data operation execution unit configured to execute the execution plan received by the operation module
  • a second sending unit configured to return the execution result to the main operation module
  • a second receiving unit configured to receive the execution plan and synchronized time information sent by the main operation module
  • a second cache unit configured to cache the execution plan and the intermediate results of the execution of the execution plan, as well as the data information and data extracted from the database;
  • the second data storage unit is used to store data information and data extracted from the database
  • the second node and operating device interface are used to provide an interface for connection between the main operating module and the operating module deployed on the database node.
  • the interface may be an Ethernet port, an IB port, etc.
  • the present invention downloads the calculations of database operations (add, delete and update) and the data that need to be synchronized to the operation module through the deployment and centralized management of the database operation module on the active and backup nodes, and the operation module completes the processing in a unified manner, and After the processing is completed, the database operation results and data are synchronized to the active and standby nodes in the database cluster through each operation module for execution; when the user reads the data, the main operation module parses and optimizes the read instructions, and performs the operations according to the requirements of each operation module.
  • the idle state is unified and coordinated, and tasks are distributed to idle operation modules for execution, so that all operation modules can respond to read requests.
  • the present invention at least has the following beneficial effects:
  • the database operations previously completed by the database can be downloaded to the operation module for completion, which reduces the workload of the database itself and improves the read and write of the database cluster. Performance and efficiency;
  • the operation module completes the data synchronization of each database node in the main and backup database clusters, and the main operation module ensures the consistency of the data through unified verification, avoiding the need for network transmission between database nodes.
  • the data synchronization operation brings delay problems, and also fundamentally solves the problem of the data synchronization mechanism of the main and backup database nodes restricting the performance of the database cluster. It provides a brand-new method, which effectively improves the real-time performance of preparing data synchronization between database nodes. , and database cluster performance;
  • the main operation module performs unified management, time synchronization and load balancing of the operation modules deployed on each database node, fully and effectively utilizing idle resources in the database cluster, and further improving the data synchronization and overall operation of the database cluster. efficiency and availability.
  • the present invention realizes that while the operation module processes and accelerates database operations, it also completes data synchronization operations between the main and backup database nodes and between the operation module and the database nodes, and the main and backup nodes dynamically respond to read requests. Therefore, on the basis of ensuring the data consistency of the active and standby nodes, the data synchronization replication operation between nodes is omitted, and the real-time performance of data synchronization replication is improved.
  • the computing resources of the nodes are further saved, and the database system provides external read and write services. performance; and through unified management and dynamic coordination, the load balancing of the system is achieved.
  • embodiments of the present invention may be provided as methods, systems, or computer program products.
  • the invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects.
  • the invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, magnetic disk storage, optical storage, etc.) embodying computer-usable program code therein.
  • These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable data processing apparatus to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction means, the instructions
  • the device implements the functions specified in a process or processes of the flowchart and/or a block or blocks of the block diagram.
  • These computer program instructions may also be loaded onto a computer or other programmable data processing device, causing a series of operating steps to be performed on the computer or other programmable device to produce computer-implemented processing, thereby executing on the computer or other programmable device.
  • Instructions provide steps for implementing the functions specified in a process or processes of a flowchart diagram and/or a block or blocks of a block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种数据库主备同步操作的加速方法和系统,从根本上提升了数据库集群的运行效率和读写性能,解决了数据库集群因现有数据同步方法和读写分离方案所带来的问题,包括数据库集群性能不足、响应延迟大、计算资源难以充分利用等,进而极大的提升了数据库集群处理数据库操作的性能。

Description

一种数据库主备同步操作的加速方法和系统 技术领域
本发明涉及数据库技术领域,特别涉及一种数据库主备同步操作的加速方法和系统。
背景技术
在数据库技术领域,数据同步操作和数据读写分离是提升数据库系统读写性能和数据同步效率、以及保证数据一致性的重要手段,尤其在由一主多备的数据库节点组成的数据库集群中。
目前,数据同步的方法主要是由主数据库节点完成相关的数据库操作,然后通过数据异步复制或同步复制的方式实现数据同步,其中包括增量同步和全量同步。不管异步复制,还是同步复制都能够在确保数据一致性的基础上,都能够实现数据增量同步,但由于数据同步需要等待主节点完成一定的数据库操作后才能由主数据库节点通过数据传输网络向备数据库节点进行数据复制,使得数据同步依然存在很大的延迟,严重影响了数据库集群的运行效率和整体性能。
在此基础上,为了提升数据库集群的性能,提出了数据读写分离的技术方案,该方案通过由主数据库节点承担写操作(包括数据增加、删除和修改操作),由备节点承担读操作。此种方式很大程度上有效地减轻主数据库节点的负载压力,提升了数据库集群的读写性能,但因主数据库节点存在性能瓶颈,且受到目前数据同步方法的影响,读写分离的技术方案并未从根本上解决制约数据库集群性能的问题,数据库集群依然存在性能瓶颈、响应延迟大,以及数据库节点计算资源难以得到充分利用的问题。具体存在如下问题:
1、数据库操作完成后,才进行数据同步,使得数据同步存在较大延迟;
2、数据库操作均由数据库主节点的CPU完成,消耗大量主节点资源,同时因主节点的性能瓶颈,导致降低了整个集群的性能;
3、在读取数据时,无法根据集群中节点闲置情况,动态响应用户的读取请求,导致响应存在较大延迟。
因此,有必要引入一种新的方法及系统,解决数据库集群因现有数据同步方 法和读写分离方案所带来的数据库集群性能不足、响应延迟大、计算资源难以充分利用等问题,进而在保障数据一致性的基础上,从根本上提升数据库集群的运行效率和读写性能。
发明内容
针对现有技术的不足,本发明提出了一种数据库主备同步操作的加速方法和系统,通过在由主备节点构成的数据库集群中增加和部署数据库操作模块,将数据库操作(增、删和更新)的计算和需要同步的数据,以及数据库读服务都下载至操作模块,解决了如下问题:
首先,本发明数据库操作和数据同步可以同步进行和完成,而传统的数据库操作完成后,才进行数据同步,使得数据同步存在较大延迟。
其次,本发明可以释放数据库节点资源,而现有数据库操作均由数据库主节点的CPU完成,消耗大量主节点资源,同时因主节点的性能瓶颈,导致降低了整个集群的性能。
最后,本发明操作模块统一管理和动态协调,实现了负载均衡,而现有技术在读取数据时,无法根据集群中节点闲置情况,动态响应用户的读取请求,导致响应存在较大延迟。
第一方面,本发明提供一种数据库主备同步操作的加速方法,应用于由主备节点构成的数据库集群中,所述方法包括:S1,初始化操作:执行时间信息同步,以及从数据库节点下载数据;S2,实时接收用户端请求,并根据请求类型完成对应的数据库操作和数据同步;S3,向用户端返回数据库操作结果。
其中,所述S1,初始化操作:执行时间信息同步,以及从数据库节点下载数据,具体包括:
S1-1,主操作模块通过时间模块获取时间信息,并向数据库主备节点上的操作模块发送初始化操作指令;
S1-2,部署于数据库主备节点上的操作模块收到所述指令后,分别从对应数据库节点上下载数据,并提取数据库节点上的数据信息;
S1-3,数据下载完成后,将所述指令完成结果信息和时间信息返回至主操作模块;
S1-4,主操作模块根据所述指令完成结果信息进行比对,判断各个数据库节点上的操作模块初始化数据是否一致;
S1-5,当所述初始化数据一致时,获取并存储所述数据库节点上的数据信息;
S1-6,统计各个操作模块的闲置状态;其中,所述闲置状态包括是否执行任务、执行任务数量、当前执行任务的类型。
其中,所述S2,实时接收用户端请求,并根据请求类型完成对应的数据库操作和数据同步,具体包括:
S2-1,主操作模块实时接收用户端请求,打上时间标签T start
S2-2,主操作模块提取所述用户端请求中的SQL语句,并对SQL进行解析、编译和优化,生成带有所述时间标签T start的执行计划;
S2-3,主操作模块根据请求类型和数据库节点上各个操作模块的闲置状况及主从关系,将所述执行计划分发至数据库节点上的各个操作模块执行;其中,所述请求类型包括数据增加、删除、更新和查询;
S2-4,所述操作模块将所述执行计划执行完成后,返回执行结果。
其中,所述S3,向用户端返回数据库操作结果,具体包括;
S3-1,所述主操作模块收到所述执行结果,判断所述执行结果的类型;
S3-2,如果所述执行结果为增加、删除或更新的数据库操作的执行结果,所述主操作模块在收到第一个执行结果时,即向用户端返回所述数据库操作结果,同时提取并更新所述数据信息,以及数据库操作信息,并记录所述完成时间T end,更新所述操作模块的闲置状态;其中,所述数据库操作结果为所述数据操作是否成功和/或操作量;
S3-3,如果所述执行结果为数据查询结果,所述主操作模块在收到所述执行结果时,向用户端返回数据查询结果,同时记录所述时间信息T end,并更新所述操作模块的闲置状态;其中,所述数据库操作结果为所述数据操作为数据查询结果。
第二方面,本发明提供一种数据库主备同步操作的加速系统,包括一主操作单元,以及部署在多个数据库节点上的操作模块;
所述主操作单元包括主操作模块以及时间模块,所述时间模块通过IO总线与主操作模块连接;所述主操作模块分别与用户端以及部署在数据库节点上的操 作模块相连接;
其中,所述时间模块,用于获取时间信息;
其中,所述主操作模块,具体包括:
输入单元,用于接收用户端的请求指令;
第一同步单元,用于从时间模块中获取时间信息,并为从用户端接收的用户端请求打上时间标签;以及完成主操作模块与部署于数据库节点上的操作模块之间的时间信息同步;
第一数据操作执行单元,用于对所述用户端请求进行解析、优化、编译并生成对应的执行计划;并用于统计部署于数据库节点上的操作模块的闲置状态信息;
第一发送单元,用于将所述执行计划发送给部署于数据库节点上的操作模块;
第一接收单元,用于接收所述操作模块返回的执行结果;
第一缓存单元,用于缓存所述执行计划和所述操作模块返回的执行结果;
第一数据存储单元,用于存储各个操作模块的闲置状态信息、返回的执行结果,以及数据信息;
第一节点和操作装置接口,用于为主操作模块与部署于数据库节点上的操作模块之间的连接提供接口;
输出单元,用于向用户端返回所述用户端请求对应的执行结果;
其中,所述部署在数据库节点上的操作模块,具体包括:
第二同步单元,用于获取所述主操作模块发送的时间信息,并在所述操作模块完成数据库操作后,为数据库操作打上所述数据库操作完成的时间标签;
第二数据操作执行单元,用于执行对所述操作模块接收到的执行计划;
第二发送单元,用于将所述执行结果返回所述主操作模块;
第二接收单元,用于接收所述主操作模块发出的执行计划和同步的时间信息;
第二缓存单元,用于缓存所述执行计划和所述执行计划执行的中间结果,以及从数据库中提取的数据信息和数据;
第二数据存储单元,用于存储从数据库中提取的数据信息及数据;
第二节点和操作装置接口,用于为主操作模块与部署于数据库节点上的操作模块之间的连接提供接口。
可见,本发明通过数据库操作模块在主备节点上的部署和集中管理,将数据 库操作(增、删和更新)的计算和需要同步的数据下载至操作模块,由操作模块统一完成处理,并在处理结束后,通过各个操作模块将这些数据库操作结果及数据同步至数据库集群中的主备节点上执行;当用户读取数据时,主操作模块解析和优化读取指令,并根据各个操作模块的闲置状态统一协调,将任务分发至闲置的操作模块执行,使得所有操作模块均可以响应读取请求。与现有技术中相比,本发明至少具备如下有益效果:
首先,通过使用自主设计的用于数据库主备同步操作加速的操作模块,实现了将此前由数据库完成的数据库操作下载至操作模块完成,减轻了数据库本身的工作负载,提升了数据库集群的读写性能和效率;
其次,操作模块在完成数据库操作的同时,完成了主备数据库集群中各个数据库节点的数据同步,并由主操作模块通过统一校验保障了数据的一致性,避免了数据库节点间通过网络传输完成数据同步操作带来延迟问题,也从根本上解决了主备数据库节点的数据同步机制制约数据库集群性能的问题,提供了一种全新的方法,进而有效提升了准备数据库节点间数据同步的实时性,以及数据库集群性能;
最后,通过主操作模块对部署于各个数据库节点上的操作模块进行统一的管理、时间同步和负载均衡,充分且有效地利用数据库集群中的闲置资源,进一步提升了数据库集群的数据同步和整体运行的效率和可用性。
综上,本发明由此实现了操作模块在处理和加速数据库操作的同时,也完成了主备数据库节点间以及操作模块与数据库节点间的数据同步操作,以及主备节点动态响应读取请求,从而在保障了主备节点数据一致性的基础上,省去了节点间的数据同步复制操作,提升数据同步复制的实时性,更进一步节省了节点的计算资源,提升数据库系统对外提供读写服务的性能;并且通过统一管理和动态协调,实现了系统的负载均衡。
本发明的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本发明而了解。本发明的目的和其他优点可通过在所写的说明书、权利要求书、以及附图中所特别指出的结构来实现和获得。
下面通过附图和实施例,对本发明的技术方案做进一步的详细描述。
附图说明
附图用来提供对本发明的进一步理解,并且构成说明书的一部分,与本发明的实施例一起用于解释本发明,并不构成对本发明的限制。
图1为本发明数据库主备同步操作的加速方法流程示意图。
图2为本发明数据库主备同步操作的加速系统架构图。
图3为本发明数据库主备同步操作的加速系统中的主操作模块结构示意图。
图4为本发明数据库主备同步操作的加速系统中的操作模块结构示意图。
具体实施方式
下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。
参照图1所示,本发明实施例提供一种数据库主备同步操作的加速方法,应用于由主备节点构成的数据库集群中,数据库以主流数据库为主,mysql,oracle,informix,db2等,集群以统一的数据库为主。
优选地,在一个实施例中,数据库主备同步操作的加速方法执行以下流程:
S101,执行操作模块时间信息和数据初始化,包括执行时间信息同步,以及从数据库节点下载数据。主操作模块可以通过IO总线,从时间模块获取时间信息。
S102,判断时间信息和数据初始化是否成功,如果不成功,返回S101,执行操作模块时间信息和数据初始化,如果成功,进入下一步骤。
S103,主操作模块实时接收用户端请求,并打上时间标签。
S104,提取SQL语句,并对SQL进行解析、编译和优化,生成带有所述时间标签的执行计划。
S105,数据查询操作的同步与其他操作不同,因此进行请求类型是否为数据查询操作的判断。
如果不是数据查询操作,则进入S106,主操作模块将执行计划下发至数据节点上的各个操作模块中执行,并更新各个操作模块的闲置状态。
S107,各个操作模块执行执行计划,并生成包含元数据、操作状态、时间消耗、和数据量大小等数据信息,以及数据库操作信息和完成时间等信息的执行结果。
S108,将执行结果返回主操作模块,同时将执行结果发送至部署所述操作模块的数据库节点上的数据库中,同步完成每个数据库节点的数据同步更新。
最后执行S122,主操作模块收到执行结果,向用户端返回数据库操作结果,并记录时间信息,更新操作模块的闲置状态。
如果是数据查询操作,则进入S109,主操作模块根据数据库节点上的各个操作模块的正在执行任务或待执行任务的情况,优先将所述执行计划下发至资源闲置或剩余执行任务较少的所述操作模块上执行,并更新所述操作模块的闲置状态;
S120,所述操作模块执行完成所述执行计划后生成执行结果,执行结果包括查询结果和查询完成的时间等信息。
S121,操作模块将执行结果返回主操作模块。
最后同样执行S122,主操作模块收到执行结果,向用户端返回数据库操作结果,并记录时间信息,更新操作模块的闲置状态。
通过使用操作模块,实现了将此前由数据库完成的数据库操作下载至操作模块完成,操作模块在完成数据库操作的同时,完成了主备数据库集群中各个数据库节点的数据同步。
可见,本发明由此实现了操作模块在处理和加速数据库操作的同时,也完成了主备数据库节点间以及操作模块与数据库节点间的数据同步操作,以及主备节点动态响应读取请求,从而在保障了主备节点数据一致性的基础上,省去了节点间的数据同步复制操作,提升数据同步复制的实时性,更进一步节省了节点的计算资源,提升数据库系统对外提供读写服务的性能;并且通过统一管理和动态协调,实现了系统的负载均衡。
综上所述,本发明数据库主备同步操作的加速方法总结如下:S1,初始化操作:执行时间信息同步,以及从数据库节点下载数据;S2,实时接收用户端请求,并根据请求类型完成对应的数据库操作和数据同步;S3,向用户端返回数据库操作结果。
优选地,所述S1,初始化操作:执行时间信息同步,以及从数据库节点下载数据,具体包括:
S1-1,主操作模块通过时间模块获取时间信息,并向数据库主备节点上的操作模块发送初始化操作指令;
S1-2,部署于数据库主备节点上的操作模块收到所述指令后,分别从对应数据库节点上下载数据,并提取数据库节点上的数据信息;
S1-3,数据下载完成后,将所述指令完成结果信息和时间信息返回至主操作模块;
S1-4,主操作模块根据所述指令完成结果信息进行比对,判断各个数据库节点上的操作模块初始化数据是否一致;
S1-5,当所述初始化数据一致时,获取并存储所述数据库节点上的数据信息;
S1-6,统计各个操作模块的闲置状态;其中,所述闲置状态包括是否执行任务、执行任务数量、当前执行任务的类型。
优选地,所述S1-1,主操作模块通过时间模块获取时间信息,并向数据库主备节点上的操作模块发送初始化操作指令,具体包括:
S1-1-1,初始化操作时,主操作模块通过IO总线,从时间模块获取时间信息,记为数据库集群初始化启动时间T su;其中,所述时间模块通过IO总线与主操作模块连接,时间模块的时间信息与真实时间同步更新;
S1-1-2,向数据库主备节点上的操作模块发送初始化操作指令,并同步发送初始化启动时间Tsu;其中,所述初始化操作指令为由SQL编译后生成的能够被数据库系统直接执行的操作指令,所述操作指令中包含数据初始化比例P,
P=(CM max×P 操作模块)/CM data×100%;
其中,CM max为操作模块可使用的最大存储空间;P 操作模块为预设的操作模块允许使用的存储空间比例,如60%、80%等;CM data为数据库节点上的所有数据所占存储空间的大小。,
优选地,所述S1-2,部署于数据库主备节点上的操作模块收到所述指令后,分别从对应数据库节点上下载数据,并提取数据库节点上的数据信息,具体包括:
S1-2-1,部署于数据库主备节点上的操作模块收到所述指令后,对指令进行分解,获得初始化启动时间T su和初始化操作指令;
S1-2-2,将所述初始化操作指令下推至同一节点的数据库中执行;
S1-2-3,执行数据库根据所述初始化操作指令,获取比例为所述初始化比例P的数据,并将所述数据下载至部署于该节点上的操作模块中;
S1-2-4,数据加载完成时,所述操作模块获取主操作模块同步的时间信息,记为数据加载完成时间T SE
S1-2-5,将所述数据加载完成时间T SE与所述操作指令完成结果信息返回至主操作模块;其中,所述操作指令完成结果信息至少包括以下数据:元数据、操作状态、时间消耗、和数据量大小。
优选地,所述S1-3,数据下载完成后,将所述指令完成结果信息和时间信息返回至主操作模块,还包括:
S1-3-1,主操作模块收到所述操作指令完成结果信息后,提取其中的数据与部署于主节点上的所述操作模块返回的信息进行比对;
S1-3-2,如果一致,则初始化成功,并记录数据库集群初始化结束时间T SE
S1-3-3,如果不一致,则将主数据库节点上的数据重新备份至不一致的数据库节点,并执行步骤S1-3-2,直到初始化成功,并记录数据库集群初始化结束时间T SE
优选地,所述S2,实时接收用户端请求,并根据请求类型完成对应的数据库操作和数据同步,具体包括:
S2-1,主操作模块实时接收用户端请求,打上时间标签T start
S2-2,主操作模块提取所述用户端请求中的SQL语句,并对SQL进行解析、编译和优化,生成带有所述时间标签T start的执行计划;
S2-3,主操作模块根据请求类型和数据库节点上各个操作模块的闲置状况及主从关系,将所述执行计划分发至数据库节点上的各个操作模块执行;其中,所述请求类型包括数据增加、删除、更新和查询,通过请求中的SQL语句即可判断;
S2-4,所述操作模块将所述执行计划执行完成后,返回执行结果。
优选地,所述S2-3,主操作模块根据请求类型和数据库节点上各个操作模块的闲置状况及主从关系,将所述执行计划分发至数据库节点上的各个操作模块执行,具体包括:
S2-3-1,主操作模块根据SQL语句判断并识别所述请求类型;
S2-3-2,如果所述请求类型为增加、删除或更新的数据库操作,则执行S2-3-3;如果所述请求类型为数据查询,则执行S2-3-4;
S2-3-3,主操作模块根据数据库节点上的各个操作模块的正在执行任务或待执行任务的情况,依次将所述执行计划下发至数据库节点上的各个操作模块中执行,并更新各个操作模块的闲置状态;所述操作模块执行完成所述执行计划后生成执行结果;其中,执行结果至少包括以下数据信息:元数据、数据量大小,以及数据库操作信息和完成时间T end;.
S2-3-4,主操作模块根据数据库节点上的各个操作模块的正在执行任务或待执行任务的情况,优先将所述执行计划下发至资源闲置或剩余执行任务较少的所述操作模块上执行,并更新所述操作模块的闲置状态;所述操作模块执行完成所述执行计划后生成执行结果;其中,执行结果包括查询结果和查询完成的时间T end
优选地,所述S2-3-4,所述操作模块执行完成所述执行计划后生成执行结果,具体包括:
S2-3-4-1,所述操作模块接收到执行计划,优先判断所述执行计划涉及的数据的存储位置;
S2-3-4-2,如果所述存储位置均处于所述操作模块中,则执行S1-3-4-3;如果所述存储位置部分处于所述操作模块中,则执行S1-3-4-4;
S2-3-4-3,由所述操作模块完成查询操作后即可向所述主操作模块返回查询结果和查询完成的时间信息T end
S2-3-4-4,所述操作模块将所述执行计划发送至其所在的数据库中执行,执行结束后,将查询结果返回所述执行模块,所述执行结果将收到所述查询结果的时间记为查询完成的时间信息T end,将所述查询结果和所述时间信息T end作为执行结果返回所述主操作模块。
优选地,所述S3,向用户端返回数据库操作结果,具体包括;
S3-1,所述主操作模块收到所述执行结果,判断所述执行结果的类型;
S3-2,如果所述执行结果为增加、删除或更新的数据库操作的执行结果,所述主操作模块在收到第一个执行结果时,即向用户端返回所述数据库操作结果,同时提取并更新所述数据信息,以及数据库操作信息,并记录所述完成时间T end, 更新所述操作模块的闲置状态;其中,所述数据库操作结果为所述数据操作是否成功和/或操作量;例如,增加、删除或删除操作返回的数据库操作结果为:增加/删除/修改成功,增加/删除/修改了n条数据。
S3-3,如果所述执行结果为数据查询结果,所述主操作模块在收到所述执行结果时,向用户端返回数据查询结果,同时记录所述时间信息T end,并更新所述操作模块的闲置状态;其中,所述数据库操作结果为所述数据操作为数据查询结果。
参照图2所示,本发明实施例还提供一种数据库主备同步操作的加速系统,包括一主操作单元,以及部署在多个数据库节点上的操作模块。
所述主操作单元包括主操作模块以及时间模块,所述时间模块通过IO总线与主操作模块连接;所述主操作模块分别与用户端以及部署在数据库节点上的操作模块相连接。
优选地,所述时间模块,用于获取时间信息。
在一个实施例中,参照图3所示,主操作模块具体包括:
输入单元,用于接收用户端的请求指令;
第一同步单元,用于从时间模块中获取时间信息,并为从用户端接收的用户端请求打上时间标签;以及完成主操作模块与部署于数据库节点上的操作模块之间的时间信息同步;
第一数据操作执行单元,用于对所述用户端请求进行解析、优化、编译并生成对应的执行计划;并用于统计部署于数据库节点上的操作模块的闲置状态信息;
第一发送单元,用于将所述执行计划发送给部署于数据库节点上的操作模块;
第一接收单元,用于接收所述操作模块返回的执行结果;
第一缓存单元,用于缓存所述执行计划和所述操作模块返回的执行结果;
第一数据存储单元,用于存储各个操作模块的闲置状态信息、返回的执行结果,以及数据信息;
第一节点和操作装置接口,用于为主操作模块与部署于数据库节点上的操作模块之间的连接提供接口,该接口可以是以太口、IB口等;
输出单元,用于向用户端返回所述用户端请求对应的执行结果;
在一个实施例中,参照图4所示,部署在数据库节点上的操作模块具体包括:
第二同步单元,用于获取所述主操作模块发送的时间信息,并在所述操作模块完成数据库操作(包括增加、删除、修改和读取)后,为数据库操作打上所述数据库操作完成的时间标签;
第二数据操作执行单元,用于执行对所述操作模块接收到的执行计划;
第二发送单元,用于将所述执行结果返回所述主操作模块;
第二接收单元,用于接收所述主操作模块发出的执行计划和同步的时间信息;
第二缓存单元,用于缓存所述执行计划和所述执行计划执行的中间结果,以及从数据库中提取的数据信息和数据;
第二数据存储单元,用于存储从数据库中提取的数据信息及数据;
第二节点和操作装置接口,用于为主操作模块与部署于数据库节点上的操作模块之间的连接提供接口,该接口可以是以太口、IB口等。
可见,本发明通过数据库操作模块在主备节点上的部署和集中管理,将数据库操作(增、删和更新)的计算和需要同步的数据下载至操作模块,由操作模块统一完成处理,并在处理结束后,通过各个操作模块将这些数据库操作结果及数据同步至数据库集群中的主备节点上执行;当用户读取数据时,主操作模块解析和优化读取指令,并根据各个操作模块的闲置状态统一协调,将任务分发至闲置的操作模块执行,使得所有操作模块均可以响应读取请求。与现有技术中相比,本发明至少具备如下有益效果:
首先,通过使用自主设计的用于数据库主备同步操作加速的操作模块,实现了将此前由数据库完成的数据库操作下载至操作模块完成,减轻了数据库本身的工作负载,提升了数据库集群的读写性能和效率;
其次,操作模块在完成数据库操作的同时,完成了主备数据库集群中各个数据库节点的数据同步,并由主操作模块通过统一校验保障了数据的一致性,避免了数据库节点间通过网络传输完成数据同步操作带来延迟问题,也从根本上解决了主备数据库节点的数据同步机制制约数据库集群性能的问题,提供了一种全新的方法,进而有效提升了准备数据库节点间数据同步的实时性,以及数据库集群性能;
最后,通过主操作模块对部署于各个数据库节点上的操作模块进行统一的管理、时间同步和负载均衡,充分且有效地利用数据库集群中的闲置资源,进一步 提升了数据库集群的数据同步和整体运行的效率和可用性。
综上,本发明由此实现了操作模块在处理和加速数据库操作的同时,也完成了主备数据库节点间以及操作模块与数据库节点间的数据同步操作,以及主备节点动态响应读取请求,从而在保障了主备节点数据一致性的基础上,省去了节点间的数据同步复制操作,提升数据同步复制的实时性,更进一步节省了节点的计算资源,提升数据库系统对外提供读写服务的性能;并且通过统一管理和动态协调,实现了系统的负载均衡。
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等 同技术的范围之内,则本发明也意图包含这些改动和变型在内。

Claims (10)

  1. 一种数据库主备同步操作的加速方法,应用于由主备节点构成的数据库集群中,其特征在于,所述方法包括:
    S1,进行初始化操作:执行时间信息同步,以及从数据库节点下载数据;
    S2,实时接收用户端请求,并根据请求类型完成对应的数据库操作和数据同步;
    S3,向用户端返回数据库操作结果。
  2. 如权利要求1所述的方法,其中,所述S1,进行初始化操作:执行时间信息同步,以及从数据库节点下载数据,具体包括:
    S1-1,主操作模块通过时间模块获取时间信息,并向数据库主备节点上的操作模块发送初始化操作指令;
    S1-2,部署于数据库主备节点上的操作模块收到所述指令后,分别从对应数据库节点上下载数据,并提取数据库节点上的数据信息;
    S1-3,数据下载完成后,将所述指令完成结果信息和时间信息返回至主操作模块;
    S1-4,主操作模块根据所述指令完成结果信息进行比对,判断各个数据库节点上的操作模块初始化数据是否一致;
    S1-5,当所述初始化数据一致时,获取并存储所述数据库节点上的数据信息;
    S1-6,统计各个操作模块的闲置状态;其中,所述闲置状态包括是否执行任务、执行任务数量、当前执行任务的类型。
  3. 如权利要求2所述的方法,其中,所述S1-1,主操作模块通过时间模块获取时间信息,并向数据库主备节点上的操作模块发送初始化操作指令,具体包括:
    S1-1-1,初始化操作时,主操作模块通过IO总线,从时间模块获取时间信息,记为数据库集群初始化启动时间T su;其中,所述时间模块通过IO总线与主操作模块连接,时间模块的时间信息与真实时间同步更新;
    S1-1-2,向数据库主备节点上的操作模块发送初始化操作指令,并同步发送初始化启动时间Tsu;其中,所述初始化操作指令为由SQL编译后生成的能够被数据库系统直接执行的操作指令,所述操作指令中包含数据初始化比例P,
    P=(CM max×P 操作模块)/CM data×100%;
    其中,CM max为操作模块可使用的最大存储空间;
    P 操作模块为预设的操作模块允许使用的存储空间比例;
    CM data为数据库节点上的所有数据所占存储空间的大小。
  4. 如权利要求3所述的方法,其中,所述S1-2,部署于数据库主备节点上的操作模块收到所述指令后,分别从对应数据库节点上下载数据,并提取数据库节点上的数据信息,具体包括:
    S1-2-1,部署于数据库主备节点上的操作模块收到所述指令后,对指令进行分解,获得初始化启动时间T su和初始化操作指令;
    S1-2-2,将所述初始化操作指令下推至同一节点的数据库中执行;
    S1-2-3,执行数据库根据所述初始化操作指令,获取比例为所述初始化比例P的数据,并将所述数据下载至部署于该节点上的操作模块中;
    S1-2-4,数据加载完成时,所述操作模块获取主操作模块同步的时间信息,记为数据加载完成时间T SE
    S1-2-5,将所述数据加载完成时间T SE与所述操作指令完成结果信息返回至主操作模块;其中,所述操作指令完成结果信息至少包括以下数据:元数据、操作状态、时间消耗、和数据量大小。
  5. 如权利要求2所述的方法,其中,所述S1-3,数据下载完成后,将所述指令完成结果信息和时间信息返回至主操作模块,还包括:
    S1-3-1,主操作模块收到所述操作指令完成结果信息后,提取其中的数据与部署于主节点上的所述操作模块返回的信息进行比对;
    S1-3-2,如果一致,则初始化成功,并记录数据库集群初始化结束时间T SE
    S1-3-3,如果不一致,则将主数据库节点上的数据重新备份至不一致的数据库节点,并执行步骤S1-3-2,直到初始化成功,并记录数据库集群初始化结束时间T SE
  6. 如权利要求1所述的方法,其中,所述S2,实时接收用户端请求,并根据请求类型完成对应的数据库操作和数据同步,具体包括:
    S2-1,主操作模块实时接收用户端请求,打上时间标签T start
    S2-2,主操作模块提取所述用户端请求中的SQL语句,并对SQL进行解析、编译和优化,生成带有所述时间标签T start的执行计划;
    S2-3,主操作模块根据请求类型和数据库节点上各个操作模块的闲置状况及主从关系,将所述执行计划分发至数据库节点上的各个操作模块执行;其中,所述请求类型包括数据增加、删除、更新和查询;
    S2-4,所述操作模块将所述执行计划执行完成后,返回执行结果。
  7. 如权利要求6所述的方法,其中,所述S2-3,主操作模块根据请求类型和数据库节点上各个操作模块的闲置状况及主从关系,将所述执行计划分发至数据库节点上的各个操作模块执行,具体包括:
    S2-3-1,主操作模块根据SQL语句判断并识别所述请求类型;
    S2-3-2,如果所述请求类型为增加、删除或更新的数据库操作,则执行S2-3-3;如果所述请求类型为数据查询,则执行S2-3-4;
    S2-3-3,主操作模块根据数据库节点上的各个操作模块的正在执行任务或待执行任务的情况,依次将所述执行计划下发至数据库节点上的各个操作模块中执行,并更新各个操作模块的闲置状态;所述操作模块执行完成所述执行计划后生成执行结果;其中,执行结果至少包括以下数据信息:元数据、数据量大小,以及数据库操作信息和完成时间T end
    S2-3-4,主操作模块根据数据库节点上的各个操作模块的正在执行任务或待执行任务的情况,优先将所述执行计划下发至资源闲置或剩余执行任务较少的所述操作模块上执行,并更新所述操作模块的闲置状态;所述操作模块执行完成所述执行计划后生成执行结果;其中,执行结果包括查询结果和查询完成的时间T end
  8. 如权利要求7所述的方法,其中,所述S2-3-4,所述操作模块执行完成所述执行计划后生成执行结果,具体包括:
    S2-3-4-1,所述操作模块接收到执行计划,优先判断所述执行计划涉及的数据的存储位置;
    S2-3-4-2,如果所述存储位置均处于所述操作模块中,则执行S2-3-4-3;如果所述存储位置部分处于所述操作模块中,则执行S2-3-4-4;
    S2-3-4-3,由所述操作模块完成查询操作后即可向所述主操作模块返回查询结果和查询完成的时间信息T end
    S2-3-4-4,所述操作模块将所述执行计划发送至其所在的数据库中执行,执 行结束后,将查询结果返回所述执行模块,所述执行结果将收到所述查询结果的时间记为查询完成的时间信息T end,将所述查询结果和所述时间信息T end作为执行结果返回所述主操作模块。
  9. 如权利要求1所述的方法,其中,所述S3,向用户端返回数据库操作结果,具体包括;
    S3-1,主操作模块收到所述执行结果,判断所述执行结果的类型;
    S3-2,如果所述执行结果为增加、删除或更新的数据库操作的执行结果,所述主操作模块在收到第一个执行结果时,即向用户端返回所述数据库操作结果,同时提取并更新所述数据信息,以及数据库操作信息,并记录所述完成时间T end,更新所述操作模块的闲置状态;其中,所述数据库操作结果为所述数据操作是否成功和/或操作量;
    S3-3,如果所述执行结果为数据查询结果,所述主操作模块在收到所述执行结果时,向用户端返回数据查询结果,同时记录所述时间信息T end,并更新所述操作模块的闲置状态;其中,所述数据库操作结果为所述数据操作为数据查询结果。
  10. 一种数据库主备同步操作的加速系统,其特征在于,系统包括一主操作单元,以及部署在多个数据库节点上的操作模块;
    所述主操作单元包括主操作模块以及时间模块,所述时间模块通过IO总线与主操作模块连接;所述主操作模块分别与用户端以及部署在数据库节点上的操作模块相连接;
    其中,所述时间模块,用于获取时间信息;
    其中,所述主操作模块,具体包括:
    输入单元,用于接收用户端的请求指令;
    第一同步单元,用于从时间模块中获取时间信息,并为从用户端接收的用户端请求打上时间标签;以及完成主操作模块与部署于数据库节点上的操作模块之间的时间信息同步;
    第一数据操作执行单元,用于对所述用户端请求进行解析、优化、编译并生成对应的执行计划;并用于统计部署于数据库节点上的操作模块的闲置状态信息;
    第一发送单元,用于将所述执行计划发送给部署于数据库节点上的操作模块;
    第一接收单元,用于接收所述操作模块返回的执行结果;
    第一缓存单元,用于缓存所述执行计划和所述操作模块返回的执行结果;
    第一数据存储单元,用于存储各个操作模块的闲置状态信息、返回的执行结果,以及数据信息;
    第一节点和操作装置接口,用于为主操作模块与部署于数据库节点上的操作模块之间的连接提供接口;
    输出单元,用于向用户端返回所述用户端请求对应的执行结果;
    其中,所述部署在数据库节点上的操作模块,具体包括:
    第二同步单元,用于获取所述主操作模块发送的时间信息,并在所述操作模块完成数据库操作后,为数据库操作打上所述数据库操作完成的时间标签;
    第二数据操作执行单元,用于执行对所述操作模块接收到的执行计划;
    第二发送单元,用于将所述执行结果返回所述主操作模块;
    第二接收单元,用于接收所述主操作模块发出的执行计划和同步的时间信息;
    第二缓存单元,用于缓存所述执行计划和所述执行计划执行的中间结果,以及从数据库中提取的数据信息和数据;
    第二数据存储单元,用于存储从数据库中提取的数据信息及数据;
    第二节点和操作装置接口,用于为主操作模块与部署于数据库节点上的操作模块之间的连接提供接口。
PCT/CN2022/139851 2022-06-29 2022-12-19 一种数据库主备同步操作的加速方法和系统 WO2024001079A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210751846.3A CN114942965B (zh) 2022-06-29 2022-06-29 一种数据库主备同步操作的加速方法和系统
CN202210751846.3 2022-06-29

Publications (1)

Publication Number Publication Date
WO2024001079A1 true WO2024001079A1 (zh) 2024-01-04

Family

ID=82911068

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/139851 WO2024001079A1 (zh) 2022-06-29 2022-12-19 一种数据库主备同步操作的加速方法和系统

Country Status (2)

Country Link
CN (1) CN114942965B (zh)
WO (1) WO2024001079A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114942965B (zh) * 2022-06-29 2022-12-16 北京柏睿数据技术股份有限公司 一种数据库主备同步操作的加速方法和系统
CN116303791A (zh) * 2023-03-22 2023-06-23 合肥申威睿思信息科技有限公司 一种基于加速系统的数据同步方法和装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170075965A1 (en) * 2015-09-16 2017-03-16 Turn Inc. Table level distributed database system for big data storage and query
US20200349172A1 (en) * 2019-04-30 2020-11-05 Microsoft Technology Licensing, Llc Managing code and data in multi-cluster environments
CN113779087A (zh) * 2021-09-09 2021-12-10 苏州浪潮智能科技有限公司 一种基于远程直接内存访问的数据库高可用的方法、系统
CN114328743A (zh) * 2021-12-30 2022-04-12 瀚高基础软件股份有限公司 集群中实现对等服务的方法及系统
CN114942965A (zh) * 2022-06-29 2022-08-26 北京柏睿数据技术股份有限公司 一种数据库主备同步操作的加速方法和系统

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107277144B (zh) * 2017-06-22 2021-02-09 浙江力石科技股份有限公司 一种分布式高并发云存储数据库系统及其负荷均衡方法
CN108090222B (zh) * 2018-01-05 2020-07-07 中国科学院计算技术研究所 一种数据库集群节点间数据同步系统
CN110209726B (zh) * 2018-02-12 2023-10-20 金篆信科有限责任公司 分布式数据库集群系统、数据同步方法及存储介质
US20210216502A1 (en) * 2020-01-09 2021-07-15 Salesforce.Com, Inc. System and method for synchronizing delete operations between primary and secondary databases
CN111858097A (zh) * 2020-07-22 2020-10-30 安徽华典大数据科技有限公司 分布式数据库系统、数据库访问方法
CN112417033A (zh) * 2020-10-19 2021-02-26 中国科学院计算机网络信息中心 一种分布式图数据库多节点数据一致性实现方法和系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170075965A1 (en) * 2015-09-16 2017-03-16 Turn Inc. Table level distributed database system for big data storage and query
US20200349172A1 (en) * 2019-04-30 2020-11-05 Microsoft Technology Licensing, Llc Managing code and data in multi-cluster environments
CN113779087A (zh) * 2021-09-09 2021-12-10 苏州浪潮智能科技有限公司 一种基于远程直接内存访问的数据库高可用的方法、系统
CN114328743A (zh) * 2021-12-30 2022-04-12 瀚高基础软件股份有限公司 集群中实现对等服务的方法及系统
CN114942965A (zh) * 2022-06-29 2022-08-26 北京柏睿数据技术股份有限公司 一种数据库主备同步操作的加速方法和系统

Also Published As

Publication number Publication date
CN114942965A (zh) 2022-08-26
CN114942965B (zh) 2022-12-16

Similar Documents

Publication Publication Date Title
US11314777B2 (en) Data replication and data failover in database systems
WO2024001079A1 (zh) 一种数据库主备同步操作的加速方法和系统
WO2020224374A1 (zh) 数据复制方法、装置、计算机设备及存储介质
US10817536B2 (en) Transferring connections in a multiple deployment database
CN111190935B (zh) 数据读取方法、装置、计算机设备及存储介质
US11928089B2 (en) Data processing method and device for distributed database, storage medium, and electronic device
CN111597015A (zh) 事务处理方法、装置、计算机设备及存储介质
EP4170509A1 (en) Method for playing back log on data node, data node, and system
CN116108057B (zh) 一种分布式数据库访问方法、装置、设备及存储介质
CN110083306A (zh) 一种分布式对象存储系统及存储方法
WO2023116419A1 (zh) 数据同步方法、设备及计算机可读存储介质
CN111475480A (zh) 一种日志处理方法及系统
WO2023103340A1 (zh) 一种区块数据提交的方法及装置
CN114500289B (zh) 控制平面恢复方法、装置、控制节点及存储介质
Peluso et al. GMU: genuine multiversion update-serializable partial data replication
WO2022135471A1 (zh) 多版本并发控制和日志清除方法、节点、设备和介质
Radi Improved aggressive update propagation technique in cloud data storage
Zhang et al. PROAR: a weak consistency model for Ceph
CN112966047B (zh) 一种基于分布式数据库的复制表功能实现方法
Rehmann et al. Applications and evaluation of in-memory mapreduce
CN115712683A (zh) 数据库的同步方法、装置、计算机设备以及存储介质
CN117420947A (zh) 一种分布式数据库实时储存功能的方法
CN114817338A (zh) 数据处理方法、装置、电子设备及计算机可读存储介质
CN115599411A (zh) 服务节点更新方法、装置、电子设备及存储介质
Correia Júnior et al. Group-based replication of on-line transaction processing servers

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22949155

Country of ref document: EP

Kind code of ref document: A1