WO2024114105A1 - 一种数据库的数据更新方法、系统及计算设备集群 - Google Patents

一种数据库的数据更新方法、系统及计算设备集群 Download PDF

Info

Publication number
WO2024114105A1
WO2024114105A1 PCT/CN2023/123472 CN2023123472W WO2024114105A1 WO 2024114105 A1 WO2024114105 A1 WO 2024114105A1 CN 2023123472 W CN2023123472 W CN 2023123472W WO 2024114105 A1 WO2024114105 A1 WO 2024114105A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
memory database
database cluster
cluster
master node
Prior art date
Application number
PCT/CN2023/123472
Other languages
English (en)
French (fr)
Inventor
舒熙
余汶龙
杨科伟
梁潇
Original Assignee
华为云计算技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为云计算技术有限公司 filed Critical 华为云计算技术有限公司
Publication of WO2024114105A1 publication Critical patent/WO2024114105A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0804Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with main memory updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Definitions

  • the data storage module of the in-memory database cluster obtains write operations from the log file
  • the data storage module is deployed on the master node of the memory database cluster, or on the slave node, or on other nodes other than the master and slave nodes.
  • the solution shown in this application also includes:
  • a master node of the in-memory database cluster receives multiple write requests, each of the multiple write requests records a write operation, and the multiple write requests include multiple write operations received at different times for updating data of the same object;
  • the master node stores the above multiple write operations in log files in the shared storage of the memory database cluster;
  • the data storage module of the memory database cluster obtains the above multiple write operations from the log file
  • the data storage module stores the latest data in the above multiple write operations in a data file in the shared storage of the memory database cluster.
  • the log file may save the operation information of the master node and metadata information such as the serial number corresponding to the operation.
  • the log file may be an AOF file (Append Only File).
  • the database state saved in the data file and the log file is exactly the same.
  • the data file can be regarded as a snapshot of the database at a certain moment, recording the latest data in the write operation in the log file.
  • the present application provides a database system, which is an in-memory database cluster and uses Redis as a cache server.
  • the service includes multiple nodes and shared storage.
  • the multiple nodes specifically include a master node and multiple slave nodes.
  • Some application scenarios may also include one or more other nodes.
  • the master node is used to deploy the Redis master instance, and the slave node is used to deploy the Redis slave instance.
  • the Redis master instance of the master node can receive the user's read and write requests and record the write operations to the log file in the shared storage; the Redis slave instance of the slave node can keep data synchronized with the master node by continuously reading the log files in the shared storage and replaying them.
  • the database system includes:
  • a master node of the in-memory database cluster used to receive a write request, wherein the write request records a write operation
  • a master node configured to store the write operation in a log file in a shared storage of the memory database cluster
  • a data storage module of the memory database cluster used for obtaining the write operation from the log file
  • the data storage module is used to store the latest data in the write operation into a data file in the shared storage of the memory database cluster.
  • the database system provided by the present application further includes: a slave node of the memory database cluster reads the log file in the shared storage to respond to a read request of the memory database cluster.
  • the new slave node when the master node is switched to a new slave node, the new slave node is used to read the log file in the shared storage in order to respond to the read request of the memory database cluster; when the slave node is switched to the new master node, the new master node is used to receive the new write request and store the write operation in the new write request to the log file in the shared storage.
  • the data storage module can be deployed on the master node of the memory database cluster, or on the slave node, or on other nodes other than the master and slave nodes.
  • the database system provided by this application also includes:
  • the data storage module of the memory database cluster is used to obtain the above multiple write operations from the log file;
  • the data storage module is used to store the latest data in the above multiple write operations into a data file in the shared storage of the memory database cluster.
  • the present application provides a computing device cluster, comprising at least one computing device, each computing device comprising a processor and a memory; the processor of the at least one computing device is used to execute instructions stored in the memory of the at least one computing device, so that the computing device cluster executes a method for updating database data provided in the first aspect or any possible implementation of the first aspect.
  • the present application provides a computer program product comprising instructions, which, when executed by a computing device cluster, enables the computing device cluster to execute a method for updating database data as provided in the first aspect or any possible implementation of the first aspect.
  • the present application provides a computer-readable storage medium comprising computer program instructions.
  • the computing device cluster executes a method for updating database data as provided in the first aspect or any possible implementation of the first aspect.
  • FIG1 is a schematic diagram of the architecture of an in-memory database system provided in an embodiment of the present application.
  • FIG. 2 is a flow chart of a method for updating data in a database provided in an embodiment of the present application.
  • FIG3 is a flow chart of a master-slave switching method when a database fails, provided in an embodiment of the present application.
  • FIG. 4 is a flow chart of a database cluster reconstruction method provided in an embodiment of the present application.
  • FIG5 is a schematic diagram of a computing device provided in an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a computing device cluster provided in an embodiment of the present application.
  • FIG. 7 is a schematic diagram of an implementation method of a computing device cluster provided in an embodiment of the present application.
  • Remote Dictionary Server It is a high-performance storage system that supports multiple data structures such as key-value, and provides direct access services for data types such as string, hash, list, set structure (Set, Sorted_Set), and stream. Data reading and writing can be based on memory or persisted to disk. As a cache service, Redis can greatly reduce the direct load pressure on backend data and applications in the database system. The in-memory database system cluster using the Redis storage system can provide support for frequently requested projects with sub-millisecond response time.
  • Fork It is used to create a child process for a process.
  • the child process runs at the same time as the parent process and uses the same data as the parent process.
  • the forked child process points to the same memory address space as the parent process.
  • Rewrite A function that creates a new file to replace an old file.
  • the user's write request will be recorded in the log file for data persistence.
  • the Redis server can use this function to create a new log file to replace the existing log file.
  • the database status saved in the new and old log files is exactly the same, but the new log file after rewriting only contains the minimum commands required to rebuild the current data set, and does not contain any redundant commands that waste space.
  • In-memory database cluster This is a database that directly operates on data in memory. Compared with disks, the data read and write speed of memory is several orders of magnitude higher. Storing data in memory can greatly improve the performance of applications compared to accessing it from disk.
  • the database that uses the Redis cache service is an in-memory database.
  • the in-memory database can adopt a master-slave architecture. The in-memory database in this mode becomes an in-memory database cluster.
  • the embodiment of the present application relates to a scenario in which a memory database cluster using a Redis cache service is used for data persistence.
  • a memory database cluster using a Redis cache service is used for data persistence.
  • the Redis system there are usually two ways for the Redis system to persist data: one is that Redis forks a child process, which generates a time point snapshot of a data set within a specified time interval; the other is that Redis records the write operation to the memory data in the form of a log, and when the log file volume is large, forks a child process, which rewrites the log file into a new log file containing only the minimum set of commands required to restore the current data set.
  • both of the above two persistence methods require a child process to be forked, there is a problem that Redis cannot maximize the use of memory and performance jitter occurs.
  • the slave node replicates and synchronizes the data of the master node through the network. Since the replication synchronization is not real-time, when the master node fails and the master-slave switch is performed, the data that is not synchronized to the slave node will be lost, and there is a risk of data loss.
  • Fig. 1 is a schematic diagram of the architecture of an in-memory database system provided in an embodiment of the present application.
  • the in-memory database system uses Redis as a cache service, including multiple nodes and shared storage, the multiple nodes specifically include a master node and multiple slave nodes, and some application scenarios may also include one or more other nodes.
  • Fig. 1 only exemplarily shows two slave nodes and one other node.
  • the master node is used to deploy the Redis master instance, and the slave node is used to deploy the Redis slave instance.
  • the Redis master instance of the master node can receive the user's read and write requests and record the write operations to the log file in the shared storage; the Redis slave instance of the slave node can keep data synchronized with the master node by continuously reading the log file in the shared storage and replaying it.
  • the in-memory database system shown in FIG1 further includes a data storage module, which can be deployed on a master node, or on a slave node, or on other nodes other than the master and slave nodes.
  • the data storage module can continuously read log files in the shared storage, and convert them into data files and store them in the shared storage, wherein the data file can be regarded as a snapshot of the database at a certain moment, recording the latest data in the write operation in the log file.
  • a database data update method provided by the present application can adopt the memory database system architecture shown in Figure 1.
  • the method realizes Redis data persistence by using shared storage, and replaces the rewrite operation of Redis with a newly added data storage module.
  • the Redis master node does not need to perform the rewrite operation, thus avoiding the use of Fork and the performance jitter problem caused by the Fork process.
  • data synchronization is realized between the master and slave nodes through the log files in the shared storage. In the master-slave switching scenario, the slave node can still read the latest data from the log file without the risk of data loss.
  • a method for updating database data provided by the present application is described in detail below in conjunction with the flowchart shown in FIG2 .
  • the method is based on the in-memory database system shown in FIG1 , and specifically includes the following steps 201 to 204 .
  • Step 201 The master node receives a write request.
  • the write request records the write operation that the client requires the master node to perform.
  • Step 202 The master node stores the write operation in a log file in the shared storage of the memory database cluster.
  • the master node After the master node updates the data in the memory according to the write operation in the received write request, it records the write operation in a log file and stores the log file in the shared storage of the database cluster.
  • the log file can be any form of log file such as an AOF file (Append Only File), which is not specifically limited here.
  • the log file stores the operation information of the master node and metadata information such as the sequence number corresponding to the operation.
  • Step 203 The data storage module obtains a write operation from the log file.
  • the data storage module can directly access the log files and obtain the write operations therein.
  • Step 204 The data storage module stores the latest data in the write operation into a data file in the shared storage.
  • the data storage module is required to convert the log file into a data file, where the data file is exactly the same as the database state saved by the log file, but the data file can directly record the latest data in the write operation in the log file using data structures such as key-value, which occupies less space than the log file.
  • the specific way to realize the conversion of log files into data files is that the data storage module stores the latest data in the write operation into the data file in the shared storage.
  • the data file can be a snapshot of the database at a certain moment.
  • the data storage module may also delete the converted log file to avoid the infinite expansion of the log file and occupying too much space resources.
  • the data storage module may also delete the converted log file to avoid the infinite expansion of the log file and occupying too much space resources.
  • the slave node when a user terminal has a read request, can directly read the log file in the shared storage and provide the data information read in the log file to the client.
  • the slave node achieves consistency with the master node data through shared storage.
  • the write request received by the master node in step 201 may be multiple write requests received at different times and containing multiple write operations for updating data of the same object.
  • the data storage module in step 204 stores the latest data of the multiple write operations in the data file.
  • FIG3 is a flow chart of the master-slave switching method provided by the present application, including a master-slave downgrade flow chart (a) of switching the master node to a new slave node and a slave-upgrade flow chart (b) of switching the slave node to a new master node.
  • a master-slave downgrade flow chart (a) of switching the master node to a new slave node
  • a slave-upgrade flow chart b
  • the method for switching from a slave node to a new master node includes the following steps 301 to 302:
  • Step 301 Read the log file in the shared storage from the node.
  • the slave node needs to first read all log files in the shared storage to ensure that the data in the memory is consistent with that before the master node fails, and no data is lost.
  • Step 302 After reading all log files in the shared storage, the slave node switches its own state and provides read and write services to the outside.
  • the slave node After the slave node has read all log files in the shared storage, it synchronizes the data in the memory with all the latest data before the master node fails.
  • the slave node can complete the switching of its own state by obtaining a lease for shared storage, that is, it is converted from a state that can only respond to external read requests to a state that can provide read and write services to the outside world. This also means that the slave node has been switched to a new master node.
  • a new write request can be received, and the write operation in the new write request can be stored in a log file in the shared storage.
  • the method for switching the master node to a new slave node includes the following steps 303 to 304:
  • Step 303 The master node reads all data files in the shared storage to restore the memory data to the full data at a certain time point t.
  • the master node fails and restarts, some memory data will be lost. At this time, the master node needs to first read all data files in the shared storage to restore the memory data to the full data at a certain time point t.
  • this step may also include the master node obtaining the maximum sequence number of the data in all data files in the shared storage.
  • Step 304 The master node reads the data after time point t in the log file in the shared storage.
  • the master node After completing data recovery, the master node acts as a new slave node and can read log files in the shared storage and respond to new read requests from the client.
  • step 303 the master node has restored the full amount of data at a certain time point t by reading the data file.
  • the master node as a new slave node, only needs to read the data after the time point t in the log file.
  • the master node can read the data after the time point t in the log file in the shared storage by reading the data whose sequence number in the log file of the shared storage is after the maximum sequence number of the data in the data file.
  • the failed master node has not been restarted.
  • the method for switching the master node to a new slave node only needs to execute step 304, that is, the master node reads the log file in the shared storage as a new slave node and responds to the new read request of the user end.
  • the present application also provides a reconstruction method for the memory database system shown in FIG1 , and FIG4 is a flow chart of the reconstruction method.
  • the reconstruction method includes a data storage module reconstruction method and a master-slave node reconstruction method, wherein:
  • the data module reconstruction method includes: the data storage module reads the data in the shared storage log file which has not been read by the data storage module before the cluster failure, and writes the data into the shared storage data file.
  • the data storage module can first read the maximum sequence number of the data in the shared storage data file, and then read the data in the shared storage log file whose sequence number is after the maximum sequence number of the data in the data file, to read the data in the shared storage log file that has not been read by the data storage module before the cluster failure.
  • the master-slave node reconstruction method includes the following steps 401 to 403:
  • Step 401 The Redis node reads all data files in the shared storage to restore the memory data to the full data at a certain time point T.
  • the Redis node is the node where the Redis instance is deployed in the memory database cluster.
  • this step may also include, the Redis node obtaining the maximum sequence number of data in all data files in the shared storage.
  • Step 402 The Redis node reads the data after time point T in the log file in the shared storage.
  • the Redis node can read the data after the time point T in the log file in the shared storage by reading the data whose sequence number in the log file in the shared storage is after the maximum sequence number of the data in the data file.
  • Step 403 The Redis node switched to the master node provides read-write services; the Redis node switched to the slave node provides read-only services.
  • the Redis nodes can complete the switching of their own states by obtaining the lease of shared storage.
  • the Redis nodes that successfully obtain the lease are switched to the master node, and the Redis nodes that fail to successfully obtain the lease are switched to the slave node.
  • a new write request can be received, and the write operation in the new write request can be stored in a log file in the shared storage.
  • the present application also provides a database system, which may be the in-memory database system shown in Figure 1.
  • the database system is an in-memory database cluster, which uses Redis as a cache service, and includes multiple nodes and shared storage, wherein the multiple nodes specifically include a master node and multiple slave nodes, and may also include one or more other nodes in some application scenarios.
  • the master node is used to deploy the Redis master instance, and the slave node is used to deploy the Redis slave instance.
  • the Redis master instance of the master node can receive the user's read and write requests and record the write operations to the log file in the shared storage; the Redis slave instance of the slave node can keep data synchronized with the master node by continuously reading the log file in the shared storage and replaying it.
  • the memory database system also includes a data storage module, which can be deployed on the master node, or on the slave node, or on other nodes other than the master and slave nodes.
  • the data storage module can continuously read the log files in the shared storage, and convert them into data files and store them in the shared storage, wherein the data file can be regarded as a snapshot of the database at a certain moment, recording the latest data in the write operation in the log file.
  • the database system includes:
  • a master node of the memory database cluster used to receive a write request, wherein the write request records the write operation
  • a master node configured to store the write operation in a log file in a shared storage of the memory database cluster
  • a data storage module of the memory database cluster used for obtaining the write operation from the log file
  • the data storage module is used to store the latest data in the write operation into a data file in the shared storage of the memory database cluster.
  • the database system provided by the present application further includes: a slave node of the memory database cluster reads the log file in the shared storage to respond to a read request of the memory database cluster.
  • the new slave node when the master node is switched to a new slave node, the new slave node is used to read the log file in the shared storage in order to respond to the read request of the memory database cluster; when the slave node is switched to the new master node, the new master node is used to receive the new write request and store the new write request in the log file in the shared storage.
  • the data storage module can be deployed on the master node of the memory database cluster, or on the slave node, or on other nodes other than the master and slave nodes.
  • the database system provided by this application also includes:
  • a master node of the in-memory database cluster configured to receive multiple write requests, each of the multiple write requests recording a write operation, the multiple write requests including multiple write operations received at different times for updating data of the same object;
  • the master node is used to store the above multiple write operations in the log files in the shared storage of the memory database cluster;
  • the data storage module of the memory database cluster is used to obtain the above multiple write operations from the log file;
  • the data storage module is used to store the latest data in the above multiple write operations into a data file in the shared storage of the memory database cluster.
  • a node may include code running on a computing instance.
  • a computing instance may be at least one of a physical host (computing device), a virtual machine, a container, or other computing devices.
  • the computing device may be one or more.
  • a node may include code running on multiple hosts, virtual machines, or containers. It should be noted that multiple hosts, virtual machines, or containers used to run the code Or the containers can be distributed in the same AZ or in different AZs, each AZ includes a data center or multiple data centers with close geographical locations. Multiple hosts, virtual machines, or containers used to run the code can be distributed in the same region or in different regions. Among them, usually a region can include multiple AZs, and VPC is set up in a region. Cross-region communication between two VPCs in the same region and between VPCs in different regions requires setting up a communication gateway in each VPC, and interconnection between VPCs is achieved through the communication gateway.
  • multiple hosts or virtual machines or containers for running the code may be distributed in the same VPC or in multiple VPCs, wherein usually a region may include multiple AZs.
  • the node may include at least one computing device, such as a server, etc.
  • the node may also be a device implemented using ASIC or PLD, etc.
  • the PLD may be implemented using CPLD, FPGA, GAL or any combination thereof.
  • the multiple computing devices included in the node can be distributed in the same AZ or in different AZs.
  • the multiple computing devices included in the node can be distributed in the same region or in different regions.
  • the multiple computing devices included in the node can be distributed in the same VPC or in multiple VPCs.
  • the multiple computing devices can be any combination of computing devices such as servers, ASICs, PLDs, CPLDs, FPGAs, and GALs.
  • the present application also provides a computing device 500, and FIG5 is a schematic diagram of the structure of the computing device 500.
  • the computing device 500 may be a master node or a slave node or other node in the above-mentioned embodiment, and includes: a bus 502, a processor 504, a memory 506, and a communication interface 508.
  • the processor 504, the memory 506, and the communication interface 508 communicate with each other through the bus 502.
  • the computing device 500 may be a server or a terminal device. It should be understood that the present application does not limit the number of processors and memories in the computing device 500.
  • the bus 502 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus, etc.
  • the bus may be divided into an address bus, a data bus, a control bus, etc.
  • FIG. 5 is represented by only one line, but does not mean that there is only one bus or one type of bus.
  • the bus 502 may include a path for transmitting information between various components of the computing device 500 (e.g., the memory 506, the processor 504, and the communication interface 508).
  • Processor 504 may include any one or more of a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor (MP), or a digital signal processor (DSP).
  • CPU central processing unit
  • GPU graphics processing unit
  • MP microprocessor
  • DSP digital signal processor
  • the memory 506 may include a volatile memory (volatile memory), such as a random access memory (RAM).
  • volatile memory such as a random access memory (RAM).
  • RAM random access memory
  • non-volatile memory non-volatile memory
  • ROM read-only memory
  • flash memory flash memory
  • HDD hard disk drive
  • SSD solid state drive
  • the memory 506 stores executable program codes, and the processor 504 executes the executable program codes to respectively implement the functions of the plurality of nodes in the aforementioned database system, thereby implementing the database data update method, the master-slave switching method, and the system reconstruction method. That is, the memory 506 stores instructions for executing the database data update method, the master-slave switching method, and the system reconstruction method.
  • the communication interface 508 uses a transceiver module such as, but not limited to, a network interface card or a transceiver to implement communication between the computing device 500 and other devices or a communication network.
  • a transceiver module such as, but not limited to, a network interface card or a transceiver to implement communication between the computing device 500 and other devices or a communication network.
  • FIG6 is a schematic diagram of the computing device cluster, which can implement the database system in the above embodiment.
  • the computing device cluster includes at least one computing device 500 as shown in FIG5, and the computing device 500 can be a server, such as a central server, an edge server, or a local server in a local data center.
  • the computing device can also be a terminal device such as a desktop computer, a laptop computer, or a smart phone.
  • Instructions for executing the database data update method, the master-slave switching method, and the system reconstruction method may be stored in the memory 506 of one or more computing devices 500 in the computing device cluster.
  • the computing device cluster may implement the database data update method, the master-slave switching method, and the system reconstruction method as described in the embodiment.
  • the memory 506 of one or more computing devices 500 in the computing device cluster may also store partial instructions for executing the database data update method, the master-slave switching method, and the system reconstruction method.
  • the combination of one or more computing devices 500 can jointly execute the instructions for executing the database data update method, the master-slave switching method, and the system reconstruction method.
  • the memory 506 in different computing devices 500 in the computing device cluster can store different instructions, which are respectively used to execute part of the functions of the database system. That is, the instructions stored in the memory 506 in different computing devices 500 can implement the functions of one or more objects in the master node, multiple slave nodes and data storage module.
  • one or more computing devices in the computing device cluster may be connected via a network.
  • the network may be a wide area network or a local area network, etc.
  • FIG. 7 shows a possible implementation. As shown in FIG. 7 , two computing devices 500A and 500B are connected via a network. Specifically, the two computing devices are connected to the network via a communication interface in each computing device.
  • the memory 506 in the computing device 500A stores instructions for executing the functions of the data storage module. Meanwhile, the memory 506 in the computing device 500B stores instructions for executing the functions of the Redis master node and the Redis slave node.
  • connection method between the computing device clusters shown in FIG. 7 may be based on the consideration that the database data update method provided in the present application requires a large amount of data to be stored, and therefore the functions implemented by the data storage module may be delegated to the computing device 500A for execution.
  • the functions of the computing device 500A shown in FIG7 may also be completed by multiple computing devices 500.
  • the functions of the computing device 500B may also be completed by multiple computing devices 500.
  • the present application also provides a computer program product including instructions.
  • the computer program product may be software or a program product including instructions that can be run on a computing device or stored in any available medium.
  • the at least one computing device executes a database data update method, a master-slave switching method, and a system reconstruction method.
  • the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium may be any available medium that can be stored by a computing device or a data storage device such as a data center that includes one or more available media.
  • the available medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a solid-state hard disk).
  • the computer-readable storage medium includes instructions that instruct the computing device to execute a database data update method, a master-slave switching method, and a system reconstruction method, or instruct the computing device to execute a database data update method, a master-slave switching method, and a system reconstruction method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请提供了一种数据库的数据更新法、系统和计算设备集群,属于数据库技术领域。该方法包括:内存数据库集群的主节点接收写请求,该写请求记录写操作;主节点将写操作存储至内存数据库集群的共享存储中的日志文件;内存数据库集群的数据存储模块从日志文件获取写操作;数据存储模块将写操作中的最新数据存储至内存数据库集群的共享存储中的数据文件。该方法可以避免内存数据库集群中由子进程完成数据持久化过程中的性能可靠性隐患,同时主从切换时无数据丢失的风险。

Description

一种数据库的数据更新方法、系统及计算设备集群
本申请要求于2022年11月30日提交中国国家知识产权局、申请号为202211521755.7、发明名称为“一种数据库的数据更新方法、系统及计算设备集群”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及数据库技术领域,尤其涉及一种数据库的数据更新法、系统以及计算设备集群、计算机可读存储介质、计算机程序产品。
背景技术
Redis(Remote Dictionary Server,远程字典服务)是一款基于内存运行的存储系统,在数据库中作为缓存服务得到了广泛应用。Redis运行时数据保存在内存中,为防止服务或系统宕机导致数据丢失,需对Redis中的数据进行持久化。
目前Redis中的数据持久化通常由子进程完成,存在数据持久化过程中Redis不能最大限度地使用内存、易发生性能抖动的问题。同时,在Redis存储系统采用主从架构实现高可靠性和高可用性时,从节点通过网络复制同步主节点的数据,由于复制同步并不是实时的,当主节点故障、进行主从切换时,会丢失未同步到从节点的数据,存在数据丢失的风险。
发明内容
本申请提供了一种数据库的数据更新方法,可以避免内存数据库集群中由子进程完成数据持久化过程中的性能可靠性隐患,同时主从切换时无数据丢失的风险。本申请还提供了对应的数据库系统、计算设备集群、计算机可读存储介质以及计算机程序产品。
第一方面,本申请提供了一种数据库的数据更新方法,该方法通过使用共享存储实现Redis数据持久化,由新增的数据存储模块替代Redis的rewrite操作,Redis主节点无需执行rewrite操作,避免了Fork的使用,没有Fork进程带来的性能抖动问题;同时,主从节点间通过共享存储中的日志文件实现数据同步,主从切换场景下,从节点依然能从日志文件中读取到最新的数据,无数据丢失的风险。具体地,该方法包括:
内存数据库集群的主节点接收写请求,该写请求记录写操作;
主节点将写操作存储至内存数据库集群的共享存储中的日志文件;
内存数据库集群的数据存储模块从日志文件获取写操作;
数据存储模块将写操作中的最新数据存储至内存数据库集群的共享存储中的数据文件。
本申请所示的方案还包括:所述内存数据库集群的从节点读取所述共享存储中的日志文件,以便响应所述内存数据库集群的读请求。
本申请所示的方案中,主从切换场景下,主节点切换为新从节点时,新从节点读取共享存储中的日志文件,以便响应内存数据库集群的读请求;从节点切换为新主节点时,新主节点接收新写请求,并将新写请求中的写操作存储至共享存储中的日志文件。
本申请所示的方案中,数据存储模块部署在内存数据库集群的主节点、或者部署在从节点、或者部署在主从节点外的其它节点。
本申请所示的方案还包括:
内存数据库集群的主节点接收多个写请求,多个写请求中的每个写请求记录写操作,多个写请求包括在不同时间接收的用于更新同一对象的数据的多个写操作;
主节点将上述多个写操作存储至内存数据库集群的共享存储中的日志文件;
内存数据库集群的数据存储模块,从日志文件获取上述多个写操作;
数据存储模块将上述多个写操作中的最新数据存储至内存数据库集群的共享存储中的数据文件。
本申请所示方案中,日志文件可以保存有主节点的操作信息以及操作对应的序号等元数据信息,一种可能的实现方式中,该日志文件可以是AOF文件(Append Only File)。
本申请所示方案中,数据文件与日志文件所保存的数据库状态完全相同,该数据文件可以是视为数据库某时刻的快照,记录了日志文件中写操作中最新的数据。
第二方面,本申请提供了一种数据库系统,该数据库系统为内存数据库集群,采用Redis作为缓存服 务,包括多个节点和共享存储,多个节点具体包括主节点和多个从节点,一些应用场景中可能还包括一个或多个其它节点。
其中,主节点用于部署Redis主实例,从节点用于部署Redis从实例。主节点的Redis主实例可以接收用户的读写请求,并将写操作记录到共享存储中的日志文件;从节点的Redis从实例可以通过不断读取共享存储中的日志文件并重放,和主节点保持数据同步。
此外,该内存数据库系统还包括数据存储模块,该数据存储模块可以部署在主节点、或者部署在从节点、或者部署在主从节点之外的其他节点。该数据存储模块可以不断读取共享存储中的日志文件,并将其转换成数据文件存储在共享存储中,其中,该数据文件可以是视为数据库某时刻的快照,记录了日志文件中写操作中最新的数据。
具体地,该数据库系统包括:
内存数据库集群的主节点,用于接收写请求,所述写请求记录写操作;
主节点,用于将所述写操作存储至内存数据库集群的共享存储中的日志文件;
内存数据库集群的数据存储模块,用于从日志文件获取所述写操作;
数据存储模块,用于将所述写操作中的最新数据存储至内存数据库集群的共享存储中的数据文件。
本申请提供的数据库系统还包括:所述内存数据库集群的从节点读取所述共享存储中的日志文件,以便响应所述内存数据库集群的读请求。
本申请提供的数据库系统中,主从切换场景下,主节点切换为新从节点时,新从节点用于读取共享存储中的日志文件,以便响应内存数据库集群的读请求;从节点切换为新主节点时,新主节点用于接收新写请求,并将新写请求中的写操作存储至共享存储中的日志文件。
本申请提供的数据库系统中,数据存储模块可以部署在内存数据库集群的主节点、或者部署在从节点、或者部署在主从节点外的其它节点。
本申请提供的数据库系统还包括:
内存数据库集群的主节点,用于接收多个写请求,多个写请求中的每个写请求记录写操作,多个写请求包括在不同时间接收的用于更新同一对象的数据的多个写操作;
主节点,用于将上述多个写操作存储至内存数据库集群的共享存储中的日志文件;
内存数据库集群的数据存储模块,用于从日志文件获取上述多个写操作;
数据存储模块,用于将上述多个写操作中的最新数据存储至内存数据库集群的共享存储中的数据文件。
第三方面,本申请提供了一种计算设备集群,包括至少一个计算设备,每个计算设备包括处理器和存储器;该至少一个计算设备的处理器用于执行该至少一个计算设备的存储器中存储的指令,以使得该计算设备集群执行如上述第一方面或第一方面中任一种可能的实现方式所提供的数据库的数据更新的方法。
第四方面,本申请提供了一种包含指令的计算机程序产品,当该指令被计算设备集群运行时,使得该计算设备集群执行如上述第一方面或第一方面中任一种可能的实现方式所提供的数据库的数据更新的方法。
第五方面,本申请提供了一种计算机可读存储介质,包括计算机程序指令,当该计算机程序指令由计算设备集群执行时,该计算设备集群执行如上述第一方面或第一方面中任一种可能的实现方式所提供的数据库的数据更新的方法。
附图说明
图1是本申请实施例提供的一种内存数据库系统的架构示意图。
图2是本申请实施例提供的一种数据库的数据更新方法的流程图。
图3是本申请实施例提供的一种数据库故障时主从切换方法的流程图。
图4是本申请实施例提供的一种数据库集群重建方法的流程图。
图5是本申请实施例提供的一种计算设备的示意图。
图6是本申请实施例提供的一种计算设备集群的示意图。
图7是本申请实施例提供的一种计算设备集群的实现方式示意图。
具体实施方式
下面将结合附图,对本申请提供的技术方案进行详细描述。
为了使本申请提供的技术方案更清晰,在具体描述本申请提供的技术方案之前,首先进行相关术语的解释。
(1)远程字典服务(Remote Dictionary Server,Redis):是一款高性能存储系统,支持键-值(Key-Value)等多种数据结构,提供字符串(String)、哈希(Hash)、列表(List)、集合结构(Set、Sorted_Set)、流(Stream)等类型数据的直接存取服务,数据读写可基于内存也可持久化到磁盘。Redis作为缓存服务可以大幅减轻数据库系统中后端数据和应用程序直接的负载压力,采用redis存储系统的内存数据库系统集群能够以亚毫秒级的响应时间为频繁请求的项目提供支持。
(2)复刻(Fork):用于为某一进程创建一个子进程,子进程与父进程同时运行,且与父进程使用相同的数据。Fork出的子进程指向与父进程相同的内存地址空间。
(3)重写(rewrite):一种为旧文件创建一个新的替代文件的功能。Redis系统中会将用户端的写请求记录入日志文件以进行数据持久化,为解决日志文件体积膨胀的问题,Redis服务器可以通过该功能创建一个新的日志文件来替代现有的日志文件,新旧两个日志文件所保存的数据库状态完全相同,但重写(rewrite)后的新的日志文件只包含重建当前数据集所需的最少命令,不包含任何浪费空间的冗余命令。
(4)内存数据库集群:是将数据放在内存中直接操作的数据库,相对于磁盘,内存的数据读写速度要高出几个数量级,将数据保存在内存中相比从磁盘上访问能够极大地提高应用的性能。本申请实例中,采用了Redis缓存服务的数据库便是内存数据库。进一步地,为实现数据库的高可靠性和可用性,内存数据库可以采用主从架构,该模式下的内存数据库便成为了内存数据库集群。
首先,对本申请实施例涉及的应用场景和系统架构进行简要说明。
本申请实施例涉及采用Redis缓存服务的内存数据库集群进行数据持久化的场景。目前,Redis系统进行数据持久化的方式通常有两种:一种是Redis通过Fork(复刻)一个子进程,该子进程在指定的时间间隔内生成数据集的时间点快照;一种是Redis以日志的形式记录对内存数据的写操作,同时当日志文件体积较大时Fork(复刻)一个子进程,该子进程将日志文件rewrite(重写)成只包含恢复当前数据集所需的最小命令集合的新日志文件。但上述两种持久化方式由于均需Fork一个子进程,存在Redis不能最大限度地使用内存、发生性能抖动的问题。此外,在Redis存储系统采用主从架构实现高可靠性和高可用性时,从节点通过网络复制同步主节点的数据,由于复制同步并不是实时的,当主节点故障、进行主从切换时,会丢失未同步到从节点的数据,存在数据丢失的风险。
图1是本申请实施例提供的一种内存数据库系统的架构示意图。如图1所示,该内存数据库系统采用Redis作为缓存服务,包括多个节点和共享存储,多个节点具体包括主节点和多个从节点,一些应用场景中可能还包括一个或多个其它节点,图1中只是示例性地给出了两个从节点和一个其它节点。
其中,主节点用于部署Redis主实例,从节点用于部署Redis从实例。主节点的Redis主实例可以接收用户的读写请求,并将写操作记录到共享存储中的日志文件;从节点的Redis从实例可以通过不断读取共享存储中的日志文件并重放,和主节点保持数据同步。
此外,图1所示的内存数据库系统还包括数据存储模块,该数据存储模块可以部署在主节点、或者部署在从节点、或者部署在主从节点之外的其他节点。该数据存储模块可以不断读取共享存储中的日志文件,并将其转换成数据文件存储在共享存储中,其中,该数据文件可以是视为数据库某时刻的快照,记录了日志文件中写操作中最新的数据。
本申请提供的一种数据库的数据更新方法,可以采用图1所示的内存数据库系统架构,该方法通过使用共享存储实现Redis数据持久化,由新增的数据存储模块替代Redis的rewrite操作,Redis主节点无需执行rewrite操作,避免了Fork的使用,没有Fork进程带来的性能抖动问题;同时,主从节点间通过共享存储中的日志文件实现数据同步,主从切换场景下,从节点依然能从日志文件中读取到最新的数据,无数据丢失的风险。
下面,结合图2所示的流程示意图详细描述本申请提供的一种数据库的数据更新方法,该方法基于图1所示的内存数据库系统,具体包括如下步骤201至步骤204。
步骤201、主节点接收写请求。
其中,写请求记录了客户端需要主节点执行的写操作。
步骤202、主节点将写操作存储至内存数据库集群的共享存储中的日志文件。
具体地,主节点根据接收到的写请求中的写操作更新内存中的数据后,将写操作记录入日志文件,并将日志文件存储至数据库集群的共享存储。其中,该日志文件可以是AOF文件(Append Only File)等任意形式的日志文件,此处不做具体限定,该日志文件中保存有主节点的操作信息以及操作对应的序号等元数据信息。
步骤203、数据存储模块从日志文件获取写操作。
由于日志文件存储在数据库集群的共享存储中,数据存储模块可以直接访问到日志文件,获取其中的写操作。
步骤204、数据存储模块将写操作中的最新数据存储至共享存储中的数据文件。
实际应用中,随着用户端写请求的增多,日志文件的体积会逐渐膨胀,为避免日志文件占用较多的空间资源,需要数据存储模块将日志文件转化为数据文件,其中,数据文件与日志文件所保存的数据库状态完全相同,但数据文件可以是直接以键-值(Key-Value)等数据结构记录日志文件中写操作中的最新数据,相比日志文件占用更小的空间。
因此,具体实现日志文件向数据文件转化的方式便可以是数据存储模块将写操作中的最新数据存储至共享存储中的数据文件。一种可能的实施例中,数据文件可以是数据库某时刻的快照。
本申请实施例中,数据存储模块通过执行步骤203和步骤204,将日志文件转化为数据文件之后,还包括可以将已完成转化的该日志文件删除,以避免日志文件的无限膨胀占用太多空间资源。通过将日志文件转化为数据文件并删除已转化的日志文件,有助于大大缩减数据恢复或集群重建时需要读取的数据。
本申请实施例中,当用户端有读请求时,从节点可以直接读取共享存储中的日志文件,将日志文件中读取的数据信息提供给客户端,从节点通过共享存储实现了与主节点数据保持一致。
一种可能的实施例中,步骤201中主节点接收的写请求可以是在不同时间接收的包含用于更新同一对象的数据的多个写操作的多个写请求,此时数据存储模块从日志文件获取上述多个写操作后,步骤204中数据存储模块是将多个写操作中的最新数据存储至数据文件。
在本申请提供的上述数据库的数据更新方法的基础上,本申请还为图1所示的内存数据库系统提供了一种主从切换的方法。图3是本申请提供的主从切换方法的流程图,包括主节点切换为新的从节点的主降从流程图(a)和从节点切换为新的主节点的从升主流程图(b)。具体地,该方法中:
从节点切换为新的主节点的方法包括以下步骤301至步骤302:
步骤301、从节点读取共享存储中的日志文件。
实际应用中,当主节点发生故障,需要进行主从切换时,从节点需要先通过读取共享存储中的所有日志文件以保证内存中的数据与主节点故障前一致,无数据丢失。
步骤302、从节点在读取完共享存储中的所有日志文件后,切换自身状态,对外提供读写服务。
当从节点读取完共享存储中的所有日志文件后,便将内存中的数据同步到了主节点故障前的所有最新数据。
其中,从节点可以通过获取共享存储的租约来完成自身状态的切换,即由只能响应外部读请求的状态转换为可对外提供读写服务的状态,这也意味着从节点已切换为新的主节点。
具体地,从节点切换为新的主节点后,可以接收新的写请求,并将新的写请求中的写操作存储至共享存储中的日志文件。
主节点切换为新的从节点的方法包括以下步骤303至步骤304:
步骤303、主节点读取共享存储中的所有数据文件,让内存数据恢复到某个时间点t的全量数据。
实际应用中,当主节点发生故障重启时,会丢失部分内存数据,这时,主节点需要先通过读取共享存储中的所有数据文件,让内存数据恢复到某个时间点t的全量数据。
具体地,本步骤还可以包括,主节点获取共享存储中所有数据文件中数据的最大序列号。
步骤304、主节点读取共享存储中的日志文件中时间点t之后的数据。
在完成数据恢复后,主节点作为新的从节点,可读取共享存储中的日志文件,响应用户端新的读请求。
具体地,步骤303中主节点已通过读数据文件恢复了某个时间点t的全量数据,此步骤中,主节点作为新的从节点只需读取日志文件中时间点t之后的数据。本申请实施例中,主节点可以通过读取共享存储的日志文件中序列号在数据文件中数据的最大序列号之后的数据,实现读取共享存储中的日志文件中时间点t之后的数据。
一种可能的实施例中,故障主节点并未重启,此时主节点切换为新的从节点的方法只需执行步骤304即可,即主节点作为新的从节点读取共享存储中的日志文件,响应用户端新的读请求。
在本申请提供的上述数据库的数据更新方法和主从切换方法的基础上,本申请还为图1所示的内存数据库系统提供了一种重建方法,图4是该重建方法的流程图。该重建方法包括数据存储模块重建方法和主从节点重建方法,其中,
数据模块重建方法包括:数据存储模块读取共享存储的日志文件中在集群故障前尚未被数据存储模块读取的数据,并将其写入共享存储的数据文件中。
具体地,数据存储模块可以先读取共享存储的数据文件中数据的最大序列号,之后通过读取共享存储的日志文件中序列号在数据文件中数据的最大序列号之后的数据,实现读取共享存储的日志文件中在集群故障前尚未被数据存储模块读取的数据。
主从节点重建方法包括以下步骤401至步骤403:
步骤401、Redis节点读取共享存储中的所有数据文件,让内存数据恢复到某个时间点T的全量数据。
其中,Redis节点是内存数据库集群中部署Redis实例的节点。
具体地,本步骤还可包括,Redis节点获取共享存储的所有数据文件中数据的最大序列号。
实际应用中,集群重建时,所有Redis节点均以从节点的状态拉起进程,所有Redis节点均先执行步骤401进行数据恢复。
步骤402、Redis节点读取共享存储中的日志文件中时间点T之后的数据。
具体地,Redis节点可以通过读取共享存储的日志文件中序列号在数据文件中数据的最大序列号之后的数据,实现读取共享存储中的日志文件中时间点T之后的数据。
步骤403、切换为主节点的Redis节点提供读写服务;切换为从节点的Redis节点提供只读服务。
实际应用中,所有Redis节点通过步骤401至步骤402完成内存中全部数据的恢复后,Redis节点可以通过获取共享存储的租约来完成自身状态的切换,成功获取到租约的Redis节点切换为主节点,未成功获取到租约的Redis节点切换为从节点。
具体地,从节点切换为新的主节点后,可以接收新的写请求,并将新的写请求中的写操作存储至共享存储中的日志文件。
本申请还提供了一种数据库系统,该数据库系统可以是上述图1所示的内存数据库系统。该数据库系统为内存数据库集群,采用Redis作为缓存服务,包括多个节点和共享存储,多个节点具体包括主节点和多个从节点,一些应用场景中可能还包括一个或多个其它节点。
其中,主节点用于部署Redis主实例,从节点用于部署Redis从实例。主节点的Redis主实例可以接收用户的读写请求,并将写操作记录到共享存储中的日志文件;从节点的Redis从实例可以通过不断读取共享存储中的日志文件并重放,和主节点保持数据同步。
此外,该内存数据库系统还包括数据存储模块,该数据存储模块可以部署在主节点、或者部署在从节点、或者部署在主从节点之外的其他节点。该数据存储模块可以不断读取共享存储中的日志文件,并将其转换成数据文件存储在共享存储中,其中,该数据文件可以是视为数据库某时刻的快照,记录了日志文件中写操作中最新的数据。
具体地,该数据库系统包括:
内存数据库集群的主节点,用于接收写请求,所述写请求记录写操作;
主节点,用于将所述写操作存储至内存数据库集群的共享存储中的日志文件;
内存数据库集群的数据存储模块,用于从日志文件获取所述写操作;
数据存储模块,用于将所述写操作中的最新数据存储至内存数据库集群的共享存储中的数据文件。
本申请提供的数据库系统还包括:所述内存数据库集群的从节点读取所述共享存储中的日志文件,以便响应所述内存数据库集群的读请求。
本申请提供的数据库系统中,主从切换场景下,主节点切换为新从节点时,新从节点用于读取共享存储中的日志文件,以便响应内存数据库集群的读请求;从节点切换为新主节点时,新主节点用于接收新写请求,并将新写请求存储至共享存储中的日志文件。
本申请提供的数据库系统中,数据存储模块可以部署在内存数据库集群的主节点、或者部署在从节点、或者部署在主从节点外的其它节点。
本申请提供的数据库系统还包括:
内存数据库集群的主节点,用于接收多个写请求,多个写请求中的每个写请求记录写操作,多个写请求包括在不同时间接收的用于更新同一对象的数据的多个写操作;
主节点,用于将上述多个写操作存储至内存数据库集群的共享存储中的日志文件;
内存数据库集群的数据存储模块,用于从日志文件获取上述多个写操作;
数据存储模块,用于将上述多个写操作中的最新数据存储至内存数据库集群的共享存储中的数据文件。
节点可以包括运行在计算实例上的代码。其中,计算实例可以是物理主机(计算设备)、虚拟机、容器等计算设备中的至少一种。进一步地,上述计算设备可以是一台或者多台。例如,节点可以包括运行在多个主机、或者虚拟机、或者容器上的代码。需要说明的是,用于运行该代码的多个主机、或者虚拟机、 或者容器可以分布在相同的AZ中,也可以分布在不同的AZ中,每个AZ包括一个数据中心或多个地理位置相近的数据中心。用于运行该代码的多个主机、或者虚拟机、或者容器可以分布在相同的region中,也可以分布在不同的region中。其中,通常一个region可以包括多个AZ,VPC设置在一个region内。同一region内两个VPC之间,以及不同region的VPC之间跨区通信需在每个VPC内设置通信网关,经通信网关实现VPC之间的互连。
同样,用于运行该代码的多个主机或者虚拟机、或者容器可以分布在同一个VPC中,也可以分布在多个VPC中。其中,通常一个region可以包括多个AZ。
节点可以包括至少一个计算设备,如服务器等。或者,节点也可以是利用ASIC实现、或PLD实现的设备等。其中,上述PLD可以是CPLD、FPGA、GAL或其任意组合实现。
节点包括的多个计算设备可以分布在相同的AZ中,也可以分布在不同的AZ中。节点包括的多个计算设备可以分布在相同的region中,也可以分布在不同的region中。同样,节点包括的多个计算设备可以分布在同一个VPC中,也可以分布在多个VPC中。其中,所述多个计算设备可以是服务器、ASIC、PLD、CPLD、FPGA和GAL等计算设备的任意组合。
本申请还提供一种计算设备500,图5是计算设备500的结构示意图。该计算设备500可以是上述实施例中的主节点或从节点或其它节点,包括:总线502、处理器504、存储器506和通信接口508。处理器504、存储器506和通信接口508之间通过总线502通信。计算设备500可以是服务器或终端设备。应理解,本申请不限定计算设备500中的处理器和存储器的个数。
总线502可以是外设部件互连标准(peripheral component interconnect,PCI)总线或扩展工业标准结构(extended industry standard architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图5中仅用一条线表示,但并不表示仅有一根总线或一种类型的总线。总线502可包括在计算设备500各个部件(例如,存储器506、处理器504、通信接口508)之间传送信息的通路。
处理器504可以包括中央处理器(central processing unit,CPU)、图形处理器(graphics processing unit,GPU)、微处理器(micro processor,MP)或者数字信号处理器(digital signal processor,DSP)等处理器中的任意一种或多种。
存储器506可以包括易失性存储器(volatile memory),例如随机存取存储器(random access memory,RAM)。存储器506还可以包括非易失性存储器(non-volatile memory),例如只读存储器(read-only memory,ROM),快闪存储器,机械硬盘(hard disk drive,HDD)或固态硬盘(solid state drive,SSD)。
存储器506中存储有可执行的程序代码,处理器504执行该可执行的程序代码以分别实现前述数据库系统中多个节点的功能,从而实现数据库的数据更新方法、主从切换方法、以及系统重建方法。也即,存储器506上存有用于执行数据库的数据更新方法、主从切换方法、以及系统重建方法的指令。
通信接口508使用例如但不限于网络接口卡、收发器一类的收发模块,来实现计算设备500与其他设备或通信网络之间的通信。
本申请还提供了一种计算设备集群,图6是该计算设备集群的示意图,该计算设备集群可实现上述实施例中的数据库系统。如图6所示,该计算设备集群包括至少一台如图5所示的计算设备500,该计算设备500可以是服务器,例如是中心服务器、边缘服务器,或者是本地数据中心中的本地服务器。在一些实施例中,计算设备也可以是台式机、笔记本电脑或者智能手机等终端设备。
计算设备集群中的一个或多个计算设备500中的存储器506中可以存有用于执行数据库的数据更新方法、主从切换方法、以及系统重建方法的指令。当计算设备集群中至少一个计算设备执行该指令时,可以使该计算设备集群实现如实施例所述的数据库的数据更新方法、主从切换方法、以及系统重建方法。
在一些可能的实现方式中,该计算设备集群中的一个或多个计算设备500的存储器506中也可以分别存有用于执行数据库的数据更新方法、主从切换方法、以及系统重建方法的部分指令。换言之,一个或多个计算设备500的组合可以共同执行用于执行数据库的数据更新方法、主从切换方法、以及系统重建方法的指令。
需要说明的是,计算设备集群中的不同的计算设备500中的存储器506可以存储不同的指令,分别用于执行数据库系统的部分功能。也即,不同的计算设备500中的存储器506存储的指令可以实现主节点、多个从节点和数据存储模块中的一个或多个对象的功能。
在一些可能的实现方式中,计算设备集群中的一个或多个计算设备可以通过网络连接。其中,所述网络可以是广域网或局域网等等。图7示出了一种可能的实现方式。如图7所示,两个计算设备500A和500B之间通过网络进行连接。具体地,通过各个计算设备中的通信接口与所述网络进行连接。在这一类可能的 实现方式中,计算设备500A中的存储器506中存有执行数据存储模块的功能的指令。同时,计算设备500B中的存储器506中存有执行Redis主节点和Redis从节点的功能的指令。
图7所示的计算设备集群之间的连接方式可以是考虑到本申请提供的数据库的数据更新方法需要大量地存储数据,因此考虑将数据存储模块实现的功能交由计算设备500A执行。
应理解,图7中示出的计算设备500A的功能也可以由多个计算设备500完成。同样,计算设备500B的功能也可以由多个计算设备500完成。
本申请还提供了一种包含指令的计算机程序产品。所述计算机程序产品可以是包含指令的,能够运行在计算设备上或被储存在任何可用介质中的软件或程序产品。当所述计算机程序产品在至少一个计算设备上运行时,使得至少一个计算设备执行数据库的数据更新方法、主从切换方法、以及系统重建方法。
本申请还提供了一种计算机可读存储介质。所述计算机可读存储介质可以是计算设备能够存储的任何可用介质或者是包含一个或多个可用介质的数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘)等。该计算机可读存储介质包括指令,所述指令指示计算设备执行数据库的数据更新方法、主从切换方法、以及系统重建方法,或指示计算设备执行数据库的数据更新方法、主从切换方法、以及系统重建方法。
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的保护范围。

Claims (15)

  1. 一种数据库的数据更新方法,其特征在于,所述方法包括:
    内存数据库集群的主节点接收写请求,所述写请求记录写操作;
    所述主节点将所述写操作存储至所述内存数据库集群的共享存储中的日志文件;
    所述内存数据库集群的数据存储模块从所述日志文件获取所述写操作;
    所述数据存储模块将所述写操作中的最新数据存储至所述内存数据库集群的共享存储中的数据文件。
  2. 根据权利要求1所述的方法,其特征在于,所述方法包括:
    所述内存数据库集群的从节点读取所述共享存储中的日志文件,以便响应所述内存数据库集群的读请求。
  3. 根据权利要求1或2所述的方法,其特征在于,所述方法包括:
    在所述内存数据库集群中,所述主节点切换为新从节点时,所述新从节点读取所述共享存储中的日志文件,以便响应内存数据库集群的读请求。
  4. 根据权利要求1至3任一项所述的方法,其特征在于,所述方法包括:
    在所述内存数据库集群中,所述从节点切换为新主节点时,所述新主节点接收新写请求,并将所述新写请求中的写操作存储至所述共享存储中的日志文件。
  5. 根据权利要求1至4任一项所述的方法,其特征在于,
    所述数据存储模块部署在所述内存数据库集群的主节点;或者,
    所述数据存储模块部署在所述内存数据库集群的从节点;或者,
    所述数据存储模块部署在所述内存数据库集群的其它节点。
  6. 一种数据库的数据更新方法,其特征在于,所述方法包括:
    内存数据库集群的主节点接收多个写请求,所述多个写请求中的每个写请求记录写操作,所述多个写请求包括在不同时间接收的用于更新同一对象的数据的多个写操作;
    所述主节点将所述多个写操作存储至所述内存数据库集群的共享存储中的日志文件;
    所述内存数据库集群的数据存储模块,从所述日志文件获取所述多个写操作;
    所述数据存储模块将所述多个写操作中的最新数据存储至所述内存数据库集群的共享存储中的数据文件。
  7. 一种数据库系统,其特征在于,所述数据库系统为内存数据库集群,包括:
    内存数据库集群的主节点,用于接收写请求,所述写请求记录写操作;
    所述主节点,用于将所述写操作存储至所述内存数据库集群的共享存储中的日志文件;
    所述内存数据库集群的数据存储模块,用于从所述日志文件获取所述写操作;
    所述数据存储模块,用于将所述写操作中的最新数据存储至所述内存数据库集群的共享存储中的数据文件。
  8. 根据权利要求7所述的数据库系统,其特征在于,
    所述内存数据库集群的从节点,用于读取所述共享存储中的日志文件,以便响应所述内存数据库集群的读请求。
  9. 根据权利要求7或8所述的数据库系统,其特征在于,
    在所述内存数据库集群中,所述主节点切换为新从节点时,所述新从节点用于读取所述共享存储中的日志文件,以便响应内存数据库集群的读请求。
  10. 根据权利要求7至9任一项所述的数据库系统,其特征在于,
    在所述内存数据库集群中,所述从节点切换为新主节点时,所述新主节点用于接收新写请求,并将所述新写请求中的写操作存储至所述共享存储中的日志文件。
  11. 根据权利要求7至10任一项所述的数据库系统,其特征在于,
    所述数据存储模块,用于部署在所述内存数据库集群的主节点;或者,
    所述数据存储模块,用于部署在所述内存数据库集群的从节点;或者,
    所述数据存储模块,用于部署在所述内存数据库集群的其它节点。
  12. 一种数据库系统,其特征在于,所述数据库系统为内存数据库集群,包括:
    内存数据库集群的主节点,用于接收多个写请求,所述多个写请求中的每个写请求记录写操作,所述多个写请求包括在不同时间接收的用于更新同一对象的数据的多个写操作;
    所述主节点,用于将所述多个写操作存储至所述内存数据库集群的共享存储中的日志文件;
    所述内存数据库集群的数据存储模块,用于从所述日志文件获取所述多个写操作;
    所述数据存储模块,用于将所述多个写操作中的最新数据存储至所述内存数据库集群的共享存储中的数据文件。
  13. 一种计算设备集群,其特征在于,包括至少一个计算设备,每个计算设备包括处理器和存储器;
    所述至少一个计算设备的处理器用于执行所述至少一个计算设备的存储器中存储的指令,以使得所述计算设备集群执行如权利要求1至6所述的数据库的数据更新方法。
  14. 一种包含指令的计算机程序产品,其特征在于,当所述指令被计算设备集群运行时,使得所述计算设备集群执行如权利要求1至6所述的数据库的数据更新方法。
  15. 一种计算机可读存储介质,其特征在于,包括计算机程序指令,当所述计算机程序指令由计算设备集群执行时,所述计算设备集群执行如权利要求1至6所述的数据库的数据更新方法。
PCT/CN2023/123472 2022-11-30 2023-10-09 一种数据库的数据更新方法、系统及计算设备集群 WO2024114105A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211521755.7A CN118152409A (zh) 2022-11-30 2022-11-30 一种数据库的数据更新方法、系统及计算设备集群
CN202211521755.7 2022-11-30

Publications (1)

Publication Number Publication Date
WO2024114105A1 true WO2024114105A1 (zh) 2024-06-06

Family

ID=91295392

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/123472 WO2024114105A1 (zh) 2022-11-30 2023-10-09 一种数据库的数据更新方法、系统及计算设备集群

Country Status (2)

Country Link
CN (1) CN118152409A (zh)
WO (1) WO2024114105A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105227657A (zh) * 2015-09-29 2016-01-06 北京京东尚科信息技术有限公司 一种数据同步的方法和装置
CN111340414A (zh) * 2020-02-14 2020-06-26 上海东普信息科技有限公司 云仓大数据处理方法、云仓系统、计算机设备和存储介质
US20210216351A1 (en) * 2020-01-15 2021-07-15 Purdue Research Foundation System and methods for heterogeneous configuration optimization for distributed servers in the cloud
CN114168380A (zh) * 2021-11-23 2022-03-11 阿里巴巴(中国)有限公司 数据库配置方法、设备、系统和存储介质
CN115033642A (zh) * 2022-05-26 2022-09-09 度小满科技(北京)有限公司 一种Redis集群的数据同步的方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105227657A (zh) * 2015-09-29 2016-01-06 北京京东尚科信息技术有限公司 一种数据同步的方法和装置
US20210216351A1 (en) * 2020-01-15 2021-07-15 Purdue Research Foundation System and methods for heterogeneous configuration optimization for distributed servers in the cloud
CN111340414A (zh) * 2020-02-14 2020-06-26 上海东普信息科技有限公司 云仓大数据处理方法、云仓系统、计算机设备和存储介质
CN114168380A (zh) * 2021-11-23 2022-03-11 阿里巴巴(中国)有限公司 数据库配置方法、设备、系统和存储介质
CN115033642A (zh) * 2022-05-26 2022-09-09 度小满科技(北京)有限公司 一种Redis集群的数据同步的方法和装置

Also Published As

Publication number Publication date
CN118152409A (zh) 2024-06-07

Similar Documents

Publication Publication Date Title
US11153380B2 (en) Continuous backup of data in a distributed data store
JP7329518B2 (ja) 追加専用記憶デバイスを使用するデータベース管理のためのシステム及び方法
US20220188003A1 (en) Distributed Storage Method and Device
US10698773B2 (en) Replicating a source data set to a target data store
US11301379B2 (en) Access request processing method and apparatus, and computer device
US10769035B2 (en) Key-value index recovery by log feed caching
JP4809040B2 (ja) ストレージ装置及びスナップショットのリストア方法
US11321291B2 (en) Persistent version control for data transfer between heterogeneous data stores
EP3206128B1 (en) Data storage method, data storage apparatus, and storage device
US11093387B1 (en) Garbage collection based on transmission object models
US10628298B1 (en) Resumable garbage collection
US11442894B2 (en) Methods for scalable file backup catalogs and devices thereof
CN113568566A (zh) 利用索引物件来进行简易存储服务无缝迁移的方法、主装置以及存储服务器
US10223184B1 (en) Individual write quorums for a log-structured distributed storage system
WO2022033269A1 (zh) 数据处理的方法、设备及系统
CN112988680B (zh) 数据加速方法、缓存单元、电子设备及存储介质
WO2020088499A1 (en) Journaling overhead reduction with remapping interface
WO2024114105A1 (zh) 一种数据库的数据更新方法、系统及计算设备集群
CN110119389B (zh) 虚拟机块设备的写操作方法、快照创建方法及装置
US12019977B2 (en) Fast fill for computerized data input
US11418589B1 (en) Object synchronization of server nodes in a network computing environment
US20220138164A1 (en) Address mirroring of a file system journal
JP2017208113A (ja) データ格納方法、データストレージ装置、及びストレージデバイス
US11675668B2 (en) Leveraging a cloud-based object storage to efficiently manage data from a failed backup operation
WO2024119924A1 (zh) 进程的迁移方法、装置及系统