WO2020207078A1 - Procédé et dispositif de traitement de données et système de base de données distribuée - Google Patents

Procédé et dispositif de traitement de données et système de base de données distribuée Download PDF

Info

Publication number
WO2020207078A1
WO2020207078A1 PCT/CN2020/070576 CN2020070576W WO2020207078A1 WO 2020207078 A1 WO2020207078 A1 WO 2020207078A1 CN 2020070576 W CN2020070576 W CN 2020070576W WO 2020207078 A1 WO2020207078 A1 WO 2020207078A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
log
processed
management module
execution modules
Prior art date
Application number
PCT/CN2020/070576
Other languages
English (en)
Chinese (zh)
Inventor
刘强
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2020207078A1 publication Critical patent/WO2020207078A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1734Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • This application relates to the field of data processing technology, in particular to data processing methods, devices and distributed database systems.
  • AZ is the abbreviation of availability zones (availability zones).
  • An AZ includes multiple storage nodes, and different copies of data are stored in different storage nodes. Water, electricity, etc. will be isolated between different AZs to prevent simultaneous failure of storage nodes in multiple AZs due to water supply, power supply and other reasons.
  • AZ+1 failure means that when all storage nodes in one AZ fail and one storage node in another AZ fails, the database system is still available, that is, users can still access data in the database system. Considering the reliability of data and storage cost, the number of AZ is usually 3.
  • the Aurora database of Amazon web services uses the Quorum distributed protocol.
  • the Qurorum distributed protocol defines: The number of successfully written replicas W, the number of successfully read replicas R, and the total number of replicas N must satisfy W+R>N, and W>N/2. Considering the storage cost, W and R usually take the minimum value that satisfies the Qurorum distributed protocol.
  • the database system needs to use 6 data copies to tolerate AZ+1 failures, which will increase storage costs.
  • the conditions for successful writing of the 4 copies cannot be met, that is, when an AZ+1 failure occurs, the writing cannot be performed normally, which will affect the overall performance of the system.
  • the embodiments of the present application provide data processing methods, devices, and distributed database systems, which help reduce storage costs and improve overall system performance.
  • an embodiment of the present application provides a distributed database system, including a plurality of nodes, and a log management module, a data management module, and a plurality of data execution modules are deployed on the plurality of nodes.
  • the multiple nodes may include one or more structured query language (SQL) nodes, and/or multiple storage nodes.
  • SQL structured query language
  • the log management module and/or the data management module are deployed on the SQL node, and the data execution module is deployed on the storage node.
  • the log management module is used to send the pending log to the data management module after determining that the pending log is successfully written to M nodes of the multiple nodes; where M is an integer greater than or equal to 3, and the pending log is used for Record the update operation for updating the data to be processed; the data management module is used to send the pending log to the N data execution modules of the multiple data execution modules; where N is an integer greater than or equal to 1; N data execution Each data execution module in the module is used to update the data to be processed according to the update operation recorded in the log to be processed.
  • this technical solution can store one or more copies (such as 3 copies) of data in the database, that is, tolerate AZ+1 failures, and can reduce storage costs compared to the prior art.
  • the database can continue to be read and written, that is, it neither affects the read data nor the write data. Therefore, compared with the prior art, the overall system performance can be improved.
  • each data execution module is also used to return data writing success indication information to the data management module after the update operation is performed.
  • the data management module is also used to return the data writing success indication information to the log management module after receiving the data writing success indication information returned by part of the data execution modules (such as a data execution module) among the N data execution modules .
  • This helps the system tolerate slow input and output (input output, IO), thereby improving overall system performance.
  • an IO request with a relatively long response time that is, the time required for the data execution module from receiving the pending log to returning the data writing success indication information
  • the distributed database system also includes multiple log execution modules.
  • the log management module is also used to send the logs to be processed to the M log execution modules of the multiple log execution modules; each log execution module in the M log execution modules is used to write the logs to be processed into a node For example, each log execution module writes the log to be processed into the storage node where the log execution module is located.
  • different log execution modules are deployed on different nodes. This possible design provides a specific implementation for writing multiple copies of the log.
  • the N data execution modules are also used to synchronize the updated pending data; the log management module is also used to determine that the N data execution modules have performed the updated pending data. After synchronization, control the M log execution modules to delete the pending logs written to the M nodes. This helps reduce storage costs.
  • the data execution module since the premise of deleting the log is that the data execution module has synchronized multiple copies (such as 3) of the updated data, as long as one of the multiple copies is available, The latest data can be read, so AZ+1 failure can be tolerated.
  • each of the N data execution modules records one or more versions of the data to be processed, and one version of the data to be processed corresponds to a log of the data to be processed.
  • the log management module is specifically used to: obtain the smallest version number among the N latest version numbers of the data to be processed recorded by the N data execution modules; among them, one data execution module of the N data execution modules records one of the data to be processed The latest version number; control the M log execution modules to delete the target log of the data to be processed; the version of the data to be processed corresponding to the target log is less than or equal to the minimum version number; the target log includes the log to be processed.
  • the latest version number of the page to be queried may be the latest LSN of the log of the page to be queried.
  • the log management module is also used to send a query command to the data management module; the query command is used to query the N latest version numbers of the data to be processed recorded by the N data execution modules; the data management module It is also used to send the query command to the N data execution modules respectively; each data execution module in the N data execution modules is also used to return the latest version number of the log to be processed recorded by itself to the data management module; The management module is also used to return the N latest version numbers of the to-be-processed data returned by the N data execution modules to the log management module.
  • This possible design provides a specific implementation method for the log management module to obtain the latest version number of the data to be processed recorded by the data management module.
  • M log execution modules are deployed on M nodes, and different log execution modules are deployed on different nodes.
  • different nodes among the M nodes belong to different AZs.
  • N data execution modules are deployed on N nodes, and different data execution modules are deployed on different nodes.
  • different nodes among the N nodes belong to different AZs.
  • different execution modules are deployed on different nodes.
  • the data to be processed is updated at the page granularity. That is to say, the data to be processed in the first aspect and any of its possible designs may be pages to be processed.
  • the embodiments of the present application provide a data processing method, which is applied to a distributed database system.
  • the distributed database system includes multiple nodes on which a log management module, a data management module, and multiple nodes are deployed.
  • the method includes: the log management module determines that the log to be processed is successfully written to M nodes of the plurality of nodes; where M is an integer greater than or equal to 3, and the log to be processed is used to record an update operation for updating the data to be processed; After determining that the log to be processed is successfully written to M nodes, the log management module sends the log to be processed to the data management module; the log to be processed is used by the data management module to control the N data execution modules of the multiple data execution modules according to the The update operation recorded in the log updates the data to be processed; where N is an integer greater than or equal to 1.
  • the distributed database system also includes multiple log execution modules.
  • the method further includes: the log management module sends the logs to be processed to the M log execution modules of the plurality of log execution modules; the logs to be processed are used by the M log execution modules to write the logs to be processed into the M nodes.
  • the method further includes: after the log management module determines that the N data execution modules have synchronized the updated data to be processed, controlling the M log execution modules to delete the pending data written to the M nodes Log.
  • the log management module controls the M log execution modules to delete the pending logs written to the M nodes, including : Obtain the smallest version number among the N latest version numbers of the data to be processed recorded by the N data execution modules; among them, one of the N data execution modules records the latest version number of the data to be processed; control M A log execution module deletes the target log of the data to be processed; the version number of the data to be processed corresponding to the target log is less than or equal to the minimum version number; the target log includes the log to be processed.
  • the method further includes: the log management module sends a query command to the data management module; the query command is used to query the N latest version numbers of the data to be processed recorded by the N data execution modules; log management The module receives the N latest version numbers of the data to be processed returned by the data management module.
  • an embodiment of the present application provides a data processing device, which can be used to execute the second aspect or any method provided by any possible design of the second aspect.
  • the device may be a log management module in the second aspect or any possible design of the second aspect, or a log in any possible design of the second aspect or the second aspect is deployed.
  • Management module node such as storage node or SQL node).
  • the device can be divided into functional modules according to the method provided by the second aspect or any of the possible designs of the second aspect.
  • each functional module can be divided corresponding to each function, or Integrate two or more functions into one processing module.
  • the device is specifically a node containing the log management module in the second aspect or any one of the possible designs of the second aspect.
  • the node includes a memory and a processor, and the memory is used to store computer programs.
  • the processor is used to call the computer program to realize the function of the log management module.
  • the function of the log management module reference may be made to the foregoing second aspect or any possible design of the second aspect.
  • an embodiment of the present application provides a device to implement the function of the data processing device provided in the third aspect or any one of the possible designs of the third aspect.
  • the device includes a processor and an interface.
  • the device may be a chip, for example.
  • the processor may be realized by hardware or software. When realized by hardware, the processor may be a logic circuit, integrated circuit, etc.; when realized by software, the processor may be a general-purpose processor. This is achieved by reading the software code stored in the memory.
  • the memory may be integrated in the processor, may be located outside the processor, and exist independently. This interface is used for information exchange between the device and other modules/devices/equipment.
  • the embodiments of the present application provide a computer-readable storage medium, such as a non-transitory computer-readable storage medium.
  • a computer program is stored thereon, and when the computer program is run on a computer, the computer is caused to execute any method provided by the second aspect or any possible design of the second aspect.
  • the computer may be a node (such as a storage node or an SQL node) in a distributed database system.
  • the embodiments of the present application provide a computer program product, which when running on a computer, enables any method provided in the second aspect or any possible design of the second aspect to be executed.
  • the computer may be a node (such as a storage node or an SQL node) in a distributed database system.
  • an embodiment of the present application provides a node cluster, including at least one node, each node includes a memory and a processor, the memory of each node is used to store a computer program, and the processor of the at least one node is used to execute the A computer program to execute any method provided by the above second aspect or any possible design of the second aspect.
  • an embodiment of the present application provides a computer-readable storage medium, including a computer program, and when the computer program runs on a node cluster, the node cluster executes the second aspect or any one of the second aspects above. Any method provided by the possible design.
  • an embodiment of the present application provides a computing device program product.
  • the computing device program product is executed by a node cluster, the node cluster executes any of the above-mentioned second aspect or any of the possible designs of the second aspect. a way.
  • any of the methods or devices, computer storage media, or computer program products provided above can be applied to the corresponding distributed database system provided above. Therefore, the beneficial effects that can be achieved can refer to the corresponding The beneficial effects in the distributed database system will not be repeated here.
  • Figure 1 is a schematic diagram of an AZ+1 failure in a distributed database system provided by the prior art
  • Figure 2 is a schematic structural diagram of a distributed database system applicable to embodiments of the present application.
  • FIG. 3 is a schematic structural diagram of a node applicable to an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of another distributed database system applicable to the embodiments of the present application.
  • FIG. 5 is a schematic diagram of interaction of a method for writing data provided by an embodiment of this application.
  • FIG. 6 is an interactive schematic diagram of a method for deleting a log (or truncating a log) according to an embodiment of the application;
  • FIG. 7 is an interactive schematic diagram of a method for reading data provided by an embodiment of this application.
  • FIG. 8 is a schematic structural diagram of a data processing device provided by an embodiment of the application.
  • At least one (species) in the embodiments of the present application includes one (species) or more (species).
  • Multiple (species) means two (species) or more than two (species).
  • at least one of A, B and C includes: A alone, B alone, A and B simultaneously, A and C simultaneously, B and C simultaneously, and A, B and C simultaneously.
  • "/" means or, for example, A/B can mean A or B;
  • "and/or” in this document is only an association relationship describing associated objects, It means that there can be three kinds of relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, and B exists alone.
  • Multiple means two or more than two.
  • words such as “first” and “second” are used to distinguish the same items or similar items with substantially the same function and effect. Those skilled in the art can understand that words such as “first” and “second” do not limit the quantity and order of execution, and words such as “first” and “second” do not limit the difference.
  • FIG. 2 is a schematic structural diagram of a distributed database system applicable to embodiments of the present application.
  • the distributed database system includes an agent layer 101, a SQL layer 102, a storage layer 103, a cluster manager 104, and an application layer 105 running on the agent layer 101.
  • the application layer 105 includes one or more applications. (application, APP).
  • the proxy layer 101 has functions of database and table partitioning, transparent access, read-write separation, and load balancing.
  • the SQL layer 102 may include multiple SQL nodes, and each SQL node may include a SQL engine and a storage adapter; the SQL engine has functions such as SQL parsing and transaction concurrency control; the storage adapter has functions such as storage layer interface adaptation and access routing.
  • the storage layer 103 may include multiple storage nodes, and each storage node may be used to store data and/or logs.
  • the storage layer 103 also has functions such as request processing distribution, log sequence number (log sequence number, LSN) preserving control, management, and maintenance command processing.
  • the storage layer 103 also has functions such as using a replication protocol to implement synchronous and asynchronous data replication between various data copies, as well as maintaining database state machines, processing recovery logs generated by write nodes, and providing asynchronous pages and corresponding page reading services.
  • the cluster manager 104 has functions such as cluster management, software start and stop, and fault monitoring processing.
  • the storage node may be a server.
  • the distributed database system shown in FIG. 2 is only an example, which does not limit the structure of the distributed database system applicable to the embodiments of the present application.
  • FIG. 3 is a schematic diagram of a structure of a node (such as a SQL node or a storage node) applicable to the embodiment of the present application.
  • the node includes at least one processor 201, a communication line 202, a memory 203, and at least one communication interface 204.
  • the processor 201 can be a general-purpose central processing unit (central processing unit, CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more programs for controlling the execution of the program of this application. integrated circuit.
  • CPU central processing unit
  • ASIC application-specific integrated circuit
  • the communication line 202 may include a path for transferring information between the aforementioned components.
  • the communication interface 204 may be any device such as a transceiver for communicating with other devices or a communication network.
  • the communication network may be Ethernet, radio access network (RAN), or wireless local area networks (WLAN).
  • the memory 203 can be a read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM), or other types that can store information and instructions
  • the dynamic storage device can also be electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM) or other optical disk storage, optical disc storage (Including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program codes in the form of instructions or data structures and can be used by a computer Any other media accessed, but not limited to this.
  • the memory 203 may exist independently and is connected to the processor through a communication line 202.
  • the memory 203 may also be integrated with the processor 201.
  • the memory 203 provided by the embodiment of the present application may generally be non-volatile. Wherein, the memory 203 is used to store computer instructions for executing the solution of the present application.
  • the processor 201 is configured to execute computer instructions stored in the memory 203, so as to implement the method provided in the embodiment of the present application.
  • the computer instructions in the embodiments of the present application may also be referred to as application program codes.
  • the aforementioned communication interface 204 may be optional.
  • a node may include multiple processors, such as the processor 201 and the processor 205 in FIG. 3. Each of these processors can be a single-CPU (single-CPU) processor or a multi-core (multi-CPU) processor.
  • the processor here may refer to one or more devices, circuits, and/or processing cores for processing data (for example, computer program instructions).
  • the structure of the node shown in FIG. 3 is only an example, which does not limit the structure of the applicable node (such as the SQL node or the storage node) in the embodiment of the present application.
  • Fig. 4 is a schematic structural diagram of another distributed database system applicable to the embodiments of the present application.
  • the distributed database system includes: a database proxy module (databse proxy) 31, an SQL module 32, a log management module 33, a data management module 34, a log execution module 35, and a data execution module 36.
  • the database proxy module 31 may correspond to the proxy layer 101 in FIG. 2.
  • the SQL module 32 may correspond to the SQL layer 102 in FIG. 2, that is, the SQL module 32 may be deployed on the SQL node.
  • the SQL module 32 may correspond to the SQL engine in the SQL layer 102.
  • one SQL module 32 can be implemented by a SQL engine, but of course it is not limited to this.
  • the log execution module 35 and the data execution module 36 may correspond to the storage layer 103 in FIG. 2, that is, the log execution module 35 and the data execution module 36 may be deployed on the storage node.
  • the log management module 33 and/or the data management module 34 may correspond to the SQL layer 102 or the storage layer 103.
  • the log management module 33 and/or the data management module 34 may be deployed on the SQL node or the storage node.
  • the log management module 33 and/or the data management module 34 may correspond to a storage adapter in the SQL layer 102.
  • a log management module 33 or a data management module 34 can be implemented through a storage adapter, but of course it is not limited to this.
  • the database proxy module 31 can be connected with one or more SQL modules 32 to provide access to the database, and has the functions of sub-database sub-table, transparent access, read-write separation, and load balancing.
  • the SQL module 32 can be connected to one or more log management modules 33 and one or more data management modules 34, and has functions such as SQL parsing and transaction concurrency control.
  • the log management module 33 may be connected to one or more log execution modules 35 to manage the log execution module 35, distribute the logs to the log execution module, and send the logs to the data management module 34.
  • the log execution module 35 is used to store and synchronize logs. For example, the received log is stored in the storage space of the storage node where the log execution module 35 is located, and the stored logs are synchronized with other log execution modules 35.
  • the data management module 34 may be connected to one or more data execution modules 36 to manage the data execution module 36 and distribute logs to the data execution module 36.
  • the data execution module 36 is used for updating, storing and synchronizing data in the database. For example, update the corresponding data according to the received log, and store the updated data (such as storing the updated data in the storage space of the storage node where the data execution module 36 is located); and interact with other data execution modules The updated data is synchronized between 36.
  • the network connection may be a wired connection or a wireless connection.
  • it may be connected through a communication bus, such as a high-speed serial computer expansion bus standard (peripheral component interconnect express, PCIe) bus.
  • PCIe peripheral component interconnect express
  • some or all of the database agent module 31, SQL module 32, log management module 33, data management module 34, log execution module 35, and data execution module 36 may be stored in the form of computer programs where they are deployed.
  • the processor in the node (such as the processor 201 in FIG. 3) can call the computer program to realize the function of the corresponding module.
  • any one of the aforementioned database agent module 31, SQL module 32, log management module 33, data management module 34, log execution module 35, and data execution module 36 can be deployed in a node cluster.
  • the node cluster may include at least one node.
  • AZ can be understood as a collection of multiple storage nodes.
  • the number of AZs is usually 3.
  • the embodiment of the present application is not limited to this.
  • the number of AZs is 3 as an example.
  • the data in the database is ultimately stored in the form of pages, and all data in a database can be regarded as a collection of several pages.
  • the log is usually used to record the update operation to the database.
  • a log is generated every time the database is updated. Sending changes to the page means updating the page. For example, if the data stored in the database is 10, if a +1 operation is performed on the data, a log will be generated in the database, and the update operation recorded in the log is the +1 operation.
  • a log includes the identification and update operation of a page.
  • An update operation refers to updating data, such as adding, modifying, or deleting.
  • the update operation includes an add operation, the update operation is used to request the page to be added to the database; when the update operation includes a modification operation, the update operation is used to request to modify the page in the database; when the update operation includes a delete During operation, the update operation is used to request the deletion of the page in the database.
  • the methods provided in the following embodiments can all be applied to the distributed database system as shown in FIG. 4.
  • the number of AZ is 3, the number of copies of the log is 3, and the number of copies of data is 3 as examples for description.
  • the database is updated based on page granularity as an example. In actual implementation, the database may also be updated with other granularities.
  • FIG. 5 it is a schematic diagram of interaction of a method for writing data provided by an embodiment of this application.
  • the method includes:
  • the SQL module determines the page to be processed, and generates a log of the page to be processed (that is, the log to be processed). Among them, the log to be processed includes the identification and update operation of the page to be processed.
  • the application in the distributed database system can send one or more write requests to the database proxy module; the database proxy module allocates the received one or more write requests to connect to the database proxy module One or more SQL modules. For each SQL module that receives a write request, the identifier of one or more pages to be processed can be determined according to the received one or more write requests.
  • the SQL module submits a log to be processed to one of the log management modules connected to the SQL module, and this log management module is referred to as a target log management module hereinafter.
  • the embodiment of the present application does not limit the specific implementation manner of determining the target log management module by the SQL module, for example, reference may be made to the prior art.
  • the page to be processed may be updated multiple times; accordingly, the SQL module can generate multiple logs containing the identifier of the page to be processed.
  • the target SQL module may submit multiple logs for processing of the page to be processed at one time.
  • the target log management module sends the logs to be processed to the first log execution module, the second log execution module, and the third log execution module, respectively.
  • the first log execution module, the second log execution module, and the third log execution module are respectively deployed on different storage nodes. In this way, even if the storage node where one of the log execution modules is located fails, the other log execution modules can still operate normally, thereby improving the reliability of the log.
  • the storage nodes deployed by the first log execution module, the second log execution module, and the third log execution module belong to different AZs. In this way, even if the storage node in the AZ where one of the log execution modules is located fails, other log execution modules can still operate normally, thereby improving the reliability of the log.
  • an AZ can correspond to a log execution module, and the log execution module is responsible for the Storage and synchronization of logs in AZ.
  • the first log execution module, the second log execution module, and the third log execution module may be log execution modules corresponding to 3 AZs, respectively.
  • the embodiment of the present application is not limited to this.
  • one AZ may correspond to one or more log execution modules.
  • Each log execution module of the first log execution module, the second log execution module, and the third log execution module writes the received pending log into the storage space of the storage node where the log execution module is located, and then After the writing is successful, the log writing success indication information is returned to the target log management module.
  • the target log management module After receiving the log writing success indication information returned by the first log execution module, the second log execution module, and the third log execution module, the target log management module sends to one of the data management modules connected to the target log management module Send the log to be processed.
  • the data management module is referred to as the target data management module.
  • the embodiment of this application does not limit the specific implementation manner of the target log management module determining the target data management module.
  • the corresponding relationship between the identifiers of multiple pages and the multiple data management modules can be predefined, and then based on the corresponding relationship, from The page to be processed is searched in the identifiers of the multiple pages, thereby determining the data management module corresponding to the page to be processed, and using the determined data management module as the target data management module.
  • S105 may be executed first and then S106 may be executed, or S106 may be executed first and then S105 may be executed, or S105 and S106 may be executed simultaneously.
  • the target data management module sends the logs to be processed to the first data execution module, the second data execution module, and the third data execution module, respectively.
  • the embodiment of the present application does not limit the specific implementation manners of the target data management module determining the first data execution module, the second data execution module, and the third data execution module.
  • the target data management module determining the first data execution module, the second data execution module, and the third data execution module.
  • the first data execution module, the second data execution module, and the third data execution module are respectively deployed on different storage nodes, which helps to improve data reliability.
  • the storage nodes deployed by the first data execution module, the second data execution module, and the third data execution module belong to different AZs, which helps to further improve the reliability of the data.
  • the first data execution module, the second data execution module, the third data execution module, the first log execution module, the second log execution module, and the third log execution module are respectively deployed on different storage nodes. This helps to improve the reliability of data and logs.
  • Each data execution module of the first data execution module, the second data execution module, and the third data execution module executes the update operation recorded in the pending log for the page to be processed, and after the update is successful, manages the target data The module returns information indicating that the data is written successfully.
  • a data execution module successfully performs an update operation on the page to be processed, it can be considered that a copy of the page to be processed is successfully written in the database.
  • the target data management module can return the data writing success indication information to the target log management module after receiving the data writing success indication returned by some data execution modules (such as one of the data execution modules) .
  • slow IO In the specific implementation, different data execution modules may respond to IO requests differently due to different hardware resources of the storage node. Specifically, different data execution modules receive and wait for processing The time required for the log to return the data writing successful indication information is different. In the embodiment of the present application, the corresponding IO request with a longer response time is referred to as "slow IO".
  • the target log management module returns the writing success indication information to the SQL module.
  • the database has successfully written at least one copy (ie, data copy) of the page to be processed.
  • the data execution modules can synchronize copies of the pages to be processed by executing background tasks. For example, assuming that when S108 is executed, the first data execution module successfully writes a copy of the page to be processed, but the second data execution module and the third data execution module fail to write the copy of the page to be processed, then the first data execution module The module can synchronize the copy of the successfully written page to be processed to the second data execution module and the third data execution module, thereby improving the reliability of the data.
  • the specific synchronization process can refer to the prior art, which will not be repeated here.
  • the target data management module sends the log to be processed to some or all of the data execution modules among the first data execution module, the second data execution module and the third data execution module.
  • the data execution module that receives the to-be-processed log performs the update operation recorded in the to-be-processed log on the page to be processed, and after the update is successful, returns the data writing success indication information to the target data management module.
  • the target log management module may control the first log execution module and the second log execution module after determining that the first data execution module, the second data execution module, and the third data execution module synchronize copies of the page to be processed And the third log execution module can delete the pending log written in S104. In this way, storage resources can be saved.
  • the embodiment of the present application does not limit how the target log management module determines whether the first data execution module, the second data execution module, and the third data execution module have synchronized the copy of the page to be processed.
  • An example provided by the embodiment of the present application may be as shown in FIG. 6.
  • the first stage the stage of persisting the log, that is, the stage of storing multiple copies of the log to be processed.
  • the second stage the stage of applying the log to the data, that is, the update operation recorded in the log to be processed is performed on the data to be processed, so as to realize the stage of storing multiple copies of the data.
  • the log Since the log is read-only, it does not need to be modified after being written. Therefore, when the storage node where a copy of the log is located fails, the log can be written to another storage node. As long as one of the multiple copies of the persistent log is available, the latest data can be read, so AZ+1 failures can be tolerated. In addition, even if the only copy of the page to be processed that was successfully written is lost, the latest data can be recovered through the copy of the persistent log, so AZ+1 failure can be tolerated.
  • the prerequisite for deleting these logs is that the data execution module has synchronized multiple copies of the updated data (for example, three), therefore, as long as one of the multiple copies is If it is available, the latest data can be read, so AZ+1 failures can be tolerated.
  • the embodiments of the present application can store three copies of data in the database, that is, tolerate AZ+1 failures, and can reduce storage costs compared to the prior art.
  • the database can continue to be read and written, that is, it neither affects the read data nor the write data. Therefore, compared with the prior art, the overall system performance can be improved.
  • the log storage mode in the embodiment of the present application is append only mode, that is, it is not modified after being written, and is read-only; the data storage mode is write inplace mode, that is, it can be modified after being written.
  • the page to be processed is updated multiple times, that is, each data execution module of the first data execution module, the second data execution module, and the third data execution module saves one or more versions of the page to be processed
  • each data execution module of the first data execution module, the second data execution module, and the third data execution module saves one or more versions of the page to be processed
  • the process of deleting logs (also called truncating logs) by the target log management module will be described.
  • FIG. 6 an interactive schematic diagram of a method for deleting a log (or truncating a log) provided by an embodiment of this application.
  • the method shown in FIG. 6 may include the following steps:
  • the target log management module sends a query command to the target data management module.
  • the query command may include the identification of the page to be queried.
  • the query command is used to query the log persistence status of the page to be queried, or to query the update status of the page to be queried.
  • the "page to be queried" may be the page to be processed above.
  • the update status of the page to be queried can be characterized by the latest version number of the page to be queried.
  • the latest version number of the page to be queried may be the latest LSN of the log of the page to be queried.
  • the data execution module can record the latest W version and version number of each page it manages, where W is an integer greater than or equal to 1. For the specific implementation of the value of W, refer to the prior art.
  • the version of a page managed by a data execution module can be obtained after the page is updated when the log of the page sent by the data management module is received; or it can be performed between the data execution module and other data execution modules. Obtained after data synchronization. Since different data execution modules write data at different rates (or different data execution modules have different processing times for IO requests), as the number of updates to the same page increases, the latest data recorded by different data execution modules The version number may be different. Of course, the latest version number of the page recorded by different data execution modules may also be the same. It should be noted that, in the embodiments of this application, the version number of the page increases as the number of times the page is updated increases as an example.
  • the target data management module sends the query command to the first data execution module, the second data execution module, and the third data execution module respectively.
  • the first data execution module, the second data execution module, and the third data execution module are data execution modules connected to the target data management module and corresponding to the identifier of the page to be queried.
  • the first data execution module, the second data execution module, and the third data execution module may be data execution modules responsible for updating, storing and/or synchronizing the page to be queried.
  • each of the first data execution module, the second data execution module, and the third data execution module queries and returns to the target data management module the information of the page to be queried recorded by itself. The latest version number.
  • the target data management module returns a query result to the target log management module, where the query result may include: the latest version number of the page to be queried returned by each data execution module.
  • the target log management module selects the minimum version number of the page to be queried from the query result, and sends the minimum version number to the first log execution module, the second log execution module, and the third log execution module, respectively.
  • the above S204 to S205 can be replaced with: the target data management module selects the minimum version number of the page to be queried from the query results, and returns the minimum version number to the target log management module; the target log management module sends the first log The execution module, the second log execution module and the third log execution module respectively send the minimum version number.
  • the larger the version number the newer the page, that is, the more times the page is updated.
  • the smaller the version number the older the data, that is, the fewer times the page has been updated.
  • S206 The first log execution module, the second log execution module, and the third log execution module delete the logs corresponding to the minimum version number and all version numbers less than the minimum version number.
  • the data execution module has performed the first 50 update operations on the page to be queried (that is, the first data execution module, the second data execution module, and the third data execution module have synchronized the data after the first 50 update operations on the page to be queried).
  • the first log execution module, the second log execution module, and the third log execution module can delete logs with LSN ⁇ 50 of the page to be queried.
  • the smallest version number among the latest version numbers recorded by these data execution modules is determined, thereby deleting the existing page
  • the log corresponding to the log execution module records a log that is smaller than the minimum version number. This achieves the purpose of deleting the logs of the synchronized pages, thereby saving storage resources.
  • the specific implementation is not limited to this.
  • FIG. 7 a schematic diagram of interaction of a method for reading data provided by an embodiment of this application.
  • the method shown in FIG. 7 may include the following steps:
  • the SQL module determines the identifier of the page to be read.
  • the page to be read may be the page to be processed above.
  • the SQL module can also determine the version number of the page to be read, such as the LSN of the log of the page to be read.
  • the SQL module can default to this reading process for reading the latest version number of the page to be read, although the embodiment of the application is not limited to this.
  • the application in the distributed database system can send one or more read requests to the database proxy module; the database proxy module allocates the received one or more read requests to connect to the database proxy module One or more SQL modules.
  • the embodiment of the present application does not limit how to allocate.
  • the prior art can be referred to.
  • the identifier of one or more pages to be read can be determined according to the received one or more read requests, and the specific implementation manner can refer to the prior art.
  • reading data in the database based on page granularity is taken as an example for description. In actual implementation, data in the database may also be read at other granularities.
  • the SQL module in the distributed database system can be an integrated read-write SQL module, that is, an SQL module that can be used to process read requests and write requests; or, it can be a separate read-write SQL module . And information can be synchronized between SQL modules. Therefore, the "SQL module" in this embodiment and the “SQL module” in the embodiment shown in FIG. 5 may be the same or different. Correspondingly, other modules in this embodiment may be the same as or different from the modules with the same name in the embodiment shown in FIG. 5.
  • the SQL module sends the identifier of the page to be read to one of the data management modules connected to the SQL module.
  • the data management module is referred to as a target data management module.
  • the target data management module sends the identifier of the page to be read to any one of the first data execution module, the second data execution module, or the third data execution module.
  • S304 The data execution module that receives the identifier of the page to be read returns the page to be read to the target data management module, specifically referring to the content of the page to be read.
  • the data execution module that receives the identifier of the page to be read sends the target version of the page to be read to the target data management module, where the target version is the version determined in the optional implementation of S301.
  • the data execution module that receives the identifier of the page to be read determines whether the data execution module stores the version number of the target version, and if so, sends the target version of the page to be read to the target data management module. In addition, if there is no storage, it can return to the target data management module read error indication information.
  • the target data management module may send the identification of the page to be read to other data execution modules to read the page to be read. For example, suppose the version number of the target version is 70, and the version number stored by the data execution module that receives the identification of the page to be read is 50-60, then the data execution module returns the reading error indication information to the target data management module .
  • S305 The target data management module returns the page to be read to the SQL module.
  • this embodiment is only an example of data reading provided based on the data writing method provided above, and it does not limit the applicable data reading method of the embodiment of the present application.
  • the embodiments of the present application may divide the log management module or its deployed nodes into functional modules based on the foregoing method examples.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated in In a processing module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or software functional modules. It should be noted that the division of modules in the embodiments of the present application is illustrative, and is only a logical function division, and there may be other division methods in actual implementation.
  • FIG. 8 it is a schematic structural diagram of a data processing device 80 provided by an embodiment of this application.
  • the device 80 can be used to execute a data processing method, such as the method shown in FIG. 4 or FIG. 5 above.
  • the device 80 may be a log management module in a distributed database system or a node on which a log management module is deployed.
  • the distributed database system includes at least two nodes, and a log management module, a data management module, and multiple data execution modules are deployed on the at least two nodes.
  • the device 80 includes a processing unit 801 and a sending unit 802.
  • the processing unit 801 is configured to determine that the to-be-processed log is successfully written to M of the multiple nodes; where M is an integer greater than or equal to 3, and the to-be-processed log is used to record an update operation for updating the to-be-processed data.
  • the sending unit 802 is configured to send the to-be-processed log to the data management module after the processing unit 801 determines that the to-be-processed log is successfully written to the M nodes; the to-be-processed log is used for the data management module to control N data in multiple data execution modules The execution module updates the data to be processed according to the update operation recorded in the log to be processed; where N is an integer greater than or equal to 1.
  • the processing unit 801 may be used to perform the receiving step corresponding to S104, and the sending unit 802 may be used to perform S106.
  • the distributed database system further includes multiple log execution modules.
  • the sending unit 802 is further configured to send the logs to be processed to the M log execution modules of the plurality of log execution modules, respectively; the logs to be processed are used for the M log execution modules to write the logs to be processed into the M nodes.
  • the processing unit 801 may be used to execute S103.
  • the processing unit 801 is further configured to, after determining that the N data execution modules have synchronized the updated data to be processed, control the M log execution modules to delete the pending logs written to the M nodes.
  • the processing unit 801 is specifically configured to: obtain the smallest version number among the N latest version numbers of the data to be processed recorded by the N data execution modules; wherein, one data execution module of the N data execution modules records the pending data A latest version number of the processed data; control M log execution modules to delete the target log of the data to be processed; the version number of the data to be processed corresponding to the target log is less than or equal to the minimum version number; the target log includes the log to be processed.
  • the processing unit 801 may be used to execute the receiving step corresponding to S203 and S206.
  • the sending unit 802 may also be used to send a query command to the data management module, where the query command is used to query the N latest version numbers of the data to be processed recorded by the N data execution modules.
  • the device 80 may also include a receiving unit 803 for receiving the N latest version numbers of the data to be processed returned by the data management module.
  • the sending unit 802 may be used to execute S202.
  • the device 80 may specifically correspond to the SQL layer 102 or the storage layer 103 in FIG.
  • the device 80 may be implemented by the processor 201 in FIG. 3 calling a computer program in the memory 203.
  • the embodiment of the present application also provides a distributed database system, including at least two nodes, on which the log management module (such as the above device 80) provided above is deployed, in addition, a data management module and Multiple data execution modules, and log execution modules can also be deployed.
  • the log management module such as the above device 80
  • a data management module and Multiple data execution modules, and log execution modules can also be deployed.
  • the computer may be implemented in whole or in part by software, hardware, firmware or any combination thereof.
  • a software program it may be implemented in the form of a computer program product in whole or in part.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from a website, computer, server, or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or may include one or more data storage devices such as servers and data centers that can be integrated with the medium.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, an SSD).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Debugging And Monitoring (AREA)

Abstract

L'invention concerne un procédé et un dispositif de traitement de données, ainsi qu'un système de base de données distribuée, qui se rapportent au domaine technique du traitement de données, facilitent la réduction des coûts de stockage et améliorent la performance globale du système. Le système de base de données distribuée comprend une pluralité de nœuds, et un module de gestion de journal (33), un module de gestion de données (34) et une pluralité de modules d'exécution de données (36) sont déployés sur la pluralité de nœuds ; le module de gestion de journal (33) est utilisé pour envoyer un journal à traiter au module de gestion de données (34) après avoir déterminé que ledit journal est écrit avec succès dans M nœuds de la pluralité de nœuds, M étant un nombre entier supérieur ou égal à 3, et ledit journal étant utilisé pour enregistrer une opération de mise à jour pour la mise à jour de données à traiter ; le module de gestion de données (34) est utilisé pour envoyer ledit journal à N modules d'exécution de données (36) de la pluralité de modules d'exécution de données, N étant un nombre entier supérieur ou égal à 1 ; et chaque module d'exécution de données (36) des N modules d'exécution de données (36) est utilisé pour mettre à jour lesdites données selon l'opération de mise à jour enregistrée dans ledit journal.
PCT/CN2020/070576 2019-04-09 2020-01-07 Procédé et dispositif de traitement de données et système de base de données distribuée WO2020207078A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910282391.3A CN111797062B (zh) 2019-04-09 2019-04-09 数据处理方法、装置和分布式数据库系统
CN201910282391.3 2019-04-09

Publications (1)

Publication Number Publication Date
WO2020207078A1 true WO2020207078A1 (fr) 2020-10-15

Family

ID=72751519

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/070576 WO2020207078A1 (fr) 2019-04-09 2020-01-07 Procédé et dispositif de traitement de données et système de base de données distribuée

Country Status (2)

Country Link
CN (1) CN111797062B (fr)
WO (1) WO2020207078A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105324770A (zh) * 2013-04-30 2016-02-10 亚马逊科技公司 有效读出副本
CN106210123A (zh) * 2016-08-03 2016-12-07 付宏伟 一个多节点的软件定义网络控制器系统
CN106228263A (zh) * 2016-07-19 2016-12-14 成都镜杰科技有限责任公司 基于大数据的物流信息化方法
US20170083565A1 (en) * 2013-03-15 2017-03-23 Amazon Technologies, Inc. Database system with database engine and separate distributed storage service
CN108241555A (zh) * 2016-12-26 2018-07-03 阿里巴巴集团控股有限公司 一种分布式数据库的备份、恢复方法、装置和服务器

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9791485B2 (en) * 2014-03-10 2017-10-17 Silver Spring Networks, Inc. Determining electric grid topology via a zero crossing technique
US10091297B1 (en) * 2016-05-18 2018-10-02 EMC IP Holding Company LLC Data processing and data movement in cloud computing environment
CN107818120B (zh) * 2016-09-14 2020-05-29 博雅网络游戏开发(深圳)有限公司 基于大数据的数据处理方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170083565A1 (en) * 2013-03-15 2017-03-23 Amazon Technologies, Inc. Database system with database engine and separate distributed storage service
CN105324770A (zh) * 2013-04-30 2016-02-10 亚马逊科技公司 有效读出副本
CN106228263A (zh) * 2016-07-19 2016-12-14 成都镜杰科技有限责任公司 基于大数据的物流信息化方法
CN106210123A (zh) * 2016-08-03 2016-12-07 付宏伟 一个多节点的软件定义网络控制器系统
CN108241555A (zh) * 2016-12-26 2018-07-03 阿里巴巴集团控股有限公司 一种分布式数据库的备份、恢复方法、装置和服务器

Also Published As

Publication number Publication date
CN111797062A (zh) 2020-10-20
CN111797062B (zh) 2023-10-27

Similar Documents

Publication Publication Date Title
US11153380B2 (en) Continuous backup of data in a distributed data store
US20210004355A1 (en) Distributed storage system, distributed storage system control method, and storage medium
US11726984B2 (en) Data redistribution method and apparatus, and database cluster
US8954391B2 (en) System and method for supporting transient partition consistency in a distributed data grid
US8533171B2 (en) Method and system for restarting file lock services at an adoptive node during a network filesystem server migration or failover
US20100023564A1 (en) Synchronous replication for fault tolerance
US9367261B2 (en) Computer system, data management method and data management program
US10599677B2 (en) Methods and systems of splitting database indexes and digests
US11403269B2 (en) Versioning validation for data transfer between heterogeneous data stores
US11080253B1 (en) Dynamic splitting of contentious index data pages
US10270852B2 (en) Data migration apparatus and system
WO2021057108A1 (fr) Procédé de lecture de données, procédé d'écriture de données et serveur
US20190347167A1 (en) Primary Node-Standby Node Data Transmission Method, Control Node, and Database System
US20120278429A1 (en) Cluster system, synchronization controlling method, server, and synchronization controlling program
WO2021082465A1 (fr) Procédé pour assurer la cohérence de données et dispositif associé
US11409711B2 (en) Barriers for dependent operations among sharded data stores
US20140289562A1 (en) Controlling method, information processing apparatus, storage medium, and method of detecting failure
CN109407975B (zh) 写数据方法与计算节点以及分布式存储系统
US20210165768A1 (en) Replication Barriers for Dependent Data Transfers between Data Stores
US7543121B2 (en) Computer system allowing any computer to copy any storage area within a storage system
WO2020119709A1 (fr) Procédé de mise en œuvre de fusion de données, dispositif, système et support de stockage
US11461201B2 (en) Cloud architecture for replicated data services
US10970177B2 (en) Methods and systems of managing consistency and availability tradeoffs in a real-time operational DBMS
WO2020207078A1 (fr) Procédé et dispositif de traitement de données et système de base de données distribuée
WO2023029485A1 (fr) Procédé et appareil de traitement de données, dispositif informatique, et support de stockage lisible par ordinateur

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20787465

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20787465

Country of ref document: EP

Kind code of ref document: A1