CN116860398A - Transaction processing method, device, equipment and storage medium based on time sequence database - Google Patents

Transaction processing method, device, equipment and storage medium based on time sequence database Download PDF

Info

Publication number
CN116860398A
CN116860398A CN202310915398.0A CN202310915398A CN116860398A CN 116860398 A CN116860398 A CN 116860398A CN 202310915398 A CN202310915398 A CN 202310915398A CN 116860398 A CN116860398 A CN 116860398A
Authority
CN
China
Prior art keywords
node
data
transaction
written
control node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310915398.0A
Other languages
Chinese (zh)
Inventor
周小华
隋鹏飞
杨宇轩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Zhiyu Technology Co ltd
Original Assignee
Zhejiang Zhiyu Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Zhiyu Technology Co ltd filed Critical Zhejiang Zhiyu Technology Co ltd
Priority to CN202310915398.0A priority Critical patent/CN116860398A/en
Publication of CN116860398A publication Critical patent/CN116860398A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/466Transaction processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1471Saving, restoring, recovering or retrying involving logging of persistent data for recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a transaction processing method, a device, equipment and a storage medium based on a time sequence database, wherein the method comprises the following steps: the coordination node initiates a transaction, interacts with the control node and each node to be written, submits the data to the control node and each node to be written after the data is written completely successfully, updates the transaction to the control node and each node to be written after the data is submitted, and finally submits the transaction completion identification to the time sequence database storage engine. The application can ensure the atomicity of distributed transaction, the consistency of cluster data, the isolation of read-write transaction and the durability of data, stores the transaction completion identification in each row of time sequence data by introducing the transaction completion identification, and reads and writes data based on the transaction completion identification, thereby avoiding the problem of data reading errors caused by partial update of data during the concurrency of reading and writing and further improving the accuracy of the decision of subsequent transaction.

Description

Transaction processing method, device, equipment and storage medium based on time sequence database
Technical Field
The present application relates to the field of time sequence databases, and in particular, to a method, an apparatus, a device, and a storage medium for transaction processing based on a time sequence database.
Background
A time series database is a database optimized for time stamped or time series data. Time series data are measured values or events that are tracked, monitored, downsampled, and aggregated over time. Such as server metrics, application performance monitoring, network data, sensor data, events, clicks, etc.
A database transaction is a unit of work performed on a database within a database management system that processes data with consistency and reliability independent of other transactions.
However, the amount of data in the time sequence database is large, and there is a high requirement on time sequence, and the data is usually stored in a distributed manner, so that most time sequence databases do not use a transaction mechanism. The time sequence database can not ensure the atomicity, consistency and isolation of data reading and writing, and challenges are brought to operation and maintenance and application scenes with higher requirements such as finance and high-end equipment monitoring.
Disclosure of Invention
The application aims to overcome the defects in the prior art and provide a transaction processing method, device, equipment and storage medium based on a time sequence database, so as to solve the problem that the transaction cannot be used in the time sequence database in the prior art.
In order to achieve the above purpose, the technical scheme adopted by the application is as follows:
in a first aspect, the present application provides a transaction processing method based on a time-series database, the time-series database including a control node and a plurality of data nodes, the method comprising:
the coordination node in the plurality of data nodes initiates a transaction, applies for a transaction start identifier to the control node and determines at least one node to be written in the plurality of data nodes;
the coordination node writes data into each node to be written according to the transaction start identification, and applies for transaction completion identification to the control node when data writing operation to all nodes to be written is completed;
the coordination node sends a transaction commit request to the control node and each node to be written according to the transaction start identifier and the transaction completion identifier;
the control node and each node to be written respond to the transaction submitting request of the coordination node and return feedback information to the coordination node;
and the coordination node sends a transaction update request to the control node and each node to be written, each node to be written updates version information of a data partition corresponding to the transaction, and the transaction completion identification is stored in a time sequence database in a lasting mode.
Optionally, the method further comprises:
the control node receives reporting information of each data node, wherein the reporting information comprises version information of a data partition in the data node and transaction completion identification of the data partition:
the control node determines at least one data node to be recovered according to the report information of each data node, and puts the data node to be recovered into a recovery queue;
and the control node determines a source data node and a target data node from the recovery queue, and the source data node recovers the data information of the source data node to the target data node.
Optionally, the control node determines at least one data node to be recovered according to the report information of each data node, including:
and the control node compares the report information of each data node and determines the data nodes with inconsistent version information as the data nodes to be recovered.
Optionally, the control node determines a source data node and a target data node from the recovery queue, and recovers, by the source data node, data information of the source data node to the target data node, including:
Determining the data node to be recovered with the latest version information as a source data node, and determining the data node to be recovered with the oldest version information as a target data node;
restoring the data information of the source data node to the target data node by the source data node according to an asynchronous restoration strategy;
and if the data difference between the source data node and the target data node is smaller than a preset threshold value or the data round trip times of the source data node and the target data node are larger than a preset limit value, the source data node restores the data information of the source data node to the target data node according to a synchronous restoration strategy.
Optionally, the applying for the transaction start identifier to the control node and determining at least one node to be written in the plurality of data nodes includes:
the coordination node sends a transaction starting request to the control node, wherein the transaction starting request comprises an identification of a data partition related to the transaction;
and the control node determines the at least one node to be written containing the data partition according to the transaction start request, and distributes the transaction start identification to the coordination node.
Optionally, after the coordinating node writes data to each node to be written according to the transaction start identifier, the method further includes:
and if the node to be written which fails to write the data exists, the coordination node sends a transaction rollback request to the control node and each node to be written so as to terminate the current transaction.
Optionally, the method further comprises:
receiving a data reading request, wherein the data reading request comprises the following steps: transaction completion identification;
and determining target data information corresponding to the transaction completion identification from the time sequence database according to the data reading request, and returning the target data information.
In a second aspect, the present application provides a transaction processing apparatus based on a time series database, the apparatus comprising: control node and a plurality of data nodes, wherein:
the coordination node in the plurality of data nodes is used for initiating a transaction, applying for a transaction start identifier to the control node and determining at least one node to be written in the plurality of data nodes;
the coordination node is used for writing data into each node to be written according to the transaction start identification, and applying a transaction completion identification to the control node when the data writing operation of all the nodes to be written is completed;
The coordination node is used for sending a transaction commit request to the control node and each node to be written according to the transaction start identifier and the transaction completion identifier;
the control node and each node to be written are used for responding to a transaction submitting request of the coordination node and returning feedback information to the coordination node;
the coordination node is used for sending a transaction update request to the control node and each node to be written, updating version information of a data partition corresponding to the transaction by each node to be written, and storing the transaction completion identification in a time sequence database in a lasting mode.
Optionally, the control node is further configured to receive reporting information of each data node, where the reporting information includes version information of a data partition in the data node and a transaction completion identifier of the data partition:
the control node is further used for determining at least one data node to be recovered according to the report information of each data node, and placing the data node to be recovered into a recovery queue;
the control node is further configured to determine a source data node and a target data node from the recovery queue, and restore, by the source data node, data information of the source data node to the target data node.
Optionally, the control node is further configured to compare the report information of each data node, and determine the data node with inconsistent version information as the data node to be recovered.
Optionally, the control node is further configured to determine a data node to be restored with latest version information as a source data node, and determine a data node to be restored with oldest version information as a target data node;
restoring the data information of the source data node to the target data node by the source data node according to an asynchronous restoration strategy;
and if the data difference between the source data node and the target data node is smaller than a preset threshold value or the data round trip times of the source data node and the target data node are larger than a preset limit value, the source data node restores the data information of the source data node to the target data node according to a synchronous restoration strategy.
Optionally, the coordination node is further configured to send a transaction start request to the control node, where the transaction start request includes an identifier of a data partition involved in the transaction;
the control node is further configured to determine, according to the transaction start request, the at least one node to be written including the data partition, and allocate the transaction start identifier to the coordination node.
Optionally, the coordination node is further configured to: and if the node to be written which fails to write the data exists, the coordination node sends a transaction rollback request to the control node and each node to be written so as to terminate the current transaction.
Optionally, the apparatus further comprises a reading module for:
receiving a data reading request, wherein the data reading request comprises the following steps: transaction completion identification;
and determining target data information corresponding to the transaction completion identification from the time sequence database according to the data reading request, and returning the target data information.
In a third aspect, the present application provides an electronic device, comprising: a processor, a storage medium, and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor in communication with the storage medium via the bus when the electronic device is running, the processor executing the machine-readable instructions to perform the steps of a time-series database-based transaction method as described above.
In a fourth aspect, the present application provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of a time-series database based transaction processing method as described above.
The beneficial effects of the application are as follows: the method and the device realize decision making by using the transactions in the time sequence database, can distinguish the transactions of different versions by adding the transaction identification to the data written into the data node, and can also mutually isolate the concurrent read data transaction and write data transaction, thereby ensuring the consistency of the data and further improving the accuracy of the decision making. And the application can only commit when all data nodes successfully write data, thus ensuring the atomicity of distributed transactions and the consistency of cluster data.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 shows a schematic view of an application scenario provided by an embodiment of the present application;
FIG. 2 is a flow chart of a transaction processing method based on a time-series database according to an embodiment of the present application;
FIG. 3 is a flowchart of recovering data of a data node according to an embodiment of the present application;
FIG. 4 is a flow chart illustrating yet another method for recovering data from a data node according to an embodiment of the present application;
FIG. 5 is a flowchart of determining a node to be written according to an embodiment of the present application;
FIG. 6 illustrates a flow chart of performing a data read transaction provided by an embodiment of the present application;
FIG. 7 is a schematic diagram of a timing database according to an embodiment of the present application;
fig. 8 shows a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described with reference to the accompanying drawings in the embodiments of the present application, and it should be understood that the drawings in the present application are for the purpose of illustration and description only and are not intended to limit the scope of the present application. In addition, it should be understood that the schematic drawings are not drawn to scale. A flowchart, as used in this disclosure, illustrates operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be implemented out of order and that steps without logical context may be performed in reverse order or concurrently. Moreover, one or more other operations may be added to or removed from the flow diagrams by those skilled in the art under the direction of the present disclosure.
In addition, the described embodiments are only some, but not all, embodiments of the application. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present application.
It should be noted that the term "comprising" will be used in embodiments of the application to indicate the presence of the features stated hereafter, but not to exclude the addition of other features.
The database transaction is a working unit executed for a database in a database management system, but the current time sequence database has large data volume and higher time sequence requirement, and often data are stored on each data node in a distributed mode, so that most time sequence databases do not use a transaction mechanism.
For example, assuming that a database transaction needs to read data corresponding to the transaction from the data node a, the data node B and the data node C, the data corresponding to the transaction may be, for example, data that is stored in the time-series database last, but when the data on the data node a is read, the server or the electronic device may still write data into the data node B, when the database transaction is to read the data from the data node B, the data that is just written may be directly read, and the data that is just written and the data that is read in the data node a are obviously not the same time data, and in a scenario where the reading and writing are performed simultaneously, a situation where only half of records are written and then read may also occur, and when the data is adopted to make a decision, a decision error caused by inconsistent data time sequence may occur.
Based on the problems, the application provides a transaction processing method based on a time sequence database, which realizes decision making by using transactions in the time sequence database, can distinguish transactions of different versions by adding transaction identifiers to data written into data nodes, and also separates concurrent read data transactions and write data transactions from each other, thereby ensuring the consistency of the data and further improving the accuracy of decision making.
Referring to fig. 1, a control node and a plurality of data nodes may be included in a time sequence database, after a server or an electronic device determines a writing transaction to be executed currently, the data node initiating the current writing transaction in the time sequence database may be used as a coordinating node, and through interactions among the coordinating node, the control node and the data nodes related to the transaction, data is written into a corresponding node, and a transaction completion identifier of the data is generated, and when a transaction request for reading the data is received, the corresponding data may be returned according to the transaction completion identifier.
Next, a method for processing a transaction based on a time-series database according to the present application will be further described with reference to fig. 2, where the execution subject of the method may be an electronic device or a server in which the time-series database is deployed, and the time-series database may be, for example, a dolphin db, as shown in fig. 2, and the method includes:
s201: the coordination node in the plurality of data nodes initiates a transaction, applies for a transaction start identification to the control node and determines at least one node to be written in the plurality of data nodes.
Optionally, the data node may initiate a transaction of writing data, and the coordination node may be a data node initiating a transaction in the data nodes, where when the data node initiates a transaction, the data node may be used as a coordination node, and when a new transaction is initiated by another data node after the current transaction is ended, the data node initiating the new transaction may be used as a coordination node.
Alternatively, the control node may be connected to an external component, and when the coordinating node initiates a transaction and applies for a transaction start identifier to the control node, the control node may generate the transaction start identifier through the external component.
Wherein the transaction start identifier may characterize that the current transaction has started the transaction start identifier may be incremented to distinguish a writing order of data corresponding to each transaction.
It should be noted that, when the coordination node initiates a transaction, the data partition where the data related to the transaction is located may also be sent to the control node, so that the control node may determine which data nodes where the data is located are all located, and use the data node related to the transaction as the node to be written.
Illustratively, assuming that the coordinator node initiates transaction a, a table needs to be written into the time-series database, and the data node associated with the table includes data node a and data node B, then both data node a and data node B need to be the nodes to be written to achieve the storage of table a and a copy of table a into data node a and data node B.
S202: and the coordination node writes data into each node to be written according to the transaction start identification, and applies for the transaction completion identification to the control node when the data writing operation of all the nodes to be written is completed.
Optionally, the control node may enter a transaction state at the data node involved in the transaction start identifier, and the coordinating node may write data to the node to be written that enters the transaction state.
As a possible implementation manner, the coordinating node may write data into each node to be written through the writing function, and when all the nodes to be written have successfully written the data, the coordinating node may apply the transaction completion identifier to the control node.
Optionally, the transaction completion identifier is used to characterize that the current transaction is completed, the transaction completion identifier may be incremented, the control node may generate the transaction completion identifier by an external component, and the transaction completion identifier and the transaction start identifier may be different.
It should be noted that, in the embodiment of the present application, only when all the nodes to be written successfully write data, the coordinating node applies the transaction completion identifier to the control node, and only if there is one node to be written with data which fails to write, the transaction completion identifier cannot be applied.
S203: and the coordination node sends a transaction commit request to the control node and each node to be written according to the transaction start identifier and the transaction completion identifier.
Alternatively, the coordinating node may submit the transaction start identifier and the transaction completion identifier to all participants of the transaction, i.e., the control node and the respective nodes to be written, to again verify that the written data has been completed successfully.
Optionally, a transaction start identifier and a transaction end identifier may be included in the transaction commit request.
It is noted that the data is written into the node to be written before, the commit operation is performed again, whether the process of writing the data is wrong or not can be verified again in the commit process, and if not, the commit is performed, so that part of the nodes to be written are prevented from being misjudged as the nodes successfully written with the data, and the reliability of transaction execution is improved.
S204: and the control node and each node to be written respond to the transaction submitting request of the coordination node and return feedback information to the coordination node.
Optionally, after the control node and each node to be written receive the transaction commit request sent by the coordination node, a feedback message may be sent to the coordination node in response, where the feedback message is used to indicate success or failure of the transaction commit operation.
It should be noted that if the transaction commit is successful, the coordinating node may update the transaction to all the transaction participants, and if the transaction commit is not successful, it indicates that there is a data node with a data writing failure, and at this time, the completed writing operation is cancelled, so as to ensure the data consistency of each node to be written and the control node.
S205: the coordination node sends a transaction update request to the control node and each node to be written, the node to be written updates version information of a data partition corresponding to the transaction, and the transaction completion identification is stored in a time sequence database in a lasting mode.
Optionally, after the coordinating node receives feedback information that the control node and each node to be written submit successfully, the coordinating node may send a transaction update request to the control node and each node to be written.
Optionally, the data node may include a plurality of data partitions, and the version information may represent a transaction occurrence condition of the data partition in the data node, where the version information may be updated once each time the data partition completes a transaction, so that the more recent the version information of the data partition, the more times the data partition occurs and completes the transaction may be described.
Alternatively, the transaction update request may be used to update the transaction to the control node and each node to be written, and end the transaction, and, illustratively, after receiving the transaction update request, the control node and each node to be written may end the current transaction, write version information of the transaction to a data partition where data is located, and persist a transaction completion identifier of the transaction to the time-series database.
It should be noted that, the completion table identifier of the transaction may be written into the control node and each node to be written in the commit stage, for each transaction, after the transaction is ended, the coordinating node submits the completion identifier of the transaction to the time-series database storage engine for unified storage and backup, a corresponding transaction completion identifier column is added in the time-series database, and the transaction completion identifiers of the transactions may be sequentially incremented from early to late according to the completion order.
In the embodiment of the application, the coordination node initiates the transaction, performs two-stage commit operation with the control node and each node to be written, namely, commits to all transaction participants after the data is written completely, updates the transaction to all the transaction participants after the commit is finished, and finally submits the transaction completion identification to the time sequence database storage engine. The data can be distinguished according to the transaction in the data writing stage through the transaction completion identification, so that when the data is read, the corresponding data can be accurately read according to the transaction completion identification, the problem of data reading errors caused by partial data updating during read-write concurrency is avoided, and the accuracy of subsequent transaction decision is further improved.
When writing data, there may be a case that some data nodes are not on line or data is not written due to network abnormality, and in order to ensure consistency of the data, the data of the data nodes needs to be synchronized.
It should be noted that, a transaction completion identifier is generated after each transaction is completed, and the transaction completion identifier is incremented, so that data nodes with inconsistent data may lack some transactions compared with data nodes with consistent data, and therefore, by comparing the transaction completion identifiers of the data nodes, it is possible to determine which transactions are missing by the data nodes, and restore the data of the data nodes.
FIG. 3 is a flowchart illustrating a step of recovering data of a data node according to the present application, and as shown in FIG. 3, the transaction processing method based on a time-series database according to the present application further includes:
s301: the control node receives reporting information of each data node, wherein the reporting information comprises version information of the data partition in the data node and transaction completion identification of the data partition.
Alternatively, the data node may not be written with data due to a restart of the machine recovery process or a network recovery, in which case the data node may actively send reporting information to the control node after starting normal operation.
As another possible implementation manner, the control node may also monitor whether the data node related to the writing transaction is consistent with the data node actually writing data, record the identification of the data node which is not written when the data node is inconsistent, and when the data node is online, obtain reporting information from the data node which is not written by the control node.
Optionally, the report information may include version information of the data partition that the data node has recently written into and a transaction completion identifier of the data partition. The control node can know which transactions are completed in the data partition according to the version information of the data partition and the completion identification of the data partition.
S302: and the control node determines at least one data node to be recovered according to the reporting information of each data node, and puts the data node to be recovered into a recovery queue.
Optionally, the control node may compare the reporting information of each data node, and use the data node with inconsistent data as the data node to be recovered.
Optionally, the control node may include a recovery queue, and the identifier of the data node that indicates that the data needs to be recovered from the report information may be placed in the recovery queue.
In an exemplary embodiment, the control node receives reporting information of the data node a and the data node B, where the reporting information of the data node a indicates that version information of a latest data partition is 6, the transaction completion identifier is 10, the reporting information of the data node B indicates that version information of a latest data partition is 5, and the transaction completion identifier is 8, and it can be seen that version information of the data node a and the data node B are different from the transaction completion identifier, and then node identifiers of the data node a and the data node B can be placed in the recovery queue.
S303: the control node determines a source data node and a target data node from the recovery queue, and the source data node recovers the data information of the source data node to the target data node.
Optionally, the control node may determine the source data node and the target data node from the recovery queue according to the report information that is currently received, where the source data node performs reference for data recovery for the target data node.
As a possible implementation manner, assuming that the data node a is a source data node, the maximum transaction completion identifier is 600, and the maximum transaction completion identifier of the target data node is 550, the recovery flow only needs to read the data with the transaction completion identifier in the range of 551-600, and the data is only a small part in the memory of the source data node, and only needs to synchronize the data to the target data node.
In the embodiment of the application, the consistency of the data in the data nodes can be ensured by carrying out data recovery on the data nodes which are not written with the data because of off-line or network recovery.
The following is a description of the step of determining at least one data node to be recovered by the control node in S302 according to the report information of each data node, including:
The control node compares the report information of each data node and determines the data nodes with inconsistent version information as the data nodes to be recovered.
As a possible implementation manner, the control node may store the version information of the data partition, so the control node may compare the version information in the reporting information of the data node with the version information of the corresponding data partition in the control node, and if the comparison result is inconsistent, it indicates that the data node is not written with data, and at this time, the data node may be determined as the data node to be recovered.
As another possible implementation manner, the control node may compare the report information of the received multiple data nodes, determine whether the version information of the data nodes are consistent, if not, determine that the data nodes may need to perform data recovery, and determine that all the data nodes that are inconsistent are data nodes to be recovered.
After determining the data node to be restored, the control node may determine the source data node and the target data node from the restoration queue, and restore the data information of the source data node to the target data node by the source data node, as shown in fig. 4, where the step S303 includes:
S401: and determining the data node to be restored with the latest version information as a source data node, and determining the data node to be restored with the oldest version information as a target data node.
Optionally, for the data partition in the data node to be recovered, version information of the data partition is updated after each transaction is completed, version information of the data partition in which the transaction is completed last time in the data nodes can be compared, the data node where the data partition with the latest version information is located is determined to be the source data node, and the data node where the data partition with the oldest version information is located is determined to be the target data node.
It should be appreciated that the latest version information may be the latest time of the last completion of the transaction by the data partition to the current time, and the oldest version information may be the latest time of the last completion of the transaction by the data partition to the current time.
S402: and recovering the data information of the source data node to the target data node by the source data node according to an asynchronous recovery strategy.
Optionally, for the source data node and the target data node with large data difference, a long data recovery time may be required, if other data writing operations of the target data node are forbidden in the data recovery process, the data writing efficiency is definitely reduced, so that an asynchronous recovery strategy may be adopted, that is, when the source data node and the target data node perform data recovery, other transactions are allowed to write data into the target data node, that is, in the asynchronous recovery strategy, the state of the data partition in the control node is not set as RECOVERING, that is, writing is not blocked. At this time, when the client acquires the information of the data nodes participating in writing from the control node, the control node only returns the data nodes with consistent data.
It should be noted that, when the asynchronous recovery policy is adopted to perform data recovery, the difference of the datas between the two data nodes is smaller than a preset threshold value, or the number of data round trips of the source data node and the target data node is larger than a preset limit value, the asynchronous recovery policy may be ended, and the recovery of the remaining data may be continued by using the synchronous recovery policy.
It will be appreciated that an incomplete transaction will not have transaction completion identification in the data node and that even with a new write, the old data written prior to the read can be identified by the transaction completion, and therefore the data recovery will not have any impact even if the data recovery is performed while the target data node is writing.
S403: if the data difference between the source data node and the target data node is smaller than a preset threshold value or the data round trip times of the source data node and the target data node are larger than a preset limit value, the source data node restores the data information of the source data node to the target data node according to a synchronous restoring strategy.
Optionally, when the source data node performs data recovery, the maximum transaction completion identifier of the target data node needs to be acquired, and data is written into the target recovery node, in this process, the source data node may record the number of data round trips, and if the data difference between the source data node and the target data node is smaller than a preset threshold value, or the number of data round trips between the source data node and the target data node is greater than a preset limit value, in order to improve the data recovery efficiency, other data writing operations of the target data node may be prohibited, and only the data recovery operation from the source data node to the target data node is performed.
In the embodiment of the application, different recovery strategies are adopted according to the difference of data and the round trip times among the data nodes when the data is recovered, so that the data recovery efficiency can be improved.
The following is a description of the above step of applying the transaction start identifier to the control node and determining at least one node to be written in the plurality of data nodes, as shown in fig. 5, and the step S201 includes:
s501: the coordinator node sends a transaction start request to the control node, wherein the transaction start request comprises the identification of the data partition related to the transaction.
Alternatively, when a data node is to initiate a transaction, the data node may be considered as a coordinating node of the current transaction, and the coordinating node may send a transaction start request to the control node to obtain, from the transaction start request, a node to be written to which the transaction relates.
Optionally, the transaction start request may include an identifier of a data partition related to the current transaction, where the identifier of the data partition may indicate a corresponding data partition and a data node where the data partition is located.
S502: and the control node determines at least one node to be written containing the data partition according to the transaction start request, and distributes a transaction start identification to the coordination node.
Optionally, the control node may determine, according to the data partition identifiers in the transaction start request, which data nodes include the data partitions, make the data nodes enter a transaction state, return the identifiers of the data nodes of the data partitions to the coordinator node, and allocate a transaction start identifier to the coordinator node.
In the embodiment of the present application, data writing may be performed after a transaction start identifier is allocated, and it should be noted that, only after writing operations on all data are successfully completed, a transaction completion identifier is applied, and after the coordinating node writes data to each node to be written according to the transaction start identifier in step S202, the method further includes:
if the node to be written which fails to write the data exists, the coordination node sends a transaction rollback request to the control node and each node to be written so as to terminate the current transaction.
Optionally, if there is a node to be written that fails to write data, in order to ensure consistency of data, the coordination node in the present application may send a transaction rollback request to the control node and each node to be written, so as to cancel the previous write operation and terminate the current transaction.
It should be noted that, when the coordinating node performs the commit operation to the control node and each node to be written, if a commit error occurs, a transaction rollback request may also be sent to the control node and each node to be written, so as to terminate the current transaction.
In the embodiment of the application, the data rollback is performed on the condition that partial data writing fails, so that written data are consistent, and the conditions of decision errors and the like caused by inconsistent subsequent data are avoided.
After writing data, in the embodiment of the present application, isolation of the read transaction may also be achieved by the transaction completion identifier, as shown in fig. 6, the step of performing the data read transaction includes:
s601: receiving a data reading request, wherein the data reading request comprises the following steps: transaction completion identification.
Optionally, when writing data, the control node may generate a transaction completion identifier through the external component, and when reading data, after the client determines that the data needs to be read or queried, the control node may query the transaction completion identifier corresponding to the data through the external component. And carrying the transaction completion identification in the data read request.
S602: and determining target data information corresponding to the transaction completion identification from the time sequence database according to the data reading request, and returning the target data information.
According to the method and the device, the control node or the data node initiating the query transaction can read the target data information corresponding to the transaction completion identification from the time sequence database according to the transaction completion identification in the data reading request.
Alternatively, the transaction completion identification has an incremental characteristic, so that this characteristic can be utilized to read from the timing database all data for which the transaction completion identification is less than the transaction completion identification in the data read request.
In the application, the data partition can further divide data to obtain a plurality of data blocks, and the data blocks can be the division of the inside of the storage engine.
In order to improve the speed of data reading, the transaction completion identifier of the data reading request can be compared with the minimum transaction completion identifier and the maximum transaction completion identifier of each data block in the data partition of the data node, if the transaction completion identifier is smaller than the maximum transaction completion identifier and larger than the minimum transaction completion identifier, the target data information can be indicated in the data block, and if the transaction completion identifier is larger than the maximum transaction completion identifier, all data on the data node can be accessed, so that the one-by-one comparison of the data in the data node is avoided, and the performance overhead cost of data query is greatly reduced.
Assuming that a read transaction with a transaction completion identifier of 590 is provided, the time sequence database DolphinDB compares the structures of the intermediate 551-550 of the memory part, finds that part of data satisfies the transaction completion identifier of less than 590, and obtains the read data of the part satisfying the snapshot isolation level by reading the specific data of the transaction completion identifier column in the read transaction. In addition to the data in the older transaction completion identification ranges 501-550 of the memory portions and the data on the disk, the portions can use the incrementation of the transaction completion identification, and the transaction completion identification can be obtained without comparison to be less than 590, so that the data can be read.
In the embodiment of the application, the data in the time sequence database can be queried according to the transaction completion identification, the data in the time sequence database is prevented from being compared, and the efficiency of data query can be greatly improved through the transaction completion identification.
Based on the same inventive concept, the embodiment of the present application further provides a time sequence database corresponding to the transaction processing method based on the time sequence database, and since the principle of solving the problem by the time sequence database in the embodiment of the present application is similar to that of the transaction processing method based on the time sequence database in the embodiment of the present application, the implementation of the apparatus may refer to the implementation of the method, and the repetition is omitted.
Referring to fig. 7, a schematic diagram of a timing database according to an embodiment of the present application is shown, where the timing database includes: a control node 701, a coordinating node 702, and a node 703 to be written; wherein:
the coordination node 702 in the plurality of data nodes is used for initiating a transaction, applying for a transaction start identifier to the control node 701 and determining at least one node 703 to be written in the plurality of data nodes;
the coordination node 702 is configured to write data to each node 703 to be written according to the transaction start identifier, and apply for a transaction completion identifier to the control node 701 when data writing operations to all the nodes 703 to be written are completed;
The coordination node 702 is configured to send a transaction commit request to the control node 701 and each node 703 to be written according to the transaction start identifier and the transaction completion identifier;
the control node 701 and each node to be written 703 are configured to respond to a transaction commit request of the coordination node 702, and return feedback information to the coordination node 702;
the coordination node 702 is configured to send a transaction update request to the control node 701 and each node to be written 703, update version information of a data partition corresponding to the transaction by each node to be written 703, and store the transaction completion identifier in the time-series database in a persistent manner.
Optionally, the control node 701 is further configured to:
receiving reporting information of each data node, wherein the reporting information comprises version information of the data partition in the data node and transaction completion identification of the data partition:
the control node 701 is further configured to determine at least one data node to be restored according to the report information of each data node, and place the data node to be restored into a restoration queue;
the control node 701 is further configured to determine a source data node and a target data node from the recovery queue, and recover data information of the source data node to the target data node by the source data node.
Optionally, the control node 701 is further configured to compare the report information of each data node, and determine the data node with inconsistent version information or transaction completion identifier as the data node to be recovered.
Optionally, the control node 701 is further configured to:
determining the data node to be recovered with the latest version information as a source data node, and determining the data node to be recovered with the oldest version information as a target data node;
restoring the data information of the source data node to the target data node by the source data node according to an asynchronous restoration strategy;
if the data difference between the source data node and the target data node is smaller than a preset threshold value or the data round trip times of the source data node and the target data node are larger than a preset limit value, the source data node restores the data information of the source data node to the target data node according to a synchronous restoring strategy.
Optionally, the coordinating node 702 is further configured to:
sending a transaction start request to the control node 701, wherein the transaction start request comprises an identification of a data partition related to a transaction;
the control node 701 is further configured to determine at least one node to be written 703 containing a data partition according to the transaction start request, and allocate a transaction start identifier to the coordinating node 702.
Optionally, the coordinating node 702 is further configured to:
if there are nodes to be written 703 that fail to write data, the coordinating node 702 sends a transaction rollback request to the controlling node 701 and each node to be written 703 to terminate the current transaction.
Optionally, the control node 701 and the data node in the timing database are further configured to:
receiving a data reading request, wherein the data reading request comprises the following steps: transaction completion identification;
and determining target data information corresponding to the transaction completion identification from the time sequence database according to the data reading request, and returning the target data information.
For a description of the processing flow of each node in the time series database, and the interaction flow between each node, reference is made to the relevant description in the above method embodiment, and will not be described in detail here.
The embodiment of the application initiates the transaction through the coordination node, and executes two-stage commit operation with the control node and each node to be written, namely, after the written data are all successful, commit is carried out to all transaction participants, after the commit is finished, the transaction is updated to all the transaction participants, finally, the transaction completion identification is submitted to the time sequence database storage engine, and commit can be carried out only when the data are all successfully written in all the data nodes, so that the consistency of the atomicity and cluster data of the distributed transaction can be ensured. The data can be distinguished according to the transaction in the data writing stage through the transaction completion identification, so that when the data is read, the corresponding data can be accurately read according to the transaction completion identification, the problem of data reading errors caused by partial data updating during read-write concurrency is avoided, and the accuracy of subsequent transaction decision is further improved.
The embodiment of the application also provides an electronic device, as shown in fig. 8, which is a schematic structural diagram of the electronic device provided by the embodiment of the application, and includes: a processor 801, a memory 802, and a bus. The memory 802 stores machine-readable instructions executable by the processor 801 (e.g., execution instructions corresponding to the control node 701, the coordinating node 702, and the node 703 to be written in the time-series database in fig. 7, etc.), and when the computer device is running, the processor 801 communicates with the memory 802 through a bus, and the machine-readable instructions are executed by the processor 801 to perform the processing of the transaction processing method based on the time-series database.
The embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program which is executed by a processor to execute the steps of the transaction processing method based on the time sequence database.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described system and apparatus may refer to corresponding procedures in the method embodiments, and are not repeated in the present disclosure. In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, and the division of the modules is merely a logical function division, and there may be additional divisions when actually implemented, and for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, indirect coupling or communication connection of devices or modules, electrical, mechanical, or other form.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily appreciate variations or alternatives within the scope of the present application.

Claims (10)

1. A method of transaction processing based on a time series database, the time series database comprising a control node and a plurality of data nodes, comprising:
the coordination node in the plurality of data nodes initiates a transaction, applies for a transaction start identifier to the control node and determines at least one node to be written in the plurality of data nodes;
the coordination node writes data into each node to be written according to the transaction start identification, and applies for transaction completion identification to the control node when data writing operation to all nodes to be written is completed;
the coordination node sends a transaction commit request to the control node and each node to be written according to the transaction start identifier and the transaction completion identifier;
the control node and each node to be written respond to the transaction submitting request of the coordination node and return feedback information to the coordination node;
and the coordination node sends a transaction update request to the control node and each node to be written, each node to be written updates version information of a data partition corresponding to the transaction, and the transaction completion identification is stored in a time sequence database in a lasting mode.
2. The method according to claim 1, wherein the method further comprises:
the control node receives reporting information of each data node, wherein the reporting information comprises version information of a data partition in the data node and transaction completion identification of the data partition:
the control node determines at least one data node to be recovered according to the report information of each data node, and puts the data node to be recovered into a recovery queue;
and the control node determines a source data node and a target data node from the recovery queue, and the source data node recovers the data information of the source data node to the target data node.
3. The method according to claim 2, wherein the control node determines at least one data node to be restored according to the report information of each data node, comprising:
and the control node compares the report information of each data node and determines the data nodes with inconsistent version information as the data nodes to be recovered.
4. The method of claim 2, wherein the control node determining a source data node and a target data node from the recovery queue and recovering, by the source data node, data information of the source data node onto the target data node, comprises:
Determining the data node to be recovered with the latest version information as a source data node, and determining the data node to be recovered with the oldest version information as a target data node;
restoring the data information of the source data node to the target data node by the source data node according to an asynchronous restoration strategy;
and if the data difference between the source data node and the target data node is smaller than a preset threshold value or the data round trip times of the source data node and the target data node are larger than a preset limit value, the source data node restores the data information of the source data node to the target data node according to a synchronous restoration strategy.
5. The method of claim 1, wherein the applying for transaction initiation identification from the control node and determining at least one of the plurality of data nodes to be written to comprises:
the coordination node sends a transaction starting request to the control node, wherein the transaction starting request comprises an identification of a data partition related to the transaction;
and the control node determines the at least one node to be written containing the data partition according to the transaction start request, and distributes the transaction start identification to the coordination node.
6. The method of claim 3, wherein after the coordinating node writes data to each of the nodes to be written according to the transaction start identification, further comprising:
and if the node to be written which fails to write the data exists, the coordination node sends a transaction rollback request to the control node and each node to be written so as to terminate the current transaction.
7. The method according to any one of claims 1-6, further comprising:
receiving a data reading request, wherein the data reading request comprises the following steps: transaction completion identification;
and determining target data information corresponding to the transaction completion identification from the time sequence database according to the data reading request, and returning the target data information.
8. A time series database comprising a control node and a plurality of data nodes, wherein:
the coordination node in the plurality of data nodes is used for initiating a transaction, applying for a transaction start identifier to the control node and determining at least one node to be written in the plurality of data nodes;
the coordination node is used for writing data into each node to be written according to the transaction start identification, and applying a transaction completion identification to the control node when the data writing operation of all the nodes to be written is completed;
The coordination node is used for sending a transaction commit request to the control node and each node to be written according to the transaction start identifier and the transaction completion identifier;
the control node and each node to be written are used for responding to a transaction submitting request of the coordination node and returning feedback information to the coordination node;
the coordination node is used for sending a transaction update request to the control node and each node to be written, updating version information of a data partition corresponding to the transaction by each node to be written, and storing the transaction completion identification in a time sequence database in a lasting mode.
9. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing program instructions executable by the processor, the processor and the storage medium communicating over the bus when the electronic device is running, the processor executing the program instructions to perform the steps of the time-series database-based transaction method of any one of claims 1 to 7 when executed.
10. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, performs the steps of the time-series database-based transaction method according to any of claims 1 to 7.
CN202310915398.0A 2023-07-25 2023-07-25 Transaction processing method, device, equipment and storage medium based on time sequence database Pending CN116860398A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310915398.0A CN116860398A (en) 2023-07-25 2023-07-25 Transaction processing method, device, equipment and storage medium based on time sequence database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310915398.0A CN116860398A (en) 2023-07-25 2023-07-25 Transaction processing method, device, equipment and storage medium based on time sequence database

Publications (1)

Publication Number Publication Date
CN116860398A true CN116860398A (en) 2023-10-10

Family

ID=88224953

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310915398.0A Pending CN116860398A (en) 2023-07-25 2023-07-25 Transaction processing method, device, equipment and storage medium based on time sequence database

Country Status (1)

Country Link
CN (1) CN116860398A (en)

Similar Documents

Publication Publication Date Title
JP2708386B2 (en) Method and apparatus for recovering duplicate database through simultaneous update and copy procedure
US9575849B2 (en) Synchronized backup and recovery of database systems
US7197632B2 (en) Storage system and cluster maintenance
US9582382B1 (en) Snapshot hardening
US6247023B1 (en) Method for providing database recovery across multiple nodes
US20060095478A1 (en) Consistent reintegration a failed primary instance
JPH06168169A (en) Distributed transaction processing using two-phase commit protocol provided with assumption commit without log force
US11436110B2 (en) Distributed database remote backup
US20090043845A1 (en) Method, system and computer program for providing atomicity for a unit of work
CN107533474B (en) Transaction processing method and device
CN112214649B (en) Distributed transaction solution system of temporal graph database
US6842763B2 (en) Method and apparatus for improving message availability in a subsystem which supports shared message queues
CN109726211B (en) Distributed time sequence database
US20230315713A1 (en) Operation request processing method, apparatus, device, readable storage medium, and system
WO2021082925A1 (en) Transaction processing method and apparatus
EP4060514A1 (en) Distributed database system and data disaster backup drilling method
CN111404737B (en) Disaster recovery processing method and related device
CN111198920B (en) Method and device for determining comparison table snapshot based on database synchronization
CN116860398A (en) Transaction processing method, device, equipment and storage medium based on time sequence database
CN113064768B (en) Method and device for switching fragment nodes in block chain system
CN111984665B (en) Distributed transaction processing method, device and system
CN114930315A (en) Processing delete requests based on updated change summaries
CN110362428A (en) The on-line automatic method and system for restoring database block
CN117632598B (en) GBase8a database online backup method
CN113238892B (en) Time point recovery method and device for global consistency of distributed system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination