WO2021249207A1 - 数据库事务处理方法、装置、服务器及存储介质 - Google Patents

数据库事务处理方法、装置、服务器及存储介质 Download PDF

Info

Publication number
WO2021249207A1
WO2021249207A1 PCT/CN2021/096691 CN2021096691W WO2021249207A1 WO 2021249207 A1 WO2021249207 A1 WO 2021249207A1 CN 2021096691 W CN2021096691 W CN 2021096691W WO 2021249207 A1 WO2021249207 A1 WO 2021249207A1
Authority
WO
WIPO (PCT)
Prior art keywords
metadata
version information
target
global
transaction
Prior art date
Application number
PCT/CN2021/096691
Other languages
English (en)
French (fr)
Inventor
卞昊穹
叶盛
雷海林
孙康
李海翔
潘安群
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP21822853.4A priority Critical patent/EP4030315A4/en
Priority to KR1020227015834A priority patent/KR20220076522A/ko
Priority to JP2022555830A priority patent/JP7497907B2/ja
Publication of WO2021249207A1 publication Critical patent/WO2021249207A1/zh
Priority to US17/743,293 priority patent/US20220276998A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2308Concurrency control
    • G06F16/2315Optimistic concurrency control
    • G06F16/2322Optimistic concurrency control using timestamps
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/219Managing data history or versioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2379Updates performed during online database operations; commit processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Definitions

  • This application relates to the field of database technology, and more specifically, to a database transaction processing method, device, server, and storage medium.
  • Metadata is data used to describe data. For example, data used to describe the organization and structure of data objects in a database can be regarded as a kind of metadata.
  • the distributed database system includes global metadata storage and multiple working nodes. Global metadata information is maintained in the global metadata store, which includes all metadata in the system. Part or all of global metadata information is stored in the local cache of each working node.
  • a work node Before a work node executes a transaction or operation statement for a data object, it usually needs to perform some processing based on the metadata of the data object.
  • the worker node first obtains the metadata from the local cache, and then obtains it from the global metadata store if the acquisition fails. Therefore, it is very important to ensure the consistency of metadata cached by each working node.
  • the embodiment of the application provides a database transaction processing method, which is applied to a working node of a distributed database system.
  • the method includes: when the target transaction is started, the transaction timestamp of the target transaction and the current global latest version of the distributed database system are obtained.
  • Information where the target transaction includes at least one operation statement for the target data object, the global version information is the version information of the newly generated metadata among the various metadata stored in the distributed database system; the target data object is determined according to the current global version information
  • the latest version of the metadata of the target transaction is determined based on the transaction timestamp, and the user data to be accessed for the target transaction is determined; the operation statement in the target transaction is executed based on the latest version of the metadata of the target data object for the user to be accessed.
  • the embodiment of the present application provides a database transaction processing device, which is applied to a working node of a distributed database system.
  • the device includes: an acquisition module, a determination module, and a transaction processing module.
  • the obtaining module is used to obtain the transaction timestamp of the target transaction and the current global latest version information of the distributed database system when the working node starts the target transaction, wherein the target transaction includes at least one operation on the target data object Statement, the global latest version information is the version information of the latest generated metadata among the various metadata stored in the distributed database system.
  • the determining module is used for determining the metadata of the latest version of the target data object according to the current global latest version information, and determining the to-be-accessed user data of the target transaction according to the transaction timestamp.
  • the transaction processing module is used to execute operation statements in the target transaction based on the metadata of the latest version of the target data object to be accessed user data.
  • the database transaction processing device provided by the embodiment of the present application further includes a change module.
  • the change module is used to: receive a change instruction for the metadata of any data object, and generate the changed metadata of the data object according to the change instruction; submit the changed metadata to the global metadata store; when the changed metadata Upon successful submission, update the global latest version information to the version information of the changed metadata.
  • the change module is also used to: before submitting the changed metadata to the global metadata store, when the changed metadata is generated, send the first time to the global timestamp manager. Timestamp allocation request; receive the timestamp returned by the global timestamp manager based on the first timestamp allocation request, and determine the timestamp as the version information of the changed metadata.
  • the global latest version information is stored in the global timestamp manager
  • the method for the change module to update the global latest version information to the version information of the changed metadata is:
  • the manager sends a submission success notification, so that the global timestamp manager updates the stored global latest version information to the version information of the changed metadata according to the submission success notification.
  • the local cache of the working node stores the version information of the local latest metadata of the data object
  • the change module is also used for: when the changed metadata is successfully submitted, the changed metadata
  • the metadata of is written into the local cache; in the local cache, the version information of the local latest metadata of the data object whose metadata has changed is updated to the version information of the changed metadata.
  • the determining module determines the latest version of the metadata of the target data object according to the current global latest version information: obtaining the latest version of the local metadata of the target data object from the local cache Information; compare whether the version information of the local latest metadata of the target data object is the same as the current global latest version information; if the same, obtain the local latest metadata of the target data object from the metadata stored in the local cache as the target data object The latest version of the metadata.
  • the determining module determines the latest version of the metadata of the target data object according to the current global latest version information: if the version information of the local latest metadata of the target data object is the same as the current version information The global latest version information is not the same. From the metadata of each version of the target data object stored in the local cache and the global metadata store, find out whether the target metadata exists. The version information of the found target metadata is higher than that of the target data object. The version information of the local latest metadata is newer and not newer than the current global latest version information; if it exists, the target metadata with the latest version information in the found target metadata is determined as the metadata of the latest version of the target data object .
  • the determining module is further configured to: after determining the target metadata with the latest version information as the metadata of the latest version of the target data object, update the target metadata with the latest version information to Local cache; the version information of the local latest metadata of the target data object is updated to the version information corresponding to the latest target metadata of the version information.
  • the determining module determines the latest version of the metadata of the target data object according to the current global latest version information: if the target metadata does not exist, the local The latest metadata is determined as the latest version of the metadata of the target data object, and the version information of the local latest metadata of the target data object is updated to the current global latest version information.
  • the global latest version information is stored in the global timestamp manager of the distributed database system
  • the acquiring module acquires the transaction timestamp of the target transaction and the current global latest version of the distributed database system
  • the version information method is: sending a second time stamp allocation request corresponding to the target transaction to the global time stamp manager; receiving response information returned by the global time stamp manager based on the second time stamp allocation request, the response information including the transaction of the target transaction Timestamp and current global latest version information.
  • the transaction processing module executes the operation statements in the target transaction based on the metadata of the latest version of the target data object based on the user data to be accessed: according to the mode information of the latest version of the target data object , Analyze the operation statement in the target transaction; process the data of the user to be accessed according to the result of the analysis.
  • an embodiment of the present application provides a server, including: one or more processors; a memory; one or more programs, wherein the one or more programs are stored in the memory and configured to be controlled by The one or more processors execute, and the one or more programs are configured to execute the above-mentioned methods.
  • the embodiments of the present application provide a computer-readable storage medium with program code stored thereon, and the program code can be invoked by a processor to execute the above-mentioned method.
  • Fig. 1 shows a schematic diagram of the architecture of a distributed database system provided by an embodiment of the present application.
  • FIG. 2 shows a schematic flowchart of a method for processing database transactions provided by an embodiment of the present application.
  • FIG. 3 shows a schematic diagram of sub-steps of step S201 shown in FIG. 2.
  • FIG. 4 shows another schematic flowchart of a database transaction processing method provided by an embodiment of the present application.
  • FIG. 5 shows another schematic flowchart of the database transaction processing method provided by the embodiment of the present application.
  • FIG. 6 shows a schematic diagram of sub-steps of step S202 shown in FIG. 2.
  • FIG. 7 shows another schematic flow chart of the database transaction processing method provided by the embodiment of the present application
  • FIG. 8 shows an interaction flowchart in a specific example of the database transaction processing method provided by the embodiment of the present application.
  • FIG. 9 shows an interaction flowchart in another specific example of the database transaction processing method provided by the embodiment of the present application.
  • Fig. 10 shows an interaction flow chart in another specific example of the database transaction processing method provided by the embodiment of the present application.
  • FIG. 11 shows an interaction flow chart in another specific example of the database transaction processing method provided by the embodiment of the present application.
  • Fig. 12 shows a block diagram of a database transaction processing apparatus provided by an embodiment of the present application.
  • FIG. 13 is a block diagram of a server for executing a database transaction processing method according to an embodiment of the present application according to an embodiment of the present application.
  • FIG. 14 is a storage unit for storing or carrying program code for implementing the database transaction processing method according to the embodiment of the present application according to an embodiment of the present application.
  • FIG. 1 is a schematic diagram of the architecture of a distributed database system 10 according to an embodiment of the present application.
  • the distributed database system 10 includes a distributed storage system 200, a global metadata storage 300, a global timestamp manager (Global Timestamp Manager, GTM) 400, and at least two working nodes, such as the working nodes 110, 120 and 110 shown in FIG. 130.
  • GTM Global Timestamp Manager
  • the distributed storage system 200 may include multiple physical storage nodes (for example, storage servers, hosts, etc.) for storing user data.
  • the distributed storage system 200 may also include multiple data storage (Data Store, DS) nodes, and each DS node corresponds to a part of the storage resources in the storage resource pool composed of the multiple physical storage nodes, and is used to manage this part of the storage. User data stored by the resource.
  • Data Store Data Store
  • the working node may also be called a Computation Node (CN), which is used to receive and execute the transaction request submitted by the database user (user) and the process of the operation statement in the transaction.
  • CN Computation Node
  • the database user here can be understood as a database object in a distributed database system, which can also be called a user object, and can be regarded as a connection bridge between user login information and user data.
  • the distributed database system 10 adopts an architecture where storage and computing are separated, the working node and the DS node are usually different processes. If the distributed database system 10 does not adopt an architecture that separates storage and computing, the working node and the DS node are only logical modules, and may actually belong to the same process.
  • the above-mentioned working nodes and DS nodes may run on physical storage nodes in the distributed storage system 200, or may run on other physical servers outside the distributed storage system 200, which are not limited in the embodiment of the present application.
  • the global metadata store stores all metadata in the distributed database system 10, and each metadata can have at least two versions.
  • Each working node has a local cache.
  • the working node 110 has a local cache 111
  • the working node 120 has a local cache 121
  • the working node 130 has a local cache 131.
  • the local here can be understood as the physical device where the working node is located.
  • the local cache of the working node may be a storage space on the physical server where the working node is located, and the storage space can be used for the working node to cache part or all of the global metadata information of the distributed database system 10.
  • the global metadata storage may be, for example, a global data dictionary (Global Data Dictionary, GDD), and the local cache of a working node may be a data dictionary cache (Data Dictionary Cache, DDC).
  • Schema information refers to the description information of the logical structure and characteristics of all data in the database system, which can be understood as the logical data view of the database user.
  • each data object can have a corresponding schema object, where the schema object can be understood as a collection of data object schema information combined according to a certain data structure.
  • the data object refers to the data stored in the database system.
  • the data objects are also different.
  • some database systems use data tables (Data Table) as basic storage objects, and the data objects in these database systems can be understood as data tables.
  • some database systems use documents as basic objects for storage, and the data objects in these database systems can be understood as documents.
  • the schema object of the data table Take the schema object of the data table as an example, which usually includes the definition information of the data table, such as table name, table type, attribute name in the table, attribute type, index, data constraint, etc.
  • a working node before a working node performs any data query and data operation on a data object, it usually needs to access the schema object of the data object to determine the data query. And whether the target data of the data operation exists. For example, if the working node 110 needs to delete the data of the column C1 from the table t1, it needs to access the schema object of the table t1 to check whether the column C1 exists in the table t1.
  • the working node 101 will first look up the schema object of table t1 from the local cache 111, and when the schema object of table t1 cannot be found from the local cache 111, then look up the schema object of table t1 from the global metadata store 300, to According to the schema object, check whether the column C1 exists in the table t1. When it is determined by checking whether the column C1 exists in the table t1, the operation of deleting the data of the column C1 from the table t1 is performed again.
  • the schema object may still be the schema object before the change of data object O1, which may cause abnormalities in the processing of database transactions or operation statements. That is, when a working node changes the metadata of a data object, it is difficult for other working nodes to know the metadata change in time, which easily leads to inconsistencies in the metadata used by the working nodes when processing transactions, resulting in abnormal transaction processing. .
  • a transaction is a logical work unit composed of one or more operation statements (for example, operation statements used to add, delete, check, or modify) user data in the database system.
  • Transactions usually have ACID characteristics.
  • ACID is the abbreviation of Atomicity, Consistency, Isolation, and Durability.
  • Consistency means that the integrity of the data before and after the transaction needs to be consistent, and it can also be understood as the transaction that makes the database system transition from one correct state to another correct state. The correct state here can be understood as the data in the database system satisfies the reservation constraints.
  • Isolation means that when multiple users access the database system concurrently, the transactions opened by the database system for each user will not interfere with each other.
  • Persistence means that once a transaction is committed (commit), the changes made by the transaction to the data in the database system are permanent.
  • the working node 110 changes the schema object of the table t1. For example, after the working node 110 deletes the C1 column of the table t1, the index based on the C1 column is deleted from the schema object of the table t1. In this way, a new version of the schema object of the table t1 will be generated, that is, the V3 version of the schema object.
  • the worker node 120 starts a transaction T1, which in turn includes the following three operation statements: 1. Query the columns of the table t1; 2.
  • snapshot isolation technology In order to maintain the isolation of concurrent transactions, working nodes in a distributed database system often use snapshot isolation technology to process transactions.
  • the basic idea of snapshot isolation is to create a new version of a data item when a data item is added, deleted, or modified, and assign a timestamp to the new version based on the current time as version information (eg, version number).
  • version information eg, version number
  • the timestamp here is usually a value generated by processing the current time using the timestamp function, and it can increase as time increases.
  • a working node executes an operation statement of any transaction, it can use the transaction timestamp of the transaction to obtain a user data snapshot (snapshot), and the version information of the user data snapshot is the transaction timestamp. Only user data whose version information is less than or equal to the version information of the user data snapshot is visible to the transaction, that is, it can be used as the to-be-accessed user data of the transaction.
  • the table t1 no longer includes the data in the C1 column.
  • the schema object of the V2 version it visits there is an index on the C1 column.
  • the same transaction or operation statement has different isolation levels for user data and metadata.
  • the worker node is in the transaction or The user data and metadata visible during the processing of the operation statement may not correspond to each other. This may lead to errors in the final processing results.
  • the distributed database system uses a transaction commit protocol based on two-phase commit (2PC) to implement the change and synchronization of schema objects.
  • a worker node can treat the schema object change as a special transaction.
  • the worker node that initiates the schema change transaction serves as the coordinator of the transaction submission, and the worker nodes of other schema objects that need to be changed synchronously are the participants of the schema change transaction. .
  • the premise of adopting this scheme is that all other nodes (ie, participants) in the distributed database system can be known and connected, and the coordinator and each participant need to maintain their respective node states and record logs for the transaction submission protocol use. That is to say, in this way, the worker nodes must be stateful, and no worker nodes can exit or join the system during the schema change and synchronization process, otherwise the change and synchronization will fail.
  • the change and synchronization of schema objects can be implemented in a timing manner.
  • the work node that changes the schema object can write the changed schema object to the global metadata store, and other work nodes regularly read the schema object from the global metadata store to achieve synchronization.
  • a lease and intermediate state-based approach can be used to implement changes to the schema object.
  • the scheme object on each working node is regarded as a state of the schema object (herein referred to as "schema state"), and the change and synchronization of the schema object can be regarded as a change to the schema state And sync.
  • the change to the schema object of table t1 is to add an index for a certain column
  • the schema object without an index can be regarded as absent
  • the schema object with an index added can be regarded as public.
  • each schema state has a lease, and each worker node must renew the lease before the lease ends (that is, the lease expires).
  • a worker node renews, it needs to check whether there is a new schema state in the global metadata store, and if it exists, it needs to load the new state into the local cache. If a working node cannot complete the contract renewal due to a failure or abnormal situation, it will no longer provide services to database users.
  • the lease mechanism ensures that any working node that can provide services to users can obtain the latest schema status within a lease period.
  • this type of scheme uses the intermediate state of the schema, and the intermediate state of the schema only applies to specific types of operations (such as delete, insert, update, etc.) The statement is valid. If this method is adopted, in the above example, only when the schema object on the worker node is in the public state, the index added to the schema object of the table t1 on the worker node is visible to the outside, and when the schema object on the worker node When the object is in the absent state or other intermediate state, the added index either does not exist or is not visible to the outside.
  • the embodiments of the present application propose a database transaction processing method, device, and server, which can ensure that the metadata that is changed before the target transaction is opened is timely learned by the working nodes that need to use the metadata. This content is explained below.
  • FIG. 2 is a schematic flowchart of a database processing method provided by an embodiment of the present application. The method can be applied to any of the working nodes shown in FIG. 1. The detailed description is as follows.
  • the transaction timestamp of the target transaction and the current global latest version information of the distributed database system are obtained, where the target transaction includes at least one operation statement for the target data object, and the global latest
  • the version information is the version information of the latest generated metadata among the various metadata stored in the distributed database system.
  • opening the target transaction may mean that the target transaction starts to be executed.
  • the transaction timestamp of each transaction may be a value generated by the global timestamp manager 400 based on the current time information when the transaction is started. For example, when the transaction is started, the current time information can be processed through a timestamp function to obtain a function value, which can be used as the transaction timestamp of the transaction.
  • the timestamp function is usually monotonically increasing.
  • the transaction timestamp of the transaction also increases with the increase of the transaction start time.
  • a working node when a working node obtains metadata of any version of any data object, it may request the global timestamp manager 400 to allocate a version of the metadata.
  • the version information of the metadata may also be a value generated by the global timestamp manager 400 using a timestamp function to process the current time information.
  • the global timestamp manager 400 can generate the version information of the metadata and the transaction timestamp of the transaction through the same timestamp function. In this case, the version information of the metadata and the transaction timestamp of the transaction belong to the same time. Poke sequence.
  • the global timestamp manager 400 can also generate the version information of metadata and the transaction timestamp of the transaction through different timestamp functions. In this case, the version information of the metadata and the transaction timestamp of the transaction belong to different timestamp sequences. . This embodiment has no limitation on this.
  • the version information of the metadata is a time stamp generated based on the time information
  • the metadata md1 and md2 in the distributed database system 10, wherein the metadata md1 has two versions, V11 and V12, and the metadata md2 has two versions V21 and V22. Then, there is a chronological order among V11, V12, V21, and V22. The chronological order can be reflected by the order of magnitude between V11, V12, V21, and V22.
  • the time stamp function is a monotonically increasing function , The greater the version information, the later the generation time of metadata, the newer the data. In the following, unless otherwise specified, the monotonic increase of the timestamp function is used as an example for description.
  • the metadata with this version information is the newly generated metadata.
  • the distributed database system 10 only stores metadata md1 and md2
  • V22 is the largest version information
  • the metadata md2 whose version information is V22 is the newly generated metadata.
  • the global timestamp manager 400 may store a global latest version information.
  • the global latest version information is the version information of the newly generated metadata among all metadata of the distributed database system 10.
  • the version information in the above example is V22.
  • the current global latest version information described in S201 refers to the global latest version information obtained from the global timestamp manager when the working node starts the target transaction, which means that the distribution is before the start time of the target transaction Version information of the latest metadata generated in the database system 10.
  • S202 Determine the metadata of the latest version of the target data object according to the current global latest version information, and determine the to-be-accessed user data of the target transaction according to the transaction timestamp.
  • the working node may access the global metadata storage 300 based on the global latest version information obtained in S201, that is, the current global latest version information, so as to obtain a metadata snapshot. In this way, it is equivalent to using metadata whose version information is less than or equal to the current global latest version information as the metadata to be accessed. Then, the working node can search the metadata of the latest version of the target data object from the metadata to be accessed, that is, the metadata of each version of the target data object with the largest version information.
  • the working node can also use the transaction timestamp of the target transaction to take a snapshot of user data, so that the user data visible to the target transaction can be determined, that is, the user data whose version information is less than or equal to the transaction timestamp.
  • the determined user data visible to the target transaction is the user data to be accessed, that is, the user data that can be accessed in the process of processing the target transaction.
  • S203 Execute the operation sentence in the target transaction on the user data to be accessed based on the latest version of the metadata of the target data object.
  • the working node can access the latest version of the metadata of the target data object (such as schema information), and analyze the operation statement in the target transaction according to the latest version of the metadata of the target data object, and treat it according to the analysis result Access user data for processing.
  • the analysis can be, for example, checking whether the processing object of the operation statement in the target transaction (such as the data table, the row, column, primary key, etc.) in the data table exists, etc., when the result of the analysis is there, you can continue to perform the operation Statement.
  • any metadata change in the distributed database system can trigger the update of the global version information
  • the working node when the working node starts the target transaction, it determines the target data object based on the current global latest version information
  • the latest version of the metadata therefore, it can be ensured that the metadata that has changed before the target transaction is opened can be known to the working node in time.
  • this solution can ensure that the worker node uses the metadata during the execution of the target transaction. Metadata after the change.
  • this solution can ensure that the working node uses the changed metadata during the entire execution of the target transaction, it avoids the inconsistency of the accessed metadata during the process of executing different operation statements of the same transaction.
  • the worker node gets the changed metadata at the beginning, which avoids the problem that the user data and metadata visible in the same transaction do not correspond to the user data and metadata in the same transaction due to the inability to see the metadata changes in the previous example. .
  • S201-1 Send a second time stamp allocation request corresponding to the target transaction to the global time stamp manager.
  • the second timestamp allocation request refers to a timestamp allocation request issued for a certain transaction, and is used to request the global timestamp manager 400 to allocate a transaction timestamp for the transaction. It can be understood that the second time stamp allocation request here is only for distinguishing from the first time stamp allocation request described later, and does not limit the importance of the time stamp request through "first" and "second".
  • S201-2 Receive response information returned by the global timestamp manager based on the second timestamp allocation request, where the response information includes the transaction timestamp of the target transaction and the current global latest version information.
  • the global timestamp manager 400 when it receives the second timestamp allocation request, it may generate a timestamp as the transaction timestamp based on the current time information, obtain the current global latest version information, and fill it to the second time. Stamp the response information corresponding to the allocation request, and send the response information to the working node.
  • the two operations of generating the transaction timestamp and obtaining the current global latest version information are atomic. In other words, these two operations either occur or do not occur.
  • the atomicity can be achieved by setting atomic locks for these two operations.
  • the atomicity of the two operations can also be achieved in other ways, which is not limited in this embodiment.
  • the transaction timestamp of the target transaction and the current global latest version information basically correspond to the same point in time.
  • the user data to be accessed determined based on the transaction timestamp and the current global latest version information
  • the metadata of the latest version of the determined target data object is the version corresponding to each other.
  • the global latest version information maintained by the global timestamp manager 400 can be updated through the process shown in FIG. 4.
  • S401 Receive a modification instruction for metadata of any data object, and generate modified metadata of the data object according to the modification instruction.
  • the change instruction here may be any instruction that can change the metadata, for example, an instruction used to implement operations such as adding, deleting, and modifying.
  • the change instruction may be input by the user or the database administrator, or may be triggered by an operation on the data object, which is not limited in this embodiment.
  • each working node in the distributed database system 10 When each working node in the distributed database system 10 receives a change instruction for the metadata of any data object, it can take a snapshot of the metadata of the current version of the data object, and the version information of the snapshot can be based on The timestamp generated by the current time information. Then, a change instruction is executed on the metadata of the data object to obtain the metadata of the new version of the data object, that is, the changed metadata.
  • the global timestamp manager 400 may be requested to allocate a piece of version information for the changed metadata.
  • the database transaction processing method provided in this embodiment may further include the steps shown in FIG. 5.
  • S501 When generating changed metadata, send a first timestamp allocation request to the global timestamp manager.
  • S502 Receive a timestamp returned by the global timestamp manager based on the first timestamp allocation request, and determine the timestamp as version information of the changed metadata.
  • the first time stamp allocation request refers to a time stamp allocation request sent to metadata, and the time stamp returned based on the first time stamp allocation request is version information of the metadata.
  • the global timestamp manager 400 When the global timestamp manager 400 receives the first timestamp allocation request sent by the working node, it may generate a timestamp based on the current time information, and carry the timestamp in the response information corresponding to the first timestamp allocation request and return it to Working node.
  • the working node receives the response information, extracts the time stamp in the response information, and determines the extracted time stamp as the version information of the changed metadata.
  • the steps shown in FIG. 5 may be executed before S402.
  • the worker node can submit the changed metadata and its version information to the global metadata storage 300.
  • the changed metadata and its version information will be written into the global metadata storage 300.
  • the global metadata store 300 may return a submission confirmation message.
  • the changed metadata and its version information will be persistently stored in the global metadata storage 300.
  • the working node can send a submission success notification to the global timestamp manager 400.
  • the submission success notification can include the version information of the changed metadata. Therefore, the global timestamp manager 400 can modify the global latest version information stored in it to the version information of the changed metadata according to the received submission success notification.
  • the global timestamp manager 400 when the global timestamp manager 400 receives the submission success notification, it can determine that the metadata with the version information has been successfully submitted to the global metadata storage based on the version information of the changed metadata carried in the submission success notification 300. In turn, the stored global latest version information may be updated to the version information carried in the submission success notification, that is, the version information of the changed metadata.
  • the working node may repeat S402. It can be understood that the worker node maintains a persistent log or task queue during the process of metadata change, and the persistent log or task records the progress of the metadata change. If during the metadata change process, the worker node performing the metadata change fails, the worker node or other worker nodes can continue to perform the metadata change based on the persistent log or task queue after recovery, until the changed The metadata was submitted successfully.
  • the global latest version information can be understood as a variable, and the update of the global latest version information can be understood as an update to the value of the variable.
  • the global latest version information may be referred to as the first variable.
  • the target data object targeted by the target firm is agreed as the first data object.
  • any change to the metadata of any working node will trigger the update of the global latest version information in the global timestamp manager 400. Therefore, there may be situations in which no new version of metadata is generated for the target data object before the target transaction is started, or the new version of metadata has been synchronized to the local cache of the working node that started the target transaction.
  • the latest metadata of the target data object in the local cache of the working node is already the latest metadata of the target data object in the distributed database system 10. At this time, the working node does not need to access the global metadata storage 300 during the execution of S202.
  • FIG. S202 shown in 2 can be implemented through the process shown in FIG. 6.
  • the local cache refers to the local cache of the working node that starts the target transaction.
  • the local cache of each working node can maintain the version information of the local latest metadata of each data object.
  • the version information of the local latest metadata described here can be understood as a variable, which is referred to herein as a second variable.
  • the value of the second variable can be updated.
  • the local cache 121 of the working node 120 there is a second variable of each data object in the distributed database system 10, and the value of the second variable represents the version information of the latest metadata of the data object in the local cache 121.
  • the local cache 121 stores the metadata of each version of the data table t1
  • the value of the second variable of the data table t1 in the local cache 121 represents the version of the metadata of each version. The latest one in the information.
  • the transaction T2 includes at least one operation statement for the data table t2. Then, the transaction T2 can be regarded as the target transaction, the working node 202-1 can be regarded as the node opened and used to process the target transaction T2, the data table t2 can be regarded as the target data object (or, the first data object), and the working node 120
  • the local cache 121 can be regarded as the local cache in S202-1.
  • the working node 120 can look up the second variable of the data table t2 from the local cache 121 and obtain the current value of the second variable.
  • the current value is the latest metadata version of the data table t2 in the local cache 121 information.
  • S202-2 Compare whether the version information of the local latest metadata of the target data object is the same as the current global latest version information. If yes, execute S202-3; if not, execute S202-4.
  • S202-3 Obtain the latest local metadata of the target data object from the metadata stored in the local cache as the latest version of the metadata of the target data object.
  • the version information of the local latest metadata of the target data object obtained through S202-1 Compare with the current global latest version information obtained through S201.
  • the local latest metadata of the target data object in the local cache is the global latest metadata in the distributed database system 10.
  • the local latest metadata can be obtained directly from the local cache without having to access the global metadata store 300.
  • S202-4 From the metadata of each version of the target data object stored in the local cache and the global metadata store, search for whether there is target metadata, and the version information of the found target metadata is higher than that of the target metadata.
  • the version information of the local latest metadata of the target data object is newer and not newer than the current global latest version information. If yes, S202-5 can be executed; if not, S202-6 can be executed.
  • S202-5 Determine the target metadata with the latest version information in the found target metadata as the metadata of the latest version of the target data object.
  • the local latest metadata version information of the target data object in the local cache is not the same as the current global latest version information, it means that one or more metadata changes have occurred before the target transaction is started.
  • the change of the secondary metadata triggers the update of the global latest version information, and the metadata changes of the target data object may or may not exist in the one or more metadata changes.
  • the target metadata is bound to exist in the local cache or the global metadata storage 300, that is, the version information is in the local latest metadata of the first data object Metadata between the version information of and the current global latest version information. This is because the update of the global latest version information is executed after the metadata that triggers the change is successfully submitted to the global metadata store 300, therefore, it has the current global latest version information and the previous version information
  • the metadata has been written into the global metadata storage 300 and can be found.
  • the target metadata with the latest version information is generated by the last change, that is, the metadata of the latest version of the target data object in the distributed database system 10.
  • the working node can update the target metadata with the latest version information to the local cache, so that the latest version of the target data object’s metadata is stored in the local cache, and update the version information of the local latest metadata of the target data object
  • the version information of the target metadata that is the latest version that is, the version information of the metadata of the latest version of the target data object).
  • S202-6 Determine the latest local metadata of the target data object as the latest version of the metadata of the target data object.
  • the local latest metadata of the target data object is already the latest version of the metadata of the target data object in the distributed database system 10.
  • the metadata of the latest version of the target data object described in S202, S202-3, S202-5, and S202-6 all refer to the metadata of the latest version of the target data object in the entire distributed database system 10.
  • the working node may also update the version information (that is, the value of the second variable) of the local latest metadata of the target data object in the local cache to the current global latest version information ( That is, the current value of the first variable).
  • the changed metadata not only needs to be written into the global metadata storage 300, but also needs to be synchronized to each working node.
  • the working node that generates the changed metadata can synchronize the changed metadata through the process shown in FIG. 7.
  • the work node that generates the changed metadata can update the second variable of the data object whose metadata has changed.
  • the working node when the working node submits the changed metadata to the global metadata storage 300 and receives the submission confirmation information returned by the global metadata storage 300, the changed metadata can be written into the local cache.
  • S702 In the local cache, update the version information of the local latest metadata of the data object whose metadata has changed to the version information of the changed metadata.
  • the data object whose metadata is changed refers to the data object to which the changed metadata belongs.
  • the worker node After the worker node successfully writes the changed metadata into the local cache, it can modify the version information (that is, the value of the second variable) of the local latest metadata of the data object to which the changed metadata belongs to the changed metadata The version information of the metadata.
  • the metadata of the data table t1 is the schema object O1 of the V10 version
  • the metadata of the data table t2 is the schema of the V20 version
  • Object O2 the metadata of data table t3 is the schema object O3 of the V30 version.
  • the global metadata storage 300 stores the schema object O1 of the V10 version, the schema object O2 of the V20 version, and the schema object O3 of the V30 version, and the global metadata storage 300 maintains the first variable GLSV (global largest schema version, the global maximum schema version) No.), the current value of GLSV is V20, which means that the current largest version information in the distributed database system 10 is V20, that is, the latest version information is V20.
  • GLSV global largest schema version, the global maximum schema version
  • the local caches 111, 121, and 132 all cache the V10 version of the schema object O1, the V20 version of the schema object O2, and the V30 version of the schema object O3, and all maintain the second variable EGSV-1 (equivalent global schema) corresponding to the data table t1. version-1, equivalent global mode version number-1), the second variable EGSV-2 corresponding to data table t2, and the second variable EGSV-3 corresponding to data table t3.
  • the current value of EGSV-1 is V10
  • the current value of EGSV-2 is V20
  • the current value of EGSV-3 is V30.
  • the process of the database transaction processing method provided in this embodiment may include the process shown in FIG. 8.
  • the working node 110 receives a change instruction for the schema object O1 of the data table t1, and generates a changed schema object O1 according to the change instruction.
  • the working node 110 sends a first time stamp allocation request r1 to the global time stamp manager 400.
  • the global time stamp manager 400 generates a time stamp V11 according to the current time information.
  • the global time stamp manager 400 sends the time stamp V11 to the working node 110.
  • the working node 110 uses the timestamp V11 as the version information of the changed schema object O1 to obtain the schema object O1 of the V11 version.
  • the working node 110 submits the schema object O1 of the V11 version to the global metadata storage 300.
  • the global time stamp manager 400 updates the current value of GLSV to V11 according to the notification.
  • S8 and S10 can be executed in parallel.
  • the working node 110 starts the transaction T1
  • the transaction T1 includes an operation statement for the data table t1
  • it can be processed according to the following process.
  • the global time stamp manager 400 receives r2, generates a time stamp V40 based on the current time information, obtains the current value V11 of GLSV, and fills V40 and V11 into the response information corresponding to r2.
  • the global timestamp manager 400 returns the response information corresponding to r2 to the working node 110.
  • the working node 110 obtains the current value V11 of EGSV-1 from the local cache 111, determines that the current value V11 of EGSV-1 is the same as the current value V11 of GLSV, and obtains the schema object O1 whose version information is V11 from the local cache 111.
  • the working node 110 uses V40 as the transaction timestamp of the transaction T1, obtains a user data snapshot based on V40, and determines the user data whose version information is less than or equal to V40 as the to-be-accessed user data data1 of the transaction T1.
  • the database transaction processing method provided in this embodiment can first be processed in accordance with S11-S13, and after S13, then It can be processed in accordance with the process shown in Figure 9.
  • the working node 110 obtains the current value V10 of EGSV-1 from the local cache 111, determines that the current value V10 of EGSV-1 is different from the current value V11 of GLSV, and then retrieves the schema stored in the local cache 111 and the global metadata store 300 In the object O1, search for the schema object O1 (that is, target metadata) whose version information is greater than V10 and less than or equal to V11.
  • the working node 110 finds the schema object O1 of the V11 version from the global metadata storage 300, determines it as the latest version of the schema object O2, and synchronizes the schema object O1 of the V11 version to the local cache 111, and stores the local cache.
  • the current value of EGSV-1 in 111 is updated to V11.
  • the working node 110 obtains a user data snapshot based on the transaction timestamp V40, and determines the user data whose version information is less than or equal to V40 as the to-be-accessed user data data1 of the transaction T1.
  • the working node 110 analyzes the operation statement in the transaction T1 based on the schema object O1 of the V11 version, and executes the operation statement in the transaction T1 on the data1 according to the analysis result.
  • the worker node 110 when the worker node 110 receives the change instruction for the schema object O2 of the data table t2 at time 2, it can process it according to the similar process of S1-S10 mentioned above. After the process is completed, the global metadata A new version (V21) schema object O2 is added to storage 300, and the current value of GLSV is V21. A schema object O2 of version V21 is newly added to the local cache 111 of the working node 110, and the current value of EGSV-2 in the local cache 111 is V21. However, the schema object O2 of the V21 version does not exist on the working nodes 120 and 130, and the current value of EGSV-2 in the local caches 121 and 131 is still V20.
  • the transaction T2 includes an operation statement for the data table t2.
  • the database transaction processing method provided by this embodiment can be processed according to the process shown in FIG. 10:
  • the global time stamp manager 400 receives r3, generates a time stamp V41 based on the current time information, obtains the current value V21 of GLSV, and fills V41 and V21 into the response information corresponding to r3.
  • S23 The global timestamp manager 400 returns the response information to the working node 120.
  • the working node 120 obtains the current value V20 of EGSV-2 from the local cache 121, determines that the current value V20 of EGSV-2 is different from the current value V21 of GLSV, and then stores the value from the local cache 121 and the global metadata storage 300 In the schema object O2, find the schema object O2 (that is, target metadata) whose version information is greater than V20 and less than or equal to V21.
  • the working node 120 finds the schema object O2 of the V21 version from the global metadata storage 300, determines it as the latest version of the schema object O2, and synchronizes the schema object O2 of the V21 version to the local cache 121, and the local cache The current value of EGSV-2 in 121 is updated to V21.
  • the working node 120 determines the to-be-accessed user data data2 of the transaction T2 based on the transaction timestamp V41, parses the operation statement in the transaction T2 based on the schema object O2 of the V21 version, and executes the operation statement in the transaction T2 on the data2 based on the analysis result.
  • the working node 130 starts the transaction T3 at time 3, and the transaction T3 includes an operation statement for the data table t3. Then the database transaction processing method provided in this embodiment can be processed according to the process shown in FIG. 11:
  • the global time stamp manager 400 receives r4, generates a time stamp V42 based on the current time information, obtains the current value V21 of GLSV, and fills V42 and V21 into the response information corresponding to r4.
  • S29 The global timestamp manager 400 returns the response information to the working node 130.
  • the working node 130 obtains the current value V30 of EGSV-3 from the local cache 131, determines that the current value V30 of EGSV-3 is different from the current value V21 of GLSV, and then obtains the schema stored in the local cache 131 and the global metadata storage 300 In the object O3, search for the schema object O3 (that is, target metadata) whose version information is greater than V30 and less than or equal to V21.
  • the working node 130 determines the schema object O3 of the V30 version as the latest version of the schema object O3, and updates the current value of EGSV-3 in the local cache 131 to V21.
  • the working node 130 determines the to-be-accessed user data data3 of the transaction T2 based on the transaction timestamp V42, parses the operation statement in the transaction T2 based on the schema object O3 of the V21 version (that is, the scheme object O3 of the V30 version), and compares it based on the analysis result data3 executes the operation statement in transaction T3.
  • the distributed database system maintains the global latest version information, and the global latest version information refers to the version information of the latest metadata among all the metadata of the system.
  • a worker node starts a target transaction for the target data object, it can obtain the transaction timestamp and the current global latest version information, so as to determine the latest version metadata of the target data object according to the current global latest version information, and set the target according to the transaction timestamp
  • the user data visible to the transaction is determined as the user data to be accessed, and the operation statement in the target transaction is executed on the user data of the target transaction to be accessed based on the latest version metadata of the target data object.
  • FIG. 12 shows a structural block diagram of a database transaction processing apparatus 1200 provided by an embodiment of the present application.
  • the apparatus 1200 may include: an acquisition module 1210, a determination module 1220, and a transaction processing module 1230.
  • the obtaining module 1210 is configured to obtain the transaction timestamp of the target transaction and the current global latest version information of the distributed database system when the target transaction is started by the working node, wherein the target transaction includes the target data At least one operation sentence of the object, the global latest version information is the version information of the latest generated metadata among the various metadata stored in the distributed database system.
  • the determining module 1220 is configured to determine the metadata of the latest version of the target data object according to the current global latest version information, and determine the to-be-accessed user data of the target transaction according to the transaction timestamp.
  • the transaction processing module 1230 is configured to execute the operation statement in the target transaction on the user data to be accessed based on the latest version of the metadata of the target data object.
  • the device 1200 may also include a change module.
  • the change module is used to: receive a change instruction for the metadata of any data object, generate the changed metadata of the data object according to the change instruction; submit the changed metadata to the global metadata store; When the changed metadata is successfully submitted, the global latest version information is updated to the version information of the changed metadata.
  • the change module may also be used to: when generating the changed metadata, send a first timestamp allocation request to the global timestamp manager; receive that the global timestamp manager allocates based on the first timestamp Request the returned timestamp, and determine the timestamp as the version information of the changed metadata.
  • the local cache of the working node stores the version information of the local latest metadata of the data object.
  • the change module may also be used to: when the changed metadata is successfully submitted, write the changed metadata into the local cache; in the local cache, the data object whose metadata has been changed The version information of the latest local metadata of is updated to the version information of the changed metadata.
  • the determining module 1220 may determine the latest version of the metadata of the target data object according to the current global latest version information by: obtaining the version information of the local latest metadata of the target data object from a local cache; Compare whether the version information of the local latest metadata of the target data object is the same as the current global latest version information; if they are the same, obtain the local latest metadata of the target data object from the metadata stored in the local cache The data serves as the metadata of the latest version of the target data object.
  • the determining module 1220 may determine the latest version of the metadata of the target data object according to the current global latest version information as follows: if the version information of the local latest metadata of the target data object is the same as the current version information The global latest version information is not the same. From the metadata of each version of the target data object stored in the local cache and the global metadata store, look for the existence of target metadata, and find out whether the target metadata is found The version information is newer than the obtained local latest version information and not newer than the current global latest version information; if it exists, the target metadata with the latest version information in the found target metadata is determined as the target The metadata of the latest version of the data object.
  • the determining module 1220 may be further configured to: after determining the target metadata with the latest version information in the found target metadata as the metadata of the latest version of the target data object, target the latest version information to the target metadata.
  • the metadata is updated to the local cache; the version information of the local latest metadata of the target data object is updated to the version information corresponding to the latest target metadata of the version information.
  • the determining module 1220 may also determine the latest version of the metadata of the target data object according to the current global latest version information: if the target metadata does not exist, the local latest version of the target data object
  • the metadata is determined to be the latest version of the metadata of the target data object, and the version information of the local latest metadata of the target data object is updated to the current global latest version information.
  • the global latest version information is stored in the global timestamp manager of the distributed database system.
  • the method for obtaining module 1210 to obtain the transaction timestamp of the target transaction and the current global latest version information of the distributed database system may be: sending the second timestamp allocation corresponding to the target transaction to the global timestamp manager Request; receiving the response information returned by the global timestamp manager based on the second timestamp allocation request, the response information including the transaction timestamp of the target transaction and the current global latest version information.
  • the way that the transaction processing module 1230 executes the operation statements in the target transaction on the user data to be accessed based on the metadata of the latest version of the target data object may be: according to the mode information of the latest version of the target data object, The operation sentence in the target transaction is analyzed; the user data to be accessed is processed according to the analysis result.
  • the coupling or direct coupling or communication connection between the displayed or discussed modules may be through some interfaces, and the indirect coupling or communication connection between the devices or modules may be electrical, Mechanical or other forms.
  • each functional module in each embodiment of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or software function modules.
  • FIG. 13 shows a structural block diagram of a server 1300 provided by an embodiment of the present application.
  • the server 1300 may be the server where the above-mentioned working node is located, it may be an independent server or a server cluster composed of multiple physical servers, or it may be used to execute basic cloud computing services such as cloud computing, big data, and artificial intelligence platforms. Cloud Server.
  • the server 1300 in this application may include one or more of the following components: a processor 1310, a memory 1320, and one or more application programs, where one or more application programs may be stored in the memory 1320 and configured to be operated by one Or multiple processors 1310 execute, and one or more programs are configured to execute the method described in the foregoing method embodiment.
  • the processor 1310 may include one or more processing cores.
  • the processor 1310 uses various interfaces and lines to connect various parts of the entire server 1300, and executes the server by running or executing instructions, programs, code sets, or instruction sets stored in the memory 1320, and calling data stored in the memory 1320. 1300's various functions and processing data.
  • the processor 1310 may adopt at least one of digital signal processing (Digital Signal Processing, DSP), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA), and Programmable Logic Array (Programmable Logic Array, PLA).
  • DSP Digital Signal Processing
  • FPGA Field-Programmable Gate Array
  • PLA Programmable Logic Array
  • the processor 1310 may integrate one or a combination of a central processing unit (CPU), a graphics processing unit (GPU), a modem, and the like.
  • the CPU mainly processes the operating system, user interface and application programs, etc.; the GPU is used for rendering and drawing of display content; the modem is used for processing wireless communication. It can be understood that the above-mentioned modem may not be integrated into the processor 1310, but may be implemented by a communication chip alone.
  • the memory 1320 may include random access memory (RAM) or read-only memory (Read-Only Memory).
  • the memory 1320 may be used to store instructions, programs, codes, code sets or instruction sets.
  • the memory 1320 may include a program storage area and a data storage area, where the program storage area may store instructions for implementing the operating system and instructions for implementing at least one function (such as touch function, sound playback function, image playback function, etc.) , Instructions for implementing the following various method embodiments, etc.
  • the storage data area may also store data (such as metadata) created by the server 1300 during use.
  • FIG. 14 shows a structural block diagram of a computer-readable storage medium provided by an embodiment of the present application.
  • the computer-readable medium 1400 stores program code, and the program code can be invoked by a processor to execute the method described in the foregoing method embodiment.
  • the computer-readable storage medium 1400 may be an electronic memory such as flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), EPROM, hard disk, or ROM.
  • the computer-readable storage medium 1400 includes a non-transitory computer-readable storage medium.
  • the computer-readable storage medium 1400 has storage space for the program code 1410 for executing any method steps in the above-mentioned methods. These program codes can be read from or written into one or more computer program products.
  • the program code 1410 may be compressed in a suitable form, for example.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种数据库事务处理方法、装置及服务器,涉及数据库技术领域。分布式数据库系统的工作节点在开启目标事务时,获取目标事务的事务时间戳及当前的全局最新版本信息,其中目标事务包括针对目标数据对象的至少一个操作语句,全局最新版本信息是分布式数据库系统存储的各个元数据中最新生成的元数据的版本信息;根据当前的全局最新版本信息确定目标数据对象的最新版本的元数据,以及根据事务时间戳确定目标事务的待访问用户数据;基于目标数据对象的最新版本的元数据对待访问用户数据执行目标事务中的操作语句。

Description

数据库事务处理方法、装置、服务器及存储介质
本申请要求于2020年06月10日提交中国专利局、申请号为202010520559.2、名称为“数据库事务处理方法、装置及服务器”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及数据库技术领域,更具体地,涉及一种数据库事务处理方法、装置、服务器及存储介质。
背景
元数据(metadata)是用于描述数据的数据,例如,用于描述数据库中数据对象的组织和结构的数据可以视为一种元数据。分布式数据库系统包括全局元数据存储和多个工作节点。全局元数据存储中维护有全局元数据信息,其包括系统中的所有元数据。每个工作节点的本地缓存中存储有部分或全量的全局元数据信息。
工作节点在执行针对数据对象的事务或操作语句之前,通常需要基于该数据对象的元数据进行一些处理。而工作节点首先会从本地缓存获取该元数据,在获取失败的情况下再从全局元数据存储中获取。因此,确保各工作节点缓存的元数据的一致性非常重要。
技术内容
本申请实施例提供了一种数据库事务处理方法,应用于分布式数据库系统的工作节点,该方法包括:在开启目标事务时,获取目标事务的事务时间戳及分布式数据库系统当前的全局最新版本信息,其中,目标事务包括针对目标数据对象的至少一个操作语句,全局版本信息是分布式数据库系统存储的各个元数据中最新生成的元数据的版本信息;根据当前的全局版本信息确定目标数据对象的最新版本的元数据,以及根据事务时间戳确定目标事务的待访问用户数据;基于目标数据对象的最新版本的元数据对待访问用户数据执行目标事务中的操作语句。
本申请实施例提供了一种数据库事务处理装置,应用于分布式数据库系统的工作节点,该装置包括:获取模块、确定模块及事务处理模块。其中,获取模块用于当工作节点开启目标事务时,获取目标事务的事务时间戳及所述分布式数据库系统当前的全局最新版本信息,其中,所述目标事务包括针对目标数据对象的至少一个操作语句,所述全局最新版本信息是所述分布式数据库系统存储的各个元数据中最新生成的元数据的版本信息。确定模块用于根据当前的全局最新版本信息确定目标数据对象的最新版本的元数据,以及根据事务时间戳确定目标事务的待访问用户数据。事务处理模块用于基于目标数据对象的最新版本的元数据对待访问用户数据执行目标事务中的操作语句。
本申请实施例提供的数据库事务处理装置还包括变更模块。变更模块用于:接收针对任一数据对象的元数据的变更指令,根据变更指令生成该数据对象的变更后的元数据; 向全局元数据存储提交变更后的元数据;当变更后的元数据成功提交时,将全局最新版本信息更新为变更后的元数据的版本信息。
本申请实施例提供的数据库事务处理装置中,变更模块还用于:在向全局元数据存储提交变更后的元数据之前,当生成变更后的元数据时,向全局时间戳管理器发送第一时间戳分配请求;接收全局时间戳管理器基于第一时间戳分配请求返回的时间戳,并将该时间戳确定为变更后的元数据的版本信息。
本申请实施例提供的数据库事务处理装置中,全局最新版本信息存储于全局时间戳管理器中,变更模块将全局最新版本信息更新为变更后的元数据的版本信息的方式为:向全局时间戳管理器发送提交成功通知,使全局时间戳管理器根据提交成功通知将存储的全局最新版本信息更新为变更后的元数据的版本信息。
本申请实施例提供的数据库事务处理装置中,工作节点的本地缓存存储有数据对象的本地最新元数据的版本信息,变更模块还用于:当变更后的元数据被成功提交时,将变更后的元数据写入本地缓存;在本地缓存中,将元数据发生变更的数据对象的本地最新元数据的版本信息更新为变更后的元数据的版本信息。
本申请实施例提供的数据库事务处理装置中,确定模块根据当前的全局最新版本信息确定目标数据对象的最新版本的元数据的方式为:从本地缓存中获取目标数据对象的本地最新元数据的版本信息;对比目标数据对象的本地最新元数据的版本信息与当前的全局最新版本信息是否相同;若相同,从本地缓存存储的元数据中,获取目标数据对象的本地最新元数据作为目标数据对象的最新版本的元数据。
本申请实施例提供的数据库事务处理装置中,确定模块根据当前的全局最新版本信息确定目标数据对象的最新版本的元数据的方式为:若目标数据对象的本地最新元数据的版本信息与当前的全局最新版本信息不相同,从本地缓存和全局元数据存储中存储的目标数据对象的各版本的元数据中,查找是否存在目标元数据,查找到的目标元数据的版本信息比目标数据对象的本地最新元数据的版本信息新、且不比当前的全局最新版本信息新;若存在,将所述查找到的目标元数据中版本信息最新的目标元数据确定为目标数据对象的最新版本的元数据。
本申请实施例提供的数据库事务处理装置中,确定模块还用于:在将版本信息最新的目标元数据确定为目标数据对象的最新版本的元数据之后,将版本信息最新的目标元数据更新至本地缓存;将目标数据对象的本地最新元数据的版本信息,更新为版本信息最新的目标元数据对应的版本信息。
本申请实施例提供的数据库事务处理装置中,确定模块根据当前的全局最新版本信息确定目标数据对象的最新版本的元数据的方式为:若不存在所述目标元数据,将目标数据对象的本地最新元数据确定为目标数据对象的最新版本的元数据,以及将目标数据对象的本地最新元数据的版本信息更新为当前的全局最新版本信息。
本申请实施例提供的数据库事务处理装置中,全局最新版本信息存储于所述分布式数据库系统的全局时间戳管理器中,获取模块获取目标事务的事务时间戳及分布式数据库系统当前的全局最新版本信息的方式为:向全局时间戳管理器发送目标事务对应的第二时间戳分配请求;接收全局时间戳管理器基于第二时间戳分配请求返回的响应信息,该响应信息包括目标事务的事务时间戳及当前的全局最新版本信息。
本申请实施例提供的数据库事务处理装置中,事务处理模块基于目标数据对象的最新版本的元数据对待访问用户数据执行目标事务中的操作语句的方式为:根据目标数据 对象的最新版本的模式信息,对目标事务中的操作语句进行解析;根据解析结果对待访问用户数据进行处理。
另一方面,本申请实施例提供了一种服务器,包括:一个或多个处理器;存储器;一个或多个程序,其中所述一个或多个程序被存储在存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个程序配置用于执行上述的方法。
另一方面,本申请实施例提供了一种计算机可读存储介质,其上存储有程序代码,该程序代码可被处理器调用执行上述的方法。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1示出了本申请实施例提供的一种分布式数据库系统的架构示意图。
图2示出了本申请实施例提供的一种数据库事务处理方法的流程示意图。
图3示出了图2所示步骤S201的子步骤示意图。
图4示出了本申请实施例提供的数据库事务处理方法的另一流程示意图。
图5示出了本申请实施例提供的数据库事务处理方法的又一流程示意图。
图6示出了图2所示步骤S202的子步骤示意图。
图7示出了本申请实施例提供的数据库事务处理方法的再一流程示意图
图8示出了本申请实施例提供的数据库事务处理方法在一个具体示例中的交互流程图。
图9示出了本申请实施例提供的数据库事务处理方法在另一个具体示例中的交互流程图。
图10示出了本申请实施例提供的数据库事务处理方法在又一个具体示例中的交互流程图。
图11示出了本申请实施例提供的数据库事务处理方法在再一个具体示例中的交互流程图。
图12示出了本申请实施例提供的数据库事务处理装置的框图。
图13是本申请实施例的用于执行根据本申请实施例的数据库事务处理方法的服务器的框图。
图14是本申请实施例的用于保存或者携带实现根据本申请实施例的数据库事务处理方法的程序代码的存储单元。
实施方式
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。
请参照图1,图1是本申请实施例提供的一种分布式数据库系统10的架构示意图。分布式数据库系统10包括分布式存储系统200、全局元数据存储300及全局时间戳管理器 (Global Timestamp Manager,GTM)400以及至少两个工作节点,例如图1所示的工作节点110、120和130。
其中,分布式存储系统200可以包括多个物理存储节点(例如,存储服务器、主机等),用于存储用户数据。分布式存储系统200还可以包括多个数据存储(Data Store,DS)节点,每个DS节点与所述多个物理存储节点组成的存储资源池中的一部分存储资源对应,用于管理该部分存储资源所存储的用户数据。
工作节点也可以称为计算节点(Computation Node,CN),用于接收并执行数据库用户(user)提交的事务请求及事务中的操作语句的进程。这里的数据库用户可以理解为分布式数据库系统中的一个数据库对象,其也可以称为用户对象,可以视为用户登录信息与用户数据之间的连接桥梁。
本申请实施例中,如果分布式数据库系统10采用的是存储与计算分离的架构,则工作节点和DS节点通常是不同的进程。如果分布式数据库系统10没有采用存储与计算分离的架构,则工作节点和DS节点只是逻辑上的模块,实际上可以属于同一个进程。
上述的工作节点和DS节点可以运行在分布式存储系统200中的物理存储节点上,也可以运行于分布式存储系统200之外的其他物理服务器上,本申请实施例对此没有限制。
全局元数据存储中存储有分布式数据库系统10中的所有元数据,每个元数据可以有至少两个版本。每个工作节点都具有本地缓存,比如,工作节点110具有本地缓存111,工作节点120具有本地缓存121,工作节点130具有本地缓存131。这里的本地可以理解为工作节点所在的物理设备。对应地,工作节点的本地缓存可以是工作节点所在的物理服务器上的存储空间,该存储空间可供工作节点缓存分布式数据库系统10的部分或者全量的全局元数据信息。本申请实施例中,全局元数据存储例如可以是全局数据字典(Global Data Dictionary,GDD),工作节点的本地缓存则可以是数据字典缓存(Data Dictionary Cache,DDC)。
在数据库系统中,常见的一种元数据是模式(schema)信息。schema信息是指数据库系统中全体数据的逻辑结构和特征的描述信息,其可以理解为数据库用户的逻辑数据视图。在数据库系统中,每个数据对象可以具有对应的schema对象,这里的schema对象可以理解为数据对象的schema信息按照一定数据结构组合而成的集合。
数据对象是指数据库系统中存储的数据,随着数据库系统类型的不同,数据对象也有所不同。比如,一些数据库系统以数据表(Data Table)为存储的基本对象,则这些数据库系统中的数据对象可以理解成数据表。又比如,一些数据库系统以文档(Document)为存储的基本对象,则这些数据库系统中的数据对象可以理解成文档。以数据表的schema对象为例,其中通常包括数据表的定义信息,例如表名、表类型、表中的属性名、属性类型、索引(index)、数据约束等。
以元数据是schema信息为例,在分布式数据库系统10中,工作节点在对数据对象进行任何的数据查询和数据操作之前,通常需要先访问该数据对象的schema对象,以确定所述数据查询和数据操作的目标数据是否存在。比如,工作节点110需要从表t1中删除列C1的数据,则需要访问表t1的schema对象,以检查表t1中是否存在列C1。详细地,工作节点101首先会从本地缓存111查找表t1的schema对象,在无法从本地缓存111中查找到表t1的schema对象时,再从全局元数据存储300查找表t1的schema对象,以根据该schema对象检查表t1中是否存在列C1。当通过检查确定表t1是否存在列C1时,再执行从表t1中删除列C1的数据的操作。
由此可见,当一个工作节点对某一数据对象O1的schema对象进行变更之后,如果其他工作节点无法及时获知该变更,则其他工作节点在对数据对象O1执行数据库事务或操作语句时,所访问的schema对象可能仍然是数据对象O1的变更前的schema对象,从而可能导致数据库事务或操作语句的处理出现异常。也即,当某一工作节点对数据对象的元数据进行变更后,其他工作节点难以及时获知该元数据的变更,容易导致各工作节点在处理事务时使用的元数据不一致,从而出现事务处理异常。
本申请实施例中,事务是由针对数据库系统中用户数据的一个或多个操作语句(如,用于进行增、删、查或者改的操作语句)组成的一个逻辑工作单元。事务通常具有ACID特性,ACID分别是Atomicity(原子性)、Consistency(一致性)、Isolation(隔离性)、Durability(持久性)的缩写。其中,原子性是指事务是一个不可分割的工作单位,事务中的操作语句所指示的操作要么都发生,要么都不发生。一致性是指事务前后数据的完整性需要保持一致,也可以理解为事务使得数据库系统从一个正确的状态转换到另一个正确的状态。此处的正确的状态可以理解为数据库系统中的数据满足预订的约束条件。隔离性是指当多个用户并发访问数据库系统时,数据库系统为每一个用户开启的事务相互之间不会发生干扰。持久性则是指事务一旦被提交(commit),则该事务对数据库系统中数据的改变就是永久性的。
上述的在元数据变更无法及时被各个工作节点获知时出现的异常,可能有多种。
一个例子中,在分布式数据库系统10中,假如当前表t1的schema对象存在V1和V2两个版本。此时,工作节点110对表t1的schema对象进行了变更,比如,工作节点110在删除表t1中C1列之后,从表t1的schema对象中删除了基于C1列建立的索引。如此,将会产生表t1的新版本的schema对象,即V3版本的schema对象。当V3版本的schema对象还没有被同步至工作节点120时,工作节点120开启了一个事务T1,事务T1依次包括以下三个操作语句:1.查询表t1有哪些列;2.向表t1写入一行数据;3.再向表t1写入一行数据。如果在事务T1中的操作语句2被执行后,V3版本的schema对象被同步至工作节点120。那么,将出现这样的情况下:工作节点120在处理事务T1中的操作语句1和2的过程中,访问的表t1的schema对象是V2版本的,其中存在列C1的索引;在处理事务T1的操作语句3的过程中,访问的表t1的schema对象是V3版本的,其中不存在列C1的索引。也就是,在同一个事务的处理过程中出现了元数据不一致的情况。
分布数据库系统中的工作节点为了使得保持并发事务的隔离性,常采用快照隔离技术来对事务进行处理。快照隔离的基本思想是在数据项发生增、删、改时创建数据项的新版本,并基于当前时间为该新版本分配一个时间戳作为版本信息(如,版本号)。这里的时间戳通常是利用时间戳函数对当前时间进行处理而生成的数值,可以随着时间的递增而递增。工作节点在执行任一事务的操作语句时,可以利用该事务的事务时间戳取得一个用户数据快照(snapshot),这个用户数据快照的版本信息即为该事务时间戳。只有版本信息小于或等于用户数据快照的版本信息的用户数据对该事务是可见的,即,可以作为该事务的待访问用户数据。
在上述的例子中,工作节点120在执行操作语句1和2的过程中,待访问用户数据中,表t1已经不包括C1列的数据。而在其访问的V2版本的schema对象中,则存在C1列的索引。换句话说,当元数据的变更无法及时地被各个工作节点获知的情况下,同一事务或操作语句对用户数据和元数据的隔离级别是不同的,再换句话说,就是工作节点在事务或操作语句的处理过程中可见的用户数据和元数据可能是不对应的。这可能导致最终的处理 结果出错。
因此,当工作节点对数据对象的元数据进行变更后,其他工作节点能够及时地获知该变更是非常重要的。基于此,一些实施方式中,分布式数据库系统会采用基于两阶段提交(Two-phase Commit,2PC)等事务提交协议来实现schema对象的变更和同步。
详细地,工作节点可以将schema对象的变更作为一种特殊的事务,发起schema变更事务的工作节点作为事务提交的协调者,其他需要同步变更的schema对象的工作节点则作为schema变更事务的参与者。采用这种方案的前提是可以知晓并连接分布式数据库系统中的其他所有节点(即,参与者),并且协调者和各个参与者都需要维护各自的节点状态并记录日志,以供事务提交协议使用。也就是说,这种方式中,工作节点必须是有状态的,并且在schema变更和同步的过程中不能有工作节点退出和加入系统,否则变更和同步就会失败。
然而,目前分布数据库领域为了达到更好的可伸缩性和性价比,在架构设计方面大都采用了存储与计算分离的架构和无状态工作节点。在这类分布式数据库系统中,无状态的工作节点与数据存储层是解耦的,只负责接收用户的数据操作请求,根据数据操作请求从存储层访问数据并执行处理任务,无状态的工作节点本身并不保存任何持久的数据和状态。且无状态的工作节点可以因故障、运维、扩容和缩容等原因随时退出或新增。因此,基于两阶段提交等事务提交协议的元数据变更和同步方式通常不适用于这类分布式数据库系统。
另一种实施方式中,可以通过定时方式实现schema对象的变更和同步。详细地,进行schema对象变更的工作节点可以将变更的schema对象写入全局元数据存储中,由其他各个工作节点定时去全局元数据存储中读取schema对象以实现同步。
再一种实施方式中,可以利用基于租约和中间状态的方式来实现对schema对象的变更。这种方式中,每个工作节点上的scheme对象看作是schema对象的一种状态(在此称为“schema状态”),对于schema对象的变更和同步则可以视为是对schema状态的变更和同步。比如,如果对表t1的schema对象所做的变更是添加某一列的索引,则没有添加索引的schema对象可以视为absent状态,而添加了索引的schema对象可以视为public状态。
根据租约机制,每一个schema状态都有一个租期(lease),每个工作节点必须在租约结束(即,租期到期)之前续约。工作节点续约时,需要检查全局元数据存储中是否存在新的schema状态,若存在,则需要将该新状态加载至本地缓存。如果某个工作节点因故障或异常情况无法完成续约,则不再对数据库用户提供服务。租约机制保证了任何一个可对用户提供服务的工作节点在一个租期内都可以获得最新的schema状态。为了防止不同工作节点在某一时刻持有不同的schema状态造成数据的不一致,这类方案采用了schema的中间状态,schema的中间状态只对特定类型(如,删除、插入、更新等)的操作语句有效。如果采用这种方式,在上述的例子中,只有当工作节点上的schema对象处于public状态时,工作节点上表t1的schema对象中添加的索引才是对外可见的,而当工作节点上的schema对象处于absent状态或其他中间状态时,所述添加的索引要么并不存在,要么并不对外可见。
可见,上述的定时方式、基于租约和中间状态的方式中,当一个工作节点变更了schema对象后,其他各个工作节点都是在一段时间后才能够同步变更后的schema对象,即无法及时地获知变更后的元数据,从而可能出现上文描述的,同步恰好发生在事务执行过程中,使得属于同一事务的不同操作语句可见的schema信息不同,或是使得同一事 务可见的用户数据和schema信息不对应。
基于此,本申请实施例提出了一种数据库事务处理方法、装置及服务器,可以确保在目标事务开启之前变更的元数据及时地被需要使用该元数据的工作节点获知。下面对该内容进行阐述。
请参照图2,图2是本申请实施例提供的一种数据库处理方法的流程示意图,该方法可以应用于图1所示的任意一个工作节点。详细描述如下。
S201,在开启目标事务时,获取所述目标事务的事务时间戳及分布式数据库系统当前的全局最新版本信息,其中,所述目标事务包括针对目标数据对象的至少一个操作语句,所述全局最新版本信息是所述分布式数据库系统存储的各个元数据中最新生成的元数据的版本信息。
其中,开启目标事务可以是指目标事务开始被执行。每一个事务的事务时间戳可以是全局时间戳管理器400在该事务开启时基于当前时间信息生成的数值。例如,可以在事务开启时,通过一时间戳函数对当前时间信息进行处理,从而得到一个函数值,该函数值可以作为该事务的事务时间戳。在一些实施例中,时间戳函数通常是单调递增的,对应地,事务的事务时间戳也随着事务开启时间的递增而递增。
本实施例中,工作节点在获得任一数据对象的任意一个版本的元数据时,可以请求全局时间戳管理器400为该元数据分配一个版本信息。详细地,元数据的版本信息也可以是全局时间戳管理器400利用一时间戳函数对当前时间信息进行处理而生成的数值。本实施例中,全局时间戳管理器400可以通过同一时间戳函数来生成元数据的版本信息和事务的事务时间戳,在此情况下,元数据的版本信息和事务的事务时间戳属于同一时间戳序列。全局时间戳管理器400也可以通过不同的时间戳函数来生成元数据的版本信息和事务的事务时间戳,在此情况下,元数据的版本信息和事务的事务时间戳属于不同的时间戳序列。本实施例对此没有限制。
在元数据的版本信息是基于时间信息生成的时间戳的情况下,同一元数据的版本信息之间存在时间先后顺序,不同元数据的版本信息之间也存在时间先后顺序。
比如,分布式数据库系统10中存在元数据md1和md2,其中元数据md1有两个版本,分别为V11和V12,元数据md2有两个版本V21和V22。则,V11、V12、V21、V22之间是有时间先后顺序的,该时间先后顺序可以通过V11、V12、V21和V22之间的大小顺序来体现,在时间戳函数是单调递增函数的情况下,版本信息越大的元数据生成时间越晚,数据就越新。下文中,在未作特殊说明的情况下,均以时间戳函数单调递增为例进行描述。
那么,在分布式数据库系统10存储的所有元数据的版本信息中,势必存在一个最大的版本信息,具有这个版本信息的元数据就是最新生成的元数据。比如,当分布式数据库系统10只存储有元数据md1和md2的情况下,如果V22是最大的版本信息,则版本信息是V22的元数据md2就是最新生成的元数据。
全局时间戳管理器400中可以存储有一全局最新版本信息,所述全局最新版本信息即为分布式数据库系统10的所有元数据中最新生成的元数据的版本信息,例如上述例子中版本信息是V22的元数据md2。因此,分布式数据库系统10的全局最新版本信息是随着系统10中的元数据的变更而变化的。当系统10中有新的元数据产生时,全局最新版本信息就会被更新。对应地,S201中描述的当前的全局最新版本信息是指,工作节点在开启目标事务时,从全局时间戳管理器获取到的全局最新版本信息,其表示的是,在目标 事务开启时刻之前分布式数据库系统10中产生的最新元数据的版本信息。
S202,根据所述当前的全局最新版本信息确定所述目标数据对象的最新版本的元数据,以及根据所述事务时间戳确定所述目标事务的待访问用户数据。
本实施例中,工作节点可以基于S201中获得的全局最新版本信息,即所述当前的全局最新版本信息,访问全局元数据存储300,从而获得一个元数据快照。如此,相当于将版本信息小于或等于所述当前的全局最新版本信息的元数据作为了待访问元数据。然后,工作节点可以从待访问元数据中查找目标数据对象的最新版本的元数据,即目标数据对象的各个版本的元数据中版本信息最大的一者。
此外,工作节点还可以用目标事务的事务时间戳做用户数据快照,从而可以确定目标事务可见的用户数据,也就是版本信息小于或等于事务时间戳的用户数据。所确定的目标事务可见的用户数据即为所述待访问用户数据,也即处理目标事务的过程中可以访问的用户数据。
S203,基于所述目标数据对象的最新版本的元数据对所述待访问用户数据执行所述目标事务中的操作语句。
实施过程中,工作节点可以访问目标数据对象的最新版本的元数据(如,schema信息),根据目标数据对象的最新版本的元数据,对目标事务中的操作语句进行解析,并根据解析结果对待访问用户数据进行处理。这里的解析例如可以是检查目标事务中的操作语句的处理对象(如,数据表、数据表中的行、列、主键等)是否存在等,当解析结果为存在时,则可以继续执行该操作语句。
通过图2所示流程,由于分布式数据库系统中任意元数据的变更都可以触发全局版本信息的更新,而工作节点在开启目标事务时,是基于当前的全局最新版本信息来确定目标数据对象的最新版本的元数据的,因而,可以确保在目标事务开启之前发生变更的元数据可以及时地被工作节点获知。换句话说,如果在目标事务开启前,发生了元数据变更,而工作节点在执行目标事务中恰好需要使用该元数据,通过本方案可以确保工作节点在整个目标事务的执行过程中使用的是变更后的元数据。
进一步地,由于本方案可以确保工作节点在整个目标事务的执行过程中使用的是变更后的元数据,因而,避免了执行同一事务的不同操作语句的过程,访问的元数据不一致的情况。此外,本方案中,工作节点在一开始拿到的就是变更后的元数据,避免了前述例子中,由于元数据变更无法及时可见所导致的同一事务可见的用户数据和元数据不对应的问题。
请一并参照图2和图3,下面将对图2所示流程做进一步的阐述。详细地,S201中获取所述目标事务的事务时间戳及所述分布式数据库系统当前的全局最新版本信息的步骤可以通过图3所示流程实现。
S201-1,向全局时间戳管理器发送目标事务对应的第二时间戳分配请求。
其中,第二时间戳分配请求是指针对某一事务发出的时间戳分配请求,用于请求全局时间戳管理器400为该事务分配事务时间戳。可以理解,这里的第二时间戳分配请求仅仅是为了和后文的第一时间戳分配请求进行区分,而并非通过“第一”、“第二”对时间戳请求的重要性进行限制。
S201-2,接收全局时间戳管理器基于第二时间戳分配请求返回的响应信息,该响应信息包括目标事务的事务时间戳及所述当前的全局最新版本信息。
本实施例中,全局时间戳管理器400在接收到第二时间戳分配请求时,可以基 于当前时间信息生成一时间戳作为事务时间戳,并获取当前的全局最新版本信息,填充至第二时间戳分配请求对应的响应信息中,并将该响应信息发送给工作节点。值得说明的是,生成事务时间戳和获取当前的全局最新版本信息这两个操作具有原子性,换句话说,这两个操作要么都发生,要么都不发生。实施过程中,可以通过为这两个操作设置原子锁的方式实现其原子性。当然,也可以通过其他方式实现所述两个操作的原子性,本实施例对此没有限制。
通过图3所示流程,可以确保目标事务的事务时间戳和当前的全局最新版本信息基本与同一时间点对应,对应地,基于事务时间戳确定的待访问用户数据和基于当前的全局最新版本信息确定的目标数据对象的最新版本的元数据是相互对应的版本。
本实施例中,全局时间戳管理器400维护的全局最新版本信息可以通过图4所示的流程来更新。
S401,接收针对任一数据对象的元数据的变更指令,根据所述变更指令生成该数据对象的变更后的元数据。
这里的变更指令可以是能够使元数据发生变化的任意指令,如,用于实现增、删、改等操作的指令。变更指令可以由用户或数据库管理员输入,也可以由针对数据对象的操作而触发,本实施例对此没有限制。
分布式数据库系统10中的每个工作节点在接收到针对任意一个数据对象的元数据的变更指令时,可以对该数据对象的当前版本的元数据做一个快照,该快照的版本信息可以是基于当前时间信息生成的时间戳。然后,对该数据对象的元数据执行变更指令,得到该数据对象新版本的元数据,即所述变更后的元数据。
此时,可以请求全局时间戳管理器400为所述变更后的元数据分配一个版本信息。基于此,本实施例提供的数据库事务处理方法还可以包括图5所示的步骤。
S501,当生成变更后的元数据时,向全局时间戳管理器发送第一时间戳分配请求。
S502,接收所述全局时间戳管理器基于所述第一时间戳分配请求返回的时间戳,并将该时间戳确定为所述变更后的元数据的版本信息。
本实施例中,第一时间戳分配请求是指针对元数据发送的时间戳分配请求,基于第一时间戳分配请求返回的时间戳是元数据的版本信息。
全局时间戳管理器400在接收到工作节点发送的第一时间戳分配请求时,可以基于当前时间信息生成一个时间戳,并将时间戳携带在第一时间戳分配请求对应的响应信息中返回给工作节点。工作节点接收该响应信息,提取该响应信息中的时间戳,并将提取的时间戳确定为所述变更后的元数据的版本信息。
S402,向全局元数据存储提交所述变更后的元数据。
本实施例中,图5所示的步骤可以在S402之前执行。实施过程中,工作节点可以将变更后的元数据及其版本信息一并提交给全局元数据存储300,变更后的元数据及其版本信息将被写入全局元数据存储300,写入成功后全局元数据存储300可以返回一个提交确认信息。此时,变更后的元数据及其版本信息将被持久化地存储于全局元数据存储300中。
S403,当所述变更后的元数据成功提交时,将所述全局最新版本信息更新为所述变更后的元数据的版本信息。
工作节点如果接收到全局元数据存储300返回的提交确认信息,可以认为所述变更 后的元数据已经提交成功。因此,工作节点可以在接收到全局元数据存储300返回的提交确认信息时,向全局时间戳管理器400发送提交成功通知,示例性地,提交成功通知可以包括变更后的元数据的版本信息,从而使全局时间戳管理器400可以根据接收的提交成功通知,将其存储的全局最新版本信息修改为所述变更后的元数据的版本信息。
详细地,全局时间戳管理器400在接收到提交成功通知时,可以基于提交成功通知中携带的变更后的元数据的版本信息,确定具有该版本信息的元数据已经成功提交至全局元数据存储300,进而可以将存储的全局最新版本信息更新为提交成功通知中携带的该版本信息,即变更后的元数据的版本信息。
本实施例中,在变更后的元数据提交不成功的情况下,工作节点可以重复执行S402。可以理解,工作节点在进行元数据变更的过程中维护有持久化的日志或任务队列,所述持久化的日志或任务对了记录有元数据变更的进度。如果在元数据变更的过程中,执行元数据变更的工作节点出现故障,该工作节点或者其他工作节点可以在恢复后基于所述持久化的日志或任务队列继续执行元数据变更,直至变更后的元数据提交成功。
本实施例中,全局最新版本信息可以理解为一个变量,而对全局最新版本信息的更新可以理解成对该变量的值的更新。为了与后文描述的其他变量区分,本文中,全局最新版本信息可以称为第一变量。
通过图4所示流程,可以确保全局时间戳管理器400存储的全局最新版本信息是分布式数据库系统10中最新生成的元数据的版本信息。
为了便于描述,在此将目标事务所针对的目标数据对象约定为第一数据对象。如上文描述的,任意一个工作节点对元数据的变更都会触发全局时间戳管理器400中的全局最新版本信息的更新。因此,可能存在这样的情况,在目标事务开启之前,目标数据对象并没有新版本的元数据产生,或是新版本的元数据已经被同步到开启目标事务的工作节点的本地缓存中。在此情况下,工作节点本地缓存中的目标数据对象的最新元数据,已经是目标数据对象在分布式数据库系统10中的最新元数据。此时,工作节点在执行S202的过程中并没有必要访问全局元数据存储300。
因此,为了避免工作节点对全局元数据存储300进行不必要的访问,从而减少网络IO(Input/Output,输入/输出)次数,减少对网络带宽的占用,提升系统性能,本实施例中,图2所示的S202可以通过图6所示的流程实现。
S202-1,从本地缓存中获取目标数据对象的本地最新元数据的版本信息。
其中,本地缓存是指开启目标事务的工作节点的本地缓存。每个工作节点的本地缓存中可以维护有每个数据对象的本地最新元数据的版本信息,这里描述的本地最新元数据的版本信息可以理解成一个变量,在此称为第二变量。第二变量的值是可被更新的,在每个工作节点的本地缓存中,存在每个数据对象对应的第二变量。
比如,在工作节点120的本地缓存121中,存在分布式数据库系统10中每个数据对象的第二变量,第二变量的值表示该数据对象在本地缓存121中的最新元数据的版本信息。以该数据对象是数据表t1为例,本地缓存121存储有数据表t1的各版本元数据,数据表t1在本地缓存121中的第二变量的值表示的是所述各版本元数据的版本信息中最新的一个。
以本实施例提供的数据库事务处理方法应用于图1所示的工作节点120为例,如果工作节点120开启了一个事务T2,事务T2包括针对数据表t2的至少一个操作语句。那么,事务T2可以视为目标事务,工作节点202-1可以视为开启并用于处理目标事务T2的节点,数据表t2可以视为目标数据对象(或,第一数据对象),工作节点120的本地缓存121可以 视为S202-1中的本地缓存。
实施过程中,工作节点120可以从本地缓存121中查找数据表t2的第二变量,获取该第二变量的当前值,该当前值即为数据表t2在本地缓存121中的最新元数据的版本信息。
S202-2,对比所述目标数据对象的本地最新元数据的版本信息与所述当前的全局最新版本信息是否相同。若是,执行S202-3;若否,可以执行S202-4。
S202-3,从所述本地缓存存储的元数据中,获取所述目标数据对象的本地最新元数据作为所述目标数据对象的最新版本的元数据。
本实施例中,为了确定本地缓存中的目标数据对象的本地最新元数据,是否是目标数据对象的全局最新元数据,可以对通过S202-1获得的目标数据对象的本地最新元数据的版本信息与通过S201获得的所述当前的全局最新版本信息进行比较。
如果两者相同,表示本地缓存中的目标数据对象的本地最新元数据就是分布式数据库系统10中的全局最新元数据,可以直接从本地缓存获取该本地最新元数据,而不必访问全局元数据存储300。
S202-4,从所述本地缓存和所述全局元数据存储中存储的所述目标数据对象的各版本的元数据中,查找是否存在目标元数据,查找到的目标元数据的版本信息比所述目标数据对象的本地最新元数据的版本信息新、且不比所述当前的全局最新版本信息新。若是,可以执行S202-5;若否,可以执行S202-6。
S202-5,将所述查找到的目标元数据中版本信息最新的目标元数据确定为所述目标数据对象的最新版本的元数据。
如果本地缓存中的目标数据对象的本地最新元数据的版本信息和所述当前的全局最新版本信息不相同,表示在目标事务开启前,已经发生了一次或多次元数据变更,所述一次或多次元数据变更触发了全局最新版本信息的更新,所述一次或多次元数据变更中可能存在目标数据对象的元数据变更,也可能不存在。
如果所述一次或多次元数据变更中存在目标数据对象的元数据变更,则在本地缓存或者全局元数据存储300中势必存在目标元数据,也就是版本信息处于第一数据对象的本地最新元数据的版本信息与所述当前的全局最新版本信息之间的元数据。这是因为,全局最新版本信息的更新,是在触发该变更后的元数据被成功提交至全局元数据存储300后才执行的,因此,具有当前的全局最新版本信息及其之前的版本信息的元数据均已被写入全局元数据存储300中,可以被查找到。
在目标数据对象的各版本元数据中,版本信息最新的目标元数据是最后一次变更产生的,也就是目标数据对象在分布式数据库系统10中最新版本的元数据。对应地,工作节点可以将版本信息最新的目标元数据更新至本地缓存,以使本地缓存中存储有目标数据对象的最新版本的元数据,并将目标数据对象的本地最新元数据的版本信息更新为该版本最新的目标元数据的版本信息(即,目标数据对象的最新版本的元数据的版本信息)。
S202-6,将所述目标数据对象的本地最新元数据确定为所述目标数据对象的最新版本的元数据。
如果上述的一次或多次元数据变更中不存在目标元数据的变更,那么,目标数据对象的本地最新元数据就已经是目标数据对象在分布式数据库系统10中最新版本的元数据。换句话说,本地缓存或者全局元数据存储300存储的目标数据对象的元数据中不会存在目标元数据。因此,当工作节点无法通过S202-4查找到目标元数 据时,可以直接将目标数据对象的本地最新元数据作为目标数据对象的最新版本的元数据。
可以理解,S202、S202-3、S202-5以及S202-6中描述的目标数据对象的最新版本的元数据,均是指目标数据对象在整个分布式数据库系统10中的最新版本的元数据。
本实施例中,在执行S202-6之后,工作节点还可以将本地缓存中的目标数据对象的本地最新元数据的版本信息(即,第二变量的值)更新为当前的全局最新版本信息(即,第一变量的当前值)。
此外,本实施例中,变更后的元数据除了需要写入全局元数据存储300,还需要同步到各个工作节点。其中,产生变更后的元数据的工作节点可以通过图7所示的流程,对该变更后的元数据进行同步。对应地,产生变更后的元数据的工作节点可以对元数据发生变更的数据对象的第二变量进行更新。详细描述如下。
S701,当变更后的元数据被成功提交时,将所述变更后的元数据写入本地缓存。
实施过程中,工作节点在向全局元数据存储300提交变更后的元数据,并接收到全局元数据存储300返回的提交确认信息时,可以将变更后的元数据写入本地缓存中。
S702,在所述本地缓存中,将元数据发生变更的数据对象的本地最新元数据的版本信息更新为所述变更后的元数据的版本信息。
其中,元数据发生变更的数据对象是指所述变更后的元数据所属的数据对象。当工作节点成功将变更后的元数据写入本地缓存后,可以将该变更后的元数据所属的数据对象的本地最新元数据的版本信息(即,第二变量的值)修改为该变更后的元数据的版本信息。
为了使本领域技术人员更加清楚地理解本方案,下面结合图1所示场景给出一些具体示例,以对本实施例提供的数据库事务处理方法进行详细阐述。
假设初始状态下,分布式数据库系统10中存在数据表t1、t2和t3,其中,数据表t1的元数据为V10版本的模式(schema)对象O1,数据表t2的元数据为V20版本的schema对象O2,数据表t3的元数据为V30版本的schema对象O3。
全局元数据存储300存储有V10版本的schema对象O1、V20版本的schema对象O2以及V30版本的schema对象O3,并且全局元数据存储300维护有第一变量GLSV(global largest schema version,全局最大模式版本号),GLSV的当前值为V20,表示分布式数据库系统10中当前最大的版本信息是V20,也即,最新的版本信息是V20。
本地缓存111、121以及132均缓存有V10版本的schema对象O1、V20版本的schema对象O2及V30版本的schema对象O3,并且均维护有数据表t1对应的第二变量EGSV-1(equivalent global schema version-1,等效全局模式版本号-1)、数据表t2对应的第二变量EGSV-2以及数据表t3对应的第二变量EGSV-3。其中,EGSV-1的当前值是V10,EGSV-2的当前值为V20,EGSV-3的当前值为V30。
一个示例中,本实施例提供的数据库事务处理方法的流程可以包括图8所示的流程。
S1,工作节点110接收到针对数据表t1的schema对象O1的变更指令,根据变更指令生成变更后的schema对象O1。
S2,工作节点110向全局时间戳管理器400发送第一时间戳分配请求r1。
S3,全局时间戳管理器400根据当前时间信息生成一时间戳V11。
S4,全局时间戳管理器400向工作节点110发送时间戳V11。
S5,工作节点110将时间戳V11作为变更后的schema对象O1的版本信息,得到V11版本的schema对象O1。
S6,工作节点110向全局元数据存储300提交V11版本的schema对象O1。
S7,全局元数据存储300在存储V11版本的schema对象O1之后,返回一提交确认信息给工作节点110。
S8,工作节点110在接收到提交确认信息时,向全局时间戳管理器400发送通知,通知携带有版本信息V11。
S9,全局时间戳管理器400根据通知将GLSV的当前值更新为V11。
S10,工作节点110在接收到提交确认信息时,将V11版本的schema对象O1写入本地缓存111,并将本地缓存111中的EGSV-1的当前值更新为V11。
其中,S8和S10可以是并行执行的。
此时,如果工作节点110开启事务T1,事务T1包括针对数据表t1的操作语句,则可以按照如下流程处理。
S11,工作节点110在开启事务T1时,向全局时间戳管理器400发送第二时间戳分配请求r2。可以理解,这里的事务T1即为目标事务。
S12,全局时间戳管理器400接收r2,基于当前时间信息生成一时间戳V40,并获取GLSV的当前值V11,将V40和V11填充到r2对应的响应信息中。
S13,全局时间戳管理器400将r2对应的响应信息返回给工作节点110。
S14,工作节点110从本地缓存111中获取EGSV-1的当前值V11,确定EGSV-1的当前值V11和GLSV的当前值V11相同,从本地缓存111中获取版本信息为V11的schema对象O1。
S15,工作节点110将V40作为事务T1的事务时间戳,基于V40获得用户数据快照,将版本信息小于或等于V40的用户数据确定为事务T1的待访问用户数据data1。
S16,根据V11版本的schema对象O1对事务T1中的操作语句进行解析,并依据解析结果对data1执行事务T1中的操作语句。
另一个示例中,如果工作节点110是在S9被执行之后、S10还没有被执行时开启的,那么,本实施例提供的数据库事务处理方法首先可以按照S11-S13进行处理,在S13之后,则可以按照图9所示流程处理。
S17,工作节点110从本地缓存111中获取EGSV-1的当前值V10,确定EGSV-1的当前值V10与GLSV的当前值V11不相同,则从本地缓存111和全局元数据存储300存储的schema对象O1中,查找版本信息大于V10、且小于或等于V11的schema对象O1(即,目标元数据)。
S18,工作节点110从全局元数据存储300中查找到V11版本的schema对象O1,将其确定为最新版本的schema对象O2,并将V11版本的schema对象O1同步至本地缓存111中,将本地缓存111中的EGSV-1的当前值更新为V11。
S19,工作节点110基于事务时间戳V40获得用户数据快照,将版本信息小于或等于V40的用户数据确定为事务T1的待访问用户数据data1。
S20,工作节点110基于V11版本的schema对象O1对事务T1中的操作语句进行解析,并依据解析结果对data1执行事务T1中的操作语句。
在上述示例的基础上,工作节点110在时刻2接收到针对数据表t2的schema对象O2的变更指令,则可以按照上述的S1-S10的类似流程进行处理,完成所述处理后,全局元数据存储300中新增了新版本(V21)的schema对象O2,GLSV的当前值是V21。工作节点 110的本地缓存111中新增了V21版本的schema对象O2,且本地缓存111中的EGSV-2的当前值为V21。而工作节点120和130上不存在V21版本的schema对象O2,且本地缓存121和131中的EGSV-2的当前值仍为V20。
此时,如果工作节点120开启事务T2,事务T2包括针对数据表t2的操作语句。在此情况下,本实施例提供的数据库事务处理方法可以按照图10所示流程处理:
S21,工作节点120开启事务T2时,向全局时间戳管理器400发送第二时间戳分配请求r3。这里的事务T2可以视为目标事务。
S22,全局时间戳管理器400接收r3,基于当前时间信息生成一时间戳V41,并获取GLSV的当前值V21,将V41和V21填充到r3对应的响应信息中。
S23,全局时间戳管理器400将响应信息返回给工作节点120。
S24,工作节点120从本地缓存121中获取EGSV-2的当前值V20,确定EGSV-2的当前值V20和GLSV的当前值V21不相同,则从本地缓存121和全局元数据存储300中存储的schema对象O2中,查找版本信息大于V20、且小于或等于V21的schema对象O2(即,目标元数据)。
S25,工作节点120从全局元数据存储300中查找到V21版本的schema对象O2,将其确定为最新版本的schema对象O2,并将V21版本的schema对象O2同步至本地缓存121中,将本地缓存121中的EGSV-2的当前值更新为V21。
S26,工作节点120基于事务时间戳V41确定事务T2的待访问用户数据data2,基于V21版本的schema对象O2解析事务T2中的操作语句,并基于解析结果对data2执行事务T2中的操作语句。
在上述示例的基础上,工作节点130在时刻3开启事务T3,事务T3包括针对数据表t3的操作语句。则本实施例提供的数据库事务处理方法可以按照图11所示流程处理:
S27,工作节点130开启事务T3时,向全局时间戳管理器400发送第二时间戳分配请求r4。这里的事务T3可以视为目标事务。
S28,全局时间戳管理器400接收r4,基于当前时间信息生成一时间戳V42,并获取GLSV的当前值V21,将V42和V21填充至r4对应的响应信息中。
S29,全局时间戳管理器400将该响应信息返回给工作节点130。
S30,工作节点130从本地缓存131中获取EGSV-3的当前值V30,确定EGSV-3的当前值V30与GLSV的当前值V21不相同,则从本地缓存131和全局元数据存储300存储的schema对象O3中,查找版本信息大于V30、且小于或等于V21的schema对象O3(即,目标元数据)。
S31,工作节点130没有查找到目标元数据,则将V30版本的schema对象O3确定为最新版本的schema对象O3,并将本地缓存131中的EGSV-3的当前值更新为V21。
S32,工作节点130基于事务时间戳V42确定事务T2的待访问用户数据data3,基于V21版本的schema对象O3(即,V30版本的scheme对象O3)解析事务T2中的操作语句,并基于解析结果对data3执行事务T3中的操作语句。
本申请提供的技术方案,分布式数据库系统维护有全局最新版本信息,该全局最新版本信息是指系统的所有元数据中版本信息最新的元数据的版本信息。工作节点在开启针对目标数据对象的目标事务时,可以获取事务时间戳以及当前的全局最新版本信息,从而根据当前的全局最新版本信息确定目标数据对象的最新版本元数据,根据事务时间戳将目标事务可见的用户数据确定为待访问用户数据,并基于目标数据对象的最新版本 元数据对目标事务的待访问用户数据执行目标事务中的操作语句。由于对分布式数据库系统中任意元数据的变更均可以触发全局版本信息的更新,而工作节点在开启目标事务时,是基于当前的全局版本信息来确定目标数据对象的最新版本元数据的,因而可以确保在目标事务开启之前变更的元数据能够及时地对需要使用该元数据的工作节点可见。此外,本申请的技术方案易于在已支持多版本并发控制的主流数据库管理系统中实现和应用,并且无显著的额外性能开销,不影响分布式数据库系统可用性和扩展性。
请参阅图12,其示出了本申请实施例提供的一种数据库事务处理装置1200的结构框图。该装置1200可以包括:获取模块1210、确定模块1220以及事务处理模块1230。
其中,获取模块1210用于当所述工作节点开启目标事务时,获取所述目标事务的事务时间戳及所述分布式数据库系统当前的全局最新版本信息,其中,所述目标事务包括针对目标数据对象的至少一个操作语句,所述全局最新版本信息是所述分布式数据库系统存储的各个元数据中最新生成的元数据的版本信息。
确定模块1220用于根据所述当前的全局最新版本信息确定所述目标数据对象的最新版本的元数据,以及根据所述事务时间戳确定所述目标事务的待访问用户数据。
事务处理模块1230用于基于所述目标数据对象的最新版本的元数据对所述待访问用户数据执行所述目标事务中的操作语句。
所述装置1200还可以包括变更模块。变更模块用于:接收针对任一数据对象的元数据的变更指令,根据所述变更指令生成该数据对象的变更后的元数据;向全局元数据存储提交所述变更后的元数据;当所述变更后的元数据成功提交时,将所述全局最新版本信息更新为所述变更后的元数据的版本信息。
所述变更模块还可以用于:当生成所述变更后的元数据时,向全局时间戳管理器发送第一时间戳分配请求;接收所述全局时间戳管理器基于所述第一时间戳分配请求返回的时间戳,并将该时间戳确定为所述变更后的元数据的版本信息。
工作节点的本地缓存存储有数据对象的本地最新元数据的版本信息。所述变更模块还可以用于:当所述变更后的元数据被成功提交时,将所述变更后的元数据写入本地缓存;在所述本地缓存中,将元数据发生变更的数据对象的本地最新元数据的版本信息更新为所述变更后的元数据的版本信息。
所述确定模块1220根据所述当前的全局最新版本信息确定所述目标数据对象的最新版本的元数据的方式可以为:从本地缓存中获取所述目标数据对象的本地最新元数据的版本信息;对比所述目标数据对象的本地最新元数据的版本信息与所述当前的全局最新版本信息是否相同;若相同,从所述本地缓存存储的元数据中,获取所述目标数据对象的本地最新元数据作为所述目标数据对象的最新版本的元数据。
所述确定模块1220根据所述当前的全局最新版本信息确定所述目标数据对象的最新版本的元数据的方式还可以为:若所述目标数据对象的本地最新元数据的版本信息与所述当前的全局最新版本信息不相同,从所述本地缓存和所述全局元数据存储中存储的所述目标数据对象的各版本的元数据中,查找是否存在目标元数据,查找到的目标元数据的版本信息比所述获取的本地最新版本信息新、且不比所述当前的全局最新版本信息新;若存在,将所述查找到的目标元数据中版本信息最新的目 标元数据确定为所述目标数据对象的最新版本的元数据。
所述确定模块1220还可以用于:在将所述查找到的目标元数据中版本信息最新的目标元数据确定为所述目标数据对象的最新版本的元数据之后,将所述版本信息最新目标元数据更新至所述本地缓存;将所述目标数据对象的本地最新元数据的版本信息,更新为所述版本信息最新的目标元数据对应的版本信息。
所述确定模块1220根据所述当前的全局最新版本信息确定所述目标数据对象的最新版本的元数据的方式还可以为:若不存在所述目标元数据,将所述目标数据对象的本地最新元数据确定为所述目标数据对象的最新版本的元数据,以及将所述目标数据对象的本地最新元数据的版本信息更新为所述当前的全局最新版本信息。
全局最新版本信息存储于所述分布式数据库系统的全局时间戳管理器中。获取模块1210获取所述目标事务的事务时间戳及所述分布式数据库系统当前的全局最新版本信息的方式可以是:向所述全局时间戳管理器发送所述目标事务对应的第二时间戳分配请求;接收所述全局时间戳管理器基于所述第二时间戳分配请求返回的响应信息,该响应信息包括所述目标事务的事务时间戳及所述当前的全局最新版本信息。
事务处理模块1230基于所述目标数据对象的最新版本的元数据对所述待访问用户数据执行所述目标事务中的操作语句的方式可以是:根据所述目标数据对象的最新版本的模式信息,对所述目标事务中的操作语句进行解析;根据解析结果对所述待访问用户数据进行处理。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述装置和模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,所显示或讨论的模块相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。
另外,在本申请各个实施例中的各功能模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
请参考图13,其示出了本申请实施例提供的一种服务器1300的结构框图。该服务器1300可以是上述的工作节点所在的服务器,其可以是独立的服务器或多个物理服务器组成的服务器集群,还可以是用于执行云计算、大数据和人工智能平台等基础云计算服务的云服务器。本申请中的服务器1300可以包括一个或多个如下部件:处理器1310、存储器1320、以及一个或多个应用程序,其中一个或多个应用程序可以被存储在存储器1320中并被配置为由一个或多个处理器1310执行,一个或多个程序配置用于执行如前述方法实施例所描述的方法。
处理器1310可以包括一个或者多个处理核。处理器1310利用各种接口和线路连接整个服务器1300内的各个部分,通过运行或执行存储在存储器1320内的指令、程序、代码集或指令集,以及调用存储在存储器1320内的数据,执行服务器1300的各种功能和处理数据。可选地,处理器1310可以采用数字信号处理(Digital Signal Processing,DSP)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、可编程逻辑阵列(Programmable Logic Array,PLA)中的至少一种硬件形式来实现。处理器1310可集成中央处理器(Central Processing Unit,CPU)、图像处理器(Graphics  Processing Unit,GPU)和调制解调器等中的一种或几种的组合。其中,CPU主要处理操作系统、用户界面和应用程序等;GPU用于负责显示内容的渲染和绘制;调制解调器用于处理无线通信。可以理解的是,上述调制解调器也可以不集成到处理器1310中,单独通过一块通信芯片进行实现。
存储器1320可以包括随机存储器(Random Access Memory,RAM),也可以包括只读存储器(Read-Only Memory)。存储器1320可用于存储指令、程序、代码、代码集或指令集。存储器1320可包括存储程序区和存储数据区,其中,存储程序区可存储用于实现操作系统的指令、用于实现至少一个功能的指令(比如触控功能、声音播放功能、图像播放功能等)、用于实现下述各个方法实施例的指令等。存储数据区还可以存储服务器1300在使用中所创建的数据(比如元数据)等。
请参考图14,其示出了本申请实施例提供的一种计算机可读存储介质的结构框图。该计算机可读介质1400中存储有程序代码,所述程序代码可被处理器调用执行上述方法实施例中所描述的方法。
计算机可读存储介质1400可以是诸如闪存、EEPROM(电可擦除可编程只读存储器)、EPROM、硬盘或者ROM之类的电子存储器。在一些实施例中,计算机可读存储介质1400包括非瞬时性计算机可读介质(non-transitory computer-readable storage medium)。计算机可读存储介质1400具有执行上述方法中的任何方法步骤的程序代码1410的存储空间。这些程序代码可以从一个或者多个计算机程序产品中读出或者写入到这一个或者多个计算机程序产品中。程序代码1410可以例如以适当形式进行压缩。
最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不驱使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (16)

  1. 一种数据库事务处理方法,应用于分布式数据库系统的工作节点,所述方法包括:
    在开启目标事务时,获取所述目标事务的事务时间戳及所述分布式数据库系统当前的全局最新版本信息,其中,所述目标事务包括针对目标数据对象的至少一个操作语句,所述全局最新版本信息是所述分布式数据库系统存储的各个元数据中最新生成的元数据的版本信息;
    根据所述当前的全局最新版本信息确定所述目标数据对象的最新版本的元数据,以及根据所述事务时间戳确定所述目标事务的待访问用户数据;
    基于所述目标数据对象的最新版本的元数据对所述待访问用户数据执行所述目标事务中的操作语句。
  2. 根据权利要求1所述的方法,其中,所述方法还包括:
    接收针对任一数据对象的元数据的变更指令,根据所述变更指令生成该数据对象的变更后的元数据;
    向全局元数据存储提交所述变更后的元数据;
    当所述变更后的元数据成功提交时,将所述全局最新版本信息更新为所述变更后的元数据的版本信息。
  3. 根据权利要求2所述的方法,其中,在所述向全局元数据存储提交所述变更后的元数据之前,所述方法还包括:
    当生成所述变更后的元数据时,向全局时间戳管理器发送第一时间戳分配请求;
    接收所述全局时间戳管理器基于所述第一时间戳分配请求返回的时间戳,并将该时间戳确定为所述变更后的元数据的版本信息。
  4. 根据权利要求2所述的方法,其中,所述全局最新版本信息存储于全局时间戳管理器中,所述将所述全局最新版本信息更新为所述变更后的元数据的版本信息,包括:
    向所述全局时间戳管理器发送提交成功通知,使所述全局时间戳管理器根据所述提交成功通知将存储的所述全局最新版本信息更新为所述变更后的元数据的版本信息。
  5. 根据权利要求2-4中任意一项所述的方法,其中,所述工作节点的本地缓存存储有数据对象的本地最新元数据的版本信息,所述方法还包括:
    当所述变更后的元数据被成功提交时,将所述变更后的元数据写入本地缓存;
    在所述本地缓存中,将元数据发生变更的数据对象的本地最新元数据的版本信息更新为所述变更后的元数据的版本信息。
  6. 根据权利要求1-4中任意一项所述的方法,其中,所述根据所述当前的全局最新版本信息确定所述目标数据对象的最新版本的元数据,包括:
    根据所述当前的全局最新版本信息确定所述分布式数据库系统的全局元数据存储 中的待访问元数据,并从确定的所述待访问元数据中查找所述目标数据对象的最新版本的元数据,所述待访问元数据为全局元数据存储中所述版本信息小于或等于所述当前的全局最新版本信息的元数据。
  7. 根据权利要求1-4中任意一项所述的方法,其中,所述根据所述当前的全局最新版本信息确定所述目标数据对象的最新版本的元数据,包括:
    从本地缓存中获取所述目标数据对象的本地最新元数据的版本信息;
    对比所述目标数据对象的本地最新元数据的版本信息与所述当前的全局最新版本信息是否相同;
    若相同,从所述本地缓存存储的元数据中,获取所述目标数据对象的本地最新元数据作为所述目标数据对象的最新版本的元数据。
  8. 根据权利要求7所述的方法,其中,所述根据所述当前的全局最新版本信息确定所述目标数据对象的最新版本的元数据,还包括:
    若所述目标数据对象的本地最新元数据的版本信息与所述当前的全局最新版本信息不相同,从所述本地缓存和全局元数据存储中存储的所述目标数据对象的各版本的元数据中,查找是否存在目标元数据,查找到的目标元数据的版本信息比所述目标数据对象的本地最新元数据的版本信息新、且不比所述当前的全局最新版本信息新;
    若存在,将所述查找到的目标元数据中版本信息最新的目标元数据确定为所述目标数据对象的最新版本的元数据。
  9. 根据权利要求8所述的方法,其中,在所述将所述查找到的目标元数据中版本信息最新的目标元数据确定为所述目标数据对象的最新版本的元数据之后,所述方法还包括:
    将所述版本信息最新的目标元数据更新至所述本地缓存;
    将所述目标数据对象的本地最新元数据的版本信息,更新为所述版本信息最新的目标元数据对应的版本信息。
  10. 根据权利要求8所述的方法,其中,所述根据所述当前的全局最新版本信息确定所述目标数据对象的最新版本的元数据,还包括:
    若不存在所述目标元数据,将所述目标数据对象的本地最新元数据确定为所述目标数据对象的最新版本的元数据。
  11. 根据权利要求10所述的方法,其中,所述方法还包括:
    若不存在目标元数据,将所述目标数据对象的本地最新元数据的版本信息更新为所述当前的全局最新版本信息。
  12. 根据权利要求1-4中任意一项所述的方法,其中,所述全局最新版本信息存储于所述分布式数据库系统的全局时间戳管理器中,所述获取所述目标事务的事务时间戳及所述分布式数据库系统当前的全局最新版本信息,包括:
    向所述全局时间戳管理器发送所述目标事务对应的第二时间戳分配请求;
    接收所述全局时间戳管理器基于所述第二时间戳分配请求返回的响应信息,该响应信息包括所述目标事务的事务时间戳及所述当前的全局最新版本信息。
  13. 根据权利要求1-4中任意一项所述的方法,其中,所述基于所述目标数据对象的最新版本的元数据对所述待访问用户数据执行所述目标事务中的操作语句,包括:
    根据所述目标数据对象的最新版本的模式信息,对所述目标事务中的操作语句进行解析;
    根据解析结果对所述待访问用户数据进行处理。
  14. 一种数据库事务处理装置,应用于分布式数据库系统的工作节点,所述装置包括:
    获取模块,用于当所述工作节点开启目标事务时,获取所述目标事务的事务时间戳及所述分布式数据库系统当前的全局最新版本信息,其中,所述目标事务包括针对目标数据对象的至少一个操作语句,所述全局最新版本信息是所述分布式数据库系统存储的各个元数据中最新生成的元数据的版本信息;
    确定模块,用于根据所述当前的全局最新版本信息确定所述目标数据对象的最新版本的元数据,以及根据所述事务时间戳确定所述目标事务的待访问用户数据;
    事务处理模块,用于基于所述目标数据对象的最新版本的元数据对所述待访问用户数据执行所述目标事务中的操作语句。
  15. 一种服务器,包括:
    一个或多个处理器;
    存储器;
    一个或多个程序,其中所述一个或多个程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个程序配置用于执行如权利要求1-13中任意一项所述的方法。
  16. 一种计算机可读存储介质,所述计算机可读存储介质存储有程序代码,所述程序代码可被处理器调用执行如权利要求1-13中任意一项所述的方法。
PCT/CN2021/096691 2020-06-10 2021-05-28 数据库事务处理方法、装置、服务器及存储介质 WO2021249207A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP21822853.4A EP4030315A4 (en) 2020-06-10 2021-05-28 METHOD AND APPARATUS FOR DATABASE TRANSACTION PROCESSING, AND SERVER AND STORAGE MEDIUM
KR1020227015834A KR20220076522A (ko) 2020-06-10 2021-05-28 데이터베이스 트랜잭션 프로세싱 방법 및 장치, 서버 및 저장 매체
JP2022555830A JP7497907B2 (ja) 2020-06-10 2021-05-28 データベースのトランザクション処理方法、データベースのトランザクション処理装置、サーバ、及びコンピュータプログラム
US17/743,293 US20220276998A1 (en) 2020-06-10 2022-05-12 Database transaction processing method and apparatus, server, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010520559.2 2020-06-10
CN202010520559.2A CN111427966B (zh) 2020-06-10 2020-06-10 数据库事务处理方法、装置及服务器

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/743,293 Continuation US20220276998A1 (en) 2020-06-10 2022-05-12 Database transaction processing method and apparatus, server, and storage medium

Publications (1)

Publication Number Publication Date
WO2021249207A1 true WO2021249207A1 (zh) 2021-12-16

Family

ID=71551257

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/096691 WO2021249207A1 (zh) 2020-06-10 2021-05-28 数据库事务处理方法、装置、服务器及存储介质

Country Status (5)

Country Link
US (1) US20220276998A1 (zh)
EP (1) EP4030315A4 (zh)
KR (1) KR20220076522A (zh)
CN (1) CN111427966B (zh)
WO (1) WO2021249207A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114594914A (zh) * 2022-03-17 2022-06-07 阿里巴巴(中国)有限公司 用于分布式存储系统的控制方法及系统
CN115470008A (zh) * 2022-11-14 2022-12-13 杭州拓数派科技发展有限公司 一种元数据访问方法、装置和存储介质

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111427966B (zh) * 2020-06-10 2020-09-22 腾讯科技(深圳)有限公司 数据库事务处理方法、装置及服务器
CN113297320B (zh) * 2020-07-24 2024-05-14 阿里巴巴集团控股有限公司 分布式数据库系统及数据处理方法
CN112053207A (zh) * 2020-09-01 2020-12-08 珠海随变科技有限公司 订单信息的获取方法、装置、计算机设备及存储介质
CN112214171B (zh) * 2020-10-12 2022-08-05 华东师范大学 一种面向SQLite数据库的非易失性内存缓冲区设计方法
CN112685433B (zh) * 2021-01-07 2022-08-05 网易(杭州)网络有限公司 元数据更新方法、装置、电子设备及计算机可读存储介质
CN112948064B (zh) * 2021-02-23 2023-11-03 北京金山云网络技术有限公司 一种数据读取方法、装置及数据读取系统
US20220398232A1 (en) * 2021-06-14 2022-12-15 Microsoft Technology Licensing, Llc Versioned metadata using virtual databases
CN113778975B (zh) * 2021-09-15 2023-11-03 京东科技信息技术有限公司 基于分布式数据库的数据处理方法及装置
CN113868273B (zh) * 2021-09-23 2022-10-04 北京百度网讯科技有限公司 元数据的快照方法及其装置
CN113656384B (zh) * 2021-10-18 2022-04-08 阿里云计算有限公司 数据处理方法、分布式数据库系统、电子设备及存储介质
US20230119834A1 (en) * 2021-10-19 2023-04-20 Sap Se Multi-tenancy using shared databases
CN114254036A (zh) * 2021-11-12 2022-03-29 阿里巴巴(中国)有限公司 数据处理方法以及系统
CN114443773A (zh) * 2022-01-30 2022-05-06 中国农业银行股份有限公司 一种分布式系统数据同步方法、装置、设备和存储介质
US11921708B1 (en) * 2022-08-29 2024-03-05 Snowflake Inc. Distributed execution of transactional queries
CN115827651B (zh) * 2022-11-22 2023-07-04 中国科学院软件研究所 一种低能耗的机载嵌入式数据库内存事务管理方法及系统
CN116303661B (zh) * 2023-01-12 2023-09-12 北京万里开源软件有限公司 一种分布式数据库中针对序列的处理方法、装置及系统
CN117914944A (zh) * 2024-03-20 2024-04-19 暗物智能科技(广州)有限公司 一种基于物联网的分布式三级缓存方法及装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8661068B1 (en) * 2011-09-29 2014-02-25 Emc Corporation Managing global metadata caches in data storage systems
US20140067884A1 (en) * 2012-08-30 2014-03-06 International Business Machines Corporation Atomic incremental load for map-reduce systems on append-only file systems
CN110019530A (zh) * 2017-12-29 2019-07-16 百度在线网络技术(北京)有限公司 基于分布式数据库的事务处理方法及装置
CN110196760A (zh) * 2018-07-12 2019-09-03 腾讯科技(深圳)有限公司 分布式事务一致性实现方法及装置
CN110245149A (zh) * 2019-06-25 2019-09-17 北京明略软件系统有限公司 元数据的版本管理方法及装置
CN111427966A (zh) * 2020-06-10 2020-07-17 腾讯科技(深圳)有限公司 数据库事务处理方法、装置及服务器

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20000038101A (ko) * 1998-12-03 2000-07-05 정선종 멀티 데이터베이스 시스템에서의 글로벌 질의 처리 장치 및 그방법
US6892205B1 (en) * 2001-02-28 2005-05-10 Oracle International Corporation System and method for pre-compiling a source cursor into a target library cache
US7480654B2 (en) * 2004-12-20 2009-01-20 International Business Machines Corporation Achieving cache consistency while allowing concurrent changes to metadata
US7747663B2 (en) * 2008-03-05 2010-06-29 Nec Laboratories America, Inc. System and method for content addressable storage
US8572130B2 (en) * 2011-06-27 2013-10-29 Sap Ag Replacement policy for resource container
US9811560B2 (en) * 2015-08-12 2017-11-07 Oracle International Corporation Version control based on a dual-range validity model
CN106021381A (zh) * 2016-05-11 2016-10-12 北京搜狐新媒体信息技术有限公司 一种云存储服务系统的数据访问/存储方法及装置
US10585873B2 (en) * 2017-05-08 2020-03-10 Sap Se Atomic processing of compound database transactions that modify a metadata entity
CN108829713B (zh) * 2018-05-04 2021-10-22 华为技术有限公司 分布式缓存系统、缓存同步方法和装置
CN110018845B (zh) * 2019-04-16 2020-09-18 成都四方伟业软件股份有限公司 元数据版本对比方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8661068B1 (en) * 2011-09-29 2014-02-25 Emc Corporation Managing global metadata caches in data storage systems
US20140067884A1 (en) * 2012-08-30 2014-03-06 International Business Machines Corporation Atomic incremental load for map-reduce systems on append-only file systems
CN110019530A (zh) * 2017-12-29 2019-07-16 百度在线网络技术(北京)有限公司 基于分布式数据库的事务处理方法及装置
CN110196760A (zh) * 2018-07-12 2019-09-03 腾讯科技(深圳)有限公司 分布式事务一致性实现方法及装置
CN110245149A (zh) * 2019-06-25 2019-09-17 北京明略软件系统有限公司 元数据的版本管理方法及装置
CN111427966A (zh) * 2020-06-10 2020-07-17 腾讯科技(深圳)有限公司 数据库事务处理方法、装置及服务器

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4030315A4

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114594914A (zh) * 2022-03-17 2022-06-07 阿里巴巴(中国)有限公司 用于分布式存储系统的控制方法及系统
CN114594914B (zh) * 2022-03-17 2024-04-02 阿里巴巴(中国)有限公司 用于分布式存储系统的控制方法及系统
CN115470008A (zh) * 2022-11-14 2022-12-13 杭州拓数派科技发展有限公司 一种元数据访问方法、装置和存储介质

Also Published As

Publication number Publication date
CN111427966A (zh) 2020-07-17
KR20220076522A (ko) 2022-06-08
US20220276998A1 (en) 2022-09-01
EP4030315A4 (en) 2023-01-25
CN111427966B (zh) 2020-09-22
JP2023518374A (ja) 2023-05-01
EP4030315A1 (en) 2022-07-20

Similar Documents

Publication Publication Date Title
WO2021249207A1 (zh) 数据库事务处理方法、装置、服务器及存储介质
CN110502507B (zh) 一种分布式数据库的管理系统、方法、设备和存储介质
US9946735B2 (en) Index structure navigation using page versions for read-only nodes
US11132350B2 (en) Replicable differential store data structure
US10691722B2 (en) Consistent query execution for big data analytics in a hybrid database
US9740582B2 (en) System and method of failover recovery
US9146934B2 (en) Reduced disk space standby
US9305056B1 (en) Results cache invalidation
US10180812B2 (en) Consensus protocol enhancements for supporting flexible durability options
CN111338766A (zh) 事务处理方法、装置、计算机设备及存储介质
US10191936B2 (en) Two-tier storage protocol for committing changes in a storage system
WO2019109854A1 (zh) 分布式数据库数据处理方法、装置、存储介质及电子装置
CN115668141A (zh) 使用时间戳对网络中的事务进行分布式处理
US20230099664A1 (en) Transaction processing method, system, apparatus, device, storage medium, and program product
US20230418811A1 (en) Transaction processing method and apparatus, computing device, and storage medium
US20240028598A1 (en) Transaction Processing Method, Distributed Database System, Cluster, and Medium
CN113391885A (zh) 一种分布式事务处理系统
US20230014427A1 (en) Global secondary index method for distributed database, electronic device and storage medium
WO2022127866A1 (zh) 数据处理方法、装置、电子设备、存储介质
EP4356257A2 (en) Versioned metadata using virtual databases
WO2022135471A1 (zh) 多版本并发控制和日志清除方法、节点、设备和介质
US11789971B1 (en) Adding replicas to a multi-leader replica group for a data set
US11514080B1 (en) Cross domain transactions
Bravo et al. Reducing the vulnerability window in distributed transactional protocols
CN116635846A (zh) 查询处理下推中的架构和数据修改并发

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21822853

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021822853

Country of ref document: EP

Effective date: 20220413

ENP Entry into the national phase

Ref document number: 20227015834

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2022555830

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE