CN112988913B

CN112988913B - Data processing method and related device

Info

Publication number: CN112988913B
Application number: CN202110512407.2A
Authority: CN
Inventors: 熊亮春; 王晓宇
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-05-11
Filing date: 2021-05-11
Publication date: 2021-08-13
Anticipated expiration: 2041-05-11
Also published as: CN112988913A

Abstract

The embodiment of the application discloses a data processing method and a related device, wherein a first node, a second node and other computing nodes can read and modify data objects stored in a distributed database system, and obtain a target identification pair of the first node for changing the data objects, wherein the target identification pair comprises an object identification and a target version identification of the data objects, and the target version identification is used for identifying that first data in the data objects are changed into a data version of second data; storing the target identification pair associated with the data object according to the object identification; acquiring a first reading request of a second node, wherein the first reading request comprises a first identification pair, and the first identification pair comprises an object identification and a first to-be-verified version identification of a data object; and if the target version identification is different from the first to-be-verified version identification and the data object comprising the second data is not acquired from the first node, rejecting the first reading request, thereby avoiding the data synchronization abnormity problem aiming at the data object.

Description

Data processing method and related device

Technical Field

The present application relates to the field of data processing, and in particular, to a data processing method and a related apparatus.

Background

A distributed database system is a commonly used data storage system, which is composed of a plurality of computing nodes for data storage. Data objects (Schema) are stored in the distributed database system, and the computing nodes can create the data objects in the system and also can change the data objects in the system.

In order to ensure the data correctness of the data object in the process of being changed, the related technology mainly adopts a strong synchronization mode of a lock mechanism, so that the incompatible operation is blocked in the process of changing the data object, the inconsistency of the data object in a system is avoided, and the condition that the operation of a computing node causes data abnormity in the process of changing the data object is also avoided.

However, strong synchronization requires that all the computing nodes in the distributed database system agree to be able to grant the lock, a large number of interactions increase the system overhead for acquiring the lock, reduce the system processing capacity, and the additional burden of these systems is rapidly increased with the increase of the computing nodes, which is not favorable for the expansion and use of the distributed database system.

Disclosure of Invention

In order to solve the above technical problem, the present application provides a data processing method and a related apparatus for ensuring data correctness of a data object in a process of being changed.

The embodiment of the application discloses the following technical scheme:

in one aspect, the present application provides a data processing method, including:

acquiring a target identification pair of a first node for changing a data object in a distributed database system, wherein the first node is a computing node in the distributed database system, the target identification pair comprises an object identification and a target version identification of the data object, and the target version identification is used for identifying a data version of a first data changed into a second data in the data object;

storing the target identification pair associated with the data object according to the object identification;

obtaining a first read request of a second node for the data object, wherein the second node is a different computing node from the first node in the distributed database system, the first read request comprises a first identifier pair, and the first identifier pair comprises the object identifier and a first to-be-verified version identifier of the data object;

and if the target version identification is determined to be different from the first to-be-verified version identification and the data object comprising the second data is not acquired from the first node, rejecting the first read request.

In another aspect, the present application provides a data processing apparatus, the apparatus including an obtaining unit, a storage unit, and an executing unit;

the obtaining unit is configured to obtain a target identifier pair used by a first node to change a data object in the distributed database system, where the first node is a computing node in the distributed database system, the target identifier pair includes an object identifier and a target version identifier of the data object, and the target version identifier is used to identify a data version in the data object, where the first data is changed to be second data;

the storage unit is used for storing the target identification pair associated with the data object according to the object identification;

the obtaining unit is further configured to obtain a first read request of a second node for the data object, where the second node is a different computing node in the distributed database system from the first node, and the first read request includes a first identifier pair, and the first identifier pair includes the object identifier and a first to-be-verified version identifier of the data object;

the execution unit is configured to reject the first read request if it is determined that the target version id is different from the first to-be-verified version id and the data object including the second data is not obtained from the first node.

In another aspect, the present application provides a computer device comprising a processor and a memory:

the memory is used for storing program codes and transmitting the program codes to the processor;

the processor is configured to perform the method of the above aspect according to instructions in the program code.

In another aspect, the present application provides a computer-readable storage medium for storing a computer program for executing the method of the above aspect.

In another aspect, embodiments of the present application provide a computer program product or a computer program, which includes computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method of the above aspect.

According to the technical scheme, the distributed database system comprises a plurality of computing nodes including a first node and a second node, if the first node needs to change first data included in a data object into second data, the first node sends a target identification pair including an object identification and a target version identification of the data object, the data object to be changed in the distributed database system can be found through the object identification, and the data version of the data object changed by the first node is determined through the target version identification. Since the computing node needs to carry the identifier pair including the version identifier when reading and writing the data object, before acquiring the data object including the second data from the first node, the system stores the target identifier pair in association with the data object in advance, so that the version identifier of the data object in the system is updated from the data version for identifying the first data to the target version identifier for identifying the second data. Therefore, if a first read request of another node, for example, a second node, to the data object is obtained during the period when the first node changes the data object, because the first to-be-verified version identifier included in the first identifier pair provided by the second node is not the same as the target version identifier, and the data object including the second data is not obtained from the first node at this time, the system will determine that the data object has not been changed at this time, and the first data in the data object in the system is not the final version at present and cannot be provided to the second node, so that the system will reject the first read request, that is, the data object of the non-final version will not be provided to the second node, thereby avoiding the data synchronization abnormality problem for the data object. Therefore, the second node can not utilize the data object of the non-final version to change data, and the correctness and consistency of the data object in the distributed database system are ensured.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic view of an application scenario of a data processing method according to an embodiment of the present application;

fig. 2 is a flowchart of a data processing method according to an embodiment of the present application;

FIG. 3 is a flow chart of modifying a data object according to an embodiment of the present application;

FIG. 4 is a flowchart of creating a data object according to an embodiment of the present application;

FIG. 5 is a flowchart of deleting a data object according to an embodiment of the present application;

fig. 6 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a server according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of a terminal device according to an embodiment of the present application.

Detailed Description

Embodiments of the present application are described below with reference to the accompanying drawings.

Based on the related art, the additional burden of the distributed database system is increased by the strong synchronization mode of the lock mechanism, so that the embodiment of the application provides a data processing method and a related device, which are used for reducing the additional burden of the distributed database system while ensuring the data correctness of the data object in the process of being changed.

The data processing method provided by the embodiment of the application is realized based on cloud computing, wherein cloud storage (cloud storage) is a new concept extended and developed on a cloud computing concept, and a distributed cloud storage system (hereinafter referred to as a storage system) refers to a storage system which integrates a large number of storage devices (storage devices are also referred to as storage nodes) of various different types in a network through application software or application interfaces to cooperatively work through functions of cluster application, grid technology, distributed storage file systems and the like, and provides data storage and service access functions to the outside.

At present, a storage method of a storage system is as follows: logical volumes are created, and when created, each logical volume is allocated physical storage space, which may be the disk composition of a certain storage device or of several storage devices. The client stores data on a certain logical volume, that is, the data is stored on a file system, the file system divides the data into a plurality of parts, each part is an object, the object not only contains the data but also contains additional information such as data identification (ID, ID entry), the file system writes each object into a physical storage space of the logical volume, and the file system records storage location information of each object, so that when the client requests to access the data, the file system can allow the client to access the data according to the storage location information of each object.

The process of allocating physical storage space for the logical volume by the storage system specifically includes: physical storage space is divided in advance into stripes according to a group of capacity measures of objects stored in a logical volume (the measures often have a large margin with respect to the capacity of the actual objects to be stored) and Redundant Array of Independent Disks (RAID), and one logical volume can be understood as one stripe, thereby allocating physical storage space to the logical volume.

In the data processing method provided by the embodiment of the application, the data stored in the storage nodes are processed, such as addition, deletion, modification and the like, by a plurality of computing nodes in the distributed database system controlled by the cloud storage technology, so that the data correctness of the data object in the changed process is ensured.

The data processing method provided by the application can be applied to data processing equipment with data processing capacity, such as terminal equipment and servers. The terminal device may be a smart phone, a desktop computer, a notebook computer, a tablet computer, and the like, but is not limited thereto; the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud computing services. The terminal device and the server may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein.

In the data processing method or the related apparatus according to the embodiment of the present application, a plurality of servers can be grouped into a blockchain, and the servers are nodes on the blockchain.

In order to facilitate understanding of the technical solution of the present application, a data processing method provided in the embodiments of the present application is described below with reference to an actual application scenario and a server as a data processing device.

Referring to fig. 1, the figure is a schematic view of an application scenario of a data processing method according to an embodiment of the present application. In the application scenario shown in fig. 1, a distributed database system includes a plurality of computing nodes and storage nodes, each of the computing nodes can process data objects stored in the storage nodes, and the following description will take an example in which the distributed database system includes a first node 101, a second node 102, and a storage node 200.

If the first node 101 needs to change the first data included in the data object a into the second data, the first node 101 sends a target identifier pair for the data object a: < a, V2.0> to the storage node 200. Wherein, the object identifier is A, and the target identifier version is V2.0. The storage node 200 may find the data object a to be changed in the stored data objects by a, and may make clear that the first node 101 wants to change the data version of the data object a from 1.0 to 2.0 by V2.0.

Upon receiving the target identification pair: after < a, V2.0>, the storage node 200 stores < a, V2.0> in association with the data object a according to a, at which time the version of the data object a in the storage node 200 is identified as 2.0, and then the first node 101 may start to change the first data to the second data, so that the storage node 200 may obtain the data object a including the second data from the first node 101.

If the storage node 200 obtains a first read request of the second node 102 while the first node 101 changes the data object a, where the first read request includes a first identifier pair: < A, V1.0>, wherein the object is identified as A, the first version to be verified is identified as V1.0, the storage node 200 can make sure that the data object that the second node 102 wants to change is data object A through A, and the data version of the data object A to be changed is 1.0.

The version of data that the second node 102 wants to change is 1.0, but the version of data currently stored by the storage node 200 for the data object a is 2.0, i.e. the first to-be-verified version identification is not the same as the target version identification, and the second node 102 is not changing for the final version of the data object a. Furthermore, at this time, the first node 101 is changing the first data included in the data object a into the second data, that is, the storage node 200 has not acquired the data object a including the second data from the first node 101, so the data object a stored in the storage node 200 is not the final version of the data object a.

If the storage node 200 provides the data object a of version 1.0, which is not the final version, to the second node 102, both the second node 102 and the first node 101 may change the data object a of version 1.0, which may cause an abnormal data synchronization problem for the data object a, and thus the correctness of the data object a cannot be guaranteed.

As such, if the storage node 200 determines that the target version identification is different from the first to-be-verified version identification and the data object a including the second data is not obtained from the first node 101, the first read request of the second node 102 will be rejected, i.e., the storage node 200 will not provide the data object a of the non-final version to the second node 102.

Therefore, the second node 102 is ensured not to use the data object A of the non-final version to perform data change, the problem of abnormal data synchronization aiming at the data object A is avoided, and the correctness and consistency of the data object A in the distributed database system are ensured.

A data processing method provided in the embodiments of the present application is described below with reference to the accompanying drawings, where a server is used as a data processing device.

Referring to fig. 2, the figure is a flowchart of a data processing method according to an embodiment of the present application. As shown in fig. 2, the data processing method includes the steps of:

s201: and acquiring a target identification pair used by the first node for changing the data object in the distributed database system.

The distributed database system comprises a plurality of computing nodes and storage nodes, and different computing nodes can process data objects stored in the storage nodes, such as operations of adding, deleting, modifying and the like. The data object is an object used for actually storing data, such as a table, an index, a column, and the like.

In the following, a data object is changed by two nodes in the distributed database system, where changing the data object is to perform a modification operation on the data object. The first node wants to modify the content of the first chapter in the original A paper stored in the storage node, and during the modification of the first node, the storage node does not acquire the A paper after the first chapter is modified. If the second node wants to modify the second chapter of the a paper during the period, but the storage node only stores the original a paper at this time, and the second node only can acquire the original a paper. Therefore, no matter the first node or the second node stores the modified a paper first, the paper stored last by the storage node will not be the a paper modified for both the first chapter and the second chapter of the original a paper, and thus the problem of abnormal data synchronization for the a paper may occur.

In order to avoid the problem of abnormal data synchronization, a strong synchronization mode of a lock mechanism is adopted in the related technology, so that data synchronization can be ensured by ensuring that incompatible operations of data objects stored by storage nodes are blocked in the changing process, but the extra burden of a distributed database system is increased.

Based on this, the data processing method provided in the embodiment of the present application proposes to use an identifier pair including an object identifier and a version identifier of a data object, where the object identifier is used to identify one data object so as to search for the data object to be changed, the version identifier may also be referred to as a Writing barrier (Writing success) and is used to clarify the data version of the data object after being changed by a compute node, and other compute nodes cannot read or modify the data object using the data pair with the incorrect version identifier, so that the data object with the incorrect version is prevented from being used by other compute nodes to be modified by using the identifier pair, thereby preventing data consistency and integrity from being damaged. Meanwhile, when data problems occur, rollback and backtracking are facilitated through the version identification, and therefore the data object of the required version is obtained.

Therefore, before processing the data object, different computing nodes need to send the identifier pair carrying the version identifier to the storage node. Continuing with the example of two computing nodes, the storage node obtains a target identifier pair used by the first node to change a data object in the distributed database system, where the target identifier pair includes an object identifier and a target version identifier of the data object, and the target version identifier is used to identify that the first data in the data object is changed into a data version of the second data.

The version identification can clarify that the first node wants to modify the data object corresponding to the object identification, and the first data in the data object is modified into the second data, and at this time, the data object of the final version currently stored by the storage node is the version corresponding to the data object including the second data.

For example, a first node wants to modify the content of chapter one in the original a paper, it needs to send a target identification pair that includes an object identification of the original a paper and a target version identification that identifies the original a paper changed to the a paper that modified chapter one, e.g., the a paper changed from version V1.0 to version V2.0. After the storage node receives the target identification pair, the first node is clear to want to modify the original A paper, and during the modification of the first node, the second node is not allowed to modify the original A paper, so that the problem of abnormal data synchronization is avoided, and the additional burden of the distributed database system is reduced compared with a strong synchronization mode of a locking mechanism in the related art.

S202: and storing the target identification pair associated with the data object according to the object identification.

Before the computing node processes the data object, it needs to send the identifier pair including the version identifier to the storage node, so that before the storage node acquires the data object including the second data from the first node, in order to ensure the consistency of the data object, i.e. to avoid other computing nodes, such as the second node, acquiring the data object of an incorrect version, and thus modifying the data object of the incorrect version so as to destroy the integrity of the data object, the storage node stores the target identifier pair and the corresponding data object in association in advance, so that the version identifier of the data object in the storage node is updated from the data version for identifying the first data to the target version identifier for identifying the second data.

Therefore, whether other nodes have the permission to modify the data object can be determined through the target version identifier, if other computing nodes carry the identifier pair of the non-target version identifier, the other computing nodes are not allowed to modify the data object, and the like, so that the problem that the first node and the second node modify the data object of the same version during the modification of the data object is avoided, and the consistency and the integrity of the data cannot be ensured.

The embodiment of the present application is not particularly limited to the manner of association storage, and one manner is described below as an example.

The data object may include metadata and user data, where the metadata is used to describe attribute definitions of the user data, referred to as "data of the data", i.e., attributes of the data. The user data belongs to an instance, i.e. a record. For example, if the data object is a two-dimensional table, each row represents user data for a user, and each column belongs to a type of metadata. Therefore, the target identification pair can be stored in the metadata corresponding to the data object and the user data corresponding to the data object according to the object identification, so that the target identification pair and the data object can be stored in a correlation mode.

S203: a first read request of a second node for a data object is obtained.

If other computing nodes in the distributed database system, such as a second node, want to modify a data object, etc., in order to ensure consistency of the data object, the data object of the final version currently stored by the storage node needs to be read first, so the second node will send a first read request to the storage node, where the first read request includes a first identifier pair, and the first identifier pair includes an object identifier and a first version identifier to be verified.

The object identifier is used to find a data object that the second node wants to read, the first version to be verified is a version of the data object that the second node wants to read currently, and the first version to be verified may be the same as or different from a final version currently stored by the storage node. In order to ensure that all the compute nodes read the data objects with the same version, all the nodes should read the data objects with the final version currently stored by the storage nodes. If the first to-be-verified version is different from the final version currently stored by the storage node, the first to-be-verified version needs to be updated to the final version currently stored by the storage node, so that the second node can acquire the same data object as the other versions of the computing nodes.

Therefore, the storage node determines that the data object which the second node wants to read is the same as the data object changed by the first node according to the object identifier in the first identifier pair, and determines whether the second node has the right to read the data object of the first version to be verified by judging whether the first version to be verified is the same as the target version identifier, so that the data synchronization abnormity caused by the fact that the first node and the second node modify the data object of the same version is avoided.

S204: and if the target version identification is different from the first to-be-verified version identification and the data object comprising the second data is not acquired from the first node, rejecting the first read request.

If the first to-be-verified version identifier is different from the target version identifier, and the storage node does not acquire the data object including the second data from the first node, it indicates that the data object that the second node wants to read is not the final version currently stored by the storage node, and the first node updates the final version of the data object by the target version identifier, but the first node does not store the modified data object during modifying the data object, and the storage node does not store the data object corresponding to the target version identifier, so the storage node cannot provide the data object of the final version for the second node, and at this time, the storage node rejects the first read-write request, that is, the data object of the non-final version is not provided to the second node, thereby avoiding the problem of abnormal data synchronization for the data object, and ensuring that the second node does not use the data object of the non-final version to perform data change, the consistency and integrity of the data objects in the distributed database system are ensured.

In order to ensure consistency and integrity of the data object, the data object needs to be read from the storage node before each computing node modifies the data object, and the storage node needs to determine whether each computing node has the right to read the data object, so that before the first node sends the target identifier pair to the storage node and changes the first data in the data object into the second data, the storage node needs to verify whether the first node has the right to read the data object, so as to ensure that the first node can change the data object, and the changed data object, namely the data object including the first data, is the final version. Therefore, before S201, the following steps may be further included, see S2001-S2003.

S2001: a second read request for the data object by the first node is obtained.

For a data object comprising the first data, the second read request comprises a second identification pair comprising an object identification and a second to-be-verified version identification of the data object. The second version identification to be verified is the version identification of the data object currently and locally stored by the first node.

And the first node sends a second reading request to the storage node, and the storage node specifies the data object and the version of the data object which the first node wants to read according to a second identifier included in the second reading request.

It should be noted that the second read request may carry a read identifier or a modification identifier, so that the storage node specifies that the first node wants to read the data object or modify the data object. Alternatively, since the second version identification to be verified only identifies one version, i.e. the final version identification currently stored by the first node, the storage node specifies that the first node wants to read the data object. And the target version identification identifies two versions, i.e. the first node wants to update the data object from the data version identifying the first data to the target version identification identifying the second data, the storage node unambiguously the first node wants to modify the data object. S2002: and determining a third identification pair which is stored in association with the corresponding data object according to the object identification.

After the data object which the first node wants to modify is determined according to the object identifier, a third identifier pair stored in association with the data object can be obtained in the storage node, the third identifier pair comprises the object identifier and the first version identifier, the third identifier pair can be found through the object identifier, the first version identifier is used for identifying the data version of the first data in the data object, and at this time, the final version of the first data in the storage node is the first version identifier.

S2003: and if the second version identification to be verified is the same as the first version identification, returning the data object corresponding to the first version identification to the first node.

If the second version identifier to be verified is the same as the first version identifier, it indicates that the data version of the data object that the first node wants to read is the final version, and the storage node stores the data object of the final version at this time, that is, there is no other node to modify the data object, the storage node may return the data object corresponding to the first version identifier, that is, the data object including the first data, to the first node, so that the first node changes the first data in the data object to the second data, and at the same time, changes the first version identifier to the target version identifier.

The embodiment of the present application does not specifically limit the version identifier updating manner, for example, based on a preset step, the first version identifier is increased or decreased to obtain the target version identifier, and if the first version identifier is V10 and the preset step is +10, the target version identifier may be updated to V20, and if the first version identifier is V100 and the preset step is-1, the target version identifier may be updated to V99.

Therefore, by determining that the second version identifier to be verified is the same as the first version identifier, the data object read by the first node can be ensured to be the final version currently stored by the storage node, so that the first node is ensured to acquire the data object with the correct version for changing, and the correctness of the data is ensured. The case where the second version identification to be verified is not identical to the first version identification is explained by S2004.

S2004: and if the second version identification to be verified is not the same as the first version identification, indicating the first node to update the second version identification to be verified to the first version identification, and retransmitting the reading request aiming at the data object.

If the second version identifier to be verified is different from the first version identifier, it indicates that the version of the data object that the first node wants to read is not the final version currently stored by the storage node, and if the first node reads the version of the data object, the modification of the data object of the version can cause the problem of abnormal data synchronization, so that the first node can be instructed to update the second version to be verified to the final version currently stored by the storage node, i.e., the first version identification, and resends the read request for the data object such that the storage node ensures that the version of the data object read by the first node is the currently stored final version, the storage node stores the currently stored final version of the data object, that is, the data object including the first data is sent to the first node, so that the first node is ensured to acquire the data object of the correct version for modification, and the correctness of the data is ensured.

The following description will be made by taking the modification of the data object as an example with reference to fig. 3. In this embodiment, the first node has a data object including first data acquired from the storage node, and the first node wants to modify the first data in the data object into second data. Referring to fig. 3, a flowchart of modifying a data object according to an embodiment of the present application is shown.

S301: the first node sets a target identification pair.

The target identification pair includes an object identification of the data object and a target version identification for identifying a version of the data object that includes the second data.

S302: and the storage node stores the target identification pair into the data dictionary according to the object identification.

In a broad sense, the data dictionary is subordinate to the metadata, and the data dictionary can be regarded as data itself, and is generally mainly used for explaining data structure meanings such as data tables and data fields, value ranges of the data fields, representative meanings of the data values, and the like.

S303: the storage node judges whether the target identifier pair is stored in the data dictionary, if so, S304 is executed, and if not, S309 is executed.

S304: the first node writes the target identification pair in the user data.

S305: and the storage node stores the target identification pair association into the user data.

S306: the storage node determines whether the target identifier pair is stored in the user data, if so, executes S307, and if not, executes S309.

S307: the first node executes the corresponding DDL logic.

The database schema Definition Language (DDL) is used to describe operations of Data objects to be stored in a storage node, such as a modify operation, a delete operation, a create operation, and the like.

S308: the first node judges whether the DDL logic is completed, if so, the modification process is ended, and if not, S301 is executed.

S309: the cleaning logic is restored and S308 is executed.

If the second node also wants to modify the data object, it needs to first read the final version of the data object currently stored by the storage node from the storage node, which is specifically referred to S311-S319.

S311: the second node locally obtains the first identification pair.

For example, the second node opens a locally saved data object and loads a first identification pair corresponding to the data object. The first identity pair comprises an object identity and a first to-be-verified version identity for the data object. S312: the second node sends the first pair of identities to the storage node.

S313: the storage node judges whether the first version identification to be verified is the same as the target version identification. If so, go to S314, otherwise, go to S316.

S314: and the storage node sends the data object comprising the second data according to the object identification.

S315: and the second node reads the data object and ends the reading process.

S316: the storage node returns a result of whether the second node can read the data object.

If the first to-be-verified version identification is different from the target version identification and the storage node does not obtain the data object including the second data from the first node, a result that the second node is not allowed to read the data object is returned.

If the first version identifier to be verified is different from the target version identifier, but the first node has modified the data object, the storage node obtains the data object including the second data, and returns a result that allows the second node to read the data object, but the second node needs to update the data version of the data object that is desired to be read.

S317: the second node determines whether the current read operation is allowed, if yes, then S318 is executed, and if not, then S319 is executed.

S318: the second node updates the first to-be-verified version identifier in the first identifier pair, sends the updated first identifier pair to the storage node, and executes S313.

S319: and the second node rolls back and ends the reading process.

After reading the data object including the second data, the second node may modify the data object including the second data in a manner as S201-S204.

The computing node may not only modify the data object, but also create and delete the data object, and the first node is taken as an example below to explain the creation process of the data object.

After acquiring a creation request of a first node for a data object, a storage node generates an object identifier for the data object according to the creation request and returns the object identifier to the first node, the first node matches the object identifier with an initial version identifier and returns the object identifier to the storage node as an initial identifier pair, the initial version identifier represents a first version of the data object, and the storage node stores the initial identifier pair in association with the data object.

During the creation of the data object, other computing nodes cannot sense the existence of the data object, and cannot process the data object, so that the problem of abnormal data synchronization cannot occur, and the data object is in a data consistency and integrity state.

If the data object is created by multiple computing nodes, for example, a first node and a second node at the same time, after the storage node obtains the creation requests of the first node and the second node for the data object, the storage node generates a first object identifier for the data object according to the creation request of the first node and returns the first object identifier to the first node, and the first node matches the first object identifier with an initial version identifier and returns the first object identifier as a first initial version identifier pair to the storage node.

Similarly, the storage node generates a second object identifier for the data object according to the creation request of the second node, and returns the second object identifier to the second node, and the second node matches the second object identifier with the initial version identifier as a second initial version identifier pair and returns the second initial version identifier pair to the storage node.

Although the same data object has a plurality of initial version identification pairs, the initial identification pairs created by different nodes are different, so that the problem of abnormal data synchronization can not occur, and the data object is in a data consistency and integrity state. If it is subsequently found that there are multiple pairs of primary version identifiers for the same data object, redundant data objects can be deleted by a delete operation.

The creation of a data object is described below in conjunction with FIG. 4. Referring to fig. 4, this figure is a flowchart of creating a data object according to an embodiment of the present application.

S401: the first node creates a structure of data objects and sends the structure to the storage node.

When creating the data object, the first node adds an identifier pair, such as < object identifier, version identifier >, in the structure of the data object, and the storage node obtains a creation request of the first node for the data object.

S402: the storage node generates an object identification for the data object.

The storage node generates an object identifier for the data object according to the acquired creation request, and the object identifier may be represented by a TINDEX _ ID, such as TINDEX _ ID: 1001.

s403: the storage node determines whether the object identifier is successfully generated, if so, S404 is executed, and if not, S411 is executed.

S404: the first node sets an initial identification pair for the data object and sends the initial identification pair to the storage node.

Wherein, the initial identification pair can be expressed as < TINDEX _ ID, SCHEMA _ VERSION >, SCHEMA _ VERSION represents VERSION identification, such as SCHEMA _ VERSION: 1. The storage node obtains an initial identifier pair corresponding to the data object, where the initial identifier pair includes an object identifier and an initial version identifier, for example, the initial identifier pair is <1001, 1 >.

S405: and the storage node stores the initial identification pair into the data dictionary according to the object identification.

S406: the storage node judges whether the initial identifier pair is stored in the data dictionary, if so, S407 is executed, and if not, S411 is executed.

S407: the first node writes the initial pair of identities in the user data.

The initial identification pair is <1001, 1 >.

S408: the storage node stores the initial identification pair association into the user data.

S409: the storage node determines whether the initial identifier pair is stored in the user data, if so, performs S410, and if not, performs S411.

S410: and the first node successfully creates the data object and ends the creation process.

S411: and restoring the cleaning logic and finishing the creating process.

And the first node fails to create the data object, and clears the data in the creation process so as to create the data object again.

Therefore, in the embodiment of the application, the added version identifiers are merged into the process of creating one data object, and the corresponding object identifiers are initialized at the computing nodes and the storage nodes, so that before the creation of the data object is completed, the object identifiers and the version identifiers which are respectively initialized for the data object by the computing nodes and the storage nodes need to be paired, and after the identifier pairs are formed, the created data object can be normally used by a user of the distributed database system, thereby ensuring the consistency and the integrity of data.

The above is the creation process of the data object, and the following is to continue to describe the deletion process of the data object by taking the first node as an example.

If the first node wants to delete the data object stored in the storage node, it needs to be ensured that other computing nodes do not modify the data object during the deletion period, so as to ensure that the data object does not have the problem of synchronization abnormality. Therefore, after the storage node acquires the deletion request of the first node for the data object, the storage node stores the identifier pair to be deleted included in the deletion request in association with the data object, deletes the data object according to the object identifier in the identifier pair to be deleted, and deletes the identifier pair to be deleted after the deletion is completed.

The identifier pair to be deleted includes an object identifier and a second version identifier for the data object, and the second version identifier is obtained by updating according to the target version identifier. For example, if the first node wants to delete a data object including the second data, the target version identification is updated to the second version identification.

During deletion, it is clear from the updated version identification that other nodes cannot read or modify the data object during deletion. For example, a third read request of other computing nodes, such as the second node, for the data object is obtained, the third read request including a fourth identification pair, the fourth identification pair including the object identification and the third to-be-verified version identification. And if the second version identification is different from the third version identification to be verified during the deletion period of the data object, the version of the data object which the second node wants to modify is different from the final version currently stored by the storage node, and the first node does not complete the deletion of the data object, and the third read request is rejected. Thereby ensuring consistency and integrity of the data objects.

Deletion of a data object is described below with reference to fig. 5. Referring to fig. 5, a flowchart of deleting a data object according to an embodiment of the present application is shown.

S501: the first node reads the target identification pair of the data object.

S502: the first node sets a mark pair to be deleted and sends the mark pair to the storage node.

The first node firstly updates the target version identification in the target identification pair into a second version identification, forms an identification pair to be deleted with the object identification and the second version identification, and sends a deletion request carrying the identification pair to be deleted to the storage node.

S503: and the storage node stores the identification pair to be deleted into the data dictionary according to the object identification.

S505: the storage node judges whether the identifier pair to be deleted is stored in the data dictionary, if so, S505 is executed, and if not, S510 is executed.

S505: and the first node writes the identification pair to be deleted in the user data.

S506: and the storage node stores the association of the identifier pairs to be deleted in the user data.

S507: the storage node determines whether the to-be-deleted identifier pair is stored in the user data, if so, S508 is executed, and if not, S510 is executed.

S508: the first node deletes the data object.

At the moment, the data object is in a deleted state, and the empty data object can be replaced by the data object to be deleted by the storage node by sending the empty data object to the storage node.

S509: and the first node deletes the identification pair to be deleted in the data dictionary, and the deletion process is finished.

S510: and restoring the cleaning logic and finishing the deleting process.

If the second node wants to read the data object during the process of deleting the data object by the first node, the second node may send a third read request carrying a third version identifier to be verified, where the third version identifier to be verified is different from the second version identifier, and if the data object is deleted, the storage node rejects the third read request, which may refer to S311-S319.

If the data object and the second version identifier are deleted after the data object is deleted, the second node cannot find the data object through the version identifier, so that the second node is prevented from processing the data object with the incorrect version, and the integrity of the data object is prevented from being damaged.

Therefore, the consistency and the integrity of the corresponding data objects can be effectively protected in the process of modifying the data objects by advancing the version identification, meanwhile, the states among the computing nodes are not required to be strongly synchronized, when the computing nodes create the data objects, after the version identification is initialized, the subsequent data operation has the version identification to determine whether the reading request is allowed or not, the computing nodes are decoupled, and the computing nodes can be ensured to be capable of correctly modifying or reading the corresponding data objects with the minimum cost in the distributed database system by judging whether the computing nodes can execute the corresponding DDL logic or not.

In order to better understand the data processing method provided by the embodiment of the present application, a distributed database system including a first node, a second node, and a storage node is taken as an example to describe the data processing method provided by the embodiment of the present application.

The first node wants to save the a paper to the storage node, and during creating the data object storing the a paper, the initial identification pair < TINDEX _ ID, SCHEMA _ VERSION > is merged into the process of creating the data object, where TINDEX _ ID is set to 1001 and SCHEMA _ VERSION is set to V1. The specific creation process can be seen in S401-S411, whereby the first node stores the a paper carrying the initial identification pair <1001, V1> into the storage node.

If a first node wants to modify the first chapter of the a paper, the a paper needs to be acquired from a storage node, and the first node sends a second read request carrying a second identifier pair to the storage node to verify that the a paper that the first node wants to modify is the currently stored final version, where the second identifier pair includes an object identifier and a second version identifier to be verified, and may be denoted as <1001, V1 >. And when the verification is passed, the storage node returns the A paper to the first node. The specific verification process may refer to S2001-S2003, whereby the first node may modify chapter i of the a paper.

The first node sends <1001, V2> to the storage node, wherein <1001, V2> is a target identification pair, the storage node stores <1001, V2> in association with the A paper, it is clear that the final version of the current A paper is V2, before the A paper including the modified first chapter is not acquired, the storage node makes clear that the first node is modifying the A paper, and other nodes are not allowed to modify the A paper. The specific modification process can be seen in S301-S309.

If the second node wants to modify chapter two of article a, the second node sends <1001, V1> to the storage node, where <1001, V1> is the first identity pair and V1 is the first to-be-verified version. The storage node verifies that the A paper which the second node wants to modify is not the current stored final version, and the storage node does not acquire the A paper including the modified first chapter from the first node and rejects the read request of the second node. The specific modification process can be seen in S311-S319.

If the first node wants to delete the A paper, the first node sends <1001, V3> to the storage node, wherein <1001, V3> is an identification pair to be deleted, the storage node stores <1001, V3> and the A paper in an associated manner, the final version of the current A paper is determined to be V3, the A paper currently stored by the storage node is locked through the version identification V3, other nodes cannot be modified, and after the A paper is deleted, the <1001, V3> is deleted. The specific deletion process can be seen in S501-S510.

Thus, in the process of executing the DDL logic, the version identification change corresponding to the data object is advanced to block the destructive operation which may be generated on the data object by other computing nodes in the changing process. Other compute nodes access the data object using only read requests that include the version identification, passively updating the version identification to latest as the case may be, with the goal of protecting the data object with the version identification.

Aiming at the data processing method provided by the embodiment, the embodiment of the application also provides a data processing device.

Referring to fig. 6, this figure is a schematic diagram of a data processing apparatus according to an embodiment of the present application. As shown in fig. 6, the data processing apparatus 600 includes: an acquisition unit 601, a storage unit 602, and an execution unit 603;

the obtaining unit 601 is configured to obtain a target identifier pair used by a first node to change a data object in a distributed database system, where the first node is a computing node in the distributed database system, the target identifier pair includes an object identifier and a target version identifier of the data object, and the target version identifier is used to identify a data version in the data object where a first data is changed to a second data;

the storage unit 602 is configured to store the target identifier pair associated with the data object according to the object identifier;

the obtaining unit 601 is further configured to obtain a first read request of a second node for the data object, where the second node is a different computing node in the distributed database system from the first node, and the first read request includes a first identifier pair, and the first identifier pair includes the object identifier and a first to-be-verified version identifier of the data object;

the executing unit 603 is configured to reject the first read request if it is determined that the target version identifier is different from the first to-be-verified version identifier and the data object including the second data is not obtained from the first node.

As a possible implementation manner, the apparatus further includes a verification unit, configured to:

obtaining a second read request of the first node for the data object, wherein the second read request comprises a second identifier pair of the data object, and the second identifier pair comprises the object identifier and a second to-be-verified version identifier of the data object;

determining a third identifier pair stored in association with the data object according to the object identifier, where the third identifier pair includes the object identifier and a first version identifier, and the first version identifier is used to identify a data version of the first data in the data object;

and if the second version identification to be verified is the same as the first version identification, returning the data object corresponding to the first version identification to the first node.

As a possible implementation manner, the verification unit is further configured to:

and if the second version identification to be verified is determined to be different from the first version identification, indicating the first node to update the second version identification to be verified to the first version identification, and retransmitting the reading request aiming at the data object.

As a possible implementation manner, the target version identifier is obtained by increasing or decreasing the first version identifier based on a preset step size.

As a possible implementation manner, the storage unit 602 is configured to:

and storing the target identification pair into metadata corresponding to the data object and storing the target identification pair into user data corresponding to the data object according to the object identification, wherein the metadata is used for describing attribute definition of the user data.

As a possible implementation manner, the apparatus further includes a creating unit configured to:

acquiring a creation request of the first node for the data object;

returning an object identifier generated for the data object to the first node according to the creation request;

acquiring an initial identification pair corresponding to the data object, wherein the initial identification pair comprises the object identification and an initial version identification;

storing the initial identification pair in association with the data object.

As a possible implementation manner, the apparatus further includes a deleting unit, configured to:

acquiring a deletion request of the first node for the data object, wherein the deletion request comprises an identifier pair to be deleted, the identifier pair to be deleted comprises the object identifier and a second version identifier, and the second version identifier is obtained by updating according to the target version identifier;

storing the identifier to be deleted in association with the data object;

and deleting the data object, and deleting the identification pair to be deleted after the deletion is finished.

As a possible implementation manner, the apparatus further includes a deleting unit, further configured to:

acquiring a third read request of the second node for the data object, wherein the third read request comprises a fourth identifier pair, and the fourth identifier pair comprises the object identifier and a third version identifier to be verified;

denying the third read request in response to determining that the second version identification is different from the third to-be-verified version identification and during deletion of the data object.

In the data processing apparatus provided in the embodiment of the application, the distributed database system includes a plurality of computing nodes including a first node and a second node, and if the first node needs to change first data included in a data object into second data, the first node sends a target identifier pair including an object identifier and a target version identifier of the data object, and can search for the data object to be changed in the distributed database system through the object identifier, and specify a data version of the data object changed by the first node through the target version identifier. Since the computing node needs to carry the identifier pair including the version identifier when reading and writing the data object, before acquiring the data object including the second data from the first node, the system stores the target identifier pair in association with the data object in advance, so that the version identifier of the data object in the system is updated from the data version for identifying the first data to the target version identifier for identifying the second data. Therefore, if a first read request of another node, for example, a second node, to the data object is obtained during the period when the first node changes the data object, because the first to-be-verified version identifier included in the first identifier pair provided by the second node is not the same as the target version identifier, and the data object including the second data is not obtained from the first node at this time, the system will determine that the data object has not been changed at this time, and the first data in the data object in the system is not the final version at present and cannot be provided to the second node, so that the system will reject the first read request, that is, the data object of the non-final version will not be provided to the second node, thereby avoiding the data synchronization abnormality problem for the data object. Therefore, the second node can not utilize the data object of the non-final version to change data, and the correctness and consistency of the data object in the distributed database system are ensured.

The aforementioned data processing device may be a computer device, which may be a server, and may also be a terminal device, and the computer device provided in the embodiments of the present application will be described below from the perspective of hardware implementation. Fig. 7 is a schematic structural diagram of a server, and fig. 8 is a schematic structural diagram of a terminal device.

Referring to fig. 7, fig. 7 is a schematic diagram of a server 1400 according to an embodiment of the present application, where the server 1400 may have a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 1422 (e.g., one or more processors) and a memory 1432, one or more storage media 1430 (e.g., one or more mass storage devices) for storing applications 1442 or data 1444. Memory 1432 and storage media 1430, among other things, may be transient or persistent storage. The program stored on storage medium 1430 may include one or more modules (not shown), each of which may include a sequence of instructions operating on a server. Still further, a central processor 1422 may be disposed in communication with storage medium 1430 for executing a series of instruction operations on storage medium 1430 on server 1400.

The server 1400 may also include one or more power supplies 1426, one or more wired or wireless network interfaces 1450, one or more input-output interfaces 1458, and/or one or more operating systems 1441, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.

The steps performed by the server in the above embodiments may be based on the server structure shown in fig. 7.

The CPU 1422 is configured to perform the following steps:

acquiring a first reading request of a second node for the data object, wherein the second node is a computing node in the distributed database system different from the first node, the first reading request comprises a first identification pair, and the first identification pair comprises the object identification and a first version to be verified identification of the data object;

Optionally, the CPU 1422 may further execute method steps of any specific implementation manner of the data processing method in the embodiment of the present application.

Referring to fig. 8, fig. 8 is a schematic structural diagram of a terminal device according to an embodiment of the present application. Fig. 8 is a block diagram illustrating a partial structure of a smartphone related to a terminal device provided in an embodiment of the present application, where the smartphone includes: a Radio Frequency (RF) circuit 1510, a memory 1520, an input unit 1530, a display unit 1540, a sensor 1550, an audio circuit 1560, a wireless fidelity (WiFi) module 1570, a processor 1580, and a power supply 1590. Those skilled in the art will appreciate that the smartphone configuration shown in fig. 8 is not intended to be limiting, and may include more or fewer components than shown, or some components in combination, or a different arrangement of components.

The following specifically describes each component of the smartphone with reference to fig. 8:

the RF circuit 1510 may be configured to receive and transmit signals during information transmission and reception or during a call, and in particular, receive downlink information of a base station and then process the received downlink information to the processor 1580; in addition, the data for designing uplink is transmitted to the base station. In general, RF circuit 1510 includes, but is not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, RF circuit 1510 may also communicate with networks and other devices via wireless communication. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Message Service (SMS), and the like.

The memory 1520 may be used to store software programs and modules, and the processor 1580 implements various functional applications and data processing of the smart phone by operating the software programs and modules stored in the memory 1520. The memory 1520 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the smartphone, and the like. Further, the memory 1520 may include high-speed random access memory and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.

The input unit 1530 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the smartphone. Specifically, the input unit 1530 may include a touch panel 1531 and other input devices 1532. The touch panel 1531, also referred to as a touch screen, can collect touch operations of a user (e.g., operations of the user on or near the touch panel 1531 using any suitable object or accessory such as a finger or a stylus) and drive corresponding connection devices according to a preset program. Alternatively, the touch panel 1531 may include two parts, a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, and sends the touch point coordinates to the processor 1580, and can receive and execute commands sent by the processor 1580. In addition, the touch panel 1531 may be implemented by various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The input unit 1530 may include other input devices 1532 in addition to the touch panel 1531. In particular, other input devices 1532 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.

The display unit 1540 may be used to display information input by the user or information provided to the user and various menus of the smartphone. The Display unit 1540 may include a Display panel 1541, and optionally, the Display panel 1541 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel 1531 may cover the display panel 1541, and when the touch panel 1531 detects a touch operation on or near the touch panel 1531, the touch operation is transmitted to the processor 1580 to determine the type of the touch event, and then the processor 1580 provides a corresponding visual output on the display panel 1541 according to the type of the touch event. Although in fig. 8, the touch panel 1531 and the display panel 1541 are two separate components to implement the input and output functions of the smartphone, in some embodiments, the touch panel 1531 and the display panel 1541 may be integrated to implement the input and output functions of the smartphone.

The smartphone may also include at least one sensor 1550, such as light sensors, motion sensors, and other sensors. Specifically, the light sensor may include an ambient light sensor that may adjust the brightness of the display panel 1541 according to the brightness of ambient light and a proximity sensor that may turn off the display panel 1541 and/or backlight when the smartphone is moved to the ear. As one of the motion sensors, the accelerometer sensor may detect the magnitude of acceleration in each direction (generally, three axes), detect the magnitude and direction of gravity when stationary, and may be used for applications (such as horizontal and vertical screen switching, related games, magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tapping) and the like for recognizing the attitude of the smartphone, and other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, an infrared sensor and the like may be further configured for the smartphone, which will not be described herein again.

Audio circuit 1560, speaker 1561, microphone 1562 may provide an audio interface between a user and a smartphone. The audio circuit 1560 may transmit the electrical signal converted from the received audio data to the speaker 1561, and convert the electrical signal into an audio signal by the speaker 1561 and output the audio signal; on the other hand, the microphone 1562 converts collected sound signals into electrical signals, which are received by the audio circuit 1560 and converted into audio data, which are processed by the output processor 1580 and then passed through the RF circuit 1510 for transmission to, for example, another smart phone, or output to the memory 1520 for further processing.

WiFi belongs to short-distance wireless transmission technology, and the smart phone can help a user to receive and send e-mails, browse webpages, access streaming media and the like through a WiFi module 1570, and provides wireless broadband internet access for the user. Although fig. 8 shows WiFi module 1570, it is understood that it is not an essential component of the smartphone and may be omitted entirely as needed within the scope not changing the essence of the invention.

The processor 1580 is a control center of the smartphone, connects various parts of the entire smartphone by using various interfaces and lines, and performs various functions of the smartphone and processes data by operating or executing software programs and/or modules stored in the memory 1520 and calling data stored in the memory 1520, thereby integrally monitoring the smartphone. Optionally, the processor 1580 may include one or more processing units; preferably, the processor 1580 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, and the like, and a modem processor, which mainly handles wireless communications. It is to be appreciated that the modem processor may not be integrated into the processor 1580.

The smartphone also includes a power supply 1590 (e.g., a battery) for powering the various components, which may preferably be logically connected to the processor 1580 via a power management system, so as to manage charging, discharging, and power consumption management functions via the power management system.

Although not shown, the smart phone may further include a camera, a bluetooth module, and the like, which are not described herein.

In an embodiment of the application, the smartphone includes a memory 1520 that can store program code and transmit the program code to the processor.

The processor 1580 included in the smart phone may execute the data processing method provided in the foregoing embodiments according to the instructions in the program code.

The embodiment of the present application further provides a computer-readable storage medium for storing a computer program, where the computer program is used to execute the data processing method provided by the foregoing embodiment.

Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the data processing method provided in the various alternative implementations of the above aspects.

Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium may be at least one of the following media: various media that can store program codes, such as read-only memory (ROM), RAM, magnetic disk, or optical disk.

It should be noted that, in the present specification, all the embodiments are described in a progressive manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus and system embodiments, since they are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described embodiments of the apparatus and system are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

The above description is only one specific embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of data processing, the method comprising:

obtaining a first read request of a second node for the data object, wherein the second node is a different computing node from the first node in the distributed database system, the first read request comprises a first identifier pair, and the first identifier pair comprises the object identifier and a first to-be-verified version identifier of the data object; the first version to be verified identifies a data version of the data object representing the second node's desire to change;

and if the target version identification is determined to be newer than the first to-be-verified version identification and the data object including the second data is not acquired from the first node, rejecting the first read request, wherein whether the first to-be-verified version is the same as the target version identification is determined, and the first to-be-verified version is used for determining whether the second node has the right to read the data object of the first to-be-verified version, so that the first node and the second node are prevented from modifying the data object of the same version, and abnormal data synchronization is avoided.

2. The method of claim 1, wherein prior to said obtaining the target identification pair used by the first node to change the data object in the distributed database system, the method further comprises:

3. The method of claim 2, further comprising:

4. The method of claim 2, wherein the target version identification is obtained by increasing or decreasing the first version identification based on a preset step size.

5. The method of claim 1, wherein storing the target identification pair in association with the data object according to the object identification comprises:

6. The method according to any one of claims 1-5, further comprising:

acquiring a creation request of the first node for the data object;

storing the initial identification pair in association with the data object.

7. The method according to any one of claims 1-5, further comprising:

storing the identifier to be deleted in association with the data object;

8. The method of claim 7, further comprising:

9. A data processing apparatus, characterized in that the apparatus comprises an acquisition unit, a storage unit and an execution unit;

the obtaining unit is configured to obtain a target identifier pair used by a first node to change a data object in a distributed database system, where the first node is a computing node in the distributed database system, the target identifier pair includes an object identifier and a target version identifier of the data object, and the target version identifier is used to identify a data version in the data object, where the first data is changed to be second data;

the obtaining unit is further configured to obtain a first read request of a second node for the data object, where the second node is a different computing node in the distributed database system from the first node, and the first read request includes a first identifier pair, and the first identifier pair includes the object identifier and a first to-be-verified version identifier of the data object; the first version to be verified identifies a data version of the data object representing the second node's desire to change;

the execution unit is configured to reject the first read request if it is determined that the target version identifier is newer than the first to-be-verified version identifier and the data object including the second data is not obtained from the first node, where it is determined whether the first to-be-verified version is the same as the target version identifier, and the execution unit is configured to determine whether the second node has a right to read the data object of the first to-be-verified version, so as to avoid that the first node and the second node modify the data object of the same version, which may cause an abnormal data synchronization.

10. The apparatus of claim 9, further comprising a verification unit to:

11. The apparatus of claim 9, wherein the storage unit is configured to:

12. The apparatus according to any of claims 9-11, wherein the apparatus further comprises a creating unit configured to:

acquiring a creation request of the first node for the data object;

storing the initial identification pair in association with the data object.

13. The apparatus according to any of claims 9-11, wherein the apparatus further comprises a deletion unit configured to:

storing the identifier to be deleted in association with the data object;

14. A computer device, the device comprising a processor and a memory:

the processor is configured to perform the method of any of claims 1-8 according to instructions in the program code.

15. A computer-readable storage medium, characterized in that the computer-readable storage medium is used to store a computer program for performing the method of any one of claims 1-8.