WO2020119709A1 - 数据合并的实现方法、装置、系统及存储介质 - Google Patents

数据合并的实现方法、装置、系统及存储介质 Download PDF

Info

Publication number
WO2020119709A1
WO2020119709A1 PCT/CN2019/124491 CN2019124491W WO2020119709A1 WO 2020119709 A1 WO2020119709 A1 WO 2020119709A1 CN 2019124491 W CN2019124491 W CN 2019124491W WO 2020119709 A1 WO2020119709 A1 WO 2020119709A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
copy
copies
log
primary key
Prior art date
Application number
PCT/CN2019/124491
Other languages
English (en)
French (fr)
Inventor
司文武
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2020119709A1 publication Critical patent/WO2020119709A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Definitions

  • the present disclosure relates to the field of distributed data technology, and in particular, to a method, device, system, and storage medium for implementing data merge between copies of a distributed database.
  • UDSF Unstructured Data Storage
  • 5G core network faces many performance requirements: multi-service shared access to the library, and extremely low response delay. Therefore, UDSF network elements are required to have higher system throughput, data storage capacity, and extremely low response delay.
  • DDB Distributed Database
  • the access mode of the copy can be divided into the main standby mode and the main main mode.
  • master-slave mode all copies are readable, and only the elected master copy is written; in master-master mode, all copies are readable and writeable. Therefore, the main access mode has higher access performance.
  • master-slave mode all copies are readable, and only the elected master copy is written; in master-master mode, all copies are readable and writeable. Therefore, the main access mode has higher access performance.
  • the new scene problem must be solved:
  • master-master mode the same data is modified concurrently on multiple copies.
  • logs between copies it is found that the latest operation time of the modified data is different from the time carried in the log, so the log cannot be applied.
  • the main purpose of the present disclosure is to provide a method, device, system and storage medium for implementing data merge between distributed database copies to solve the technical problem of data update conflicts in a distributed system when logs cannot be connected or multi-master accesses, To achieve data consistency between copies.
  • a data merge implementation method provided by the present disclosure is applied to a distributed database system.
  • the method includes: performing a copy access when the distributed database is in a master-master working mode, and detecting When the data between multiple copies is inconsistent, compare the time of the most recent operation of the data on different copies; according to the time of the most recent operation of the data on different copies, keep the last update result of the data on each copy; based on the last update result of the data on each copy, perform each The mutual merge operation of the data between the copies makes the data between the copies consistent.
  • An embodiment of the present disclosure also proposes a data merge implementation device, including: a comparison module for copy access when the distributed database is in the master-master working mode, and compares different copies when data inconsistency among multiple copies is detected The last operation time of the data on the data; the save module is used to retain the last update result of the data on each copy according to the time of the most recent operation of the data on the different copies; the merge operation module is used to perform each The mutual merge operation of the data between the copies makes the data between the copies consistent.
  • An embodiment of the present disclosure also provides a data merge implementation system, including: a memory, a processor, and a computer program stored on the memory and executable on the processor, and the computer program is executed by the processor The steps of the method for realizing the data merge as described above.
  • Embodiments of the present disclosure also provide a computer-readable storage medium that stores a computer program on the computer-readable storage medium, and when the computer program is executed by a processor, implements the steps of the data merge implementation method described above.
  • FIG. 1 is a schematic flowchart of a first embodiment of a method for implementing data merge of the present disclosure
  • FIG. 2 is a schematic flowchart of a second embodiment of a method for implementing data merge of the present disclosure
  • FIG. 3 is a schematic flowchart of a third embodiment of a method for implementing data merge of the present disclosure
  • FIG. 4 is a schematic flowchart of a fourth embodiment of a method for implementing data merge of the present disclosure
  • FIG. 5 is a schematic flowchart of a fifth embodiment of a method for implementing data merge of the present disclosure
  • FIG. 6 is a schematic flowchart of a sixth embodiment of a method for implementing data merge of the present disclosure
  • FIG. 7 is a schematic flowchart of a seventh embodiment of a method for implementing data merge of the present disclosure
  • FIG. 8 is a schematic diagram of a system architecture involved in an operating environment according to an embodiment of the present disclosure.
  • the main solution of the embodiment of the present disclosure is to perform copy access when the distributed database is in the main working mode, and when data inconsistency among multiple copies is detected, compare the time of the most recent operation of data on different copies; according to different copies The time of the most recent operation of the data, the last update result of the data on each copy is retained; based on the last update result of the data on each copy, the mutual merge operation of the data between the copies is performed to make the data between the copies consistent, and thus, the log is copied , Solve the technical problem of data update conflicts in the distributed system when the log cannot be connected or multi-master access.
  • the merge processing of the deletion operation and the addition and modification operations when the data merge occurs ensuring the consistency of the data between the copies.
  • the distributed database accesses the replica data in the master mode
  • the replica's own log buffer overflows, other replicas cannot read the continuous log, and subsequent logs cannot be applied, resulting in inconsistent data between replicas; or The same data is modified concurrently on multiple copies.
  • the latest operation time of the modified data is different from the time carried in the logs, which results in conflicts in data updates and the inability to apply logs. It also causes data inconsistencies between copies.
  • the present disclosure provides a solution that can solve the technical problem of data update conflicts caused by log incontinuity or multi-master access in a distributed system, and achieve data consistency between copies.
  • the first embodiment of the present disclosure proposes a data merge implementation method.
  • the method is applied to a distributed database system, and the method includes:
  • Step S101 when the distributed database is in the main working mode to perform copy access, and when data inconsistency among multiple copies is detected, compare the latest operation time of data on different copies;
  • the solution of this embodiment relates to a technology for copying data between copies in a distributed database.
  • log replication is used to ensure data consistency among multiple copies.
  • the application scenario of the solution in this embodiment is the main and main working modes of the copy access mode.
  • the main and main working mode all copies are readable and writable. Therefore, the main access mode has higher access performance.
  • An embodiment of the present disclosure proposes a data merging method for resolving conflicts to achieve data consistency between copies.
  • the distributed database when the distributed database is in the master-master working mode for copy access, if data inconsistency among multiple copies is detected, the time of the most recent operation of the data on different copies is compared.
  • Step S102 according to the time of the most recent operation of the data on different copies, the last update result of the data on each copy is retained;
  • Step S103 based on the last update result of the data on each copy, perform the mutual merge operation of the data between the copies, so that the data between the copies remains consistent.
  • the last update result of the data on each copy is retained. Based on the last update result of the data on each copy, the data merge operation between the copies is performed to make the data between the copies consistent.
  • An example is as follows: when the log buffer overflow causes the log to be discontinuous, or the log application cannot be performed between the replicas, a data merge between the replicas is initiated; if data updates occur on both replicas, a bidirectional data merge is initiated.
  • the mutual merge operation of the data between the replicas can include not only: the merge of a single record in the database, but also the data merge of the entire distributed database system.
  • a second embodiment of the present disclosure proposes a data merge implementation method.
  • the distributed database is in the main master working mode for copy storage Fetching, and when the data inconsistency between multiple copies is detected, the step of comparing the most recent operation time of the data on different copies also includes:
  • step S100 when the distributed database is in the master-master working mode for copy access, it is detected whether data among multiple copies is inconsistent.
  • this embodiment further includes a solution for detecting whether the data between multiple copies is inconsistent.
  • the following scheme can be adopted to detect whether the data between multiple copies is inconsistent.
  • the distributed database when the distributed database is in the main and main working mode for copy access, if the same data is modified concurrently on multiple copies, and the log is applied between the copies, the latest operation time and log of the modified data are detected Whether the time carried in the data is consistent. If it is inconsistent, data inconsistency among multiple copies is detected.
  • a third embodiment of the present disclosure proposes a data merge implementation method. Based on the embodiment shown in FIG. 1 or FIG. 2, the method further includes:
  • Step S104 When receiving a delete request or applying another copy of the delete operation log to delete the data on the current copy, the primary key of the deleted data on the current copy and the deletion operation time are saved in a preset primary key storage queue.
  • this embodiment further includes: a solution for deleting duplicate data.
  • this embodiment pre-sets a primary key storage queue (DEL_PK_QUE, Deleted data Primary Primary Queue), which is used to store the primary key of the deleted data and the deletion operation due to the deletion operation request or the application deletion operation log time.
  • DEL_PK_QUE Deleted data Primary Primary Queue
  • the primary key of the deleted data and the deletion operation time are stored in DEL_PK_QUE.
  • the log contains the operation timestamp, operation type, operation key and value, for non-increasing operation logs, and also contains the timestamp of the last time the data was operated.
  • Step 1 copy 1 receives To delete the data;
  • Step 2 copy 1 deletes the data from the database;
  • step 3 copy 1 inserts the primary key of the deleted data and the time of the delete operation into DEL_PK_QUE;
  • step 4 copy 1 generates a log of the delete operation (the log contains The following information: time stamp of this operation, operation type, operation key and value, for non-incremental operation logs, it also contains the time stamp of the last time the data was operated);
  • Step 5 copy 1 sends the deletion operation to the initiator of the deletion request Response;
  • Step 6 the log of copy 1's deletion operation is synchronized to copy 2;
  • step 7, copy 2 applies the delete operation log; only delete the data to be deleted and the log carries the delete operation time is greater than the latest data operation time, then delete the data;
  • Step 8 the data on
  • the fourth embodiment of the present disclosure proposes a method for implementing data merge. Based on the embodiment shown in FIG. 3, the method further includes:
  • Step S105 When a request to add data is received to add new data to the copy, the primary key of the newly added data is used to delete the corresponding queue element from the primary key storage queue.
  • this embodiment further includes: an operation of adding new data.
  • the primary key of the newly added data is used to delete possible queue elements from DEL_PK_QUE.
  • Step 1 Replica 1 received Add data request; Step 2. Copy 1 inserts the data into the database; Step 3. Copy 1 uses the primary key of the data to delete possible deletion information from DEL_PK_QUE (for performance considerations, this step can be ignored); Step 4. Copy 1 is generated this time Increase operation log; Step 5. Copy 1 sends an increase operation response to the increase request initiator; Step 6. Copy 1 log of this increase operation is synchronized to copy 2; Step 7.
  • Copy 2 application adds operation log; query DEL_PK_QUE, If there is deletion operation information and the deletion operation time is longer than the log carrying increased data operation time, the data is not inserted, otherwise, the data is inserted; Step 8. The data on the copy 2 is inserted, and the deletion operation information exists in DEL_PK_QUE, then from DEL_PK_QUE Delete the deletion information of the inserted data (for performance considerations, this step can be ignored).
  • a fifth embodiment of the present disclosure proposes a data merge implementation method.
  • the method further includes: Step S106, when the replica application adds an operation log, use The primary key of the newly added data queries the primary key storage queue; if there is no delete operation record corresponding to the data to be added in the primary key storage queue, the corresponding data to be added is inserted into the current copy; otherwise, the comparison is increased The increase operation time carried in the operation log and the deletion operation time saved in the primary key storage queue, if the former is greater than the latter, the corresponding data to be added is inserted; if the former is not greater than the latter, the corresponding to be updated is not inserted Increase the data.
  • this embodiment further includes: an operation of adding an operation log by the application.
  • the primary key of the data to be added is used to query DEL_PK_QUE, and if there is no delete operation corresponding to the data to be added, the new data is inserted into the current copy; otherwise, the log is compared The increase operation time carried in and the delete operation time saved in DEL_PK_QUE, if the former is greater than the latter, the new data is inserted; if the former is not greater than the latter, the new data is not inserted.
  • Step 1 Replica 1 received Add data request; Step 2. Copy 1 inserts the data into the database; Step 3. Copy 1 uses the primary key of the data to delete possible deletion information from DEL_PK_QUE (for performance considerations, this step can be ignored); Step 4. Copy 1 is generated this time Increase operation log; Step 5. Copy 1 sends an increase operation response to the increase request initiator; Step 6. Copy 1 log of this increase operation is synchronized to copy 2; Step 7.
  • Copy 2 application adds operation log; query DEL_PK_QUE, If there is deletion operation information and the deletion operation time is longer than the log carrying increased data operation time, the data is not inserted, otherwise, the data is inserted; Step 8. The data on the copy 2 is inserted, and the deletion operation information exists in DEL_PK_QUE, then from DEL_PK_QUE Delete the deletion information of the inserted data (for performance considerations, this step can be ignored).
  • a sixth embodiment of the present disclosure proposes a data merge implementation method. Based on the above embodiment shown in FIG. 5, the method further includes:
  • Step S107 when the copy application modification operation log is applied, the modified operation time of the data carried in the application modification operation log is compared with the latest operation time of the modified data; if the latest operation time of the modified data is not equal to that carried in the modification operation log
  • the data merge operation is initiated for this piece of data.
  • this embodiment further includes a solution for modifying the operation log.
  • the copy when the copy applies the modification operation log, it is found that the most recent operation time of the modified data is not equal to the operation time before the data carried in the log is modified, and a data merge is initiated for the piece of data.
  • Step 1 The modification operation is generated on copy 2 Log; step 2.
  • the modification log on copy 2 is synchronized to copy 1; step 3.
  • the application log on copy 1 conflicts: the operation time before the data on copy 2 carried in the modification log is not equal to the data on copy 1 The last operation time; Step 4.
  • the copy 1 reads the copy to the copy 2 from the log application conflict involving modified data; Step 5.
  • the copy 1 receives the read data from the copy 2; Step 6. Compares the operation time of the data in the two copies , Select the data with the latest operation time as the modification result; Step 7.
  • the current log application conflict resolution use the current log to continue reading the subsequent logs of copy 2; Step 8. Enter the normal log copy application process between copy 1 and copy 2.
  • the solution of the embodiment of the present disclosure may also perform inter-copy data recovery processing.
  • Step 1 Copy 1 sends a request to copy 2 to restore all data in DEL_PK_QUE; Step 2. Copy 1 receives a response from copy 2 to read the data in DEL_PK_QUE; Step 3. Copy 1 inserts the delete operation information returned in step 2 into DEL_PK_QUE; step 4. copy 1 sends a request to restore all data to copy 2, that is, all data from the copy in copy 2; step 5. copy 1 receives copy 2 and sends Response of several read data; Step 6. Replica 1 inserts the data returned in Step 5 into the database; Step 7. Replica 1 data recovery is completed, read to Replica 2 the logs that may be generated during data recovery; Step 8.
  • Replica 1 Receive a response from several read logs sent by copy 2; Step 9. Copy 1 applies the log returned in step 8 based on the delete operation information stored in DEL_PK_QUE (for operation log application, refer to Figure 1 Figure 2 Figure 3); Step 10. Copy 1 and copy 2 enter the normal log replication application process.
  • the solution of the embodiment of the present disclosure may also perform data merge processing based on deleting information in DEL_PK_QUE.
  • Step 1 Copy 1 initiates a request to read all data from Copy 2; Step 2. Copy 1 receives the read data sent by several copies 2 Response; Step 3.
  • Step 3.1 data exists in copy 1, then compare the operation time of the data in the two copies, select the data with the latest operation time as the record Content;
  • Step 3.2 data does not exist in copy 1, query DEL_PK_QUE, if there is deletion operation information and the deletion operation time is longer than the operation time of the data in copy 2, the data is not inserted, otherwise, the data is inserted;
  • step 3.3 is based on 3.1, 3.2
  • the execution result in the operation generates an operation log;
  • Step 4. After completing the merge processing of all the data on the copy 2, the log generated during the data merge is read to the copy 2;
  • Step 5. receives several read logs sent by the copy 2 Response;
  • Copy 1 applies the log returned in Step 3.3 based on the delete operation information stored in DEL_PK_QUE;
  • Step 7. Enter the normal log copy application process between Copy 1 and Copy 2.
  • the solution of the embodiment of the present disclosure can also implement the insertion processing of DEL_PK_QUE.
  • the insertion process of DEL_PK_QUE can be exemplified as follows: Step 1. Insert delete operation information into DEL_PK_QUE: primary key and delete operation time; Step 2. Following the steps below for insert processing; Step 2.1 There is free space in the cache area storing the primary key and operation time information , Then allocate space to store the primary key and delete operation time; Step 2.2.
  • delete operation information inserted in step 2.3 does not exist in DEL_PK_QUE, then directly insert DEL_PK_QUE; delete operation information inserted in step 2.3 already exists in DEL_PK_QUE, then update the delete operation time of the corresponding primary key.
  • a seventh embodiment of the present disclosure proposes a data merge implementation method. Based on the embodiment shown in FIG. 6 above, the method further includes:
  • Step S108 if it is detected that the cache space for storing the primary key and the deletion operation time is used up, the space occupied by the queue element with the oldest deletion operation time stored in the primary key storage queue is released.
  • the merge processing of the deletion operation and the addition and modification operations when the data merge occurs ensuring the consistency of the data between the copies.
  • the cache space for storing the primary key and operation time is used up, the space occupied by the queue element with the oldest deletion operation time saved in DEL_PK_QUE is released to save storage resources.
  • an embodiment of the present disclosure also provides a data merge implementation device, including: a comparison module for copy access when the distributed database is in the main and main working mode, and compares when data inconsistency between multiple copies is detected The time of the most recent operation of the data on the different copies; the save module, which is used to retain the last update result of the data on each copy according to the time of the most recent operation of the data on the different copies; the merge operation module, which is based on the last update result of the data on each copy, Perform the merge operation of the data between the copies to make the data between the copies consistent
  • the above-mentioned merge operation module is also used to delete the delete operation log of the current copy when receiving a delete request or apply other copies to delete the data on the current copy, the primary key of the deleted data on the current copy and the delete operation time , Saved in the preset primary key storage queue.
  • the above-mentioned merge operation module is also used to delete the corresponding queue element from the primary key storage queue using the primary key of the newly added data when a request to add data is added to add new data to the copy.
  • the above merge operation module is also used to query the primary key storage queue using the primary key of the data to be added when the replica application adds an operation log; if there is no corresponding data to be added in the primary key storage queue Delete operation record, insert the corresponding data to be added in the current copy; otherwise, compare the increase operation time carried in the application increase operation log with the delete operation time saved in the primary key storage queue, if the former is greater than the latter , The corresponding data to be added is inserted; if the former is not greater than the latter, the corresponding data to be added is not inserted.
  • the above merge operation module is also used to compare the modified operation time of the data carried in the modified operation log of the application with the most recent operation time of the modified data when the modified operation log is applied to the copy;
  • the data merge operation is initiated for the piece of data.
  • the above merge operation module is further used to release the space occupied by the queue element with the oldest delete operation time stored in the primary key storage queue if it is detected that the cache space storing the primary key and the delete operation time is used up .
  • the distributed database when the distributed database is in the main and main working mode for copy access, and when data inconsistency between multiple copies is detected, the time of the most recent operation of the data on different copies is compared; The last update result of the data on each copy; based on the last update result of the data on each copy, the mutual merge operation of the data between the copies is performed to make the data between the copies consistent, thus, through log replication, the distributed system is solved ,
  • the log cannot be connected or the multi-master access causes the technical problem of data update conflicts.
  • the merge processing of the deletion operation and the addition and modification operations when the data merge occurs ensuring the consistency of the data between the copies.
  • an embodiment of the present disclosure also provides a data merge implementation system, including: a memory, a processor, and a computer program stored on the memory and executable on the processor, the computer program being processed by the computer program The steps of the method for implementing the data merging described above when the device is executed.
  • the system in this embodiment may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, and a communication bus 1002.
  • the communication bus 1002 is used to implement connection communication between these components.
  • the user interface 1003 may include a display screen (Display), an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface and a wireless interface.
  • the network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface).
  • the memory 1005 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as a disk memory.
  • the memory 1005 may optionally be a storage device independent of the foregoing processor 1001.
  • FIG. 8 does not constitute a limitation on the platform, and may include more or fewer components than those illustrated, or combine certain components, or arrange different components.
  • the memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and an implementation program for data merge.
  • the network interface 1004 is mainly used to connect to a network server and perform data communication with the network server;
  • the user interface 1003 is mainly used to connect to a client and perform data communication with the client;
  • the processor 1001 may be used to Call the implementation program of data merger stored in the memory 1005 and perform the following operations: when the distributed database is in the main working mode for copy access, and when data inconsistency between multiple copies is detected, compare the most recent operations on the data on different copies Time; according to the time of the latest operation of the data on different copies, retain the last update result of the data on each copy; based on the last update result of the data on each copy, perform the mutual merge operation of the data between the copies, so that the data between the copies remains consistent .
  • the processor 1001 may be used to call a data merge implementation program stored in the memory 1005, and also perform the following operations: upon receiving a delete request or applying another copy of the delete operation log to delete the data on the current copy At this time, the primary key of the deleted data on the current copy and the deletion operation time are saved in the preset primary key storage queue.
  • the processor 1001 may be used to call a data merge implementation program stored in the memory 1005, and also perform the following operations: when receiving a request to add data to add new data to the copy, use the newly added data The primary key of deletes the corresponding queue element from the primary key storage queue.
  • the processor 1001 may be used to call a data merge implementation program stored in the memory 1005, and also perform the following operations: when the replica application adds an operation log, use the primary key of the data to be added to query the primary key storage queue ; If there is no delete operation record corresponding to the data to be added in the primary key storage queue, insert the corresponding data to be added in the current copy; otherwise, compare the increase operation time and the increase operation time carried in the application increase operation log The delete operation time saved in the primary key storage queue, if the former is greater than the latter, the corresponding data to be added is inserted; if the former is not greater than the latter, the corresponding data to be added is not inserted.
  • the processor 1001 may be used to call a data merger implementation program stored in the memory 1005, and also perform the following operations: when copying the modification operation log, compare the data carried in the modification operation log of the application to be modified Operation time and the latest operation time of the modified data; if the latest operation time of the modified data is not equal to the operation time of the data carried in the modification operation log, the data merge operation is initiated for the piece of data.
  • the processor 1001 may be used to call a data merger implementation program stored in the memory 1005, and also perform the following operations: perform recovery processing on data between copies or insert information on the primary key storage queue on the copy.
  • the processor 1001 may be used to call a data merger implementation program stored in the memory 1005, and also perform the following operations: if it is detected that the cache space for storing the primary key and the deletion operation time is used up, the primary key storage is released The space occupied by the queue element with the oldest delete operation time stored in the queue.
  • an embodiment of the present disclosure also provides a computer-readable storage medium that stores a computer program on the computer-readable storage medium, and when the computer program is executed by a processor, implements the steps of the method for implementing data merge as described above .
  • a data merge implementation method, device, system and storage medium proposed in the embodiments of the present disclosure perform copy access when the distributed database is in the main working mode, and detect data inconsistency among multiple copies When comparing the time of the most recent operation of the data on different copies; according to the time of the most recent operation of the data on different copies, retain the last update results of the data on each copy; based on the last update results of the data on each copy, merge the data between the copies The operation keeps the data between the copies consistent. Therefore, through log replication, the technical problem of data update conflicts caused by log incontinuity or multi-master access in a distributed system is solved. In addition, by storing the primary key of the deleted data in the preset primary key deletion queue, the merge processing of the deletion operation and the addition and modification operations when the data merge occurs, ensuring the consistency of the data between the copies.
  • the present disclosure solves the technical problem of data update conflicts when logs cannot be connected or multi-master accesses in a distributed system.
  • the merge processing of the deletion operation and the addition and modification operations when the data merge occurs is solved.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种数据合并的实现方法、装置、系统及存储介质,其方法包括:在分布式数据库处于主主工作模式进行副本存取,并在检测到多个副本间数据不一致时,比较不同副本上数据最近操作的时间(S101);根据不同副本上数据最近操作的时间,保留各副本上数据的最后更新结果(S102);基于各副本上数据的最后更新结果,进行各副本间数据的相互合并操作,使得各副本间数据保持一致(S103)。

Description

数据合并的实现方法、装置、系统及存储介质
本公开要求享有2018年12月11日提交的名称为“数据合并的实现方法、装置、系统及存储介质”的中国专利申请CN201811510709.0的优先权,其全部内容通过引用并入本文中。
技术领域
本公开涉及分布式数据技术领域,尤其涉及一种分布式数据库副本间数据合并的实现方法、装置、系统及存储介质。
背景技术
UDSF(Unstructured Data Storage network function,非结构化数据存储功能)作为5G核心网中存储非结构化数据的数据库网元,面临诸多性能需求:多业务共享访问库、极低的响应时延。因此,要求UDSF网元具备更高的系统吞吐量、数据存储容量、极低的响应时延。
分布式数据库(DDB,Distributed Database)技术是解决5G核心网对数据存取高性能要求的必然技术手段。分布式数据库系统中,副本的存取模式,可分为主备模式、主主模式。其中,在主备模式下,所有副本均可读,只有在选举出的主副本上进行写;在主主模式下,所有副本均可读、写。因此,主主访问模式,具备更高的访问性能。但主主模式下,必须解决所引入的新的场景问题:
主主模式下,副本自身日志缓冲区溢出时,导致其它副本无法读取连续日志,而无法应用后续日志,造成副本间数据不一致;
主主模式下,同一数据在多个副本上并发修改,副本间应用日志时,发现被修改数据最近操作时间与日志中携带时间不一,从而无法应用日志。
发明内容
本公开的主要目的在于提供一种分布式数据库副本间数据合并的实现方法、装置、系统及存储介质,以解决分布式系统中,当日志无法接续或者多主访问导致数据更新冲突的技术问题,实现副本间数据的一致性。
为实现上述目的,本公开提供的一种数据合并的实现方法,所述方法应用于分布式数据库系统,所述方法包括:在分布式数据库处于主主工作模式进行副本存取,并在检测到多个副本间数据不一致时,比较不同副本上数据最近操作的时间;根据不同副本上数据最 近操作的时间,保留各副本上数据的最后更新结果;基于各副本上数据的最后更新结果,进行各副本间数据的相互合并操作,使得各副本间数据保持一致。
本公开实施例还提出一种数据合并的实现装置,包括:比较模块,用于在分布式数据库处于主主工作模式进行副本存取,并在检测到多个副本间数据不一致时,比较不同副本上数据最近操作的时间;保存模块,用于根据不同副本上数据最近操作的时间,保留各副本上数据的最后更新结果;合并操作模块,用于基于各副本上数据的最后更新结果,进行各副本间数据的相互合并操作,使得各副本间数据保持一致。
本公开实施例还提出一种数据合并的实现系统,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如上所述的数据合并的实现方法的步骤。
本公开实施例还提出一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如上所述的数据合并的实现方法的步骤。
附图说明
图1是本公开数据合并的实现方法第一实施例的流程示意图;
图2是本公开数据合并的实现方法第二实施例的流程示意图;
图3是本公开数据合并的实现方法第三实施例的流程示意图;
图4是本公开数据合并的实现方法第四实施例的流程示意图;
图5是本公开数据合并的实现方法第五实施例的流程示意图;
图6是本公开数据合并的实现方法第六实施例的流程示意图;
图7是本公开数据合并的实现方法第七实施例的流程示意图;
图8是本公开实施例运行环境涉及的系统架构示意图。
本公开目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
为了使本公开的技术方案更加清楚、明了,下面将结合附图作进一步详述。
具体实施方式
应当理解,此处所描述的实施例仅仅用以解释本公开,并不用于限定本公开。
本公开实施例的主要解决方案是:在分布式数据库处于主主工作模式进行副本存取,并在检测到多个副本间数据不一致时,比较不同副本上数据最近操作的时间;根据不同副本上数据最近操作的时间,保留各副本上数据的最后更新结果;基于各副本上数据的最后更新结果,进行各副本间数据的相互合并操作,使得各副本间数据保持一致,由此,通过 日志复制,解决了分布式系统中,当日志无法接续或者多主访问导致数据更新冲突的技术问题。此外,通过将被删除数据的主键保存于预设的主键删除队列中,解决数据合并发生时,删除操作与增、改操作的合并处理,保证了副本间数据的一致性。
由于一些情况中,分布式数据库在主主模式下存取副本数据时,若副本自身日志缓冲区溢出,会导致其它副本无法读取连续日志,而无法应用后续日志,造成副本间数据不一致;或者,同一数据在多个副本上并发修改,副本间应用日志时,由于被修改数据最近操作时间与日志中携带时间不一,从而导致数据更新冲突而无法应用日志,也会造成副本间数据不一致。
本公开提供一种解决方案,可以解决分布式系统中,当日志无法接续或者多主访问导致数据更新冲突的技术问题,实现副本间数据的一致性。
在一个实施例中,如图1所示,本公开第一实施例提出一种数据合并的实现方法,所述方法应用于分布式数据库系统,所述方法包括:
步骤S101,在分布式数据库处于主主工作模式进行副本存取,并在检测到多个副本间数据不一致时,比较不同副本上数据最近操作的时间;
本实施例方案涉及分布式数据库中副本间数据的复制技术。在分布式系统中,通过日志复制,以保证多个副本间数据的一致性。
其中,本实施例方案应用场景为副本存取模式的主主工作模式,主主工作模式下,所有副本均可读、写。因此,主主访问模式,具备更高的访问性能。
由于一些情况中,在副本存取模式为主主工作模式下,当日志无法接续或者多主访问导致数据更新冲突,造成副本间数据不一致。本公开实施例提出一种解决冲突的数据合并方法,实现副本间数据的一致性。
在一个实施例中,在分布式数据库处于主主工作模式进行副本存取时,若检测到多个副本间数据不一致,则比较不同副本上数据最近操作的时间。
步骤S102,根据不同副本上数据最近操作的时间,保留各副本上数据的最后更新结果;
步骤S103,基于各副本上数据的最后更新结果,进行各副本间数据的相互合并操作,使得各副本间数据保持一致。
根据不同副本上数据最近操作的时间,保留各副本上数据的最后更新结果。基于各副本上数据的最后更新结果,进行各副本间数据的相互合并操作,使得各副本间数据保持一致。
举例如下:在日志缓冲区溢出导致日志不连续,或者副本间无法进行日志应用时,则发起副本间数据合并;如果两个副本均发生数据更新,则发起双向数据合并。
其中,副本间数据的相互合并操作不仅可以包括:数据库中的单条记录发生合并,还可以是整个分布式数据库系统的数据合并。
由于分布式数据库在主主模式下存取副本数据时,若副本自身日志缓冲区溢出,会导致其它副本无法读取连续日志,而无法应用后续日志,造成副本间数据不一致;或者,同一数据在多个副本上并发修改,副本间应用日志时,由于被修改数据最近操作时间与日志中携带时间不一,从而导致数据更新冲突而无法应用日志,也会造成副本间数据不一致。
本实施例通过上述方案,在上述场景发生时,比较不同副本上数据最近操作的时间,保留数据最后更新的结果,使得多副本上数据并发操作的最终结果不会丢失,通过副本间数据的相互合并,使得副本间数据保持一致。
如图2所示,本公开第二实施例提出一种数据合并的实现方法,基于上述图1所示的实施例,在上述步骤S101,所述在分布式数据库处于主主工作模式进行副本存取,并在检测到多个副本间数据不一致时,比较不同副本上数据最近操作的时间的步骤之前还包括:
步骤S100,在分布式数据库处于主主工作模式进行副本存取时,检测多个副本间数据是否不一致。
相比上述图1所示的实施例,本实施例还包括检测多个副本间数据是否不一致的方案。其中,根据不同的应用场景,可以采用如下方案来检测多个副本间数据是否不一致。
作为一种实现方式,在分布式数据库处于主主工作模式进行副本存取时,检测是否存在副本自身日志缓冲区溢出,导致其它副本无法读取连续日志;
若是,则检测到多个副本间数据不一致。
作为另一种实现方式,在分布式数据库处于主主工作模式进行副本存取时,若同一数据在多个副本上并发修改,并在副本间应用日志时,检测被修改数据最近操作时间与日志中携带时间是否一致,若不一致,则检测到多个副本间数据不一致。
本实施例通过上述方案,在分布式数据库处于主主工作模式进行副本存取时,检测多个副本间数据是否不一致,在检测到多个副本间数据不一致时,比较不同副本上数据最近操作的时间;根据不同副本上数据最近操作的时间,保留各副本上数据的最后更新结果;基于各副本上数据的最后更新结果,进行各副本间数据的相互合并操作,使得各副本间数据保持一致,由此,通过日志复制,解决了分布式系统中,当日志无法接续或者多主访问导致数据更新冲突的技术问题。
如图3所示,本公开第三实施例提出一种数据合并的实现方法,基于上述图1或图2 所示的实施例,所述方法还包括:
步骤S104,在接收到删除请求或应用其它副本的删除操作日志,以删除当前副本上的数据时,将当前副本上被删除数据的主键及删除操作时间,保存于预设的主键存储队列。
相比上述图1或图2所示的实施例,本实施例还包括:删除副本数据的方案。
在一个实施例中,本实施例预先设置有一主键存储队列(DEL_PK_QUE,Deleted data Primary Key Queue),该主键存储队列用于保存因删除操作请求或应用删除操作日志而被删除数据的主键及删除操作时间。
在本实施例中,在收到删除请求或应用其它副本的删除操作日志,以删除副本上的数据时,将被删除数据的主键及删除操作时间,保存于DEL_PK_QUE中。
其中,日志包含操作时间戳、操作类别、操作的key及value、对于非增加操作的日志,还包含数据最近一次被操作的时间戳。
以两个副本(Replica_1---副本1,Replica_2---副本2)为例,进行副本数据删除操作或者应用删除操作日志,存储数据的主键及操作时间的过程如下:步骤1,副本1收到删除数据的请求;步骤2,副本1从数据库中删除数据;步骤3,副本1将删除数据的主键及删除操作时间插入DEL_PK_QUE中;步骤4,副本1产生本次删除操作的日志(日志包含如下信息:本次操作时间戳、操作类别、操作的key及value,对于非增加操作的日志,还包含数据最近一次被操作的时间戳);步骤5,副本1向删除请求发起者发送删除操作响应;步骤6,副本1本次删除操作的日志被同步到副本2;步骤7,副本2应用删除操作日志;只有待删数据存在且日志携带删除操作时间大于数据最近操作时间,才删除数据;步骤8,副本2上数据被删除,则将删除数据的主键及删除操作时间插入DEL_PK_QUE中。
本实施例通过上述方案,在分布式数据库处于主主工作模式进行副本存取,并在检测到多个副本间数据不一致时,比较不同副本上数据最近操作的时间;根据不同副本上数据最近操作的时间,保留各副本上数据的最后更新结果;基于各副本上数据的最后更新结果,进行各副本间数据的相互合并操作,使得各副本间数据保持一致,由此,通过日志复制,解决了分布式系统中,当日志无法接续或者多主访问导致数据更新冲突的技术问题。此外,通过将被删除数据的主键保存于预设的主键删除队列中,解决数据合并发生时,删除操作与其他操作的合并处理,保证了副本间数据的一致性。
如图4所示,本公开第四实施例提出一种数据合并的实现方法,基于上述图3所示的实施例,所述方法还包括:
步骤S105,在接收到增加数据请求,以在副本上增加新的数据时,使用新增数据的主键从所述主键存储队列中删除相应的队列元素。
相比上述图3所示的实施例,本实施例还包括:增加新数据的操作。
在一个实施例中,在接收到增加数据请求,以在副本上增加新的数据时,使用新增数据的主键从DEL_PK_QUE中删除可能存在的队列元素。
还是以两个副本(Replica_1---副本1,Replica_2---副本2)为例,进行副本增加数据操作、应用增加操作日志,删除、查询DEL_PK_QUE处理的过程如下:步骤1.副本1收到增加数据请求;步骤2.副本1将数据插入数据库;步骤3.副本1使用数据的主键从DEL_PK_QUE中删除可能存在的删除信息(性能考虑,此步骤可以忽略);步骤4.副本1产生本次增加操作的日志;步骤5.副本1向增加请求发起者发送增加操作响应;步骤6.副本1本次增加操作的日志被同步到副本2;步骤7.副本2应用增加操作日志;查询DEL_PK_QUE,如果存在删除操作信息且删除操作时间大于日志携带增加数据操作时间,则不插入数据,其它情况,插入数据;步骤8.副本2上数据被插入,且DEL_PK_QUE中存在删除操作信息,则从DEL_PK_QUE中删除插入数据的删除信息(性能考虑,此步骤可以忽略)。
本实施例通过上述方案,在分布式数据库处于主主工作模式进行副本存取,并在检测到多个副本间数据不一致时,比较不同副本上数据最近操作的时间;根据不同副本上数据最近操作的时间,保留各副本上数据的最后更新结果;基于各副本上数据的最后更新结果,进行各副本间数据的相互合并操作,使得各副本间数据保持一致,由此,通过日志复制,解决了分布式系统中,当日志无法接续或者多主访问导致数据更新冲突的技术问题。此外,通过将被删除数据的主键保存于预设的主键删除队列中,解决数据合并发生时,删除操作与增加操作的合并处理,保证了副本间数据的一致性。
如图5所示,本公开第五实施例提出一种数据合并的实现方法,基于上述图4所示的实施例,所述方法还包括:步骤S106,在副本应用增加操作日志时,使用待新增数据的主键查询所述主键存储队列;如果所述主键存储队列中不存在对应待新增数据的删除操作记录,则在当前副本中插入对应的待新增数据;否则,比较应用的增加操作日志中携带的增加操作时间和所述主键存储队列中保存的删除操作时间,若前者大于后者,则插入对应的待新增数据;若前者不大于后者,则不插入对应的待新增数据。
相比上述实施例,本实施例还包括:应用增加操作日志的操作。
在一个实施例中,在副本应用增加操作日志时,使用待新增数据的主键查询DEL_PK_QUE,如果不存在对应待新增数据的删除操作,则在当前副本中插入新增数据;否则,比较日志中携带的增加操作时间和DEL_PK_QUE中保存的删除操作时间,若前者大于后者,则插入新增数据;若前者不大于后者,则不插入新增数据。
还是以两个副本(Replica_1---副本1,Replica_2---副本2)为例,进行副本增加数据操作、应用增加操作日志,删除、查询DEL_PK_QUE处理的过程如下:步骤1.副本1收到增加数据请求;步骤2.副本1将数据插入数据库;步骤3.副本1使用数据的主键从DEL_PK_QUE中删除可能存在的删除信息(性能考虑,此步骤可以忽略);步骤4.副本1产生本次增加操作的日志;步骤5.副本1向增加请求发起者发送增加操作响应;步骤6.副本1本次增加操作的日志被同步到副本2;步骤7.副本2应用增加操作日志;查询DEL_PK_QUE,如果存在删除操作信息且删除操作时间大于日志携带增加数据操作时间,则不插入数据,其它情况,插入数据;步骤8.副本2上数据被插入,且DEL_PK_QUE中存在删除操作信息,则从DEL_PK_QUE中删除插入数据的删除信息(性能考虑,此步骤可以忽略)。
本实施例通过上述方案,在分布式数据库处于主主工作模式进行副本存取,并在检测到多个副本间数据不一致时,比较不同副本上数据最近操作的时间;根据不同副本上数据最近操作的时间,保留各副本上数据的最后更新结果;基于各副本上数据的最后更新结果,进行各副本间数据的相互合并操作,使得各副本间数据保持一致,由此,通过日志复制,解决了分布式系统中,当日志无法接续或者多主访问导致数据更新冲突的技术问题。此外,通过将被删除数据的主键保存于预设的主键删除队列中,解决数据合并发生时,删除操作与增、改操作的合并处理,保证了副本间数据的一致性。
如图6所示,本公开第六实施例提出一种数据合并的实现方法,基于上述图5所示的实施例,所述方法还包括:
步骤S107,在副本应用修改操作日志时,比较应用的修改操作日志中携带的数据被修改的操作时间和被修改数据的最近操作时间;若被修改数据的最近操作时间不等于修改操作日志中携带的数据被修改的操作时间,则针对该条数据发起数据合并操作。
相比上述图5所示的实施例,本实施例还包括应用修改操作日志的方案。
在一个实施例中,在副本应用修改操作日志时,比较发现被修改数据的最近操作时间不等于日志中携带的数据被修改前的操作时间,则针对该条数据发起数据合并。
还是以两个副本(Replica_1---副本1,Replica_2---副本2)为例,进行副本应用修改操作日志,触发单条数据合并处理的过程如下:步骤1.副本2上产生了修改操作的日志;步骤2.副本2上的修改日志被同步到副本1;步骤3.副本1上应用日志发生冲突:修改日志中携带的副本2上数据被修改前的操作时间不等于副本1上该数据最近一次的操作时间;步骤4.副本1向副本2读取日志应用冲突涉及修改的数据;步骤5.副本1收到副本2上读取数据;步骤6.比较两个副本中数据的操作时间,选择操作时间最新的数据作为修 改结果;步骤7.当前日志应用冲突解决,使用当前日志继续读取副本2的后续日志;步骤8.副本1与副本2间进入正常的日志复制应用流程。
本实施例通过上述方案,在分布式数据库处于主主工作模式进行副本存取,并在检测到多个副本间数据不一致时,比较不同副本上数据最近操作的时间;根据不同副本上数据最近操作的时间,保留各副本上数据的最后更新结果;基于各副本上数据的最后更新结果,进行各副本间数据的相互合并操作,使得各副本间数据保持一致,由此,通过日志复制,解决了分布式系统中,当日志无法接续或者多主访问导致数据更新冲突的技术问题。此外,通过将被删除数据的主键保存于预设的主键删除队列中,解决数据合并发生时,删除操作与增、改操作的合并处理,保证了副本间数据的一致性。
此外,在另一实施例中,本公开实施例方案还可以进行副本间数据恢复处理。
副本间数据恢复处理的过程可以举例如下:步骤1.副本1向副本2发送恢复DEL_PK_QUE中所有数据的请求;步骤2.副本1收到副本2发送的读取DEL_PK_QUE中数据的响应;步骤3.副本1将步骤2中返回的删除操作信息插DEL_PK_QUE中;步骤4.副本1向副本2发送恢复所有数据请求,即从副本2中该副本的所有数据;步骤5.副本1收到副本2发送的若干读取数据的响应;步骤6.副本1将步骤5中返回的数据插入数据库中;步骤7.副本1数据恢复完成,向副本2读取数据恢复期间可能产生的日志;步骤8.副本1收到副本2发送的若干读取日志的响应;步骤9.副本1基于DEL_PK_QUE中存储的删除操作信息应用步骤8中返回的日志(操作日志应用参考图1图2图3);步骤10.副本1与副本2间进入正常的日志复制应用流程。
此外,在另一实施例中,本公开实施例方案还可以基于DEL_PK_QUE中删除信息,进行数据合并处理。
副本间基于DEL_PK_QUE中删除信息,进行数据合并的处理的过程可以举例如下:步骤1.副本1向副本2发起读取所有数据的请求;步骤2.副本1收到若干副本2发送的读取数据的响应;步骤3.对于步骤2中返回的每条数据记录,进行如下合并处理;步骤3.1数据在副本1中存在,则比较两个副本中数据的操作时间,选择操作时间最新的数据为记录内容;步骤3.2数据在副本1中不存在,查询DEL_PK_QUE,如果存在删除操作信息且删除操作时间大于副本2中数据的操作时间,则不插入数据,其它情况,插入数据;步骤3.3基于3.1、3.2中的执行结果产生操作日志;步骤4.完成副本2上所有数据的合并处理,则向副本2读取数据合并期间产生的日志;步骤5.副本1收到副本2发送的若干读取日志的响应;步骤6.副本1基于DEL_PK_QUE中存储的删除操作信息应用步骤 3.3中返回的日志;步骤7.副本1与副本2间进入正常的日志复制应用流程。
此外,在另一实施例中,本公开实施例方案还可以实现DEL_PK_QUE的插入处理。
DEL_PK_QUE的插入处理过程可以举例如下:步骤1.往DEL_PK_QUE中插入删除操作信息:主键及删除操作时间;步骤2.按如下步骤进行插入处理;步骤2.1存储主键及操作时间信息的缓存区有空闲空间,则分配空间存储主键及删除操作时间;步骤2.2存储主键及操作时间信息的缓存区有无空闲空间,则从DEL_PK_QUE中释放所保存删除操作时间最老的队列元素所占资源,而后分配空间存储主键及删除操作时间;步骤2.3插入的删除操作信息在DEL_PK_QUE中不存在,则直接插入DEL_PK_QUE;步骤2.3插入的删除操作信息在DEL_PK_QUE中已存在,则更新对应主键的删除操作时间。
如图7所示,本公开第七实施例提出一种数据合并的实现方法,基于上述图6所示的实施例,所述方法还包括:
步骤S108,若检测到存储主键及删除操作时间的缓存空间使用完毕,则释放所述主键存储队列中所保存删除操作时间最老的队列元素所占空间。
如前所述,DEL_PK_QUE中保存因删除操作请求或应用删除操作日志而被删除数据的主键及删除操作时间。
在本实施例中,存储主键及操作时间的缓存空间使用完毕,则释放DEL_PK_QUE中所保存删除操作时间最老的队列元素所占空间。
本实施例通过上述方案,在分布式数据库处于主主工作模式进行副本存取,并在检测到多个副本间数据不一致时,比较不同副本上数据最近操作的时间;根据不同副本上数据最近操作的时间,保留各副本上数据的最后更新结果;基于各副本上数据的最后更新结果,进行各副本间数据的相互合并操作,使得各副本间数据保持一致,由此,通过日志复制,解决了分布式系统中,当日志无法接续或者多主访问导致数据更新冲突的技术问题。此外,通过将被删除数据的主键保存于预设的主键删除队列中,解决数据合并发生时,删除操作与增、改操作的合并处理,保证了副本间数据的一致性。此外,在存储主键及操作时间的缓存空间使用完毕时,释放DEL_PK_QUE中所保存删除操作时间最老的队列元素所占空间,以节省存储资源。
需要说明的是,上述各实施例根据实际情况,可以相互组合实施,在此不再赘述。
此外,本公开实施例还提出一种数据合并的实现装置,包括:比较模块,用于在分布 式数据库处于主主工作模式进行副本存取,并在检测到多个副本间数据不一致时,比较不同副本上数据最近操作的时间;保存模块,用于根据不同副本上数据最近操作的时间,保留各副本上数据的最后更新结果;合并操作模块,用于基于各副本上数据的最后更新结果,进行各副本间数据的相互合并操作,使得各副本间数据保持一致
在一个实施例中,上述合并操作模块,还用于在接收到删除请求或应用其它副本的删除操作日志,以删除当前副本上的数据时,将当前副本上被删除数据的主键及删除操作时间,保存于预设的主键存储队列。
在一个实施例中,上述合并操作模块,还用于在接收到增加数据请求,以在副本上增加新的数据时,使用新增数据的主键从所述主键存储队列中删除相应的队列元素。
在一个实施例中,上述合并操作模块,还用于在副本应用增加操作日志时,使用待新增数据的主键查询所述主键存储队列;如果所述主键存储队列中不存在对应待新增数据的删除操作记录,则在当前副本中插入对应的待新增数据;否则,比较应用的增加操作日志中携带的增加操作时间和所述主键存储队列中保存的删除操作时间,若前者大于后者,则插入对应的待新增数据;若前者不大于后者,则不插入对应的待新增数据。
在一个实施例中,上述合并操作模块,还用于在副本应用修改操作日志时,比较应用的修改操作日志中携带的数据被修改的操作时间和被修改数据的最近操作时间;
若被修改数据的最近操作时间不等于修改操作日志中携带的数据被修改的操作时间,则针对该条数据发起数据合并操作。
在一个实施例中,上述合并操作模块,还用于若检测到存储主键及删除操作时间的缓存空间使用完毕,则释放所述主键存储队列中所保存删除操作时间最老的队列元素所占空间。
本实施例实现副本间数据合并的原理请参照上述各实施例,在此不再赘述。
本实施例在分布式数据库处于主主工作模式进行副本存取,并在检测到多个副本间数据不一致时,比较不同副本上数据最近操作的时间;根据不同副本上数据最近操作的时间,保留各副本上数据的最后更新结果;基于各副本上数据的最后更新结果,进行各副本间数据的相互合并操作,使得各副本间数据保持一致,由此,通过日志复制,解决了分布式系统中,当日志无法接续或者多主访问导致数据更新冲突的技术问题。此外,通过将被删除数据的主键保存于预设的主键删除队列中,解决数据合并发生时,删除操作与增、改操作的合并处理,保证了副本间数据的一致性。
此外,本公开实施例还提出一种数据合并的实现系统,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如上所述的数据合并的实现方法的步骤。
在一个实施例中,如图8所示,本实施例系统可以包括:处理器1001,例如CPU,网络接口1004,用户接口1003,存储器1005,通信总线1002。其中,通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard),可选用户接口1003还可以包括标准的有线接口、无线接口。网络接口1004可选的可以包括标准的有线接口、无线接口(如WI-FI接口)。存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。
本领域技术人员可以理解,图8中示出的系统结构并不构成对平台的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
如图8所示,作为一种计算机存储介质的存储器1005中可以包括操作系统、网络通信模块、用户接口模块以及数据合并的实现程序。
在图8所示的系统中,网络接口1004主要用于连接网络服务器,与网络服务器进行数据通信;用户接口1003主要用于连接客户端,与客户端进行数据通信;而处理器1001可以用于调用存储器1005中存储的数据合并的实现程序,并执行以下操作:在分布式数据库处于主主工作模式进行副本存取,并在检测到多个副本间数据不一致时,比较不同副本上数据最近操作的时间;根据不同副本上数据最近操作的时间,保留各副本上数据的最后更新结果;基于各副本上数据的最后更新结果,进行各副本间数据的相互合并操作,使得各副本间数据保持一致。
在一个实施例中,处理器1001可以用于调用存储器1005中存储的数据合并的实现程序,还执行以下操作:在接收到删除请求或应用其它副本的删除操作日志,以删除当前副本上的数据时,将当前副本上被删除数据的主键及删除操作时间,保存于预设的主键存储队列。
在一个实施例中,处理器1001可以用于调用存储器1005中存储的数据合并的实现程序,还执行以下操作:在接收到增加数据请求,以在副本上增加新的数据时,使用新增数据的主键从所述主键存储队列中删除相应的队列元素。
在一个实施例中,处理器1001可以用于调用存储器1005中存储的数据合并的实现程序,还执行以下操作:在副本应用增加操作日志时,使用待新增数据的主键查询所述主键存储队列;如果所述主键存储队列中不存在对应待新增数据的删除操作记录,则在当前副本中插入对应的待新增数据;否则,比较应用的增加操作日志中携带的增加操作时间和所述主键存储队列中保存的删除操作时间,若前者大于后者,则插入对应的待新增数据;若 前者不大于后者,则不插入对应的待新增数据。
在一个实施例中,处理器1001可以用于调用存储器1005中存储的数据合并的实现程序,还执行以下操作:在副本应用修改操作日志时,比较应用的修改操作日志中携带的数据被修改的操作时间和被修改数据的最近操作时间;若被修改数据的最近操作时间不等于修改操作日志中携带的数据被修改的操作时间,则针对该条数据发起数据合并操作。
在一个实施例中,处理器1001可以用于调用存储器1005中存储的数据合并的实现程序,还执行以下操作:对副本间数据进行恢复处理或在副本上进行主键存储队列中信息的插入处理。
在一个实施例中,处理器1001可以用于调用存储器1005中存储的数据合并的实现程序,还执行以下操作:若检测到存储主键及删除操作时间的缓存空间使用完毕,则释放所述主键存储队列中所保存删除操作时间最老的队列元素所占空间。
本实施例实现副本间数据合并的原理请参照上述各实施例,在此不再赘述。
此外,本公开实施例还提出一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如上所述的数据合并的实现方法的步骤。
本实施例实现副本间数据合并的原理请参照上述各实施例,在此不再赘述。
相比一些情况,本公开实施例提出的一种数据合并的实现方法、装置、系统及存储介质,在分布式数据库处于主主工作模式进行副本存取,并在检测到多个副本间数据不一致时,比较不同副本上数据最近操作的时间;根据不同副本上数据最近操作的时间,保留各副本上数据的最后更新结果;基于各副本上数据的最后更新结果,进行各副本间数据的相互合并操作,使得各副本间数据保持一致,由此,通过日志复制,解决了分布式系统中,当日志无法接续或者多主访问导致数据更新冲突的技术问题。此外,通过将被删除数据的主键保存于预设的主键删除队列中,解决数据合并发生时,删除操作与增、改操作的合并处理,保证了副本间数据的一致性。
本公开解决了分布式系统中,当日志无法接续或者多主访问导致数据更新冲突的技术问题。此外,通过将被删除数据的主键保存于预设的主键删除队列中,解决数据合并发生时,删除操作与增、改操作的合并处理。
以上所述仅为本公开的优选实施例,并非因此限制本公开的专利范围,凡是利用本公 开说明书及附图内容所作的等效结构或流程变换,或直接或间接运用在其它相关的技术领域,均同理包括在本公开的专利保护范围内。

Claims (13)

  1. 一种数据合并的实现方法,其中,所述方法应用于分布式数据库系统,所述方法包括:
    在分布式数据库处于主主工作模式进行副本存取,并在检测到多个副本间数据不一致时,比较不同副本上数据最近操作的时间;
    根据不同副本上数据最近操作的时间,保留各副本上数据的最后更新结果;
    基于各副本上数据的最后更新结果,进行各副本间数据的相互合并操作,使得各副本间数据保持一致。
  2. 根据权利要求1所述的数据合并的实现方法,其中,所述在分布式数据库处于主主工作模式进行副本存取,并在检测到多个副本间数据不一致时,比较不同副本上数据最近操作的时间的步骤之前还包括:
    在分布式数据库处于主主工作模式进行副本存取时,检测是否存在副本自身日志缓冲区溢出,导致其它副本无法读取连续日志;
    若是,则检测到多个副本间数据不一致。
  3. 根据权利要求1所述的数据合并的实现方法,其中,所述在分布式数据库处于主主工作模式进行副本存取,并在检测到多个副本间数据不一致时,比较不同副本上数据最近操作的时间的步骤之前还包括:
    在分布式数据库处于主主工作模式进行副本存取时,若同一数据在多个副本上并发修改,并在副本间应用日志时,检测被修改数据最近操作时间与日志中携带时间是否一致,若不一致,则检测到多个副本间数据不一致。
  4. 根据权利要求1所述的数据合并的实现方法,其中,所述基于各副本上数据的最后更新结果,进行各副本间数据的相互合并操作,使得各副本间数据保持一致的步骤中包括:
    若多个副本均发生数据更新,则发起双向数据合并。
  5. 根据权利要求1所述的数据合并的实现方法,其中,所述方法还包括:
    在接收到删除请求或应用其它副本的删除操作日志,以删除当前副本上的数据时,将当前副本上被删除数据的主键及删除操作时间,保存于预设的主键存储队列。
  6. 根据权利要求5所述的数据合并的实现方法,其中,所述方法还包括:
    在接收到增加数据请求,以在副本上增加新的数据时,使用新增数据的主键从所述主键存储队列中删除相应的队列元素。
  7. 根据权利要求6所述的数据合并的实现方法,其中,所述方法还包括:
    在副本应用增加操作日志时,使用待新增数据的主键查询所述主键存储队列;
    如果所述主键存储队列中不存在对应待新增数据的删除操作记录,则在当前副本中插入对应的待新增数据;
    如果所述主键存储队列中存在对应待新增数据的删除操作记录,则比较应用的增加操作日志中携带的增加操作时间和所述主键存储队列中保存的删除操作时间,若前者大于后者,则插入对应的待新增数据;若前者不大于后者,则不插入对应的待新增数据。
  8. 根据权利要求5-7中任一项所述的数据合并的实现方法,其中,所述方法还包括:
    在副本应用修改操作日志时,比较应用的修改操作日志中携带的数据被修改的操作时间和被修改数据的最近操作时间;
    若被修改数据的最近操作时间不等于修改操作日志中携带的数据被修改的操作时间,则针对该条数据发起数据合并操作。
  9. 根据权利要求8所述的数据合并的实现方法,其中,所述方法还包括:
    对副本间数据进行恢复处理或在副本上进行主键存储队列中信息的插入处理。
  10. 根据权利要求9所述的数据合并的实现方法,其中,所述方法还包括:
    若检测到存储主键及删除操作时间的缓存空间使用完毕,则释放所述主键存储队列中所保存删除操作时间最老的队列元素所占空间。
  11. 一种数据合并的实现装置,其中,包括:
    比较模块,用于在分布式数据库处于主主工作模式进行副本存取,并在检测到多个副本间数据不一致时,比较不同副本上数据最近操作的时间;
    保存模块,用于根据不同副本上数据最近操作的时间,保留各副本上数据的最后更新结果;
    合并操作模块,用于基于各副本上数据的最后更新结果,进行各副本间数据的相互合并操作,使得各副本间数据保持一致。
  12. 一种数据合并的实现系统,其中,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如权利要求1-10中任一项所述的数据合并的实现方法的步骤。
  13. 一种计算机可读存储介质,其中,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1-10中任一项所述的数据合并的实现方法的步骤。
PCT/CN2019/124491 2018-12-11 2019-12-11 数据合并的实现方法、装置、系统及存储介质 WO2020119709A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811510709.0A CN111309799A (zh) 2018-12-11 2018-12-11 数据合并的实现方法、装置、系统及存储介质
CN201811510709.0 2018-12-11

Publications (1)

Publication Number Publication Date
WO2020119709A1 true WO2020119709A1 (zh) 2020-06-18

Family

ID=71075844

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/124491 WO2020119709A1 (zh) 2018-12-11 2019-12-11 数据合并的实现方法、装置、系统及存储介质

Country Status (2)

Country Link
CN (1) CN111309799A (zh)
WO (1) WO2020119709A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11743722B2 (en) 2019-04-29 2023-08-29 Telefonaktiebolaget Lm Ericsson (Publ) Handling of multiple authentication procedures in 5G
US11922026B2 (en) 2022-02-16 2024-03-05 T-Mobile Usa, Inc. Preventing data loss in a filesystem by creating duplicates of data in parallel, such as charging data in a wireless telecommunications network

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117149097B (zh) * 2023-10-31 2024-02-06 苏州元脑智能科技有限公司 一种分布式存储系统数据访问控制方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140324785A1 (en) * 2013-04-30 2014-10-30 Amazon Technologies, Inc. Efficient read replicas
CN105447046A (zh) * 2014-09-02 2016-03-30 阿里巴巴集团控股有限公司 一种分布式系统数据一致性处理方法、装置和系统
CN107357920A (zh) * 2017-07-21 2017-11-17 北京奇艺世纪科技有限公司 一种增量式的多副本数据同步方法及系统
CN108228678A (zh) * 2016-12-22 2018-06-29 华为技术有限公司 一种多副本数据恢复方法及装置
CN108897822A (zh) * 2018-06-21 2018-11-27 郑州云海信息技术有限公司 一种数据更新方法、装置、设备及可读存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140324785A1 (en) * 2013-04-30 2014-10-30 Amazon Technologies, Inc. Efficient read replicas
CN105447046A (zh) * 2014-09-02 2016-03-30 阿里巴巴集团控股有限公司 一种分布式系统数据一致性处理方法、装置和系统
CN108228678A (zh) * 2016-12-22 2018-06-29 华为技术有限公司 一种多副本数据恢复方法及装置
CN107357920A (zh) * 2017-07-21 2017-11-17 北京奇艺世纪科技有限公司 一种增量式的多副本数据同步方法及系统
CN108897822A (zh) * 2018-06-21 2018-11-27 郑州云海信息技术有限公司 一种数据更新方法、装置、设备及可读存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11743722B2 (en) 2019-04-29 2023-08-29 Telefonaktiebolaget Lm Ericsson (Publ) Handling of multiple authentication procedures in 5G
US11922026B2 (en) 2022-02-16 2024-03-05 T-Mobile Usa, Inc. Preventing data loss in a filesystem by creating duplicates of data in parallel, such as charging data in a wireless telecommunications network

Also Published As

Publication number Publication date
CN111309799A (zh) 2020-06-19

Similar Documents

Publication Publication Date Title
US11604597B2 (en) Data processing method and apparatus
US10140185B1 (en) Epoch based snapshot summary
US11442961B2 (en) Active transaction list synchronization method and apparatus
WO2020119709A1 (zh) 数据合并的实现方法、装置、系统及存储介质
US10657150B2 (en) Secure deletion operations in a wide area network
KR20180021679A (ko) 일관된 데이터베이스 스냅샷들을 이용한 분산 데이터베이스에서의 백업 및 복원
US20150121130A1 (en) Data storage method, data storage apparatus, and storage device
WO2020093501A1 (zh) 文件存储方法、删除方法、服务器及存储介质
WO2020025049A1 (zh) 数据同步的方法、装置、数据库主机及存储介质
CN111125021B (zh) 从异步远程系统有效恢复文件系统图像的一致视图的方法和系统
US20240028598A1 (en) Transaction Processing Method, Distributed Database System, Cluster, and Medium
CN112307119A (zh) 数据同步方法、装置、设备及存储介质
WO2023197404A1 (zh) 一种基于分布式数据库的对象存储方法及装置
WO2022048416A1 (zh) 操作请求的处理方法、装置、设备、可读存储介质及系统
CN112334891B (zh) 用于搜索服务器的集中式存储
US9563521B2 (en) Data transfers between cluster instances with delayed log file flush
WO2022135471A1 (zh) 多版本并发控制和日志清除方法、节点、设备和介质
WO2023071043A1 (zh) 文件聚合兼容方法、装置、计算机设备和存储介质
JP7450735B2 (ja) 確率的データ構造を使用した要求の低減
CN111240891A (zh) 基于数据库多表间数据一致性的数据恢复方法及装置
WO2020107352A1 (zh) 日志序列号生成方法、装置及可读存储介质
KR20190096837A (ko) 충돌 페이지 리스트를 이용한 병렬 저널링 방법 및 그 장치
US11755425B1 (en) Methods and systems for synchronous distributed data backup and metadata aggregation
US20200342065A1 (en) Replicating user created snapshots
US10706012B2 (en) File creation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19895123

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19895123

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 22.10.2021)