CN107026880A - Method of data synchronization and device - Google Patents

Method of data synchronization and device Download PDF

Info

Publication number
CN107026880A
CN107026880A CN201610069972.5A CN201610069972A CN107026880A CN 107026880 A CN107026880 A CN 107026880A CN 201610069972 A CN201610069972 A CN 201610069972A CN 107026880 A CN107026880 A CN 107026880A
Authority
CN
China
Prior art keywords
transaction journal
transaction
storage
snapshot
repeater
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610069972.5A
Other languages
Chinese (zh)
Inventor
杨军
吴元清
章孜
杨中锋
周谦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201610069972.5A priority Critical patent/CN107026880A/en
Publication of CN107026880A publication Critical patent/CN107026880A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides method of data synchronization and device.Method of data synchronization includes:To repeater request transaction daily record, if asked transaction journal is not present in repeater, to snapshot server request transaction daily record;Transaction journal is received from repeater or snapshot server;And applied transaction daily record is with synchrodata.

Description

Method of data synchronization and device
Technical field
The present invention relates to computer network field, and in particular to a kind of method of data synchronization and dress Put.
Background technology
The popularization used with internet, and the service of the Internet, applications 7*24 whole days are natural Attribute, the availability of website is more and more important to Large-Scale Interconnected net company.Even if a few minutes delay Machine is likely to bring about great losses to Internet firm and user.Website is caused to be delayed the factor of machine Mainly include server, the network equipment, computer room, ISP network circuit, weather natural calamity etc. Force majeure.Redundancy is the major way for lifting Website Usability.Usual Large-Scale Interconnected net company Important web station system will be deployed to multiple servers, multiple computer rooms, multiple cities, very To the whole world.For data system, under more stable reliable network environment, Wo Menyi As reproduction technology in itself that provided using database realize data redundancy.When network environment becomes When unstable, the reproduction technology that database itself is provided often does not reach enough robustnesses;And And sufficiently low duplication delay can not be provided.Therefore, in the network environment across computer room, cross-region In, how a kind of database synchronous system of low delay High Availabitity is provided, just turns into and improves available The key of property.
One of prior art is Mysql Replication, i.e. Mysql leader follower replications.Mysql Duplication is a kind of master-slave synchronisation technology that (log shipping) is transmitted based on transaction journal.Work as master When having affairs write-in on server, Transaction Information is first write transaction journal (Write by master server Ahead Log) in, the data change that transaction packet contains then is performed again, changes data file. Performed from server after the operation for specifying master server, run IO threads from server, even Master server is connect, to master server request transaction log transmission.From the IO threads on server The transaction journal received is write to local transaction log file (relay log).From server SQL threads are run simultaneously, local transaction log file is read, and perform included in affairs one by one Data change, data file of the final write-in from server.
Another prior art is Oracle Golden Gate.This be it is a kind of support Oracle, The reproduction technology of the multitype databases such as Mysql.It is similar with Mysql in its principle, it is also based on Transaction journal is transmitted, and main difference is that Oracle Golden Gate are with third party's component Mode is run, independently of Database Systems in deployment.
But, all there is more obvious shortcoming in both technologies.Mysql leader follower replication technologies Using widely in daily database O&M, be primarily due to this reproduction technology by Mysql itself is directly provided, it is not necessary to install any tripartite's component;It is simple using upper comparison in addition It is single.But, Mysql replicates the place denounced always and is that replication performance.Until Mysql5.6, Mysql just start to optimize replication performance.Further, since Mysql is multiple System is provided by Mysql itself, and peripheral system can not be intervened, and also can not just realize that such as isomery is answered The Premium Features such as system, bidirectional replication, message distribution.Oracle Golden Gate are due to Oracle Good support, in traditional industrial application such as telecommunications, banks than wide.But its commercial product Attribute, determines limited autgmentability, it is impossible to adapt to that Internet firm's demand is complicated and changeable to be made Use scene.
The content of the invention
In view of this, the invention provides a kind of method of data synchronization and dress of low delay High Availabitity Put.It improves the performance that Mysql is replicated, and is replicated with Mysql compared to 5-10 times of raising.Meanwhile, The present invention provides the framework spirit transmitted by transaction journal and carry out data change distribution from design Activity.
First aspect is there is provided a kind of method of data synchronization according to an embodiment of the invention, including: To repeater (Relay) request transaction daily record, if transaction journal can not be received from repeater, Then to snapshot server request transaction daily record;From repeater or snapshot (Snapshot) server Receive transaction journal;And applied transaction daily record is with synchrodata.
Second aspect, additionally provides a kind of data synchronization unit according to an embodiment of the invention, Including:Request unit, is configured as to repeater request transaction daily record, if can not be from relaying Device receives transaction journal, then to snapshot server request transaction daily record;Receiving unit, is configured To receive transaction journal from repeater or snapshot server;And transaction journal applying unit, quilt Applied transaction daily record is configured to synchrodata;And applying unit, it is configured as receiving and answers With the transaction journal with synchrodata.
Brief description of the drawings
Fig. 1 is the schematic diagram for showing data syn-chronization center according to an embodiment of the invention.
Fig. 2 is the flow chart for showing method of data synchronization according to an embodiment of the invention.
Fig. 3 is to show to be used for the flow of the method for snapshot server according to an embodiment of the invention Figure.
Fig. 4 is the block diagram for showing data synchronization unit according to an embodiment of the invention.
Embodiment
The present invention replicates identical from general principle with Mysql, is all by by primary database Transaction journal be transferred to from database, be then applied to one by one from database.The present invention is logical Height parallelization is crossed to improve from database side storage algorithm, so as to improve replication performance.
Fig. 1 shows the schematic diagram at data syn-chronization center according to an embodiment of the invention.Data are same Step center is mainly made up of three components:Repeater (Relay), snapshot (Snapshot) is multiple Device (Replicator) processed.The core work flow of whole system is as follows:
(1) Relay extracts transaction journal from data source, is put into Circular buffer (Ring Buffer) In, it is used as internal memory journal queue;
(2) Replicator sets up with Relay and connected, the online change (Online Transfer) of request, Ask one section of Incremental Transactions daily record;
(3) Relay takes out corresponding transaction journal block from Ring Buffer, is sent to Replicator;
(4) Replicator is received after transaction journal, is set up and is connected with target database, and applies Transaction journal;
(5) if Relay does not find corresponding transaction journal block in 1.3, just send and do not find Notify (for example, 404) to Replicator;
(6) Replicator is received do not find and notify 404 after, set up and connect with Snapshot, please Seek quickness according to increment (Snapshot Delta);
(7) Snapshot takes out corresponding snapshot increment from its memory, is sent to Replicator;
(8) Replicator is received after snapshot increment, is set up and is connected with target database, and applies Transaction journal;
(9) during 2-8, Snapshot as a Relay special consumption terminal, always All online changes are being subscribed to from Relay, and are being write in memory, Relay data are used as Standby when expired;
Primary clustering repeater (Relya) of the invention explained below, reproducer And snapshot (Snapshot) (Replicator).
1st, repeater (Relay)
Repeater Relay extracts transaction journal from source database, and Replicator is provided Daily record subscription service, role is upper equivalent to Mysql Slave IO Thread.Relay is in design Source database is connected in single thread mode, affairs are used as using Circular buffer (Ring Buffer) The data structure of daily record storage, it is ensured that reading performance;In addition for the database of big writing, Relay can also use memory mirror file to be stored as transaction journal, it is ensured that can receive foot Enough high writing.
Relay workflow is as follows:
1.1Event Producer threads are set up to data source and connected, and initiating transaction journal extraction please Ask, persistently pull transaction journal;
1.2Event Producer are parsed the transaction journal pulled, according to fixed in advance The avro schema of justice, sequence chemical conversion avro forms;
The avro forms of transaction journal are added in Ring Buffer by 1.3Event Producer;
If 1.4 reach the checkpoint time intervals specified, Event Producer will be current Have been written into Ring Buffer maximum transaction log-sequence numbers write-in checkpoint;
2nd, reproducer (Replicator)
Replicator is the consumption terminal of transaction journal, and affairs are pulled from Relay or Snapshot Daily record, target database is applied to by transaction journal by the uniformity of configuration, role it is upper equivalent to Mysql Slave SQL Thread。
Replicator can be configured according to uniformity, flexibly select storehouse, table, row etc. at different levels The storage algorithm of degree of parallelism;For the write-in of large batch of continuous data, submitted using batch. Replicator additionally provides unified transaction journal consumption interface, eliminates and data source affairs day The coupling of will form.
Replicator main working process is as follows:
2.1Relay Puller threads are set up with Relay and connected, and subscribe to transaction journal, and will receive To transaction journal write-in Ring Buffer in;
2.2Dispatcher takes out the transaction journal block of batch from Ring Buffer, according to one The configuration of cause property, is grouped to transaction journal, is merged, generating the processing such as storage sentence, being sent To callback threads;
2.3Callback thread pools, are configured according to uniformity, to the storage sentence of packet, are performed It is parallel to submit;
2.4 when Relay Puller threads receive 404 from Relay in 2.2.2.1, then initiate Snapshot Puller threads are sent to Snapshot and subscribe to request, and affairs are received from Snapshot After daily record, remaining process step is continued executing with;
After 2.5 each Callback threads run succeeded, successful Transaction Sequence number will be had been filed on Write the corresponding checkpoint of thread;
3rd, snapshot (Snapshot)
Snapshot (Snapshot) is responsible for subscribing to all transaction journals from Relay, and write-in is persistently deposited Storage is as snapshot, while providing batch daily record subscription service to Replicator, role is upper suitable In Mysql Slave Relay Log.
As it was previously stated, under normal circumstances, Replicator is directly connected to Relay, Relay is consumed Transaction journal in memory queue.But in some cases, because network jitter, object library it is negative High factor was carried, Replicator may be caused to fall behind with respect to Relay a lot.In addition, working as When new consumption terminal adds the subscriber of same data source, the problem of new consumption terminal has cold start-up. In order to avoid doing full dose snapshot from data source again, Snapshot is special as one of Relay Consumption terminal, by a kind of high consumption pattern handled up, endlessly consumes online from Relay Transaction journal, by effective processing to transaction journal, finally saves the portion one of data source Snapshot is caused, that is, includes the snapshot of the last state in the table of data source storehouse per a line, is retained simultaneously The transaction journal than Relay Buffer longer of one backtracking time.
Snapshot mainly includes Snapshot increments flow and Snapshot snapshot workflows.
Snapshot incremental workflow journeys are as follows:
3.1 when Replicator sends to Snapshot and subscribes to request, and Snapshot is checked please The initialization transaction log-sequence numbers in parameter are sought, if initialization transaction log-sequence numbers are worked as in Snapshot In the range of preceding transaction journal storage (Log Store), then start snapshot increment flow;
3.2Snapshot Server read a transaction journal block from transaction journal storage, send To Replicator;
3.3 repeat 3.2, until having consumed all daily records in transaction journal storage;
3.4 at this moment, and Replicator has pulled up to Relay, and then Replicator connects again Relay is met, continues to consume online change;
Snapshot snapshot workflows are as follows:
3.1 when Replicator sends to Snapshot and subscribes to request, and Snapshot is checked please The initialization transaction log-sequence numbers in parameter are sought, if initialization transaction log-sequence numbers are not in Snapshot In current transaction journal memory range, then start snapshot transfer process;
3.2Snapshot Server read a transaction journal block from snapshot storage, are sent to Replicator;
3.3 repeat 3.2, until having consumed all daily records in snapshot storage;
3.4Replicator continues to ask Snapshot, into snapshot incremental workflow journey.
In addition, present invention also offers High Availabitity design, including:
4.1 checkpoints (Checkpoint) mechanism is to provide transaction journal in whole system Safety in production and the guarantee of consumption.In order to ensure checkpoint across machine, the height across computer room Availability, we select Zookeeper as checkpoint memory mechanisms.Zookeeper makes With across computer room deployment mechanism, it is ensured that checkpoint is available across computer room.
4.2Relay as high performance transaction journal transfer, for ensureing whole system Low delay is extremely important end to end.In order to ensure Relay high availability, we select Zookeeper realizes Relay Active-Standby Failover mechanism.
In addition, the present invention also provides a general api interface, three method, systems are facilitated to subscribe to master Data change on database.
Fig. 2 is the flow chart for showing method of data synchronization 200 according to an embodiment of the invention. Methods described 200 can be performed in Replicator.
In step 201, to repeater request transaction daily record.Step 202, repeater is judged With the presence or absence of the transaction journal asked.If there is asked transaction journal in repeater, Then in step 204, transaction journal is received from repeater.Asked if be not present in repeater Transaction journal, in step 203 to snapshot server request transaction daily record, then in step 204 Transaction journal is received from snapshot server.In step 205, applied transaction daily record is with synchrodata.
Fig. 3 is to show to be used for the method 300 of snapshot server according to an embodiment of the invention Flow chart.
In step 301, snapshot server receives transaction journal request from Replicator.In step Rapid 302, judge to whether there is asked transaction journal in transaction journal storage.If affairs There is asked transaction journal in daily record storage, then sent in step 303 to Replicator The transaction journal.If asked transaction journal is not present in transaction journal storage, In step 304, transaction journal is obtained from snapshot storage, and the thing is sent to Replicator Business daily record.
Fig. 4 is the block diagram for showing data synchronous system 400 according to an embodiment of the invention.Institute Stating data synchronization unit 400 includes:Request unit 401, is configured as asking thing to repeater Business daily record, if transaction journal can not be received from repeater, to snapshot server request transaction Daily record;Receiving unit 402, is configured as receiving transaction journal from repeater or snapshot server; And applying unit 403, transaction journal is configured to apply with synchrodata.
To sum up, the present invention is parallel by the data structure and consumption terminal of a kind of distributed memory queue The optimized algorithm of storage, can be provided across low delay under computer room, cross-region scene, High Availabitity Database synchronization mechanism.

Claims (10)

1. a kind of method of data synchronization, including:
To repeater request transaction daily record, if asked affairs day are not present in repeater Will, then to snapshot server request transaction daily record;
Transaction journal is received from repeater or snapshot server;And
Applied transaction daily record is with synchrodata.
2. according to the method described in claim 1, wherein the Transaction Information is stored in repeater In caching, and if there is no asked transaction journal in repeater caching, then from repeater Notice is not found in reception.
3. method according to claim 1 or 2, wherein the transaction journal is also stored in In the transaction journal storage of snapshot server and snapshot storage, and if in transaction journal storage In find asked transaction journal, then from transaction journal storage receive transaction journal, if Asked transaction journal is not found in transaction journal storage, then receives affairs from snapshot storage Daily record.
4. method according to claim 3, wherein the snapshot storage is non-volatile deposits Reservoir.
5. according to the method described in claim 1, in addition to:
The transaction journal of reception is stored in reproducer caching,
Packet, merging, generation storage sentence processing are performed to transaction journal according to uniformity configuration,
Storage sentence is performed using call back function thread according to uniformity configuration and submitted parallel, with And
After each call back function thread runs succeeded, successful transaction journal sequence will be had been filed on Number corresponding checkpoint of write-in.
6. a kind of data synchronization unit, including:
Request unit, is configured as to repeater request transaction daily record, if can not be from repeater Transaction journal is received, then to snapshot server request transaction daily record;
Receiving unit, is configured as receiving transaction journal from repeater or snapshot server;And
Applying unit, is configured to apply transaction journal with synchrodata.
7. data synchronization unit according to claim 6, wherein the Transaction Information is preserved In repeater caching, and
If the receiving unit is additionally configured to do not have asked affairs in repeater caching Daily record, then receive from repeater and do not find notice.
8. the device according to claim 6 or 7, wherein the transaction journal is also stored in In the persistently storage of the transaction journal storage of snapshot server and snapshot, and if in transaction journal Asked transaction journal is found in storage, then receives transaction journal from transaction journal storage, such as Fruit does not find asked transaction journal in transaction journal storage, then stores and receive from snapshot Transaction journal.
9. device according to claim 8, wherein the snapshot storage is non-volatile deposits Reservoir.
10. device according to claim 6, the applying unit is additionally configured to:
The transaction journal of reception is stored in reproducer caching,
Packet, merging, generation storage sentence processing are performed to transaction journal according to uniformity configuration,
Storage sentence is performed using call back function thread according to uniformity configuration and submitted parallel, with And
After each call back function thread runs succeeded, successful transaction journal sequence will be had been filed on Number corresponding checkpoint of write-in.
CN201610069972.5A 2016-02-01 2016-02-01 Method of data synchronization and device Pending CN107026880A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610069972.5A CN107026880A (en) 2016-02-01 2016-02-01 Method of data synchronization and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610069972.5A CN107026880A (en) 2016-02-01 2016-02-01 Method of data synchronization and device

Publications (1)

Publication Number Publication Date
CN107026880A true CN107026880A (en) 2017-08-08

Family

ID=59524192

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610069972.5A Pending CN107026880A (en) 2016-02-01 2016-02-01 Method of data synchronization and device

Country Status (1)

Country Link
CN (1) CN107026880A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109189748A (en) * 2018-08-20 2019-01-11 郑州云海信息技术有限公司 A kind of buffer consistency processing method and nfs server
CN109358817A (en) * 2018-10-26 2019-02-19 北京百度网讯科技有限公司 Methods, devices and systems for replicate data
CN110019062A (en) * 2017-08-14 2019-07-16 北京京东尚科信息技术有限公司 Method of data synchronization and system
CN111930692A (en) * 2020-05-28 2020-11-13 武汉达梦数据库有限公司 Transaction merging execution method and device based on log analysis synchronization
CN111930693A (en) * 2020-05-28 2020-11-13 武汉达梦数据库有限公司 Transaction merging execution method and device based on log analysis synchronization
CN113190528A (en) * 2021-04-21 2021-07-30 中国海洋大学 Parallel distributed big data architecture construction method and system
CN114297291A (en) * 2021-12-09 2022-04-08 武汉达梦数据库股份有限公司 Transaction combination-based parallel execution method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104657382A (en) * 2013-11-21 2015-05-27 阿里巴巴集团控股有限公司 Method and device for detecting consistency of data of MySQL master and slave servers
CN104809200A (en) * 2015-04-24 2015-07-29 联动优势科技有限公司 Database synchronization method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104657382A (en) * 2013-11-21 2015-05-27 阿里巴巴集团控股有限公司 Method and device for detecting consistency of data of MySQL master and slave servers
CN104809200A (en) * 2015-04-24 2015-07-29 联动优势科技有限公司 Database synchronization method and device

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110019062A (en) * 2017-08-14 2019-07-16 北京京东尚科信息技术有限公司 Method of data synchronization and system
CN109189748A (en) * 2018-08-20 2019-01-11 郑州云海信息技术有限公司 A kind of buffer consistency processing method and nfs server
CN109358817A (en) * 2018-10-26 2019-02-19 北京百度网讯科技有限公司 Methods, devices and systems for replicate data
CN109358817B (en) * 2018-10-26 2022-02-18 北京百度网讯科技有限公司 Method, device and system for copying data
CN111930692A (en) * 2020-05-28 2020-11-13 武汉达梦数据库有限公司 Transaction merging execution method and device based on log analysis synchronization
CN111930693A (en) * 2020-05-28 2020-11-13 武汉达梦数据库有限公司 Transaction merging execution method and device based on log analysis synchronization
CN111930692B (en) * 2020-05-28 2022-05-13 武汉达梦数据库股份有限公司 Transaction merging execution method and device based on log analysis synchronization
CN111930693B (en) * 2020-05-28 2024-02-06 武汉达梦数据库股份有限公司 Transaction merging execution method and device based on log analysis synchronization
CN113190528A (en) * 2021-04-21 2021-07-30 中国海洋大学 Parallel distributed big data architecture construction method and system
CN113190528B (en) * 2021-04-21 2022-12-06 中国海洋大学 Parallel distributed big data architecture construction method and system
CN114297291A (en) * 2021-12-09 2022-04-08 武汉达梦数据库股份有限公司 Transaction combination-based parallel execution method and device

Similar Documents

Publication Publication Date Title
CN107026880A (en) Method of data synchronization and device
US20210385273A1 (en) System and method for real-time cloud data synchronization using a database binary log
CN103198159B (en) A kind of many copy consistency maintaining methods of isomeric group reformed based on affairs
US10402115B2 (en) State machine abstraction for log-based consensus protocols
CN106776121B (en) Data disaster recovery device, system and method
CN110209726A (en) Distributed experiment & measurement system system, method of data synchronization and storage medium
WO2015192661A1 (en) Method, device, and system for data synchronization in distributed storage system
CN106502823A (en) data cloud backup method and system
KR20140088123A (en) Real time document presentation data synchronization through generic service
CN106777270A (en) A kind of Heterogeneous Database Replication parallel execution system and method based on submission point time line locking
CN105069160A (en) Autonomous controllable database based high-availability method and architecture
CN106599104A (en) Mass data association method based on redis cluster
CN109840166A (en) Across the cluster object storage async backup methods, devices and systems of one kind
CN107430606A (en) With parallel persistent message broker system
CN107273440A (en) Computer application, date storage method, micro services and microdata storehouse
US20120278817A1 (en) Event distribution pattern for use with a distributed data grid
US20120278429A1 (en) Cluster system, synchronization controlling method, server, and synchronization controlling program
CN109739435A (en) File storage and update method and device
CN111506649A (en) Transaction data disaster tolerance switching method and device, computing device and storage medium
CN105242988B (en) The data back up method of distributed file system and distributed file system
CN109901948A (en) Shared-nothing database cluster strange land dual-active disaster tolerance system
CN109859068B (en) Power grid data real-time synchronization system based on resource pool technology
CN112181723A (en) Financial disaster recovery method and device, storage medium and electronic equipment
US20180121531A1 (en) Data Updating Method, Device, and Related System
CN103384266A (en) Parastor200 management node high availability method based on real-time synchronization at file level

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170808

RJ01 Rejection of invention patent application after publication