CN103763368B - A kind of method of data synchronization across data center - Google Patents
A kind of method of data synchronization across data center Download PDFInfo
- Publication number
- CN103763368B CN103763368B CN201410023373.0A CN201410023373A CN103763368B CN 103763368 B CN103763368 B CN 103763368B CN 201410023373 A CN201410023373 A CN 201410023373A CN 103763368 B CN103763368 B CN 103763368B
- Authority
- CN
- China
- Prior art keywords
- data center
- data
- daily record
- module
- primary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 230000005540 biological transmission Effects 0.000 claims description 10
- 230000004913 activation Effects 0.000 claims description 3
- 230000002354 daily effect Effects 0.000 description 45
- 238000007726 management method Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 230000003203 everyday effect Effects 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 238000013500 data storage Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
- G06F16/273—Asynchronous replication or reconciliation
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The present invention provides a kind of method of data synchronization across data center, and it implements process and is: complete the write of data and the record of daily record;Isochronous schedules and propelling movement;Daily record plays back, and completes data syn-chronization;Carry out the data access across data center, it is achieved asynchronous data simultaneously operating.This kind of method of data synchronization across data center is compared to the prior art, it is possible to realize the asynchronous data simultaneously operating across data center, improves the safety of data;Effectively utilize the Internet resources between the I/O resource within data center and data center, practical, it is easy to promote.
Description
Technical field
The present invention relates to technical field of computer data storage, specifically a kind of method of data synchronization across data center.
Background technology
Along with Internet era arrive: social networks, microblogging, location-based service etc. are just being surging forward towards the interactive website of ordinary internet users, such as Google, Facebook, Twitter and domestic Renren Network, microblogging etc., provide the interactive service based on the Internet and wireless network to hundreds of millions of users.The Internet user being found everywhere through the world all carry out every day diversified alternately, at any time all manufacture various data, the quantity of these data is the several times of unit epoch data volume.
For storing these data, each Internet firm establishes huge data center all over the world, and the host number at individual data center is at hundreds of to the tens thousand of order of magnitude not etc..Information from Google shows, Google has dozens of data center and crosses ten million station server in the whole world, stores the mass data of its Global Subscriber generation every day.To the management of these data with to use be all huge challenge: include the data duplication etc. between the reading of data and storage, index and the interface of addressing, configuration and management, data center, this is wherein particularly urgent to the support of data syn-chronization between many data centers and Research Requirements.
The research stored currently for the data of magnanimity is still in the infancy, method of data synchronization between data center is still had to the aspect of much worth research and improvement, for Hbase, the duplication of Hbase depends on the architecture of Master/Slave, the characteristic simply carrying out data duplication between Liang Ge data center is just added at 0.90.0 version, replication task does not have the realization of priority query, it does not have unified scheduling is done in the load for data center.On the other hand, traditional data syn-chronization algorithm across data center is generally with the transmission of monoblock data be covered as main method, and this method can take substantial amounts of Internet resources and I/O resource.
For this situation, a kind of method of data synchronization across data center based on daily record playback of invention.
Summary of the invention
The technical assignment of the present invention is to solve the deficiencies in the prior art, it is provided that a kind of method of data synchronization across data center.
The technical scheme is that and realize in the following manner, this kind of method of data synchronization across data center, it implements process and is:
One, the write of data and the record of daily record are completed: in primary data center running log logging modle, when primary data center receives the request of data that client is sent, this module by operation required for request in the way of daily record record at primary data center, this module, in the way of embedded or plug-in unit, is incorporated in the operation flow of primary data center.
Two, isochronous schedules and propelling movement: arrange scheduler module and operate in primary data center, this scheduler module is responsible for scheduling data readback operation, according to the load of primary data center, the load at Backup Data center, scheduling strategy information, the propelling movement of Activation Log and playback operation;The push operation that scheduler module requires is completed by daily record pushing module, and this daily record pushing module runs at primary data center, by data manipulation log transmission to Backup Data center.
Three, daily record playback, complete data syn-chronization: primary data center daily record pushing module pushes the data manipulation come and performs to be received by daily record playback module, this daily record playback module operates in Backup Data center, and at current data center playback of data Operation Log, it is achieved the data syn-chronization of Liang Ge data center.
Four, the data access across data center is carried out, it is achieved asynchronous data simultaneously operating.
The detailed process of described step one is: client recognizes the primary data center at customer data place according to local configuration, and all of data manipulation is all sent to primary data center, transfers to the back end in master data to process;After primary data center receives the request of client, performing guest operation according to the operation of request and content, in this course, logger module captures operation and the related data of client's request by the mode of intercept requests;Logger module judges that the operation of client is modified the need of to the data of data center, the need to, then this data manipulation needs as the content across data center's data syn-chronization operation, now operation and the relevant data of request are saved in the asynchronous log recording region of primary data center with proprietary journal format by logger module, and the content in this region is all the content needing to carry out across data center's data readback.
The detailed process of described step 2 is: first monitored following condition by the scheduler module operating in primary data center.
1) number of daily record and the data volume related in asynchronous log recording region;
2) loading condition of primary data center, including network I/O and disk I/O;
3) loading condition at Backup Data center, including network I/O and disk I/O;
When above three meets the scheduling strategy that Configuration Management Officer is arranged, trigger daily record push operation, daily record push operation is performed by daily record pushing module, and this module is responsible for being written to the data manipulation daily record in asynchronous for primary data center log recording region the asynchronous daily record at Backup Data center and is performed region;When daily record pushing module complete daily record after data center transmits, be notified that scheduler module, then scheduler module drives the playback of daily record playback module execution journal at Backup Data center.
The detailed process of described step 3 is: operate in the daily record playback module at Backup Data center after receiving the notice of scheduler module, start the playback operation of execution journal, daily record playback module reads the data logging being stored in asynchronous daily record execution region, then the content of daily record is decoded, obtain operation corresponding to daily record and related data, then on the interdependent node at Backup Data center, again perform this operation, make data and the data consistent in primary data center at Backup Data center, it is achieved synchronizing across data center of data.
The client of described step 4 obtains data by access Backup Data center in following two kinds of situations, when client cannot connect primary data center;Client can connect primary data center, but during primary data center heavy traffic.
The present invention compared with prior art produced by provide the benefit that:
A kind of method of data synchronization across data center of the present invention is capable of the asynchronous data simultaneously operating across data center, improves the safety of data;User is when accessing primary data center, it is also possible to obtain data by accessing preliminary data center;Due in replayed section, it is only necessary to the difference of transmission data, without transmission data itself, therefore this method can also reduce the data volume of transmission, and minimizing simultaneously operating is to the taking of bandwidth between data center;It addition, the scheduler module in system can be scheduling according to the load of data center, effectively utilize the Internet resources between the I/O resource within data center and data center, play the effect of load balance;Practical, it is easy to promote.
Accompanying drawing explanation
Accompanying drawing 1 be the present invention realize process schematic.
Detailed description of the invention
Below in conjunction with accompanying drawing, a kind of method of data synchronization across data center of the present invention is described in detail below.
As shown in Figure 1, a kind of method of data synchronization across data center, by the playback of data manipulation daily record, it is achieved the asynchronous data between data center synchronizes.It implements process:
First pass through programming and following module be set:
(1) logger module.Operate in primary data center, be responsible for when primary data center receives the request of data that client is sent, by operation required for request in the way of daily record record at primary data center.This module, in the way of embedded or plug-in unit, is incorporated in the operation flow of primary data center.
(2) scheduler module.Operate in primary data center, be responsible for scheduling data readback operation.According to information, the propelling movement of Activation Log and playback operations such as the load of primary data center, the load at Backup Data center, scheduling strategies.
(3) daily record pushing module.Operate in primary data center, be responsible for performing the push operation that scheduler module requires, by data manipulation log transmission to Backup Data center.
(4) daily record playback module.Operate in Backup Data center, be responsible for receiving primary data center daily record pushing module and push the data manipulation execution come, and at current data center playback of data Operation Log, it is achieved the data syn-chronization of Liang Ge data center.
By completing operations described below with upper module:
One, the record of the write of data and daily record.
Under normal conditions, client recognizes the primary data center at customer data place according to local configuration, and by all of data manipulation, including reading, write, deletion etc., is all sent to primary data center, transfers to the back end in master data to process.
After primary data center receives the request of client, guest operation can be performed according to the operation of request and content.In this course, logger module can capture operation and the related data of client's request by the mode of intercept requests.
Logger module can judge that the operation of client is modified the need of to the data of data center, if it is desired, then illustrates that this data manipulation needs as the content operated across data center's data syn-chronization.In this time, the operation of request and relevant data can be saved in the asynchronous log recording region of primary data center with proprietary journal format by logger module, and the content in this region is all the content needing to carry out across data center's data readback.
Two, the propelling movement of isochronous schedules and daily record.
Operate in the scheduler module of primary data center, following condition can be monitored:
1) number of daily record and the data volume related in asynchronous log recording region.
2) loading condition of primary data center, including network I/O and disk I/O.
3) loading condition at Backup Data center, master includes network I/O and disk I/O.
When above three meets the scheduling strategy that Configuration Management Officer is arranged, trigger daily record push operation.The premise triggered is usually condition 1) higher, and condition 2) and condition 3) relatively low.
Daily record push operation is performed by daily record pushing module, and this module is responsible for being written to the data manipulation daily record in asynchronous for primary data center log recording region the asynchronous daily record at Backup Data center and is performed region.
When daily record pushing module complete daily record across data center transmit after, be notified that scheduler module.Then scheduler module drives the playback of the daily record playback module execution journal at Backup Data center.
Three, the playback of daily record.
The daily record operating in Backup Data center pays a return visit module after receiving the notice of scheduler module, starts the playback operation of execution journal.
Daily record is paid a return visit module and is read the data logging being stored in asynchronous daily record execution region, then the content of daily record is decoded, obtain operation corresponding to daily record and related data, then on the interdependent node at Backup Data center, again perform this operation, make data and the data consistent in primary data center at Backup Data center.It is achieved thereby that data across data center synchronize.
Four, across the data access of data center.
Client may be passed through to access Backup Data center in both cases and obtain data:
First, when client cannot connect primary data center.This situation is likely to be primary data center and there occurs fault, it is also possible to owing to the network between primary data center and client interrupts.
When this happens, client will be attempted from Backup Data center read operation, and can only perform read operation.Additionally, owing to now whether data between uncertain Backup Data center and primary data center have been completed simultaneously operating, therefore user can be shown relevant information by client, notifies that the source of user's now data is Backup Data center, and there is Data Consistency.
Second, client can connect primary data center, but during primary data center heavy traffic.
When this happens, client is if it is determined that the operation of client is read-only operation, then can confirm whether the data that operation relates to have been synchronized to Backup Data center to primary data center, if primary data center informs that client data synchronously completes, then client can be passed through to access the data that Backup Data center acquisition client to read.Now, the effect of a load balance is played at Backup Data center.
The method of data synchronization across data center of the present invention, it is possible to realize the asynchronous data simultaneously operating across data center, improves the safety of data.User is when accessing primary data center, it is also possible to obtain data by accessing preliminary data center.Can effectively reduce and between data center, synchronize produced data volume.Due in replayed section, it is only necessary to the difference of transmission data, without transmission data itself, therefore can reducing the data volume of transmission, minimizing simultaneously operating is to the taking of bandwidth between data center.Can be scheduling according to the load of data center, effectively utilize the Internet resources between the I/O resource within data center and data center, play the effect of load balance.
The foregoing is only embodiments of the invention, all within the spirit and principles in the present invention, any amendment of making, equivalent replacement, improvement etc., should be included within protection scope of the present invention.
Claims (3)
1. the method for data synchronization across data center, it is characterised in that it implements process and is:
One, the write of data and the record of daily record are completed: in primary data center running log logging modle, when primary data center receives the request of data that client is sent, this module by operation required for request in the way of daily record record at primary data center, this module, in the way of embedded or plug-in unit, is incorporated in the operation flow of primary data center;
The detailed process of this step one is: client recognizes the primary data center at customer data place according to local configuration, and all of data manipulation is all sent to primary data center, transfers to the back end of primary data center to process;After primary data center receives the request of client, performing guest operation according to the operation of request and content, in this course, logger module captures operation and the related data of client's request by the mode of intercept requests;Logger module judges that the operation of client is modified the need of to the data of data center, the need to, then this data manipulation needs as the content across data center's data syn-chronization operation, now operation and the relevant data of request are saved in the asynchronous log recording region of primary data center with proprietary journal format by logger module, and the content in this region is all the content needing to carry out across data center's data readback;
Two, isochronous schedules and propelling movement: arrange scheduler module and operate in primary data center, this scheduler module is responsible for scheduling data readback operation, according to the load of primary data center, the load at Backup Data center, scheduling strategy information, the propelling movement of Activation Log and playback operation;The push operation that scheduler module requires is completed by daily record pushing module, and this daily record pushing module runs at primary data center, by data manipulation log transmission to Backup Data center;
Three, daily record playback, complete data syn-chronization: primary data center daily record pushing module pushes the data manipulation come and performs to be received by daily record playback module, this daily record playback module operates in Backup Data center, and at current data center playback of data Operation Log, it is achieved the data syn-chronization of Liang Ge data center;
The detailed process of described step 3 is: operate in the daily record playback module at Backup Data center after receiving the notice of scheduler module, start the playback operation of execution journal, daily record playback module reads the data logging being stored in asynchronous daily record execution region, then the content of daily record is decoded, obtain operation corresponding to daily record and related data, then on the interdependent node at Backup Data center, again perform this operation, make data and the data consistent in primary data center at Backup Data center, it is achieved synchronizing across data center of data;
Four, the data access across data center is carried out, it is achieved asynchronous data simultaneously operating.
2. a kind of method of data synchronization across data center according to claim 1 and 2, it is characterised in that: the detailed process of described step 2 is: first monitored following condition by the scheduler module operating in primary data center,
1) number of daily record and the data volume related in asynchronous log recording region;
2) loading condition of primary data center, including network I/O and disk I/O;
3) loading condition at Backup Data center, including network I/O and disk I/O;
When above three meets the scheduling strategy that Configuration Management Officer is arranged, trigger daily record push operation, daily record push operation is performed by daily record pushing module, and this module is responsible for being written to the data manipulation daily record in asynchronous for primary data center log recording region the asynchronous daily record at Backup Data center and is performed region;When daily record pushing module complete daily record after data center transmits, be notified that scheduler module, then scheduler module drives the playback of daily record playback module execution journal at Backup Data center.
3. a kind of method of data synchronization across data center according to claim 2, it is characterised in that: the client of described step 4 obtains data by access Backup Data center in following two kinds of situations, when client cannot connect primary data center;Client can connect primary data center, but during primary data center heavy traffic.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410023373.0A CN103763368B (en) | 2014-01-20 | 2014-01-20 | A kind of method of data synchronization across data center |
PCT/CN2015/070416 WO2015106656A1 (en) | 2014-01-20 | 2015-01-09 | Cross-data-center data synchronization method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410023373.0A CN103763368B (en) | 2014-01-20 | 2014-01-20 | A kind of method of data synchronization across data center |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103763368A CN103763368A (en) | 2014-04-30 |
CN103763368B true CN103763368B (en) | 2016-07-06 |
Family
ID=50530527
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410023373.0A Active CN103763368B (en) | 2014-01-20 | 2014-01-20 | A kind of method of data synchronization across data center |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN103763368B (en) |
WO (1) | WO2015106656A1 (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103763368B (en) * | 2014-01-20 | 2016-07-06 | 浪潮电子信息产业股份有限公司 | A kind of method of data synchronization across data center |
CN104219288B (en) * | 2014-08-14 | 2018-03-23 | 中国南方电网有限责任公司超高压输电公司 | Distributed Data Synchronization method and its system based on multithreading |
CN104519130B (en) * | 2014-12-16 | 2018-02-27 | 北京中交兴路车联网科技有限公司 | A kind of data sharing caching method across IDC |
CN104899278B (en) * | 2015-05-29 | 2019-05-03 | 北京京东尚科信息技术有限公司 | A kind of generation method and device of Hbase database data operation log |
CN106557530B (en) * | 2015-09-30 | 2019-10-11 | 腾讯科技(深圳)有限公司 | Operation system, data recovery method and device |
CN105610917B (en) * | 2015-12-22 | 2019-12-20 | 腾讯科技(深圳)有限公司 | Method and system for realizing synchronous data repair in system |
CN110290214A (en) * | 2019-06-28 | 2019-09-27 | 苏州浪潮智能科技有限公司 | A kind of transmitting data file method and system |
CN110750594B (en) * | 2019-09-30 | 2023-05-30 | 上海视云网络科技有限公司 | Real-time cross-network database synchronization method based on mysql incremental log |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1677931A (en) * | 2004-04-02 | 2005-10-05 | 鸿富锦精密工业(深圳)有限公司 | Network daily-record data management system and method |
CN101043375A (en) * | 2007-03-15 | 2007-09-26 | 华为技术有限公司 | Distributed system journal collecting method and system |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8214329B2 (en) * | 2008-08-26 | 2012-07-03 | Zeewise, Inc. | Remote data collection systems and methods |
CN102075556B (en) * | 2009-11-19 | 2014-11-26 | 北京明朝万达科技有限公司 | Method for designing service architecture with large-scale loading capacity |
WO2012081050A1 (en) * | 2010-12-14 | 2012-06-21 | Hitachi, Ltd. | Failure recovery method in information processing system and information processing system |
CN103500229B (en) * | 2013-10-24 | 2017-04-19 | 北京奇虎科技有限公司 | Database synchronization method and database system |
CN103763368B (en) * | 2014-01-20 | 2016-07-06 | 浪潮电子信息产业股份有限公司 | A kind of method of data synchronization across data center |
-
2014
- 2014-01-20 CN CN201410023373.0A patent/CN103763368B/en active Active
-
2015
- 2015-01-09 WO PCT/CN2015/070416 patent/WO2015106656A1/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1677931A (en) * | 2004-04-02 | 2005-10-05 | 鸿富锦精密工业(深圳)有限公司 | Network daily-record data management system and method |
CN101043375A (en) * | 2007-03-15 | 2007-09-26 | 华为技术有限公司 | Distributed system journal collecting method and system |
Also Published As
Publication number | Publication date |
---|---|
CN103763368A (en) | 2014-04-30 |
WO2015106656A1 (en) | 2015-07-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103763368B (en) | A kind of method of data synchronization across data center | |
US10956601B2 (en) | Fully managed account level blob data encryption in a distributed storage environment | |
US9213719B2 (en) | Peer-to-peer redundant file server system and methods | |
US10659225B2 (en) | Encrypting existing live unencrypted data using age-based garbage collection | |
CN105335513B (en) | A kind of distributed file system and file memory method | |
CN103780638B (en) | Method of data synchronization and system | |
CN106446159B (en) | A kind of method of storage file, the first virtual machine and name node | |
CN105187464B (en) | Method of data synchronization, apparatus and system in a kind of distributed memory system | |
CN103647797A (en) | Distributed file system and data access method thereof | |
CN104391930A (en) | Distributed file storage device and method | |
CN106303428A (en) | A kind of security protection cloud platform | |
CN103455577A (en) | Multi-backup nearby storage and reading method and system of cloud host mirror image file | |
US20130031221A1 (en) | Distributed data storage system and method | |
CN105828017B (en) | A kind of cloud storage access system and method towards video conference | |
CN104899161B (en) | A kind of caching method of the continuous data protection based on cloud storage environment | |
CN112416892A (en) | Emergency video data cloud storage system | |
CN104517067B (en) | Access the method, apparatus and system of data | |
CN102820998A (en) | Dual-fault-tolerant service system applicable to office applications and data storage method of dual-fault-tolerant service system | |
CN102541693A (en) | Multi-copy storage management method and system of data | |
CN116304390B (en) | Time sequence data processing method and device, storage medium and electronic equipment | |
CN105022779A (en) | Method for realizing HDFS file access by utilizing Filesystem API | |
CN106919574B (en) | Method for processing remote synchronous file in real time | |
CN106528667A (en) | Low-power-consumption mass data full-text retrieval system frame capable of carrying out read-write separation | |
CN104869056A (en) | Institution personnel data synchronization method based on relational data separation | |
CN109257403A (en) | Date storage method and equipment, distributed memory system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |