CN103763368B - A kind of method of data synchronization across data center - Google Patents

A kind of method of data synchronization across data center Download PDF

Info

Publication number
CN103763368B
CN103763368B CN201410023373.0A CN201410023373A CN103763368B CN 103763368 B CN103763368 B CN 103763368B CN 201410023373 A CN201410023373 A CN 201410023373A CN 103763368 B CN103763368 B CN 103763368B
Authority
CN
China
Prior art keywords
data center
data
daily record
module
primary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410023373.0A
Other languages
Chinese (zh)
Other versions
CN103763368A (en
Inventor
王恩东
文中领
张立强
袁冬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN201410023373.0A priority Critical patent/CN103763368B/en
Publication of CN103763368A publication Critical patent/CN103763368A/en
Priority to PCT/CN2015/070416 priority patent/WO2015106656A1/en
Application granted granted Critical
Publication of CN103763368B publication Critical patent/CN103763368B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/273Asynchronous replication or reconciliation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The present invention provides a kind of method of data synchronization across data center, and it implements process and is: complete the write of data and the record of daily record;Isochronous schedules and propelling movement;Daily record plays back, and completes data syn-chronization;Carry out the data access across data center, it is achieved asynchronous data simultaneously operating.This kind of method of data synchronization across data center is compared to the prior art, it is possible to realize the asynchronous data simultaneously operating across data center, improves the safety of data;Effectively utilize the Internet resources between the I/O resource within data center and data center, practical, it is easy to promote.

Description

A kind of method of data synchronization across data center
Technical field
The present invention relates to technical field of computer data storage, specifically a kind of method of data synchronization across data center.
Background technology
Along with Internet era arrive: social networks, microblogging, location-based service etc. are just being surging forward towards the interactive website of ordinary internet users, such as Google, Facebook, Twitter and domestic Renren Network, microblogging etc., provide the interactive service based on the Internet and wireless network to hundreds of millions of users.The Internet user being found everywhere through the world all carry out every day diversified alternately, at any time all manufacture various data, the quantity of these data is the several times of unit epoch data volume.
For storing these data, each Internet firm establishes huge data center all over the world, and the host number at individual data center is at hundreds of to the tens thousand of order of magnitude not etc..Information from Google shows, Google has dozens of data center and crosses ten million station server in the whole world, stores the mass data of its Global Subscriber generation every day.To the management of these data with to use be all huge challenge: include the data duplication etc. between the reading of data and storage, index and the interface of addressing, configuration and management, data center, this is wherein particularly urgent to the support of data syn-chronization between many data centers and Research Requirements.
The research stored currently for the data of magnanimity is still in the infancy, method of data synchronization between data center is still had to the aspect of much worth research and improvement, for Hbase, the duplication of Hbase depends on the architecture of Master/Slave, the characteristic simply carrying out data duplication between Liang Ge data center is just added at 0.90.0 version, replication task does not have the realization of priority query, it does not have unified scheduling is done in the load for data center.On the other hand, traditional data syn-chronization algorithm across data center is generally with the transmission of monoblock data be covered as main method, and this method can take substantial amounts of Internet resources and I/O resource.
For this situation, a kind of method of data synchronization across data center based on daily record playback of invention.
Summary of the invention
The technical assignment of the present invention is to solve the deficiencies in the prior art, it is provided that a kind of method of data synchronization across data center.
The technical scheme is that and realize in the following manner, this kind of method of data synchronization across data center, it implements process and is:
One, the write of data and the record of daily record are completed: in primary data center running log logging modle, when primary data center receives the request of data that client is sent, this module by operation required for request in the way of daily record record at primary data center, this module, in the way of embedded or plug-in unit, is incorporated in the operation flow of primary data center.
Two, isochronous schedules and propelling movement: arrange scheduler module and operate in primary data center, this scheduler module is responsible for scheduling data readback operation, according to the load of primary data center, the load at Backup Data center, scheduling strategy information, the propelling movement of Activation Log and playback operation;The push operation that scheduler module requires is completed by daily record pushing module, and this daily record pushing module runs at primary data center, by data manipulation log transmission to Backup Data center.
Three, daily record playback, complete data syn-chronization: primary data center daily record pushing module pushes the data manipulation come and performs to be received by daily record playback module, this daily record playback module operates in Backup Data center, and at current data center playback of data Operation Log, it is achieved the data syn-chronization of Liang Ge data center.
Four, the data access across data center is carried out, it is achieved asynchronous data simultaneously operating.
The detailed process of described step one is: client recognizes the primary data center at customer data place according to local configuration, and all of data manipulation is all sent to primary data center, transfers to the back end in master data to process;After primary data center receives the request of client, performing guest operation according to the operation of request and content, in this course, logger module captures operation and the related data of client's request by the mode of intercept requests;Logger module judges that the operation of client is modified the need of to the data of data center, the need to, then this data manipulation needs as the content across data center's data syn-chronization operation, now operation and the relevant data of request are saved in the asynchronous log recording region of primary data center with proprietary journal format by logger module, and the content in this region is all the content needing to carry out across data center's data readback.
The detailed process of described step 2 is: first monitored following condition by the scheduler module operating in primary data center.
1) number of daily record and the data volume related in asynchronous log recording region;
2) loading condition of primary data center, including network I/O and disk I/O;
3) loading condition at Backup Data center, including network I/O and disk I/O;
When above three meets the scheduling strategy that Configuration Management Officer is arranged, trigger daily record push operation, daily record push operation is performed by daily record pushing module, and this module is responsible for being written to the data manipulation daily record in asynchronous for primary data center log recording region the asynchronous daily record at Backup Data center and is performed region;When daily record pushing module complete daily record after data center transmits, be notified that scheduler module, then scheduler module drives the playback of daily record playback module execution journal at Backup Data center.
The detailed process of described step 3 is: operate in the daily record playback module at Backup Data center after receiving the notice of scheduler module, start the playback operation of execution journal, daily record playback module reads the data logging being stored in asynchronous daily record execution region, then the content of daily record is decoded, obtain operation corresponding to daily record and related data, then on the interdependent node at Backup Data center, again perform this operation, make data and the data consistent in primary data center at Backup Data center, it is achieved synchronizing across data center of data.
The client of described step 4 obtains data by access Backup Data center in following two kinds of situations, when client cannot connect primary data center;Client can connect primary data center, but during primary data center heavy traffic.
The present invention compared with prior art produced by provide the benefit that:
A kind of method of data synchronization across data center of the present invention is capable of the asynchronous data simultaneously operating across data center, improves the safety of data;User is when accessing primary data center, it is also possible to obtain data by accessing preliminary data center;Due in replayed section, it is only necessary to the difference of transmission data, without transmission data itself, therefore this method can also reduce the data volume of transmission, and minimizing simultaneously operating is to the taking of bandwidth between data center;It addition, the scheduler module in system can be scheduling according to the load of data center, effectively utilize the Internet resources between the I/O resource within data center and data center, play the effect of load balance;Practical, it is easy to promote.
Accompanying drawing explanation
Accompanying drawing 1 be the present invention realize process schematic.
Detailed description of the invention
Below in conjunction with accompanying drawing, a kind of method of data synchronization across data center of the present invention is described in detail below.
As shown in Figure 1, a kind of method of data synchronization across data center, by the playback of data manipulation daily record, it is achieved the asynchronous data between data center synchronizes.It implements process:
First pass through programming and following module be set:
(1) logger module.Operate in primary data center, be responsible for when primary data center receives the request of data that client is sent, by operation required for request in the way of daily record record at primary data center.This module, in the way of embedded or plug-in unit, is incorporated in the operation flow of primary data center.
(2) scheduler module.Operate in primary data center, be responsible for scheduling data readback operation.According to information, the propelling movement of Activation Log and playback operations such as the load of primary data center, the load at Backup Data center, scheduling strategies.
(3) daily record pushing module.Operate in primary data center, be responsible for performing the push operation that scheduler module requires, by data manipulation log transmission to Backup Data center.
(4) daily record playback module.Operate in Backup Data center, be responsible for receiving primary data center daily record pushing module and push the data manipulation execution come, and at current data center playback of data Operation Log, it is achieved the data syn-chronization of Liang Ge data center.
By completing operations described below with upper module:
One, the record of the write of data and daily record.
Under normal conditions, client recognizes the primary data center at customer data place according to local configuration, and by all of data manipulation, including reading, write, deletion etc., is all sent to primary data center, transfers to the back end in master data to process.
After primary data center receives the request of client, guest operation can be performed according to the operation of request and content.In this course, logger module can capture operation and the related data of client's request by the mode of intercept requests.
Logger module can judge that the operation of client is modified the need of to the data of data center, if it is desired, then illustrates that this data manipulation needs as the content operated across data center's data syn-chronization.In this time, the operation of request and relevant data can be saved in the asynchronous log recording region of primary data center with proprietary journal format by logger module, and the content in this region is all the content needing to carry out across data center's data readback.
Two, the propelling movement of isochronous schedules and daily record.
Operate in the scheduler module of primary data center, following condition can be monitored:
1) number of daily record and the data volume related in asynchronous log recording region.
2) loading condition of primary data center, including network I/O and disk I/O.
3) loading condition at Backup Data center, master includes network I/O and disk I/O.
When above three meets the scheduling strategy that Configuration Management Officer is arranged, trigger daily record push operation.The premise triggered is usually condition 1) higher, and condition 2) and condition 3) relatively low.
Daily record push operation is performed by daily record pushing module, and this module is responsible for being written to the data manipulation daily record in asynchronous for primary data center log recording region the asynchronous daily record at Backup Data center and is performed region.
When daily record pushing module complete daily record across data center transmit after, be notified that scheduler module.Then scheduler module drives the playback of the daily record playback module execution journal at Backup Data center.
Three, the playback of daily record.
The daily record operating in Backup Data center pays a return visit module after receiving the notice of scheduler module, starts the playback operation of execution journal.
Daily record is paid a return visit module and is read the data logging being stored in asynchronous daily record execution region, then the content of daily record is decoded, obtain operation corresponding to daily record and related data, then on the interdependent node at Backup Data center, again perform this operation, make data and the data consistent in primary data center at Backup Data center.It is achieved thereby that data across data center synchronize.
Four, across the data access of data center.
Client may be passed through to access Backup Data center in both cases and obtain data:
First, when client cannot connect primary data center.This situation is likely to be primary data center and there occurs fault, it is also possible to owing to the network between primary data center and client interrupts.
When this happens, client will be attempted from Backup Data center read operation, and can only perform read operation.Additionally, owing to now whether data between uncertain Backup Data center and primary data center have been completed simultaneously operating, therefore user can be shown relevant information by client, notifies that the source of user's now data is Backup Data center, and there is Data Consistency.
Second, client can connect primary data center, but during primary data center heavy traffic.
When this happens, client is if it is determined that the operation of client is read-only operation, then can confirm whether the data that operation relates to have been synchronized to Backup Data center to primary data center, if primary data center informs that client data synchronously completes, then client can be passed through to access the data that Backup Data center acquisition client to read.Now, the effect of a load balance is played at Backup Data center.
The method of data synchronization across data center of the present invention, it is possible to realize the asynchronous data simultaneously operating across data center, improves the safety of data.User is when accessing primary data center, it is also possible to obtain data by accessing preliminary data center.Can effectively reduce and between data center, synchronize produced data volume.Due in replayed section, it is only necessary to the difference of transmission data, without transmission data itself, therefore can reducing the data volume of transmission, minimizing simultaneously operating is to the taking of bandwidth between data center.Can be scheduling according to the load of data center, effectively utilize the Internet resources between the I/O resource within data center and data center, play the effect of load balance.
The foregoing is only embodiments of the invention, all within the spirit and principles in the present invention, any amendment of making, equivalent replacement, improvement etc., should be included within protection scope of the present invention.

Claims (3)

1. the method for data synchronization across data center, it is characterised in that it implements process and is:
One, the write of data and the record of daily record are completed: in primary data center running log logging modle, when primary data center receives the request of data that client is sent, this module by operation required for request in the way of daily record record at primary data center, this module, in the way of embedded or plug-in unit, is incorporated in the operation flow of primary data center;
The detailed process of this step one is: client recognizes the primary data center at customer data place according to local configuration, and all of data manipulation is all sent to primary data center, transfers to the back end of primary data center to process;After primary data center receives the request of client, performing guest operation according to the operation of request and content, in this course, logger module captures operation and the related data of client's request by the mode of intercept requests;Logger module judges that the operation of client is modified the need of to the data of data center, the need to, then this data manipulation needs as the content across data center's data syn-chronization operation, now operation and the relevant data of request are saved in the asynchronous log recording region of primary data center with proprietary journal format by logger module, and the content in this region is all the content needing to carry out across data center's data readback;
Two, isochronous schedules and propelling movement: arrange scheduler module and operate in primary data center, this scheduler module is responsible for scheduling data readback operation, according to the load of primary data center, the load at Backup Data center, scheduling strategy information, the propelling movement of Activation Log and playback operation;The push operation that scheduler module requires is completed by daily record pushing module, and this daily record pushing module runs at primary data center, by data manipulation log transmission to Backup Data center;
Three, daily record playback, complete data syn-chronization: primary data center daily record pushing module pushes the data manipulation come and performs to be received by daily record playback module, this daily record playback module operates in Backup Data center, and at current data center playback of data Operation Log, it is achieved the data syn-chronization of Liang Ge data center;
The detailed process of described step 3 is: operate in the daily record playback module at Backup Data center after receiving the notice of scheduler module, start the playback operation of execution journal, daily record playback module reads the data logging being stored in asynchronous daily record execution region, then the content of daily record is decoded, obtain operation corresponding to daily record and related data, then on the interdependent node at Backup Data center, again perform this operation, make data and the data consistent in primary data center at Backup Data center, it is achieved synchronizing across data center of data;
Four, the data access across data center is carried out, it is achieved asynchronous data simultaneously operating.
2. a kind of method of data synchronization across data center according to claim 1 and 2, it is characterised in that: the detailed process of described step 2 is: first monitored following condition by the scheduler module operating in primary data center,
1) number of daily record and the data volume related in asynchronous log recording region;
2) loading condition of primary data center, including network I/O and disk I/O;
3) loading condition at Backup Data center, including network I/O and disk I/O;
When above three meets the scheduling strategy that Configuration Management Officer is arranged, trigger daily record push operation, daily record push operation is performed by daily record pushing module, and this module is responsible for being written to the data manipulation daily record in asynchronous for primary data center log recording region the asynchronous daily record at Backup Data center and is performed region;When daily record pushing module complete daily record after data center transmits, be notified that scheduler module, then scheduler module drives the playback of daily record playback module execution journal at Backup Data center.
3. a kind of method of data synchronization across data center according to claim 2, it is characterised in that: the client of described step 4 obtains data by access Backup Data center in following two kinds of situations, when client cannot connect primary data center;Client can connect primary data center, but during primary data center heavy traffic.
CN201410023373.0A 2014-01-20 2014-01-20 A kind of method of data synchronization across data center Active CN103763368B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201410023373.0A CN103763368B (en) 2014-01-20 2014-01-20 A kind of method of data synchronization across data center
PCT/CN2015/070416 WO2015106656A1 (en) 2014-01-20 2015-01-09 Cross-data-center data synchronization method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410023373.0A CN103763368B (en) 2014-01-20 2014-01-20 A kind of method of data synchronization across data center

Publications (2)

Publication Number Publication Date
CN103763368A CN103763368A (en) 2014-04-30
CN103763368B true CN103763368B (en) 2016-07-06

Family

ID=50530527

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410023373.0A Active CN103763368B (en) 2014-01-20 2014-01-20 A kind of method of data synchronization across data center

Country Status (2)

Country Link
CN (1) CN103763368B (en)
WO (1) WO2015106656A1 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103763368B (en) * 2014-01-20 2016-07-06 浪潮电子信息产业股份有限公司 A kind of method of data synchronization across data center
CN104219288B (en) * 2014-08-14 2018-03-23 中国南方电网有限责任公司超高压输电公司 Distributed Data Synchronization method and its system based on multithreading
CN104519130B (en) * 2014-12-16 2018-02-27 北京中交兴路车联网科技有限公司 A kind of data sharing caching method across IDC
CN104899278B (en) * 2015-05-29 2019-05-03 北京京东尚科信息技术有限公司 A kind of generation method and device of Hbase database data operation log
CN106557530B (en) * 2015-09-30 2019-10-11 腾讯科技(深圳)有限公司 Operation system, data recovery method and device
CN105610917B (en) * 2015-12-22 2019-12-20 腾讯科技(深圳)有限公司 Method and system for realizing synchronous data repair in system
CN110290214A (en) * 2019-06-28 2019-09-27 苏州浪潮智能科技有限公司 A kind of transmitting data file method and system
CN110750594B (en) * 2019-09-30 2023-05-30 上海视云网络科技有限公司 Real-time cross-network database synchronization method based on mysql incremental log

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1677931A (en) * 2004-04-02 2005-10-05 鸿富锦精密工业(深圳)有限公司 Network daily-record data management system and method
CN101043375A (en) * 2007-03-15 2007-09-26 华为技术有限公司 Distributed system journal collecting method and system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8214329B2 (en) * 2008-08-26 2012-07-03 Zeewise, Inc. Remote data collection systems and methods
CN102075556B (en) * 2009-11-19 2014-11-26 北京明朝万达科技有限公司 Method for designing service architecture with large-scale loading capacity
WO2012081050A1 (en) * 2010-12-14 2012-06-21 Hitachi, Ltd. Failure recovery method in information processing system and information processing system
CN103500229B (en) * 2013-10-24 2017-04-19 北京奇虎科技有限公司 Database synchronization method and database system
CN103763368B (en) * 2014-01-20 2016-07-06 浪潮电子信息产业股份有限公司 A kind of method of data synchronization across data center

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1677931A (en) * 2004-04-02 2005-10-05 鸿富锦精密工业(深圳)有限公司 Network daily-record data management system and method
CN101043375A (en) * 2007-03-15 2007-09-26 华为技术有限公司 Distributed system journal collecting method and system

Also Published As

Publication number Publication date
CN103763368A (en) 2014-04-30
WO2015106656A1 (en) 2015-07-23

Similar Documents

Publication Publication Date Title
CN103763368B (en) A kind of method of data synchronization across data center
US10956601B2 (en) Fully managed account level blob data encryption in a distributed storage environment
US9213719B2 (en) Peer-to-peer redundant file server system and methods
US10659225B2 (en) Encrypting existing live unencrypted data using age-based garbage collection
CN105335513B (en) A kind of distributed file system and file memory method
CN103780638B (en) Method of data synchronization and system
CN106446159B (en) A kind of method of storage file, the first virtual machine and name node
CN105187464B (en) Method of data synchronization, apparatus and system in a kind of distributed memory system
CN103647797A (en) Distributed file system and data access method thereof
CN104391930A (en) Distributed file storage device and method
CN106303428A (en) A kind of security protection cloud platform
CN103455577A (en) Multi-backup nearby storage and reading method and system of cloud host mirror image file
US20130031221A1 (en) Distributed data storage system and method
CN105828017B (en) A kind of cloud storage access system and method towards video conference
CN104899161B (en) A kind of caching method of the continuous data protection based on cloud storage environment
CN112416892A (en) Emergency video data cloud storage system
CN104517067B (en) Access the method, apparatus and system of data
CN102820998A (en) Dual-fault-tolerant service system applicable to office applications and data storage method of dual-fault-tolerant service system
CN102541693A (en) Multi-copy storage management method and system of data
CN116304390B (en) Time sequence data processing method and device, storage medium and electronic equipment
CN105022779A (en) Method for realizing HDFS file access by utilizing Filesystem API
CN106919574B (en) Method for processing remote synchronous file in real time
CN106528667A (en) Low-power-consumption mass data full-text retrieval system frame capable of carrying out read-write separation
CN104869056A (en) Institution personnel data synchronization method based on relational data separation
CN109257403A (en) Date storage method and equipment, distributed memory system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant