CN111581030A - Data synchronization system and method based on difference data - Google Patents

Data synchronization system and method based on difference data Download PDF

Info

Publication number
CN111581030A
CN111581030A CN202010400280.0A CN202010400280A CN111581030A CN 111581030 A CN111581030 A CN 111581030A CN 202010400280 A CN202010400280 A CN 202010400280A CN 111581030 A CN111581030 A CN 111581030A
Authority
CN
China
Prior art keywords
file
name information
data
data block
checksum
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010400280.0A
Other languages
Chinese (zh)
Inventor
刘举
高志会
苏亮彪
陈勇铨
周华
吕爱民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Yingfang Software Co ltd
Original Assignee
Shanghai Yingfang Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Yingfang Software Co ltd filed Critical Shanghai Yingfang Software Co ltd
Priority to CN202010400280.0A priority Critical patent/CN111581030A/en
Publication of CN111581030A publication Critical patent/CN111581030A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data synchronization system and method based on difference data, the system includes: the source end sends the file name information and the file attribute of the file to be synchronized to the standby end, continuously receives the file name information and the data block checksum sent by the standby end, calculates the data block checksum of the file corresponding to the source end local machine, compares the data block checksum with the received data block checksum, and sends differential data to the standby end for synchronization according to the comparison result; the backup end stores the file name information and the file attribute of the file to be synchronized sent by the source end into a storage file after acquiring the file name information and the file attribute of the file to be synchronized, continuously reads the file name information and the file attribute in the storage file, acquires a corresponding file in a local machine of the backup end according to the file name information, calculates the checksum of each data block in the corresponding file, sends the calculated file name information and the checksum of each data block of the file to the source end, and writes the difference data into the corresponding file of the local machine of the backup end after receiving the difference data.

Description

Data synchronization system and method based on difference data
Technical Field
The invention relates to the technical field of computer data backup, in particular to a data synchronization system and a data synchronization method based on difference data for continuous comparison between production end data and target end data in a data synchronization process.
Background
The data backup is the basis of disaster tolerance, and the data backup process is to synchronize the data of the production end to the target end. At present, the most common method is to completely synchronize the data of the production end to the target end in each backup process. However, if the data of the generating end is completely synchronized to the target end every time of backup, although the integrity of the data can be ensured, the transmission efficiency of the data is greatly reduced.
Disclosure of Invention
In order to overcome the defects in the prior art, the present invention provides a data synchronization system and method based on difference data, so as to synchronize the difference data by comparing the data of the production end with the data of the target end in the process of synchronizing the data, thereby effectively improving the data synchronization efficiency.
To achieve the above object, the present invention provides a data synchronization system based on difference data, comprising:
the source end sends the file name information and the file attribute of the file to be synchronized to the standby end, continuously receives the file name information and the data block checksum sent by the standby end, calculates the data block checksum of the file corresponding to the source end local machine, compares the data block checksum with the data block checksum of the corresponding file received from the standby end, and sends differential data to the standby end for synchronization according to the comparison result;
the backup end stores the file name information and the file attribute of the file which is sent by the source end and needs to be synchronized into a storage file, continuously reads the file name information and the file attribute in the storage file, obtains a corresponding file in a local machine of the backup end according to the read file name information, calculates the checksum of each data block in the corresponding file, sends the calculated file name information and the calculated checksum of the data block to the source end, and writes the difference data into the corresponding file of the local machine of the backup end after receiving the difference data sent by the source end.
Preferably, the source further comprises:
the synchronous file information traversal sending unit is used for traversing all file information to be synchronized on the source end and sending file name information and file attributes of all files to be synchronized to the standby end;
the data block checksum receiving unit is used for continuously receiving the file name information sent by the standby terminal and the checksum of each data block of the corresponding file;
the system comprises a check sum calculation and comparison unit, a data block check sum calculation unit and a data block check sum comparison unit, wherein the check sum calculation and comparison unit is used for acquiring and calculating the data block check sum of a corresponding file of a source local computer according to received file name information sent by a standby terminal and comparing a calculation result with the data block check sum of the corresponding file acquired from the standby terminal;
and the comparison result processing unit is used for determining difference data according to the comparison result of the checksum calculation comparison unit, sending the difference data to the standby terminal, and sending a file synchronization completion mark to the standby terminal after the current file synchronization is completed.
Preferably, the backup terminal further comprises:
the storage unit is used for writing the file name information and the file attribute of the file which is sent by the source end and needs to be synchronized into an independent storage file;
the check sum calculation unit is used for continuously reading the file name information and the file attributes stored by each storage unit in the storage file, acquiring the corresponding file in the local machine of the standby terminal according to the read file name information, calculating the check sum of each data block in the corresponding file in the local machine of the standby terminal, and sending the calculated file name information and the check sum of each data block of the file to the source terminal;
and the data synchronization unit is used for receiving the difference data sent by the source end and writing the difference data into a corresponding file of a local machine of the backup end.
Preferably, the backup terminal further comprises:
and the check thread starting unit is used for starting the check thread so as to start the check sum calculating unit after the check thread is started.
Preferably, each time the checksum calculation unit obtains file name information and file attributes of a source end to be synchronized, the read file name information and file attributes are loaded into a linear table, meanwhile, a corresponding file of the backup end is read according to the read file name information, and the checksum of each data block in the corresponding file is calculated.
Preferably, when the data synchronization unit receives the difference data, the data synchronization unit extracts corresponding file name information and file attributes from the loaded linear table, obtains a corresponding file of the backup end according to the file name information, and writes the difference data into the corresponding file of the backup end.
Preferably, if the comparison result of the comparison result processing unit indicates that no difference data is generated, after the comparison is completed, a file synchronization completion sending flag is sent to the backup side.
Preferably, when the data synchronization unit receives a file synchronization completion flag sent by the source end, the data synchronization unit sets a file attribute of a file corresponding to the backup end according to the file attribute extracted from the linear table, and deletes a corresponding record in the linear table.
In order to achieve the above object, the present invention further provides a data synchronization method based on difference data, including the following steps:
step S1, the source end sends the file name information and the file attribute of the file to be synchronized to the backup end, and the backup end stores the file name information and the file attribute into an independent storage file;
step S2, the backup terminal continuously reads the file name information and file attribute in the storage file, acquires the corresponding file in the local machine of the backup terminal according to the read file name information, calculates the checksum of each data block of the corresponding file, and sends the calculated file name information and the checksum of the data block of the file to the source terminal;
step S3, the source end continuously receives the file name information and the data block checksum sent by the backup end, obtains and calculates the data block checksum of the corresponding file of the source end local machine according to the file name information, compares the data block checksum with the data block checksum received from the backup end, and sends the difference data to the backup end according to the comparison result;
step S4, after receiving the difference data sent by the source, the backup writes the difference data into a corresponding file of the backup local machine.
Preferably, in step S2, when the backup acquires file name information and file attributes of a source that needs to be synchronized from the storage file, the backup loads the read file name information and file attributes into a linear table, and reads the corresponding file of the backup according to the file name information, and calculates the checksum of each data block in the corresponding file.
Compared with the prior art, the data synchronization system and the method based on the differential data send the file name information and the file attribute of all files to be synchronized to the standby terminal by the source terminal, the standby terminal stores the file name information and the file attribute into an independent storage file, then the standby terminal starts a verification thread, continuously reads the synchronous file name information and the file attribute in the storage file, acquires the corresponding file in the local machine of the standby terminal according to the file name information, calculates the checksum of each data block in the corresponding file, continuously sends the calculated file name information of the corresponding file and the checksum of each data block to the source terminal, the source terminal continuously receives the synchronous file name information and the data block checksum sent by the standby terminal, calculates the data block checksum of the file corresponding to the local machine of the source terminal, and compares the data block checksum with the data block checksum sent by the standby terminal, and sending the difference data to the standby terminal according to the comparison result to realize the synchronization of the difference data.
Drawings
FIG. 1 is a system architecture diagram of a data synchronization system based on difference data according to the present invention;
FIG. 2 is a flow chart illustrating the steps of a data synchronization method based on difference data according to the present invention;
FIG. 3 is a flow chart of data synchronization based on difference data according to an embodiment of the present invention.
Detailed Description
Other advantages and capabilities of the present invention will be readily apparent to those skilled in the art from the present disclosure by describing the embodiments of the present invention with specific embodiments thereof in conjunction with the accompanying drawings. The invention is capable of other and different embodiments and its several details are capable of modification in various other respects, all without departing from the spirit and scope of the present invention.
FIG. 1 is a system architecture diagram of a data synchronization system based on difference data according to the present invention. As shown in fig. 1, the present invention provides a data synchronization system based on difference data, which includes:
the source end 10, i.e. the production end, sends the file name information and file attributes of all files to be synchronized to the standby end 20, continuously receives the checksum of the file name information and the data block of the file sent by the standby end 20, calculates the checksum of the data block of the synchronous file corresponding to the source end 10, compares the checksum with the checksum of the data block of the corresponding file sent by the standby end 20, and sends the difference data to the standby end 20 according to the comparison result.
Specifically, the source terminal 10 further includes:
the traversal sending unit 101 for the synchronized file information is configured to traverse all files that need to be synchronized on the source peer 10, and send file name information and file attributes of all files that need to be synchronized to the standby peer 20. In the embodiment of the present invention, the file name information includes, but is not limited to, a folder name and a file name, and the folder name marks a path where the file is located; marking a file needing to be synchronized by a file name, namely the file name information comprises a full-path file name of the file needing to be synchronized; the file attributes include, but are not limited to, the access rights of the file, the file creation time, the modification time, the file size, and other attributes.
And the data block checksum receiving unit 102 is configured to continuously receive the synchronous file name information sent by the standby terminal and the data block checksum of each data block of the corresponding file. Specifically, when the backup sends the file name information and the file attributes of all files to be synchronized sent by the source end to the backup 20, the backup stores the file name information and the file attributes of all files to be synchronized into an individual storage file, when the check thread is started, the backup continuously reads the file name information and the attributes of each synchronized file in the storage file, obtains the full path name according to the file name information and obtains the corresponding file in the backup local machine, calculates the checksum of each data block in the file corresponding to the backup local machine by using the MD5 algorithm, namely the MD5 value, continuously sends the file name information of the synchronized file and the calculated checksum of each data block of the synchronized file to the data block check and receiving unit 102 of the source end 10 for receiving, because the file name information sent by the backup to the source end is the full path file name formed by the backup reading the folder name in the storage file and the specific file name, the source peer can match to the corresponding sync file according to the full path filename.
The checksum calculation and comparison unit 103 is configured to obtain and calculate a data block checksum of the synchronous file corresponding to the source local machine according to the received synchronous file name information sent by the standby terminal and the data block checksum of the corresponding file, and compare the calculation result with the data block checksum of the corresponding file obtained from the standby terminal, where a calculation method of the data block checksum is consistent with that of the standby terminal, and details are not described here.
And the comparison result processing unit 104 is configured to determine difference data, that is, data blocks with inconsistent checksums, according to the comparison result of the checksum calculation and comparison unit 103, send the difference data to the standby terminal 20, write the difference data into a corresponding file of the standby terminal by the standby terminal 20, the standby terminal 20 may sequentially fetch a synchronized file name and an attribute from the loaded linear table, match the file name and the attribute to the corresponding file of the standby terminal, and write the difference data into the corresponding file of the matched standby terminal. Specifically, after the backup 20 receives the file name information and the file attribute of the synchronous file sent by the source 10 and writes the file name information and the file attribute into an individual storage file, a verification thread is started, the verification thread of the backup 20 continuously and sequentially reads the storage files (the storage files stored by the file name information and the attribute of the synchronous file sent by the source 10), the file name information and the file attribute of one source synchronous file are obtained every time the storage files are read, the file name information and the file attribute are sequentially written into a linear table as a recording unit, meanwhile, the backup corresponding file is matched according to the file name information, the checksum of the data block of the corresponding file is calculated, the checksum of the file name information and the data block corresponding to the backup obtained by calculation are sent to the source 10, which ensures that the file name information of the synchronous file received by the source 10 is consistent with the sequence of the file names in the recording unit loaded in the linear table, in this way, when the comparison result processing unit 104 sends the difference data, the backup 20 can directly take out the start recording unit from the linear table, the file name information and the file attribute information in the start recording unit correspond to the file name information and the file attribute of the file corresponding to the difference data sent by the comparison result processing unit 104, and the backup 20 can match to the corresponding file according to the file name information of the recording unit taken out from the linear table and write the difference data into the corresponding file. When the source end 10 completes a file synchronously, it sends a file synchronization completion flag to the standby end 20, and after the standby end 20 receives the file synchronization completion flag, it sets the file attribute of the file corresponding to the synchronized standby end according to the file attribute of the recording unit taken out from the linear table, and at the same time, deletes the linear table start recording unit so that the file synchronized by the source end 10 next time is consistent with the file in the start recording unit in the linear table.
The backup 20, i.e. the target, stores the file name information and the file attribute of the file to be synchronized, which is sent by the source 10, in a separate storage file, continuously reads the synchronous file name information and the file attribute in the storage file, obtains the corresponding file in the local of the backup 20 according to the synchronous file name information of each file, calculates the checksum of each data block in the corresponding file, continuously sends the calculated checksum of the file name information and each data block to the source, and writes the differential data into the corresponding file in the local of the backup 20 after receiving the differential data sent by the source 10.
Specifically, the backup 20 further includes:
the storage unit 201 is configured to write file name information and file attributes of a file to be synchronized, which are sent by the source terminal 10, into a separate storage file.
The checksum calculating unit 201 is configured to continuously read the synchronous filename information and the file attribute of each file in the storage file, obtain a corresponding file in the backup terminal local machine according to the synchronous filename information of each file, calculate a checksum of each data block in the corresponding file in the backup terminal local machine, and continuously send the calculated filename information of the synchronous file and the checksum of each data block to the source terminal 10.
The data synchronization unit 202 is configured to receive the difference data sent by the source end 10, and write the difference data into a corresponding file of the local host of the backup end.
Preferably, the back-end 20 further comprises:
a check thread starting unit for starting the check thread, so as to start the check sum computing unit 201 after the check thread is started, the check sum computing unit 201 continuously reads the file name and the file attribute to be synchronized sent by the source end stored in the storage file, the file name information and the file attribute to be synchronized of one source end are obtained every time the file name and the file attribute are read, and the read file name information and the read file attribute are loaded into a linear table, meanwhile, the corresponding file of the backup terminal local machine is obtained according to the read file name information, the corresponding file is read, the data block check sum of each data block in the corresponding file is calculated, and sends the file name information of the file and the calculated checksum of the data blocks to the source 10, and circulating in this way, the verification thread continuously sends the file name information, the file attribute and the data block checksum of the file to be synchronized to the source end.
Preferably, when the data synchronization unit 202 of the backup 20 receives the difference data, one unit, i.e. the corresponding file name and the file attribute, is sequentially retrieved from the linear table, and the difference data is written into the corresponding file, so as to implement the difference data synchronization. When the source end 10 finishes synchronizing a file, it sends a synchronization completion flag to the standby end 20, after the standby end 20 receives the synchronization completion flag, the data synchronization unit 202 sets the attributes of the files corresponding to the standby end according to the file attributes taken out from the linear table to achieve consistency of the file attributes at both ends, and after the data synchronization and attribute synchronization are completed, it needs to delete the corresponding records from the linear table, so as to sequentially take out one unit of the linear table, i.e. the corresponding synchronization file name and file attribute, when synchronizing the next file. It should be noted that, if there is no difference data generated in the comparison between the source end and the backup end, after the comparison is completed, the source end 10 also sends a synchronization completion end flag, and after the backup end 20 receives the synchronization completion end flag, the same operation is performed, that is, the attribute is set and the corresponding record in the linear table is deleted, so that the order of the source end synchronization file is consistent with the order of the corresponding record in the linear table.
FIG. 2 is a flowchart illustrating steps of a data synchronization method based on difference data according to the present invention. As shown in fig. 2, the data synchronization method based on difference data of the present invention includes the following steps:
step S1, the source sends the file name information and file attributes of all files that need to be synchronized to the backup, and the backup stores them in a separate storage file. In the specific embodiment of the present invention, the file name information includes, but is not limited to, a folder name and a file name, the folder name marks a path where a file is located, the file name marks a file to be synchronized, that is, the file name information includes a full-path file name of the file to be synchronized, and the file attributes include, but are not limited to, attributes of access authority, file creation time, modification time, file size, and the like of the file. That is, the source end traverses the files to be synchronized, and sends the folder, the file name and the file attribute to the standby end, and the standby end receives the folder, the file name and the file attribute sent by the source end and writes the folder, the file name and the file attribute into the storage file.
Step S2, the backup terminal continuously reads the synchronous file name information and file attributes in the storage file, obtains the corresponding file in the backup terminal local machine according to the synchronous file name information of each file, calculates the checksum of each data block in the corresponding file, and continuously sends the calculated file name information of the corresponding file and the checksum of each data block to the source terminal.
Step S3, the source end continuously receives the synchronous file name information sent by the backup end and the data block checksum of each data block of the file, calculates the data block checksum of the synchronous file corresponding to the source end local machine, compares the data block checksum with the data block checksum of the corresponding file sent by the backup end, and sends the difference data to the backup end according to the comparison result.
Step S4, after receiving the difference data sent by the source end, the backup end writes the difference data into a corresponding file of the local machine of the backup end.
Preferably, in step S2, the backup side starts the check thread, and after the check thread is started, the backup side continuously reads a storage unit in the storage file, the storage unit stores file name information (including folder and file name) and file attribute to be synchronized sent by the source end, that is, each time reading, obtaining the file name information and file attribute which the source end needs to synchronize, and loading the read file name information and attribute into a linear table, meanwhile, the corresponding file of the backup terminal local machine is obtained according to the read file name information, the corresponding file is read, the data block check sum of each data block of the corresponding file is calculated, and sends the file name and file attribute and the computed checksum of the data block to the source end, and so on, the verification thread continuously sends the file name information of the files to be synchronized and the data block checksum to the source end; meanwhile, when the backup end receives the difference data sent by the source end, corresponding file name information and file attributes are taken out from the loaded linear table according to the difference data, the difference data are written in the corresponding files, the difference data synchronization is realized, and after the synchronization completion mark of the source end is received, namely after the synchronization of the current files is completed, the attributes of the files corresponding to the backup end are set according to the file attributes taken out from the linear table, the consistency of the file attributes at the two ends is ensured, and meanwhile, the records corresponding to the linear table are deleted.
Examples
As shown in fig. 3, in the present embodiment, the data synchronization process based on difference data of the present invention is as follows:
step 1, a source end (namely a production end) traverses all files needing synchronization, and sends file name information and file attributes to a standby end (namely a target end).
And 2, receiving the file name information and the file attribute of the synchronous file sent by the source end by the backup end, and writing the file name information and the file attribute into an independent storage file.
And 3, starting a check thread by the standby terminal, wherein the check thread is realized as follows: continuously reading file name information and file attributes which are stored in a storage file and need to be synchronized at a source end; acquiring file name information and attributes of a source end needing to be synchronized every time the file name information and the attributes are read; loading the read file name information and the attribute into a linear table; acquiring a local file corresponding to the standby terminal according to the acquired file name information, reading the local file of the standby terminal, and calculating the checksum of each data block of the file; sending the file name information and the data block checksum to a source end; and circulating in this way, the standby end check thread continuously sends the file name information and the data block check sum of the files needing to be synchronized to the source end and loads the file name information and the file attributes into the linear table.
And 4, continuously receiving the file name information of the synchronous file sent by the standby terminal and the data block checksum of the file corresponding to the standby machine by the source terminal, comparing the file name information and the data block checksum of the synchronous file read and calculated by the source terminal according to the file name information, and judging and synchronizing the difference data.
And 5, receiving the difference data by the standby terminal, taking out the file name information and the attribute from the loaded linear table, writing the difference data according to the file name information, setting the file attribute of the current file after the synchronization of the current file is finished, and simultaneously deleting the corresponding record of the linear table so as to sequentially take out one unit of the linear table, namely the corresponding synchronized file name and the corresponding file attribute when synchronizing the next file.
In the invention, the step 3 and the step 4 can be continuously carried out at the same time, namely, the standby terminal starts an independent check thread, the check thread continuously reads the storage file, acquires the file name and the attribute of the synchronous file, calculates the data block check sum and sends the data block check sum to the source terminal, and the source terminal can continuously receive the data block of the standby terminal and compare the synchronous difference data while the standby terminal continuously sends the file name information and the data block check sum, thereby improving the efficiency of data synchronization.
To sum up, the data synchronization system and method based on differential data of the present invention send the file name information and file attributes of all files to be synchronized to the backup end by the source end, the backup end saves the file name information and file attributes into a separate storage file, then the backup end starts a check thread, continuously reads the synchronous file name information and file attributes in the storage file, obtains the corresponding file in the backup end local machine according to the file name information, calculates the checksum of each data block in the corresponding file, and continuously sends the calculated file name information of the corresponding file and the checksum of each data block to the source end, the source end continuously receives the synchronous file name information and data block checksum sent by the backup end, calculates the data block checksum of the file corresponding to the source end local machine, and compares the data block checksum with the data block checksum sent by the backup end, and sending the difference data to the standby terminal according to the comparison result to realize the synchronization of the difference data.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Modifications and variations can be made to the above-described embodiments by those skilled in the art without departing from the spirit and scope of the present invention. Therefore, the scope of the invention should be determined from the following claims.

Claims (10)

1. A data synchronization system based on difference data, comprising:
the source end sends the file name information and the file attribute of the file to be synchronized to the standby end, continuously receives the file name information and the data block checksum sent by the standby end, calculates the data block checksum of the file corresponding to the source end local machine, compares the data block checksum with the data block checksum of the corresponding file received from the standby end, and sends differential data to the standby end for synchronization according to the comparison result;
the backup end stores the file name information and the file attribute of the file which is sent by the source end and needs to be synchronized into a storage file, continuously reads the file name information and the file attribute in the storage file, obtains a corresponding file in a local machine of the backup end according to the read file name information, calculates the checksum of each data block in the corresponding file, sends the calculated file name information and the calculated checksum of the data block to the source end, and writes the difference data into the corresponding file of the local machine of the backup end after receiving the difference data sent by the source end.
2. The difference data based data synchronization system of claim 1, wherein the source further comprises:
the synchronous file information traversal sending unit is used for traversing all file information to be synchronized on the source end and sending file name information and file attributes of all files to be synchronized to the standby end;
the data block checksum receiving unit is used for continuously receiving the file name information sent by the standby terminal and the checksum of each data block of the corresponding file;
the system comprises a check sum calculation and comparison unit, a data block check sum calculation unit and a data block check sum comparison unit, wherein the check sum calculation and comparison unit is used for acquiring and calculating the data block check sum of a corresponding file of a source local computer according to received file name information sent by a standby terminal and comparing a calculation result with the data block check sum of the corresponding file acquired from the standby terminal;
and the comparison result processing unit is used for determining difference data according to the comparison result of the checksum calculation comparison unit, sending the difference data to the standby terminal, and sending a file synchronization completion mark to the standby terminal after the current file synchronization is completed.
3. The differential data-based data synchronization system of claim 2, wherein the back-end further comprises:
the storage unit is used for writing the file name information and the file attribute of the file which is sent by the source end and needs to be synchronized into an independent storage file;
the check sum calculation unit is used for continuously reading the file name information and the file attributes stored by each storage unit in the storage file, acquiring the corresponding file in the local machine of the standby terminal according to the read file name information, calculating the check sum of each data block in the corresponding file in the local machine of the standby terminal, and sending the calculated file name information and the check sum of each data block of the file to the source terminal;
and the data synchronization unit is used for receiving the difference data sent by the source end and writing the difference data into a corresponding file of a local machine of the backup end.
4. The system for data synchronization based on difference data according to claim 3, wherein the backup side further comprises:
and the check thread starting unit is used for starting the check thread so as to start the check sum calculating unit after the check thread is started.
5. The difference data based data synchronization system of claim 4, wherein: and when the checksum calculation unit acquires the file name information and the file attribute which are required to be synchronized at the source end, the read file name information and the read file attribute are loaded to a linear table, meanwhile, the corresponding file of the standby end is read according to the read file name information, and the checksum of each data block in the corresponding file is calculated.
6. The difference data based data synchronization system of claim 5, wherein: and when the data synchronization unit receives the differential data, taking out the corresponding file name information and the file attribute from the loaded linear table, acquiring the corresponding file of the standby terminal according to the file name information, and writing the differential data into the corresponding file of the standby terminal.
7. The difference data based data synchronization system of claim 5, wherein: and if the comparison result of the comparison result processing unit is that no difference data is generated, sending a file synchronization completion mark to the standby terminal after the comparison is completed.
8. The difference data based data synchronization system according to claim 6 or 7, wherein: and when the data synchronization unit receives a file synchronization completion mark sent by the source end, setting the file attribute of the file corresponding to the backup end according to the file attribute taken out from the linear table, and deleting the corresponding record in the linear table.
9. A data synchronization method based on difference data comprises the following steps:
step S1, the source end sends the file name information and the file attribute of the file to be synchronized to the backup end, and the backup end stores the file name information and the file attribute into an independent storage file;
step S2, the backup terminal continuously reads the file name information and file attribute in the storage file, acquires the corresponding file in the local machine of the backup terminal according to the read file name information, calculates the checksum of each data block of the corresponding file, and sends the calculated file name information and the checksum of the data block of the file to the source terminal;
step S3, the source end continuously receives the file name information and the data block checksum sent by the backup end, obtains and calculates the data block checksum of the corresponding file of the source end local machine according to the file name information, compares the data block checksum with the data block checksum received from the backup end, and sends the difference data to the backup end according to the comparison result;
step S4, after receiving the difference data sent by the source, the backup writes the difference data into a corresponding file of the backup local machine.
10. A method for data synchronization based on difference data according to claim 9, characterized by: in step S2, when the backup acquires filename information and file attributes that the source needs to synchronize from the storage file, the backup loads the filename information and the file attributes that are read into a linear table, and reads the corresponding file of the backup according to the filename information, and calculates the checksum of each data block in the corresponding file.
CN202010400280.0A 2020-05-13 2020-05-13 Data synchronization system and method based on difference data Pending CN111581030A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010400280.0A CN111581030A (en) 2020-05-13 2020-05-13 Data synchronization system and method based on difference data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010400280.0A CN111581030A (en) 2020-05-13 2020-05-13 Data synchronization system and method based on difference data

Publications (1)

Publication Number Publication Date
CN111581030A true CN111581030A (en) 2020-08-25

Family

ID=72113522

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010400280.0A Pending CN111581030A (en) 2020-05-13 2020-05-13 Data synchronization system and method based on difference data

Country Status (1)

Country Link
CN (1) CN111581030A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112015816A (en) * 2020-08-27 2020-12-01 北京字节跳动网络技术有限公司 Data synchronization method, device, medium and electronic equipment
CN112767767A (en) * 2021-01-29 2021-05-07 重庆子元科技有限公司 Virtual training system
CN114253924A (en) * 2021-12-21 2022-03-29 上海英方软件股份有限公司 Synchronization method, synchronization equipment and storage medium
CN115982109A (en) * 2023-03-20 2023-04-18 北京飞轮数据科技有限公司 Data synchronization method and device, electronic equipment and computer readable medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103605768A (en) * 2013-11-27 2014-02-26 浪潮电子信息产业股份有限公司 Massive file synchronization speed increasing method in storage systems
US20150154222A1 (en) * 2013-12-04 2015-06-04 International Business Machines Corporation Efficiency of File Synchronization in a Linear Tape File System
CN109542679A (en) * 2018-11-09 2019-03-29 安徽典典科技发展有限责任公司 A kind of variance data compares and synchronous method
CN109885421A (en) * 2019-02-18 2019-06-14 安徽典典科技发展有限责任公司 A kind of data difference comparative approach
CN110389937A (en) * 2019-07-26 2019-10-29 上海英方软件股份有限公司 A kind of method and system based on database in phase transmission file
CN110908830A (en) * 2019-10-18 2020-03-24 上海英方软件股份有限公司 Method for realizing file system to object storage difference comparison and backup through database

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103605768A (en) * 2013-11-27 2014-02-26 浪潮电子信息产业股份有限公司 Massive file synchronization speed increasing method in storage systems
US20150154222A1 (en) * 2013-12-04 2015-06-04 International Business Machines Corporation Efficiency of File Synchronization in a Linear Tape File System
CN109542679A (en) * 2018-11-09 2019-03-29 安徽典典科技发展有限责任公司 A kind of variance data compares and synchronous method
CN109885421A (en) * 2019-02-18 2019-06-14 安徽典典科技发展有限责任公司 A kind of data difference comparative approach
CN110389937A (en) * 2019-07-26 2019-10-29 上海英方软件股份有限公司 A kind of method and system based on database in phase transmission file
CN110908830A (en) * 2019-10-18 2020-03-24 上海英方软件股份有限公司 Method for realizing file system to object storage difference comparison and backup through database

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112015816A (en) * 2020-08-27 2020-12-01 北京字节跳动网络技术有限公司 Data synchronization method, device, medium and electronic equipment
CN112767767A (en) * 2021-01-29 2021-05-07 重庆子元科技有限公司 Virtual training system
CN112767767B (en) * 2021-01-29 2022-12-23 重庆子元科技有限公司 Virtual training system
CN114253924A (en) * 2021-12-21 2022-03-29 上海英方软件股份有限公司 Synchronization method, synchronization equipment and storage medium
CN115982109A (en) * 2023-03-20 2023-04-18 北京飞轮数据科技有限公司 Data synchronization method and device, electronic equipment and computer readable medium
CN115982109B (en) * 2023-03-20 2023-07-25 北京飞轮数据科技有限公司 Data synchronization method, device, electronic equipment and computer readable medium

Similar Documents

Publication Publication Date Title
CN111581030A (en) Data synchronization system and method based on difference data
CN109522160B (en) Method and system for comparing and backing up file directory by saving file information abstract
US9773059B2 (en) Tape data management
US20070156778A1 (en) File indexer
CN107479881B (en) Method for synchronizing difference codes, storage medium, electronic device and system
JP5237661B2 (en) File synchronization apparatus, file synchronization method, and file synchronization program
CN103339615B (en) storage system and information processing method
CN103838645B (en) Remote difference synthesis backup method based on Hash
CN113254394B (en) Snapshot processing method, system, equipment and storage medium
CN110647514A (en) Metadata updating method and device and metadata server
CN107577549A (en) It is a kind of to store the method for testing for deleting function again
CN112463026A (en) Method and apparatus for deduplication of supplemental data in a distributed object storage system
CN113190448B (en) Test code updating method and device, electronic equipment and storage medium
CN109344163B (en) Data verification method and device and computer readable medium
CN110908830A (en) Method for realizing file system to object storage difference comparison and backup through database
CN109101644A (en) A kind of sound state journal file scanning collecting method
CN113419897A (en) File processing method and device, electronic equipment and storage medium thereof
CN112000850A (en) Method, device, system and equipment for data processing
CN114491145B (en) Metadata design method based on stream storage
KR102400723B1 (en) Apparatus for recovering metadata of deleted files based on FAT32 and apparatus method thereof
CN114741552A (en) Video file storage method and medium with custom format
CN114217741A (en) Storage method of storage device and storage device
CN112131194A (en) File storage control method and device of read-only file system and storage medium
CN114356232B (en) Data reading and writing method and device
CN111737252B (en) Data fusion method and system based on data center

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200825