WO2017097106A1 - 一种文件差量的传输方法以及装置 - Google Patents

一种文件差量的传输方法以及装置 Download PDF

Info

Publication number
WO2017097106A1
WO2017097106A1 PCT/CN2016/106650 CN2016106650W WO2017097106A1 WO 2017097106 A1 WO2017097106 A1 WO 2017097106A1 CN 2016106650 W CN2016106650 W CN 2016106650W WO 2017097106 A1 WO2017097106 A1 WO 2017097106A1
Authority
WO
WIPO (PCT)
Prior art keywords
file
blocks
difference
data
matching
Prior art date
Application number
PCT/CN2016/106650
Other languages
English (en)
French (fr)
Inventor
祝彪
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2017097106A1 publication Critical patent/WO2017097106A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Definitions

  • the present invention relates to the field of communications, and in particular, to a method and apparatus for transmitting a file difference.
  • the "ignoring" uploading technology means that after determining the file A to be uploaded, the client calculates the feature code of the file A and sends the feature code to the server.
  • the signature code may be the fifth version of the information-digest algorithm (English name: Message-Digest Algorithm 5, MD5 for short) or the first version of the secure hash algorithm (English name: Secure Hash Algorithm, SHA1 for short). .
  • the server can directly return an upload success message to the client, and the client does not need to upload the file A; if the file does not exist on the server A file having the same signature, but there is a file C having the same file name as the file A.
  • the client can also calculate the difference data of the file A compared to the file C by the Rsync algorithm. And sending the difference data to the server, the server uses the file C to combine with the difference data, and generates a file consistent with the file A on the server, so that the client does not need to upload the entire file A to server.
  • the client sequentially sends the identifiers of the difference data and the matching data to the server according to the order of the difference data and the matching data in the file A.
  • the server combines the delta data and the data in the file C corresponding to the matching data identifier according to the receiving order. That is to say, when uploading the difference data, the client needs to send an upload request to the server for each part of the difference data, and the uploading process is complicated, in file A and When the difference data between the files C is large and scattered in multiple different positions of the file A, multiple uploads by the client increase the load of the client and the server.
  • the first aspect provides a method for transmitting a file difference, the server saves a first file, and the method is used to save a second file to the server, where the method includes: determining, by the client, the second file, The second file is compared with the N pieces of difference data of the first file and the L matching blocks; the first file has a data block that is consistent with each of the matching blocks, and the second file is The N pieces of difference data and the L matching blocks are composed; N and L are both positive integers greater than 0; the N pieces of difference data are combined into M difference blocks; M is a positive integer less than N Transmitting attribute information of the M difference blocks and the L matching blocks to the server; wherein attribute information of the L matching blocks is used to use the M difference blocks and the Data blocks in the first file that coincide with each of the matching blocks are merged into the second file.
  • the client when uploading the difference data, the client does not request the server to upload once for each difference data distributed in different locations, but first merges the difference data into a difference block with larger data volume. And uploading with the difference block as the smallest unit, reducing the number of requests sent by the client to the server, thereby reducing the load on the server.
  • the attribute information of the L matching blocks includes an identifier of a data block in the first file that is consistent with each of the matching blocks, and each of the Determining an offset of the matching block in the second file; wherein the identifier of the data block is used to determine L data blocks in the first file that are consistent with the L matching blocks,
  • the offset is each of the horses
  • a difference between a start address of the block in the second file and a first address of the second file the offset of the L matching blocks being used to compare the M difference blocks with The L data blocks are merged into the second file.
  • the attribute information of the matching block needs to correctly indicate the matching block and the difference.
  • the attribute information of the matching block includes the identifier of the data block in the first file that is consistent with the matching block, and the offset of the matching block in the second file. Transfer amount.
  • the client determines, in the second file, the N pieces of difference data of the second file compared to the first file, and L
  • the matching block includes: receiving file partitioning information of the first file sent by the server; determining, according to the file blocking information, that the second file is consistent with the data block in the first file by using an Rsync algorithm The matching block.
  • the file segmentation information includes an identifier of the data block in the first file
  • the client determines, after the matching block in the second file that is consistent with the data block in the first file, Recording an identifier of the data block that is consistent with the matching block, so that after receiving the identifier of the data block that is consistent with the matching block, the server may determine and describe the first file Matches blocks of data that are consistent with the block.
  • a method for file difference transmission where a server saves a first file, and the method is used to save a second file to the server, including: the server receives M difference blocks sent by the client, and L Attribute information of the matching block; the M difference blocks are obtained by combining the N pieces of difference data of the first file with the first file; Matching data blocks in which the blocks are consistent; the second file is composed of the N pieces of difference data and the L matching blocks; N and L are both positive integers greater than 0, and M is a positive integer less than N; The attribute information of the L matching blocks merges the M difference blocks and the data blocks in the first file that are consistent with each of the matching blocks into the second file.
  • the server receives the difference data uploaded by the client with the difference block as the minimum unit. Compared with the prior art, the server receives the N upload requests sent by the client for the N differential data. The above solution reduces the number of upload requests processed by the server, thereby reducing the load on the server.
  • the attribute information of the L matching blocks includes an identifier of a data block in the first file that is consistent with each of the matching blocks, and each of the Determining an offset of the matching block in the second file; the offset is a difference between a start address of the matching block in the second file and a first address of the second file
  • Combining, according to the attribute information of the L matching blocks, the M difference blocks and the data blocks in the first file that are consistent with each of the matching blocks into the second file The method includes: determining, according to the identifier of the data block, L data blocks that are consistent with the L matching blocks in the first file; and the M differences according to an offset of the L matching blocks The block and the L data blocks are merged into the second file.
  • a third aspect provides a client that saves a first file, the client is configured to save a second file to the server, and the client includes: a determining unit, configured to determine the second file The second file is compared with the N pieces of difference data of the first file and the L matching blocks; the first file has a data block that is consistent with each of the matching blocks, and the second file is The N pieces of difference data and the L matching blocks are composed; N and L are both positive integers greater than 0; combining units for combining the N pieces of difference data into M difference blocks; M is less than a positive integer of N; a sending unit, configured to send attribute information of the M difference blocks and the L matching blocks to the server; wherein attribute information of the L matching blocks is used for The M difference blocks and the data blocks in the first file that are consistent with each of the matching blocks are merged into the second file.
  • the attribute information of the L matching blocks includes an identifier of a data block in the first file that is consistent with each of the matching blocks, And an offset of each of the matching blocks in the second file; wherein the identifier of the data block is used to determine L data in the first file that is consistent with the L matching blocks Block, the offset is a difference between a start address of each of the matching blocks in the second file and a first address of the second file, and an offset of the L matching blocks And used to merge the M difference blocks and the L data blocks into the second file.
  • the above division of the client functional unit is only a logical function division.
  • the physical implementation of the foregoing functional unit may also have different manners.
  • the determining unit may be a central processing unit or a specific integrated circuit, and the foregoing combining unit may specifically It is a data reading and writing device capable of reading and writing data according to an instruction of a processor, and the transmitting unit may be a transmitter.
  • the fourth aspect provides a server, where the server saves the first file, and includes: a receiving unit, configured to receive M pieces of difference blocks sent by the client, and attribute information of the L matching blocks; the M difference blocks
  • the second file to be saved to the server is merged with the N pieces of difference data of the first file; the first file has a data block that is consistent with each of the matching blocks.
  • the second file is composed of the N pieces of difference data and the L matching blocks; N and L are both positive integers greater than 0, M is a positive integer less than N; and a merging unit is used according to the L
  • the attribute information of the matching blocks merges the M difference blocks and the data blocks in the first file that are consistent with each of the matching blocks into the second file.
  • the attribute information of the L matching blocks includes an identifier of a data block in the first file that is consistent with each of the matching blocks, and each Determining an offset of the matching block in the second file; the offset is a difference between a start address of the matching block in the second file and a first address of the second file
  • the merging unit is configured to: determine, according to the identifier of the data block, L data blocks that are consistent with the L matching blocks in the first file; according to the offset of the L matching blocks The M difference blocks and the L data blocks are combined into the second file.
  • the above division of the server functional unit is only a logical function division, and the actual implementation may have another division manner, and the foregoing merging unit may specifically be capable of reading data according to instructions of the processor.
  • the written data reading and writing device, the receiving unit may be a receiver.
  • a client where a server saves a first file, and the client is configured to save a second file to the server, where the client includes: a processor, a transmitter, a receiver, and a communication bus; Wherein the processor, the transmitter and the receiver complete phase through the communication bus Inter-communication; the processor is configured to: determine, in the second file, the N pieces of difference data and the L matching blocks of the second file compared to the first file; the first file exists a data block consistent with each of the matching blocks, the second file consisting of the N pieces of difference data and the L matching blocks; N and L are both positive integers greater than 0; The difference data is merged into M difference blocks; M is a positive integer smaller than N; and the attribute information of the M difference blocks and the L matching blocks is sent to the server; wherein the L pieces The attribute information of the matching block is used to merge the M difference blocks and the data blocks in the first file that are consistent with each of the matching blocks into the second file.
  • the attribute information of the L matching blocks includes an identifier of a data block in the first file that is consistent with each of the matching blocks, And an offset of each of the matching blocks in the second file; wherein the identifier of the data block is used to determine L data in the first file that is consistent with the L matching blocks Block, the offset is a difference between a start address of each of the matching blocks in the second file and a first address of the second file, and an offset of the L matching blocks And used to merge the M difference blocks and the L data blocks into the second file.
  • a server comprising: a memory, a processor, a transmitter, a receiver, and a communication bus; wherein the processor, the transmitter, and the receiver complete each other through the communication bus
  • the first storage file is stored by the processor; the processor is configured to: receive M difference blocks sent by the client, and attribute information of the L matching blocks; the M difference blocks are to be saved to the server
  • the second file is merged with the N pieces of difference data of the first file; the first file has a data block that is consistent with each of the matching blocks, and the second file is described by N pieces of difference data and the L matching blocks are composed; N and L are both positive integers greater than 0, M is a positive integer less than N; and the M differences are according to attribute information of the L matching blocks And a block and a data block in the first file that is consistent with each of the matching blocks are merged into the second file.
  • the attribute information of the L matching blocks includes an identifier of a data block in the first file that is consistent with each of the matching blocks, and the L An offset of the matching block in the second file; the offset is the matching block in the second file And a difference between the start address of the second file and the first address of the second file; the processor is specifically configured to: determine, according to the identifier of the data block, that the first file is associated with the L matching blocks Consistent L data blocks; merging the M difference blocks and the L data blocks into the second file according to the offset of the L matching blocks.
  • the client merges the N pieces of difference data into M differences according to an order in which the addresses of the N pieces of difference data in the second file are small to large.
  • the server merges the M delta blocks and the L data blocks into the second file according to the offset of the L matching blocks, including: from the storage address D of the server Starting at 1 + P k , writing a data block in the first file that is consistent with the Kth matching block; wherein D 1 is a storage area in the server for storing the second file a first address, P k is an offset of the kth matching block, 1 ⁇ K ⁇ L; starting from a storage address D 1 of the server, sequentially writing the same at a free address of the storage area
  • the difference data in the M difference blocks is only one example in which the server merges the M difference blocks and the L data blocks, and those skilled in the art should know that the order in which the difference blocks and the data blocks are merged is not limited in the present invention. .
  • FIG. 1 is a flowchart of a method for determining a difference data between two files by using an Rsync algorithm according to an embodiment of the present invention
  • FIG. 3 is a schematic flowchart diagram of a method for transmitting a file difference according to an embodiment of the present disclosure
  • FIG. 4 is a schematic diagram of a server combining a delta block and a data block in a first file according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a client combining the difference data into a difference block according to an embodiment of the present invention
  • FIG. 6 is an example of a server merging difference block and a data block in a first file according to an embodiment of the present invention
  • FIG. 7 is a schematic structural diagram of a client according to an embodiment of the present disclosure.
  • FIG. 8 is a schematic structural diagram of a server according to an embodiment of the present disclosure.
  • FIG. 9 is a schematic structural diagram of another client according to an embodiment of the present disclosure.
  • FIG. 10 is a schematic structural diagram of another server according to an embodiment of the present invention.
  • the Rsync algorithm is an efficient algorithm for synchronizing files, which can find different data in two files.
  • the method for the client to apply the Rsync algorithm to determine different data of file A compared to file C includes:
  • the client receives file block information of the file C sent by the server.
  • the file block information includes a number of each data block of the file C, and a block feature code of each data block, where the block feature code includes a weak check code and a strong check code of the data block, where The weak check code may be Adler 32, and the strong check code may be MD5.
  • Each data block in file C has the same block size.
  • the client reads a first data block of a block size from a starting position of the file A.
  • the client calculates a weak check code of the first data block, and searches in the file C whether there is a second data block that is the same as the weak check code of the first data block.
  • Step S104 is executed. If there is no second data block in the file C that is identical to the weak check code of the first data block, step S108 is performed.
  • the client calculates a strong check code of the first data block, and determines whether the strong check code of the second data block is the same as the strong check code of the first data block.
  • step S105 is performed, and if the first data block is the same as the strong check code of the second data block, performing steps S106 and step S107.
  • the client offsets one byte from a starting position of the first data block, and reads a third data block of a block size.
  • the client re-executes from step S103 by using the third data block as the first data block.
  • the client marks the first data block to match the second data block.
  • the client establishes a correspondence between the first data block and the number of the second data block.
  • the client offsets a block size from a starting position of the first data block, and reads a fourth data block of a block size.
  • the client starts the execution from step S103 by using the fourth data block as the first data block.
  • the client offsets a byte size from a starting position of the first data block, and reads a fifth data block of a block size.
  • the client re-executes from step S103 by using the fifth data block as the first data block.
  • the client can determine the data block in the file A that matches the data block of the file C through the above steps S101 to S108.
  • file C includes data block 1, data block 2, data block 3, data block 4, and number
  • the client determines, by the Rsync algorithm, the matching block 2 in the file A that is consistent with the data block 2, the matching block 3 that is consistent with the data block 3, and the matching block 4 that is consistent with the data block 4,
  • the difference data of the file A with respect to the file C is each piece of data in the file A separated by the matching block, such as the difference data 1 shown in FIG. 2, the difference data 2, the difference amount. Data 3 and delta data 4.
  • the client sequentially sends the difference data 1 to the server, and the matching block The number of 2, the difference data 2, the number of the matching block 3, the number of the matching block 4, the difference data 3, the number of the matching block 3, and the difference data 4.
  • the server combines the received difference data and the data block corresponding to the matching block number in the file C in the order of reception.
  • the client described in this document may be a client device such as a mobile phone, a computer, or a tablet computer
  • the server may be a cloud storage server.
  • the embodiment of the present invention provides a method for transmitting a file difference, which is used to solve the technical problem that the load of each server is uploaded once for each difference data in the prior art, wherein the server saves the first file.
  • the method includes:
  • the client determines a second file to be uploaded, and calculates a feature code of the second file.
  • the client sends the file name of the second file and the signature to the server.
  • the server queries whether there is a first file with the same file name as the second file.
  • step S304 and step S305 are performed; if yes, step S306 to step S313 are performed.
  • the server sends an upload notification message to the client.
  • the server calculates a signature of the first file, and determines whether a signature of the second file is the same as a signature of the first file.
  • step S307 is performed; if not, steps S308 to S313 are performed.
  • the server sends an upload success message to the client.
  • the server divides the first file into a plurality of data blocks, and determines file partition information of the first file.
  • the server sends the file block information to the client.
  • the client determines, according to the file block information, the N file difference data and the L matching blocks in the second file compared to the first file by using an Rsync algorithm.
  • N and L are both positive integers greater than zero.
  • the size of the first file is blocked by the server, which directly affects the data block matching between the second file and the first file.
  • the size of the partitioning of the first file by the server may be preset, for example, the block size is preset to 10 bytes (KB), in which case the server is from the first Starting with the first address of the file, each 10KB of data is divided into one data block, where the last data block may be less than 10KB.
  • the client combines the N pieces of difference data into M difference blocks.
  • M is a positive integer less than N.
  • the size of the difference block is directly related to the number of times the client uploads data to the server. The larger the difference block, the smaller the number of difference blocks, and the fewer the number of uploads by the client. However, the size of the difference block is limited by the transmission capability of the client. In a specific implementation process, the size of the difference block may be preset according to the current network bandwidth of the client and the data transmission format, wherein each difference block is The size can be different or the same, for example, the size of the difference block is set to match the block size. The embodiment of the present invention does not limit this.
  • the client sends the M difference blocks and the attribute information of the L matching blocks to the server.
  • the client may perform the following operations on each difference block: the client sends an upload request to the server, and receives the upload sent by the server for the response.
  • the response message of the request the client sends the difference block and the attribute information of the matching block adjacent to each difference data in the difference block to the client according to the response message.
  • the server receives the M upload requests sent by the client for the M difference blocks.
  • the server receives the N upload requests sent by the client for the N differential data, and the solution reduces the server processing. The number of upload requests, which in turn reduces the load on the server.
  • the client may also send an upload request for multiple difference blocks, that is, the client may upload the difference block as a minimum unit, which is not limited by the present invention.
  • the server After receiving the attribute information of the M difference blocks and the L matching blocks, the server, according to the attribute information of the L matching blocks, the M difference blocks and the first file and each matching block. Consistent data blocks are merged into the second file.
  • the M difference blocks are guaranteed to be transmitted to the server side, that is, the order in which the client sends the difference blocks is consistent with the order in which the server receives the difference blocks, so that the server The difference blocks can be sequentially selected for merging according to the order of reception.
  • the client determines that the second file is compared with the N pieces of difference data of the first file and the L matching blocks in the second file, and the N pieces of difference data are combined into M difference blocks,
  • the attribute information of the M difference blocks and the L matching blocks is sent to the server, so that the server compares the M difference blocks and the first file with each matching block according to the attribute information of the L matching blocks. Consistent data blocks are merged into the second file.
  • the client uploads a difference data to the server each time.
  • the client can upload a difference block to the server each time, thereby reducing the number of client uploads, thereby reducing the server's load.
  • the client may also send each delta block to the server first, and then send the L matched attribute information to the server, or transmit simultaneously.
  • This embodiment of the present invention does not limit this.
  • the embodiments described in the specification are all preferred embodiments, and the actions involved are not necessarily required by the present invention.
  • the attribute information of the L matching blocks may include an identifier of a data block in the first file that is consistent with each matching block, and the L matching blocks are in the second file.
  • the offset in the offset; the offset is the difference between the start address of the matching block in the second file and the first address of the second file.
  • the server determines, according to the identifier of the data block that is consistent with each matching block in the first file, the L data blocks that are consistent with the L matching blocks in the first file.
  • the identifier of the data block may be a number of the data block.
  • the file split information sent by the server includes the number of the data block of the first file, so that the After determining, by the Rsync algorithm, the matching block in the second file that matches the data block of the first file, the client may record the number of the data block and use the number as the phase in the first file with the matching block. The identity of the consistent data block is sent to the server.
  • the identifier of the data block may also be a block size of the data block in the first file, and an offset of the data block in the first file, that is, in step S309 shown in FIG. 3, the server sends the client to the client.
  • the file segmentation information sent by the terminal includes the block size of the data block in the first file and the offset of the data block.
  • the record is recorded. Number According to the block size and the offset of the block, after receiving the identifier of the data block sent by the client, the server can determine that the matching block is consistent in the first file by using the offset and the block size. Data block.
  • the server reads the L-matching block of the first matching block offsets P 1, according to the size of the address data is read in order of increasing difference of the M blocks of the first part P 1 is the difference
  • the quantity data is written from the first address D 1 of the first file in the server, and the first difference data is written.
  • step S402 P 1 is not equal to zero. If P 1 is equal to 0, in this case, the server directly writes the first matching block from the address D 1 , that is, the first file and the second file have the same data from the first address. The block and the matching block are not directly executed in step S402, and step S403 is directly executed.
  • the server starts from the address D 1 +P 1 in the server, and writes a data block in the first file that matches the first matching block.
  • the server reads an offset P K of the Kth matching block in the L matching blocks, and reads the data size in the order of increasing the addresses in the M difference blocks as P K -(D K-).
  • the Kth difference data of 1 + P K-1 ) is written from the address D K in the server, and the Kth difference data is written.
  • the server may read the attribute information of the matching block in the order of the offset from small to large.
  • the client merges the N pieces of difference data into M difference blocks according to the order of the N pieces of difference data in the second file from small to large, because the client transmits the order block mechanism of the difference block, Therefore, the server may select the difference data in the order of increasing the addresses in the order of the difference blocks in the order of the received difference blocks, so as to ensure that the data composition order of the files obtained by the server merge is the same as the second file.
  • the client may combine the difference data into a difference block in order of increasing the address, so that the difference data sequentially read by the server after receiving the difference block is sequentially performed.
  • the address is incremented.
  • step S404 P K is greater than D K-1 + P K-1 ; if P K is equal to D K-1 + P K-1 , it indicates that the Kth matching block is compared with the k-1th matching block. Neighbor, that is, there is no difference data between the Kth matching block and the k-1th matching block, and in this case, the above step S404 need not be performed.
  • the server starts from the address D K +P K in the server, and writes a data block in the first file that is consistent with the Kth matching block.
  • the K is sequentially determined to be 2, 3, 4, ..., L, and the steps are repeated for each value of K.
  • S404 and S405 if there is still residual residual data not merged at the end, the client starts from the address D L in the server and writes the remaining difference data.
  • the server receives the M difference blocks sent by the client and the attribute information of the L matching blocks; and according to the attribute information of the L matching blocks, the M difference blocks and the first file and each The data blocks that match the matching blocks are merged into the second file. That is to say, the server receives the sent difference data of the client with the difference block as the minimum unit.
  • the server receives the N upload requests sent by the client for the N pieces of difference data, and the method is reduced. The server handles the number of upload requests, which reduces the load on the server.
  • the file A in FIG. 2 is taken as the second file, and the file C is used as the first file, and the server details the difference block and the data block in the first file that is consistent with the matching block to be merged into the second file. process.
  • the client can combine the difference data 1 to the difference data 4 into three difference blocks.
  • the difference block 1 includes the difference data 1, the difference data 2, and the first portion of the difference data 3, the first portion being as shown by 1 in FIG. 5, and the difference block 2 including The second part of the difference data 3, the second part is as shown by 2 in FIG. 5, the difference block 3 includes the third part of the difference data 3 and the difference data 4, and the third part is as shown in FIG. 3 is shown.
  • the client may first send the difference block 1 and the matching block 2, the matching block 3, and the matching block 4 attribute information to the server, and the attribute information is as shown in FIG. 5 (B2, 4 KB) (B3, 18 KB). (B4, 28KB) is shown.
  • B2 is the number of matching block 2
  • 4KB is the offset of matching block 2
  • B3 is the number of matching block 3
  • 18KB is the offset of matching block 3
  • B4 is the number of matching block 4
  • 28KB is the matching block.
  • An offset of 4 wherein the number of the matching block is the same as the number of the data block in the second file that coincides with the matching block.
  • the server after receiving the difference data and the attribute information sent by the client for the first time, the server has an offset of 4 KB because the first matching block, that is, the matching block 2, is the server.
  • the difference block 1 4 KB of difference data, that is, the difference data 1 is read in the order of increasing address, and according to the number B2 of the matching block 2, the first file is read in accordance with the matching block 2 Data block 2, the difference data 1 is merged with the data block 2;
  • the offset of the second matching block is 18 KB. Since the difference data 1 and the data block 1 occupy 14 KB, the server can be in the order of increasing the address in the difference block 1. Reading another difference data having a data size of 4 KB, that is, the difference data 2, and combining the difference data 2 with the data block 2, and reading the first file according to the number B3 of the matching block 3 a data block 3 corresponding to the matching block 3, and combining the data block 3 with the difference data 2;
  • the offset of the third matching block is 28 KB. Due to the difference data 1, the data block 1, the difference data 2 and the data block 3 occupy 28 KB, so it can be determined There is no difference data between the matching block 3 and the matching block 4.
  • the client can read the data block 4 in the first file according to the number B4 of the matching block 4, and compare the data block 4 with the data block 3. merge;
  • the client can directly write the remaining delta data, that is, the first part of the delta data 3. After the data block 4.
  • the server After the server receives the difference block and the matching block attribute information after receiving the client, the data can be merged by referring to the foregoing method until the file corresponding to the second file is obtained, and details are not described herein again.
  • the server may also receive the M difference blocks in all. And after the attribute information of the L matching blocks, starting from the storage address D 1 +P k of the server, writing a data block in the first file that is consistent with the Kth matching block; wherein D 1 is the The first address of the storage area of the server for storing the second file, P k is the offset of the kth matching block, 1 ⁇ K ⁇ L, and after writing all the data blocks, from the server D 1 at the start storage address, write data amount of the difference of the M blocks of difference successively at the free address of the storage area.
  • the embodiment of the present invention provides a client 70, which is configured to execute a corresponding method in the foregoing method embodiment, and save the second file to the server, where the server saves the first file, as shown in FIG. 7, the client 70 includes :
  • a determining unit 71 configured to determine, in the second file, the N pieces of difference data of the second file and the L matching blocks in the second file; the first file has data consistent with each matching block Block, the second file is composed of the N pieces of difference data and the L matching blocks; N and L are both positive integers greater than 0;
  • a merging unit 72 configured to merge the N pieces of difference data into M difference blocks; M is a positive integer smaller than N;
  • the sending unit 73 is configured to send, to the server, attribute information of the M difference blocks and the L matching blocks, where the attribute information of the L matching blocks is used for the M difference blocks and the first A data block in a file that coincides with each of the matching blocks is merged into the second file.
  • the client may perform the following operations for each difference block: sending an upload request to the server, and receiving, by the client, the response sent by the server for responding to the After uploading the response message of the request, the difference block is sent to the server.
  • the client when uploading the difference data, the client does not request the server to upload once for each difference data distributed in different locations, but first merges the difference data into a difference block with larger data volume. Requesting an upload for each delta block. Therefore, using the above client reduces the number of requests sent to the server, thereby reducing the load on the server.
  • the client does not limit the order of sending the difference block and the attribute information of the matching block, and may first send the attribute information of the L matching blocks, and then send the M difference blocks, or Transmitting the M difference blocks first, and then transmitting the attribute information of the L matching blocks, and simultaneously transmitting the difference data in the difference block adjacent to each other while transmitting each difference block
  • the attribute information of the block is sent to the server, which is not limited in this embodiment of the present invention.
  • the attribute information of the matching block needs to correctly indicate the position between the matching block and the difference data.
  • the attribute information of the matching block includes an identifier of the data block in the first file that is consistent with each matching block and an offset of the matching block in the second file.
  • the specific working process of each unit of the client described above may refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.
  • the above division of the client functional unit is only a logical function division, and the actual implementation may have another division manner, and the physical implementation of the above functional unit may also have different manners.
  • the determination unit may specifically
  • the central processing unit may be a specific integrated circuit
  • the merging unit may be a data reading and writing device capable of reading and writing data according to instructions of the processor
  • the transmitting unit may be a transmitter.
  • the embodiment of the present invention provides a server 80, which stores a first file, and is used to execute the corresponding method in the foregoing method embodiment.
  • the server 80 includes:
  • the receiving unit 81 is configured to receive attribute information of the M difference blocks and the L matching blocks sent by the client, where the M difference blocks are compared with the N pieces of difference data of the first file to be saved. a data block corresponding to each of the matching blocks exists in the first file, the second file is composed of the N pieces of difference data and the L matching blocks; N and L are both positive and greater than 0 An integer, M is a positive integer less than N;
  • the merging unit 82 is configured to merge the M delta blocks and the data blocks in the first file that are consistent with the matching block into the second file according to the attribute information of the L matching blocks.
  • the attribute information of the L matching blocks includes an identifier of a data block in the first file that is consistent with each matching block, and an offset of the each matching block in the second file; the offset The quantity is the difference between the start address of the matching block in the second file and the first address of the second file; the merging unit 82 is specifically configured to: determine, according to the identifier of the data block, the first file L data blocks in which the L matching blocks are identical; the M difference blocks and the L data blocks are merged into the second file according to the offset of the L matching blocks.
  • the above division of the server functional unit is only a logical function division, and the actual implementation may have another division manner, and the merging unit may specifically be capable of reading and writing data read and written according to instructions of the processor.
  • the receiving unit may be a receiver.
  • the server only needs to receive the upload request sent by the client with the difference block as the minimum upload unit on the premise that the difference data and the first file can be correctly merged, compared with the prior art.
  • the embodiment of the present invention reduces the load of the server.
  • the embodiment of the present invention provides another client 90, wherein the server saves the first file, and the client 90 is configured to save the second file to the server.
  • the client 90 includes:
  • the processor 91 may be a multi-core central processing unit CPU, or a specific integrated circuit (ASIC), or one or more integrated circuits configured to implement the embodiments of the present invention.
  • the processor 91 is configured to implement the following operations:
  • the second file is compared with the N pieces of difference data of the first file by And L matching blocks; the first file has a data block that is consistent with each of the matching blocks, and the second file is composed of the N pieces of difference data and the L matching blocks; L is a positive integer greater than 0;
  • M is a positive integer smaller than N
  • the attribute information of the L matching blocks includes an identifier of a data block in the first file that is consistent with each of the matching blocks, and each of the matching blocks is in the second file. Offset;
  • the identifier of the data block is used to determine L data blocks in the first file that are consistent with the L matching blocks, where the offset is each of the matching blocks in the second a difference between a start address in the file and a first address of the second file, the offset of the L matching blocks being used by the server to the M difference blocks and the L data
  • the blocks are merged into the second file.
  • the server 10 includes:
  • a memory 101 a processor 102, a receiver 103, a transmitter 104, and a communication bus 105; wherein the memory 101, the processor 102, the receiver 103, and the transmitter 104 pass the communication
  • the bus 105 completes communication with each other.
  • the memory 101 holds the first file.
  • the processor 102 may be a multi-core central processing unit CPU, or a specific integrated circuit (ASIC), or one or more integrated circuits configured to implement the embodiments of the present invention.
  • CPU central processing unit
  • ASIC specific integrated circuit
  • the processor 102 is configured to implement the following operations:
  • the M difference blocks are N pieces of difference data of the first file to be saved to the server compared to the first file Merging into the first file: a data block consistent with each of the matching blocks, wherein the second file is composed of the N pieces of difference data and the L matching blocks; L is a positive integer greater than 0 Number, M is a positive integer less than N;
  • the attribute information of the L matching blocks includes an identifier of a data block in the first file that is consistent with each of the matching blocks, and each of the matching blocks is in the second file.
  • An offset is a difference between a start address of the matching block in the second file and a first address of the second file;
  • the disclosed system, apparatus, and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. in.
  • the above integrated unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
  • the above-described integrated unit implemented in the form of a software functional unit can be stored in a computer readable storage medium.
  • the software functional units described above are stored in a storage medium and include instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform portions of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read only memory (English full name: Read-Only Memory, abbreviated as: ROM), a random access memory (English name: Random Access Memory, abbreviated as: RAM), a disk or A variety of media such as optical discs that can store program code.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

一种文件差量的传输方法以及装置,涉及通信领域,用以解决客户端对于每部分差量数据均需要向服务器发送一次上传请求的技术问题。该方法包括:客户端确定第二文件中,该第二文件相比第一文件的N份差量数据以及L个匹配块;将所述N份差量数据合并为M个差量块;M是小于N的正整数;将所述M个差量块以及所述L个匹配块的属性信息发送至所述服务器;其中,所述L个匹配块的属性信息被用于将所述M个差量块以及所述第一文件中与每个所述匹配块相一致的数据块,合并为所述第二文件。本发明实施例用于差量数据传输。28

Description

一种文件差量的传输方法以及装置 技术领域
本发明涉及通信领域,尤其涉及一种文件差量的传输方法以及装置。
背景技术
随着云存储服务的普及,越来越多的人选择将个人的文件上传到网盘中进行备份,人们对文件上传速度的要求也越来越高,于是发展出了利用文件特征码,实现“忽略式”上传的技术。在有限的网络带宽条件下,“忽略式”上传技术提升了文件上传效率。
具体地,“忽略式”上传技术是指,客户端在确定待上传的文件A后,计算该文件A的特征码,并将该特征码发送至服务器。其中,该特征码可以是信息-摘要算法第五版(英文全称:Message-Digest Algorithm 5,简称MD5),也可以是安全哈希算法第一版(英文全称:Secure Hash Algorithm,简称:SHA1)。这样,若该服务器上存在与文件A具有相同特征码的文件B,则该服务器可以直接向该客户端返回上传成功消息,不需要客户端再上传文件A;若该服务器上不存在与该文件A具有相同特征码的文件,但存在与该文件A具有相同文件名的文件C,在此情况下,该客户端还可以通过Rsync算法计算出该文件A相比该文件C的差量数据,并将该差量数据发送至该服务器,由该服务器利用该文件C与该差量数据相结合,在该服务器上生成与该文件A相一致的文件,从而无需客户端将整个文件A上传至服务器。
并且,现有技术中,为了便于服务器能够正确合并文件C与差量数据,客户端是按照差量数据以及匹配数据在文件A中的顺序,依次发送差量数据和匹配数据的标识至服务器,该服务器按照接收顺序,合并该差量数据以及文件C中对应该匹配数据标识的数据。也就是说,客户端在上传差量数据时,对于每部分差量数据均需要向服务器发送一次上传请求,上传过程复杂,在文件A与 文件C之间的差量数据较多,且分散在文件A的多个不同位置时,客户端的多次上传会增加客户端以及服务器的负载。
发明内容
本发明的目的是提供一种文件差量的传输方法以及装置,用于解决现有技术中,客户端对于每部分差量数据均需要向服务器发送一次上传请求的技术问题。
上述目的将通过独立权利要求中的特征来达成。进一步的实现方式在从属权利要求、说明书和附图中体现。
第一方面,提供一种文件差量的传输方法,服务器保存第一文件,所述方法用于将第二文件保存到所述服务器,所述方法包括:客户端确定所述第二文件中,所述第二文件相比所述第一文件的N份差量数据以及L个匹配块;所述第一文件中存在与每个所述匹配块相一致的数据块,所述第二文件由所述N份差量数据以及所述L个匹配块组成;N和L均是大于0的正整数;将所述N份差量数据合并为M个差量块;M是小于N的正整数;将所述M个差量块以及所述L个匹配块的属性信息发送至所述服务器;其中,所述L个匹配块的属性信息被用于将所述M个差量块以及所述第一文件中与每个所述匹配块相一致的数据块,合并为所述第二文件。
也就是说,客户端在上传差量数据时,并非对分布在不同位置的每一份差量数据均向服务器请求一次上传,而是先将差量数据合并为数据量更大的差量块,并以差量块为最小单位进行上传,减少了客户端向服务器发送请求的数量,进而减轻了服务器的负载。
在结合第一方面的第一种可能的实现方式中,所述L个匹配块的属性信息包括所述第一文件中与每个所述匹配块相一致的数据块的标识,以及每个所述匹配块在所述第二文件中的偏移量;其中,所述数据块的标识被用于确定所述第一文件中与所述L个匹配块相一致的L个数据块,所述偏移量是每个所述匹 配块在所述第二文件中的起始地址与所述第二文件的首地址之间的差值,所述L个匹配块的偏移量被用于将所述M个差量块与所述L个数据块合并为所述第二文件。也就是说,为了确保服务器在接收到差量块后,能够将差量块与第一文件中的数据块正确合并为第二文件,需要该匹配块的属性信息能够正确表明匹配块与差量数据之间的位置,上述方案提供了一种优选的实现方式,即匹配块的属性信息包括第一文件中与该匹配块相一致的数据块的标识以及该匹配块在第二文件中的偏移量。
结合第一方面,或者第一方面的第二种可能的实现方式,所述客户端确定所述第二文件中,所述第二文件相比所述第一文件的N份差量数据以及L个匹配块包括:接收服务器发送的所述第一文件的文件分块信息;根据所述文件分块信息通过Rsync算法确定所述第二文件中,与所述第一文件中的数据块相一致的所述匹配块。其中,所述文件分块信息包括第一文件中的数据块的标识,所述客户端确定出所述第二文件中,与所述第一文件中的数据块相一致的匹配块后,可以记录与所述匹配块相一致的数据块的标识,这样,所述服务器在接收到与所述匹配块相一致的数据块的标识后,即可在所述第一文件中确定出与所述匹配块相一致的数据块。
第二方面,提供一种文件差量传输的方法,服务器保存第一文件,所述方法用于将第二文件保存到所述服务器,包括:服务器接收客户端发送的M个差量块以及L个匹配块的属性信息;所述M个差量块是所述第二文件相比所述第一文件的N份差量数据合并而成的;所述第一文件中存在与每个所述匹配块相一致的数据块;所述第二文件由所述N份差量数据以及所述L个匹配块组成;N和L均是大于0的正整数,M是小于N的正整数;根据所述L个匹配块的属性信息将所述M个差量块以及所述第一文件中与每个所述匹配块相一致的数据块,合并为所述第二文件。
上述方案中,服务器接收客户端以差量块为最小单位上传的差量数据,相比现有技术中,服务器要接收客户端针对N份差量数据发送的N次上传请求, 上述方案减少了服务器处理上传请求的数目,进而减轻了服务器的负载。
在结合第二方面的第一种可能的实现方式中,所述L个匹配块的属性信息包括所述第一文件中与每个所述匹配块相一致的数据块的标识,以及每个所述匹配块在所述第二文件中的偏移量;所述偏移量是所述匹配块在所述第二文件中的起始地址与所述第二文件的首地址之间的差值;所述根据所述L个匹配块的属性信息将所述M个差量块以及所述第一文件中与每个所述匹配块相一致的所述数据块,合并为所述第二文件,包括:根据所述数据块的标识确定所述第一文件中与所述L个匹配块相一致的L个数据块;根据所述L个匹配块的偏移量将所述M个差量块以及所述L个数据块合并为所述第二文件。
第三方面,提供一种客户端,服务器保存第一文件,所述客户端用于将第二文件保存到所述服务器,所述客户端包括:确定单元,用于确定所述第二文件中,所述第二文件相比第一文件的N份差量数据以及L个匹配块;所述第一文件中存在与每个所述匹配块相一致的数据块,所述第二文件由所述N份差量数据以及所述L个匹配块组成;N和L均是大于0的正整数;合并单元,用于将所述N份差量数据合并为M个差量块;M是小于N的正整数;发送单元,用于将所述M个差量块以及所述L个匹配块的属性信息发送至所述服务器;其中,所述L个匹配块的属性信息被用于将所述M个差量块以及所述第一文件中与每个所述匹配块相一致的所述数据块,合并为所述第二文件。
结合第二方面,在第二方面的第一种可能的实现方式中,所述L个匹配块的属性信息包括所述第一文件中与每个所述匹配块相一致的数据块的标识,以及每个所述匹配块在所述第二文件中的偏移量;其中,所述数据块的标识被用于确定所述第一文件中与所述L个匹配块相一致的L个数据块,所述偏移量是每个所述匹配块在所述第二文件中的起始地址与所述第二文件的首地址之间的差值,所述L个匹配块的偏移量被用于将所述M个差量块与所述L个数据块合并为所述第二文件。
值得说明的是,以上对客户端功能单元的划分,仅仅为一种逻辑功能划分, 实际实现时可以有另外的划分方式,并且,上述功能单元的物理实现也可以有不同的方式,例如,上述确定单元具体可以一中央处理器,也可以是一特定集成电路,上述合并单元具体可以是能够根据处理器的指令,对数据进行读写的数据读写装置,上述发送单元可以是一发射机。
第四方面,提供一种服务器,所述服务器保存第一文件,包括:接收单元,用于接收客户端发送的M个差量块以及L个匹配块的属性信息;所述M个差量块是待保存到所述服务器的第二文件相比所述第一文件的N份差量数据合并而成的;所述第一文件中存在与每个所述匹配块相一致的数据块,所述述第二文件由所述N份差量数据以及所述L个匹配块组成;N和L均是大于0的正整数,M是小于N的正整数;合并单元,用于根据所述L个匹配块的属性信息将所述M个差量块以及所述第一文件中与每个所述匹配块相一致的数据块,合并为所述第二文件。
在结合第四方面的第一种可能的实现方式中,所述L个匹配块的属性信息包括所述第一文件中与每个所述匹配块相一致的数据块的标识,以及每个所述匹配块在所述第二文件中的偏移量;所述偏移量是所述匹配块在所述第二文件中的起始地址与所述第二文件的首地址之间的差值;所述合并单元具体用于:根据所述数据块的标识确定所述第一文件中与所述L个匹配块相一致的L个数据块;根据所述L个匹配块的偏移量将所述M个差量块以及所述L个数据块合并为所述第二文件。
值得说明的是,以上对服务器功能单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,并且,上述合并单元具体可以是能够根据处理器的指令,对数据进行读写的数据读写装置,上述接收单元可以是一接收机。
第五方面,提供一种客户端,服务器保存第一文件,所述客户端用于将第二文件保存到所述服务器,所述客户端包括:处理器、发射机、接收机和通信总线;其中,所述处理器、所述发射机和所述接收机通过所述通信总线完成相 互间的通信;所述处理器用于:确定所述第二文件中,所述第二文件相比所述第一文件的N份差量数据以及L个匹配块;所述第一文件中存在与每个所述匹配块相一致的数据块,所述第二文件由所述N份差量数据以及所述L个匹配块组成;N和L均是大于0的正整数;将所述N份差量数据合并为M个差量块;M是小于N的正整数;将所述M个差量块以及所述L个匹配块的属性信息发送至所述服务器;其中,所述L个匹配块的属性信息被用于将所述M个差量块以及所述第一文件中与每个所述匹配块相一致的数据块,合并为所述第二文件。
结合第五方面,在第五方面的第一种可能的实现方式中,所述L个匹配块的属性信息包括所述第一文件中与每个所述匹配块相一致的数据块的标识,以及每个所述匹配块在所述第二文件中的偏移量;其中,所述数据块的标识被用于确定所述第一文件中与所述L个匹配块相一致的L个数据块,所述偏移量是每个所述匹配块在所述第二文件中的起始地址与所述第二文件的首地址之间的差值,所述L个匹配块的偏移量被用于将所述M个差量块与所述L个数据块合并为所述第二文件。
第六方面,提供一种服务器,包括:存储器、处理器、发射机、接收机和通信总线;其中,所述处理器、所述发射机和所述接收机通过所述通信总线完成相互间的通信;所述存储器保存第一文件;所述处理器用于:接收客户端发送的M个差量块以及L个匹配块的属性信息;所述M个差量块是待保存到所述服务器的第二文件相比所述第一文件的N份差量数据合并而成的;所述第一文件中存在与每个所述匹配块相一致的数据块,所述述第二文件由所述N份差量数据以及所述L个匹配块组成;N和L均是大于0的正整数,M是小于N的正整数;根据所述L个匹配块的属性信息将所述M个差量块以及所述第一文件中与每个所述匹配块相一致的数据块,合并为所述第二文件。
在结合第六方面的第一种可能的实现方式中,所述L个匹配块的属性信息包括所述第一文件中与每个所述匹配块相一致的数据块的标识,以及所述L个匹配块在所述第二文件中的偏移量;所述偏移量是所述匹配块在所述第二文件 中的起始地址与所述第二文件的首地址之间的差值;所述处理器具体用于:根据所述数据块的标识确定所述第一文件中与所述L个匹配块相一致的L个数据块;根据所述L个匹配块的偏移量将所述M个差量块以及所述L个数据块合并为所述第二文件。
在上述某些方面可能的实现方式中,客户端是按照所述N份差量数据在所述第二文件中的地址从小到大的顺序将所述N份差量数据合并为M个差量块,则所述服务器根据所述L个匹配块的偏移量将所述M个差量块以及所述L个数据块合并为所述第二文件,包括:从所述服务器的存储地址D1+Pk处开始,写入所述第一文件中与第K个所述匹配块相一致的数据块;其中,D1是所述服务器中用于存储所述第二文件的存储区域的首地址,Pk是第k个所述匹配块的偏移量,1≤K≤L;从所述服务器的存储地址D1处开始,在所述存储区域的空闲地址处依次写入所述M个差量块中的差量数据。上述只是服务器合并所述M个差量块以及所述L个数据块的一个示例,本领域的技术人员应该知晓,对于所述差量块和所述数据块进行合并的顺序本发明不做限定。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例提供的一种应用Rsync算法确定两个文件之间的差量数据的方法流程图;
图2为基于图1所示方法确定出的文件A相比文件C的差量数据以及匹配块;
图3为本发明实施例提供的一种文件差量的传输方法的流程示意图;
图4为本发明实施例提供的服务器合并差量块与第一文件中的数据块的方 法流程图;
图5为本发明实施例提供的客户端将差量数据合并为差量块的一个示例;
图6为本发明实施例提供的服务器合并差量块与第一文件中的数据块的一个示例;
图7为本发明实施例提供的一种客户端的结构示意图;
图8为本发明实施例提供的一种服务器的结构示意图;
图9为本发明实施例提供的另一种客户端的结构示意图;
图10为本发明实施例提供的另一种服务器的结构示意图。
具体实施方式
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述。
下面首先对Rsync算法进行简单介绍,以便本领域技术人员更容易理解本发明提供的技术方案。
Rsync算法是同步文件的一个高效算法,它能查找出两个文件中的不同的数据。如图1所示,客户端应用Rsync算法确定文件A相比文件C的不同数据的方法包括:
S101、客户端接收服务器发送的文件C的文件分块信息。
其中,该文件分块信息包括文件C的每个数据块的编号,以及每个数据块的分块特征码,该分块特征码包括数据块的弱校验码和强校验码,其中,该弱校验码可以是Adler 32,该强校验码可以是MD5。该文件C中的每个数据块具有相同的分块大小。
S102、该客户端从文件A的起始位置读取一个分块大小的第一数据块。
S103、该客户端计算该第一数据块的弱校验码,并在该文件C中查找是否存在与该第一数据块的弱校验码相同的第二数据块。
进一步地,若该文件C中存在与该第一数据块的弱校验码相同的第二数据 块,则执行步骤S104,若该文件C中不存在与该第一数据块的弱校验码相同的第二数据块,则执行步骤S108。
S104、该客户端计算该第一数据块的强校验码,确定该第二数据块的强校验码是否与该第一数据块的强校验码相同。
值得说明的是,客户端在确定两个数据块的强校验码相同时,可以有效确定两个数据块相同。
进一步地,若该第一数据块与该第二数据块的强校验码不相同,则执行步骤S105,若该第一数据块与该第二数据块的强校验码相同,则执行步骤S106和步骤S107。
S105、该客户端从该第一数据块的起始位置偏移一个字节,读取一个分块大小的第三数据块。
其中,该客户端以该第三数据块作为第一数据块,重新从步骤S103开始执行。
S106、该客户端标记该第一数据块与该第二数据块相匹配。
例如,该客户端建立该第一数据块与该第二数据块的编号之间的对应关系。
S107、该客户端从该第一数据块的起始位置偏移一个分块大小,读取一个分块大小的第四数据块。
其中,该客户端以该第四数据块作为第一数据块,重新从步骤S103开始执行。
S108、该客户端从该第一数据块的起始位置偏移一个字节大小,读取一个分块大小的第五数据块。
其中,该客户端以该第五数据块作为第一数据块,重新从步骤S103开始执行。
这样,客户端通过上述步骤S101至步骤S108,可以确定出文件A中,与该文件C的数据块相匹配的数据块。
如图2所示,若文件C包括数据块1,数据块2,数据块3,数据块4和数 据块5,客户端通过Rsync算法确定出文件A中与该数据块2相一致的匹配块2,与该数据块3相一致的匹配块3,与该数据块4相一致的匹配块4,如图2所示,文件A中存在两个与该数据块3相一致的匹配块3。这样,文件A相对于文件C的差量数据即为文件A中,被匹配块分隔开来的每一份数据,如图2中示出的差量数据1,差量数据2,差量数据3以及差量数据4。
仍以图2举例说明,现有技术中,为了便于服务器能够将文件A中的差量数据以及文件C正确合并为与文件A相同的文件,客户端依次向服务器发送差量数据1,匹配块2的编号,差量数据2,匹配块3的编号,匹配块4的编号,差量数据3,匹配块3的编号,差量数据4。该服务器按照接收顺序,合并接收到的差量数据以及文件C中与匹配块编号对应的数据块。
另外,本文所描述的客户端具体可以是手机,电脑,平板电脑等用户端设备,服务器具体可以是云端存储服务器。
本发明实施例提供一种文件差量的传输方法,用于解决现有技术中,针对每份差量数据均进行一次上传,增加了服务器的负载的技术问题,其中,该服务器保存第一文件。如图3所示,该方法包括:
S301、客户端确定待上传的第二文件,并计算该第二文件的特征码。
S302、该客户端将该第二文件的文件名以及该特征码发送至服务器。
S303、该服务器查询是否存在与该第二文件的文件名相同的第一文件。
若不存在,则执行步骤S304和步骤S305;若存在,则执行步骤S306至步骤S313。
S304、该服务器向该客户端发送上传通知消息。
S305、该客户端在接收到该上传通知消息后,将该第二文件上传至该服务器。
S306、该服务器计算该第一文件的特征码,并确定该第二文件的特征码与该第一文件的特征码是否相同。
若相同,执行步骤S307;若不相同,执行步骤S308至步骤S313。
S307、该服务器向该客户端发送上传成功消息。
也就是说,对于服务器中已经存在的文件,客户端在进行重复上传时,仅需要向服务器发送了该文件的文件名以及特征码,无需上传该文件的数据,提高了上传效率。
S308、该服务器将该第一文件分为多个数据块,并确定第一文件的文件分块信息。
S309、该服务器将该文件分块信息发送至该客户端。
S310、该客户端根据该文件分块信息通过Rsync算法确定该第二文件中,该第二文件相比第一文件的N份差量数据以及L个匹配块。
其中,N和L均是大于0的正整数。
上述步骤S308至步骤S310的详细过程可以参照图1所示的方法步骤,此处不再赘述。
值得说明的是,步骤S308中,该服务器对该第一文件进行分块的大小,直接影响该第二文件与该第一文件之间相匹配的数据块的情况。通常情况下,分块越小,该第二文件与该第一文件相匹配的数据块越多;分块越大,该第二文件相比该第一文件的差量数据的数据量越大。因此,在具体实施过程中,该服务器对于该第一文件进行分块的大小可以预先设置,例如,分块大小预设为10字节(KB),在此情况下,该服务器从该第一文件的首地址开始,将每10KB的数据分为一个数据块,其中,最后的数据块可能小于10KB。
S311、该客户端将该N份差量数据合并为M个差量块。
其中,M是小于N的正整数。
值得说明的是,差量块的大小跟客户端向服务器上传数据的次数直接相关,差量块越大,差量块的数量越少,客户端的上传次数也就越少。但是,差量块的大小受到客户端的传输能力的限制,在具体实施过程中,可以根据客户端当前的网络带宽以及数据传输格式等预先设置差量块的大小,其中,每个差量块的大小可以不同,也可以相同,例如将差量块的大小均设定为与匹配块大小相 同,本发明实施例对此不作限定。
S312、该客户端将该M个差量块以及该L个匹配块的属性信息发送至该服务器。
在本发明实施例的一种可能的实现方式中,客户端可以对每个差量块执行如下操作:该客户端向服务器发送上传请求,并在接收到该服务器发送的用于响应的该上传请求的响应消息,该客户端根据该响应消息将该差量块以及与该差量块内的每份差量数据相邻的匹配块的属性信息发送至该客户端。
这样,服务器接收客户端针对M个差量块发送的M次上传请求,相比现有技术中,服务器要接收客户端针对N份差量数据发送的N次上传请求,上述方案减少了服务器处理上传请求的数目,进而减轻了服务器的负载。
上述只是举例说明,在实际实施时,客户端也可以针对多个差量块发送一次上传请求,也就是说,客户端可以以差量块为最小单位进行上传,本发明对此不做限定。
S313、该服务器在接收到M个差量块以及L个匹配块的属性信息后,根据该L个匹配块的属性信息将该M个差量块以及该第一文件中与每个匹配块相一致的数据块合并为该第二文件。
值得说明的是,该M个差量块传输到服务器侧时是保序的,也就是说,客户端发送各差量块的顺序与该服务器接收到各差量块的顺序一致,以便该服务器可以根据接收顺序依次选择差量块进行合并。
采用上述方法,该客户端确定第二文件中,该第二文件相比第一文件的N份差量数据以及L个匹配块,将该N份差量数据合并为M个差量块,将该M个差量块以及L个匹配块的属性信息发送至该服务器,以便该服务器根据该L个匹配块的属性信息将该M个差量块以及该第一文件中与每个匹配块相一致的数据块合并为该第二文件。这样,相比现有技术中,客户端每次向服务器上传一份差量数据,本发明中客户端可以每次向服务器上传一个差量块,减少了客户端上传次数,从而减轻了服务器的负载。
需要说明的是,对于上述方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,例如,关于步骤S312中对于差量块以及匹配块的属性信息的发送,该客户端也可以先将每个差量块发送至服务器,再将该L个匹配的属性信息发送至服务器,也可以同时传输,本发明实施例对此不做限定。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作并不一定是本发明所必须的。
下面对上述步骤S313进行详细说明,其中,该L个匹配块的属性信息可以包括该第一文件中与每个匹配块相一致的数据块的标识以及该L个匹配块在该第二文件中的偏移量;该偏移量是该匹配块在该第二文件中的起始地址与该第二文件的首地址之间的差值。这样,该服务器在接收到M个差量块以及L个匹配块的属性信息后,可以根据该数据块的标识确定该第一文件中与该L个匹配块相一致的L个数据块。
进一步地,为了使本领域的技术人员能够更加理解服务器侧对于数据合并的方法,下面通过一个详细的例子进行举例说明。如图4所示,包括:
S401、该服务器根据该第一文件中与每个匹配块相一致的数据块的标识确定该第一文件中与该L个匹配块相一致的L个数据块。
可选地,该数据块的标识具体可以是数据块的编号,参照上述对应图1中的步骤S106的描述,服务器发送的文件分块信息中包括第一文件的数据块的编号,这样,该客户端在根据Rsync算法确定出第二文件中与该第一文件的数据块相匹配的匹配块后,可以记录该数据块的编号,并将该编号作为第一文件中与该匹配块的相一致的数据块的标识发送至服务器。
另外,该数据块的标识也可以是第一文件中数据块的分块大小,以及数据块在第一文件中的偏移量,也就是说,在图3所示步骤S309中,服务器向客户端发送的文件分块信息包括第一文件中数据块的分块大小以及数据块的偏移量,进一步地,客户端在第二文件中确定出与数据块相一致的匹配块后,记录该数 据块的分块大小以及偏移量,这样,服务器在收到客户端发送的数据块的标识后,通过偏移量以及分块大小即可在第一文件中确定出与该匹配块相一致的数据块。
S402、该服务器读取该L个匹配块中的第1个匹配块的偏移量P1,在该M个差量块中按照地址递增的顺序读取数据大小为P1的第1份差量数据,从该第一文件在该服务器中的首地址D1开始,写入该第1份差量数据。
值得说明的是,在步骤S402中,P1不等于0。若P1等于0,在此情况下,该服务器直接从地址D1开始写入第1个匹配块,也就是说,该第一文件与该第二文件从首地址开始即存在相一致的数据块和匹配块,此时,无需执行步骤S402,直接执行步骤S403。
S403、该服务器从该服务器中的地址D1+P1开始,写入该第一文件中与该第1个匹配块相一致的数据块。
S404、该服务器读取该L个匹配块中的第K个匹配块的偏移量PK,在该M个差量块中按照地址递增的顺序读取数据大小为PK-(DK-1+PK-1)的第K份差量数据,从该服务器中的地址DK开始,写入该第K份差量数据。
值得说明的是,在步骤S404中,1<K≤L,PK-1<PK,DK=DK-1+PK-1。由PK-1<PK可知,本发明实施例中服务器可以是按照偏移量从小到大的顺序读取匹配块的属性信息。客户端是按照该N份差量数据在该第二文件中的地址从小到大的顺序将该N份差量数据合并为M个差量块,由于客户端传输差量块的保序机制,因此,服务器可以按照接收到的差量块的顺序依次在各差量块中按照地址递增的顺序选择差量数据进行合并,从而保证了服务器合并得到的文件的数据组成顺序与第二文件相同。
优选地,图3所示步骤S311中,客户端可以是按照地址递增的顺序依次将差量数据合并为差量块,这样,服务器在接收到差量块后,顺序读取出来的差量数据的地址即是递增的。
另外,在步骤S404中,PK大于DK-1+PK-1;若PK等于DK-1+PK-1,则表明第 K个匹配块与第k-1个匹配块相邻,也就是说,第K个匹配块与第k-1个匹配块之间不存在差量数据,此时,无需执行上述步骤S404。
S405、该服务器从该服务器中的地址DK+PK开始,写入该第一文件中与该第K个匹配块相一致的数据块。
值得说明的是,在具体实施过程中,客户端首次执行步骤S401和步骤S403后,将K依次取值为2,3,4,……,L,并对K的每个取值重复执行步骤S404和S405,若最后仍存在剩余差量数据未合并,则该客户端从该服务器中的地址DL开始,写入剩余的差量数据。
采用上述方法,服务器接收客户端发送的M个差量块以及L个匹配块的属性信息;并根据该L个匹配块的属性信息将该M个差量块以及该第一文件中与每个匹配块相一致的数据块合并为该第二文件。也就是说,服务器接收客户端以差量块为最小单位的发送的差量数据,相比现有技术中,服务器要接收客户端针对N份差量数据发送的N次上传请求,上述方法减少了服务器处理上传请求的数目,进而减轻了服务器的负载。
下面以图2中的文件A作为第二文件,文件C作为第一文件,详细描述服务器将差量块以及第一文件中与所述匹配块相一致的所述数据块合并为第二文件的过程。
如图2所示,若每个匹配块的大小为10KB,并且,该匹配块2的偏移量为4KB,匹配块3的偏移量为18KB,匹配块4的偏移量为28KB,另一匹配块3的偏移量为54KB。在此种情况下,该客户端可以将差量数据1至差量数据4合并为三个差量块。例如,如图5所示,差量块1包括差量数据1,差量数据2以及差量数据3的第1部分,该第1部分如图5中的1所示,差量块2包括差量数据3的第2部分,该第2部分如图5中的2所示,差量块3包括差量数据3的第3部分以及差量数据4,该第3部分如图5中的3所示。
这样,该客户端可以首先向该服务器发送差量块1以及匹配块的2,匹配块3以及匹配块4的属性信息,该属性信息如图5中的(B2,4KB)(B3,18KB) (B4,28KB)所示。其中,B2为匹配块2的编号,4KB是匹配块2的偏移量,B3为匹配块3的编号,18KB是匹配块3的偏移量,B4为匹配块4的编号,28KB是匹配块4的偏移量,其中,匹配块的编号与第二文件中与该匹配块相一致的数据块的编号相同。
进一步地,如图6所示,该服务器在接收到客户端首次发送的差量数据以及属性信息后,由于第一个匹配块,即该匹配块2,的偏移量为4KB,则该服务器在该差量块1中按照地址递增的顺序读取4KB的差量数据,即该差量数据1,并根据该匹配块2的编号B2,读取第一文件中与该匹配块2相一致的数据块2,将该差量数据1与该数据块2相合并;
第二个匹配块的偏移量,即该匹配块3,为18KB,由于差量数据1与该数据块1已占用14KB,因此,该服务器可以在该差量块1中按照地址递增的顺序读取另一数据大小为4KB的差量数据,即该差量数据2,并将该差量数据2与该数据块2相合并,并根据该匹配块3的编号B3,读取第一文件中与该匹配块3相一致的数据块3,并将该数据块3与该差量数据2相合并;
第三匹配块的偏移量,即该匹配块4,为28KB,由于差量数据1,该数据块1,该差量数据2以及该数据块3已占用28KB,因此,可去可确定该匹配块3与该匹配块4之间不存在差量数据,该客户端可以根据匹配块4的编号B4读取第一文件中的数据块4,并将该数据块4与该数据块3相合并;
进一步地,该差量块1中还有剩余的差量数据未合并,在此情况下,该客户端可以直接将剩余的差量数据,即该差量数据3的第1部分,写入到该数据块4之后。
以此类推,该服务器在接收到客户端之后发送的差量块以及匹配块属性信息后,可以参照上述方法进行数据合并,直到得到与第二文件相一致的文件,此处不再赘述。
值得说明的是,本领域的技术人员应该知晓,对于所述差量块和所述数据块进行合并的顺序本发明不做限定,例如,该服务器也可以在全部接收到该M 个差量块和L个匹配块的属性信息后,从该服务器的存储地址D1+Pk处开始,写入该第一文件中与第K个该匹配块相一致的数据块;其中,D1是该服务器中用于存储该第二文件的存储区域的首地址,Pk是第k个该匹配块的偏移量,1≤K≤L,并在写入所有的数据块后,从该服务器的存储地址D1处开始,在该存储区域的空闲地址处依次写入该M个差量块中的差量数据。
本发明实施例提供一种客户端70,用于执行上述方法实施例中对应的方法,将第二文件保存到服务器,该服务器有保存第一文件,如图7所示,该客户端70包括:
确定单元71,用于确定该第二文件中,该第二文件相比该第一文件的N份差量数据以及L个匹配块;该第一文件中存在与每个匹配块相一致的数据块,该第二文件由该N份差量数据以及该L个匹配块组成;N和L均是大于0的正整数;
合并单元72,用于将该N份差量数据合并为M个差量块;M是小于N的正整数;
发送单元73,用于将该M个差量块以及该L个匹配块的属性信息发送至该服务器;其中,该L个匹配块的属性信息被用于将该M个差量块以及该第一文件中与每个该匹配块相一致的数据块,合并为该第二文件。
在本发明实施例的一种可能的实现方式中,该客户端可以针对每一个差量块执行以下操作:向该服务器发送上传请求,并在该客户端接收到该服务器发送的用于响应该上传请求的响应消息后,将该差量块发送至该服务器。
也就是说,客户端在上传差量数据时,并非对分布在不同位置的每一份差量数据均向服务器请求一次上传,而是先将差量数据合并为数据量更大的差量块,针对每一个差量块请求一次上传,因此,采用上述客户端,减少了向服务器发送请求的数量,进而减轻了服务器的负载。
值得说明的是,客户端对于发送差量块以及匹配块的属性信息的顺序不做限定,可以先发送所述L个匹配块的属性信息,再发送所述M个差量块,也可 以先发送所述M个差量块,再发送所述L个匹配块的属性信息,还可以在发送每个差量块的同时,将与所述差量块内的差量数据相邻匹配块的属性信息发送至服务器,本发明实施例对此不做限定。
另外,为了确保服务器在接收到差量块后,能够将差量块与第一文件正确合并为第二文件,需要该匹配块的属性信息能够正确表明匹配块与差量数据之间的位置,在本发明实施例的一种优选的实现方式,匹配块的属性信息包括该第一文件中与每个匹配块相一致的数据块的标识以及该匹配块在第二文件中的偏移量。服务器根据匹配块的属性信息将差量块以及第一文件中与所述匹配块相一致的数据块合并为第二文件的过程可以参照前述方法实施例中对应的描述,此处不再赘述。
所属本领域的技术人员应该清楚地了解到,为描述的方便和简洁,上述描述的客户端的各单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述,并且,以上对客户端功能单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,并且,上述功能单元的物理实现也可以有不同的方式,例如,上述确定单元具体可以一中央处理器,也可以是一特定集成电路,上述合并单元具体可以是能够根据处理器的指令,对数据进行读写的数据读写装置,上述发送单元可以是一发射机。
本发明实施例提供一种服务器80,该服务器80保存第一文件,用于执行上述方法实施例中对应的方法,如图8所示,该服务器80包括:
接收单元81,用于接收客户端发送的M个差量块以及L个匹配块的属性信息;该M个差量块是待保存的第二文件相比第一文件的N份差量数据合并而成的;该第一文件中存在与每个该匹配块相一致的数据块,该第二文件由该N份差量数据以及该L个匹配块组成;N和L均是大于0的正整数,M是小于N的正整数;
合并单元82,用于根据该L个匹配块的属性信息将该M个差量块以及该第一文件中与所述匹配块相一致的数据块合并为该第二文件。
可选地,该L个匹配块的属性信息包括该第一文件中与每个匹配块相一致的数据块的标识以及该每个匹配块在该第二文件中的偏移量;该偏移量是该匹配块在该第二文件中的起始地址与该第二文件的首地址之间的差值;该合并单元82具体用于:根据该数据块的标识确定该第一文件中与该L个匹配块相一致的L个数据块;根据该L个匹配块的偏移量将该M个差量块以及该L个数据块合并为该第二文件。
以上对服务器功能单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,并且,上述合并单元具体可以是能够根据处理器的指令,对数据进行读写的数据读写装置,上述接收单元可以是一接收机。
另外,所属本领域的技术人员应该清楚地了解到,为描述的方便和简洁,上述描述的服务器的各单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
采用本发明实施例提供的服务器,该服务器在保证差量数据与第一文件能够正确合并的前提下,只需接收客户端以差量块为最小上传单元发送的上传请求,相比现有技术要接收客户端针对N份差量数据发送的N次上传请求,本发明实施例减轻了服务器的负载。
本发明实施例提供另一种客户端90,其中,服务器保存第一文件,该客户端90用于将第二文件保存到该服务器,如图9所示,该客户端90包括:
处理器(processor)91、接收机92、发射机93和通信总线94;其中,所述处理器91、所述接收机92和所述发射机93通过所述通信总线94完成相互间的通信。
处理器91可能是一个多核中央处理器CPU,或者是特定集成电路(英文全称:Application Specific Integrated Circuit,简称:ASIC),或者是被配置成实施本发明实施例的一个或多个集成电路。
所述处理器91用于实现以下操作:
确定所述第二文件中,所述第二文件相比所述第一文件的N份差量数据以 及L个匹配块;所述第一文件中存在与每个所述匹配块相一致的数据块,所述第二文件由所述N份差量数据以及所述L个匹配块组成;N和L均是大于0的正整数;
将所述N份差量数据合并为M个差量块;M是小于N的正整数;
将所述M个差量块以及所述L个匹配块的属性信息发送至所述服务器;其中,所述L个匹配块的属性信息被用于将所述M个差量块以及所述第一文件中与每个所述匹配块相一致的数据块,合并为所述第二文件。
可选地,所述L个匹配块的属性信息包括所述第一文件中与每个所述匹配块相一致的数据块的标识,以及每个所述匹配块在所述第二文件中的偏移量;
其中,所述数据块的标识被用于确定所述第一文件中与所述L个匹配块相一致的L个数据块,所述偏移量是每个所述匹配块在所述第二文件中的起始地址与所述第二文件的首地址之间的差值,所述L个匹配块的偏移量用于所述服务器将所述M个差量块与所述L个数据块合并为所述第二文件。
本发明实施例提供另一种服务器10,如图10所示,该服务器10包括:
存储器101、处理器(processor)102、接收机103、发射机104和通信总线105;其中,所述存储器101、所述处理器102、所述接收机103和所述发射机104通过所述通信总线105完成相互间的通信。
所述存储器101保存第一文件。
处理器102可能是一个多核中央处理器CPU,或者是特定集成电路(英文全称:Application Specific Integrated Circuit,简称:ASIC),或者是被配置成实施本发明实施例的一个或多个集成电路。
所述处理器102用于实现以下操作:
接收客户端发送的M个差量块以及L个匹配块的属性信息;所述M个差量块是待保存到所述服务器的第二文件相比所述第一文件的N份差量数据合并而成的;所述第一文件中存在与每个所述匹配块相一致的数据块,所述述第二文件由所述N份差量数据以及所述L个匹配块组成;N和L均是大于0的正整 数,M是小于N的正整数;
根据所述L个匹配块的属性信息将所述M个差量块以及所述第一文件中与每个所述匹配块相一致的数据块,合并为所述第二文件。
可选地,所述L个匹配块的属性信息包括所述第一文件中与每个所述匹配块相一致的数据块的标识,以及每个所述匹配块在所述第二文件中的偏移量;所述偏移量是所述匹配块在所述第二文件中的起始地址与所述第二文件的首地址之间的差值;
所述根据所述L个匹配块的属性信息将所述M个差量块以及所述第一文件中与每个所述匹配块相一致的所述数据块,合并为所述第二文件,包括:
根据所述数据块的标识确定所述第一文件中与所述L个匹配块相一致的L个数据块;
根据所述L个匹配块的偏移量将所述M个差量块以及所述L个数据块合并为所述第二文件。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元 中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
上述以软件功能单元的形式实现的集成的单元,可以存储在一个计算机可读取存储介质中。上述软件功能单元存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(英文全称:Read-Only Memory,简称:ROM)、随机存取存储器(英文全称:Random Access Memory,简称:RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
尽管已描述了本发明的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本发明范围的所有变更和修改。
显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。

Claims (12)

  1. 一种文件差量的传输方法,其特征在于,服务器保存第一文件,所述方法用于将第二文件保存到所述服务器,包括:
    客户端确定所述第二文件中,所述第二文件相比所述第一文件的N份差量数据以及L个匹配块;所述第一文件中存在与每个所述匹配块相一致的数据块,所述第二文件由所述N份差量数据以及所述L个匹配块组成;N和L均是大于0的正整数;
    将所述N份差量数据合并为M个差量块;M是小于N的正整数;
    将所述M个差量块以及所述L个匹配块的属性信息发送至所述服务器;其中,所述L个匹配块的属性信息被用于将所述M个差量块以及所述第一文件中与每个所述匹配块相一致的数据块,合并为所述第二文件。
  2. 根据权利要求1所述的方法,其特征在于,所述L个匹配块的属性信息包括所述第一文件中与每个所述匹配块相一致的数据块的标识,以及每个所述匹配块在所述第二文件中的偏移量;
    其中,所述数据块的标识被用于确定所述第一文件中与所述L个匹配块相一致的L个数据块,所述偏移量是每个所述匹配块在所述第二文件中的起始地址与所述第二文件的首地址之间的差值,所述L个匹配块的偏移量被用于将所述M个差量块与所述L个数据块合并为所述第二文件。
  3. 一种文件差量传输的方法,其特征在于,服务器保存第一文件,所述方法用于将第二文件保存到所述服务器,包括:
    服务器接收客户端发送的M个差量块以及L个匹配块的属性信息;所述M个差量块是所述第二文件相比所述第一文件的N份差量数据合并而成的;所述第一文件中存在与每个所述匹配块相一致的数据块;所述第二文件由所述N份差量数据以及所述L个匹配块组成;N和L均是大于0的正整数,M是小于N的正整数;
    根据所述L个匹配块的属性信息将所述M个差量块以及所述第一文件中与每个所述匹配块相一致的数据块,合并为所述第二文件。
  4. 根据权利要求3所述的方法,其特征在于,所述L个匹配块的属性信息包括所述第一文件中与每个所述匹配块相一致的数据块的标识,以及每个所述匹配块在所述第二文件中的偏移量;所述偏移量是所述匹配块在所述第二文件中的起始地址与所述第二文件的首地址之间的差值;
    所述根据所述L个匹配块的属性信息将所述M个差量块以及所述第一文件中与每个所述匹配块相一致的所述数据块,合并为所述第二文件,包括:
    根据所述数据块的标识确定所述第一文件中与所述L个匹配块相一致的L个数据块;
    根据所述L个匹配块的偏移量将所述M个差量块以及所述L个数据块合并为所述第二文件。
  5. 一种客户端,其特征在于,服务器保存第一文件,所述客户端用于将第二文件保存到所述服务器,所述客户端包括:
    确定单元,用于确定所述第二文件中,所述第二文件相比第一文件的N份差量数据以及L个匹配块;所述第一文件中存在与每个所述匹配块相一致的数据块,所述第二文件由所述N份差量数据以及所述L个匹配块组成;N和L均是大于0的正整数;
    合并单元,用于将所述N份差量数据合并为M个差量块;M是小于N的正整数;
    发送单元,用于将所述M个差量块以及所述L个匹配块的属性信息发送至所述服务器;其中,所述L个匹配块的属性信息被用于将所述M个差量块以及所述第一文件中与每个所述匹配块相一致的所述数据块,合并为所述第二文件。
  6. 根据权利要求5所述的客户端,其特征在于,所述L个匹配块的属性信息包括所述第一文件中与每个所述匹配块相一致的数据块的标识,以及每个所述匹配块在所述第二文件中的偏移量;
    其中,所述数据块的标识被用于确定所述第一文件中与所述L个匹配块相一致的L个数据块,所述偏移量是每个所述匹配块在所述第二文件中的起始地址与所述第二文件的首地址之间的差值,所述L个匹配块的偏移量被用于将所述M个差量块与所述L个数据块合并为所述第二文件。
  7. 一种服务器,其特征在于,所述服务器保存第一文件,包括:
    接收单元,用于接收客户端发送的M个差量块以及L个匹配块的属性信息;所述M个差量块是待保存到所述服务器的第二文件相比所述第一文件的N份差量数据合并而成的;所述第一文件中存在与每个所述匹配块相一致的数据块,所述述第二文件由所述N份差量数据以及所述L个匹配块组成;N和L均是大于0的正整数,M是小于N的正整数;
    合并单元,用于根据所述L个匹配块的属性信息将所述M个差量块以及所述第一文件中与每个所述匹配块相一致的数据块,合并为所述第二文件。
  8. 根据权利要求7所述的服务器,其特征在于,所述L个匹配块的属性信息包括所述第一文件中与每个所述匹配块相一致的数据块的标识,以及每个所述匹配块在所述第二文件中的偏移量;所述偏移量是所述匹配块在所述第二文件中的起始地址与所述第二文件的首地址之间的差值;
    所述合并单元具体用于:
    根据所述数据块的标识确定所述第一文件中与所述L个匹配块相一致的L个数据块;
    根据所述L个匹配块的偏移量将所述M个差量块以及所述L个数据块合并为所述第二文件。
  9. 一种客户端,其特征在于,服务器保存第一文件,所述客户端用于将第二文件保存到所述服务器,所述客户端包括:处理器、发射机、接收机和通信总线;其中,所述处理器、所述发射机和所述接收机通过所述通信总线完成相互间的通信;所述处理器用于:
    确定所述第二文件中,所述第二文件相比所述第一文件的N份差量数据以 及L个匹配块;所述第一文件中存在与每个所述匹配块相一致的数据块,所述第二文件由所述N份差量数据以及所述L个匹配块组成;N和L均是大于0的正整数;
    将所述N份差量数据合并为M个差量块;M是小于N的正整数;
    将所述M个差量块以及所述L个匹配块的属性信息发送至所述服务器;其中,所述L个匹配块的属性信息被用于将所述M个差量块以及所述第一文件中与每个所述匹配块相一致的数据块,合并为所述第二文件。
  10. 根据权利要求9所述的客户端,其特征在于,所述L个匹配块的属性信息包括所述第一文件中与每个所述匹配块相一致的数据块的标识,以及每个所述匹配块在所述第二文件中的偏移量;
    其中,所述数据块的标识被用于确定所述第一文件中与所述L个匹配块相一致的L个数据块,所述偏移量是每个所述匹配块在所述第二文件中的起始地址与所述第二文件的首地址之间的差值,所述L个匹配块的偏移量被用于将所述M个差量块与所述L个数据块合并为所述第二文件。
  11. 一种服务器,其特征在于,包括:存储器、处理器、发射机、接收机和通信总线;其中,所述处理器、所述发射机和所述接收机通过所述通信总线完成相互间的通信;所述存储器保存第一文件;所述处理器用于:
    接收客户端发送的M个差量块以及L个匹配块的属性信息;所述M个差量块是待保存到所述服务器的第二文件相比所述第一文件的N份差量数据合并而成的;所述第一文件中存在与每个所述匹配块相一致的数据块,所述述第二文件由所述N份差量数据以及所述L个匹配块组成;N和L均是大于0的正整数,M是小于N的正整数;
    根据所述L个匹配块的属性信息将所述M个差量块以及所述第一文件中与每个所述匹配块相一致的数据块,合并为所述第二文件。
  12. 根据权利要求11所述的服务器,其特征在于,所述L个匹配块的属性信息包括所述第一文件中与每个所述匹配块相一致的数据块的标识,以及所述L 个匹配块在所述第二文件中的偏移量;所述偏移量是所述匹配块在所述第二文件中的起始地址与所述第二文件的首地址之间的差值;
    所述处理器具体用于:
    根据所述数据块的标识确定所述第一文件中与所述L个匹配块相一致的L个数据块;
    根据所述L个匹配块的偏移量将所述M个差量块以及所述L个数据块合并为所述第二文件。
PCT/CN2016/106650 2015-12-09 2016-11-21 一种文件差量的传输方法以及装置 WO2017097106A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510909294.4 2015-12-09
CN201510909294.4A CN105554081B (zh) 2015-12-09 2015-12-09 一种文件差量的传输方法以及装置

Publications (1)

Publication Number Publication Date
WO2017097106A1 true WO2017097106A1 (zh) 2017-06-15

Family

ID=55833013

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/106650 WO2017097106A1 (zh) 2015-12-09 2016-11-21 一种文件差量的传输方法以及装置

Country Status (2)

Country Link
CN (1) CN105554081B (zh)
WO (1) WO2017097106A1 (zh)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105554081B (zh) * 2015-12-09 2019-01-18 华为技术有限公司 一种文件差量的传输方法以及装置
CN105871629B (zh) * 2016-05-30 2019-11-15 自连电子科技(上海)有限公司 物联网设备传输数据的方法及系统
CN105959407A (zh) * 2016-06-27 2016-09-21 乐视控股(北京)有限公司 数据上传方法及装置
CN107480267A (zh) * 2017-08-17 2017-12-15 无锡清华信息科学与技术国家实验室物联网技术中心 一种利用局部性提高文件差分同步速度的方法
CN109067924A (zh) * 2018-09-26 2018-12-21 东莞华贝电子科技有限公司 文件传输方法及装置
CN113868013A (zh) * 2020-06-30 2021-12-31 华为技术有限公司 一种数据传输方法、系统、装置、设备及介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102685159A (zh) * 2011-03-10 2012-09-19 腾讯科技(深圳)有限公司 文件传输方法及装置
CN103685509A (zh) * 2013-12-12 2014-03-26 深圳市彩讯科技有限公司 文件差量同步方法
US20140250067A1 (en) * 2013-03-04 2014-09-04 Vmware, Inc. Cross-file differential content synchronization using cached patches
CN105554081A (zh) * 2015-12-09 2016-05-04 华为技术有限公司 一种文件差量的传输方法以及装置

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101232501A (zh) * 2007-12-10 2008-07-30 腾讯科技(深圳)有限公司 一种多文件发送的改进系统及方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102685159A (zh) * 2011-03-10 2012-09-19 腾讯科技(深圳)有限公司 文件传输方法及装置
US20140250067A1 (en) * 2013-03-04 2014-09-04 Vmware, Inc. Cross-file differential content synchronization using cached patches
CN103685509A (zh) * 2013-12-12 2014-03-26 深圳市彩讯科技有限公司 文件差量同步方法
CN105554081A (zh) * 2015-12-09 2016-05-04 华为技术有限公司 一种文件差量的传输方法以及装置

Also Published As

Publication number Publication date
CN105554081A (zh) 2016-05-04
CN105554081B (zh) 2019-01-18

Similar Documents

Publication Publication Date Title
WO2017097106A1 (zh) 一种文件差量的传输方法以及装置
US11501533B2 (en) Media authentication using distributed ledger
CN107819828B (zh) 数据传输方法、装置、计算机设备和存储介质
WO2016155635A1 (zh) 一种数据处理方法和设备
MX2012004910A (es) Acceso concurrente a un grupo de memoria compartida entre un dispositivo de acceso de bloque y un dispositivo de acceso de grafico.
CN104410692A (zh) 一种用于重复文件上传的方法和系统
WO2014067240A1 (zh) 一种恢复移动终端已删除sqlite文件的方法及装置
JP2016513306A (ja) データ格納方法、データストレージ装置、及びストレージデバイス
US10681115B2 (en) Multimedia data transmission method and device
CN112839003A (zh) 数据校验方法及系统
US20130013570A1 (en) File storage apparatus, data storing method, and data storing program
WO2017147794A1 (zh) 差异数据备份的方法和设备
CN109688176B (zh) 一种文件同步方法及终端、网络设备、存储介质
WO2018050055A1 (zh) 数据请求处理方法及其系统、接入设备、存储设备
WO2014085959A1 (zh) 批量文件传输方法及设备
WO2019019842A1 (zh) 图片处理方法、相应装置及存储介质
JP6113816B1 (ja) 情報処理システム、情報処理装置、及びプログラム
CN115242783B (zh) 传输方法、装置、电子设备和介质
JP6413792B2 (ja) ストレージシステム
CN113486025B (zh) 数据存储方法、数据查询方法及装置
US11010409B1 (en) Multi-streaming with synthetic replication
CN111163120A (zh) 分布式数据库的数据存储传输方法和装置以及存储介质
US11522816B2 (en) Multi-stride packet payload mapping for robust transmission of data
CN112688905B (zh) 数据传输方法、装置、客户端、服务器及存储介质
WO2013136584A1 (ja) データ転送システム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16872298

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16872298

Country of ref document: EP

Kind code of ref document: A1