WO2006049581A1 - A method to transmit and update a transmitted electronic document - Google Patents

A method to transmit and update a transmitted electronic document Download PDF

Info

Publication number
WO2006049581A1
WO2006049581A1 PCT/SG2005/000080 SG2005000080W WO2006049581A1 WO 2006049581 A1 WO2006049581 A1 WO 2006049581A1 SG 2005000080 W SG2005000080 W SG 2005000080W WO 2006049581 A1 WO2006049581 A1 WO 2006049581A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
document
block
data blocks
blocks
Prior art date
Application number
PCT/SG2005/000080
Other languages
French (fr)
Inventor
Shou Kwong Richard Fam
Original Assignee
Dramtech (Asia Pacific) Pte Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dramtech (Asia Pacific) Pte Ltd filed Critical Dramtech (Asia Pacific) Pte Ltd
Publication of WO2006049581A1 publication Critical patent/WO2006049581A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/178Techniques for file synchronisation in file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/123Storage facilities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/131Fragmentation of text files, e.g. creating reusable text-blocks; Linking to fragments, e.g. using XInclude; Namespaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files

Definitions

  • a method of storing data at a remote location and keeping it readily accessible is to transmit the data through a network and to have it stored in a computer's hard disk at a remote site.
  • the data needs to remain encrypted both during transmission and when it resides at the remote site, as the transmission network, for example the internet, wide area network (WAN), local area network (LAN) etc, and even the remote site (for example a 3 rd party hosting centre) itself, may not be itself secure.
  • the transmission network for example the internet, wide area network (WAN), local area network (LAN) etc, and even the remote site (for example a 3 rd party hosting centre) itself, may not be itself secure.
  • existing tools are used to synchronize unencrypted documents over a network.
  • the unencrypted amendments are synchronized with the transmitted documents to update the amendments.
  • the transmitted document is first decrypted, before synchronization with the amended document.
  • the updated documents are then encrypted again for security.
  • it will be necessary to set up a secure network route or infrastructure prior to synchronization, as the amendments are not encrypted during transmission and therefore experiences a threat of security. Again, this is time consuming and resource intensive, especially for large electronic documents like e-mails. More importantly, there is a compromise to data security during transmission, as the documents are not encrypted prior and during transmission.
  • a method to securely transmit a document to a remote storage location including partitioning the document into a plurality of distinct data blocks, said data block having a pre-determined block length, categorizing each distinct data block into one of two categories - incomplete and new, said incomplete data block having a data length less than a pre-determined block length, and said new block having a data length equal to the pre-determined block length, assigning a digital signature to each classified data block, preparing a signature table to tabulate the assigned digital signatures, storing the assigned digital signatures to the prepared signature table, preparing a delta file to receive the data of the data blocks and finally transmitting the data blocks to the remote storage location.
  • the method further includes encrypting the data blocks prior to transmitting the data blocks to the remote storage location.
  • This is to provide further security to the documents when transmitting through a network.
  • a method to securely incorporate amendments to a transmitted document including partitioning the amended document into a plurality of distinct data blocks, said data block having a pre-determined block length, preparing a subsequent signature table, taking signatures assigned and store into the subsequent signature table, identifying the data blocks that have been amended by identifying identical signatures in the subsequent signature table with the signature table of the original document, identifying complete blocks of data in the amended document, classifying the identified complete blocks of data as new, computing and assigning signatures to the classified new data blocks, placing the assigned signatures into the subsequent signature table, classifying the remaining data blocks as incomplete, preparing an amendment delta file, preparing a delta file to receive the data of the data blocks, transmitting only the amended data blocks to the remote storage location and integrating the transmitted amended data blocks to the transmitted documents thereby incorporating the amendments.
  • the method further includes encrypting the data prior to transmission of the data blocks to the remote storage location.
  • the method of classifying the data blocks includes classifying a data block as processed where the data block has an assigned digital signature.
  • the method of classifying the data blocks includes classifying a data block as new where the data block has no assigned digital signature, and has a data length equal to the pre-determined data length.
  • the method of classifying the data blocks includes classifying a data block as incomplete where the data block has no assigned digital signature, and has a data length less than the pre-determined data length.
  • Figure 1 shows the setup for performing the method of transmitting a document
  • Figure 2 shows a flowchart illustrating a method of transmitting a document securely to .a remote storage location for the first time
  • Figures 3A, 3B and 4 show the partitioning of the data blocks, the assigning of signatures and the transmission of data of the document;
  • Figure 5 shows a flowchart illustrating a method of incorporating amendments to a transmitted document at a remote storage location
  • Figure 6 shows how the data blocks are identified and classified
  • Figures 7A, 7B show the respective signature tables of the original document and the amended document
  • Figure 8 shows the format of the amendment delta file 700 to be transmitted by the user
  • Figure 9 shows a flowchart illustrating the method of generating a delta file from an amended document
  • Figure 10 shows the construction of a new encoded file from the existing encoded
  • Figure 11 is a flowchart illustrating a method of constructing a new encoded file from the delta file.
  • Figure 1 shows an information network 105 suitable for implementing the methods according to the present invention.
  • the information network 105 includes a user computer 101 connected to a remote storage location shown in Figure 1 as a server computer 103 through a network 102. Documents prepared at a user computer 101 will be subsequently transmitted through the network 102 to the server computer 103.
  • the user computer 101 may be any programmable device, for example, a general-purpose computer, a programmable hand phone, a personal digital assistant (PDA) or any other programmable digital appliance.
  • PDA personal digital assistant
  • the server computer 103 as shown in Figure 1 is a general-purpose computer.
  • the network 102 may be a Local Area Network, Wide Area Network, Intranet, Internet, etc.
  • the document is first partitioned and categorized before encrypting and transmitting to the server 103.
  • Figure 2 depicts a flow chart illustrating a method according to the present invention for processing a document for its secure transmission to the server 103.
  • a pre-determined block size of L bytes is first determined. Once determined, this block size L is maintained throughout the method of the invention.
  • the pre-determined block size L may be determined based on the electronic size of each individual document which is to be transmitted, so that, for example, a larger sized document may have longer data blocks, and vice versa. In another variation, the pre-determined block size L may be determined based on the type of document, for example, information intensive documents like spreadsheets, graphics or documents with text.
  • the block size L is determined, the document is partitioned into equal blocks of L bytes. The programme then proceeds to classify these data blocks as "new" or "incomplete".
  • a data block is classified as "new” if its partitioned block size is equal to the pre-determined block size L, and classified as "incomplete” if its partitioned block size is less than the pre-determined block size L. As it is the first time these data blocks are processed, the programme classifies these data blocks as "new". In the situation where the size of the document is not an exact multiple of L bytes, the last block will be shorter than the pre-determined block length L and hence classified as an "incomplete” block.
  • a signature table 210 is then prepared so that digital signatures computed and assigned, may be tabulated.
  • a digital signature 20 is computed and assigned for each data block 10 classified as "new", using, for example, the algorithm such as a 128-bit MD5 (message digest).
  • the digital signature 20 is computed for all "new" blocks and these signatures are stored in the prepared signature table 210. There is no digital signature computed and assigned for an "incomplete” block.
  • the data contained in all the data blocks, i.e., blocks classified as either "new” or “incomplete” are encoded using any method, and appended to a delta file and finally, the delta file is transmitted to the server computer.
  • Figure 3A shows a document 200 that has been partitioned into data blocks arbitrarily termed A1 through to A6 using any known conventional data partitioning methods.
  • the document is partitioned into the pre ⁇ determined block size L, for example, 8192 bytes.
  • the document size is not an exact multiple of the pre-determined block size.
  • the final data block A6 may fall short of the pre-determined block size.
  • data block A6 is classified as an "incomplete" block. All the other complete data blocks 10, arbitrarily termed as A1 through to A5 is of the pre ⁇ determined block size, L.
  • data blocks A1 through A5 are classified as "new”.
  • a digital signature 20, for example of 128 bit is computed for each classified "new" block. Therefore, digital signatures 20 are computed and assigned to data blocks A1 through to A5.
  • the digital signature 20 is computed and assigned such that they are unique. This may be achieved by adopting several commonly known algorithms used in cryptography. An example of one such algorithm is the 128-bit MD5 message digest.
  • the digital signatures 20 and their assigned corresponding data blocks 10 are stored in a signature table 210, as seen in Figure 3B.
  • This signature table 210 functions as a digital fingerprint or as an electronic identity of the document 200.
  • Digital signatures 20 are computed and assig ned only for data blocks categorized as "new", and therefore a digital signature is not computed for block A6 categorized as an "incomplete" block as seen in Figure 3B.
  • N ext as seen in Figure 4, the data contained in the "new" and “incomplete” blocks are encoded and placed in a delta file 301 to be transmitted to the server 103 through the network 102.
  • the encoding of data is optional. To enhance security of transmission, encoding may be performed by encrypting the contents with any of the encryption algorithms fo und commonly in the market.
  • the form of encoding will depend on several factors. For instance, if security of the document 200 is of concern, then the encoding takes the form of an encryption. If the document is not previously in a compressed format, then encoding takes the form of data compression. Any other form of encoding can be applied provided that any chosen conventional algorithm used to encode the data, will not result in a loss in data integrity when the document is subsequently decoded.
  • the next step is to transmit the delta file 301 to the server computer 103 where the encoded data will be extracted and stored.
  • encoding may involve compressing the document with any commonly known compression algorithm. Encoding may also involve first compressing followed by encrypting of each data block content. It is to be understood that the present invention is not limited to the type or choice of algorithm for compression and/or encryption, and the invention may be performed with the use of any kraown encoding method, with a requirement that documents or data encoded may similarl y be decoded with no loss in data integrity. It is further to be understood that the terms adopted in this specification - “incomplete”, “new” and “processed” are purely arbitrary, and have been termed as such, as an example. Other terms may also be adopted, for example, the use of the term “partial” instead of “incomplete”, “unprocessed” instead of "new”, and “executed” instead of “processed”, without departing from the spirit of the invention.
  • Figure 4 shows the layout of the delta file 301 prepared for the document as described in the method of Figure 2.
  • the delta file 301 is a file prepared that includes instructions to the system, to prepare the data blocks prior to transmission to the remote location, or server 103
  • This delta file 301 is prepared and contains instructions to insert all the data contents of the document 200 carried by the delta file 301, and transmitted by a user to the server 103.
  • the instructions contained in the delta file 301 includes instructions necessary to enable the * server 103 to reconstruct the encoded document 202 correspondingly to form data blocks B1 to B6, arbitrarily named , representing the encoded, transmitted form of data blocks A1 to A6.
  • the data blocks B1 to B6 for the encoded document 202 forms the encoded data file 202 at the server constructed from the contents of the document 200. Therefore, the data blocks A1 to A6 of the document 200 is encoded and transmitted to form the encoded data file formed by corresponding data blocks B1 to B6, arbitrarily named. This is done sequentially from the first block A1 to the last block A6.
  • Figure 5 shows a method where amendments made to the original document 200 are identified and prepared prior to transmission to the server 103.
  • a signature table (seen in Figure 7B) is prepared for the amended document 205, known as an amendment signature table 602.
  • Data blocks of the amended document 205 with computed signatures found in the signature table 210 of the original document 200 are first identified ("processed” data blocks), and their assigned signatures are stored in the subsequent signature table 602. These data blocks do not need to be prepared for transmission, as they have previously been transmitted. This saves time, and is an advantage of the present invention.
  • complete data blocks of block size L are identified in the amended document 205. These data blocks are classified as "new”. Again, digital signatures are computed for data blocks classified as "new”. A digital signature is computed and assigned to each "new" data block, and these signatures are stored in the subsequent signature table 602. The remaining data blocks are classified as "incomplete”.
  • the original document 200 is partitioned into blocks A1 through to A6 and the signatures assigned to each block are taken and stored into a signature table 210, as shown in Figure 3, in accordance with the steps outlined in the flow chart depicted Figure 2.
  • the programme first identifies which blocks in the amended document 205 relates to the original document 200.
  • its digital signature can be> found in the signature table 210.
  • there are 3 blocks common to both the original document 200 and amended document 205 as they have identical digital signatures i.e. blocks A3 and D2 have an identical digital signature S(A3)
  • blocks A2 and D5 have an identical digital signature S(A2)
  • blocks A4 and D6 have identical digital signature S(A4).
  • Data blocks D2, D5 and D6 of the amended document 205 are classified as "processed" blocks . It is classified thus as they have been previously processed and their encoded data can foe found in the server computer 103 therefore making it unnecessary to transmit this information again.
  • the subsequent signature table 602 of the amended document 205 is updated with the digital signatures of the "processed" blocks, i.e. S(A3) for block D2, S(A2) for block D5, and S(A4) for block D ⁇ ⁇ Having identified the blocks common to both the original 200 and amended documents 205, the next step is to identify complete blocks of data in the amended document 205, i.e. blocks with , previously determined data size of L bytes.
  • Block D3 508 is one such block, and its computed and assigned digital sig nature S(D3) is computed and placed in the subsequent signature table 602.
  • This data block is classified as a "new" block as its signature is not found in the signature table 210 and its contents is recognised to have not previously been transmitted to the server 1O3 through the amendment delta file 700.
  • the remaining blocks D1 , D4 and D7 are classified as “incomplete” blocks, as their data sizes are less than the pre-determined block size of L bytes. These "incomplete" blocks have not been previously assigned digital signatures and therefore, these signatures are not present in the signature table 602. Nevertheless, the content of the "incomplete” blocks represents new information that must be transmitted to the server 103 via the amendment delta file 700.
  • Figure 8 shows the format of the amendment delta file 700 to be transmitted by the user to the server 103 to update the existing encoded document at the server 103 to reflect the amendments made to the document 200 at the client computer 101.
  • the amendment delta file 700 comprises a sequence of instructions for the server computer 103 to execute. There are basically two instructions - either (i) to insert data carried by the delta amendment file 700 or (ii) to copy data that is already on record in the server computer 103 in order to update the encoded document at the server end.
  • the amendment delta file 700 is constructed after having identified the types of blocks present in the amended document 205.
  • D1 is an "incomplete" block with no assigned digital signature. Its data is encoded and appended to the delta file 701 with instruction (i) for the server computer 103 to insert this data into the updated encoded document file.
  • D2 is a "processed” block and its data is common to A3 (block 3) of the original document 200. Hence, instruction (ii) is called, and its entry in the amendment delta file 702 is an instruction for the server computer 103 to use block 3 of the current encoded document to update the new encoded document.
  • D3 is a "new" block, and therefore instruction (i) is called.
  • D4 and D7 are "incomplete" blocks and are processed in the same fashion as block D1 , calling instruction (i).
  • D5 and D6 are "processed” blocks and are processed in the same fashion as block D2, i.e. calling instruction (ii).
  • Figure 9 is a flowchart illustrating the method of generating a delta file from an amended document 205.
  • the data blocks of the amended document are first classified or identified as “processed”, "new” and “incomplete” blocks of the amended document.
  • a new delta file and a signature table is first created for the amended document.
  • the data blocks classified as "processed” blocks are the first blocks to be identified, as their signatures will match those in the signature table belonging to the original document 200.
  • the blocks are partitioned according to the basic pre-determined block length of L bytes of data. These data blocks are classified as “new” blocks, as they have not previously been processed.
  • the remaining blocks are classified as "incomplete” as their data sizes are smaller than the basic predetermined block size of L bytes.
  • each block is considered in turn. If the block is classified as a "processed" block, then the newly created signature table is updated with the data block's digital signature (which can be found in the old signature table). The delta file is appended with the instruction to copy data from the block number found in the old signature table. There is no need to insert the block's data into the delta file for transmission as its data is already on record in the server computer 103. If the data block is classified as a "new" block, then the block's signature is computed and tabulated into the new signature table. The block's data is then encoded and the data appended to the delta file for transmission to the server computer where it will be stored.
  • this block is found in subsequent modifications to the document, then it will be classified as a "processed” block and its data need not be re-transmitted again to the server computer. If the block under consideration is neither a "processed” or “new” block, then it must be an "incomplete” block. The block's data is then encoded and the data appended to the delta file for transmission to the server computer where it will be stored. Note that the signature of an "incomplete" block is neither computed nor recorded and as such can never be identified in subsequent modifications to the document. Having processed the final block of the amended document 205, the current signature file is then discarded and replaced by the new signature file. The construction of the delta file is now complete and can be transmitted to the server computer for processing.
  • Figure 10 shows the construction of the new encoded file 902 from the existing encoded file 903 found in the server and the delta file 901 received from the client.
  • the delta file, reproduced from step 700 comprises a set of instructions.
  • New data (comprising contents of "new” and “incomplete” blocks) is inserted into the new data file as illustrated in steps 904, 906, 907 and 910.
  • Data that has been previously processed (contents of "processed” blocks”) are copied from the existing data file to the new data file as illustrated in steps 905, 908 and 909.
  • Figure 11 is a flowchart illustrating a method according to the present invention of constructing a new encoded file from the delta file and existing encoded file by the server computer 103.
  • Each block in the delta file is processed sequentially starting from the first data block to the last data block. If the block under consideration contains a "copy” instruction (ii), then, this is a “processed” block and data is extracted from the existing encoded data file and appended to the new encoded data file 1005. If the block under consideration contains an "insert” instruction (i), then, this is either a "new" or “incomplete” block and its data is appended to the new encoded data file 1006. After the last block of the delta file has been processed, the newly encoded data file replaces the existing encoded data file 1008.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention relates to a method to securely transmit a document to a storage location, including partitioning the document into data blocks categorizing each data block assigning a digital signature to each classified data block, storing the assigned digital signatures to a signature table, preparing a delta file to receive the data of the data blocks and transmitting the data blocks. The invention further provides a method to securely transmit amendments to a document, including partitioning the amended document into data blocks, preparing a subsequent signature table, taking signatures assigned and storing in the subsequent signature table, identifying the data blocks that have been amended by identifying identical signatures in the subsequent and signature table of the original document, identifying and classifying the identified complete blocks of data as new, computing and assigning signatures to the classified new data blocks, placing the signatures into the subsequent signature table, classifying the remaining data blocks as incomplete, preparing an amendment delta file, preparing a delta file to receive the data of the data blocks, transmitting only the amended data blocks to integrate it with the transmitted documents thereby incorporating the amendments.

Description

A METHOD TO TRANSMIT AND UPDATE A TRANSMITTED ELECTRONIC DOCUMENT
BACKGROUND OF THE INVENTION
Today's commercial environment is moving towards a paperless system. In this system, commercial documents or data are usually electronically stored on a local database environment. However, it is increasingly important to provide a backup system at a remote location. This is advantageous so that in the event of a disaster where the locally stored data may become unreadable or un-retrievable, (for example due to a fire, a flood, or a computer virus attack on the local database, etc) electronic information will not be lost, and an organization can proceed to recover its data from the remote location.
Traditionally, a method of storing data at a remote location and keeping it readily accessible, is to transmit the data through a network and to have it stored in a computer's hard disk at a remote site.
It is often critical for businesses to maintain the confidentiality and integrity of their electronic data. Therefore, the data needs to remain encrypted both during transmission and when it resides at the remote site, as the transmission network, for example the internet, wide area network (WAN), local area network (LAN) etc, and even the remote site (for example a 3rd party hosting centre) itself, may not be itself secure.
The problem faced with the traditional method of storing data, is that when a transmitted document is amended or modified, the entire document will have to be encrypted and re¬ transmitted to the remote site. This is regardless of the extent of amendment or modification made on the transmitted document. This is time consuming. As some businesses rely on timely data, this method of storing data is unacceptable.
In another method, existing tools are used to synchronize unencrypted documents over a network. In this method the unencrypted amendments are synchronized with the transmitted documents to update the amendments. As the existing tools can only synchronize unencrypted documents, the transmitted document is first decrypted, before synchronization with the amended document. Once synchronization is complete, the updated documents are then encrypted again for security. In this method, it will be necessary to set up a secure network route or infrastructure prior to synchronization, as the amendments are not encrypted during transmission and therefore experiences a threat of security. Again, this is time consuming and resource intensive, especially for large electronic documents like e-mails. More importantly, there is a compromise to data security during transmission, as the documents are not encrypted prior and during transmission.
It is an object of the present invention to overcome or at least ameliorate one or more of the above problems in the prior art.
Discussion of any one of the prior art mentioned above is not to be taken as an admission of the state of common general knowledge of the skilled addressee.
SUMMARY OF THE INVENTION
According to the invention, there is provided a method to securely transmit a document to a remote storage location, including partitioning the document into a plurality of distinct data blocks, said data block having a pre-determined block length, categorizing each distinct data block into one of two categories - incomplete and new, said incomplete data block having a data length less than a pre-determined block length, and said new block having a data length equal to the pre-determined block length, assigning a digital signature to each classified data block, preparing a signature table to tabulate the assigned digital signatures, storing the assigned digital signatures to the prepared signature table, preparing a delta file to receive the data of the data blocks and finally transmitting the data blocks to the remote storage location.
Preferably, the method further includes encrypting the data blocks prior to transmitting the data blocks to the remote storage location. This is to provide further security to the documents when transmitting through a network. According to another aspect of the invention, there is provided a method to securely incorporate amendments to a transmitted document including partitioning the amended document into a plurality of distinct data blocks, said data block having a pre-determined block length, preparing a subsequent signature table, taking signatures assigned and store into the subsequent signature table, identifying the data blocks that have been amended by identifying identical signatures in the subsequent signature table with the signature table of the original document, identifying complete blocks of data in the amended document, classifying the identified complete blocks of data as new, computing and assigning signatures to the classified new data blocks, placing the assigned signatures into the subsequent signature table, classifying the remaining data blocks as incomplete, preparing an amendment delta file, preparing a delta file to receive the data of the data blocks, transmitting only the amended data blocks to the remote storage location and integrating the transmitted amended data blocks to the transmitted documents thereby incorporating the amendments.
Preferably, the method further includes encrypting the data prior to transmission of the data blocks to the remote storage location.
Still preferably the method of classifying the data blocks includes classifying a data block as processed where the data block has an assigned digital signature.
Preferably, the method of classifying the data blocks includes classifying a data block as new where the data block has no assigned digital signature, and has a data length equal to the pre-determined data length.
Preferably, the method of classifying the data blocks includes classifying a data block as incomplete where the data block has no assigned digital signature, and has a data length less than the pre-determined data length. DESCRIPTION OF FIGURES
In order that the invention might be more fully understood, embodiments of the invention will be described by way of example only, with reference to the accompanying drawings, in which:
Figure 1 shows the setup for performing the method of transmitting a document;
Figure 2 shows a flowchart illustrating a method of transmitting a document securely to .a remote storage location for the first time;
Figures 3A, 3B and 4 show the partitioning of the data blocks, the assigning of signatures and the transmission of data of the document;
Figure 5 shows a flowchart illustrating a method of incorporating amendments to a transmitted document at a remote storage location;
Figure 6 shows how the data blocks are identified and classified;
Figures 7A, 7B show the respective signature tables of the original document and the amended document;
Figure 8 shows the format of the amendment delta file 700 to be transmitted by the user;
Figure 9 shows a flowchart illustrating the method of generating a delta file from an amended document;
Figure 10 shows the construction of a new encoded file from the existing encoded; and
Figure 11 is a flowchart illustrating a method of constructing a new encoded file from the delta file.
The figures are not necessarily drawn to scale. DETAILED DESCRIPTION OF THE INVENTION
Reference will now be made in detail to the preferred embodiments of the invention, examples of which are illustrated in the accompanying drawings. The preferred embodiments of the invention are not intended to limit the invention in its broadest aspect to these embodiments. On the contrary, the invention is intended to cover alternatives, modifications and equivalents, which may be included within the spirit and scope of the invention as defined by the appended claims. Furthermore, in the following detailed description of the embodiments, numerous specific details are set forth in order to provide an understanding of the present embodiments.
Referring to the drawings, Figure 1 shows an information network 105 suitable for implementing the methods according to the present invention. The information network 105 includes a user computer 101 connected to a remote storage location shown in Figure 1 as a server computer 103 through a network 102. Documents prepared at a user computer 101 will be subsequently transmitted through the network 102 to the server computer 103. The user computer 101 may be any programmable device, for example, a general-purpose computer, a programmable hand phone, a personal digital assistant (PDA) or any other programmable digital appliance. The server computer 103 as shown in Figure 1 , is a general-purpose computer. The network 102 may be a Local Area Network, Wide Area Network, Intranet, Internet, etc.
According to the method of the invention, the document is first partitioned and categorized before encrypting and transmitting to the server 103.
Figure 2 depicts a flow chart illustrating a method according to the present invention for processing a document for its secure transmission to the server 103. A pre-determined block size of L bytes is first determined. Once determined, this block size L is maintained throughout the method of the invention. The pre-determined block size L may be determined based on the electronic size of each individual document which is to be transmitted, so that, for example, a larger sized document may have longer data blocks, and vice versa. In another variation, the pre-determined block size L may be determined based on the type of document, for example, information intensive documents like spreadsheets, graphics or documents with text. Once the block size L is determined, the document is partitioned into equal blocks of L bytes. The programme then proceeds to classify these data blocks as "new" or "incomplete". A data block is classified as "new" if its partitioned block size is equal to the pre-determined block size L, and classified as "incomplete" if its partitioned block size is less than the pre-determined block size L. As it is the first time these data blocks are processed, the programme classifies these data blocks as "new". In the situation where the size of the document is not an exact multiple of L bytes, the last block will be shorter than the pre-determined block length L and hence classified as an "incomplete" block.
A signature table 210 is then prepared so that digital signatures computed and assigned, may be tabulated.
A digital signature 20 is computed and assigned for each data block 10 classified as "new", using, for example, the algorithm such as a 128-bit MD5 (message digest). The digital signature 20 is computed for all "new" blocks and these signatures are stored in the prepared signature table 210. There is no digital signature computed and assigned for an "incomplete" block. Next, the data contained in all the data blocks, i.e., blocks classified as either "new" or "incomplete" are encoded using any method, and appended to a delta file and finally, the delta file is transmitted to the server computer.
The method as described in Figure 2 will be illustrated in the following figures 3A, 3B and 4.
Applying the method as described above, Figure 3A shows a document 200 that has been partitioned into data blocks arbitrarily termed A1 through to A6 using any known conventional data partitioning methods. The document is partitioned into the pre¬ determined block size L, for example, 8192 bytes.
It is quite likely that the document size is not an exact multiple of the pre-determined block size. In this example, the final data block A6 may fall short of the pre-determined block size. In this situation, data block A6 is classified as an "incomplete" block. All the other complete data blocks 10, arbitrarily termed as A1 through to A5 is of the pre¬ determined block size, L. As the document 200 has not been previously processed, data blocks A1 through A5 are classified as "new". Next, following the method described in Figure 2, a digital signature 20, for example of 128 bit, is computed for each classified "new" block. Therefore, digital signatures 20 are computed and assigned to data blocks A1 through to A5.
The digital signature 20 is computed and assigned such that they are unique. This may be achieved by adopting several commonly known algorithms used in cryptography. An example of one such algorithm is the 128-bit MD5 message digest. The digital signatures 20 and their assigned corresponding data blocks 10 are stored in a signature table 210, as seen in Figure 3B. This signature table 210 functions as a digital fingerprint or as an electronic identity of the document 200. Digital signatures 20 are computed and assig ned only for data blocks categorized as "new", and therefore a digital signature is not computed for block A6 categorized as an "incomplete" block as seen in Figure 3B. N ext, as seen in Figure 4, the data contained in the "new" and "incomplete" blocks are encoded and placed in a delta file 301 to be transmitted to the server 103 through the network 102. The encoding of data is optional. To enhance security of transmission, encoding may be performed by encrypting the contents with any of the encryption algorithms fo und commonly in the market.
The form of encoding will depend on several factors. For instance, if security of the document 200 is of concern, then the encoding takes the form of an encryption. If the document is not previously in a compressed format, then encoding takes the form of data compression. Any other form of encoding can be applied provided that any chosen conventional algorithm used to encode the data, will not result in a loss in data integrity when the document is subsequently decoded. The next step is to transmit the delta file 301 to the server computer 103 where the encoded data will be extracted and stored.
If network bandwidth is of concern, encoding may involve compressing the document with any commonly known compression algorithm. Encoding may also involve first compressing followed by encrypting of each data block content. It is to be understood that the present invention is not limited to the type or choice of algorithm for compression and/or encryption, and the invention may be performed with the use of any kraown encoding method, with a requirement that documents or data encoded may similarl y be decoded with no loss in data integrity. It is further to be understood that the terms adopted in this specification - "incomplete", "new" and "processed" are purely arbitrary, and have been termed as such, as an example. Other terms may also be adopted, for example, the use of the term "partial" instead of "incomplete", "unprocessed" instead of "new", and "executed" instead of "processed", without departing from the spirit of the invention.
Figure 4 shows the layout of the delta file 301 prepared for the document as described in the method of Figure 2.
The delta file 301 is a file prepared that includes instructions to the system, to prepare the data blocks prior to transmission to the remote location, or server 103
This delta file 301 is prepared and contains instructions to insert all the data contents of the document 200 carried by the delta file 301, and transmitted by a user to the server 103. The instructions contained in the delta file 301 includes instructions necessary to enable the* server 103 to reconstruct the encoded document 202 correspondingly to form data blocks B1 to B6, arbitrarily named , representing the encoded, transmitted form of data blocks A1 to A6. The data blocks B1 to B6 for the encoded document 202 forms the encoded data file 202 at the server constructed from the contents of the document 200. Therefore, the data blocks A1 to A6 of the document 200 is encoded and transmitted to form the encoded data file formed by corresponding data blocks B1 to B6, arbitrarily named. This is done sequentially from the first block A1 to the last block A6.
Figure 5 shows a method where amendments made to the original document 200 are identified and prepared prior to transmission to the server 103. A signature table (seen in Figure 7B) is prepared for the amended document 205, known as an amendment signature table 602.
Data blocks of the amended document 205 with computed signatures found in the signature table 210 of the original document 200 are first identified ("processed" data blocks), and their assigned signatures are stored in the subsequent signature table 602. These data blocks do not need to be prepared for transmission, as they have previously been transmitted. This saves time, and is an advantage of the present invention. Next, complete data blocks of block size L are identified in the amended document 205. These data blocks are classified as "new". Again, digital signatures are computed for data blocks classified as "new". A digital signature is computed and assigned to each "new" data block, and these signatures are stored in the subsequent signature table 602. The remaining data blocks are classified as "incomplete".
Figures 3, 6, 7A and 7B applies the method as described in Figure 5.
The original document 200 is partitioned into blocks A1 through to A6 and the signatures assigned to each block are taken and stored into a signature table 210, as shown in Figure 3, in accordance with the steps outlined in the flow chart depicted Figure 2.
The programme first identifies which blocks in the amended document 205 relates to the original document 200. As certain data blocks of the original document 200 may be common with the amended document 205, its digital signature can be> found in the signature table 210. In the present example shown in Figures 3, 6 and 7, it can be seen that there are 3 blocks common to both the original document 200 and amended document 205 as they have identical digital signatures, i.e. blocks A3 and D2 have an identical digital signature S(A3), blocks A2 and D5 have an identical digital signature S(A2), and, blocks A4 and D6 have identical digital signature S(A4). Data blocks D2, D5 and D6 of the amended document 205 are classified as "processed" blocks . It is classified thus as they have been previously processed and their encoded data can foe found in the server computer 103 therefore making it unnecessary to transmit this information again.
The subsequent signature table 602 of the amended document 205 is updated with the digital signatures of the "processed" blocks, i.e. S(A3) for block D2, S(A2) for block D5, and S(A4) for block Dβ^ Having identified the blocks common to both the original 200 and amended documents 205, the next step is to identify complete blocks of data in the amended document 205, i.e. blocks with , previously determined data size of L bytes. Block D3 508 is one such block, and its computed and assigned digital sig nature S(D3) is computed and placed in the subsequent signature table 602. This data block is classified as a "new" block as its signature is not found in the signature table 210 and its contents is recognised to have not previously been transmitted to the server 1O3 through the amendment delta file 700. The remaining blocks D1 , D4 and D7 are classified as "incomplete" blocks, as their data sizes are less than the pre-determined block size of L bytes. These "incomplete" blocks have not been previously assigned digital signatures and therefore, these signatures are not present in the signature table 602. Nevertheless, the content of the "incomplete" blocks represents new information that must be transmitted to the server 103 via the amendment delta file 700.
Figure 8 shows the format of the amendment delta file 700 to be transmitted by the user to the server 103 to update the existing encoded document at the server 103 to reflect the amendments made to the document 200 at the client computer 101. The amendment delta file 700 comprises a sequence of instructions for the server computer 103 to execute. There are basically two instructions - either (i) to insert data carried by the delta amendment file 700 or (ii) to copy data that is already on record in the server computer 103 in order to update the encoded document at the server end.
The amendment delta file 700 is constructed after having identified the types of blocks present in the amended document 205. D1 is an "incomplete" block with no assigned digital signature. Its data is encoded and appended to the delta file 701 with instruction (i) for the server computer 103 to insert this data into the updated encoded document file. D2 is a "processed" block and its data is common to A3 (block 3) of the original document 200. Hence, instruction (ii) is called, and its entry in the amendment delta file 702 is an instruction for the server computer 103 to use block 3 of the current encoded document to update the new encoded document. D3 is a "new" block, and therefore instruction (i) is called. Its data is encoded and appended to the amendment delta file 703 with the instruction for the server computer to insert this data into the updated encoded document file. D4 and D7 are "incomplete" blocks and are processed in the same fashion as block D1 , calling instruction (i). D5 and D6 are "processed" blocks and are processed in the same fashion as block D2, i.e. calling instruction (ii).
Figure 9 is a flowchart illustrating the method of generating a delta file from an amended document 205. The data blocks of the amended document are first classified or identified as "processed", "new" and "incomplete" blocks of the amended document. A new delta file and a signature table is first created for the amended document. The data blocks classified as "processed" blocks are the first blocks to be identified, as their signatures will match those in the signature table belonging to the original document 200. From the remaining data, the blocks are partitioned according to the basic pre-determined block length of L bytes of data. These data blocks are classified as "new" blocks, as they have not previously been processed. The remaining blocks are classified as "incomplete" as their data sizes are smaller than the basic predetermined block size of L bytes.
After identifying the 3 types of blocks, each block is considered in turn. If the block is classified as a "processed" block, then the newly created signature table is updated with the data block's digital signature (which can be found in the old signature table). The delta file is appended with the instruction to copy data from the block number found in the old signature table. There is no need to insert the block's data into the delta file for transmission as its data is already on record in the server computer 103. If the data block is classified as a "new" block, then the block's signature is computed and tabulated into the new signature table. The block's data is then encoded and the data appended to the delta file for transmission to the server computer where it will be stored. Note that if this block is found in subsequent modifications to the document, then it will be classified as a "processed" block and its data need not be re-transmitted again to the server computer. If the block under consideration is neither a "processed" or "new" block, then it must be an "incomplete" block. The block's data is then encoded and the data appended to the delta file for transmission to the server computer where it will be stored. Note that the signature of an "incomplete" block is neither computed nor recorded and as such can never be identified in subsequent modifications to the document. Having processed the final block of the amended document 205, the current signature file is then discarded and replaced by the new signature file. The construction of the delta file is now complete and can be transmitted to the server computer for processing.
Figure 10 shows the construction of the new encoded file 902 from the existing encoded file 903 found in the server and the delta file 901 received from the client. The delta file, reproduced from step 700 comprises a set of instructions. New data (comprising contents of "new" and "incomplete" blocks) is inserted into the new data file as illustrated in steps 904, 906, 907 and 910. Data that has been previously processed (contents of "processed" blocks") are copied from the existing data file to the new data file as illustrated in steps 905, 908 and 909. Figure 11 is a flowchart illustrating a method according to the present invention of constructing a new encoded file from the delta file and existing encoded file by the server computer 103. Each block in the delta file is processed sequentially starting from the first data block to the last data block. If the block under consideration contains a "copy" instruction (ii), then, this is a "processed" block and data is extracted from the existing encoded data file and appended to the new encoded data file 1005. If the block under consideration contains an "insert" instruction (i), then, this is either a "new" or "incomplete" block and its data is appended to the new encoded data file 1006. After the last block of the delta file has been processed, the newly encoded data file replaces the existing encoded data file 1008.

Claims

1. A method to securely transmit a document to a remote storage location, including partitioning the document into a plurality of distinct data blocks, said data block having a pre-determined block length; categorizing each distinct data block into one of two categories - incomplete and new, said incomplete data block having a data length less than a pre-determined block length, and said new block having a data length equal to the pre-determined block length; assigning a digital signature to each classified data block; preparing a signature table to tabulate the assigned digital signatures; storing the assigned digital signatures to the prepared signature table; preparing a delta file to receive the data of the data blocks; and transmitting the data blocks to the remote storage location.
2. A method to securely transmit a document to a remote storage location according to claim 1 wherein the method further includes encrypting the data blocks prior to transmitting the data blocks to the remote storage location.
3. A method to securely incorporate amendments to a transmitted document, said document transmitted in a method according to claim 1 including partitioning the amended document into a plurality of distinct data blocks, said data block having a pre-determined block length; preparing a subsequent signature table; taking signatures assigned and store into the subsequent signature table; identifying the data blocks that have been amended by identifying identical signatures in the subsequent signature table with the signature table of the original document; identifying complete blocks of data in the amended document; classifying the identified complete blocks of data as new; computing and assigning signatures to the classified new data blocks; placing the assigned signatures into the subsequent signature table; classifying the remaining data blocks as incomplete; preparing an amendment delta file; preparing a delta file to receive the data of the data blocks; transmitting only the amended data blocks to the remote storage location; and integrating the transmitted amended data blocks to the transmitted documents thereby incorporating the amendments.
4. A method to securely incorporate amendments to a transmitted document according to claim 3, wherein the method further includes encrypting the data prior to transmission of the data blocks to the remote storage location.
5. A method to securely incorporate amendments to a transmitted document according to any one of claims 3 or 4 wherein the method of classifying the data blocks includes classifying a data block as processed where the data block has an assigned digital signature.
6. A method to securely incorporate amendments to a transmitted document according to any one of claims 3 to 5 wherein the method of classifying the data blocks includes classifying a data block as new where the data block has no assigned digital signature, and has a data length equal to the pre-determined data length.
7. A method to securely incorporate amendments to a transmitted document according to claim 3 wherein the method of classifying the data blocks includes classifying a data block as incomplete where the data block has no assigned digital signature, and has a data length less than the pre-determined data length.
PCT/SG2005/000080 2004-11-05 2005-03-15 A method to transmit and update a transmitted electronic document WO2006049581A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SG200406422-6 2004-11-05
SG200406422A SG121917A1 (en) 2004-11-05 2004-11-05 A method to transmit and update a transmitted electronic document

Publications (1)

Publication Number Publication Date
WO2006049581A1 true WO2006049581A1 (en) 2006-05-11

Family

ID=36319463

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SG2005/000080 WO2006049581A1 (en) 2004-11-05 2005-03-15 A method to transmit and update a transmitted electronic document

Country Status (2)

Country Link
SG (1) SG121917A1 (en)
WO (1) WO2006049581A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2441598A (en) * 2006-09-07 2008-03-12 Fujin Technology Plc Categorisation of Data using Structural Analysis
GB2495813A (en) * 2011-10-17 2013-04-24 Ibm Managing digital signatures in interactive documents
US20130144602A1 (en) * 2011-12-02 2013-06-06 Institute For Information Industry Quantitative Type Data Analyzing Device and Method for Quantitatively Analyzing Data

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996025801A1 (en) * 1995-02-17 1996-08-22 Trustus Pty. Ltd. Method for partitioning a block of data into subblocks and for storing and communicating such subblocks
FR2767937A1 (en) * 1997-09-04 1999-03-05 Michel Gouget Method of file duplication reducing volume of data transferred

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996025801A1 (en) * 1995-02-17 1996-08-22 Trustus Pty. Ltd. Method for partitioning a block of data into subblocks and for storing and communicating such subblocks
FR2767937A1 (en) * 1997-09-04 1999-03-05 Michel Gouget Method of file duplication reducing volume of data transferred

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2441598A (en) * 2006-09-07 2008-03-12 Fujin Technology Plc Categorisation of Data using Structural Analysis
GB2495813A (en) * 2011-10-17 2013-04-24 Ibm Managing digital signatures in interactive documents
GB2495813B (en) * 2011-10-17 2013-12-11 Ibm Managing digital signatures
US20130144602A1 (en) * 2011-12-02 2013-06-06 Institute For Information Industry Quantitative Type Data Analyzing Device and Method for Quantitatively Analyzing Data

Also Published As

Publication number Publication date
SG121917A1 (en) 2006-05-26

Similar Documents

Publication Publication Date Title
US10256978B2 (en) Content-based encryption keys
CN108270874B (en) Application program updating method and device
US20080002830A1 (en) Method, system, and computer-readable medium to maintain and/or purge files of a document management system
US6934845B2 (en) Method and system of reversibly marking a text document with a pattern of extra blanks for authentication
US11403414B2 (en) Method and system for secure storage of digital data
US11671263B2 (en) Cryptographically securing data files in a collaborative environment
US10721058B2 (en) Ultra-secure blockchain enabled analytics
US20110289310A1 (en) Cloud computing appliance
CN103118104B (en) A kind of data restoration method and server based on version vector
CN108810112B (en) Node synchronization method and device for market supervision block chain system
CN112491989A (en) Data transmission method, device, equipment and storage medium
CN112565393A (en) File uploading method, file downloading method, file uploading device, file downloading device, computer equipment and storage medium
CN101771548A (en) File synchronizing method and system
US10360354B1 (en) Method and apparatus of performing distributed steganography of a data message
CN111159100A (en) Block chain file access method and device, computer equipment and storage medium
JP2009110061A (en) Version management system and version management method
WO2006049581A1 (en) A method to transmit and update a transmitted electronic document
US6714950B1 (en) Methods for reproducing and recreating original data
JP2002135247A (en) Digital information storing method
CN112329029A (en) Block chain-based electronic archive file safe storage method and system
CN108563396B (en) Safe cloud object storage method
Tian et al. Sed‐Dedup: An efficient secure deduplication system with data modifications
CN109271811B (en) Group signature-based electronic material evidence tamper-proof storage method
CN111782615A (en) Block chain-based large file storage method and system and computer equipment
JP2009237934A (en) File converting device, and file converting method and program

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 05722327

Country of ref document: EP

Kind code of ref document: A1