US20020010702A1 - System and method for differential compression of data from a plurality of binary sources - Google Patents
System and method for differential compression of data from a plurality of binary sources Download PDFInfo
- Publication number
- US20020010702A1 US20020010702A1 US08/794,134 US79413497A US2002010702A1 US 20020010702 A1 US20020010702 A1 US 20020010702A1 US 79413497 A US79413497 A US 79413497A US 2002010702 A1 US2002010702 A1 US 2002010702A1
- Authority
- US
- United States
- Prior art keywords
- algorithm
- file
- data
- version
- string
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99933—Query processing, i.e. searching
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99933—Query processing, i.e. searching
- Y10S707/99936—Pattern matching access
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99941—Database schema or data structure
- Y10S707/99942—Manipulating data structure, e.g. compression, compaction, compilation
Definitions
- the invention generally relates to the field of data compression. More specifically, the invention relates to techniques, applicable to data which occurs in different versions, for finding differences between the versions.
- Differencing algorithms compress data by taking advantage of statistical correlations between different versions of the same data sets. Strictly speaking, differencing algorithms achieve compression by finding common sequences between two versions of the same data that can be encoded using a copy reference.
- file will be used to indicate a linear data set to be addressed by a differencing algorithm. Typically, a file is modified one or more times, each modification producing a successive “version” of the file.
- a differencing algorithm is defined as an algorithm that finds and outputs the changes made between two versions of the same file by locating common sequences to be copied, and by finding unique sequences to be added explicitly.
- a delta file ( ⁇ ) is the encoding of the output of a differencing algorithm.
- An algorithm that creates a delta file takes as input two versions of a file, a base file and a version file to be encoded, and outputs a delta file representing the incremental changes made between versions.
- FIG. 1 is an illustration of the process of creating a delta file from a base file and a version file.
- a base file 2 and a version file 4 are shown schematically, in a linear “memory map” format. They are lined up parallel to each other for illustrative purposes.
- Different versions of a file may be characterized as having sequences of data or content. Some of the sequences are unchanged between the versions, and may be paired up with each other. See, for instance, unchanged sequences 6 and 8 . By contrast, a sequence of one version (e.g., a sequence 10 in the base file) may have been changed to a different sequence in the version file (e.g., 12 ).
- One possible encoding of a delta file shown as 14 , consists of a linear array of editing directives. These directives include copy commands, such as 16 , which are references to a location in the base file 2 where the same data as that in the version file 4 exists; and further include add commands, such as 18 , which are instructions to add data into the version file 4 , the add data instruction 18 being followed by the data (e.g., 20 ) to be added.
- the '906 patent describes that backup and restore can be limited by both bandwidth on the network, often 10 MB/s, and poor throughput to secondary and tertiary storage devices, often 500 KB/s to tape storage. Since resource limitations frequently make backing up just the changes to a file system infeasible over a single night or even weekend, differential file compression has great potential to alleviate bandwidth problems by using available processor cycles to reduce the amount of data transferred. This technology can be used to provide backup and restore services on a subscription basis over any network including the Internet.
- Differencing has its origins in longest common subsequence (LCS) algorithms, and in the string-to-string correction problem.
- LCS longest common subsequence
- C. Guerra “Fast linear-space computations of longest common subsequences”, Theoretical Computer Science, 92(1):3-17, 1992 and Claus Rick, “A new flexible algorithm for the longest common subsequence problem”, Proceedings of the 6th Annual Symposium on Combinatorial Pattern Matching Espoo, Finland, 5-7 July 1995.
- R. A. Wagner and M. J. Fischer “The string-to-string correction problem”, Journal of the ACM, 21(1):168-173, January 1973.
- Some of the first applications of differencing updated the screens of slow terminals by sending a set of edits to be applied locally rather than retransmitting a screen full of data.
- Another early application was the UNIX “diff” utility, which used the LCS method to find and output the changes to a text file. diff was useful for source code development and for primitive document control.
- LCS algorithms find the longest common sequence between two strings by optimally removing symbols in both files leaving identical and sequential symbols. (A string/substring contains all consecutive symbols between and including its first and last symbol, whereas a sequence/subsequence may omit symbols with respect to the corresponding string.)
- RCS detects the modified lines in a file, and encodes a delta file by adding these lines and indicating lines to be copied from the base version. This is referred to as “differencing at line granularity.”
- the delta file is a line-by-line edit script applied to a base file to convert it to the new version.
- SCCS version control system Marc J. Rochkind, “The source code control system”, IEEE Transactions on Software Engineering, SE-1 (4):364-370, December 1975.
- RCS generates minimal line granularity delta files, and is the definitive previous work in version control.
- Source code control has been the major application for differencing. These packages allow authors to store and recall file versions. Software releases may be restored exactly, and changes are recoverable. Version control has also been integrated into a line editor, so that on every change a minimal delta is retained. See Christopher W. Fraser and Eugene W. Myers, “An editor for revision control”, ACM Transactions on Programming Languages and Systems, 9(2):277-295, April 1987. This allows for an unlimited undo facility without excessive storage.
- Greedy algorithms A well-known class of differencing algorithms may be termed “greedy” algorithms. Greedy algorithms often provide simple solutions to optimization problems by making what appears to be the best decision, i.e., the “greedy” decision, at each step. For differencing files, the greedy algorithm takes the longest match it can find at a given offset on the assumption that this match provides the best compression. It makes a locally optimal decision with the hope that this decision is part of the optimal solution over the input.
- the greedy algorithm provides an optimal encoding of a delta file, but it requires time proportional to the product of the sizes of the input files.
- the greedy algorithm for constructing differential files finds and encodes the longest copy in the base file corresponding to the first offset in the version file. After advancing the offset in the version file past the encoded copy, it looks for the longest copy starting at the current offset. If at a given offset, it cannot find a copy, the symbol at this offset is marked to be added and the algorithm advances to the following offset.
- the first task the algorithm performs is to construct a hash list and a link list out of the base version of the input files.
- the hash table allows an algorithm to store or identify the offset of a string with a given footprint.
- the link list stores the offsets of the footprints, beyond the initial footprint, that hash to the same value.
- strings at offset A 1 , A 2 , A 3 , and A 4 all have a footprint with value A.
- the link list effectively performs as a re-hash function for this data structure.
- the algorithm finds the matching strings in the file.
- the FindBestMatch function in FIG. 4 hashes the string at the current offset and returns the longest match that contains the string identified by the footprint.
- the function exhaustively searches through all strings that have matching footprints by fully traversing the link list for the matched hash entry. If the current offset in the version file verFile has footprint A, the function looks up the A-th element in the hash table to find a string with footprint A in the base file. In hashtable[A], we store the offset of the string with a matching footprint.
- the string at the current offset in the version file is compared with the string at hashtable[A] in the base file.
- the length of the matching string at these offsets is recorded.
- the function then moves to linktable[hashtable[A]] to find the next matching string. Each successive string in. the link table is compared in turn.
- the longest matching string with offset copy_start and length copy_length is returned by the function FindBestMatch.
- the algorithm finds a match for the current offset, the unmatched symbols previous to this match are encoded and output to the delta file, using the EmitAdd function, and the matching strings are output using the EmitCopy function.
- the algorithm terminates by outputting the end code to the delta file with the EmitEnd function.
- Common strings may be quickly identified by common footprints, the value of a hash function over a fixed length prefix of a string.
- the greedy algorithm must examine all matching footprints and extend the matches in order to find the longest matching string.
- the number of matching footprints between the base and version file can grow with respect to the product of the sizes of the input files, i.e. O(M ⁇ N) for files of size M and N, and the algorithm uses time proportional to the number of matching footprints.
- revision control needs to be generalized to include binary files. This allows binary data, such as edited multimedia, binary software releases, database files, etc., to be revised with the same version control and recoverability guarantees as text. Whereas revision control is currently a programmer's tool, binary revision control systems will enable the publisher, film maker, and graphic artist to realize the benefits of data versioning. It also enables developers to place image data, resource files, databases and binaries under their revision control system. Some existing version control packages have been modified to handle binary files, but in doing so they impose an arbitrary line structure. This results in delta files that achieve little or no compression as compared to storing the versions uncompressed.
- the set closely approximates a set produced by a method of scanning the entire second version to find a best possible match with a string in the first version.
- the invention describes a plurality of methods for binary differencing that can be integrated to form algorithms that efficiently compress versioned data.
- algorithms based on these methods are presented. These algorithms can difference any stream of data without a priori knowledge of the format or contents of the input.
- the algorithms drawn from the invention can difference data at any granularity including operating at the level of a byte or even a bit. Furthermore, these algorithms perform this task using linear run time and a constant amount of space. The algorithms accept arbitrarily large input file without a degradation in the rate of compression. Finally, these methods can be used to produce a steady and reliable stream of data for real time applications.
- the invention is disclosed in several parts. Techniques useful to algorithms that generate binary differences are presented and these techniques are then integrated into algorithms to difference versioned data. It is understood that a person of ordinary skill in the art could assemble these techniques into one of many possible algorithms. The methods described as the invention then outline a family of algorithms for binary differencing using a combination of methods drawn from this invention.
- an article of manufacture such as a pre-recorded disk or other similar computer program product, for use with a data processing system, could include a storage medium and program means recorded thereon for directing the data processing system to facilitate the practice of the method of the invention. It will be understood that such apparatus and articles of manufacture also fall within the spirit and scope of the invention.
- FIG. 1 is a schematic, memory-map view of two versions of a file, showing a differencing scheme according to the invention, the result of which is a difference file including markers for identical sections of the two versions.
- FIG. 2 is a snapshot representation of a data stream, showing, superimposed thereon, a group of fixed-length symbol strings used for footprinting in accordance with the invention.
- FIG. 3 is a schematic representation of conventional hash and link tables.
- FIG. 4 is a pseudocode implementation of a conventional “greedy” differencing technique, employing the tables of FIG. 3.
- FIG. 5 is a pseudocode implementation of a “linear” embodiment of the method of the invention.
- FIG. 6 is a pseudocode implementation of a procedure called from the pseudocode of FIG. 5.
- FIGS. 7A, 7B, and 7 C are illustrations of implementations of functions as per the embodiment of FIGS. 5 and 6.
- FIGS. 8A and 8B are illustrations of version matching scenarios described in connection with the invention.
- FIG. 9 is a diagram of a data structure produced and used by a system and method according to the invention.
- FIG. 10 is a pseudocode implementation of a “one-and-a-half-pass” embodiment of the method of the invention.
- FIG. 11 is a pseudocode implementation of a procedure called from the pseudocode of FIG. 10.
- FIG. 12 is a pseudocode implementation of a “one pass” embodiment of the method of the invention.
- FIG. 13 is a pseudocode implementation of a procedure called from the pseudocode of FIG. 12.
- FIG. 14 is a pseudocode implementation of another procedure called from the pseudocode of FIG. 12.
- Binary differencing algorithms all perform the same basic task. At the granularity of a byte, encode a set of version data as a set of changes from a base version of the same data. Due to their common tasks, all of the algorithms we examine share certain features. All binary differencing algorithms partition a file into two classes of variable length byte strings, those strings that appear in the base version and those that are unique to the version being encoded.
- the binary algorithms under consideration operate on data streams.
- a data stream to be a data source that is byte addressable, allows random access, and stores consecutive data contiguously.
- the data stream abstraction is more appropriate for this application than the file abstraction, as the file abstraction provides a greater level of detail than the algorithms require.
- Files consists of multiple blocks of data which may exist on multiple devices in addition to being non-contiguous in storage or memory. In UNIX parlance, this is called the i-node interface. Files also lack byte addressability. Reads on a file are generally performed at the granularity of a file block, anywhere from 512 bytes to 64 kilobytes.
- a data stream or file is composed of successive symbols from an alphabet, where symbols are a fundamental and indivisible element of data.
- symbols may be considered bytes and the alphabet is the set of all bytes, all combinations of 8 bits. While bytes are not truly indivisible, they do represent a fundamental unit for write, read and copy operations in the data streams that we address. Any combination of sequential and contiguous bytes comprise a string.
- a differencing algorithm finds the changes between two versions of the same data by partitioning the data into strings that have changed and strings that have not changed. Those strings that have not changed may be compressed by encoding them with a reference to the same data in the other file.
- the quality of a differencing algorithm depends upon its ability to find the maximum number of matching strings.
- the algorithm that produces the minimal delta finds a maximum total length of strings to be copied between files. In a minimal delta, the amount of data not copied represents the changed data between versions.
- the Reichenberger encoding consists of three types of codewords. There is an ADD codeword, which is followed by the length of the string to add and the string itself, a COPY codeword, which is followed by the length of the copy and an offset in the base version that references the matching string, and an END codeword, which indicates the end of input.
- ADD codeword which is followed by the length of the string to add and the string itself
- COPY codeword which is followed by the length of the copy and an offset in the base version that references the matching string
- END codeword which indicates the end of input.
- the formats of these codewords are summarized below. If required, such as in the case of the COPY command, a codeword may also specify additional bytes to follow.
- An ‘s’ indicates a following byte used to encode the offset in the base version.
- An ‘I’ indicates a following byte used to encode the length of the copy.
- a footprint does not uniquely represent a string, but does exhibit the following property: two matching strings will always express matching footprints, or equivalently, footprints that do not match always imply that the strings they represent differ.
- Differencing algorithms will use footprints to remember and locate strings that have been seen previously. These algorithms use a hash table with size equal to the cardinality of the set of footprints, i.e. there is a one-to-one correspondence between potential footprint values and hash table entries. Each hash table entry holds, at a minimum, a reference to the string that generated the footprint.
- Footprints are generated by a hashing function.
- a good hashing function for this application must be both run time efficient and generate a near uniform distribution of footprints over all footprint values.
- a non-uniform distribution of footprints results in differing strings hashing to the same footprint with higher probability.
- FIG. 2 is a snapshot of a data stream made up of symbols, identified for the sake of illustration by an index, beginning at 0, and running through r, r+n, r+n+3, etc.
- Strings of n consecutive symbols are designated as X, with subscripts showing the index of the first of the n symbols.
- Brackets shown below the data stream, uniformly n symbols in width, are superimposed on the data stream to represent n-symbol strings.
- a hash function run over these strings, generates footprints.
- the simple linear differencing algorithm of the invention generates delta files in a single, linear time pass over the input files, and uses a constant amount of memory space to do so.
- the linear algorithm achieves its runtime bounds by implementing the “next match policy”. It attempts to take the longest match at a given offset by taking the longest matching string at the next matching prefix beyond the offset at which the previous match was encoded. In effect, it encodes the first matching string found, rather than searching all matching footprints for the best matching string.
- matching strings are often sequential, i.e., they occur in the same order in both files. When strings that match are sequential, the next matching footprint approximates the best match extremely well. In fact, this property holds for all changes that are insertions and deletions.
- FIGS. 7A, 7B, and 7 C respectively illustrate the encoding of an insertion and a deletion, a deletion, and an insertion. They exhibit a schematic of two versions of the same file in a base file ( 1 ) and a the version file ( 2 ).
- the base file is the older version
- the version file is the new version, to be encoded as a set of changes from ( 1 ).
- the first portion ( 4 ) of the base file ( 1 ) has been deleted, and is not present in the version file ( 2 ).
- the version file ( 2 ) then starts with a string ( 3 ) that matches a later offset in the base file ( 1 ), and is encoded as a copy.
- the version file starts with new data ( 5 ) that was not present in the base file ( 1 ). This region has been added to the version file ( 2 ), and is encoded as such. The following data in the version file ( 2 ) copies the start of the base file ( 3 ).
- FIG. 7A the version file ( 2 ) starts with data ( 5 ) not in the base file and this data is encoded by an ADD command.
- the following data in ( 2 ) is not from the start of the base file ( 1 ), but copies a later portion of the base file ( 3 ).
- FIG. 7A represents the modification to a version when a delete ( 4 ) and an insert ( 5 ) occur together.
- EmitCodes outputs all of the data in the version file, between the end of the previous copy and the offset of the current copy, as an add command.
- the algorithm updates the current offsets in both files to point to the end of the encoded copy. If the files are versions of each other, the copies should represent the same data in both files, and the end of both copies should be a point of file pointer synchronization.
- a “point of synchronization”, in this case, is defined to be the relative offsets of the same data in the two file versions.
- the task of the linear differencing algorithm can be described as detecting points of synchronization, and subsequently copying from synchronized offsets.
- the presented algorithm operates both in linear time and constant space. At all times, the algorithm maintains a hash table of constant size. After finding a match, hash entries are flushed and the same hash table is reused to find the next matching footprint. Since this hash table neither grows nor is deallocated, the algorithm operates in constant space, roughly the size of the hash table, on all inputs.
- the maximum number of hash entries does not necessarily depend on the input file size, the size of the hash table need not grow with the size of the file.
- the maximum number of hash entries is bounded by twice the number of bytes between the end of the previous copied string and the following matching footprint. On highly correlated files, we expect a small maximum number of hash entries, since we expect to find matching strings frequently.
- the algorithm operates in time linear in the size of the input files as we are guaranteed to advance either the base file offset or the version file offset by one byte each time through the inside loop of the program.
- identity mode both the base offset and the version offset are incremented by one byte at each step.
- hashing mode each time a new offset is hashed, at least one of the offsets is incremented, as matching footprints are always found between the current offset in one file and a previous offset in another.
- Identity mode guarantees to advance the offsets in both files at every step, whereas hashing mode guarantees only to advance the offset in one file. Therefore, identity mode proceeds through the input at as much as twice the rate of hashing mode.
- the byte identity function is far easier to compute than the Karp-Rabin hashing function.
- the algorithm spends more time in identity mode than it would on less correlated versions. We can then state that the algorithm executes faster on more highly correlated inputs and the simple linear algorithm operates best on its most common input, similar version files.
- the algorithm achieves less than optimal compression when either the algorithm falsely believes that the offsets are synchronized, the assumption that all changes between versions consist of insertions and deletions fails to hold, or when the implemented hashing function exhibits less than ideal behavior. Examples are given in FIGS. 8A and 8B.
- the algorithm fails to find the copy of tokens ABCD since the string has been rearranged.
- the algorithm encodes EFG as a copy, and then flushes the hash table, removing symbols ABCD that previously appeared in the base file. When hashing mode restarts, the ABCD match has been missed, and will be encoded as an add.
- the algorithm misses the true start of the string ABCDEF in the base file (best match), in favor of the previous string at AB (next match).
- the algorithm Upon detecting and encoding a “spurious” match, the algorithm achieves some degree of compression, just not the best compression. Furthermore, the algorithm never bypasses “synchronized offsets” in favor of a spurious match. This also follows directly from choosing the next match, and not the best match.
- Hashing functions are, unfortunately, not ideal. Consequently, the algorithm may also experience the blocking of footprints.
- a fixed length string hashes to a footprint. If there is another footprint from a non-matching string in the same file, which is already occupying that entry in the hash table, then we say that the footprint is being blocked.
- the second footprint is ignored and the first one is retained. This is the correct procedure to implement next match, assuming that each footprint represents a unique string.
- hash functions generally hash a large number of inputs to a smaller number of keys, and are therefore not unique. Strings that hash to the same value may differ, and the algorithm loses the ability to find strings matching the discarded footprint.
- Footprint blocking could be addressed by any rehash function, or by hash chaining. However, this solution would destroy the constant space utilization bound on the algorithm. It also turns out to be unnecessary, as will be discussed below, in connection with the “more advanced algorithms”.
- the first method we term “undoing the damage”.
- a differencing algorithm finds strings to be copied and strings to be added and outputs them to a delta file.
- This scheme can be best thought of as a first-in-first-out (FIFO) queue that caches the most recent encodings made by the algorithm.
- FIFO first-in-first-out
- an algorithm has the opportunity to recall a given encoding, and to exchange it for a better one.
- an algorithm that uses this technique can make a quick decision as to an encoding, and, if this decision turns out to not be the best decision, the encoding will be undone in lieu of a more favorable encoding.
- Checkpointing takes a subset of all possible footprint values and calls these checkpoints. All footprints that are not in this subset are discarded and the algorithm runs on only the remaining checkpoints. This allows the file size and consequently the execution time to be reduced by an arbitrarily large factor. There is, unfortunately, a corresponding loss of compression with the runtime speedup.
- the technique is orthogonal to our other methods and can be applied to any of these algorithms.
- a linear run time differencing algorithm often has to encode stretches of input without complete information.
- the algorithm may have found a common string between the base and version files which represents the best known encoding seen in the files up to this point.
- it may find a longer common string that would encode the same region of the file more compactly. Under these circumstances, it becomes beneficial to let the algorithm change its mind and re-encode a portion of the file. This is termed “undoing the damage” and allows the algorithm to recover from previous bad decisions.
- an algorithm performs the best known encoding of some portion of a version file as its current version file pointer passes through that region. If it later encounters a string in the base file that would better encode this region, the old encoding is discarded in favor of the new encoding.
- the hash table acts as a short term memory and allows the algorithm to remember strings of tokens, so that when it sees them again, it may encode them as copies. This occurs when the algorithm finds a prior string in the base file that matches the current offset in the version file. Undoing the damage uses the symmetric case: matching strings between the current offset in the base file and a previous offset in the version file.
- the short term memory also allows the algorithm to recall and examine previous encoding decisions by recalling strings in the version file. These may then be re-encoded if the current offset in the base file provides a better encoding than the existing codewords.
- the algorithm buffers codewords rather than writing them directly to a file.
- the buffer in this instance, is a fixed size first in first out (FIFO) queue of file encodings called the “codeword lookback buffer”.
- codeword lookback buffer When a region of the file is logically encoded, the appropriate codewords are written to the lookback buffer.
- the buffer collects code words until it is full. Then, when writing a codeword to a full buffer, the oldest codeword gets pushed out and is written to the file. When a codeword “falls out of the cache”, it becomes immutable and has been committed to the file.
- the algorithm performs two types of undoing the damage.
- the first type of undoing the damage occurs when the algorithm encodes a new portion of the version file. If the algorithm is at the current offset in the file being encoded, new data will be encoded and added to the lookback buffer. The algorithm attempts to extend that matching string backwards from the current offset in the version file. If this backward matching string exceeds the length of the previous codeword, that encoding is discarded and replaced with the new longer copy command. The algorithm will “swallow” and discard codewords from the top of the lookback buffer as long as the codewords in question are either:
- a copy command that may be wholly re-encoded. If the command may only be partially re-encoded, the codeword may not be reclaimed and no additional compression can be attained; or
- the second type of undoing the damage is more general and may change any previous encoding, not just the most recent encoding. If a matching string is found between the current offset in the base file and a previous offset in the version file, the algorithm determines if the current encoding of this offset of the version file may be improved using this matching string. The algorithm searches through the buffer to find the first codeword that encodes a portion of the version file where the matching string was found. The matching string is then used to re-encode this portion, reclaiming partial add commands and whole copy commands.
- codeword lookback buffer be both searchable and editable, as the algorithm must efficiently look up previous encodings and potentially modify or erase those entries.
- the obvious implementation of the codeword lookback buffer is a linked list that contains the codewords, in order, as they were emitted from a differencing algorithm. This data structure has the advantage of simply supporting the insert, edit and delete operations on codewords.
- FIG. 9 is a snapshot of such a FIFO buffer. Hatch-marked squares contain encodings, and Xs mark dummy encodings.
- This region is divided into fixed sized elements. Each element is an entry in the codeword lookback buffer. An element in the lookback buffer contains the necessary data to emit its codeword. It also contains the version offset, the region of the version file that this entry encodes.
- the circular queue uses a fixed amount of memory.
- the pointers “first” and “last” mark the boundaries of the allocated region. Within this region, the data structure maintains pointers “head” and “tail”, which are the logical beginning and end of the FIFO. These pointers allow the queue to wrap around the end of the allocated region. As per common buffering practice, simple pointer arithmetic around these four pointers supports the access of any element in the queue in constant time.
- the one-and-a-half-pass algorithm modifies a greedy algorithm, such as the algorithm developed by Reichenberger (supra), producing a new algorithm that uses linear run time and a constant amount of memory.
- the greedy algorithm always guarantees to find the best encoding by performing exhaustive search through its data structures for the longest matching string at any given footprint. At first glance it would seem that this method cannot be improved with undoing the damage. However, the greedy algorithm suffers from using both memory and execution time inefficiently. As a consequence of linear memory growth and quadratic execution time growth, the greedy algorithm fails to scale well and cannot be used on arbitrarily large files.
- the one-and-a-half-pass algorithm modifies the greedy algorithm by altering data structures and search policies, to achieve execution time that grows linearly in the size of the input. Linear run-time comes at a price, and the modifications reduce the one-and-a-half-pass algorithm's ability to compactly represent versions. We can then use the undoing-the-damage technique to improve the compression that the algorithm achieves. The resulting algorithm compresses data comparably to the greedy algorithm, and executes faster on all inputs.
- the significant modification from the greedy algorithm in the one-and-a-half-pass algorithm is that the latter uses the first matching string that it finds at any given footprint, rather than searching exhaustively through all matching footprints.
- the algorithm discards the link table that was used in the greedy algorithm. Using the hash table only, the algorithm maintains a single string reference at each footprint value.
- the algorithm By storing only a single string reference for each footprint, the algorithm implements a first matching string, rather than a best matching string, policy when comparing footprints. This could be potentially unsatisfactory, as the algorithm would consistently be selecting inferior encodings. Yet, by undoing the damage the algorithm avoids incurring the penalties for a bad decision. By choosing a first match policy, the algorithm spends constant time on any given footprint, resulting in linear execution time. By maintaining only a single hash table of fixed size, the algorithm operates in constant space.
- hashing ceases to behave well when the hash table becomes densely populated. So, our first requirement is that the total number of stored footprints, i.e., the length of the input file, is smaller than the number of storage bins in our hash table.
- a preferred implementation of the one-and-a-half-pass algorithm is given, in pseudocode form, in FIG. 10.
- the algorithm first passes over the base file, baseFile, footprinting a string prefix at every byte offset, and storing these footprints for future lookup in a hash table. Having processed the base file, the algorithm footprints the first offset in the version file, verFile. The algorithm examines the hash table for a colliding footprint. If no footprints collide, we advance to the next offset by incrementing ver_pos and repeat this process.
- the algorithm uses the Verify function to check the strings for identity. Strings that pass the identity test are then encoded and output to the fixup buffer. All symbols in the version file between the end of the last output codeword, add_start, and the beginning of the matching strings, ver_pos, are output as an add command. The matching strings are then output to the fixup buffer using the FixupBufferInsertCopy function.
- the function FixupBufferInsertCopy (FIG. 11) not only outputs the matching strings to the fixup buffer, it also implements undoing the damage. Before encoding the matching strings, the algorithm determines if they match backwards. If they do, it deletes the last encoding out of the queue and re-encodes that portion of the version file by integrating it into the current copy command. Having reclaimed as many backwards code words as possible, the function simply dumps a copy command in the buffer and returns. This one type of undoing the damage is adequate in this case as the algorithm has complete information about the base file as it encodes the version file.
- this algorithm one-and-a-half-pass, because it processes the base file twice and the version file once. Initially, this technique takes a single pass over the base file in order to build the hash table. Then, as the algorithm encodes the version file, random access is performed on the matching strings in the base file, inspecting only those strings whose footprints collide with footprints from the version file.
- the algorithm of the invention has been found to run in linear time for all known inputs, including all files in a distributed file system, and further including database.
- the algorithm generates a hash key for a footprint at each offset.
- the generation of a hash key takes constant time and must be done once for each footprint in the file, requiring total time linearly proportional to the size of the base file.
- the version file is encoded.
- the algorithm either generates a hash key for the footprint at that offset, or uses the identity function to match the symbol as a copy of another symbol in the base file. In either case, the algorithm uses a constant amount of time at every offset for total time proportional to the size of the version file.
- This algorithm has the potential to encode delta files as well as the greedy algorithm when the decision of choosing the first match is equally as good as choosing the best match.
- the first match well represents the best match when the footprint hashing function generates “false matches” infrequently. Therefore, to achieve good compression, with respect to the greedy algorithm's compression, we must select a suitably long footprint. If the footprints uniquely represent the strings, the algorithms behave identically. However, the one-and-a-half-pass algorithm guarantees linear performance on all inputs, and cannot be slowed by many strings with the same footprint.
- FIG. 12 is a pseudocode implementation of a “one pass” algorithm according to the invention.
- the one pass algorithm improves the compression of the simple linear differencing algorithm without a significant depreciation in the execution time.
- the algorithm discards information that could later be valuable. If the algorithm was to make an encoding that was not from a point of synchronization, the chance to later find a point of synchronization from that string is lost. The one pass algorithm does not flush the hash table in order to find potentially missed points of synchronization.
- the algorithm must then avoid the pitfall of not incrementing the file pointer when matching a frequently occurring common string.
- the algorithm does this by guaranteeing that the file pointers in both files are non-decreasing always and that when offsets are hashed, the pointers in both files advance. So, rather than trying to find the exact point of synchronization, the algorithm collects data about all previous footprints.
- the data that it accumulates arrives incrementally, as it advances through the input files.
- the algorithm uses a replacement rule to update the hash table when there are identical footprints from the same file. This rule discards old information, and preferentially keeps information close to the point of synchronization. The algorithm need not worry about making a bad encoding.
- the one pass algorithm starts at offset zero in both files, generates footprints at these offsets and stores them in the hash tables. Footprints from verFile go into verhashtbl and footprints from baseFile in bashashtbl. It continues by advancing the file pointers, ver_pos and base_pos, and generating footprints at subsequent offsets. When the algorithm finds footprints that match, it first ensures that the strings these footprints represent are identical, using the Verify function.
- the EmitCodes subroutine has been modified from its previous incarnation (FIG. 6), to output codewords to the fixup buffer, rather than outputting data directly to the file.
- the data that precedes the start of the copy is encoded in an add command using the function FixupBufferInsertAdd.
- the matched data is then output, using the function FixupBufferInsertCopy.
- FixupBufferInsertCopy implements one type of undoing the damage. Before encoding the current copy, the string is checked to see if it matches backwards. If the match extends backwards, the function re-encodes the previous codewords, if it produces a more compact encoding.
- the one pass algorithm also implements undoing the damage when the current offset in baseFile matches a previous offset in verFile. This case of undoing the damage is different as it attempts to repair an encoding from an arbitrary point in the cache, rather than just re-encoding the last elements placed in the codeword fixup buffer. In fact, the target codeword may have fallen out of the cache and not even be in the fixup buffer.
- FixupEncoding performs this type of undoing the damage. After finding the first codeword that encodes a portion of the string found in the version file, as many encodings as possible are reclaimed to be integrated into a single copy command.
- the outer loop in the routine OnePass only runs when either the base_active or version_active flag is set. These flags indicate whether the file pointer has reached the end of the input. It is necessary to read the whole version file in order to complete the encoding. It is also necessary to finish processing of the base file, even if the version file has been wholly read, as the algorithm may use this information to undo the damage. This also differs from the simple linear differencing algorithm which completes after finishing processing in the version file. The simple linear differencing algorithm has no motivation to continue footprinting the base file after the version file has been encoded, as it cannot modify previous encodings.
- the per file hash tables in the one pass algorithm remembers the most recent occurrence of each footprint in each file. This results as the algorithm elects to replace existing footprints in the hash table with conflicting new occurrences of the same footprints.
- the hash tables tend to have complete information for the footprints from the most recent offsets. For older offsets, the hash table becomes incomplete with these footprints being overwritten. It is appropriate to consider this “memory” of previous strings through footprints as a window into the most recent offsets in each file. This window is the region over which the algorithm can act at any given time. A footprint that has been expelled from this window cannot be used to create a copy command or to undo the damage.
- the window in the past does not consist of contiguous data, but data about past footprints that gets less dense at offsets further in the past from the current file offset.
- This window dictates the effectiveness of the algorithm to detect transposed data.
- two data streams composed of long strings A and B One version of this data can be described by AB and the other by BA.
- BA the other by BA.
- This type of rearranged data can be detected and efficiently encoded assuming that the window into the past covers some portion of the transposed data. It is thereby beneficial for encoding transpositions to have a hash table that can contain all of the footprints in the base file.
- the one pass algorithm may perform random access in either file but on highly correlated inputs this access should always be near the current file pointers and not to distant offsets in the past. What distinguishes the one pass algorithm from other algorithms is its on-line nature. Since the algorithm starts encoding the version file upon initiation, it does not fill a hash table with footprints from the base file before encoding the version file, the algorithm emits a constant stream of output data. In fact, the algorithm can be described as having a data rate. This is a very important feature if one uses the algorithm to serve a network channel or for any other real time application.
- the one pass algorithm behaves well under arbitrarily long input streams in that it only loses the ability to detect transposed data. The same cannot be said of the one-and-a-half-pass algorithm. Since it has only a single hash table with no ability to re-hash, when that hash table is full, the algorithm must discard footprints. This results in pathologically poor performance of inputs that overflow the one-and-a-half-pass algorithm's hash table. Note that both algorithms fail to perform optimally when the input is such that their hash tables are filled. In the next section, we will address this problem using a method called checkpointing.
- the checkpointing method declares a certain subset of all possible footprints checkpoints. The algorithm will then only operate on footprints that are in this checkpoint subset. We still need to run the hashing function at every offset, but only those footprints that are in the checkpoint subset participate in finding matches. This reduces the entries in the hash table and allows algorithms to accept longer inputs without the footprint entry and lookup operations breaking down.
- An algorithm must also ensure that the set C of all checkpoints can address every element in the hash table, i.e.
- a non-zero value for k is selected, to ensure that the string of all zeros is not in the checkpoint set. Many types of data stuff zeros for alignment or empty space. Therefore, this string, with the corresponding checkpoint equal to zero, is frequently occurring, and is therefore not beneficial.
- This implementation is orthogonal to the algorithms that use it, and can be isolated to the one step where the algorithm generates the next footprint.
- Checkpointing alleviates the failure of the one-and-a-half-pass algorithm operating on large input files. By choosing an appropriate number of checkpoints as shown above, the algorithm can fit the contents of any file into its hash table.
- checkpointing has a negative effect on the ability of the algorithm to detect small matching strings between file versions. If an algorithm is to detect and encode matching strings, then one of the footprints of this string must be a checkpoint. Short matching strings will have few colliding footprints, and will consequently be missed with greater likelihood. On the other hand, for versioned data, we expect highly correlated input streams, and can expect long matching strings which contain checkpoints with increasing probability.
- checkpointing technique relies upon undoing the damage, and performs better on the one-and-a-half-pass algorithm than the greedy algorithm. Since checkpointing does not look at every footprint, an algorithm is likely to miss the starting offset for matching strings. With undoing the damage, this missed offset is handled transparently, and the algorithm finds the true start of matching strings without additional modifications to the code.
- the one pass algorithm has problems detecting transpositions when its hash table becomes over-utilized. This feature is not so much a mode of failure as a property of the algorithm. Applying checkpointing as we did in the one-and-a-half-pass algorithm allows such transpositions to be detected. Yet, if the modification of the data does not exhibit transpositions, then the algorithm sacrifices the ability to detect fine grained matches and gains no additional benefit.
- checkpoint value depends on the nature of the input data. For data that exhibits only insert and delete modifications, checkpointing should be disregarded altogether. Any policy decision as to the number of checkpoints is subject to differing performance, and the nature of the input data needs to be considered to formulate such a policy.
- the one pass algorithm adds undoing the damage and checkpointing to the simple linear algorithm. These additions allow the one pass algorithm to find strings that would better encode portions of the version file and consequently improve compression.
- the one pass algorithm is
- the one and a half pass algorithm differences files in a single pass over the version file and a double pass over the base file.
- the algorithm first passes over the base version of the file collecting information in its hash table. After complete processing of the base file, it passes over the version file finding strings with matching footprints and verifying them in the base file. This algorithm does implement undoing the damage and checkpointing.
- the one and a half pass algorithm does not have the on-line property, but does compress data as rapidly and more compactly than the on-line algorithms.
- the invention may be implemented using standard programming and/or engineering techniques using computer programming software, firmware, hardware or any combination or subcombination thereof.
- Any such resulting program(s), having computer readable program code means may be embodied or provided within one or more computer readable or usable media such as fixed (hard) drives, disk, diskettes, optical disks, magnetic tape, semiconductor memories such as read-only memory (ROM), etc., or any transmitting/receiving medium such as the Internet or other communication network or link, thereby making a computer program product, i.e., an article of manufacture, according to the invention.
- the article of manufacture containing the computer programming code may be made and/or used by executing the code directly from one medium, by copying the code from one medium to another medium, or by transmitting the code over a network.
- An apparatus for making, using, or selling the invention may be one or more processing systems including, but not limited to, a central processing unit (CPU), memory, storage devices, communication links, communication devices, servers, I/O devices, or any subcomponents or individual parts of one or more processing systems, including software, firmware, hardware or any combination or subcombination thereof, which embody the invention as set forth in the claims.
- CPU central processing unit
- memory storage devices
- communication links communication devices
- communication devices servers, I/O devices, or any subcomponents or individual parts of one or more processing systems, including software, firmware, hardware or any combination or subcombination thereof, which embody the invention as set forth in the claims.
- User input may be received from the keyboard, mouse, pen, voice, touch screen, or any other means by which a human can input data to a computer, including through other programs such as application programs.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US08/794,134 US6374250B2 (en) | 1997-02-03 | 1997-02-03 | System and method for differential compression of data from a plurality of binary sources |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US08/794,134 US6374250B2 (en) | 1997-02-03 | 1997-02-03 | System and method for differential compression of data from a plurality of binary sources |
Publications (2)
Publication Number | Publication Date |
---|---|
US20020010702A1 true US20020010702A1 (en) | 2002-01-24 |
US6374250B2 US6374250B2 (en) | 2002-04-16 |
Family
ID=25161817
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US08/794,134 Expired - Lifetime US6374250B2 (en) | 1997-02-03 | 1997-02-03 | System and method for differential compression of data from a plurality of binary sources |
Country Status (1)
Country | Link |
---|---|
US (1) | US6374250B2 (US20020010702A1-20020124-M00001.png) |
Cited By (78)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020184559A1 (en) * | 2001-06-01 | 2002-12-05 | Farstone Technology Inc. | Backup/recovery system and methods regarding the same |
US20030229643A1 (en) * | 2002-05-29 | 2003-12-11 | Digimarc Corporation | Creating a footprint of a computer file |
US20040093564A1 (en) * | 2002-11-07 | 2004-05-13 | International Business Machines Corporation | Method and apparatus for visualizing changes in data |
US20040205539A1 (en) * | 2001-09-07 | 2004-10-14 | Mak Mingchi Stephen | Method and apparatus for iterative merging of documents |
US20040210885A1 (en) * | 2000-11-14 | 2004-10-21 | Microsoft Corporation | Methods for comparing versions of a program |
US20040220980A1 (en) * | 2000-03-01 | 2004-11-04 | Forster Karl J. | Method and system for updating an archive of a computer file |
US20050044294A1 (en) * | 2003-07-17 | 2005-02-24 | Vo Binh Dao | Method and apparatus for window matching in delta compressors |
US20050069151A1 (en) * | 2001-03-26 | 2005-03-31 | Microsoft Corporaiton | Methods and systems for synchronizing visualizations with audio streams |
US20050235043A1 (en) * | 2004-04-15 | 2005-10-20 | Microsoft Corporation | Efficient algorithm and protocol for remote differential compression |
US20060059173A1 (en) * | 2004-09-15 | 2006-03-16 | Michael Hirsch | Systems and methods for efficient data searching, storage and reduction |
US20060059207A1 (en) * | 2004-09-15 | 2006-03-16 | Diligent Technologies Corporation | Systems and methods for searching of storage data with reduced bandwidth requirements |
US20060085561A1 (en) * | 2004-09-24 | 2006-04-20 | Microsoft Corporation | Efficient algorithm for finding candidate objects for remote differential compression |
US20060112264A1 (en) * | 2004-11-24 | 2006-05-25 | International Business Machines Corporation | Method and Computer Program Product for Finding the Longest Common Subsequences Between Files with Applications to Differential Compression |
US20060143168A1 (en) * | 2004-12-29 | 2006-06-29 | Rossmann Albert P | Hash mapping with secondary table having linear probing |
US20060193159A1 (en) * | 2005-02-17 | 2006-08-31 | Sensory Networks, Inc. | Fast pattern matching using large compressed databases |
US20060253438A1 (en) * | 2005-05-09 | 2006-11-09 | Liwei Ren | Matching engine with signature generation |
US20060277197A1 (en) * | 2005-06-03 | 2006-12-07 | Bailey Michael P | Data format for website traffic statistics |
US20070028226A1 (en) * | 2000-11-17 | 2007-02-01 | Shao-Chun Chen | Pattern detection preprocessor in an electronic device update generation system |
EP1754322A1 (en) * | 2004-06-10 | 2007-02-21 | Samsung Electronics Co., Ltd. | Apparatus and method for efficient generation of delta files for over-the-air upgrades in a wireless network |
US20070169073A1 (en) * | 2002-04-12 | 2007-07-19 | O'neill Patrick | Update package generation and distribution network |
US20070207800A1 (en) * | 2006-02-17 | 2007-09-06 | Daley Robert C | Diagnostics And Monitoring Services In A Mobile Network For A Mobile Device |
US20070276794A1 (en) * | 2006-02-24 | 2007-11-29 | Hiroyasu Nishiyama | Pointer compression/expansion method, a program to execute the method and a computer system using the program |
US20070300206A1 (en) * | 2006-06-22 | 2007-12-27 | Microsoft Corporation | Delta compression using multiple pointers |
US20080005506A1 (en) * | 2006-06-30 | 2008-01-03 | Data Equation Limited | Data processing |
US20080005135A1 (en) * | 2006-06-30 | 2008-01-03 | Microsoft Corporation | Defining and extracting a flat list of search properties from a rich structured type |
US7320009B1 (en) * | 2003-03-28 | 2008-01-15 | Novell, Inc. | Methods and systems for file replication utilizing differences between versions of files |
US20080086513A1 (en) * | 2006-10-04 | 2008-04-10 | O'brien Thomas Edward | Using file backup software to generate an alert when a file modification policy is violated |
US20080148147A1 (en) * | 2006-12-13 | 2008-06-19 | Pado Metaware Ab | Method and system for facilitating the examination of documents |
US20080163189A1 (en) * | 2002-08-22 | 2008-07-03 | Shao-Chun Chen | System for generating efficient and compact update packages |
US20080177782A1 (en) * | 2007-01-10 | 2008-07-24 | Pado Metaware Ab | Method and system for facilitating the production of documents |
US20080183734A1 (en) * | 2007-01-31 | 2008-07-31 | Anurag Sharma | Manipulating the original content of at least one original read-only computer file in a computer file-system in a computer system |
US20080313243A1 (en) * | 2007-05-24 | 2008-12-18 | Pado Metaware Ab | method and system for harmonization of variants of a sequential file |
US20090199090A1 (en) * | 2007-11-23 | 2009-08-06 | Timothy Poston | Method and system for digital file flow management |
US20090228716A1 (en) * | 2008-02-08 | 2009-09-10 | Pado Metawsre Ab | Method and system for distributed coordination of access to digital files |
US7596632B1 (en) * | 2004-10-05 | 2009-09-29 | At&T Intellectual Property Ii, L.P. | Windowing by prefix matching |
US20090271528A1 (en) * | 2004-04-15 | 2009-10-29 | Microsoft Corporation | Efficient chunking algorithm |
US7849054B2 (en) | 2000-12-27 | 2010-12-07 | Microsoft Corporation | Method and system for creating and maintaining version-specific properties in a file |
US20100318759A1 (en) * | 2009-06-15 | 2010-12-16 | Microsoft Corporation | Distributed rdc chunk store |
JP2011501839A (ja) * | 2007-10-04 | 2011-01-13 | グローバル インフィニプール ゲーエムベーハー | データエンティティ及びそのバージョンにアクセスするための方法 |
US20110113016A1 (en) * | 2009-11-06 | 2011-05-12 | International Business Machines Corporation | Method and Apparatus for Data Compression |
US20110295894A1 (en) * | 2010-05-27 | 2011-12-01 | Samsung Sds Co., Ltd. | System and method for matching pattern |
US20130024435A1 (en) * | 2011-07-19 | 2013-01-24 | Exagrid Systems, Inc. | Systems and methods for managing delta version chains |
US8468515B2 (en) | 2000-11-17 | 2013-06-18 | Hewlett-Packard Development Company, L.P. | Initialization and update of software and/or firmware in electronic devices |
US8495019B2 (en) | 2011-03-08 | 2013-07-23 | Ca, Inc. | System and method for providing assured recovery and replication |
US8526940B1 (en) | 2004-08-17 | 2013-09-03 | Palm, Inc. | Centralized rules repository for smart phone customer care |
US8555273B1 (en) | 2003-09-17 | 2013-10-08 | Palm. Inc. | Network for updating electronic devices |
US8578361B2 (en) | 2004-04-21 | 2013-11-05 | Palm, Inc. | Updating an electronic device with update agent code |
US8711013B2 (en) | 2012-01-17 | 2014-04-29 | Lsi Corporation | Coding circuitry for difference-based data transformation |
US8752044B2 (en) | 2006-07-27 | 2014-06-10 | Qualcomm Incorporated | User experience and dependency management in a mobile device |
US8893110B2 (en) | 2006-06-08 | 2014-11-18 | Qualcomm Incorporated | Device management in a network |
US8930431B2 (en) | 2010-12-15 | 2015-01-06 | International Business Machines Corporation | Parallel computation of a remainder by division of a sequence of bytes |
US20150074291A1 (en) * | 2005-09-29 | 2015-03-12 | Silver Peak Systems, Inc. | Systems and methods for compressing packet data by predicting subsequent data |
US9397951B1 (en) | 2008-07-03 | 2016-07-19 | Silver Peak Systems, Inc. | Quality of service using multiple flows |
US9438538B2 (en) | 2006-08-02 | 2016-09-06 | Silver Peak Systems, Inc. | Data matching using flow based packet data storage |
US9549048B1 (en) | 2005-09-29 | 2017-01-17 | Silver Peak Systems, Inc. | Transferring compressed packet data over a network |
US9584403B2 (en) | 2006-08-02 | 2017-02-28 | Silver Peak Systems, Inc. | Communications scheduler |
US9613071B1 (en) | 2007-11-30 | 2017-04-04 | Silver Peak Systems, Inc. | Deferred data storage |
US9626224B2 (en) | 2011-11-03 | 2017-04-18 | Silver Peak Systems, Inc. | Optimizing available computing resources within a virtual environment |
US20170195446A1 (en) * | 2015-12-31 | 2017-07-06 | International Business Machines Corporation | Enhanced storage clients |
US9712463B1 (en) | 2005-09-29 | 2017-07-18 | Silver Peak Systems, Inc. | Workload optimization in a wide area network utilizing virtual switches |
US9717021B2 (en) | 2008-07-03 | 2017-07-25 | Silver Peak Systems, Inc. | Virtual network overlay |
US20170364701A1 (en) * | 2015-06-02 | 2017-12-21 | ALTR Solutions, Inc. | Storing differentials of files in a distributed blockchain |
US9875344B1 (en) | 2014-09-05 | 2018-01-23 | Silver Peak Systems, Inc. | Dynamic monitoring and authorization of an optimization device |
US9906630B2 (en) | 2011-10-14 | 2018-02-27 | Silver Peak Systems, Inc. | Processing data packets in performance enhancing proxy (PEP) environment |
US9948496B1 (en) | 2014-07-30 | 2018-04-17 | Silver Peak Systems, Inc. | Determining a transit appliance for data traffic to a software service |
US9967056B1 (en) | 2016-08-19 | 2018-05-08 | Silver Peak Systems, Inc. | Forward packet recovery with constrained overhead |
US10164861B2 (en) | 2015-12-28 | 2018-12-25 | Silver Peak Systems, Inc. | Dynamic monitoring and visualization for network health characteristics |
US10257082B2 (en) | 2017-02-06 | 2019-04-09 | Silver Peak Systems, Inc. | Multi-level learning for classifying traffic flows |
US20190251189A1 (en) * | 2018-02-09 | 2019-08-15 | Exagrid Systems, Inc. | Delta Compression |
US10432484B2 (en) | 2016-06-13 | 2019-10-01 | Silver Peak Systems, Inc. | Aggregating select network traffic statistics |
US10637721B2 (en) | 2018-03-12 | 2020-04-28 | Silver Peak Systems, Inc. | Detecting path break conditions while minimizing network overhead |
US10771394B2 (en) | 2017-02-06 | 2020-09-08 | Silver Peak Systems, Inc. | Multi-level learning for classifying traffic flows on a first packet from DNS data |
US10805840B2 (en) | 2008-07-03 | 2020-10-13 | Silver Peak Systems, Inc. | Data transmission via a virtual wide area network overlay |
US10892978B2 (en) | 2017-02-06 | 2021-01-12 | Silver Peak Systems, Inc. | Multi-level learning for classifying traffic flows from first packet data |
US10970413B2 (en) | 2015-06-02 | 2021-04-06 | ALTR Solutions, Inc. | Fragmenting data for the purposes of persistent storage across multiple immutable data structures |
US11044202B2 (en) | 2017-02-06 | 2021-06-22 | Silver Peak Systems, Inc. | Multi-level learning for predicting and classifying traffic flows from first packet data |
US11212210B2 (en) | 2017-09-21 | 2021-12-28 | Silver Peak Systems, Inc. | Selective route exporting using source type |
US11316530B2 (en) | 2015-12-31 | 2022-04-26 | International Business Machines Corporation | Adaptive compression for data services |
Families Citing this family (110)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU8397298A (en) * | 1997-07-15 | 1999-02-10 | Pocket Soft, Inc. | System for finding differences between two computer files and updating the computer files |
US6138254A (en) * | 1998-01-22 | 2000-10-24 | Micron Technology, Inc. | Method and apparatus for redundant location addressing using data compression |
JP3478130B2 (ja) * | 1998-06-30 | 2003-12-15 | 日本ビクター株式会社 | 情報記録媒体 |
EP1056010A1 (en) * | 1999-05-28 | 2000-11-29 | Hewlett-Packard Company | Data integrity monitoring in trusted computing entity |
EP1055990A1 (en) | 1999-05-28 | 2000-11-29 | Hewlett-Packard Company | Event logging in a computing platform |
US6470345B1 (en) * | 2000-01-04 | 2002-10-22 | International Business Machines Corporation | Replacement of substrings in file/directory pathnames with numeric tokens |
EP2148284A1 (en) * | 2000-01-10 | 2010-01-27 | Iron Mountain Incorporated | Administration of a differential backup system in a client-server environment |
US8156074B1 (en) * | 2000-01-26 | 2012-04-10 | Synchronoss Technologies, Inc. | Data transfer and synchronization system |
US8620286B2 (en) | 2004-02-27 | 2013-12-31 | Synchronoss Technologies, Inc. | Method and system for promoting and transferring licensed content and applications |
US6671757B1 (en) * | 2000-01-26 | 2003-12-30 | Fusionone, Inc. | Data transfer and synchronization system |
US7509420B2 (en) * | 2000-02-18 | 2009-03-24 | Emc Corporation | System and method for intelligent, globally distributed network storage |
US6470329B1 (en) * | 2000-07-11 | 2002-10-22 | Sun Microsystems, Inc. | One-way hash functions for distributed data synchronization |
US8073954B1 (en) | 2000-07-19 | 2011-12-06 | Synchronoss Technologies, Inc. | Method and apparatus for a secure remote access system |
US7895334B1 (en) | 2000-07-19 | 2011-02-22 | Fusionone, Inc. | Remote access communication architecture apparatus and method |
US6810398B2 (en) * | 2000-11-06 | 2004-10-26 | Avamar Technologies, Inc. | System and method for unorchestrated determination of data sequences using sticky byte factoring to determine breakpoints in digital sequences |
US7818435B1 (en) | 2000-12-14 | 2010-10-19 | Fusionone, Inc. | Reverse proxy mechanism for retrieving electronic content associated with a local network |
US20020099726A1 (en) * | 2001-01-23 | 2002-07-25 | International Business Machines Corporation | Method and system for distribution of file updates |
GB2372595A (en) * | 2001-02-23 | 2002-08-28 | Hewlett Packard Co | Method of and apparatus for ascertaining the status of a data processing environment. |
GB2372592B (en) * | 2001-02-23 | 2005-03-30 | Hewlett Packard Co | Information system |
GB2372591A (en) * | 2001-02-23 | 2002-08-28 | Hewlett Packard Co | Method of investigating transactions in a data processing environment |
GB2372594B (en) * | 2001-02-23 | 2004-10-06 | Hewlett Packard Co | Trusted computing environment |
US8615566B1 (en) | 2001-03-23 | 2013-12-24 | Synchronoss Technologies, Inc. | Apparatus and method for operational support of remote network systems |
US7310687B2 (en) * | 2001-03-23 | 2007-12-18 | Cisco Technology, Inc. | Methods and systems for managing class-based condensation |
US7500017B2 (en) * | 2001-04-19 | 2009-03-03 | Microsoft Corporation | Method and system for providing an XML binary format |
US6732248B2 (en) * | 2001-06-28 | 2004-05-04 | International Business Machines, Corporation | System and method for ghost offset utilization in sequential byte stream semantics |
EP1282023A1 (en) * | 2001-07-30 | 2003-02-05 | Hewlett-Packard Company | Trusted platform evaluation |
GB2378272A (en) * | 2001-07-31 | 2003-02-05 | Hewlett Packard Co | Method and apparatus for locking an application within a trusted environment |
US6904430B1 (en) * | 2002-04-26 | 2005-06-07 | Microsoft Corporation | Method and system for efficiently identifying differences between large files |
US6973465B2 (en) * | 2002-04-26 | 2005-12-06 | Sun Microsystems, Inc. | Mechanism for migrating a file sequence with versioning information from one workspace to another |
US6925467B2 (en) * | 2002-05-13 | 2005-08-02 | Innopath Software, Inc. | Byte-level file differencing and updating algorithms |
US7702636B1 (en) * | 2002-07-31 | 2010-04-20 | Cadence Design Systems, Inc. | Federated system and methods and mechanisms of implementing and using such a system |
US20040049767A1 (en) * | 2002-09-05 | 2004-03-11 | International Business Machines Corporation | Method and apparatus for comparing computer code listings |
US7096311B2 (en) * | 2002-09-30 | 2006-08-22 | Innopath Software, Inc. | Updating electronic files using byte-level file differencing and updating algorithms |
US6836657B2 (en) * | 2002-11-12 | 2004-12-28 | Innopath Software, Inc. | Upgrading of electronic files including automatic recovery from failures and errors occurring during the upgrade |
US20040098361A1 (en) * | 2002-11-18 | 2004-05-20 | Luosheng Peng | Managing electronic file updates on client devices |
US7320010B2 (en) * | 2002-11-18 | 2008-01-15 | Innopath Software, Inc. | Controlling updates of electronic files |
US20040098421A1 (en) * | 2002-11-18 | 2004-05-20 | Luosheng Peng | Scheduling updates of electronic files |
US7003534B2 (en) * | 2002-11-18 | 2006-02-21 | Innopath Software, Inc. | Generating difference files using module information of embedded software components |
US7007049B2 (en) * | 2002-11-18 | 2006-02-28 | Innopath Software, Inc. | Device memory management during electronic file updating |
US7844734B2 (en) * | 2002-11-18 | 2010-11-30 | Innopath Software, Inc. | Dynamic addressing (DA) using a centralized DA manager |
US7099884B2 (en) * | 2002-12-06 | 2006-08-29 | Innopath Software | System and method for data compression and decompression |
US7143115B2 (en) * | 2003-04-15 | 2006-11-28 | Pocket Soft, Inc. | Method and apparatus for finding differences between two computer files efficiently in linear time and for using these differences to update computer files |
US7089270B2 (en) * | 2003-06-20 | 2006-08-08 | Innopath Software | Processing software images for use in generating difference files |
EP1652069B1 (en) * | 2003-07-07 | 2010-08-25 | Red Bend Ltd. | Method and system for updating versions of content stored in a storage device |
US20050010870A1 (en) * | 2003-07-09 | 2005-01-13 | Jinsheng Gu | Post-processing algorithm for byte-level file differencing |
US20050010576A1 (en) * | 2003-07-09 | 2005-01-13 | Liwei Ren | File differencing and updating engines |
WO2005010715A2 (en) | 2003-07-21 | 2005-02-03 | Fusionone, Inc. | Device message management system |
US7031972B2 (en) | 2003-07-21 | 2006-04-18 | Innopath Software, Inc. | Algorithms for block-level code alignment of software binary files |
US20050020308A1 (en) * | 2003-07-23 | 2005-01-27 | David Lai | Dynamically binding Subscriber Identity Modules (SIMs)/User Identity Modules (UIMs) with portable communication devices |
KR100871778B1 (ko) * | 2003-10-23 | 2008-12-05 | 이노패스 소프트웨어, 아이엔시. | 중앙집중형 동적 어드레싱 매니저를 이용한 동적 어드레싱방법 및 장치 |
US7203708B2 (en) * | 2003-11-06 | 2007-04-10 | Microsoft Corporation | Optimizing file replication using binary comparisons |
US7870161B2 (en) * | 2003-11-07 | 2011-01-11 | Qiang Wang | Fast signature scan |
US8135683B2 (en) * | 2003-12-16 | 2012-03-13 | International Business Machines Corporation | Method and apparatus for data redundancy elimination at the block level |
US7200603B1 (en) * | 2004-01-08 | 2007-04-03 | Network Appliance, Inc. | In a data storage server, for each subsets which does not contain compressed data after the compression, a predetermined value is stored in the corresponding entry of the corresponding compression group to indicate that corresponding data is compressed |
US7079051B2 (en) * | 2004-03-18 | 2006-07-18 | James Andrew Storer | In-place differential compression |
US7739679B2 (en) * | 2004-04-06 | 2010-06-15 | Hewlett-Packard Development Company, L.P. | Object ordering tool for facilitating generation of firmware update friendly binary image |
US8572280B2 (en) * | 2004-05-06 | 2013-10-29 | Valve Corporation | Method and system for serialization of hierarchically defined objects |
US9542076B1 (en) | 2004-05-12 | 2017-01-10 | Synchronoss Technologies, Inc. | System for and method of updating a personal profile |
CN1998224A (zh) | 2004-05-12 | 2007-07-11 | 富盛旺公司 | 高级联络识别系统 |
US20050262167A1 (en) * | 2004-05-13 | 2005-11-24 | Microsoft Corporation | Efficient algorithm and protocol for remote differential compression on a local device |
US7516451B2 (en) | 2004-08-31 | 2009-04-07 | Innopath Software, Inc. | Maintaining mobile device electronic files including using difference files when upgrading |
US7680805B2 (en) * | 2004-12-30 | 2010-03-16 | Sap Ag | Synchronization method for an object oriented information system (IS) model |
US7849462B2 (en) * | 2005-01-07 | 2010-12-07 | Microsoft Corporation | Image server |
US8073926B2 (en) * | 2005-01-07 | 2011-12-06 | Microsoft Corporation | Virtual machine image server |
US20070094348A1 (en) * | 2005-01-07 | 2007-04-26 | Microsoft Corporation | BITS/RDC integration and BITS enhancements |
US8380686B2 (en) * | 2005-03-14 | 2013-02-19 | International Business Machines Corporation | Transferring data from a primary data replication appliance in a primary data facility to a secondary data replication appliance in a secondary data facility |
EP2194476B1 (en) | 2005-03-22 | 2014-12-03 | Hewlett-Packard Development Company, L.P. | Method and apparatus for creating a record of a software-verification attestation |
US20060236319A1 (en) * | 2005-04-15 | 2006-10-19 | Microsoft Corporation | Version control system |
JP4810915B2 (ja) * | 2005-07-28 | 2011-11-09 | 日本電気株式会社 | データ検索装置及び方法、並びにコンピュータ・プログラム |
US7822278B1 (en) * | 2005-09-20 | 2010-10-26 | Teradici Corporation | Methods and apparatus for encoding a digital video signal |
WO2007023497A1 (en) * | 2005-08-23 | 2007-03-01 | Red Bend Ltd. | Method and system for in-place updating content stored in a storage device |
KR100772399B1 (ko) * | 2006-02-28 | 2007-11-01 | 삼성전자주식회사 | 패치 파일 생성 방법 및 그 방법을 수행하는 프로그램을기록한 컴퓨터 판독 가능한 기록매체 |
CA2648428C (en) * | 2006-04-07 | 2017-11-21 | Data Storage Group | Data compression and storage techniques |
US8832045B2 (en) | 2006-04-07 | 2014-09-09 | Data Storage Group, Inc. | Data compression and storage techniques |
US20070276912A1 (en) * | 2006-05-05 | 2007-11-29 | Mike Rybak | Apparatus and method for forming and communicating a responsive data message |
US20100293141A1 (en) * | 2006-05-31 | 2010-11-18 | Pankaj Anand | Method and a System for Obtaining Differential Backup |
US7636728B2 (en) * | 2006-06-22 | 2009-12-22 | Microsoft Corporation | Media difference files for compressed catalog files |
US20080098170A1 (en) * | 2006-10-23 | 2008-04-24 | Guthrie William L | System and method for incremental RPO-type algorithm in disk drive |
US20080154986A1 (en) * | 2006-12-22 | 2008-06-26 | Storage Technology Corporation | System and Method for Compression of Data Objects in a Data Storage System |
US7925886B2 (en) | 2007-06-13 | 2011-04-12 | International Business Machines Corporation | Encryption output data generation method and system |
US8805799B2 (en) * | 2007-08-07 | 2014-08-12 | International Business Machines Corporation | Dynamic partial uncompression of a database table |
US7747585B2 (en) * | 2007-08-07 | 2010-06-29 | International Business Machines Corporation | Parallel uncompression of a partially compressed database table determines a count of uncompression tasks that satisfies the query |
US8630981B1 (en) * | 2007-12-21 | 2014-01-14 | Symantec Corporation | Techniques for differencing binary installation packages |
US8181111B1 (en) | 2007-12-31 | 2012-05-15 | Synchronoss Technologies, Inc. | System and method for providing social context to digital activity |
JP2009225260A (ja) * | 2008-03-18 | 2009-10-01 | Fujitsu Ten Ltd | 制御装置、制御方法、車両の制御装置、及び車両の制御システム |
US20090287986A1 (en) * | 2008-05-14 | 2009-11-19 | Ab Initio Software Corporation | Managing storage of individually accessible data units |
US8849772B1 (en) | 2008-11-14 | 2014-09-30 | Emc Corporation | Data replication with delta compression |
US8447740B1 (en) * | 2008-11-14 | 2013-05-21 | Emc Corporation | Stream locality delta compression |
US8751462B2 (en) * | 2008-11-14 | 2014-06-10 | Emc Corporation | Delta compression after identity deduplication |
US8527465B1 (en) * | 2008-12-24 | 2013-09-03 | Emc Corporation | System and method for modeling data change over time |
US8438558B1 (en) | 2009-03-27 | 2013-05-07 | Google Inc. | System and method of updating programs and data |
US8412848B2 (en) | 2009-05-29 | 2013-04-02 | Exagrid Systems, Inc. | Method and apparatus for content-aware and adaptive deduplication |
US8255006B1 (en) | 2009-11-10 | 2012-08-28 | Fusionone, Inc. | Event dependent notification system and method |
US8633838B2 (en) * | 2010-01-15 | 2014-01-21 | Neverfail Group Limited | Method and apparatus for compression and network transport of data in support of continuous availability of applications |
US20120124496A1 (en) | 2010-10-20 | 2012-05-17 | Mark Rose | Geographic volume analytics apparatuses, methods and systems |
US8645338B2 (en) | 2010-10-28 | 2014-02-04 | International Business Machines Corporation | Active memory expansion and RDBMS meta data and tooling |
US8943428B2 (en) | 2010-11-01 | 2015-01-27 | Synchronoss Technologies, Inc. | System for and method of field mapping |
BR112013010406A2 (pt) * | 2010-11-02 | 2016-08-09 | I Ces Innovative Compression Engineering Solutions | método para compressão de valores digitais de arquivos de imagem, áudio e/ou vídeo |
US10438176B2 (en) | 2011-07-17 | 2019-10-08 | Visa International Service Association | Multiple merchant payment processor platform apparatuses, methods and systems |
US10318941B2 (en) | 2011-12-13 | 2019-06-11 | Visa International Service Association | Payment platform interface widget generation apparatuses, methods and systems |
US9953378B2 (en) * | 2012-04-27 | 2018-04-24 | Visa International Service Association | Social checkout widget generation and integration apparatuses, methods and systems |
US10096022B2 (en) * | 2011-12-13 | 2018-10-09 | Visa International Service Association | Dynamic widget generator apparatuses, methods and systems |
US9110964B1 (en) * | 2013-03-05 | 2015-08-18 | Emc Corporation | Metadata optimization for network replication using differential encoding |
US9235475B1 (en) | 2013-03-05 | 2016-01-12 | Emc Corporation | Metadata optimization for network replication using representative of metadata batch |
US11216468B2 (en) | 2015-02-08 | 2022-01-04 | Visa International Service Association | Converged merchant processing apparatuses, methods and systems |
US20170038978A1 (en) * | 2015-08-05 | 2017-02-09 | HGST Netherlands B.V. | Delta Compression Engine for Similarity Based Data Deduplication |
US10282127B2 (en) | 2017-04-20 | 2019-05-07 | Western Digital Technologies, Inc. | Managing data in a storage system |
US10809928B2 (en) | 2017-06-02 | 2020-10-20 | Western Digital Technologies, Inc. | Efficient data deduplication leveraging sequential chunks or auxiliary databases |
US10503608B2 (en) | 2017-07-24 | 2019-12-10 | Western Digital Technologies, Inc. | Efficient management of reference blocks used in data deduplication |
US11669496B2 (en) * | 2021-07-21 | 2023-06-06 | Huawei Technologies Co., Ltd. | Method and apparatus for replicating a target file between devices |
Family Cites Families (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5297038A (en) * | 1985-09-27 | 1994-03-22 | Sharp Kabushiki Kaisha | Electronic dictionary and method of codifying words therefor |
US4807182A (en) * | 1986-03-12 | 1989-02-21 | Advanced Software, Inc. | Apparatus and method for comparing data groups |
GB8719572D0 (en) * | 1987-08-19 | 1987-09-23 | Krebs M S | Sigscan text retrieval system |
US5003307A (en) * | 1989-01-13 | 1991-03-26 | Stac, Inc. | Data compression apparatus with shift register search means |
US5129082A (en) * | 1990-03-27 | 1992-07-07 | Sun Microsystems, Inc. | Method and apparatus for searching database component files to retrieve information from modified files |
CA2051939A1 (en) | 1990-10-02 | 1992-04-03 | Gary A. Ransford | Digital data registration and differencing compression system |
US5278979A (en) * | 1990-12-20 | 1994-01-11 | International Business Machines Corp. | Version management system using pointers shared by a plurality of versions for indicating active lines of a version |
EP0578207B1 (en) * | 1992-07-06 | 1999-12-01 | Microsoft Corporation | Method for naming and binding objects |
US5384598A (en) | 1992-10-20 | 1995-01-24 | International Business Machines Corporation | System and method for frame differencing video compression and decompression with frame rate scalability |
US5392072A (en) | 1992-10-23 | 1995-02-21 | International Business Machines Inc. | Hybrid video compression system and method capable of software-only decompression in selected multimedia systems |
JP3132738B2 (ja) * | 1992-12-10 | 2001-02-05 | ゼロックス コーポレーション | テキスト検索方法 |
US5649200A (en) * | 1993-01-08 | 1997-07-15 | Atria Software, Inc. | Dynamic rule-based version control system |
US5574906A (en) | 1994-10-24 | 1996-11-12 | International Business Machines Corporation | System and method for reducing storage requirement in backup subsystems utilizing segmented compression and differencing |
US5745906A (en) * | 1995-11-14 | 1998-04-28 | Deltatech Research, Inc. | Method and apparatus for merging delta streams to reconstruct a computer file |
US5715454A (en) * | 1996-03-11 | 1998-02-03 | Hewlett-Packard Company | Version control of documents by independent line change packaging |
US5768532A (en) * | 1996-06-17 | 1998-06-16 | International Business Machines Corporation | Method and distributed database file system for implementing self-describing distributed file objects |
US5794254A (en) * | 1996-12-03 | 1998-08-11 | Fairbanks Systems Group | Incremental computer file backup using a two-step comparison of first two characters in the block and a signature with pre-stored character and signature sets |
US5787431A (en) * | 1996-12-16 | 1998-07-28 | Borland International, Inc. | Database development system with methods for java-string reference lookups of column names |
-
1997
- 1997-02-03 US US08/794,134 patent/US6374250B2/en not_active Expired - Lifetime
Cited By (172)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7730031B2 (en) * | 2000-03-01 | 2010-06-01 | Computer Associates Think, Inc. | Method and system for updating an archive of a computer file |
US8019730B2 (en) | 2000-03-01 | 2011-09-13 | Computer Associates Think, Inc. | Method and system for updating an archive of a computer file |
US8019731B2 (en) | 2000-03-01 | 2011-09-13 | Computer Associates Think, Inc. | Method and system for updating an archive of a computer file |
US20110010345A1 (en) * | 2000-03-01 | 2011-01-13 | Computer Associates Think, Inc. | Method and system for updating an archive of a computer file |
US20040220980A1 (en) * | 2000-03-01 | 2004-11-04 | Forster Karl J. | Method and system for updating an archive of a computer file |
US20100174685A1 (en) * | 2000-03-01 | 2010-07-08 | Computer Associates Think, Inc. | Method and system for updating an archive of a computer file |
US20040210885A1 (en) * | 2000-11-14 | 2004-10-21 | Microsoft Corporation | Methods for comparing versions of a program |
US8479189B2 (en) | 2000-11-17 | 2013-07-02 | Hewlett-Packard Development Company, L.P. | Pattern detection preprocessor in an electronic device update generation system |
US8468515B2 (en) | 2000-11-17 | 2013-06-18 | Hewlett-Packard Development Company, L.P. | Initialization and update of software and/or firmware in electronic devices |
US20070028226A1 (en) * | 2000-11-17 | 2007-02-01 | Shao-Chun Chen | Pattern detection preprocessor in an electronic device update generation system |
US7849054B2 (en) | 2000-12-27 | 2010-12-07 | Microsoft Corporation | Method and system for creating and maintaining version-specific properties in a file |
US20050069151A1 (en) * | 2001-03-26 | 2005-03-31 | Microsoft Corporaiton | Methods and systems for synchronizing visualizations with audio streams |
US6910151B2 (en) * | 2001-06-01 | 2005-06-21 | Farstone Technology Inc. | Backup/recovery system and methods regarding the same |
US20020184559A1 (en) * | 2001-06-01 | 2002-12-05 | Farstone Technology Inc. | Backup/recovery system and methods regarding the same |
US20040205539A1 (en) * | 2001-09-07 | 2004-10-14 | Mak Mingchi Stephen | Method and apparatus for iterative merging of documents |
US20070169073A1 (en) * | 2002-04-12 | 2007-07-19 | O'neill Patrick | Update package generation and distribution network |
US20030229643A1 (en) * | 2002-05-29 | 2003-12-11 | Digimarc Corporation | Creating a footprint of a computer file |
US20080163189A1 (en) * | 2002-08-22 | 2008-07-03 | Shao-Chun Chen | System for generating efficient and compact update packages |
US20040093564A1 (en) * | 2002-11-07 | 2004-05-13 | International Business Machines Corporation | Method and apparatus for visualizing changes in data |
US9547703B2 (en) | 2003-03-28 | 2017-01-17 | Oracle International Corporation | Methods and systems for file replication utilizing differences between versions of files |
US8306954B2 (en) | 2003-03-28 | 2012-11-06 | Oracle International Corporation | Methods and systems for file replication utilizing differences between versions of files |
US7320009B1 (en) * | 2003-03-28 | 2008-01-15 | Novell, Inc. | Methods and systems for file replication utilizing differences between versions of files |
US9934301B2 (en) | 2003-03-28 | 2018-04-03 | Oracle International Corporation | Methods and systems for file replication utilizing differences between versions of files |
US20080040375A1 (en) * | 2003-07-17 | 2008-02-14 | Vo Binh D | Method and apparatus for windowing in entropy encoding |
US20050044294A1 (en) * | 2003-07-17 | 2005-02-24 | Vo Binh Dao | Method and apparatus for window matching in delta compressors |
US8200680B2 (en) | 2003-07-17 | 2012-06-12 | At&T Intellectual Property Ii, L.P. | Method and apparatus for windowing in entropy encoding |
US7454431B2 (en) * | 2003-07-17 | 2008-11-18 | At&T Corp. | Method and apparatus for window matching in delta compressors |
US20110173167A1 (en) * | 2003-07-17 | 2011-07-14 | Binh Dao Vo | Method and apparatus for windowing in entropy encoding |
US7925639B2 (en) | 2003-07-17 | 2011-04-12 | At&T Intellectual Property Ii, L.P. | Method and apparatus for windowing in entropy encoding |
US8555273B1 (en) | 2003-09-17 | 2013-10-08 | Palm. Inc. | Network for updating electronic devices |
US7555531B2 (en) | 2004-04-15 | 2009-06-30 | Microsoft Corporation | Efficient algorithm and protocol for remote differential compression |
US8117173B2 (en) | 2004-04-15 | 2012-02-14 | Microsoft Corporation | Efficient chunking algorithm |
US20050235043A1 (en) * | 2004-04-15 | 2005-10-20 | Microsoft Corporation | Efficient algorithm and protocol for remote differential compression |
US20090271528A1 (en) * | 2004-04-15 | 2009-10-29 | Microsoft Corporation | Efficient chunking algorithm |
US8578361B2 (en) | 2004-04-21 | 2013-11-05 | Palm, Inc. | Updating an electronic device with update agent code |
EP1754322A1 (en) * | 2004-06-10 | 2007-02-21 | Samsung Electronics Co., Ltd. | Apparatus and method for efficient generation of delta files for over-the-air upgrades in a wireless network |
EP1754322A4 (en) * | 2004-06-10 | 2012-03-07 | Samsung Electronics Co Ltd | DEVICE AND METHOD FOR EFFICIENTLY PRODUCING DELTA FILES FOR AIR RADIO UPGRADES IN A WIRELESS NETWORK |
US8526940B1 (en) | 2004-08-17 | 2013-09-03 | Palm, Inc. | Centralized rules repository for smart phone customer care |
US20090228453A1 (en) * | 2004-09-15 | 2009-09-10 | International Business Machines Corporation | Systems and Methods for Efficient Data Searching, Storage and Reduction |
US20090234821A1 (en) * | 2004-09-15 | 2009-09-17 | International Business Machines Corporation | Systems and Methods for Efficient Data Searching, Storage and Reduction |
US7523098B2 (en) * | 2004-09-15 | 2009-04-21 | International Business Machines Corporation | Systems and methods for efficient data searching, storage and reduction |
US8275782B2 (en) | 2004-09-15 | 2012-09-25 | International Business Machines Corporation | Systems and methods for efficient data searching, storage and reduction |
US9400796B2 (en) * | 2004-09-15 | 2016-07-26 | International Business Machines Corporation | Systems and methods for efficient data searching, storage and reduction |
US20090228454A1 (en) * | 2004-09-15 | 2009-09-10 | International Business Machines Corporation | Systems and Methods for Efficient Data Searching, Storage and Reduction |
US8275756B2 (en) | 2004-09-15 | 2012-09-25 | International Business Machines Corporation | Systems and methods for efficient data searching, storage and reduction |
US20090228455A1 (en) * | 2004-09-15 | 2009-09-10 | International Business Machines Corporation | Systems and Methods for Efficient Data Searching, Storage and Reduction |
US20090228534A1 (en) * | 2004-09-15 | 2009-09-10 | Inernational Business Machines Corporation | Systems and Methods for Efficient Data Searching, Storage and Reduction |
US9378211B2 (en) | 2004-09-15 | 2016-06-28 | International Business Machines Corporation | Systems and methods for efficient data searching, storage and reduction |
US20090228456A1 (en) * | 2004-09-15 | 2009-09-10 | International Business Machines Corporation | Systems and Methods for Efficient Data Searching, Storage and Reduction |
US10282257B2 (en) | 2004-09-15 | 2019-05-07 | International Business Machines Corporation | Systems and methods for efficient data searching, storage and reduction |
US20090234855A1 (en) * | 2004-09-15 | 2009-09-17 | International Business Machines Corporation | Systems and Methods for Efficient Data Searching, Storage and Reduction |
US20060059173A1 (en) * | 2004-09-15 | 2006-03-16 | Michael Hirsch | Systems and methods for efficient data searching, storage and reduction |
US9430486B2 (en) * | 2004-09-15 | 2016-08-30 | International Business Machines Corporation | Systems and methods for efficient data searching, storage and reduction |
US10649854B2 (en) | 2004-09-15 | 2020-05-12 | International Business Machines Corporation | Systems and methods for efficient data searching, storage and reduction |
US8275755B2 (en) * | 2004-09-15 | 2012-09-25 | International Business Machines Corporation | Systems and methods for efficient data searching, storage and reduction |
US20060059207A1 (en) * | 2004-09-15 | 2006-03-16 | Diligent Technologies Corporation | Systems and methods for searching of storage data with reduced bandwidth requirements |
US8725705B2 (en) * | 2004-09-15 | 2014-05-13 | International Business Machines Corporation | Systems and methods for searching of storage data with reduced bandwidth requirements |
US8112496B2 (en) | 2004-09-24 | 2012-02-07 | Microsoft Corporation | Efficient algorithm for finding candidate objects for remote differential compression |
US20100064141A1 (en) * | 2004-09-24 | 2010-03-11 | Microsoft Corporation | Efficient algorithm for finding candidate objects for remote differential compression |
US20060085561A1 (en) * | 2004-09-24 | 2006-04-20 | Microsoft Corporation | Efficient algorithm for finding candidate objects for remote differential compression |
US7613787B2 (en) | 2004-09-24 | 2009-11-03 | Microsoft Corporation | Efficient algorithm for finding candidate objects for remote differential compression |
US7596632B1 (en) * | 2004-10-05 | 2009-09-29 | At&T Intellectual Property Ii, L.P. | Windowing by prefix matching |
US7487169B2 (en) * | 2004-11-24 | 2009-02-03 | International Business Machines Corporation | Method for finding the longest common subsequences between files with applications to differential compression |
US20060112264A1 (en) * | 2004-11-24 | 2006-05-25 | International Business Machines Corporation | Method and Computer Program Product for Finding the Longest Common Subsequences Between Files with Applications to Differential Compression |
US20060143168A1 (en) * | 2004-12-29 | 2006-06-29 | Rossmann Albert P | Hash mapping with secondary table having linear probing |
US7788240B2 (en) * | 2004-12-29 | 2010-08-31 | Sap Ag | Hash mapping with secondary table having linear probing |
US20060193159A1 (en) * | 2005-02-17 | 2006-08-31 | Sensory Networks, Inc. | Fast pattern matching using large compressed databases |
US20060253438A1 (en) * | 2005-05-09 | 2006-11-09 | Liwei Ren | Matching engine with signature generation |
US7516130B2 (en) * | 2005-05-09 | 2009-04-07 | Trend Micro, Inc. | Matching engine with signature generation |
US20090193018A1 (en) * | 2005-05-09 | 2009-07-30 | Liwei Ren | Matching Engine With Signature Generation |
US8171002B2 (en) | 2005-05-09 | 2012-05-01 | Trend Micro Incorporated | Matching engine with signature generation |
US8538969B2 (en) * | 2005-06-03 | 2013-09-17 | Adobe Systems Incorporated | Data format for website traffic statistics |
US20060277197A1 (en) * | 2005-06-03 | 2006-12-07 | Bailey Michael P | Data format for website traffic statistics |
US9363309B2 (en) * | 2005-09-29 | 2016-06-07 | Silver Peak Systems, Inc. | Systems and methods for compressing packet data by predicting subsequent data |
US20150074291A1 (en) * | 2005-09-29 | 2015-03-12 | Silver Peak Systems, Inc. | Systems and methods for compressing packet data by predicting subsequent data |
US9549048B1 (en) | 2005-09-29 | 2017-01-17 | Silver Peak Systems, Inc. | Transferring compressed packet data over a network |
US9712463B1 (en) | 2005-09-29 | 2017-07-18 | Silver Peak Systems, Inc. | Workload optimization in a wide area network utilizing virtual switches |
US20070207800A1 (en) * | 2006-02-17 | 2007-09-06 | Daley Robert C | Diagnostics And Monitoring Services In A Mobile Network For A Mobile Device |
US20070276794A1 (en) * | 2006-02-24 | 2007-11-29 | Hiroyasu Nishiyama | Pointer compression/expansion method, a program to execute the method and a computer system using the program |
US7539695B2 (en) * | 2006-02-24 | 2009-05-26 | Hitachi Ltd. | Pointer compression/expansion method, a program to execute the method and a computer system using the program |
US8893110B2 (en) | 2006-06-08 | 2014-11-18 | Qualcomm Incorporated | Device management in a network |
US8776022B2 (en) | 2006-06-22 | 2014-07-08 | Microsoft Corporation | Delta compression using multiple pointers |
US20070300206A1 (en) * | 2006-06-22 | 2007-12-27 | Microsoft Corporation | Delta compression using multiple pointers |
US8793655B2 (en) | 2006-06-22 | 2014-07-29 | Microsoft Corporation | Delta compression using multiple pointers |
US7861224B2 (en) * | 2006-06-22 | 2010-12-28 | Microsoft Corporation | Delta compression using multiple pointers |
US20080005506A1 (en) * | 2006-06-30 | 2008-01-03 | Data Equation Limited | Data processing |
US7502807B2 (en) * | 2006-06-30 | 2009-03-10 | Microsoft Corporation | Defining and extracting a flat list of search properties from a rich structured type |
US8886656B2 (en) * | 2006-06-30 | 2014-11-11 | Data Equation Limited | Data processing |
US20080005135A1 (en) * | 2006-06-30 | 2008-01-03 | Microsoft Corporation | Defining and extracting a flat list of search properties from a rich structured type |
US8752044B2 (en) | 2006-07-27 | 2014-06-10 | Qualcomm Incorporated | User experience and dependency management in a mobile device |
US9438538B2 (en) | 2006-08-02 | 2016-09-06 | Silver Peak Systems, Inc. | Data matching using flow based packet data storage |
US9584403B2 (en) | 2006-08-02 | 2017-02-28 | Silver Peak Systems, Inc. | Communications scheduler |
US9961010B2 (en) | 2006-08-02 | 2018-05-01 | Silver Peak Systems, Inc. | Communications scheduler |
US20080086513A1 (en) * | 2006-10-04 | 2008-04-10 | O'brien Thomas Edward | Using file backup software to generate an alert when a file modification policy is violated |
US7769731B2 (en) * | 2006-10-04 | 2010-08-03 | International Business Machines Corporation | Using file backup software to generate an alert when a file modification policy is violated |
US20080148147A1 (en) * | 2006-12-13 | 2008-06-19 | Pado Metaware Ab | Method and system for facilitating the examination of documents |
US8209605B2 (en) | 2006-12-13 | 2012-06-26 | Pado Metaware Ab | Method and system for facilitating the examination of documents |
US20080177782A1 (en) * | 2007-01-10 | 2008-07-24 | Pado Metaware Ab | Method and system for facilitating the production of documents |
US20080183734A1 (en) * | 2007-01-31 | 2008-07-31 | Anurag Sharma | Manipulating the original content of at least one original read-only computer file in a computer file-system in a computer system |
US8082260B2 (en) * | 2007-01-31 | 2011-12-20 | International Business Machines Corporation | Handling content of a read-only file in a computer's file system |
US20080313243A1 (en) * | 2007-05-24 | 2008-12-18 | Pado Metaware Ab | method and system for harmonization of variants of a sequential file |
US8010507B2 (en) * | 2007-05-24 | 2011-08-30 | Pado Metaware Ab | Method and system for harmonization of variants of a sequential file |
JP2011501839A (ja) * | 2007-10-04 | 2011-01-13 | グローバル インフィニプール ゲーエムベーハー | データエンティティ及びそのバージョンにアクセスするための方法 |
US20090199090A1 (en) * | 2007-11-23 | 2009-08-06 | Timothy Poston | Method and system for digital file flow management |
US9613071B1 (en) | 2007-11-30 | 2017-04-04 | Silver Peak Systems, Inc. | Deferred data storage |
US20090228716A1 (en) * | 2008-02-08 | 2009-09-10 | Pado Metawsre Ab | Method and system for distributed coordination of access to digital files |
US9397951B1 (en) | 2008-07-03 | 2016-07-19 | Silver Peak Systems, Inc. | Quality of service using multiple flows |
US10313930B2 (en) | 2008-07-03 | 2019-06-04 | Silver Peak Systems, Inc. | Virtual wide area network overlays |
US11419011B2 (en) | 2008-07-03 | 2022-08-16 | Hewlett Packard Enterprise Development Lp | Data transmission via bonded tunnels of a virtual wide area network overlay with error correction |
US11412416B2 (en) | 2008-07-03 | 2022-08-09 | Hewlett Packard Enterprise Development Lp | Data transmission via bonded tunnels of a virtual wide area network overlay |
US10805840B2 (en) | 2008-07-03 | 2020-10-13 | Silver Peak Systems, Inc. | Data transmission via a virtual wide area network overlay |
US9717021B2 (en) | 2008-07-03 | 2017-07-25 | Silver Peak Systems, Inc. | Virtual network overlay |
US20100318759A1 (en) * | 2009-06-15 | 2010-12-16 | Microsoft Corporation | Distributed rdc chunk store |
US20110113016A1 (en) * | 2009-11-06 | 2011-05-12 | International Business Machines Corporation | Method and Apparatus for Data Compression |
US8380688B2 (en) * | 2009-11-06 | 2013-02-19 | International Business Machines Corporation | Method and apparatus for data compression |
US20110295894A1 (en) * | 2010-05-27 | 2011-12-01 | Samsung Sds Co., Ltd. | System and method for matching pattern |
US9392005B2 (en) * | 2010-05-27 | 2016-07-12 | Samsung Sds Co., Ltd. | System and method for matching pattern |
US8930431B2 (en) | 2010-12-15 | 2015-01-06 | International Business Machines Corporation | Parallel computation of a remainder by division of a sequence of bytes |
US8935310B2 (en) | 2010-12-15 | 2015-01-13 | International Business Machines Corporation | Parallel computation of a remainder by division of a sequence of bytes |
US9405509B2 (en) | 2010-12-15 | 2016-08-02 | International Business Machines Corporation | Parallel computation of a remainder by division of a sequence of bytes |
US8495019B2 (en) | 2011-03-08 | 2013-07-23 | Ca, Inc. | System and method for providing assured recovery and replication |
US8589363B2 (en) * | 2011-07-19 | 2013-11-19 | Exagrid Systems, Inc. | Systems and methods for managing delta version chains |
US20140122425A1 (en) * | 2011-07-19 | 2014-05-01 | Jamey C. Poirier | Systems And Methods For Managing Delta Version Chains |
US20130024435A1 (en) * | 2011-07-19 | 2013-01-24 | Exagrid Systems, Inc. | Systems and methods for managing delta version chains |
US9430546B2 (en) * | 2011-07-19 | 2016-08-30 | Exagrid Systems, Inc. | Systems and methods for managing delta version chains |
US9906630B2 (en) | 2011-10-14 | 2018-02-27 | Silver Peak Systems, Inc. | Processing data packets in performance enhancing proxy (PEP) environment |
US9626224B2 (en) | 2011-11-03 | 2017-04-18 | Silver Peak Systems, Inc. | Optimizing available computing resources within a virtual environment |
US8711013B2 (en) | 2012-01-17 | 2014-04-29 | Lsi Corporation | Coding circuitry for difference-based data transformation |
US11374845B2 (en) | 2014-07-30 | 2022-06-28 | Hewlett Packard Enterprise Development Lp | Determining a transit appliance for data traffic to a software service |
US11381493B2 (en) | 2014-07-30 | 2022-07-05 | Hewlett Packard Enterprise Development Lp | Determining a transit appliance for data traffic to a software service |
US10812361B2 (en) | 2014-07-30 | 2020-10-20 | Silver Peak Systems, Inc. | Determining a transit appliance for data traffic to a software service |
US9948496B1 (en) | 2014-07-30 | 2018-04-17 | Silver Peak Systems, Inc. | Determining a transit appliance for data traffic to a software service |
US10719588B2 (en) | 2014-09-05 | 2020-07-21 | Silver Peak Systems, Inc. | Dynamic monitoring and authorization of an optimization device |
US20210192015A1 (en) * | 2014-09-05 | 2021-06-24 | Silver Peak Systems, Inc. | Dynamic monitoring and authorization of an optimization device |
US11954184B2 (en) | 2014-09-05 | 2024-04-09 | Hewlett Packard Enterprise Development Lp | Dynamic monitoring and authorization of an optimization device |
US11921827B2 (en) * | 2014-09-05 | 2024-03-05 | Hewlett Packard Enterprise Development Lp | Dynamic monitoring and authorization of an optimization device |
US9875344B1 (en) | 2014-09-05 | 2018-01-23 | Silver Peak Systems, Inc. | Dynamic monitoring and authorization of an optimization device |
US11868449B2 (en) | 2014-09-05 | 2024-01-09 | Hewlett Packard Enterprise Development Lp | Dynamic monitoring and authorization of an optimization device |
US10885156B2 (en) | 2014-09-05 | 2021-01-05 | Silver Peak Systems, Inc. | Dynamic monitoring and authorization of an optimization device |
US20170364701A1 (en) * | 2015-06-02 | 2017-12-21 | ALTR Solutions, Inc. | Storing differentials of files in a distributed blockchain |
US10970413B2 (en) | 2015-06-02 | 2021-04-06 | ALTR Solutions, Inc. | Fragmenting data for the purposes of persistent storage across multiple immutable data structures |
US10121019B2 (en) * | 2015-06-02 | 2018-11-06 | ALTR Solutions, Inc. | Storing differentials of files in a distributed blockchain |
US11336553B2 (en) | 2015-12-28 | 2022-05-17 | Hewlett Packard Enterprise Development Lp | Dynamic monitoring and visualization for network health characteristics of network device pairs |
US10164861B2 (en) | 2015-12-28 | 2018-12-25 | Silver Peak Systems, Inc. | Dynamic monitoring and visualization for network health characteristics |
US10771370B2 (en) | 2015-12-28 | 2020-09-08 | Silver Peak Systems, Inc. | Dynamic monitoring and visualization for network health characteristics |
US10701172B2 (en) | 2015-12-31 | 2020-06-30 | International Business Machines Corporation | Clients for storage services |
US11316530B2 (en) | 2015-12-31 | 2022-04-26 | International Business Machines Corporation | Adaptive compression for data services |
US20170195446A1 (en) * | 2015-12-31 | 2017-07-06 | International Business Machines Corporation | Enhanced storage clients |
US10015274B2 (en) * | 2015-12-31 | 2018-07-03 | International Business Machines Corporation | Enhanced storage clients |
US10715623B2 (en) | 2015-12-31 | 2020-07-14 | International Business Machines Corporation | Caching for data store clients using expiration times |
US10320933B2 (en) * | 2015-12-31 | 2019-06-11 | International Business Machines Corporation | Caching for data store clients |
US11757740B2 (en) | 2016-06-13 | 2023-09-12 | Hewlett Packard Enterprise Development Lp | Aggregation of select network traffic statistics |
US11757739B2 (en) | 2016-06-13 | 2023-09-12 | Hewlett Packard Enterprise Development Lp | Aggregation of select network traffic statistics |
US11601351B2 (en) | 2016-06-13 | 2023-03-07 | Hewlett Packard Enterprise Development Lp | Aggregation of select network traffic statistics |
US10432484B2 (en) | 2016-06-13 | 2019-10-01 | Silver Peak Systems, Inc. | Aggregating select network traffic statistics |
US10326551B2 (en) | 2016-08-19 | 2019-06-18 | Silver Peak Systems, Inc. | Forward packet recovery with constrained network overhead |
US10848268B2 (en) | 2016-08-19 | 2020-11-24 | Silver Peak Systems, Inc. | Forward packet recovery with constrained network overhead |
US9967056B1 (en) | 2016-08-19 | 2018-05-08 | Silver Peak Systems, Inc. | Forward packet recovery with constrained overhead |
US11424857B2 (en) | 2016-08-19 | 2022-08-23 | Hewlett Packard Enterprise Development Lp | Forward packet recovery with constrained network overhead |
US10892978B2 (en) | 2017-02-06 | 2021-01-12 | Silver Peak Systems, Inc. | Multi-level learning for classifying traffic flows from first packet data |
US10771394B2 (en) | 2017-02-06 | 2020-09-08 | Silver Peak Systems, Inc. | Multi-level learning for classifying traffic flows on a first packet from DNS data |
US10257082B2 (en) | 2017-02-06 | 2019-04-09 | Silver Peak Systems, Inc. | Multi-level learning for classifying traffic flows |
US11582157B2 (en) | 2017-02-06 | 2023-02-14 | Hewlett Packard Enterprise Development Lp | Multi-level learning for classifying traffic flows on a first packet from DNS response data |
US11729090B2 (en) | 2017-02-06 | 2023-08-15 | Hewlett Packard Enterprise Development Lp | Multi-level learning for classifying network traffic flows from first packet data |
US11044202B2 (en) | 2017-02-06 | 2021-06-22 | Silver Peak Systems, Inc. | Multi-level learning for predicting and classifying traffic flows from first packet data |
US11212210B2 (en) | 2017-09-21 | 2021-12-28 | Silver Peak Systems, Inc. | Selective route exporting using source type |
US11805045B2 (en) | 2017-09-21 | 2023-10-31 | Hewlett Packard Enterprise Development Lp | Selective routing |
US11126594B2 (en) * | 2018-02-09 | 2021-09-21 | Exagrid Systems, Inc. | Delta compression |
US20190251189A1 (en) * | 2018-02-09 | 2019-08-15 | Exagrid Systems, Inc. | Delta Compression |
US10637721B2 (en) | 2018-03-12 | 2020-04-28 | Silver Peak Systems, Inc. | Detecting path break conditions while minimizing network overhead |
US10887159B2 (en) | 2018-03-12 | 2021-01-05 | Silver Peak Systems, Inc. | Methods and systems for detecting path break conditions while minimizing network overhead |
US11405265B2 (en) | 2018-03-12 | 2022-08-02 | Hewlett Packard Enterprise Development Lp | Methods and systems for detecting path break conditions while minimizing network overhead |
Also Published As
Publication number | Publication date |
---|---|
US6374250B2 (en) | 2002-04-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6374250B2 (en) | System and method for differential compression of data from a plurality of binary sources | |
US6542906B2 (en) | Method of and an apparatus for merging a sequence of delta files | |
Burns | Differential compression: a generalized solution for binary files | |
US5742818A (en) | Method and system of converting data from a source file system to a target file system | |
US9454318B2 (en) | Efficient data storage system | |
US7539685B2 (en) | Index key normalization | |
CN107210753B (zh) | 通过从驻留在内容关联滤筛中的基本数据单元导出数据的数据的无损简化 | |
JP4364790B2 (ja) | バイト・レベルのファイル相違検出および更新アルゴリズム | |
US6233589B1 (en) | Method and system for reflecting differences between two files | |
US6574591B1 (en) | File systems image transfer between dissimilar file systems | |
US8306954B2 (en) | Methods and systems for file replication utilizing differences between versions of files | |
US5717912A (en) | Method and apparatus for rapid full text index creation | |
AU2005284737B2 (en) | Systems and methods for searching and storage of data | |
Reichenberger | Delta storage for arbitrary non-text files | |
US7647291B2 (en) | B-tree compression using normalized index keys | |
US20020078062A1 (en) | File processing method, data processing apparatus and storage medium | |
Burns et al. | In-place reconstruction of delta compressed files | |
Burns et al. | A linear time, constant space differencing algorithm | |
US7379940B1 (en) | Focal point compression method and apparatus | |
US7685186B2 (en) | Optimized and robust in-place data transformation | |
Shapira et al. | In place differential file compression | |
Crowley | Data structures for text sequences | |
US20070220026A1 (en) | Efficient caching for large scale distributed computations | |
Ossefort | A sorting routine of intermediate size and speed | |
Oles | ACF: An Automatically Compressed File System for UNIX |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:AJTAI, MIKLOS;BURNS, RANDAL CHILTON;FAGIN, RONALD;AND OTHERS;REEL/FRAME:008612/0499;SIGNING DATES FROM 19970310 TO 19970407 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
FPAY | Fee payment |
Year of fee payment: 8 |
|
REMI | Maintenance fee reminder mailed | ||
AS | Assignment |
Owner name: TWITTER, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:INTERNATIONAL BUSINESS MACHINES CORPORATION;REEL/FRAME:032075/0404 Effective date: 20131230 |
|
FPAY | Fee payment |
Year of fee payment: 12 |
|
SULP | Surcharge for late payment |
Year of fee payment: 11 |