WO2009126644A3 - Methods and systems for improved throughput performance in a distributed data de-duplication environment - Google Patents

Methods and systems for improved throughput performance in a distributed data de-duplication environment Download PDF

Info

Publication number
WO2009126644A3
WO2009126644A3 PCT/US2009/039801 US2009039801W WO2009126644A3 WO 2009126644 A3 WO2009126644 A3 WO 2009126644A3 US 2009039801 W US2009039801 W US 2009039801W WO 2009126644 A3 WO2009126644 A3 WO 2009126644A3
Authority
WO
WIPO (PCT)
Prior art keywords
data
block
systems
methods
server
Prior art date
Application number
PCT/US2009/039801
Other languages
French (fr)
Other versions
WO2009126644A2 (en
Inventor
Roderick B. Wideman
Original Assignee
Quantum Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Quantum Corporation filed Critical Quantum Corporation
Priority to EP09730872.0A priority Critical patent/EP2263188A4/en
Publication of WO2009126644A2 publication Critical patent/WO2009126644A2/en
Publication of WO2009126644A3 publication Critical patent/WO2009126644A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • G06F3/0641De-duplication techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

In accordance with some embodiments, of the systems and methods described here a data storage system that may include data de-duplication may receive a stream of data and parse the stream of data into a block at a local client node. Additionally, in some embodiments, a code that represents the block of data might be determined at the local client node. This code, representing the block of data, may be sent to a server. In accordance with various embodiments, the server may determine if a block is unique, for example, based on the code received at the server. In various embodiments, the server might write a unique block to a file at the local client node; and update metadata.
PCT/US2009/039801 2008-04-08 2009-04-07 Methods and systems for improved throughput performance in a distributed data de-duplication environment WO2009126644A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP09730872.0A EP2263188A4 (en) 2008-04-08 2009-04-07 Methods and systems for improved throughput performance in a distributed data de-duplication environment

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US12/099,698 2008-04-08
US12/099,698 US8751561B2 (en) 2008-04-08 2008-04-08 Methods and systems for improved throughput performance in a distributed data de-duplication environment

Publications (2)

Publication Number Publication Date
WO2009126644A2 WO2009126644A2 (en) 2009-10-15
WO2009126644A3 true WO2009126644A3 (en) 2010-01-14

Family

ID=41134248

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2009/039801 WO2009126644A2 (en) 2008-04-08 2009-04-07 Methods and systems for improved throughput performance in a distributed data de-duplication environment

Country Status (3)

Country Link
US (1) US8751561B2 (en)
EP (1) EP2263188A4 (en)
WO (1) WO2009126644A2 (en)

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10642794B2 (en) * 2008-09-11 2020-05-05 Vmware, Inc. Computer storage deduplication
US8452731B2 (en) 2008-09-25 2013-05-28 Quest Software, Inc. Remote backup and restore
US8200923B1 (en) 2008-12-31 2012-06-12 Emc Corporation Method and apparatus for block level data de-duplication
WO2010135430A1 (en) 2009-05-19 2010-11-25 Vmware, Inc. Shortcut input/output in virtual machine systems
US8121993B2 (en) * 2009-10-28 2012-02-21 Oracle America, Inc. Data sharing and recovery within a network of untrusted storage devices using data object fingerprinting
US8499131B2 (en) * 2010-04-13 2013-07-30 Hewlett-Packard Development Company, L.P. Capping a number of locations referred to by chunk references
US9053032B2 (en) 2010-05-05 2015-06-09 Microsoft Technology Licensing, Llc Fast and low-RAM-footprint indexing for data deduplication
US8694703B2 (en) 2010-06-09 2014-04-08 Brocade Communications Systems, Inc. Hardware-accelerated lossless data compression
US9401967B2 (en) 2010-06-09 2016-07-26 Brocade Communications Systems, Inc. Inline wire speed deduplication system
GB2470498B (en) * 2010-07-19 2011-04-06 Quantum Corp Establishing parse scope
GB2470497B (en) * 2010-07-19 2011-06-15 Quantum Corp Collaborative, distributed, data de-duplication
US10394757B2 (en) 2010-11-18 2019-08-27 Microsoft Technology Licensing, Llc Scalable chunk store for data deduplication
US9110936B2 (en) 2010-12-28 2015-08-18 Microsoft Technology Licensing, Llc Using index partitioning and reconciliation for data deduplication
US9639543B2 (en) 2010-12-28 2017-05-02 Microsoft Technology Licensing, Llc Adaptive index for data deduplication
US8438137B2 (en) 2011-02-28 2013-05-07 Hewlett-Packard Development Company, L.P. Automatic selection of source or target deduplication
US8904128B2 (en) 2011-06-08 2014-12-02 Hewlett-Packard Development Company, L.P. Processing a request to restore deduplicated data
US8990171B2 (en) 2011-09-01 2015-03-24 Microsoft Corporation Optimization of a partially deduplicated file
US8620886B1 (en) * 2011-09-20 2013-12-31 Netapp Inc. Host side deduplication
US8712963B1 (en) 2011-12-22 2014-04-29 Emc Corporation Method and apparatus for content-aware resizing of data chunks for replication
US8639669B1 (en) * 2011-12-22 2014-01-28 Emc Corporation Method and apparatus for determining optimal chunk sizes of a deduplicated storage system
US9417811B2 (en) 2012-03-07 2016-08-16 International Business Machines Corporation Efficient inline data de-duplication on a storage system
WO2014089760A1 (en) * 2012-12-11 2014-06-19 华为技术有限公司 Method and apparatus for compressing data
CN103092927B (en) * 2012-12-29 2016-01-20 华中科技大学 File rapid read-write method under a kind of distributed environment
US10437784B2 (en) * 2015-01-30 2019-10-08 SK Hynix Inc. Method and system for endurance enhancing, deferred deduplication with hardware-hash-enabled storage device
US10275467B2 (en) * 2015-12-15 2019-04-30 Microsoft Technology Licensing, Llc Multi-level high availability model for an object storage service
CN105721607A (en) * 2016-03-30 2016-06-29 苏州美天网络科技有限公司 Data storage system applicable to stream media
TWI758825B (en) * 2020-08-18 2022-03-21 鴻海精密工業股份有限公司 Method and device of compressing configuration data, and method and device of decompressing configuration data

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020174180A1 (en) * 2001-03-16 2002-11-21 Novell, Inc. Client-server model for synchronization of files
US20060069719A1 (en) * 2002-10-30 2006-03-30 Riverbed Technology, Inc. Transaction accelerator for client-server communication systems
WO2006083958A2 (en) * 2005-02-01 2006-08-10 Newsilike Media Group, Inc. Systems and methods for use of structured and unstructured distributed data
US20070124415A1 (en) * 2005-11-29 2007-05-31 Etai Lev-Ran Method and apparatus for reducing network traffic over low bandwidth links
US20080077630A1 (en) * 2006-09-22 2008-03-27 Keith Robert O Accelerated data transfer using common prior data segments

Family Cites Families (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3608280A (en) * 1969-03-26 1971-09-28 Bendix Corp Microwave energy shielding system
US5616928A (en) * 1977-04-13 1997-04-01 Russell; Virginia Protecting personnel and the environment from radioactive emissions by controlling such emissions and safely disposing of their energy
US5122332A (en) * 1977-04-13 1992-06-16 Virginia Russell Protecting organisms and the environment from harmful radiation by controlling such radiation and safely disposing of its energy
DE3611919A1 (en) * 1985-05-18 1986-11-20 OLMAC B.V., Nievwejein RADIATION PROTECTION MATTRESS
US4703133A (en) * 1986-06-05 1987-10-27 Miller John S Electromagnetic shield
US5153378A (en) * 1991-05-10 1992-10-06 Garvy Jr John W Personal space shielding apparatus
US5990810A (en) * 1995-02-17 1999-11-23 Williams; Ross Neil Method for partitioning a block of data into subblocks and for storing and communcating such subblocks
US5761053A (en) * 1996-05-08 1998-06-02 W. L. Gore & Associates, Inc. Faraday cage
DE69728650D1 (en) * 1996-10-04 2004-05-19 Matsushita Electric Ind Co Ltd PROTECTIVE DEVICE FROM ELECTROMAGNETIC FIELDS
US6374266B1 (en) * 1998-07-28 2002-04-16 Ralph Shnelvar Method and apparatus for storing information in a data processing system
TW415232U (en) * 1999-10-26 2000-12-11 Taipei Veterans General Hospit Electric physiology Faraday box
US20020011189A1 (en) * 2000-02-25 2002-01-31 Leightner Paul E. System and method for blocking geopathic radiation
AT408838B (en) * 2000-04-20 2002-03-25 Ali Alishahi ENERGY WAVE SHIELDING CAPSULE + GENERATOR FOR THE GENERATION OF NEGATIVE WAVES, FOR COMBATING DIFFERENT FORMS OF CANCER
GB0104910D0 (en) * 2001-02-28 2001-04-18 Ibm Devices to reduce electro-magnetic field radiation
US6992314B2 (en) * 2001-06-26 2006-01-31 Kabushiki Kaisha Shunkosha Device for eliminating electromagnetic waves
AU2002304842A1 (en) * 2001-08-20 2003-03-10 Datacentertechnologies N.V. File backup system and method
US20030046335A1 (en) * 2001-08-30 2003-03-06 International Business Machines Corporation Efficiently serving large objects in a distributed computing network
US7496604B2 (en) * 2001-12-03 2009-02-24 Aol Llc Reducing duplication of files on a network
US7055008B2 (en) * 2003-01-22 2006-05-30 Falconstor Software, Inc. System and method for backing up data
US20060212439A1 (en) * 2005-03-21 2006-09-21 Microsoft Corporation System and method of efficient data backup in a networking environment
GB0517113D0 (en) * 2005-08-20 2005-09-28 Ibm Methods, apparatus and computer programs for data communication efficiency
US8862841B2 (en) * 2006-04-25 2014-10-14 Hewlett-Packard Development Company, L.P. Method and system for scaleable, distributed, differential electronic-data backup and archiving
US8131682B2 (en) * 2006-05-11 2012-03-06 Hitachi, Ltd. System and method for replacing contents addressable storage
US7840537B2 (en) * 2006-12-22 2010-11-23 Commvault Systems, Inc. System and method for storing redundant information
US8768895B2 (en) * 2007-04-11 2014-07-01 Emc Corporation Subsegmenting for efficient storage, resemblance determination, and transmission
US20090132571A1 (en) * 2007-11-16 2009-05-21 Microsoft Corporation Efficient use of randomness in min-hashing
US9766983B2 (en) * 2008-03-05 2017-09-19 Ca, Inc. Proximity and in-memory map based signature searching for duplicate data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020174180A1 (en) * 2001-03-16 2002-11-21 Novell, Inc. Client-server model for synchronization of files
US20060069719A1 (en) * 2002-10-30 2006-03-30 Riverbed Technology, Inc. Transaction accelerator for client-server communication systems
WO2006083958A2 (en) * 2005-02-01 2006-08-10 Newsilike Media Group, Inc. Systems and methods for use of structured and unstructured distributed data
US20070124415A1 (en) * 2005-11-29 2007-05-31 Etai Lev-Ran Method and apparatus for reducing network traffic over low bandwidth links
US20080077630A1 (en) * 2006-09-22 2008-03-27 Keith Robert O Accelerated data transfer using common prior data segments

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2263188A4 *

Also Published As

Publication number Publication date
EP2263188A4 (en) 2014-03-05
US8751561B2 (en) 2014-06-10
WO2009126644A2 (en) 2009-10-15
US20090254609A1 (en) 2009-10-08
EP2263188A2 (en) 2010-12-22

Similar Documents

Publication Publication Date Title
WO2009126644A3 (en) Methods and systems for improved throughput performance in a distributed data de-duplication environment
WO2008080143A3 (en) Method and system for searching stored data
WO2006102621A3 (en) System and method for tracking changes to files in streaming applications
WO2010114777A3 (en) Differential file and system restores from peers and the cloud
WO2011072072A3 (en) Methods and systems for providing a unified namespace for multiple network protocols
AU2016219688A1 (en) Matching techniques for cross-platform monitoring and information
WO2007095619A3 (en) Systems and methods for indexing and searching data records based on distance metrics
WO2013101826A3 (en) Crowd determined file size uploading methods, devices and systems
WO2009134772A3 (en) Peer-to-peer redundant file server system and methods
EP2472829A8 (en) Methods, systems and devices for horizontally scalable high-availability dynamic context-based routing
GB201212411D0 (en) Transmission of map-reduce data based on a storage network or a storage network file system
WO2007071343A3 (en) Systems and methods for finding log files generated by a distributed computer
WO2008070484A3 (en) Methods and systems for quick and efficient data management and/or processing
WO2010077972A3 (en) Method and apparatus to implement a hierarchical cache system with pnfs
WO2008024317A3 (en) Automatic load spreading in a clustered network storage system
WO2006012317A3 (en) Methods and systems for indexing files and adding associated metadata to index and metadata databases based upon the power state of a data processing device
WO2008005731A3 (en) Systems and methods for power management in relation to a wireless storage device
TW200708943A (en) Intelligent auto-archiving
WO2008113647A3 (en) Shared disk clones
WO2008139640A1 (en) Download program, information storage medium, download system and download method
WO2010037031A3 (en) System and method for aggregating web feeds relevant to a geographical locale from multiple sources
WO2005019985A3 (en) System for incorporating information about a source and usage of a media asset into the asset itself
IL186953A0 (en) System and method for caching network file systems
WO2008103447A3 (en) Implementation of a structured query language interface in a distributed database
ATE471624T1 (en) CLIENT ERROR FENCE MECHANISM TO FENCE NETWORK FILE SYSTEM DATA IN A HOST CLUSTER ENVIRONMENT

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09730872

Country of ref document: EP

Kind code of ref document: A2

WWE Wipo information: entry into national phase

Ref document number: 2009730872

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE