CN103617177A - Stackable repeating data deletion file system - Google Patents

Stackable repeating data deletion file system Download PDF

Info

Publication number
CN103617177A
CN103617177A CN201310541623.5A CN201310541623A CN103617177A CN 103617177 A CN103617177 A CN 103617177A CN 201310541623 A CN201310541623 A CN 201310541623A CN 103617177 A CN103617177 A CN 103617177A
Authority
CN
China
Prior art keywords
data
file system
repeating
deletion
service module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310541623.5A
Other languages
Chinese (zh)
Inventor
王恩东
文中领
张立强
孟圣智
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Beijing Electronic Information Industry Co Ltd
Original Assignee
Inspur Beijing Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Beijing Electronic Information Industry Co Ltd filed Critical Inspur Beijing Electronic Information Industry Co Ltd
Priority to CN201310541623.5A priority Critical patent/CN103617177A/en
Publication of CN103617177A publication Critical patent/CN103617177A/en
Priority to PCT/CN2014/089303 priority patent/WO2015067128A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • G06F16/1748De-duplication implemented within the file system, e.g. based on file segments

Abstract

The invention provides a stackable repeating data deletion file system which comprises a file system service module and a repeating deletion service module. For normal data, the data of a bottom file system are imported into the stackable repeating data deletion file system through a direct interface conversion mode; for data subjected to repeating data deletion, corresponding data attribute identifications are read, and redirection of IO flow is conducted to realize transparent seamless access to the data after repeating deletion. The repeating deletion service module reads file system log data exported by the file system service module, analyzes log content and then conducts calculation of data signs and detection and deletion of repeating data to complete identification of the data after repeating deletion. The stackable repeating data deletion file system can make full use of the storage capacity of an existing storage system with no need for upgrading hardware, thereby reducing investment to the maximum extent; due to a stackable software design, a repeating data deletion function is provided on the basis of an existing file system, a data storage structure is optimized, and the occupation of space of the storage system is lowered.

Description

A kind of stack data de-duplication file system
Technical field
The present invention relates to Computer Storage field, be specifically related to a kind of data de-duplication file system realizing based on stackable file system technology.
Background technology
In large memory system, data rapid growth is comparatively sharp-pointed with the relative contradiction slowly of memory device upgrading, in order to alleviate the volume grows problem of storage system, the space that reduction data take, reduce costs, maximum using has resource, and data de-duplication technology has become requisite gordian technique in large scale system.
By using data de-duplication technology, user can obtain obvious data reduction effect, can greatly reduce the bandwidth demand of storage system, cuts operating costs and maintenance cost.By data reduction, the memory capacity of rear end reality is reduced greatly, brought thus more succinct storage administration, effectively reduce handling cost.
Yet popular data de-duplication scheme, mostly is the scheme of heavily deleting towards nearline storage and back-up storage, and often combines closely with standby system, thereby general file system service cannot be provided at present.Can in on-line system, directly provide the product of data de-duplication function less, and all need to use proprietary file system format, often all there is many restrictions in these proprietary file system, makes directly to apply and have certain difficulty in large-scale online storage subsystem aspect performance, function, reliability, extensibility.
Existing large memory system is the file system builds based on ripe often, as ext3, ext4, xfs, lustre etc., this class file system itself does not have the function of data de-duplication, and if use data de-duplication function, be faced with and need to use proprietary file system, standing obvious appreciable performance reduces, and carry out large-scale Data Migration, this brings high time and space cost, in having the storage system of mass data, there is no feasibility, high cost.
For this present situation, the present invention has designed a kind of stack data de-duplication file system, can the file system based on existing maturation provide data de-duplication function, fully keep the performance of original storage system, need hardly to carry out any Data Migration simultaneously.
Summary of the invention
The present invention has designed and Implemented a kind of stack data de-duplication file system, can make full use of the storage capacity of existing storage system, without upgrading hardware, reduce investment outlay to greatest extent, by the Software for Design of stack, data de-duplication function is provided in existing file system, optimization data storage organization, the space hold of reduction storage system.
Described system comprises:
File system service module, for normal data, adopts the mode of direct interface conversion by the data importing presents system of bottom document system; For the data of having carried out data de-duplication, read corresponding data attribute sign, carry out being redirected of IO flow process, realize the transparent seamless access of heavily deleting rear data;
Heavily delete service module, the file system journal data that file reading system service module derives, resolve the calculating of the laggard row data signature of log content, detection and the deletion of repeating data, complete heavily to delete rear data to be identified.
The invention has the beneficial effects as follows: the design based on stackable file system can make full use of existing storage system; only by being installed, the software systems of this patent description can make existing file system support data de-duplication function to save storage space; without migration data; the IO performance that has simultaneously kept original storage system, realizes sufficient equipment and reuses and invest protection.
Accompanying drawing explanation
The configuration diagram of the stack data de-duplication file system that accompanying drawing 1 proposes for this patent.
Embodiment
With reference to the accompanying drawings 1, content of the present invention is described to the process that realizes this architecture with an instantiation.
As described in summary of the invention, architecture of the present invention mainly comprises: file system service module, heavily delete service module.
File system service module has realized the file system of a complete support POSIX agreement, and it has adopted the layout strategy of stackable file system, by the mapping at file system interface layer and rewriting, by the complete realization of the service of bottom document system.For normal data, this module adopts the mode of direct interface conversion by the data importing presents system of bottom document system, has realized the seamless access of normal data.For the data of having carried out data de-duplication, the agreement of this module file system having thus described the invention, reads corresponding data attribute sign, carries out being redirected of IO flow process, realizes the transparent seamless access of heavily deleting rear data.
Heavily delete service module independent operating outside band, it adopts multi-thread design, makes full use of the computation capability of multiple nucleus system, and superfast data de-duplication function is provided.The file system journal data that this module file reading system service module derives, resolve the calculating of the laggard row data signature of log content, detection and the deletion of repeating data, complete heavily to delete rear data to be identified.This module can be moved with file system service module simultaneously, by the fine granularity designing in file system service module, locks, and guarantees the atomicity of data processing, and reliable parallel data processing power is provided.
In a typical configuration surroundings, file system service module, heavily delete service module and can be used as general application software and be installed in host computer system.After having carried out the configuration of relevant software, can startup file system service module, heavily delete service module, file system that now can carry the present invention has described on main frame, and can carry out data access.After the file system IO of a period of time completes, heavily delete service module and can automatically carry out the calculating of data signature, and according to configuration parameter, carry out detection and the deletion of repeating data, and complete the mark of heavily deleting rear data.
So far, completely realize whole stack data de-duplication file system, realized the function that high-performance repeating data deleting service is provided in existing file system, improved greatly the space availability ratio of storage system, effectively protected client's investment.
Certainly; the present invention also can have other various embodiments; in the situation that not deviating from spirit of the present invention and essence thereof; those of ordinary skill in the art are when making according to the present invention various corresponding changes and distortion, but these corresponding changes and distortion all should belong to the protection domain of claim of the present invention.

Claims (1)

1. a stack data de-duplication file system, is characterized in that comprising:
File system service module, for normal data, adopts the mode of direct interface conversion by the data importing presents system of bottom document system; For the data of having carried out data de-duplication, read corresponding data attribute sign, carry out being redirected of IO flow process, realize the transparent seamless access of heavily deleting rear data;
Heavily delete service module, the file system journal data that file reading system service module derives, resolve the calculating of the laggard row data signature of log content, detection and the deletion of repeating data, complete heavily to delete rear data to be identified.
CN201310541623.5A 2013-11-05 2013-11-05 Stackable repeating data deletion file system Pending CN103617177A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201310541623.5A CN103617177A (en) 2013-11-05 2013-11-05 Stackable repeating data deletion file system
PCT/CN2014/089303 WO2015067128A1 (en) 2013-11-05 2014-10-23 Stackable data duplication file system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310541623.5A CN103617177A (en) 2013-11-05 2013-11-05 Stackable repeating data deletion file system

Publications (1)

Publication Number Publication Date
CN103617177A true CN103617177A (en) 2014-03-05

Family

ID=50167880

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310541623.5A Pending CN103617177A (en) 2013-11-05 2013-11-05 Stackable repeating data deletion file system

Country Status (2)

Country Link
CN (1) CN103617177A (en)
WO (1) WO2015067128A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104133888A (en) * 2014-07-30 2014-11-05 宇龙计算机通信科技(深圳)有限公司 Multi-system data processing method, device and terminal
CN104391915A (en) * 2014-11-19 2015-03-04 湖南国科微电子有限公司 Duplicated data delete method
WO2015067128A1 (en) * 2013-11-05 2015-05-14 浪潮(北京)电子信息产业有限公司 Stackable data duplication file system
CN105205094A (en) * 2015-08-12 2015-12-30 浪潮(北京)电子信息产业有限公司 Multi-control share storage system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100082700A1 (en) * 2008-09-22 2010-04-01 Riverbed Technology, Inc. Storage system for data virtualization and deduplication
US20100082547A1 (en) * 2008-09-22 2010-04-01 Riverbed Technology, Inc. Log Structured Content Addressable Deduplicating Storage
CN101908073A (en) * 2010-08-13 2010-12-08 清华大学 Method for deleting duplicated data in file system in real time
CN103051671A (en) * 2012-11-22 2013-04-17 浪潮电子信息产业股份有限公司 Repeating data deletion method for cluster file system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB0104227D0 (en) * 2001-02-21 2001-04-11 Ibm Information component based data storage and management
CN103279502B (en) * 2013-05-06 2016-01-20 北京赛思信安技术有限公司 A kind of framework and method with the data de-duplication file system be combined with parallel file system
CN103617177A (en) * 2013-11-05 2014-03-05 浪潮(北京)电子信息产业有限公司 Stackable repeating data deletion file system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100082700A1 (en) * 2008-09-22 2010-04-01 Riverbed Technology, Inc. Storage system for data virtualization and deduplication
US20100082547A1 (en) * 2008-09-22 2010-04-01 Riverbed Technology, Inc. Log Structured Content Addressable Deduplicating Storage
CN101908073A (en) * 2010-08-13 2010-12-08 清华大学 Method for deleting duplicated data in file system in real time
CN103051671A (en) * 2012-11-22 2013-04-17 浪潮电子信息产业股份有限公司 Repeating data deletion method for cluster file system

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015067128A1 (en) * 2013-11-05 2015-05-14 浪潮(北京)电子信息产业有限公司 Stackable data duplication file system
CN104133888A (en) * 2014-07-30 2014-11-05 宇龙计算机通信科技(深圳)有限公司 Multi-system data processing method, device and terminal
CN104133888B (en) * 2014-07-30 2019-08-02 宇龙计算机通信科技(深圳)有限公司 A kind of multisystem data processing method, device and terminal
CN104391915A (en) * 2014-11-19 2015-03-04 湖南国科微电子有限公司 Duplicated data delete method
CN104391915B (en) * 2014-11-19 2016-02-24 湖南国科微电子股份有限公司 A kind of data heavily delete method
CN105205094A (en) * 2015-08-12 2015-12-30 浪潮(北京)电子信息产业有限公司 Multi-control share storage system

Also Published As

Publication number Publication date
WO2015067128A1 (en) 2015-05-14

Similar Documents

Publication Publication Date Title
US9852055B2 (en) Multi-level memory compression
US8417912B2 (en) Management of low-paging space conditions in an operating system
US9760500B2 (en) Caching scheme synergy for extent migration between tiers of a storage system
CN101866359B (en) Small file storage and visit method in avicade file system
CN103218224A (en) Method and terminal for improving utilization ratio of memory space
CN102158349A (en) Log management device and method thereof
CN102982182B (en) Data storage planning method and device
CN105630810B (en) A method of mass small documents are uploaded in distributed memory system
CN103051671A (en) Repeating data deletion method for cluster file system
CN103067480A (en) Synchronized method and system of network disk
CN101398823A (en) Method and system for implementing remote storage by virtual file systems technology
CN103617177A (en) Stackable repeating data deletion file system
CN103744618A (en) Method and system for achieving team shared storage
CN112328592A (en) Data storage method, electronic device and computer readable storage medium
CN103942301A (en) Distributed file system oriented to access and application of multiple data types
CN103116475A (en) Method of automatic simplifying allocation expansion
CN112783887A (en) Data processing method and device based on data warehouse
CN103294407A (en) Storage device and data read-write method
CN205263797U (en) Adopt memory of solid state hard drives SSD as L2 cache
US20140258347A1 (en) Grouping files for optimized file operations
CN107220342A (en) The control method and system of a kind of distributed data base
CN202309769U (en) Data storage system based on cloud computing
CN102495902B (en) Method and system for simultaneously realizing ETL (Extract Transform and Load) process of spatial data and attribute data
CN102663140B (en) Terabyte (TB)-level-based panoramic image data quick access method
CN105117282A (en) Input and output request splitting method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140305