CN103617177A - Stackable repeating data deletion file system - Google Patents
Stackable repeating data deletion file system Download PDFInfo
- Publication number
- CN103617177A CN103617177A CN201310541623.5A CN201310541623A CN103617177A CN 103617177 A CN103617177 A CN 103617177A CN 201310541623 A CN201310541623 A CN 201310541623A CN 103617177 A CN103617177 A CN 103617177A
- Authority
- CN
- China
- Prior art keywords
- data
- file system
- repeating
- deletion
- service module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/174—Redundancy elimination performed by the file system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/174—Redundancy elimination performed by the file system
- G06F16/1748—De-duplication implemented within the file system, e.g. based on file segments
Abstract
The invention provides a stackable repeating data deletion file system which comprises a file system service module and a repeating deletion service module. For normal data, the data of a bottom file system are imported into the stackable repeating data deletion file system through a direct interface conversion mode; for data subjected to repeating data deletion, corresponding data attribute identifications are read, and redirection of IO flow is conducted to realize transparent seamless access to the data after repeating deletion. The repeating deletion service module reads file system log data exported by the file system service module, analyzes log content and then conducts calculation of data signs and detection and deletion of repeating data to complete identification of the data after repeating deletion. The stackable repeating data deletion file system can make full use of the storage capacity of an existing storage system with no need for upgrading hardware, thereby reducing investment to the maximum extent; due to a stackable software design, a repeating data deletion function is provided on the basis of an existing file system, a data storage structure is optimized, and the occupation of space of the storage system is lowered.
Description
Technical field
The present invention relates to Computer Storage field, be specifically related to a kind of data de-duplication file system realizing based on stackable file system technology.
Background technology
In large memory system, data rapid growth is comparatively sharp-pointed with the relative contradiction slowly of memory device upgrading, in order to alleviate the volume grows problem of storage system, the space that reduction data take, reduce costs, maximum using has resource, and data de-duplication technology has become requisite gordian technique in large scale system.
By using data de-duplication technology, user can obtain obvious data reduction effect, can greatly reduce the bandwidth demand of storage system, cuts operating costs and maintenance cost.By data reduction, the memory capacity of rear end reality is reduced greatly, brought thus more succinct storage administration, effectively reduce handling cost.
Yet popular data de-duplication scheme, mostly is the scheme of heavily deleting towards nearline storage and back-up storage, and often combines closely with standby system, thereby general file system service cannot be provided at present.Can in on-line system, directly provide the product of data de-duplication function less, and all need to use proprietary file system format, often all there is many restrictions in these proprietary file system, makes directly to apply and have certain difficulty in large-scale online storage subsystem aspect performance, function, reliability, extensibility.
Existing large memory system is the file system builds based on ripe often, as ext3, ext4, xfs, lustre etc., this class file system itself does not have the function of data de-duplication, and if use data de-duplication function, be faced with and need to use proprietary file system, standing obvious appreciable performance reduces, and carry out large-scale Data Migration, this brings high time and space cost, in having the storage system of mass data, there is no feasibility, high cost.
For this present situation, the present invention has designed a kind of stack data de-duplication file system, can the file system based on existing maturation provide data de-duplication function, fully keep the performance of original storage system, need hardly to carry out any Data Migration simultaneously.
Summary of the invention
The present invention has designed and Implemented a kind of stack data de-duplication file system, can make full use of the storage capacity of existing storage system, without upgrading hardware, reduce investment outlay to greatest extent, by the Software for Design of stack, data de-duplication function is provided in existing file system, optimization data storage organization, the space hold of reduction storage system.
Described system comprises:
File system service module, for normal data, adopts the mode of direct interface conversion by the data importing presents system of bottom document system; For the data of having carried out data de-duplication, read corresponding data attribute sign, carry out being redirected of IO flow process, realize the transparent seamless access of heavily deleting rear data;
Heavily delete service module, the file system journal data that file reading system service module derives, resolve the calculating of the laggard row data signature of log content, detection and the deletion of repeating data, complete heavily to delete rear data to be identified.
The invention has the beneficial effects as follows: the design based on stackable file system can make full use of existing storage system; only by being installed, the software systems of this patent description can make existing file system support data de-duplication function to save storage space; without migration data; the IO performance that has simultaneously kept original storage system, realizes sufficient equipment and reuses and invest protection.
Accompanying drawing explanation
The configuration diagram of the stack data de-duplication file system that accompanying drawing 1 proposes for this patent.
Embodiment
With reference to the accompanying drawings 1, content of the present invention is described to the process that realizes this architecture with an instantiation.
As described in summary of the invention, architecture of the present invention mainly comprises: file system service module, heavily delete service module.
File system service module has realized the file system of a complete support POSIX agreement, and it has adopted the layout strategy of stackable file system, by the mapping at file system interface layer and rewriting, by the complete realization of the service of bottom document system.For normal data, this module adopts the mode of direct interface conversion by the data importing presents system of bottom document system, has realized the seamless access of normal data.For the data of having carried out data de-duplication, the agreement of this module file system having thus described the invention, reads corresponding data attribute sign, carries out being redirected of IO flow process, realizes the transparent seamless access of heavily deleting rear data.
Heavily delete service module independent operating outside band, it adopts multi-thread design, makes full use of the computation capability of multiple nucleus system, and superfast data de-duplication function is provided.The file system journal data that this module file reading system service module derives, resolve the calculating of the laggard row data signature of log content, detection and the deletion of repeating data, complete heavily to delete rear data to be identified.This module can be moved with file system service module simultaneously, by the fine granularity designing in file system service module, locks, and guarantees the atomicity of data processing, and reliable parallel data processing power is provided.
In a typical configuration surroundings, file system service module, heavily delete service module and can be used as general application software and be installed in host computer system.After having carried out the configuration of relevant software, can startup file system service module, heavily delete service module, file system that now can carry the present invention has described on main frame, and can carry out data access.After the file system IO of a period of time completes, heavily delete service module and can automatically carry out the calculating of data signature, and according to configuration parameter, carry out detection and the deletion of repeating data, and complete the mark of heavily deleting rear data.
So far, completely realize whole stack data de-duplication file system, realized the function that high-performance repeating data deleting service is provided in existing file system, improved greatly the space availability ratio of storage system, effectively protected client's investment.
Certainly; the present invention also can have other various embodiments; in the situation that not deviating from spirit of the present invention and essence thereof; those of ordinary skill in the art are when making according to the present invention various corresponding changes and distortion, but these corresponding changes and distortion all should belong to the protection domain of claim of the present invention.
Claims (1)
1. a stack data de-duplication file system, is characterized in that comprising:
File system service module, for normal data, adopts the mode of direct interface conversion by the data importing presents system of bottom document system; For the data of having carried out data de-duplication, read corresponding data attribute sign, carry out being redirected of IO flow process, realize the transparent seamless access of heavily deleting rear data;
Heavily delete service module, the file system journal data that file reading system service module derives, resolve the calculating of the laggard row data signature of log content, detection and the deletion of repeating data, complete heavily to delete rear data to be identified.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310541623.5A CN103617177A (en) | 2013-11-05 | 2013-11-05 | Stackable repeating data deletion file system |
PCT/CN2014/089303 WO2015067128A1 (en) | 2013-11-05 | 2014-10-23 | Stackable data duplication file system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310541623.5A CN103617177A (en) | 2013-11-05 | 2013-11-05 | Stackable repeating data deletion file system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103617177A true CN103617177A (en) | 2014-03-05 |
Family
ID=50167880
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310541623.5A Pending CN103617177A (en) | 2013-11-05 | 2013-11-05 | Stackable repeating data deletion file system |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN103617177A (en) |
WO (1) | WO2015067128A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104133888A (en) * | 2014-07-30 | 2014-11-05 | 宇龙计算机通信科技(深圳)有限公司 | Multi-system data processing method, device and terminal |
CN104391915A (en) * | 2014-11-19 | 2015-03-04 | 湖南国科微电子有限公司 | Duplicated data delete method |
WO2015067128A1 (en) * | 2013-11-05 | 2015-05-14 | 浪潮(北京)电子信息产业有限公司 | Stackable data duplication file system |
CN105205094A (en) * | 2015-08-12 | 2015-12-30 | 浪潮(北京)电子信息产业有限公司 | Multi-control share storage system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100082700A1 (en) * | 2008-09-22 | 2010-04-01 | Riverbed Technology, Inc. | Storage system for data virtualization and deduplication |
US20100082547A1 (en) * | 2008-09-22 | 2010-04-01 | Riverbed Technology, Inc. | Log Structured Content Addressable Deduplicating Storage |
CN101908073A (en) * | 2010-08-13 | 2010-12-08 | 清华大学 | Method for deleting duplicated data in file system in real time |
CN103051671A (en) * | 2012-11-22 | 2013-04-17 | 浪潮电子信息产业股份有限公司 | Repeating data deletion method for cluster file system |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB0104227D0 (en) * | 2001-02-21 | 2001-04-11 | Ibm | Information component based data storage and management |
CN103279502B (en) * | 2013-05-06 | 2016-01-20 | 北京赛思信安技术有限公司 | A kind of framework and method with the data de-duplication file system be combined with parallel file system |
CN103617177A (en) * | 2013-11-05 | 2014-03-05 | 浪潮(北京)电子信息产业有限公司 | Stackable repeating data deletion file system |
-
2013
- 2013-11-05 CN CN201310541623.5A patent/CN103617177A/en active Pending
-
2014
- 2014-10-23 WO PCT/CN2014/089303 patent/WO2015067128A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100082700A1 (en) * | 2008-09-22 | 2010-04-01 | Riverbed Technology, Inc. | Storage system for data virtualization and deduplication |
US20100082547A1 (en) * | 2008-09-22 | 2010-04-01 | Riverbed Technology, Inc. | Log Structured Content Addressable Deduplicating Storage |
CN101908073A (en) * | 2010-08-13 | 2010-12-08 | 清华大学 | Method for deleting duplicated data in file system in real time |
CN103051671A (en) * | 2012-11-22 | 2013-04-17 | 浪潮电子信息产业股份有限公司 | Repeating data deletion method for cluster file system |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015067128A1 (en) * | 2013-11-05 | 2015-05-14 | 浪潮(北京)电子信息产业有限公司 | Stackable data duplication file system |
CN104133888A (en) * | 2014-07-30 | 2014-11-05 | 宇龙计算机通信科技(深圳)有限公司 | Multi-system data processing method, device and terminal |
CN104133888B (en) * | 2014-07-30 | 2019-08-02 | 宇龙计算机通信科技(深圳)有限公司 | A kind of multisystem data processing method, device and terminal |
CN104391915A (en) * | 2014-11-19 | 2015-03-04 | 湖南国科微电子有限公司 | Duplicated data delete method |
CN104391915B (en) * | 2014-11-19 | 2016-02-24 | 湖南国科微电子股份有限公司 | A kind of data heavily delete method |
CN105205094A (en) * | 2015-08-12 | 2015-12-30 | 浪潮(北京)电子信息产业有限公司 | Multi-control share storage system |
Also Published As
Publication number | Publication date |
---|---|
WO2015067128A1 (en) | 2015-05-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9852055B2 (en) | Multi-level memory compression | |
US8417912B2 (en) | Management of low-paging space conditions in an operating system | |
US9760500B2 (en) | Caching scheme synergy for extent migration between tiers of a storage system | |
CN101866359B (en) | Small file storage and visit method in avicade file system | |
CN103218224A (en) | Method and terminal for improving utilization ratio of memory space | |
CN102158349A (en) | Log management device and method thereof | |
CN102982182B (en) | Data storage planning method and device | |
CN105630810B (en) | A method of mass small documents are uploaded in distributed memory system | |
CN103051671A (en) | Repeating data deletion method for cluster file system | |
CN103067480A (en) | Synchronized method and system of network disk | |
CN101398823A (en) | Method and system for implementing remote storage by virtual file systems technology | |
CN103617177A (en) | Stackable repeating data deletion file system | |
CN103744618A (en) | Method and system for achieving team shared storage | |
CN112328592A (en) | Data storage method, electronic device and computer readable storage medium | |
CN103942301A (en) | Distributed file system oriented to access and application of multiple data types | |
CN103116475A (en) | Method of automatic simplifying allocation expansion | |
CN112783887A (en) | Data processing method and device based on data warehouse | |
CN103294407A (en) | Storage device and data read-write method | |
CN205263797U (en) | Adopt memory of solid state hard drives SSD as L2 cache | |
US20140258347A1 (en) | Grouping files for optimized file operations | |
CN107220342A (en) | The control method and system of a kind of distributed data base | |
CN202309769U (en) | Data storage system based on cloud computing | |
CN102495902B (en) | Method and system for simultaneously realizing ETL (Extract Transform and Load) process of spatial data and attribute data | |
CN102663140B (en) | Terabyte (TB)-level-based panoramic image data quick access method | |
CN105117282A (en) | Input and output request splitting method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20140305 |