CN106293537B - A kind of autonomous block management method of the data-intensive file system of lightweight - Google Patents

A kind of autonomous block management method of the data-intensive file system of lightweight Download PDF

Info

Publication number
CN106293537B
CN106293537B CN201610665489.3A CN201610665489A CN106293537B CN 106293537 B CN106293537 B CN 106293537B CN 201610665489 A CN201610665489 A CN 201610665489A CN 106293537 B CN106293537 B CN 106293537B
Authority
CN
China
Prior art keywords
data
memory node
block
data memory
data block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610665489.3A
Other languages
Chinese (zh)
Other versions
CN106293537A (en
Inventor
陈付梅
韩德志
毕坤
王军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Maritime University
Original Assignee
Shanghai Maritime University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Maritime University filed Critical Shanghai Maritime University
Priority to CN201610665489.3A priority Critical patent/CN106293537B/en
Publication of CN106293537A publication Critical patent/CN106293537A/en
Application granted granted Critical
Publication of CN106293537B publication Critical patent/CN106293537B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of autonomous block management methods of the data-intensive file system of lightweight, (ISD is divided by cross transfer, Intersected Shifted Declustering) realize the mapping of data block to data memory node, the quick lookup of data block in data memory node, the fast quick-recovery of data block and the quick redistribution of data block when newly adding data memory node etc. when data memory node fails, host node is set only to be responsible for the storage and maintenance of file name space in data-intensive file system, and data block is to the mapping relation information storage and maintenance of data memory node, and data memory node fail when data block replacement and new data memory node add when data block redistribution etc. all completed by data memory node autonomy.The invention saves the memory headroom of host node in data-intensive file system, improves the processing capacity of host node, can increase substantially the block management data efficiency of data-intensive file system under big data environment.

Description

A kind of autonomous block management method of the data-intensive file system of lightweight
Technical field
The present invention relates to a kind of autonomies of the data-intensive file system of computer security technique more particularly to lightweight Block management method.
Background technique
Data-intensive file system DiFS, such as Google file system GFS, Hadoop distributed file system HDFS Deng having become the main file system of big data storage management.Current data-intensive file system DiFS uses principal and subordinate Formula framework, host node (meta data server) manage all metadata, and from node (data memory node), only responsible data are deposited Storage.In order to maintain high availability, data file is usually divided into the block of fixed size by these storage systems, and each data block is usual There are 3 copies, and they are all assigned in the data memory node of different clusters.Host node must record hundreds and thousands of The address of data memory node, and the data block of all data files is recorded to the map information of these memory nodes.Also, Host node must regularly check the variation of the address mapping information of all data blocks.With the continuous increase of data volume, these Metadata information does not occupy the memory headroom of host node, influences the processing capacity of host node, and seriously limits master The scalability of node.
In order to solve the problems, such as that data-intensive file system exists, by the distribution of data file physical block and safeguard from member It separates in data management, is answered by the maintaining method that each data memory node executes data block to memory node map information It transports and gives birth to.Adopting said method, host node does not need to save a large amount of data block metadata information again and data block is deposited to data Store up node map information, but need with one group of data block to data memory node, data memory node to data block it Between invertible mapping function complete.
The data of data-intensive file system management magnanimity, these data have the following characteristics that 1) data volume is big, data Total amount increases fast;2) data storage performance demand is high;3) high reliability and high restorability are required: when data occur to lose or count It, being capable of quick extensive restored data under the premise of not influencing normal work when failing according to memory node;4) it is required to quickly Lookup data block storage location;5) memory headroom and influence host node few as far as possible of occupancy host node few as far as possible are required Processing capacity;
As seen from the above analysis, the management method of traditional file systems is not suitable with the pipe of data-intensive file system Reason, main cause: 1) with the continuous increase of data volume, it is empty that the storage of file data blocks address table will occupy a large amount of storage Between;2) host node is responsible for the maintenance of file data blocks address table, with being continuously increased for file data blocks address table, substantially reduces The processing capacity of host node;3) being continuously increased for data volume not only occupies a large amount of memory space of host node, increases ground The metadata maintenance cost such as location, while also reducing the scalability of host node;4) each data memory node is being stored Host node will be first seeked advice from when with inquiry, increase the time of addressing in this way.
Summary of the invention
For the data block storage of data-intensive file system and searching and managing demand, the present invention provides a kind of light weights The autonomous block management method of the data-intensive file system of grade, by by the distribution of physical data block, inquiry and related first number It separates from traditional metadata management according to maintenance, is completed by each data memory node, reduce host node memory space Expense and burden.The present invention can promote the scalability of the data-intensive file system under big data environment, reduce data Block address the time, and can first mate's degree improve whole system performance.
Technical principle of the invention is that the present invention is by cross transfer division methods (ISD, Intersected Shifted Declustering) autonomous management of realizing data block, i.e., by realizing data block with one group of reversible mathematical function To data memory node and data memory node to the mapping of data block, the distributed storage and fast quick checking of data block are completed Ask etc..
The present invention specifically includes following several operations:
Operation 1, data block storage operation;
Operate 2, data block search operation;
Operation 3, fail data memory node crash handling operation;
Operation 4, addition new data memory node operation.
(1) data block storage operation the following steps are included:
Logical groups (LG) where step 1.1, host node select data block by reversible linear hash function;
Step 1.2, host node select data memory node storing data in logical groups by reversible displacement segmentation function Block number evidence;
Step 1.3, data memory node storing data block number evidence and data block address map information.
(2) data block search operation the following steps are included:
Data memory node calculates data block b according to the reversed invertible function of its call number where step 2.1, data block b The new ID of place logical groups;
Data memory node is according to logical groups ID where data block b where step 2.2, data block b, with reversed invertible function The physics ID for calculating data block b, provides condition for the complete data file of file system recovery;
Step 2.3, data memory node obtain data block in the mapping letter of memory node according to the physics ID of data block Breath;
Step 2.4, data memory node take the data of data block b to send file system according to the map information of data block b.
(3) crash handling of fail data memory node operation the following steps are included:
Step 3.1 determines logic groups where fail data memory node;
The smallest data memory node is loaded in logic groups other than step 3.2, selection data storage failure node to make For backup node;
Step 3.3, multiple post data memory nodes are replicated in each logical groups parallel using intelligence recombination mapping method The data for including in the corresponding fail data memory node.
(4) addition new data memory node operation the following steps are included:
Step 4.1, the average load COV for calculating data memory node in all logical groups in whole systemave
One step 4.2, selection logical groups, calculate in the group maximum load C OV in all data memory nodesmax
Step 4.3 compares COVmaxAnd COVaveSize, if COVmax≥COVave, data memory node is added with new Replace the logical groups data memory node.Otherwise, next logical groups are chosen, step 4.1, step 4.2 and step 4.3 are repeated, Until the load for the data memory node being newly added reaches or approaches the average load of data memory node in system.
The advantage of this data-intensive file system autonomy block management method is:
(1) host node memory space expense is greatly reduced.By data block to data memory node map information from tradition Metadata in separate, by the carry out storage and management that each data memory node is autonomous, host node do not need save and Safeguard a large amount of data block address information, the metadata information for saving host node reduces 90% or more than traditional file systems.
(2) processing capacity of host node is greatly improved.Map information between data block and data memory node is by every The autonomous storage and maintenance of a data memory node, eliminates the burden of host node.Such method and distributed file system HDFS is compared, and the process performance of host node can be made to improve 30% or more.
(3) restorability and scalability of system are improved.When data memory node failure by using intelligent weight Group mapping method only migrates a small number of data when adding new data memory node by using decoupling address mapping method in this way Block can complete the recovery of fail data node data and the duplication of new addition back end data, and substantially increase system can Restorative and scalability.
Detailed description of the invention
Fig. 1 is the flow chart of concrete operations of the present invention;
Fig. 2 is the schematic diagram that host node and data memory node management function divide in the present invention;
Fig. 3 be continuous blocks to back end mapping and back end to block lookup example;
Fig. 4 is the example of back end failure recovery process;
Fig. 5 is the example of new data node adding procedure.
Specific embodiment
In order to be easy to understand the technical means, the creative features, the aims and the efficiencies achieved by the present invention, tie below Diagram and specific embodiment are closed, a kind of autonomy of the data-intensive file system for lightweight that the present invention is further explained proposes Block management method.
A kind of autonomous block management method of the data-intensive file system of lightweight, it is real by one group of reversible mathematical function Existing data block is to back end and back end to the mapping of data block.As shown in Fig. 2, each node concrete function in the present invention Divide: host node is only responsible for system name space maintenance, the distribution of data block to data memory node, each data memory node Management;Each data memory node is responsible for the consistency check of data block, data block is restored and the mapping of data memory node Information storage and maintenance.
As shown in Figure 1, autonomy block management method of the present invention, specifically includes following several operations:
Operation 1, data block storage operation;
Operate 2, data block search operation;
Operation 3, fail data memory node crash handling operation;
Operation 4, addition new data memory node operation.
(1) data block storage operation, comprising the following steps:
Step 1.1, host node pass through logical groups (LG) where reversible linear hash function selection block;
Step 1.2, host node select data memory node storing data in logical groups by reversible displacement segmentation function Block number evidence;
Step 1.3, data memory node storing data block number evidence and data block address map information.
In the step 1.1 of data block storage operation, the logic where data block is selected by reversible linear hash function Group (LG) formula:
Wherein, it is group number current in system that g, which is the logical groups ID, x to be mapped, and X is logical groups when starting in system Number, b is that the block ID, s for wanting storing data block in its file are newly-increased logical groups numbers,
In the step 1.2 of data block storage operation, data in logical groups are selected to store by reversible displacement segmentation function The process of node includes:
A it) calculates data block b and is mapped to the new block identification after logical groups g, formula:
Wherein, a is new logo of the data block in logical groups g, and x is current logic group number, and X is initial logical groups number, b Data-oriented block ID, s are newly-increased logical groups numbers,
B the index ID that data block b is mapped to the data memory node in logical groups g) is calculated, formula:
D=node (a, i)=(a+i) %4 (3)
Wherein, a is new data block mark of the data block b in logical groups g, i be data block b copy number (value 0,1, 2), d is index (value 0,1,2,3) of the data block b in the data memory node of logic group selection.
The copy number refers to that intensive file system provides three copies for each data block, fully ensures that it can With property, number is 0,1,2;The index of the data memory node refers to all data memory nodes in a logical groups Number, the present invention in each logical groups include 4 data memory nodes, call number is respectively 0,1,2,3.
(2) data block search operation the following steps are included:
Data memory node calculates data block b according to the reversed invertible function of its call number where step 2.1, data block b The new ID of place logical groups;
Data memory node is according to logical groups ID where data block b where step 2.2, data block b, with reversed invertible function The physics ID for calculating data block b, provides condition for the complete data file of file system recovery;
Step 2.3, data memory node obtain data block in the mapping letter of memory node according to the physics ID of data block Breath;
Step 2.4, data memory node are sent according to the data that the map information of data block b obtains data block b to file system System.
Step 2.1 in data block search operation calculates number according to the reversed invertible function of data memory node call number d According to the new ID of logical groups where block b, formula:
D=(a+i) %4 → can inverse operation → a=4j+ (d-i) %4 (4)
Wherein i indicates the copy number of data block, can take 0 with iteration, 1,2, j it is desirable 0,1,2 ..., n etc.;
The reversed invertible function of step 2.2 in data block search operation calculates the physics ID of data block b, formula:
Wherein, g is the index comprising data-oriented memory node logical groups,
Fig. 3 (a) be continuous data block by linear Hash mapping to each logical groups, and by migration divide realize number According to the distributed storage of block each data memory node in logical groups;Fig. 3 (b) is by taking back end 2 as an example, and demonstration passes through can Inverse function realizes the reverse process for searching data block.
(3) crash handling of fail data memory node operation the following steps are included:
Step 3.1 determines logic groups where fail data memory node;
The smallest data memory node is loaded in logic groups other than step 3.2, selection data storage failure node to make For backup node;
Step 3.3, multiple post data memory nodes are replicated in each logical groups parallel using intelligence recombination mapping method The data for including in the corresponding fail data memory node.
The intelligence recombination mapping method, is to choose post data memory node number and comprising fail data memory node Logical groups number it is equal, a fail data memory node is likely to be contained in multiple logical groups, each post data is deposited Node is stored up only to be responsible for replicating the partial data in a corresponding logical groups in the fail data memory node.
Fig. 4 demonstrates each logical groups and replaces recovery process to back end 2 by taking the failure of back end 2 as an example.
(4) addition new data memory node operation, it is main using decoupling address mapping method, comprising the following steps:
Step 4.1, the average load COV for calculating data memory node in all logical groups in whole systemave
One step 4.2, selection logical groups, calculate in the group maximum load C OV in all data memory nodesmax
Step 4.3 compares COVmaxAnd COVaveSize, if COVmax≥COVave, data memory node is added with new Maximum data memory node is loaded in replacement logical groups.Otherwise, next logical groups are chosen, step 4.1, step 4.2 are repeated With step 4.3, born until the load for the data memory node being newly added reaches or approaches the average of data memory node in system Until load.
Fig. 5 demonstrates system addition new data node node128Constitute new new logic group LG1000When, the number of whole system According to block transition process.
It can be seen that system by invertible function and using intelligence recombination mapping method and using decoupling by Fig. 4 and Fig. 5 Address mapping method, when making back end failure and the addition of new data node, only seldom data block migration is fully ensured that The stability of system and availability to user.
This method is illustrated with an example below.
Select HDFS as data-intensive file system, by emulating 10000 back end, 1000000 data Under the big data environment of block, is using the autonomous block management method of the data-intensive file system of lightweight and do not using the party Host node EMS memory occupation situation is as shown in table 1 when method, and host node CUP occupancy situation is as shown in table 2.Wherein 1000000 data block It is generally evenly distributed in 10000 back end, each data block size is 64MB.
1 host node management data block EMS memory occupation situation of table
Data section points 1000 2000 5000 7000 9000 10000
Committed memory (MB) after optimization 15 20 27 36 42 50
It is not optimised committed memory (MB) 180 186 189 192 194 196
2 host node management data block CPU occupancy situation of table
Data section points 500 2000 3000 4000 5000
CPU usage (%) after optimization 1.4 2.3 2.5 3.1 4.2
It is not optimised rear CPU usage (%) 6.3 12.1 16.6 19.8 23.2
From Tables 1 and 2 it is found that after autonomous block management method using the data-intensive file system of lightweight, main section The EMS memory occupation situation of point and the occupancy situation of CPU are substantially better than not using the autonomy of the data-intensive file system of lightweight The case where block management method.
It is discussed in detail although the contents of the present invention have passed through above preferred embodiment, but it should be appreciated that above-mentioned Description is not considered as limitation of the present invention.After those skilled in the art have read above content, for of the invention A variety of modifications and substitutions all will be apparent.Therefore, protection scope of the present invention should be limited to the appended claims.

Claims (6)

1. a kind of autonomous block management method of the data-intensive file system of lightweight, which is characterized in that data-intensive text Part system realizes the autonomous management of data block by cross transfer division methods, i.e., real by using one group of reversible mathematical function Existing data block to data memory node and data memory node to the mapping of data block, complete the distributed storage of data block with It searches;
Data block storage operation is realized by the autonomous block management method, comprising the following steps:
Step 1.1, host node select logical groups where data block by reversible linear hash function;
Step 1.2, host node select data memory node in the logical groups by reversible displacement segmentation function;
Step 1.3, in the data memory node chosen, the data and data block address map information of storing data block;
The crash handling operation of fail data memory node is realized by the autonomous block management method, comprising the following steps:
Step 3.1 determines logic groups where fail data memory node;
Step 3.2, select data store failure node other than logic groups in load the smallest data memory node as after Slave node;
Step 3.3, multiple post data memory nodes are replicated corresponding in each logical groups parallel using intelligence recombination mapping method The fail data memory node in include data;
The intelligence recombination mapping method is to make the post data memory node number of selection and comprising fail data memory node Logical groups number is equal, and a fail data memory node is comprised in multiple logical groups, and each post data stores Node only replicates the partial data of the fail data memory node in a corresponding logical groups.
2. the autonomous block management method of the data-intensive file system of lightweight as described in claim 1, which is characterized in that
Data block search operation is realized by the autonomous block management method, comprising the following steps:
Data memory node where step 2.1, data block is patrolled where calculating data block with reversed invertible function according to its call number Collect the new ID of group;
Step 2.2, data memory node where data block are according to the new ID of logical groups where data block, with reversed invertible function meter Calculate the physics ID of data block;
Step 2.3, data memory node obtain data block in the map information of memory node according to the physics ID of data block;
According to the map information of data block, the data for obtaining data block are sent to data-intensive text for step 2.4, data memory node Part system.
3. the autonomous block management method of the data-intensive file system of lightweight as claimed in claim 1 or 2, feature exist In,
Addition new data memory node operation is realized by the autonomous block management method, wherein using decoupling address of cache side Method, comprising the following steps:
Step 4.1, the average load COV for calculating data memory node in all logical groups in whole systemave
Step 4.2 selects any one logical groups, calculates maximum load in all data memory nodes in the logical groups COVmax
Step 4.3 compares COVmaxAnd COVaveSize, if COVmax≥COVave, replaced with the data memory node being newly added It changes in logical groups and loads maximum data memory node;Otherwise, next logical groups are chosen, step 4.1, step are repeated 4.2 and step 4.3, until the load for the data memory node being newly added reaches or approaches being averaged for data memory node in system Until load.
4. the autonomous block management method of the data-intensive file system of lightweight as described in claim 1, which is characterized in that
In step 1.1, the formula of logical groups where selecting data block by reversible linear hash function:
Wherein, g is logical groups ID, and x is current logic group number in system, and logical groups number when X is initial in system, b is to deposit Data block ID, s of the data block of storage in its file are newly-increased logical groups numbers,
In step 1.2, the process for selecting data memory node in logical groups by reversible displacement segmentation function includes:
A it) calculates data block b and is mapped to the new block identification after logical groups g, formula:
Wherein, a is new logo of the data block b in logical groups g;
B the index ID that data block b is mapped to the data memory node in logical groups g) is calculated, formula:
D=node (a, i)=(a+i) %4 (3)
Wherein, i is the copy number of data block b, and d is call number of the data block b in the data memory node of logic group selection.
5. the autonomous block management method of the data-intensive file system of lightweight as claimed in claim 4, which is characterized in that
In step 2.1, the new ID of logical groups where calculating data block b with reversed invertible function;
A=4j+ (d-i) %4 (4)
The formula by d=(a+i) %4 carry out can inverse operation obtain;
Wherein, the copy i iteration of data block take 0,1,2, j take zero or positive integer;
In step 2.2, the physics ID of data block b is calculated with reversed invertible function:
6. the autonomous block management method of the data-intensive file system of lightweight as described in claim 1, which is characterized in that
In the data-intensive file system, system name space maintenance is carried out by host node, data block to data stores The distribution of node, the management of each data memory node;
And it is responsible for the consistency check, data block recovery and data memory node of data block by each data memory node Map information storage and maintenance.
CN201610665489.3A 2016-08-12 2016-08-12 A kind of autonomous block management method of the data-intensive file system of lightweight Active CN106293537B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610665489.3A CN106293537B (en) 2016-08-12 2016-08-12 A kind of autonomous block management method of the data-intensive file system of lightweight

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610665489.3A CN106293537B (en) 2016-08-12 2016-08-12 A kind of autonomous block management method of the data-intensive file system of lightweight

Publications (2)

Publication Number Publication Date
CN106293537A CN106293537A (en) 2017-01-04
CN106293537B true CN106293537B (en) 2019-11-12

Family

ID=57670722

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610665489.3A Active CN106293537B (en) 2016-08-12 2016-08-12 A kind of autonomous block management method of the data-intensive file system of lightweight

Country Status (1)

Country Link
CN (1) CN106293537B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109783398B (en) * 2019-01-18 2020-09-15 上海海事大学 Performance optimization method for FTL (fiber to the Home) solid state disk based on relevant perception page level
CN114844911B (en) * 2022-04-20 2024-07-09 网易(杭州)网络有限公司 Data storage method, device, electronic equipment and computer readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101488104A (en) * 2009-02-26 2009-07-22 北京世纪互联宽带数据中心有限公司 System and method for implementing high-efficiency security memory
CN104052576A (en) * 2014-06-07 2014-09-17 华中科技大学 Data recovery method based on error correcting codes in cloud storage
CN104077423A (en) * 2014-07-23 2014-10-01 山东大学(威海) Consistent hash based structural data storage, inquiry and migration method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012147087A1 (en) * 2011-04-29 2012-11-01 Tata Consultancy Services Limited Archival storage and retrieval system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101488104A (en) * 2009-02-26 2009-07-22 北京世纪互联宽带数据中心有限公司 System and method for implementing high-efficiency security memory
CN104052576A (en) * 2014-06-07 2014-09-17 华中科技大学 Data recovery method based on error correcting codes in cloud storage
CN104077423A (en) * 2014-07-23 2014-10-01 山东大学(威海) Consistent hash based structural data storage, inquiry and migration method

Also Published As

Publication number Publication date
CN106293537A (en) 2017-01-04

Similar Documents

Publication Publication Date Title
US10761758B2 (en) Data aware deduplication object storage (DADOS)
CN106066896B (en) Application-aware big data deduplication storage system and method
Chen et al. The dynamic cuckoo filter
CN106776967B (en) Method and device for storing massive small files in real time based on time sequence aggregation algorithm
TWI472935B (en) Scalable segment-based data de-duplication system and method for incremental backups
CN110058822B (en) Transverse expansion method for disk array
CN102855294B (en) Intelligent hash data layout method, cluster storage system and method thereof
CN105069111B (en) Block level data duplicate removal method based on similitude in cloud storage
US8965856B2 (en) Increase in deduplication efficiency for hierarchical storage system
CN104077423A (en) Consistent hash based structural data storage, inquiry and migration method
CN105683898A (en) Set-associative hash table organization for efficient storage and retrieval of data in a storage system
CN101079034A (en) System and method for eliminating redundancy file of file storage system
JP2013514560A (en) Storage system
CN107667363A (en) Object-based storage cluster with plurality of optional data processing policy
CN103929454A (en) Load balancing storage method and system in cloud computing platform
CN103139300A (en) Virtual machine image management optimization method based on data de-duplication
CN105117351A (en) Method and apparatus for writing data into cache
US11755557B2 (en) Flat object storage namespace in an object storage system
CN103902735A (en) Application perception data routing method oriented to large-scale cluster deduplication and system
US20180373456A1 (en) Metadata Load Distribution Management
CN105354250A (en) Data storage method and device for cloud storage
CN108073472B (en) Memory erasure code distribution method based on heat perception
US20200341639A1 (en) Lattice layout of replicated data across different failure domains
CN108415671A (en) A kind of data de-duplication method and system of Oriented Green cloud computing
CN106293537B (en) A kind of autonomous block management method of the data-intensive file system of lightweight

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant