CN102819535A - Data block migration - Google Patents

Data block migration Download PDF

Info

Publication number
CN102819535A
CN102819535A CN2011102868481A CN201110286848A CN102819535A CN 102819535 A CN102819535 A CN 102819535A CN 2011102868481 A CN2011102868481 A CN 2011102868481A CN 201110286848 A CN201110286848 A CN 201110286848A CN 102819535 A CN102819535 A CN 102819535A
Authority
CN
China
Prior art keywords
data
node
key assignments
file
new
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2011102868481A
Other languages
Chinese (zh)
Inventor
V·贾亚拉曼
A·丁卡尔
M·泰勒
G·拉奥
M·E·罗特
M·巴什依姆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dell Products LP
Original Assignee
Dell Products LP
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US13/158,289 external-priority patent/US9400799B2/en
Application filed by Dell Products LP filed Critical Dell Products LP
Publication of CN102819535A publication Critical patent/CN102819535A/en
Pending legal-status Critical Current

Links

Images

Abstract

Techniques and mechanisms are provided for migrating data blocks around a cluster during node addition and node deletion. Migration requires no downtime, as a newly added node is immediately operational while the data blocks are being moved. Blockmap files and deduplication dictionaries need not be updated.

Description

The data block migration
Technical field
The disclosure relates to the data block migration.
Background technology
Keep huge data volume all can take a large amount of resources aspect physical hardware cost and system management and the architecture cost.Some mechanism provide data compression to economize on resources.For instance, some file layout is through overcompression, for example portable document format (PDF).In addition, some other utility allows on single file-level the mode with relative poor efficiency to carry out compression.
Data go to the system that heavily refers to repeat to improve the ability of storage, transmission and/or treatment effeciency through the data in the elimination file.Introducing goes the storage system of weight technology to comprise the single instance of the storage data sementation that a plurality of files are common.In some instances, the data that send to storage system have been divided into the segmentation of fixing or variable-size.Each segmentation all has been equipped with a segment identifiers (ID), the for example hashed value of digital signature or real data.In case generated segmentation ID, then can use this ID to confirm whether there has been this data sementation in the system.If there is this data sementation, then there is no need to store once more this segmentation.
In a lot of conventional embodiments, might need to troop the migration data piece around certain.But the mechanism that is used for the migration data piece is limited.Therefore, the mechanism that is used to improve the data block migration is provided here.
Description of drawings
Through with reference to description below in conjunction with accompanying drawing, can understand the disclosure best, wherein said accompanying drawing shows specific embodiment of the present invention.
Shown in Fig. 1 is the particular example that can use the technology of the present invention and the system of mechanism.
Shown in Fig. 2 is an example of lock.
Shown in Fig. 3 A is an example of adding node.
Shown in Fig. 3 B is an example carrying out data access.
Shown in Fig. 4 A is an example of File mapping.
Shown in Fig. 4 B is data storage casing (suitcase).
Shown in Fig. 5 is the particular example of heavy field.
Shown in Fig. 6 A is the particular example with file of individual data segmentation.
Shown in Fig. 6 B is the particular example with file of a plurality of data sementations and composition.
Shown in Fig. 7 is the particular example of computer system.
Embodiment
Now will be in detail with reference to some concrete example of the present invention, these examples comprise that the inventor imagines is used to carry out optimal mode of the present invention.The example of these specific embodiments shown in the drawings.Though the present invention combines these specific embodiments to describe, and should be appreciated that this is not in order to limit the invention to described embodiment.In contrast, it is intended to cover the invention essence that possibly be included in accessory claim definition and scope with interior replacement, modification and equivalent.
For example, technology of the present invention is in the context of data block, to describe with mechanism.Yet should be noted that technology of the present invention and mechanism are applicable to that in various data structure, this has wherein comprised the variant of data block.Numerous concrete details have been set forth in the following description, so that provide about complete understanding of the present invention.Certain illustrated embodiment of the present invention can implement under the situation of some or all of these details not having.In other instances, do not describe well-known processing operation in detail, in order to avoid unnecessarily obscure with the present invention.
For the sake of clarity, with singulative different technologies of the present invention and mechanism are described sometimes.But should be noted that only if otherwise explain, otherwise some embodiment have comprised the repeatedly instantiation of repeatedly iteration or certain mechanism of certain technology.For example, system uses processor in multiple context.But what should reckon with is, unless note in addition otherwise, system can use a plurality of processors, keeps simultaneously being in the scope of the present invention.In addition, technology of the present invention and mechanism are described two connections between the entity sometimes.Should be noted that the connection between two entities may not mean direct expedite connection because might be resident between two entities multiple other entities.For example, processor can be connected to storer, but what should reckon with is that multiple bridge and controller can reside between processor and the storer.Therefore, only if otherwise explain, may not mean direct expedite connection otherwise connect.
Summary
Here provide and be used for centering on the technological and machine-processed of the migration data piece of trooping in node interpolation and knot removal process.Said migration does not need stop time, because the new node that adds can be worked when data block is moved at once.Dictionary there is no need to upgrade the piece mapped file with going heavily.
Illustrative embodiments
The cost of maintenance, management, transmission and/or processing mass data might be very high.These costs not only comprise power supply and cooling cost, but also comprise system maintenance, the network bandwidth and hardware cost.
Made at present the occupy-place (footprint) that some effort reduce the data that file server keeps already, and the network traffic that reduces to be associated.Before data are write file server, there is multiple utility to compress to single file.Compression algorithm is fairly perfect and popularizes very much.The target of some compression algorithms is files of the data or the particular type of particular type.Compression algorithm adopts multiple mode to work, but a lot of compression algorithm can come can be mapped in the specified data source sequence than the short code word through analyzing data.In a lot of embodiments, the source sequence that the most frequently occurs or the long source sequence of the most frequently appearance can be replaced by the possible code word of lacking most.
Through reducing the redundant data amount, data are gone heavily to handle and have been reduced the storage occupy-place.Go heavily to handle and to comprise the segmentation of discerning variable or fixed size.According to various embodiment, each segmentation of data is all handled with hashing algorithm, for example MD5 or SHA-1.This is treated to each segmentation and has produced unique ID, hashed value or quote.In other words, iff has changed the few bytes of document or demonstration, and what stored so only is the part that changes.In some instances, go to heavy system to use fixing or moving window comes the search matched sequence, and use and quote the identification and matching sequence, rather than store matching sequence once more.
Go in the heavy system in data, discern the alternative file that is used to back up, create backup stream, and data are sent to heavy system with the backup server of backup agent cooperation.Go the typical goal systems in the heavy system when receiving data sementation, to go heavily to data.Having the piece that is kept at the copy in the heavy system there is no need by storage once more.But, such as quote with reference count other information might need to upgrade.Through disclosing the NAS driver can supply the user to operate to back up with archive file, some embodiments allow under the situation of not using backup software, candidate data to be moved to heavy system.
In an activity file system, in the system operation process, might need to add or deletion of node.Usually, comparatively it is desirable to face node interpolation and knot removal and center on the migration data of trooping.According to various embodiment, each piece mapping and data storage casing in trooping all have casing ID or SCID.What SCID identified is node and piece mapping or data storage casing, and therefore, SCID can one of overall situation identification be positioned at the inner file of trooping.
According to various embodiment, technology of the present invention has been taken into account according to the node interpolation with mechanism and has been deleted the processing that node is mapped to SCID.When restriction or avoiding copying data, node is mapped with and possibly changes.In a particular embodiment, wherein needn't come a Refreshing Every piece mapping, thereby revise SCID through scanning each SCID.Technology of the present invention can be applied to have any cluster environment of the node of any amount.When adding new node, data can be on node by balance again.Equally, in certain node of scheduling deletion, at this moment as long as from the node copies data that will delete can be from this node distributing data again.
All there are a plurality of defectives in existing a lot of mapping function.A lot of mapping functions are difficult to calculate, and might need numerous processor cycles.Mapping function might be rewritten key assignments when mapping function changes, and might when adding node, need between existing node, carry out extra copying data and handle.To having trooping of two nodes when adding new node, the lower solution of efficient might be with copying data to new node, and copies data to node 1 from node 1 copy node 2 and from node 2.Different embodiment according to the subject invention, data only are copied into new node.
According to various embodiment, node serial number can use a function and from SCID, obtain, for example #define get_the_node_number_from_the_scid (_ scid_) scid_to_node_array [_ scid_%MAX_CLUSTER_SIZE].Mapping function allows to use a key assignments to define the node that keeps data.According to various embodiment, when generating new key assignments, mapping function might change.Key assignments self can comprise node serial number, therefore, key assignments can be under the situation about not communicating between the node on each node independent allocation.In a particular embodiment, in node interpolation or delete procedure, needn't data block be re-assigned to different nodes through rewriting key assignments.When adding node, at this moment can the data of any amount be copied to new node from each node so that on trooping equilibrium criterion again.
What Fig. 1 showed is many tenants on-demand architecture.Comprise that a plurality of virtual machines with virtual image 101,103,105,107 and 109 corresponding virtual machines move on the server platform of having shared a plurality of processor cores 141.According to various embodiment, what virtual image A 101 moved is server OS, database server and one or more customized application.Virtual image 103 and 105 is clone body of virtual image A 101.According to various embodiment, what virtual image B 107 moved is server OS, database server, web server and/or one or more customized application.Virtual image 109 is clone body of virtual image B 107.In a particular embodiment, user 111 is connected to virtual image A 101.User 113,115 and 117 is connected to the clone body 103 of virtual image A.User 119 and 121 is connected to the clone body 105 of virtual image A.User 123,125 and 127 is connected to virtual image B 107.User 129 and 131 clone body 109 that are connected to virtual image B.
Calculating cloud service supplier allows the user to create the new instance of virtual image as required.These new instances can be the clone body of existing virtual machine images.The object optimization system provides API (API), and said interface can be used for cloning immediately certain file.When using this API, at this moment can new counterfoil be put into user's NameSpace, and will clone a piece mapped file.
In a particular embodiment, each file that comprises in the object optimization system all representes with piece mapping, said mapping representative then be all objects of in this document, finding.The piece mapped file comprises the skew and the size of each object.Then, each clauses and subclauses in the piece mapped file are all pointed to certain skew of data box house.According to various embodiment, a lot of piece mapped files will point to less data casing, cause the identical data block of a plurality of file-sharings thus.
According to various embodiment, the piece mapped file keeps identical all skews and the position indicator pointer of piece mapping with original document, therefore needn't copy the user file data.In a particular embodiment, if revised the file of being cloned after a while, the behavior is identical with revising the processing that has taken place when having gone to weigh file so.
Shown in Fig. 2 is an example through the file structure of optimizing.According to various embodiment, with its data structure storage where optimization system will be apprised of, and where data input stream derives from, and what the scope of optimization is, those Optimizing operation are applied to said stream, and how data markers become optimised.Then, data will not optimized.In a particular embodiment, be stored in the lock 221 through the data of optimizing.This lock 221 can be the interface of catalogue, volume, subregion or permanent object storer.In lock 221 inside, be stored in container or the structure through the data of optimizing, for example casing 271.In file system, each casing 271 can be a file.In piece or object memories, extended formatting also is operable.The visual NameSpace 201 of user comprises a plurality of counterfoil files 211.According to various embodiment, counterfoil file 211 is corresponding to virtual image A 213 and virtual image B 215.Virtual image A 213 is associated with the extended attribute information 217 that comprises file size data and/or other metadata.215 of virtual image B are associated with the extended attribute information 219 that comprises file size data and/or other metadata.
According to various embodiment, be stored in the lock 221 through the data of optimizing.Piece mapped file 261 comprises skew, length and the location identifier that is used at the appropriate data sementation in data storage casing 271 location.A plurality of mapped files can point to the identical data segmentation in the data storage casing.Each piece mapped file also has and catalogue handle virtual image A 233 and the corresponding corresponding extended attribute information 231 of catalogue handle virtual image B243 and 241.
Shown in Fig. 3 A is an example that is used in the technology of the interpolation node of trooping.Describe though this technology is added in the context at node, will be recognized that different techniques also can apply to Section point deletion or modification.301, detect the data unbalancedness.According to various embodiment, the multigroup collecting system can be confirmed some node just by a large amount of uses, and it is then fewer that other nodes use.In other examples, system can detect the needs additional node according to the storer operating position.In other examples,, also can add or deletion of node even if there is not the specified data unbalancedness.303, receive a request of adding node.The processing of adding node can be corresponding in storage cluster, connecting additional storage array or online memory device.
305, generate a plurality of key assignments.In a particular embodiment, mapping function can be rewritten 307.In a particular embodiment, said a plurality of key assignments can be a casing identifier and/or corresponding to specific piece mapped file.According to various embodiment, mapping function stipulates that said key assignments sign is perhaps corresponding to specific nodes.This mapping function might be rewritten when generating a plurality of key assignments.309, data are by from node copy paper new node, so as in data storage to be trooped equilibrium criterion again.According to various embodiment, in node, interpolation, deletion or modification process, needn't scan, visit, analysis or modified block mapped file.In a particular embodiment, the piece mapped file remains unchanged.
Shown in Fig. 3 B is an example that is used for after data migtation, carrying out the technology of data access.351, visit counterfoil file.This counterfoil file is corresponding to the virtual image through the file optimized, and comprises the extended attribute information of metadata and so on.According to various embodiment, the counterfoil file provides a casing identifier (SCID).In a particular embodiment, this casing identifier specifies a node.In a particular embodiment, 353, in user's space, can visit extended attribute information and metadata immediately.355, the node of SCID appointment will be determined.According to various embodiment, the node of SCID appointment will be determined.In a particular embodiment, node serial number is to confirm through modulus and the maximum cluster size access index of using SCID.In some instances, node serial number is to use and obtains from SCID like minor function:
#define?get_the_node_number_from_the_scid(_scid_)
scid_to_nodc_array[_scid_%MAX_CLUSTER_SIZE]
357, user capture piece mapped file.This piece mapped file is included in skew, length and the positional information of sign data sementation in the data storage casing.According to various embodiment, when data migtation, needn't visit, scan or upgrade the piece mapped file.359, visit the data storage casing in the appropriate node.361, the metadata in the data storage casing can be obtained.363, the data sementation in the data storage casing can be obtained.365, these data sementations can be launched and/or decompressed again, so that obtain not optimised data.
Shown in Fig. 4 A is an example of piece mapped file or File mapping, and shown in Fig. 4 B is the corresponding data storage casing of after having optimized file X, having created.File mapping file X 401 comprises skew 403, index 405 and lname field 407.According to various embodiment, the size of each segmentation that is used for the File mapping of file X is 8K.In a particular embodiment, each data sementation all has employing form < data storage casing ID >. the index of <tables of data index >.For example, 0.1 corresponding be casing ID 0 and tables of data index 1,2.3 corresponding then be casing ID2 and database index 3.All reside among the casing ID 0 with skew 0K, 8K, the corresponding segmentation of 16K, the tables of data index then is 1,2 and 3.In File mapping, because each segmentation is before all by any file include mistake, therefore, lname field 407 is NULL.
Shown in Fig. 4 B is an example with File mapping file X 401 corresponding data storage casings.According to various embodiment, data storage casing 471 comprises index part and data division.Index part comprises index 453, data-bias 455 and data referencing counting 457.Data division comprises index 453, data 461 and last file index 463.According to various embodiment, through array data table 451 by this way, the batch of can the permission system carrying out about index part reads to obtain offset data, so that allow the mass data in the parallel reading of data part.
According to various embodiment, data storage casing 471 comprises three skews, and the reference count of data sementation that data sementation is mapped to File mapping file X 409 is right.At index part, be cited once with the corresponding index 1 of the data among the offset data A.Be cited once with the corresponding index 2 of the data among the offset data B.Be cited once with the corresponding index 3 of the data among the offset data C.At data division, the quoting of file X 401 that index 1 comprises data A and points to last reference data A.The quoting of file X 401 that index 2 comprises data B and points to last reference data B.The quoting of file X 401 that index 3 comprises data C and points to last reference data C.
According to various embodiment, catalogue is the said key assignments that goes to heavy system.This catalogue is used to identify the data sementation of repetition, and points to the position of data sementation.When having a plurality of little data sementation in the system, the size of catalogue might arrive inefficiency greatly.In addition, when a plurality of optimized nodes acted on same data set, each in them all can be created the catalogue of oneself.Because first node might have been discerned the redundant data segmentation, yet Section Point is not but known this point because between these two nodes, have share directory, and therefore, this method might cause producing suboptimum and go heavily.Thus, Section Point will be stored the data sementation identical with original segmentation.The processing of sharing whole catalogue can be used locking mechanism and the update mechanism that is used to merge from a plurality of nodes realizes.But this type of mechanism might be very complicated, and performance is had a negative impact.
Therefore, can use a kind of work partition scheme based on the segmentation ID or the hashed value scope of different pieces of information segmentation here.Hashed value scope that the inner different nodes of trooping are all designated.If node is being handled the data sementation with the hashed value that is mapped to another node, it can be got in touch with other nodes that have this scope so, whether has had this data sementation so that find in the data-carrier store.
Shown in Fig. 5 is a plurality of catalogues that are assigned to different segmentation ID or hashed value scope.Though described is the hashed value scope, should be realized that directory index can be the key assignments of hashed value scope, fiducial value or other types.According to various embodiment, hashed value is the SHA1 hashed value.In a particular embodiment, what first node used is catalogue 501, and this catalogue comprises hashed value scope 0x0,000 0000 00000000-0x0000 0000 FFFF FFFF.What Section Point used then is catalogue 551, and this catalogue comprises hashed value scope 0x0,000 0,001 0000 0000-0X0000 0001 FFFF FFFF.For simplicity, the scope that is in catalogue 501 is represented with interior hashed value 511 usefulness symbol a, b and c.And the scope that for simplicity, is in catalogue 551 is represented with interior hashed value 561 usefulness symbol i, j and k.According to various embodiment, each hashed value in the catalogue 501 all is mapped to particular memory location 521, and for example the position 523,525 or 527.Each hashed value in the catalogue 551 all is mapped to particular memory location 571, and for example the position 573,575 and 577.
Through having a plurality of small-sized segmentations, can improve the possibility of finding repetition.But,, will reduce the efficient of using catalogue itself and the File mapping that use is associated and the efficient of data storage casing so if having a lot of small-sized segmentations.
Shown in Fig. 6 A is an example of non-container file.According to various embodiment, container file can be ZIP file, files, such as the throughput rate external member file of .docx .xlsx or the like, and these container files comprise a plurality of dissimilar objects.The non-container file of image and simple text file and so on does not then comprise foreign peoples's object.
According to various embodiment, can recognize that the non-container file of some type can be from fragment size less than being benefited file self size.For instance, the numerous image literary compositions such as .jpg and .GIFf file do not have and other .jpg and the total a plurality of segmentations of .GIFf file.Therefore, for this reason the class file type to select the processing of small-sized segmentation be inefficiency.Thus, the section boundaries of image file can be the border of file self.For example, non-container data 601 comprises that have can not be from the file 603 than the type of being benefited the staging treating of fine granulation.Can not be from comprising image file than the file type of being benefited the segmentation of fine granulation, for example .jpg .png .gif and .bmp file.What therefore, file 603 was equipped with is single split 605.What in going heavy catalogue, keep is single split.If the single large-scale segmentation that comprises whole file is provided, so more effectively compressed segmentation.According to various embodiment, a plurality of segmentations that comprise the file of a plurality of same types are compressed simultaneously.In a particular embodiment, having only those to have from the segmentation of the data of the file of same type compresses with single compressed context.Will be recognized that here and can special-purpose compressor reducer be applied to the particular fragments that the same file type is associated.
Shown in Fig. 6 B is an example with container file of a plurality of foreign peoples's objects.Data 651 comprise benefits from the more container file of the staging treating of intelligence.According to various embodiment, can adopt aptitude manner to carry out staging treating here, allow to use single compressed context to compress a plurality of segmentations simultaneously.Staging treating can use aptitude manner to implement, and heavily improves compression efficiency simultaneously so that realize going.Single split size or use slip segmentation window are different with selecting, through delayer 653, and can the extraction document component.For example, the .docx file might comprise text, image and other container files.Give an example, file 653 can comprise component 655,659 and 663.Component 655 can be not have benefited from the more component of the staging treating of fine granulation, and comprises segmentation 657 thus.What equally, segmentation 659 comprised also is single split 661.By contrast, component 663 is actual to be the container file 663 that embeds, and it not only comprises benefits from the data that additional segments is handled, but also comprises another component 673.For example, data 665 can comprise text.According to various embodiment, the text segmentation size both can be predetermined size, also can be dynamic or adjustable size.In a particular embodiment, text such as has been divided at big segmentation 667,669 and 671.Therefore, data can also comprise non-text object 673, and what this object was equipped with is the section boundaries of calibrating mutually with object bounds 675.
Plurality of devices with use the particular example heavily handled of going can implement network-efficient.Shown in Fig. 7 is an example of computer system.According to specific illustrative embodiments, the system 700 that is fit to the embodiment of the present invention specific embodiment comprises processor 701, storer 703, interface 711 and bus 715 (for example pci bus).When under the control of appropriate software or firmware, operating, what processor 701 was responsible for is the task of optimization and so on.Adopt the distinct device of ad hoc fashion configuration also can be used to replace processor 701, perhaps replenish said processor 701.Whole embodiment can also adopt custom hardware to accomplish.Interface 711 is configured to usually transmit and receive data via network and divides into groups or data sementation.The particular example of the interface of equipment support comprises Ethernet interface, Frame Relay Interface, cable interface, DSL interface, token ring interface or the like.
In addition various high-speed interfaces can be provided, fastethernet interface for example, gigabit ethernet interface, atm interface, hssi interface, pos interface, fddi interface or the like.Usually, these interfaces can comprise the port that is fit to appropriate media communication.In some cases, they can also comprise independent processor, and in some instance, can comprise the easy RAM of mistake.Said independent processor can be controlled this communications-intensive tasks, for example divides into groups to switch medium control and management.
According to specific illustrative embodiments, system 700 uses storer 703 to store data and programmed instruction, and keeps the local side buffer.For example, these programmed instruction can the control operation system and/or the operation of one or more application.Said one or more storer can also be configured to store the metadata that receives and in batches by request metadata.
Because this type of information and programmed instruction can be used to implement system/method described herein; Therefore; What the present invention relates to is tangible machine-readable medium, and wherein said medium comprise the programmed instruction that is used to carry out different operating described herein, status information or the like.The optical medium that comprises hard disk, floppy disk, tape, CD-ROM video disc and DVD and so on about the example of machine-readable medium; The magneto-optical media of CD and so on, and be configured to store the hardware device with execution of program instructions, for example ROM (read-only memory) (ROM) and programmable read-only memory (prom) specially.Example about programmed instruction comprises machine code, the machine code that for example produces by compiler, and comprised the file that can use the high layer identification code of interpreter execution by computing machine.
Though be to describe numerous assemblies and processing with singulative hereinbefore for convenience's sake, those skilled in the art can reckon with that the processing of a plurality of assemblies and repetition also can be used for the technology of embodiment of the present invention.
Though show especially and invention has been described with reference to specific embodiment, it should be appreciated by those skilled in the art that under the situation that does not break away from essence of the present invention and scope the variation of form and details aspect all is feasible.Thus, should to be interpreted into be to have comprised all changes and the equivalent that falls into essence of the present invention and scope in the present invention.

Claims (20)

1. method comprises:
Troop from data storage and receive to add the request of new node;
Generate a plurality of new key assignments that is associated with mapping function, what said mapping function identified is and the corresponding specific nodes of specific key assignments;
Copy data to new node from a plurality of existing nodes, so that the data again on the equilibrium criterion storage cluster.
2. method according to claim 1, each in wherein a plurality of new key assignments all comprises node serial number.
3. method according to claim 1, wherein a plurality of new key assignments are corresponding to a plurality of mapped files.
4. method according to claim 3, wherein each piece mapped file comprises skew, length and the location identifier that is used in the segmentation of a plurality of casing sign.
5. method according to claim 3, wherein said a plurality of mapped files remain unchanged after having added new node.
6. method according to claim 1, wherein said a plurality of new key assignments are a plurality of casing identifiers (scid).
7. method according to claim 6, wherein mapping function comprises following formula:
#define?get_the_node_number_from_the_scid(_scid_)
scid_to_nodc_array[_scid_%MAX_CLUSTER_SIZE]
8. method according to claim 1, wherein new node is a memory device.
9. method according to claim 1, wherein new node is a storage array.
10. method according to claim 1, wherein a plurality of segmentations are copied on the casing of new node maintenance.
11. method according to claim 1, wherein when producing a plurality of new key assignments, mapping function might change.
12. a system comprises:
Interface is configured to add new node from the data storage request of trooping;
Processor; Be configured to produce a plurality of new key assignments that is associated with mapping function; Said mapping function sign and the corresponding specific nodes of specific key assignments, this processor also is configured to copy data to new node from a plurality of existing nodes, so that the data again on the equilibrium criterion storage cluster.
13. system according to claim 12, each in wherein a plurality of new key assignments all comprises node serial number.
14. system according to claim 12, wherein a plurality of new key assignments are corresponding to a plurality of fast mapped files.
15. system according to claim 14, wherein each piece mapped file comprises skew, length and the location identifier that is used in the segmentation of a plurality of casing sign.
16. system according to claim 14, wherein said a plurality of mapped files remain unchanged after having added new node.
17. system according to claim 12, wherein said a plurality of new key assignments are a plurality of casing identifiers (scid).
18. a method comprises:
Troop from data storage and receive to add the request of new node;
Generate a plurality of new key assignments that is associated with mapping function, what said mapping function identified is and the corresponding specific nodes of specific key assignments;
Copy data to new node from a plurality of existing nodes, so that the data again on the equilibrium criterion storage cluster.
19. method according to claim 18, each in wherein a plurality of new key assignments all comprises node serial number.
20. method according to claim 18, wherein a plurality of new key assignments are corresponding to a plurality of mapped files.
CN2011102868481A 2011-06-10 2011-09-23 Data block migration Pending CN102819535A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US13/158,289 US9400799B2 (en) 2010-10-04 2011-06-10 Data block migration
US13/158,289 2011-06-10

Publications (1)

Publication Number Publication Date
CN102819535A true CN102819535A (en) 2012-12-12

Family

ID=47305351

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011102868481A Pending CN102819535A (en) 2011-06-10 2011-09-23 Data block migration

Country Status (1)

Country Link
CN (1) CN102819535A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103229487A (en) * 2012-12-27 2013-07-31 华为技术有限公司 Partition balance method, device and server in distributed storage system
CN105493043A (en) * 2013-08-21 2016-04-13 森普利维蒂公司 System and method for virtual machine conversion
CN105743671A (en) * 2014-12-10 2016-07-06 华为技术有限公司 Capacity expanding method and system, and controller
CN107295046A (en) * 2016-03-31 2017-10-24 阿里巴巴集团控股有限公司 A kind of method and apparatus of user's migration

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060143248A1 (en) * 2001-06-27 2006-06-29 Yukio Nakano Database management system with rebalance architectures
US20100036887A1 (en) * 2008-08-05 2010-02-11 International Business Machines Corporation Efficient transfer of deduplicated data
CN101917396A (en) * 2010-06-25 2010-12-15 清华大学 Real-time repetition removal and transmission method for data in network file system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060143248A1 (en) * 2001-06-27 2006-06-29 Yukio Nakano Database management system with rebalance architectures
US20100036887A1 (en) * 2008-08-05 2010-02-11 International Business Machines Corporation Efficient transfer of deduplicated data
CN101917396A (en) * 2010-06-25 2010-12-15 清华大学 Real-time repetition removal and transmission method for data in network file system

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103229487A (en) * 2012-12-27 2013-07-31 华为技术有限公司 Partition balance method, device and server in distributed storage system
WO2014101044A1 (en) * 2012-12-27 2014-07-03 华为技术有限公司 Partition balancing method, device and server in distributed storage system
CN105493043A (en) * 2013-08-21 2016-04-13 森普利维蒂公司 System and method for virtual machine conversion
CN105493043B (en) * 2013-08-21 2019-03-19 慧与发展有限责任合伙企业 System and method for virtual machine conversion
US10762038B2 (en) 2013-08-21 2020-09-01 Hewlett Packard Enterprise Development Lp System and method for virtual machine conversion
CN105743671A (en) * 2014-12-10 2016-07-06 华为技术有限公司 Capacity expanding method and system, and controller
CN107295046A (en) * 2016-03-31 2017-10-24 阿里巴巴集团控股有限公司 A kind of method and apparatus of user's migration
CN107295046B (en) * 2016-03-31 2020-06-05 阿里巴巴集团控股有限公司 User migration method and device

Similar Documents

Publication Publication Date Title
US10929017B2 (en) Data block migration
KR102007070B1 (en) Reference block aggregating into a reference set for deduplication in memory management
US11093159B2 (en) Storage system with storage volume pre-copy functionality for increased efficiency in asynchronous replication
US9792306B1 (en) Data transfer between dissimilar deduplication systems
US9569456B2 (en) Accelerated deduplication
US9020909B2 (en) Active file Instant Cloning
US9678974B2 (en) Methods and apparatus for network efficient deduplication
JP5303038B2 (en) Storage system that eliminates duplicate data
US9195673B2 (en) Scalable graph modeling of metadata for deduplicated storage systems
US8423520B2 (en) Methods and apparatus for efficient compression and deduplication
US9262432B2 (en) Scalable mechanism for detection of commonality in a deduplicated data set
WO2012140686A1 (en) Data management method and data management system
US20200364106A1 (en) Storage system with coordinated recovery across multiple input-output journals of different types
US10108644B1 (en) Method for minimizing storage requirements on fast/expensive arrays for data mobility and migration
CN102819535A (en) Data block migration
CN108431815A (en) The duplicate removal complex data of distributed data in processor grid
US10776321B1 (en) Scalable de-duplication (dedupe) file system
CN112241336A (en) Method, apparatus and computer program product for backing up data
GB2484396A (en) Rebalancing of data in storage cluster when adding a new node

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20121212