CN106020739A - Data storage method and system for distributed storage - Google Patents

Data storage method and system for distributed storage Download PDF

Info

Publication number
CN106020739A
CN106020739A CN201610547862.5A CN201610547862A CN106020739A CN 106020739 A CN106020739 A CN 106020739A CN 201610547862 A CN201610547862 A CN 201610547862A CN 106020739 A CN106020739 A CN 106020739A
Authority
CN
China
Prior art keywords
storage
queue
described
storage device
data set
Prior art date
Application number
CN201610547862.5A
Other languages
Chinese (zh)
Inventor
吴兴义
Original Assignee
乐视控股(北京)有限公司
乐视云计算有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 乐视控股(北京)有限公司, 乐视云计算有限公司 filed Critical 乐视控股(北京)有限公司
Priority to CN201610547862.5A priority Critical patent/CN106020739A/en
Publication of CN106020739A publication Critical patent/CN106020739A/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from or digital output to record carriers, e.g. RAID, emulated record carriers, networked record carriers
    • G06F3/0601Dedicated interfaces to storage systems
    • G06F3/0602Dedicated interfaces to storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from or digital output to record carriers, e.g. RAID, emulated record carriers, networked record carriers
    • G06F3/0601Dedicated interfaces to storage systems
    • G06F3/0602Dedicated interfaces to storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from or digital output to record carriers, e.g. RAID, emulated record carriers, networked record carriers
    • G06F3/0601Dedicated interfaces to storage systems
    • G06F3/0628Dedicated interfaces to storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from or digital output to record carriers, e.g. RAID, emulated record carriers, networked record carriers
    • G06F3/0601Dedicated interfaces to storage systems
    • G06F3/0668Dedicated interfaces to storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Abstract

The embodiment of the invention provides a data storage method for distributed storage. The data storage method comprises the following steps: monitoring the working state of each storage device in a cluster; when a disabled storage device is monitored, determining at least one storage array where the disabled storage device is located, rest survival storage devices in the at least one storage array and all data sets relevant with the at least one storage array; generating a first queue and a second queue based on the visit of a user to at least one data set in all the data sets, wherein the first queue corresponds to unvisited data sets in all the data sets, and the second queue corresponds to visited data sets in all the data sets; selecting an available storage device from the cluster to replace the disabled storage device; and migrating the data sets to the available storage device by virtue of the rest survival storage devices in the at least one storage array based on the sequence that the second queue is prior to the first queue. According to the data storage method, hotspot data is preferentially migrated, so that the loss probability of the hotspot data is decreased, and the data security is improved.

Description

Date storage method and system for distributed storage

Technical field

The present invention relates to computer network field, particularly relate to a kind of data for distributed storage and store Method and system.

Background technology

Distributed memory system, is data according to the cutting of certain rule and to be broken up and be stored in many platform independent and lead to With on storage server.Traditional network store system uses all data of storage server repository concentrated, Storage server becomes the bottleneck of systematic function, is also the focus of reliability and safety, it is impossible to meet big The needs of scale storage application, and distributed memory system uses extendible system structure, utilizes multiple stage Storage server shares storage load, utilizes position storage server selection storage information, and it not only improves The reliability of system, availability and access efficiency, be also easy to extension.Thousands of of storage cluster Memory module can be substantially redundant by data, such that it is able to significantly improve the safety of data.

In field of storage, year fault rate (AFR) is generally used to characterize the reliability of disk, present city The AFR of disk general on field is usually about 4%, if i.e. one cluster has 365 pieces of disks, The probability having disk to damage in so 1 year is pow (0.96,365)=0.9999996619351175, I.e. 1 year there is disk failure the most certainly.And for distributed storage cluster, actually disk number is led to Often all having reached thousands of pieces, what therefore reply disk failures became that each storage system will solve asks Topic.

In prior art, the problem processing disk failure by data redundancy, it is common that each number evidence May be stored in three pieces even more on polylith disk, when certain block disk failure, can rely on remaining Two copies do data and recover, and prevent the loss of data when disk failures or memory module delay machine.But it is real On border, the design that data recovery policy have to improve, otherwise still can run into asking of loss of data Topic.Enterprise, when building storage cluster, the most all can buy a lot of storage servers and disk with batch, Often possess similar hardware specification with the disk of batch and drive firmware, it is possible to appearance was lost efficacy simultaneously Situation, in this case, when a certain piece of disk starts to lose efficacy, is stored in the data on this block disk All in the state of degradation, worse, the probability lost efficacy the most therewith with other disks of batch therewith is just Can improve, if the most other one piece of disk also lost efficacy, the most a part of data are by only surplus next one survival Copy, and degree of demoting deteriorates further, and if data can not be repaired as early as possible, the data of degradation may Can lose completely because of the inefficacy of the disk at last copy place.

Processing disk failure problem, common scheme is to increase number of copies, i.e. by three original replication policies Being revised as four copies, this simple and crude scheme can seriously increase the carrying cost of enterprise, additionally increase After number of copies, the write performance of data also can reduce, and is not the most a good scheme.

Typically, the data of user all have obvious cold and hot difference, the most frequent quilt of the hottest data The data accessed, are the most also the most important data.This is also famous in computer system The data that a kind of embodiment of principle of locality, i.e. user accessed in some day in a certain moment future still need to by The probability accessed is higher.Distributed memory system both provides higher availability, even if data is a certain Individual copy damages, and when user accesses these data, still can read from other copies and return to User.But for the important hot spot data arrived accessed by the user, if can not repair as early as possible, that If disk continues to damage, then these significant datas are it would appear that lose, the loss brought to user The biggest.

Summary of the invention

The embodiment of the present invention provides a kind of date storage method for distributed storage and system, preferentially selects The hot spot data selecting user's access migrates, and reduces the probability that hot spot data is lost, and ensures hot spot data Safety, thus significantly improve Information Security and the availability of system.

The embodiment of the present invention provides a kind of date storage method for distributed storage, including:

Each duty storing device in the cluster of monitoring distributed storage;

When there is inefficacy storage device, determine at least one storage at described storage device place of losing efficacy In array, at least one storage array described remaining survival storage device and with described at least one deposit The total data group that storage array is relevant;

Based on user's access at least one data set in described total data group, generate first team Row and the second queue, wherein, be not accessed for data set pair in first queue and described total data group Should, the second queue is with to be accessed for data set in described total data group corresponding;

Available storage is selected to replace the described storage device that lost efficacy from cluster;

Based on the second queue prior to the order of first queue, utilize at least one storage array described surplus Remaining survival storage device available storage after replacing migrates data set.

The embodiment of the present invention provides a kind of data-storage system for distributed storage, including:

Monitoring module, each duty storing device in the cluster of monitoring distributed storage;

Migrating data determining module, for when there is inefficacy storage device, determining described storage of losing efficacy Remaining survival storage at least one storage array at device place, at least one storage array described Device and the total data group relevant at least one storage array described;

Migration series generation module, based on user at least one data set in described total data group Access, generate first queue and the second queue, wherein, first queue with in described total data group It is not accessed for data set corresponding, the second queue and described total data group are accessed for data set pair Should;

Storage repair module, selects available storage to replace the described storage device that lost efficacy from cluster;

Data Migration module, based on the second queue prior to the order of first queue, described in utilization at least one In individual storage array, remaining survival storage device available storage after replacing migrates data set.

Accompanying drawing explanation

In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to reality Execute the required accompanying drawing used in example or description of the prior art to be briefly described, it should be apparent that under, Accompanying drawing during face describes is some embodiments of the present invention, for those of ordinary skill in the art, On the premise of not paying creative work, it is also possible to obtain other accompanying drawing according to these accompanying drawings.

Fig. 1 is the embodiment of the present invention flow chart for the date storage method of distributed storage;

Fig. 2 is the present invention flow chart for an embodiment of the date storage method of distributed storage;

Fig. 3 be the present invention for distributed storage the flow process of another embodiment of date storage method Figure;

Fig. 4 is the embodiment of the present invention structural representation for the data-storage system of distributed storage;

Fig. 5 is the present invention structural representation for the embodiment of the data-storage system of distributed storage.

Detailed description of the invention

For making the purpose of the embodiment of the present invention, technical scheme and advantage clearer, below in conjunction with this Accompanying drawing in bright embodiment, is clearly and completely described the technical scheme in the embodiment of the present invention, Obviously, described embodiment is a part of embodiment of the present invention rather than whole embodiments.Based on Embodiment in the present invention, those of ordinary skill in the art are obtained under not making creative work premise The every other embodiment obtained, broadly falls into the scope of protection of the invention.

A kind of date storage method for distributed storage provided according to embodiments of the present invention, such as figure Shown in 1, including:

Each duty storing device in the cluster of monitoring distributed storage;

When there is inefficacy storage device, determine at least one storage at described storage device place of losing efficacy In array, at least one storage array described remaining survival storage device and with described at least one deposit The total data group that storage array is relevant;

Based on user's access at least one data set in described total data group, generate first team Row and the second queue, wherein, be not accessed for data set pair in first queue and described total data group Should, the second queue is with to be accessed for data set in described total data group corresponding;

Available storage is selected to replace the described storage device that lost efficacy from cluster;

Based on the second queue prior to the order of first queue, utilize at least one storage array described surplus Remaining survival storage device available storage after replacing migrates data set.

Distributed storage is all to be stored by each data set on the N number of storage device in cluster, and N is Constant, in a preferred embodiment, N=3.

In an alternate embodiment of the invention, each data set is all stored on 3 storage devices in cluster, And set up the available storage list in the map listing of data set correspondence storage array and cluster, its In, available storage can be to have used but the storage device of also memory space, it is also possible to be The storage device being not used.

In an alternate embodiment of the invention, first queue and the second queue can be obtained by following steps: Determine described storage at least one storage array at device place, at least one storage array described of losing efficacy In remaining survival storage device and the total data group relevant at least one storage array described it After, generate a first queue comprising total data group and a second empty queue, if user couple At least one data set in total data group accesses, then added by least one data set that user accesses It is added in the second queue, after having added, this at least one data set is deleted from first queue, Wherein, what first queue and the second queue stored is all title rather than the data set itself of data set.

In utilizing at least one storage array described remaining survival storage device after replacing can During migrating data set with storage device, user asks to access a data set in total data group, If this data set is in first queue, then this data set is moved to the second queue, if this data set is In two queues, then this data set is moved to the head of the queue of the second queue, preferentially migrates.For second Data set in queue, accessed number of times is the most, the most forward in the second queue.

In some optional embodiments, storage device can be various memorizer, such as RAM, ROM Deng, it is also possible to it is other data-storable storage mediums such as disk or floppy disk.

In some optional embodiments, each storage device carries least one set data set, with Just improve the utilization rate of storage device, reduce carrying cost.

In some optional embodiments, will often organize on 3 disks that data set all stores in cluster, To form the available disk list in the map listing of data set correspondence storage array and cluster, described in reflect Penetrating the partial list in list as shown in table 1 below, the partial list in described available disk list is as follows Shown in table 2,

Table 1:

Data set Storage array dg1、dg2 (d4,d666,d77) dg3、dg4 (d4,d8,d666) dg5 (d4,d123,d10)

Table 2:

Available disk list d110 d20 d456 d77

In the present embodiment, " dg+ constant " is used for the data set that labelling is different, and " d+ constant " is used for The disk that labelling is different, the disclosure does not the most limit.

In some optional embodiments, the position storing device in storage array is sequential, a side Face, the storage device that position is forward in storage array is responsible for receiving the data set of write, and is transmitted to it Remaining storage device in the storage array of place, on the other hand, when there is inefficacy storage device, utilizes Position forward storage device carry out data set migration to new storage device.

In some optional embodiments, in monitoring map listing, the duty of disk, can pass through Constantly disk is written and read operation to monitor its duty, it is also possible to by every 20s to magnetic Dish carries out a read-write operation, it is also possible to utilize monitoring tools of the prior art such as smartmontools Monitor the duty of disk.

When there is failed disk, when losing efficacy such as disk d4, table 1 determine failed disk d4 place Storage array: (d4, d666, d77), (d4, d8, d666) and (d4, d123, d10), and deposit Storage array (d4, d666, d77) relevant data set includes data set dg1 and data set dg2, and deposits The data set that storage array (d4, d8, d666) is relevant includes data set dg3 and data set dg4, with storage The data set that array (d4, d123, d10) is relevant includes data set dg5, travels through total data group, raw Become first queue Q of data to be migrated1With the second empty queue Q2, wherein, Q1=data set dg1, Data set dg2, data set dg3, data set dg4, data set dg5}.

Now, user asks to access data set dg1 and data set dg4, then by data set dg1 sum According to group dg4 from first queue Q1Middle deletion also moves to the second queue Q2, now Q2=data set dg1, Data set dg4}.

Wherein, in first queue and the second queue storage be all title rather than the data set of data set Itself.

During carrying out data set migration, user asks to access data set dg4, now data again Dg4 is in the second queue for group, and data set dg4 is not at the head of the queue of the second queue, then moved Priority migration is carried out to head of the queue.

In some are optionally implemented, each data set migration is deleted after completing from the queue at its place, First queue and the second queue is deleted after total data group has migrated.

In some optional embodiments, the data set in first queue can be according to the title of data set Or the size of data set is ranked up, it is also possible to be ranked up according to the real needs of user.The present invention The order of data set in first queue is not limited, does not repeats at this.

In some optional embodiments, three blocks of different available magnetic can be selected according to available disk list Failed disk d4 in three storage arrays replaced respectively by dish, as (d110, d666, d77), (d20, d8, d666) and (d456, d123, d10), it is also possible to select one piece of available disk to replace Failed disk d4 in three storage arrays, such as (d110, d666, d77), (d110, d8, d666) (d110, d123, d10).

In some optional embodiments, can arrange from available disk immediately after losing efficacy determining disk d4 Table selects new available disk such as disk d110 Replace Disk and Press Anykey To Reboot d4, it is also possible to determining that disk d4 lost efficacy As in 15 minutes in rear certain time, disk d4 does not repair, then select new from available disk list Disk Replace Disk and Press Anykey To Reboot d4.Failed disk updates map listing after being replaced.

Refer to Fig. 2, when the quantity of remaining survival storage device is two or more, according to residue Each survival storage device position in storage array, generate storage device recovery order, concrete and Speech, storage device recovery order can be the sequence of positions storing device in storage array, utilizes position Forward storage device carries out Data Migration to new storage device.When losing efficacy such as disk d4, determine The storage array at its place includes (d4, d666, d77), (d4, d8, d666) and (d456, d123, d10), For data set dg1 and data set dg2, utilize disk d666 to new disk such as d110 by Data set is migrated prior to the order of first queue according to the second queue.For data set d3 and data set d4, Utilize disk d8 to new disk such as d110 according to the second queue prior to the order transport number of first queue According to group.For data set d5, utilize disk d123 to new disk such as d110 according to the second queue first Order in first queue migrates data set.

Refer to Fig. 3, when the quantity of the storage device that lost efficacy is two or more, deposit according to described inefficacy At least two available storage corresponding to storage device position in storage array, generates storage device extensive Multiple order, specifically, described storage device recovery order can be that available storage is at storage array Sequence of positions, the forward available storage in position is prior to position available storage rearward.As deposited After in storage array (d4, d666, d77), disk d4 and disk d666 all lost efficacy, replace disk d110 Disk d4, disk d20 Replace Disk and Press Anykey To Reboot d666, then preferentially migrate relevant data set to disk d110, Relevant data set is migrated the most again to disk d20.

Below as a example by failed disk d4, illustrate this date storage method.

The storage array at failed disk d4 place includes: (d4, d666, d77), (d4, d8, d666) (d4, d123, d10), residue survival disk include: disk d666, disk d77, disk d8, Disk d123 and disk d10.The total data group relevant to above-mentioned storage array include data set dg1, Data set dg2, data set dg3, data set dg4 and data set dg5.Traversal total data group, Generate first queue and the second empty queue, wherein, first queue Q1={ data set dg1, data Group dg2, data set dg3, data set dg4, data set dg5}.Select according to available disk list Select disk d110 replace failed disk d4, then replace after storage array be (d110, d666, d77), (d110, d8, d666) and (d110, d123, d10).

Now, user asks to access data set dg1 and dg4, then by data set dg1 and data set dg4 From first queue, delete and move to the second queue, then the second queue Q2={ data set dg1, data Group dg4},

Utilization residue survival disk is when disk d110 migrates data set, in priority migration the second queue Data set.During data set migration, user asks to access data set dg3, then by data set dg3 Move to the second queue, the second queue Q2Become Q2'={ data set dg1, data set dg4, data Group dg3} (data set dg1 has not the most migrated), if user asks to access data set dg3 again, Data set dg3 is accessed 2 times, and data set dg1 and data set dg4 is accessed 1 time, then will Data set dg3 moves to head of the queue, the second queue Q2' become Q2"={ data set dg3, data set dg1, Data set dg4}.Preferentially migrate the second queue Q to disk d1102Data set in ", migration completes Afterwards, the data set in first queue is migrated.

A kind of data-storage system 1000 for distributed storage provided according to embodiments of the present invention, As shown in Figure 4, including:

Monitoring module 100, each duty storing device in the cluster of monitoring distributed storage;

Migrate data determining module 200, for when there is inefficacy storage device, determining described inefficacy Remaining survival in storage at least one storage array at device place, at least one storage array described Storage device and the total data group relevant at least one storage array described;

Migration series generation module 300, based on user at least one number in described total data group According to the access of group, generate first queue and the second queue, wherein, first queue and described total data Group is not accessed for data set corresponding, the second queue and described total data group are accessed for data Group correspondence;

Storage repair module 400, selects available storage to replace the described storage dress that lost efficacy from cluster Put;

Data Migration module 500, based on the second queue prior to the order of first queue, described in utilization extremely Remaining survival storage device available storage transport number after replacing in a few storage array According to group.

This data-storage system is used for performing above-mentioned date storage method, and can reach and date storage method Identical technique effect.

The structural representation of another data-storage system 1200 that Fig. 5 provides for the embodiment of the present application, Implementing of subscriber equipment 1200 is not limited by the application specific embodiment.As it is shown in figure 5, This subscriber equipment 1200 may include that

Processor (processor) 1210, communication interface (Communications Interface) 1220, Memorizer (memory) 1230 and communication bus 1240.Wherein:

Processor 1210, communication interface 1220 and memorizer 1230 are complete by communication bus 1240 Become mutual communication.

Communication interface 1220, for the net element communication with such as client etc..

Processor 1210, is used for the program that performs 1232, specifically can perform in said method embodiment Correlation step.

Specifically, program 1232 can include that program code, described program code include computer operation Instruction.

Processor 1210 is probably a central processor CPU, or specific integrated circuit ASIC (Application Specific Integrated Circuit), or be configured to implement the application enforcement One or more integrated circuits of example.

Device embodiment described above is only schematically, wherein said illustrates as separating component Unit can be or may not be physically separate, the parts shown as unit can be or Person may not be physical location, i.e. may be located at a place, or can also be distributed to multiple network On unit.Some or all of module therein can be selected according to the actual needs to realize the present embodiment The purpose of scheme.Those of ordinary skill in the art are not in the case of paying performing creative labour, the most permissible Understand and implement.

Through the above description of the embodiments, those skilled in the art is it can be understood that arrive each reality The mode of executing can add the mode of required general hardware platform by software and realize, naturally it is also possible to by firmly Part.Based on such understanding, the portion that prior art is contributed by technique scheme the most in other words Dividing and can embody with the form of software product, this computer software product can be stored in computer can Read in storage medium, such as ROM/RAM, magnetic disc, CD etc., including some instructions with so that one It is real that computer equipment (can be personal computer, memory module, or the network equipment etc.) performs each Execute the method described in some part of example or embodiment.

Last it is noted that above example is only in order to illustrate technical scheme, rather than to it Limit;Although the present invention being described in detail with reference to previous embodiment, the ordinary skill of this area Personnel it is understood that the technical scheme described in foregoing embodiments still can be modified by it, or Person carries out equivalent to wherein portion of techniques feature;And these amendments or replacement, do not make corresponding skill The essence of art scheme departs from the spirit and scope of various embodiments of the present invention technical scheme.

Claims (8)

1. the date storage method for distributed storage, it is characterised in that including:
Each duty storing device in the cluster of monitoring distributed storage;
When there is inefficacy storage device, determine at least one storage at described storage device place of losing efficacy In array, at least one storage array described remaining survival storage device and with described at least one deposit The total data group that storage array is relevant;
Based on user's access at least one data set in described total data group, generate first team Row and the second queue, wherein, be not accessed for data set pair in first queue and described total data group Should, the second queue is with to be accessed for data set in described total data group corresponding;
Available storage is selected to replace the described storage device that lost efficacy from cluster;
Based on the second queue prior to the order of first queue, utilize at least one storage array described surplus Remaining survival storage device available storage after replacing migrates data set.
Date storage method the most according to claim 1, it is characterised in that for the second team Data set in row, accessed number of times is the most, the most forward in the second queue.
Date storage method the most according to claim 1, it is characterised in that described determining Losing efficacy, it is remaining to store at least one storage array at device place, at least one storage array described After survival storage device and the total data group relevant at least one storage array described, described method Also include:
When the quantity of remaining survival storage device is two or more, according to remaining each survival storage Device position in storage array, generates storage device recovery order;
Based on the second queue prior to the order of first queue, utilize at least one storage array described surplus Remaining survival storage device available storage after replacing migrates data set and includes:
Based on the second queue prior to the order of first queue, according to storage device recovery order, utilize institute State remaining survival storage device available storage after replacing at least one storage array to move Move data set.
Date storage method the most according to claim 1, it is characterised in that described determining Losing efficacy, it is remaining to store at least one storage array at device place, at least one storage array described Survival storage device and the total data group relevant at least one storage array described, described method is also Including:
When the quantity of the storage device that lost efficacy is two or more, according to corresponding with the described storage device that lost efficacy At least two available storage position in storage array, generate storage device recovery order;
Based on the second queue prior to the order of first queue, utilize at least one storage array described surplus Remaining survival storage device available storage after replacing migrates data set and includes:
Based on the second queue prior to the order of first queue, according to storage device recovery order, utilize institute State remaining survival storage device available storage after replacing at least one storage array to move Move data set.
5. the data-storage system for distributed storage, it is characterised in that including:
Monitoring module, each duty storing device in the cluster of monitoring distributed storage;
Migrating data determining module, for when there is inefficacy storage device, determining described storage of losing efficacy Remaining survival storage at least one storage array at device place, at least one storage array described Device and the total data group relevant at least one storage array described;
Migration series generation module, based on user at least one data set in described total data group Access, generate first queue and the second queue, wherein, first queue with in described total data group It is not accessed for data set corresponding, the second queue and described total data group are accessed for data set pair Should;
Storage repair module, selects available storage to replace the described storage device that lost efficacy from cluster;
Data Migration module, based on the second queue prior to the order of first queue, described in utilization at least one In individual storage array, remaining survival storage device available storage after replacing migrates data set.
Data-storage system the most according to claim 5, it is characterised in that for the second team Data set in row, accessed number of times is the most, the most forward in the second queue.
Data-storage system the most according to claim 5, it is characterised in that described system is also Determine module including storage order, be two or more for the quantity storing device when remaining survival Time, according to remaining each survival storage device position in storage array, generate storage device and recover Sequentially;
Described Data Migration module for based on the second queue prior to the order of first queue, according to storage Device recovery order, utilizes remaining survival at least one storage array described to store device to replacement After available storage migrate data set.
Data-storage system the most according to claim 5, it is characterised in that described system is also Determine module including storage order, be used for when the quantity of the storage device that lost efficacy is two or more, according to At least two available storage corresponding with the described storage device that lost efficacy position in storage array, Generate storage device recovery order;
Described Data Migration module for based on the second queue prior to the order of first queue, according to storage Device recovery order, utilizes remaining survival at least one storage array described to store device to replacement After available storage migrate data set.
CN201610547862.5A 2016-07-12 2016-07-12 Data storage method and system for distributed storage CN106020739A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610547862.5A CN106020739A (en) 2016-07-12 2016-07-12 Data storage method and system for distributed storage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610547862.5A CN106020739A (en) 2016-07-12 2016-07-12 Data storage method and system for distributed storage

Publications (1)

Publication Number Publication Date
CN106020739A true CN106020739A (en) 2016-10-12

Family

ID=57109489

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610547862.5A CN106020739A (en) 2016-07-12 2016-07-12 Data storage method and system for distributed storage

Country Status (1)

Country Link
CN (1) CN106020739A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101692227A (en) * 2009-09-25 2010-04-07 中国人民解放军国防科学技术大学 Building method of large-scale and high-reliable filing storage system
CN101692226A (en) * 2009-09-25 2010-04-07 中国人民解放军国防科学技术大学 Storage method of mass filing stream data
CN102193746A (en) * 2010-03-11 2011-09-21 Lsi公司 System and method for optimizing redundancy restoration in distributed data layout environments
US20140331086A1 (en) * 2010-04-26 2014-11-06 Cleversafe, Inc. Prioritizing rebuilding of stored data in a dispersed storage network
CN104813276A (en) * 2012-11-26 2015-07-29 亚马逊科技公司 Streaming restore of a database from a backup system
CN105637487A (en) * 2013-06-13 2016-06-01 数据引力公司 Live restore for a data intelligent storage system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101692227A (en) * 2009-09-25 2010-04-07 中国人民解放军国防科学技术大学 Building method of large-scale and high-reliable filing storage system
CN101692226A (en) * 2009-09-25 2010-04-07 中国人民解放军国防科学技术大学 Storage method of mass filing stream data
CN102193746A (en) * 2010-03-11 2011-09-21 Lsi公司 System and method for optimizing redundancy restoration in distributed data layout environments
US20140331086A1 (en) * 2010-04-26 2014-11-06 Cleversafe, Inc. Prioritizing rebuilding of stored data in a dispersed storage network
CN104813276A (en) * 2012-11-26 2015-07-29 亚马逊科技公司 Streaming restore of a database from a backup system
CN105637487A (en) * 2013-06-13 2016-06-01 数据引力公司 Live restore for a data intelligent storage system

Similar Documents

Publication Publication Date Title
Baker et al. Megastore: Providing scalable, highly available storage for interactive services
US7606844B2 (en) System and method for performing replication copy storage operations
CN105027070B (en) Roll up the security of operation
CN102667709B (en) For providing the system and method for the longer-term storage of data
US8868711B2 (en) Dynamic load balancing in a scalable environment
US8504571B2 (en) Directed placement of data in a redundant data storage system
US8352424B2 (en) System and method for managing replicas of objects in a distributed storage system
US9448731B2 (en) Unified snapshot storage management
US8555106B2 (en) Data migration management apparatus and information processing system
JP3786955B2 (en) Data storage management system for network interconnection processor
US8090792B2 (en) Method and system for a self managing and scalable grid storage
US9720989B2 (en) Dynamic partitioning techniques for data streams
US8572330B2 (en) Systems and methods for granular resource management in a storage network
CN103874980B (en) Mapping in a storage system
JP2012507075A (en) Configuration management in distributed data systems.
US8862847B2 (en) Distributed storage method, apparatus, and system for reducing a data loss that may result from a single-point failure
US20110004683A1 (en) Systems and Methods for Granular Resource Management in a Storage Network
US9424274B2 (en) Management of intermediate data spills during the shuffle phase of a map-reduce job
US20170206141A1 (en) Unified snapshot storage management, using an enhanced storage manager and enhanced media agents
DE202009019149U1 (en) Asynchronous distributed garbage collection for replicated storage clusters
US8918392B1 (en) Data storage mapping and management
JP5730271B2 (en) Network data storage system and data access method thereof
US7788303B2 (en) Systems and methods for distributed system scanning
US8661216B2 (en) Systems and methods for migrating components in a hierarchical storage network
US9448892B2 (en) Systems and methods for migrating components in a hierarchical storage network

Legal Events

Date Code Title Description
PB01 Publication
C06 Publication
SE01 Entry into force of request for substantive examination
C10 Entry into substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20161012

WD01 Invention patent application deemed withdrawn after publication