CN108512918A - The data processing method of heterogeneous distributed storage system - Google Patents

The data processing method of heterogeneous distributed storage system Download PDF

Info

Publication number
CN108512918A
CN108512918A CN201810245382.2A CN201810245382A CN108512918A CN 108512918 A CN108512918 A CN 108512918A CN 201810245382 A CN201810245382 A CN 201810245382A CN 108512918 A CN108512918 A CN 108512918A
Authority
CN
China
Prior art keywords
node
memory
data
memory node
storage system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810245382.2A
Other languages
Chinese (zh)
Inventor
曹叶文
艾伦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN201810245382.2A priority Critical patent/CN108512918A/en
Publication of CN108512918A publication Critical patent/CN108512918A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of data processing methods of heterogeneous distributed storage system, memory node in distributed memory system is divided into coding and repairs node set and copy reparation node set two parts, it is put into after initial data is encoded in the memory node that multiple codings repair node and copy reparation node composition, according to maximum flow minimum cut theorem, determine that each node meets the primary condition of reconstruct raw data file, under the limitation of the primary condition, obtain download cost low as possible, minimum reparation bandwidth point and minimum memory point is calculated, completes reconstruct.

Description

The data processing method of heterogeneous distributed storage system
Technical field
The present invention relates to a kind of data processing methods of heterogeneous distributed storage system.
Background technology
With the development of computer network, network information data amount becomes increasing, traditional document storage system without Method meets high power capacity, high reliability, the demand of high-performance etc..Distributed memory system is because of its good expansible characteristic And high reliability.However in distributed memory system, the node for storing data is insecure.
In order to provide reliable storage service by insecure memory node, need to introduce within the storage system superfluous It is remaining.It is exactly directly to be backed up to initial data to introduce the simplest method of redundancy, although directly backing up its simple storage efficiency It is not high with system reliability, and its storage efficiency can be improved by the method that coding introduces redundancy.
In current storage system, coding method generally uses MDS codes, and (Maximum Distance Separable are most Big distance is separable), MDS codes can reach the best of memory space efficiency, and (n, k) MDS error correcting code is needed an original Beginning file is divided into k equal-sized modules, and generates n orthogonal coding modules by uniform enconding, by n node It stores different modules, and meets MDS attributes (arbitrary k is a with regard to restructural original document in n coding module).This coding skill Art occupies an important position in providing effective network storage redundancy, and it is standby to be particularly suitable for storage big file and file data Part application.
However, system node failure or file loss, the redundancy of system can be gradually reduced with the time, therefore need A kind of mechanism is wanted to ensure the redundancy of system.Document [R.Rodrigues and B.Liskov, " HighAvailability in DHTs:Erasure Coding vs.Replication”,Workshop on Peer-to-PeerSystems(IPTPS) 2005.] the EC codes (Erasure Codes error correcting codes) proposed in, which is relatively effective on storage overhead, however is supported It is also bigger that redundancy restores required communication overhead.
In order to reduce the bandwidth used in repair process, document [A.G.Dimakis, P.G.Godfrey, M.J.Wainwright, K.Ramchandran ", Network coding for distributed storage systems " IEEE Proc.INFOCOM, Anchorage, Alaska, May2007.] using the thought of network code theory propose regeneration Code (RGC, Regenerating Codes), RGC codes also meet MDS code characteristics.In the repair process for regenerating code, new node needs D memory node is connected in remaining memory node and downloads the data of β sizes from this d memory node respectively, so The reparation bandwidth of RGC codes is d β.
And many distributed memory systems are isomeries in practice, new node needs connect in remaining memory node The size of data for connecing d memory node and being downloaded from this d memory node is different, and this d node is respective Download cost be also different, to sum up, heterogeneous distributed storage system there is reparation bandwidth and magnetic disc i/o it is excessively high, have Higher download cost.
Invention content
The present invention is to solve the above-mentioned problems, it is proposed that a kind of data processing method of heterogeneous distributed storage system, this Invention is combined using in isomery regeneration code and copy mode, can not only effectively reduce reparation bandwidth, additionally it is possible to have Effect reduces the download cost in repair process and the reduction along with magnetic disc i/o.
To achieve the goals above, the present invention adopts the following technical scheme that:
A kind of data processing method of heterogeneous distributed storage system, includes the following steps:
Memory node in distributed memory system is divided into coding and repairs node set and copy reparation node set two Part is put into after encoding initial data in the memory node that multiple codings repair node and copy reparation node composition, according to According to maximum flow minimum cut theorem, determine that each node meets the primary condition of reconstruct raw data file, in the limit of the primary condition Under system, download cost low as possible is obtained, minimum reparation bandwidth point and minimum memory point is calculated, completes reconstruct.
Further, the coding in distributed memory system is repaired into node set and is divided into two class unit sets, per a kind of There is different download costs and of a sort node has identical download cost.
Further, it is put into (n-m+ λ m) a memory node after being encoded to the initial data that size is M, wherein (n-m) A coding repairs node, and number of copies is a copy reparation nodes of m (m≤k-1) of λ;The data of each memory node storage Size is α;Initial data repairs node and arbitrary (k-m) a coding reparation node to complete to reconstruct using m copy.
Further, the ratio that coding repairs the eventual failure of node set is f1, when coding is repaired in node set A node failure when, new coding repair node set in from first kind unit set select d1A effective memory node Downloading data, the size of data downloaded from each node are β1, download cost is C1, d is selected from first kind unit set2It is a to have The memory node downloading data of effect, the size of data downloaded from each node are β2, download cost is C2, repair bandwidth γ=d1 β1+d2β2, total download cost is CT=C1d1β1+C2d2β2, and β1≥β2, while having β1=k ' β2, k ' takes positive integer value.
The ratio that copy repairs eventual failure in node set is f2, when copy repairs a node in node set When failure, new copy repairs node set from the replica node downloading data of a storage failure.
Further, when the effective memory node number selected in first kind unit set is more than or equal to memory node sum When, the primary condition for meeting reconstruct raw data file is:
Wherein, M is initial data capacity, and k is memory node sum, and α is the size of data of each memory node storage, m Number of nodes is repaired for copy.
Further, when the effective memory node number selected in first kind unit set is more than or equal to memory node sum When, minimum repairs bandwidth point and is:
Further, when the effective memory node number selected in first kind unit set is more than or equal to memory node sum When, minimum memory point is:
Further, full when the effective memory node number selected in first kind unit set is less than memory node sum The primary condition of lumping weight structure raw data file:
Further, when the effective memory node number selected in first kind unit set is less than memory node sum, most Small reparation bandwidth point is:
Further, when the effective memory node number selected in first kind unit set is less than memory node sum, most Small amount of storage point is:
Compared with prior art, beneficial effects of the present invention are:
The present invention and existing regeneration code, there are many advantages, and the present invention first is more to meet reality in isomery Meaning, secondly the present invention can bring better MBR and MSR points, while the present invention has more compared to achievement in research before Low download cost and lower disk I/O.
Description of the drawings
The accompanying drawings which form a part of this application are used for providing further understanding of the present application, and the application's shows Meaning property embodiment and its explanation do not constitute the improper restriction to the application for explaining the application.
The relationship of reparation bandwidth γ when Fig. 1 is the memory capacity α of each node and repairs certain failure node;
The average reparation bandwidth of MSR points under the method for the present invention is shown as abscissa is β in Fig. 2 and Fig. 32And β1Ratio The situation of change of value k ';
Fig. 4 is the average magnetic disc i/o of MBR points under the method for the present invention as abscissa is β2And β1Ratio k ' variation feelings Condition;
Fig. 5 is that the download cost of method proposed by the present invention and research method before document download the ratio of cost with m The case where variation;
Specific implementation mode:
The invention will be further described with embodiment below in conjunction with the accompanying drawings.
It is noted that following detailed description is all illustrative, it is intended to provide further instruction to the application.Unless another It indicates, all technical and scientific terms used herein has usual with the application person of an ordinary skill in the technical field The identical meanings of understanding.
It should be noted that term used herein above is merely to describe specific implementation mode, and be not intended to restricted root According to the illustrative embodiments of the application.As used herein, unless the context clearly indicates otherwise, otherwise singulative It is also intended to include plural form, additionally, it should be understood that, when in the present specification using term "comprising" and/or " packet Include " when, indicate existing characteristics, step, operation, device, component and/or combination thereof.
In the present invention, term for example "upper", "lower", "left", "right", "front", "rear", "vertical", "horizontal", " side ", The orientation or positional relationship of the instructions such as "bottom" is to be based on the orientation or positional relationship shown in the drawings, only to facilitate describing this hair Bright each component or component structure relationship and the relative of determination, not refer in particular to either component or element in the present invention, cannot understand For limitation of the present invention.
In the present invention, term such as " affixed ", " connected ", " connection " shall be understood in a broad sense, and indicate may be a fixed connection, Can also be to be integrally connected or be detachably connected;It can be directly connected, it can also be indirectly connected through an intermediary.For The related scientific research of this field or technical staff can determine the concrete meaning of above-mentioned term in the present invention as the case may be, It is not considered as limiting the invention.
As described in the background art, many distributed memory systems are isomeries in practice, and new node needs D memory node is connected in remaining memory node and the size of data downloaded from this d memory node be it is different, And the respective download cost of this d node is also different.For there is asking for higher reparation bandwidth in the case of this isomery The topic present invention, which proposes, a kind of to be regenerated the method that code and copy mode are combined in isomery and is directed to this method we obtain The minimum memory point under this method and minimum reparation bandwidth point are gone out.This method can not only effectively reduce reparation bandwidth, also The download cost in repair process and the reduction along with magnetic disc i/o can effectively be reduced.
Specifically, the technical solution adopted by the present invention, including:
The first step:Memory node in distributed memory system is divided into two parts, including coding repairs node { Xi, and Copy repairs node { Yj}.Coding in distributed memory system is repaired into node { XiIt is divided into two classesPer a kind of There is different download costs and of a sort node has identical download cost.
Classification mainly can be according to the difference of data source, such as from different data centers.Specifically, considering one The scene at a actual interaction data center, it includes local data center and remote data centers, we will be in difference The node of data center is divided into two classes, and when being in the node failure at different data center, new node will be from local data Download β in center1Bits downloads β from remote data center2Bits completes to repair, usually from different data center's downloading datas when Download cost be also different from local data center downloading data when download cost be C1, number is downloaded from remote data center According to when download cost be C2, consider that an ordinary circumstance is exactly that the link capacity of local data center is more than remote data center Link capacity has β1=k ' β2, wherein k ' >=1.
Second step:It is put into (n-m+ λ m) a memory node after being encoded to the initial data that size is M, wherein (n-m) is a {Xi, and number of copies is m (m≤k-1) a { Y of λj};The size of data of each memory node storage is α;Initial data utilizes M YjWith arbitrary (k-m) a Xi(be altogether k memory node) completes to reconstruct.XiThe ratio of middle eventual failure is f1(0<f1 <1), work as XiIn a node failure when, a new XiFromMiddle selection d1A effective memory node downloading data, The size of data downloaded from each node is β11≤ α), download cost is C1, fromIn select d2A effective memory node Downloading data, the size of data downloaded from each node are β22≤ α), download cost is C2(C2≥C1), reparation bandwidth γ= d1β1+d2β2(α≤γ), total download cost are CT=C1d1β1+C2d2β2, β in the present invention1≥β2, while having β1=k ' β2 (k ' should take positive integer value for actual conditions);YjThe ratio of middle eventual failure is f2(0<f2<1), work as YjIn one When node failure, a new YjFrom the replica node downloading data of a storage failure.
Wherein f as a preferred method,1, f2Selection reference literature [A.G.Dimakis, P.G.Godfrey, M.J.Wainwright, K.Ramchandran ", Network coding for distributed storage systems " IEEE Proc.INFOCOM, Anchorage, Alaska, May2007.] in Table1.Certainly, in other embodiments Other selection modes can be selected.
Reparation bandwidth is α, n, m, λ, k, d1、d2It is nonnegative integer, α, β1、β2It is nonnegative real number, for MBR and MSR Total memory capacity S=[n+ (λ -1) m] α, for average magnetic disc i/o be f1(n-m)dα+f2λ m α, for average reparation bandwidth For f1(n-m)γ+f2λmα。
How to obtain the lower download cost present invention in order to better illustrate and be classified into d1>=k and d1<Two kinds of situations of k add With explanation.
Third walks:In d1In the case of >=k, according to maximum flow minimum cut theorem, if the minimum of segmentation information source and the stay of two nights The minimum value cut is bigger than the capacity M of initial data, then the linear network encoding existed in finite field F enables to stay of two nights reduction former Beginning data.According to the theorem, the present invention has obtained d1Meet the primary condition of reconstruct raw data file in the case of >=k:
(1) formula is analyzed in order to facilitate us, enables bi=(d1-(k-1)+i)β1+d2β2(2)
C (α) is a linear segmented function
The minimum value of α determines by C (α) >=M, allows α*It is defined as the minimum value of α, then is had:
β1=k ' β2So as to obtain it is proposed by the present invention in the case of β2And α*Tradeoff and γ and α*Compromise Relationship.
4th step:Pass through the minimum reparation bandwidth point MBR being calculated in situation of the present invention:
5th step:Pass through the minimum memory point MSR being calculated in situation of the present invention:
6th step:As a kind of preferable, the comparative illustration with the prior art, selected part existing literature, i.e., existing text Offer 1 [S.Akhlaghi, A.Kiani, M.R.Ghanavati, " Cost-bandwidth tradeoff in Distributed Storage Systms”Computer Communications2010;33(17):2105-2115,doi:10.1016/ J.comcom.2010.07.022] object as comparative descriptions, the advantages of in the method for the prominent present invention, but in other implementations In mode, it's not limited to that.
D in the method for the present invention in order to obtain1The minimum download cost for repairing bandwidth point is than 1 side of existing literature in the case of >=k The download cost of method is low.
The method that we define the present invention is RRMBR and RRMSR, and the method in document is GMBR and GMSR, downloads generation The ratio of valence is ηMBRAnd ηMSR
It notices as m=0, ηMBR=1, the ρ as m=2k-1MBR=1.Work as ρMBR<When 1, the download cost of RRMBR is low In GMBR's.So when 0<m<When 2k-1, ρMBR<1 is met, and the method in the present invention can bring lower download generation at this time Valence.
Notice the ρ as m=0MSR=1 wants to allow ηMSR<0, therefore ηMSR(m) it is necessary for subtraction function.In d1The case where >=k Under (d1k′+d2-kk′+k′)k′>0 so the method in the present invention can bring lower download cost.
7th step:In d1<In the case of k, according to maximum flow minimum cut theorem, if the minimal cut of segmentation information source and the stay of two nights Minimum value it is bigger than the capacity M of initial data, then exist finite field F linear network encoding enable to the stay of two nights reduction it is original Data.
According to the theorem, the present invention has obtained d1<Meet the primary condition of reconstruct raw data file in the case of k:
(5) formula is analyzed in order to facilitate us, enables h (x, y)=d1+d2-k+x+yk′ (6)
C (α) is a linear segmented function
The minimum value of α is determined by C (α) >=M, allows α * to be defined as the minimum value of α, then has
Wherein:
g1(i)=i (2d-2k+2m+i+1)
g2(i)=(i+1) (2d2+ik′)
Therefore β can be calculated2min
β2min=f2(d1-1)
β1=k ' β2So as to obtain it is proposed by the present invention in the case of β2And α*Tradeoff and γ and α*Compromise Relationship.
8th step:Pass through the minimum reparation bandwidth point MBR being calculated in situation of the present invention
9th step:Pass through the minimum memory point MSR being calculated in situation of the present invention
Tenth step:D in the method for the present invention in order to obtain1<The minimum download cost for repairing bandwidth point is than existing text in the case of k The download cost for offering 1 method is low, we define the present invention method be RRMBR and RRMSR, and the method in document be GMBR and GMSR, the ratio for downloading cost are ηMBRAnd ηMSR
For formula (12) by analyzing ηMBRPerseverance is less than 1, therefore the download cost of this method is than the method for document existing literature 1 Download cost it is low,
For formula (13) by analysis when 0<m<η when 2k-1-dMBRLess than 1, as long as therefore meeting under above-mentioned condition this method It is lower than the download cost of the method proposed in existing literature 1 to carry cost.
In order to further verify scheme proposed by the present invention, MATLAB softwares can be used to carry out emulation experiment.Certainly, Those skilled in the art can carry out emulation experiment using other software.
In this experiment, it is assumed that our parameter model is (n, k, k ', d1,d2)=(15,10,2,11,3).
The relationship of reparation bandwidth γ when Fig. 1 is shown the memory capacity α of each node and repairs certain failure node.When It is the method in existing literature 1 when m=0, is met in the value range of actual conditions in m as seen from Figure 1, with m's Increase can obtain better MBR points and MSR points.
The average reparation bandwidth of MSR points under the method for the present invention is shown as abscissa is β in Fig. 2 and Fig. 32And β1Ratio The situation of change of value k ' is averagely repaired bandwidth with the increase of m and is had directly figure it is seen that being traditional scheme as m=0 The reduction of sight.As can be seen from Figure 3 with the increase of number of copies λ, certain raising therefore copy can be obtained by averagely repairing bandwidth It is more suitable that number λ should obtain smaller value acquirement 1.
The average magnetic disc i/o of MBR points under the method for the present invention is shown as abscissa is β in Fig. 42And β1Ratio k ' Situation of change, it can be seen that as the increase of the m magnetic disc i/o that is averaged is effectively reduced.
The ratio of the download cost and research method download cost before document of method proposed by the present invention is shown in Fig. 5 The case where changing with m, it can be seen that when the method for the present invention has lower download cost when m gets 4 to m values between 0 to 9 When the present invention download cost obtain it is minimum.
To sum up, it can be seen that there are many advantages for the present invention and existing method, and the present invention first is in isomery situation Under more meet practical significance, secondly present invention can bring better MBR and MSR points, while the present invention is compared to before Achievement in research has lower download cost and lower disk I/O.
The foregoing is merely the preferred embodiments of the application, are not intended to limit this application, for the skill of this field For art personnel, the application can have various modifications and variations.Within the spirit and principles of this application, any made by repair Change, equivalent replacement, improvement etc., should be included within the protection domain of the application.
Above-mentioned, although the foregoing specific embodiments of the present invention is described with reference to the accompanying drawings, not protects model to the present invention The limitation enclosed, those skilled in the art should understand that, based on the technical solutions of the present invention, those skilled in the art are not Need to make the creative labor the various modifications or changes that can be made still within protection scope of the present invention.

Claims (10)

1. a kind of data processing method of heterogeneous distributed storage system, it is characterized in that:Include the following steps:
Memory node in distributed memory system is divided into coding and repairs node set and copy reparation node set two parts, It is put into after initial data is encoded in the memory node that multiple codings repair node and copy reparation node composition, according to maximum Minimal cut theorem is flowed, determines that the primary condition of each node satisfaction reconstruct raw data file obtains under the limitation of the primary condition The low download cost of the amount of exhausting is calculated minimum reparation bandwidth point and minimum memory point, completes reconstruct.
2. a kind of data processing method of heterogeneous distributed storage system as described in claim 1, it is characterized in that:It will be distributed Coding in storage system repairs node set and is divided into two class unit sets, has different download costs per one kind and same class Node have identical download cost.
3. a kind of data processing method of heterogeneous distributed storage system as described in claim 1, it is characterized in that:It is to size It is put into (n-m+ λ m) a memory node after the initial data coding of M, wherein (n-m) a coding repairs node and number of copies is equal Node is repaired for a copies of m (m≤k-1) of λ;The size of data of each memory node storage is α;Initial data is secondary using m This reparation node and arbitrary (k-m) a coding repair node to complete to reconstruct.
4. a kind of data processing method of heterogeneous distributed storage system as claimed in claim 3, it is characterized in that:Coding is repaired The ratio of the eventual failure of node set is f1, when coding repairs a node failure in node set, new coding It repairs in node set and selects d from first kind unit set1A effective memory node downloading data is downloaded from each node Size of data is β1, download cost is C1, d is selected from first kind unit set2A effective memory node downloading data, from every The size of data that a node is downloaded is β2, download cost is C2, repair bandwidth γ=d1β1+d2β2, total download cost is CT= C1d1β1+C2d2β2, and β1≥β2, while having β1=k ' β2, k ' takes positive integer value.
5. a kind of data processing method of heterogeneous distributed storage system as claimed in claim 3, it is characterized in that:Copy reparation The ratio of eventual failure is f in node set2, when copy repairs a node failure in node set, new copy Repair replica node downloading data of the node set from a storage failure.
6. a kind of data processing method of heterogeneous distributed storage system as claimed in claim 3, it is characterized in that:Work as the first kind When the effective memory node number selected in unit set is more than or equal to memory node sum, meet the base of reconstruct raw data file This condition is:
Wherein, M is initial data capacity, and k is memory node sum, and α is the size of data of each memory node storage, and m is pair This reparation number of nodes.
7. a kind of data processing method of heterogeneous distributed storage system as claimed in claim 3, it is characterized in that:Work as the first kind When the effective memory node number selected in unit set is more than or equal to memory node sum, minimum repairs bandwidth point and is:
8. a kind of data processing method of heterogeneous distributed storage system as claimed in claim 3, it is characterized in that:Work as the first kind When the effective memory node number selected in unit set is more than or equal to memory node sum, minimum memory point is:
9. a kind of data processing method of heterogeneous distributed storage system as claimed in claim 3, it is characterized in that:Work as the first kind When the effective memory node number selected in unit set is less than memory node sum, meet the basic item of reconstruct raw data file Part:
10. a kind of data processing method of heterogeneous distributed storage system as claimed in claim 3, it is characterized in that:When first When the effective memory node number selected in class unit set is less than memory node sum, minimum repairs bandwidth point and is:
Or, when the effective memory node number selected in first kind unit set is less than memory node sum, minimum memory point For:
CN201810245382.2A 2018-03-23 2018-03-23 The data processing method of heterogeneous distributed storage system Pending CN108512918A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810245382.2A CN108512918A (en) 2018-03-23 2018-03-23 The data processing method of heterogeneous distributed storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810245382.2A CN108512918A (en) 2018-03-23 2018-03-23 The data processing method of heterogeneous distributed storage system

Publications (1)

Publication Number Publication Date
CN108512918A true CN108512918A (en) 2018-09-07

Family

ID=63378347

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810245382.2A Pending CN108512918A (en) 2018-03-23 2018-03-23 The data processing method of heterogeneous distributed storage system

Country Status (1)

Country Link
CN (1) CN108512918A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109445990A (en) * 2018-10-29 2019-03-08 哈尔滨工业大学(深圳) MDS buffering scheme based on double rendition
CN109828723A (en) * 2019-02-13 2019-05-31 山东大学 A kind of distributed memory system and its precise information restorative procedure and device
CN111131457A (en) * 2019-12-25 2020-05-08 上海交通大学 Capacity and bandwidth compromise method and system for heterogeneous distributed storage
CN111971945A (en) * 2019-04-03 2020-11-20 东莞理工学院 Rack sensing regeneration code for data center
CN116610645A (en) * 2023-07-17 2023-08-18 山东管理学院 Data distributed storage method and system based on heterogeneous regenerated code conversion

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104113562A (en) * 2013-04-17 2014-10-22 王磊 Distributed data storage and recovery system based on network coding and method thereof
US20160026543A1 (en) * 2014-07-24 2016-01-28 At&T Intellectual Property I, L.P. Distributed Storage of Data

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104113562A (en) * 2013-04-17 2014-10-22 王磊 Distributed data storage and recovery system based on network coding and method thereof
US20160026543A1 (en) * 2014-07-24 2016-01-28 At&T Intellectual Property I, L.P. Distributed Storage of Data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SOROUSH AKHLAGHI等: ""Cost-bandwidth tradeoff in distributed storage system"", 《COMPUTER COMMUNICATIONS》 *
丁炳辰等: ""与副本结合的部分再生码"", 《计算机科学》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109445990A (en) * 2018-10-29 2019-03-08 哈尔滨工业大学(深圳) MDS buffering scheme based on double rendition
CN109828723A (en) * 2019-02-13 2019-05-31 山东大学 A kind of distributed memory system and its precise information restorative procedure and device
CN109828723B (en) * 2019-02-13 2020-05-05 山东大学 Distributed storage system and accurate data restoration method and device thereof
CN111971945A (en) * 2019-04-03 2020-11-20 东莞理工学院 Rack sensing regeneration code for data center
CN111131457A (en) * 2019-12-25 2020-05-08 上海交通大学 Capacity and bandwidth compromise method and system for heterogeneous distributed storage
CN111131457B (en) * 2019-12-25 2021-11-30 上海交通大学 Capacity and bandwidth compromise method and system for heterogeneous distributed storage
CN116610645A (en) * 2023-07-17 2023-08-18 山东管理学院 Data distributed storage method and system based on heterogeneous regenerated code conversion
CN116610645B (en) * 2023-07-17 2023-12-05 山东管理学院 Data distributed storage method and system based on heterogeneous regenerated code conversion

Similar Documents

Publication Publication Date Title
CN108512918A (en) The data processing method of heterogeneous distributed storage system
US20190087262A1 (en) Dispersed b-tree directory trees
CN103686206B (en) Video transcoding method and system in cloud environment
Hu et al. Analysis and construction of functional regenerating codes with uncoded repair for distributed storage systems
CN107003933A (en) The method that construction method, device and its data of part replica code are repaired
CN105242983A (en) Data storage method and data storage management server
CN107135264B (en) Data coding method for embedded device
US20230108184A1 (en) Storage Modification Process for a Set of Encoded Data Slices
Wang et al. MFR: Multi-loss flexible recovery in distributed storage systems
CN110419029B (en) Method for partially updating data content in distributed storage network
CN109478125B (en) Manipulating a distributed consistency protocol to identify a desired set of storage units
CN107153661A (en) A kind of storage, read method and its device of the data based on HDFS systems
Akhlaghi et al. A fundamental trade-off between the download cost and repair bandwidth in distributed storage systems
CN108764458A (en) A kind of model compression method and system of non-uniform quantizing
CN109344009B (en) Mobile cloud system fault-tolerant method based on hierarchical check points
CN111045843A (en) Distributed data processing method with fault tolerance capability
Giroire et al. Peer-to-peer storage systems: a practical guideline to be lazy
US11223675B2 (en) Hash data structure biasing
CN108897497B (en) Centerless data management method and device
CN107070740A (en) A kind of efficient PAAS platform monitoring methods and system
CN108647108B (en) Construction method of minimum bandwidth regeneration code based on cyclic VFRC
CN106027653B (en) A kind of cloudy storage system extended method based on RAID4
WO2015148846A1 (en) Adaptive file backup system
CN107193362B (en) Energy-saving device for enhancing cloud computing environment
CN112468541B (en) Data processing method, device and system based on Internet of things

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180907