CN104008152B - Support the framework method of the distributed file system of mass data access - Google Patents

Support the framework method of the distributed file system of mass data access Download PDF

Info

Publication number
CN104008152B
CN104008152B CN201410216506.6A CN201410216506A CN104008152B CN 104008152 B CN104008152 B CN 104008152B CN 201410216506 A CN201410216506 A CN 201410216506A CN 104008152 B CN104008152 B CN 104008152B
Authority
CN
China
Prior art keywords
node
file
motion
distributed
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410216506.6A
Other languages
Chinese (zh)
Other versions
CN104008152A (en
Inventor
董敏
金泽豪
毕盛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201410216506.6A priority Critical patent/CN104008152B/en
Publication of CN104008152A publication Critical patent/CN104008152A/en
Application granted granted Critical
Publication of CN104008152B publication Critical patent/CN104008152B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • G06F16/134Distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Abstract

The invention discloses a kind of framework method for the distributed file system for supporting mass data to access, this method is based on distributed hashtable, and access node is obtained by carrying out Hash mapping to file path.Using fully distributed non-stop layer architecture design, new node can add cluster by communicating several times.Addressing between node uses Kademlia algorithms, and the distance being divided routing table and obtaining node by XOR is to realize redirecting for nearest neighbors.The operation of the node is mapped to handle by PaxosLease algorithm picks leader, to solve consistency problem.The piecemeal storage of size is then fixed in the real data of file, and redundancy backup is on several nodes, there is provided the demand of security and Distributed Calculation.The system of framework can significantly increase treatment effeciency when mass file is handled, and better effects can be also obtained in the environment of relatively low delay requirement.

Description

Support the framework method of the distributed file system of mass data access
Technical field
The present invention relates to distributed file system research field, more particularly to a kind of distribution for supporting mass data to access The framework method of file system.
Background technology
With the development of Internet technology, " cloud computing is just increasingly paid attention to by people, and it is Distributed Calculation, parallel meter The fusions of conventional art such as calculation, effectiveness calculating, network storage, virtualization, load balancing and a kind of new user oriented formed Service type product concept.And " cloud storage " is one of cloud service for pressing close to common netizen the most.
The distributed file system of early stage, file and its metadata information do not do redundancy backup, once wherein certain clothes Business device failure, then store file on that server with regard to unavailable.And as quantity of documents increases, system also becomes more It is huge, both difficult extensions or unmanageable.Modern distributed file system then more focuses on the Distribution Strategy of metadata, by file member Data and data storage separation, can improve concurrency, the availability of service, and make full use of actual data storage machine in cluster The disk I/O of device.
Distributed file system common at present has GFS, HDFS, Lustre, MogileFS etc., and it is each applied to difference Field.Most active is HDFS on Hadoop, its Organization Chart as shown in figure 8, its towards be Distributed Calculation, use is single The framework of meta data server, simple system are adapted to larger file size, and the file write by way of addition often reaches Hundreds and thousands of GB, file is subjected to piecemeal storage.For distributed data processing, the scene calculated, HDFS enough should Pay, and existing many successful cases.But its single host node easily becomes bottleneck, and there is the situation of single point failure. MogileFS supports the read-write of large amount of small documents, can replicate file automatically, but does not support the random read-write of file, to database Depend on unduly, equally exist Single Point of Faliure.Lustre uses object storage technology, is adapted to be written and read big file, and it will be big File fragmentation, reliability is provided by the RAID on memory node, therefore system does not provide the redundancy backup of multiple copies.
The content of the invention
The shortcomings that it is a primary object of the present invention to overcome prior art and deficiency, there is provided one kind supports mass data to access Distributed file system framework method, the advantages of system has used for reference various distributed file systems, its non-stop layer frame Structure and redundancy backup mechanism can provide the data of safe and reliable, efficient distribution type file access service and magnanimity to upper strata Access.
The purpose of the present invention is realized by following technical scheme:Support the distributed file system of mass data access Framework method, comprise the following steps:
(1) using the Network Communication Framework of non-obstruction, in linux system, using epoll selectors.Make system big Still there is very high performance when measuring connection and high IO;
(2) simple and highly efficient remote procedure call (RPC, the Remote Procedure based on dynamic proxy is used Call), system complexity is reduced;
(3) similar with traditional C/S frameworks, for client by API Access file system, the node in cluster passes through ether Net realization is in communication with each other, and each node is responsible for safeguarding routing table, metadata, file data.Client connects any one and noted The node of volume service is that the operation to file can be achieved;
(4) file is mapped on corresponding node by uniformity hash algorithm, ensures distribution and the interstitial content of file Unrelated, the addition of node is preferably minimized with exiting the migration amount of influence to system and data, distributed hashtable use Kademila algorithms, the time loss in locating file can be reduced to greatest extent;
(5) big file block, the data and metadata of file are all backed up on 3 different nodes, node is delayed machine After can switch rapidly, ensure the safe and effective of data;
(6) on multiple nodes all there is file backup in fully distributed structure, need during to some file operation to sentence Disconnected really exercisable backup.System uses a kind of algorithm outstanding, that leader can be quickly elected in multiple nodes PaxosLease, other backups are re-synchronised to after being operated by leader.
Non- clogging networks communications framework described in the step of above method (1), it is the NIO storehouses MINA based on Java, its The event driven API for supporting to be abstracted on TCP/UDP is provided.It is also outstanding filter chain and with Multi-thread control device mould Type, unpacking is quickly packaged to packet, and gives the processing of Multi-thread control device, MINA takes in complete RPC calling About 0.5 millisecond.
The traditional mode of remote procedure call described in the step of above method (2) is three layers:
(2-1) counterfoil/framework (Stub/Skeleton) layer:For client counterfoil (agency) and server end frame.
(2-2) telereference (Remote Refference) layer:For telereference behavior.
(2-3) transmits (Transport) layer:Tracking for establishment of connection and management, and remote object.
The excessive abnormal examination of RMI lower portions that Java is carried, subsidiary unnecessary information during transmission, the generation of counterfoil Also so that the management of code becomes complicated, Figure 11 is seen.And dynamic proxy mode (see Fig. 6) is operationally dynamic as needed Agent object is generated, the method name that will be called, parameter are looked into by being sent to service end, service end after packaging after receiving request Chartered service entities are looked for, are called after the method for entity to being sent to client after return value and abnormal packaging.
The thought of metadata reference Linux file system index node and GFS described in the step of above method (3), Child node information, file size, permission mode, data block information including file etc., tree structure is formed, each text in system The size of part block is 64M, and big file is stored by piecemeal, and by the block message linked list maintenance of file metadata;File operation API bags Include and create node, judge whether, create directory, deleting, listing the operation similar with other operating systems such as catalogue file.
The step of above method (4), the Kademila protocol algorithms process had:
(4-1) machine characteristic (such as IP address) and file path all obtain an ID by Hash operation, and the system uses Quick and sound 64 CityHash algorithms, have the characteristics that uniformly, collision rate it is low.
(4-2) ID is distributed in 264On the ring of size, to find the closest node of the ID that be mapped to current key, need Calculate the distance of known node.In Kademila algorithms, the distance between two ID is obtained by XOR:
D (x, y)=x ⊕ y
It is recognised that XOR is one-way.It is always true in the presence of one for any given node x and distance D Fixed node y so that d (x, y)=D;
(4-3) Kad routing tables are made up of the data structure for being referred to as K buckets, K buckets are actual deposit be<K,V>To mapping, often The distance range for the ID values that individual K buckets have an ID and it is included.When insertion<K,V>To it is enough when, K buckets can divide Split, the Kad routing tables of a machine are 64 in which final state.If some K bucket is full, replaced using lru algorithm, favorably In the management of current meter node.
(4-4) Kad routes are a nonequilibrium line segment binary trees, but the Kad routes of a node are not too large, look into The average time complexity of inquiry is O (logN), and it, which is operated, is divided into insertion, deletion, searches one closest to certain ID value.
The partition strategy of big file described in the step of above method (5) refers to the partition strategy in HDFS, will be greater than 64M File carry out piecemeal, it is 64M to give tacit consent to each piecemeal.Each piecemeal is backuped on 3 neighbouring nodes, and the write-in of file is write from memory Recognize using additional mode, ablation process is chain type, and each node transmits after receiving data to next node, number According at least thinking to write successfully after first node verifies successfully, if having piecemeal write-in failure, by inspection data backup Thread is initiated synchronous.
The step of above method (6), the PaxosLease algorithms were specific as follows:
When sponsor (Proposer) one proposal of proposition, want the proposal and get the Green Light, it is necessary to obtain and exceed half The approval of several resolution persons (Acceptor), it can just be synchronized on people (Learner) handbook of all execution proposals.Resolution person and The waiter of message transmission is not (corresponding node, the network failure in a distributed system) of full-time job, it is believed that only The resolution person that exceed half (1+n/2) have approved proposal, then the proposal is passed through.
Its constraint includes:
P1:One resolution person must receive the motion received for the first time;
P2:Once one with motion value v, (motion value is that each motion must be with, for example the tax revenue in reality carries Case, then motion value can be tax revenue ratio) motion go through, then the motion ratified afterwards must have value v.
One motion value v of approval means that multiple resolution persons receive the value, and therefore, P2 can be strengthened:
P2a:Once a motion with motion value v goes through, then the motion that any resolution person receives again afterwards There must be value v.
Because communication is asynchronous, constraints P2a and constraints P1 can be clashed.An if motion value v quilt After approval, a sponsor and a resolution person revive from dormancy, and the former proposes a motion with new motion value.Root According to constraints P1, the latter should receive, and according to constraints P2a, then should not receive, constraints P2a under this Scene It is contradictory with P1.Then the behavior to proponent is needed to enter row constraint:
P2b:Once a motion with motion value v goes through, then the motion that any sponsor proposes later is necessary With value v.
Constraints P2b has contained constraints P2a, is a stronger constraint, it can be difficult to realizing, can find one The individual constraint P2c for containing constraints P2b:
P2c:If the motion that a numbering is n has motion value v, then a majority be present, otherwise institute in them Someone is without any motion of the receiving numbering less than n, otherwise they in all motions of the numbering less than n it has been accepted that number That maximum motion has motion value v.
The present invention compared with prior art, has the following advantages that and beneficial effect:
(1) present invention has used for reference the advantages of various distributed file systems, such as HDFS file block, carries on this basis Go out the distributed file system based on Hash mapping, each node is both used as data access node, also serve as metadata storage Node, overcome traditional single point failure situation, service can be provided, can also overcome as routing inquiry, the via node redirected The pressure that metadata is safeguarded by single node, the problem of no Single Point of Faliure, greatly improve the stability of system.
(2) present invention is complete distributed structure/architecture, and each node is cheap PC, fully excavates its computing and IO abilities, The migration exited to data of node and the influence of system are preferably minimized, and the addition of node also very flexibly, has very high expansion Malleability.
(3) search procedure of a file operation uses Kademila algorithms in the inventive method, and the required time is The complexity of Logarithmic degree, the time loss in search procedure can be reduced to greatest extent.Operation in specific 3 copies is led to PaxosLease election leadership persons are crossed, it is highly reliable.Two stages, all operations had low-down time delay.
Brief description of the drawings
Fig. 1 is the present embodiment two layer system configuration diagram.
Fig. 2 is the present embodiment RPC communication model.
Fig. 3 is the transmission channel schematic diagram in the write-in of the present embodiment file data.
Fig. 4 is the demonstration graph that the present embodiment Kademlia algorithms once insert K buckets.
Fig. 5 is the schematic diagram that ID is once searched in the present embodiment Kademlia algorithms.
Fig. 6 is dynamic proxy schematic diagram.
Fig. 7 is the process schematic that the present embodiment PaxosLease algorithms once compete.
Fig. 8 is Hadoop file system HDFS Organization Charts in the prior art.
Fig. 9 is MINA network architecture figures.
Figure 10 is RPC structural framing figures.
The RPC frameworks (RMI) that Figure 11 is Java itself call schematic diagram.
Embodiment
With reference to embodiment and accompanying drawing, the present invention is described in further detail, but embodiments of the present invention are unlimited In this.
Embodiment 1
Hardware net structure such as Fig. 1 used by the distributed file system that support mass data described in the present embodiment accesses It is shown, it is two layer system framework, specifically includes client and several servers, each server includes name node (NameNode) it is and back end (DataNode), similar with traditional C/S frameworks, client by API Access file system, Node in cluster is realized by Ethernet to be in communication with each other, and each node is responsible for safeguarding routing table, metadata, file data.Visitor Following operate can be achieved in family end:A, it is connected to arbitrary node;B, it is connected to specific service device.Client is by connecting any one The node of registered service is that the operation to file can be achieved.
Framework method described in the present embodiment is based on distributed hashtable, by carrying out Hash mapping acquisition to file path Access node.This system uses fully distributed non-stop layer architecture design, and new node can be added by communicating several times Cluster.Addressing algorithm between node employs Kademlia algorithms, and routing table is divided and saved by XOR Distance between point is to realize redirecting for nearest neighbors.The section is mapped to handle by PaxosLease algorithm picks leader The operation of point, to solve consistency problem.The metadata of file is by the Hash mapping to file absolute path to corresponding Node, and store on this node.Metadata object is stored directly in internal memory to provide the service of access, while is protected on hard disk A mirror image is deposited to make fault recovery use.The piecemeal storage of size, and redundancy backup is then fixed in the real data of file On several nodes, there is provided the demand of security and Distributed Calculation.The system can be significantly when mass file is handled Treatment effeciency is improved, better effects can be also obtained in the environment of relatively low delay requirement.Below in conjunction with the accompanying drawings to specific method Step is described.
First, the present embodiment uses the NIO storehouses MINA based on Java, in linux system, using epoll selectors.MINA Network frame figure it is as shown in Figure 9.
2nd, using simple and highly efficient remote procedure call (RPC, RemoteProcedure based on dynamic proxy Call), system complexity is reduced, RPC communication model is as shown in Figure 2.
Described RPC traditional modes are three layers, as shown in Figure 10, including:Counterfoil/framework (Stub/Skeleton) layer:With In client counterfoil (agency) and server end frame;Telereference (Remote Refference) layer:For telereference row For;Transmit (Transport) layer:Tracking for establishment of connection and management, and remote object.
3rd, similar with traditional C/S frameworks, for client by API Access file system, the node in cluster passes through ether Net realization is in communication with each other, and each node is responsible for safeguarding routing table, metadata, file data.Client connects any one and noted The node of volume service is that the operation to file can be achieved.
Described metadata reference Linux file system index node and GFS thought, include the child node letter of file Breath, file size, permission mode, data block information etc., tree structure is formed, the size of each blocks of files is 64M in system, greatly File is stored by piecemeal, and by the block message linked list maintenance of file metadata;File operation API includes creating node, judging It is no to exist, create directory, deleting, listing the operation similar with other operating systems such as catalogue file.Such as in the present embodiment Two metadata INode and BlockInfo establishing data structure are respectively following structure, and INode includes FsVersion, path, type, mode, createTime, modifyTime, children, size, blockInfos etc. believe Breath;BlockInfo includes the information such as path, blocklength, offset, seqNum, replica.
4th, file is mapped on corresponding node by uniformity hash algorithm, and wherein distributed hashtable uses Kademila algorithms, the time loss in locating file is reduced to greatest extent.
The Kademila protocol algorithms process has:
(4-1) machine characteristic (such as IP address) and file path all obtain an ID by Hash operation, and the system uses Quick and sound 64 CityHash algorithms, have the characteristics that uniformly, collision rate it is low.
(4-2) ID is distributed in 264On the ring of size, to find the closest node of the ID that be mapped to current key, need Calculate the distance of known node.In Kademila algorithms, the distance between two ID is obtained by XOR:
D (x, y)=x ⊕ y
It is recognised that XOR is one-way.It is always true in the presence of one for any given node x and distance D Fixed node y so that d (x, y)=D;
(4-3) Kad routing tables are made up of the data structure for being referred to as K buckets, K buckets are actual deposit be<K,V>To mapping, often The distance range for the ID values that individual K buckets have an ID and it is included.When insertion<K,V>To it is enough when, K buckets can divide Split, the Kad routing tables of a machine are 64 in which final state.If some K bucket is full, replaced using lru algorithm, favorably In the management of current meter node.
(4-4) Kad routes are a nonequilibrium line segment binary trees, but the Kad routes of a node are not too large, look into The average time complexity of inquiry is O (logN), and it, which is operated, is divided into insertion, deletion, searches one closest to certain ID value.
It is as follows that search procedure of Kademila algorithms is specifically given with reference to accompanying drawing 4,5:
(1) division of K buckets
Every machine has a Kad routing table, K buckets are actual deposit be<K,V>To mapping.Each K buckets have one ID and ID values that it is included distance range.When insertion<K,V>To it is enough when, K buckets can divide.See Fig. 4.
(2) ID is searched
Setting:
Node ID Routing iinformation
0 0,1,11,15
1 1,2,10,15
2 2,3,11,13
3 3,4,12,14
4 4,5,12,13
5 5,6,13,15
6 6,7,12,14
7 7,8,10,12
8 8,9,11,13
9 3,9,10,15
10 0,6,10,11
11 0,7,11,12
12 0,9,12,13
13 1,8,13,14
14 2,7,14,15
15 0,9,12,15
In the present embodiment, it is necessary to from node 0, node 13 is searched, as shown above, with reference to Fig. 5, search procedure is such as Under:
A) in node 0,0,11,15 these three nodes is found by findNear (searching neighbor point) operations and may know that section Point 13.Wherein 0 had accessed, and did not visited again.
B) 0,11,12 are got from node 11;0,12,15 are got from node 15.Wherein 0,11,15 these three nodes are Through accessing, do not visit again.Remaining node 12 redirects next time.
C) hit node 13 is obtained from node 12, and obtains the IP values of node 13.
As can be seen here, ID is carried out in Kad networks and searches required RPC request number of times no more than logN times, and with The increase of run time, Kad routing iinformation can more enrich, adjacent node can more clear mutual situation, it is and popular Distant-end node can also be able to trust and preserve.In the ideal case, node checks just can be completed by 1 to 2 communication, this is it Advantage not available for his DHT technologies.
5th, big file block, the data and metadata of file are all backed up on 3 different nodes, node is delayed machine After can switch rapidly, ensure the safe and effective of data.
The strategy of the big file block refers to the partition strategy in HDFS, and the file that will be greater than 64M carries out piecemeal, acquiescence Each piecemeal is 64M.Each piecemeal is backuped on 3 neighbouring nodes, as shown in figure 3, what the write-in acquiescence of file used It is additional mode, ablation process is chain type, and each node transmits after receiving data to next node, and data at least exist First node is thought to write successfully after verifying successfully, if there is piecemeal write-in failure, is initiated by the thread of inspection data backup It is synchronous.
6th, system uses a kind of algorithm PaxosLease outstanding, that leader can be quickly elected in multiple nodes, Other backups are re-synchronised to after being operated by leader.Algorithm steps are as follows:
When sponsor (Proposer) one proposal of proposition, want the proposal and get the Green Light, it is necessary to obtain and exceed half The approval of several resolution persons (Acceptor), it can just be synchronized on people (Learner) handbook of all execution proposals.Resolution person and The waiter of message transmission is not (corresponding node, the network failure in a distributed system) of full-time job, it is believed that only The resolution person that exceed half (1+n/2) have approved proposal, then the proposal is passed through.
7 to specifically give PaxosLease algorithmic procedures as follows below in conjunction with the accompanying drawings:
1) sponsor wishes to obtain a T (T<M) the lease of second.It needs to prepare a motion numbering first [request.ballotNumber], and be sent on most boards of resolution person.
2) resolution person judges the motion numbering of request when a request is received Whether [request.ballotNumber] is more than the maximum motion numbering promised to undertake in local state [state.highestPromised].If it is lower, resolution person can ignore request or send back a refusal respond.If It is equal to or is more than, resolution person constructs a Prepare Response, wherein containing the resolution of current approved [state.acceptedProposal], it is empty or current leader.In addition, the highest of local is promised to undertake numbering by resolution person It is arranged to ask caused motion numbering, and highest is promised to undertake that numbering sends back to sponsor together with accepted at present resolve.
3) sponsor examines the prepare response beamed back from resolution person, if the majority of resolution person is replied Receive motion as sky, represent that they can receive a new motion, owner of the sponsor oneself as lease, that is, Leader, and unlatching one countdown T, lease will fail after timet.Sponsor by countdown T, resolution numbering and Motion value composition propose request are sent to all resolution persons.
4) resolution person checks whether numbering [request.ballotNumber] is big after propose request are received In the maximum numbering that local state is promised to undertake.If it is less, ignore or postback a refusal respond.If equal to or be more than, Resolution person receives motion:Maximum motion numbering is set, starts countdown T and lease owner (leader) is set.Then, Construction propose response are simultaneously postbacked, wherein containing resolution numbering.After countdown time-out, local state Lease owner is arranged to empty.Unless system reboot, otherwise resolution person is not reset by their highest and promises to undertake numbering.
5) the propose response of sponsor inspection institute recovery, if the majority of resolution person, which is replied, receives motion, Then sponsor possesses lease until the countdown time-out set in the 3rd step., will be certainly when sponsor receives the reply of majority Oneself state, which changes into, " possesses lease ".
Above-described embodiment is the preferable embodiment of the present invention, but embodiments of the present invention are not by above-described embodiment Limitation, other any Spirit Essences without departing from the present invention with made under principle change, modification, replacement, combine, simplification, Equivalent substitute mode is should be, is included within protection scope of the present invention.

Claims (8)

1. support the framework method of the distributed file system of mass data access, it is characterised in that comprise the following steps:
(1) using the Network Communication Framework of non-obstruction, in linux system, using epoll selectors;
(2) remote procedure call based on dynamic proxy is used;The traditional mode of described remote procedure call is three layers:
(2-1) counterfoil/ccf layer:For client counterfoil/agency, server end framework;
(2-2) telereference layer:For telereference behavior;
(2-3) transport layer:Tracking for establishment of connection and management, and remote object;
Dynamic proxy pattern described in step (2) is:Agent object is dynamically generated during operation as needed, will be called Method name, parameter search chartered service entities by being sent to service end, service end after packaging after receiving request, adjust With after the method for entity to return value and it is abnormal pack after be sent to client;
(3) client is by API Access file system, and the node in cluster is realized by Ethernet to be in communication with each other, each node It is responsible for safeguarding routing table, metadata, file data;The node that client connects any one registered service is realized to file Operation;
(4) file is mapped on corresponding node by uniformity hash algorithm, and distributed hashtable uses Kademila algorithms; Addressing algorithm between node employs Kademlia algorithms, being divided routing table and obtaining node by XOR Distance is to realize redirecting for nearest neighbors;
(5) big file block, the data and metadata of file are all backed up on several different nodes;It is described to big text The step of part piecemeal is:The file that will be greater than 64M carries out piecemeal, and it is 64M to give tacit consent to each piecemeal;Each piecemeal backups to neighbouring 3 nodes on, the write-in acquiescence of file is using additional mode, and ablation process is chain type, and each node receives Transmitted after data to next node, data at least think to write successfully after first node verifies successfully, if there is piecemeal to write Enter failure, then initiated by the thread of inspection data backup synchronous;
(6) on multiple nodes all there is file backup in fully distributed structure, to being used during some file operation PaxosLease algorithms elect leader in multiple nodes, and other backups are re-synchronised to after being operated by leader.
2. the framework method of the distributed file system according to claim 1 for supporting mass data to access, its feature exist In, the Network Communication Framework of the non-obstruction described in step (1), it is the NIO storehouses MINA based on Java, its offer support TCP/ The event driven API being abstracted on UDP, unpacking is packaged to packet, and gives the processing of Multi-thread control device.
3. the framework method of the distributed file system according to claim 1 for supporting mass data to access, its feature exist In, the thought of metadata reference Linux file system index node and GFS described in step (3), include the child node of file Information, file size, permission mode, data block information, tree structure is formed, the size of each blocks of files is 64M in system, greatly File is stored by piecemeal, and by the block message linked list maintenance of file metadata;File operation API includes creating node, judging The no operation there is, create directory, deleting, listing catalogue file.
4. the framework method of the distributed file system according to claim 1 for supporting mass data to access, its feature exist In step (4) the Kademila algorithm steps are as follows:
(4-1) machine characteristic and file path all obtain an ID by Hash operation;
(4-2) ID is distributed in 264On the ring of size, to find the closest node of the ID that be mapped to current key, it is necessary to count The distance of known node is calculated, in Kademila algorithms, the distance between two ID is obtained by XOR:
<mrow> <mi>d</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>=</mo> <mi>x</mi> <mo>&amp;CirclePlus;</mo> <mi>y</mi> <mo>;</mo> </mrow>
XOR is one-way, for any given node x and distance D, always in the presence of the node y of a determination, is made Obtain d (x, y)=D;
(4-3) Kad routing tables are made up of the data structure for being referred to as K buckets, K buckets are actual deposit be<K,V>To mapping, each K The distance range for the ID values that bucket has an ID and it is included, when insertion<K,V>To it is enough when, K buckets can divide, The Kad routing tables of next machine of end-state are 64;If some K bucket is full, replaced using lru algorithm;
(4-4) Kad routes are a nonequilibrium line segment binary trees, and it, which is operated, is divided into insertion, deletion, searches closest to certain ID value One.
5. the framework method of the distributed file system according to claim 4 for supporting mass data to access, its feature exist In machine characteristic and file path all obtain an ID by 64 CityHash algorithms in step (4-1).
6. the framework method of the distributed file system according to claim 1 for supporting mass data to access, its feature exist In step (6) the PaxosLease algorithms are specific as follows:
When sponsor's one proposal of proposition, want the proposal and get the Green Light, it is necessary to obtain batch of the resolution person more than half Standard, it can just be synchronized on people's handbook of all execution proposals;Its constraint includes:
P1:One resolution person must receive the motion received for the first time;
P2:Once a motion with motion value v goes through, then the motion ratified afterwards must have value v.
7. the framework method of the distributed file system according to claim 6 for supporting mass data to access, its feature exist In further being constrained above-mentioned constraints P2:
P2a:Once a motion with motion value v goes through, then the motion that any resolution person receives again afterwards is necessary With value v;
Meanwhile row constraint is entered in the behavior to proponent:
P2b:Once a motion with motion value v goes through, then the motion that any sponsor proposes later must have Value v.
8. the framework method of the distributed file system according to claim 7 for supporting mass data to access, its feature exist In constraint includes:
P2c:If the motion that a numbering is n has motion value v, then a majority be present, otherwise owner in them All without any motion of the receiving numbering less than n, otherwise they are it has been accepted that numbering is maximum in all motions of the numbering less than n That motion there is motion value v.
CN201410216506.6A 2014-05-21 2014-05-21 Support the framework method of the distributed file system of mass data access Active CN104008152B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410216506.6A CN104008152B (en) 2014-05-21 2014-05-21 Support the framework method of the distributed file system of mass data access

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410216506.6A CN104008152B (en) 2014-05-21 2014-05-21 Support the framework method of the distributed file system of mass data access

Publications (2)

Publication Number Publication Date
CN104008152A CN104008152A (en) 2014-08-27
CN104008152B true CN104008152B (en) 2017-12-01

Family

ID=51368809

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410216506.6A Active CN104008152B (en) 2014-05-21 2014-05-21 Support the framework method of the distributed file system of mass data access

Country Status (1)

Country Link
CN (1) CN104008152B (en)

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016051512A1 (en) * 2014-09-30 2016-04-07 株式会社日立製作所 Distributed storage system
CN104317947B (en) * 2014-11-07 2017-12-12 南京烽火星空通信发展有限公司 A kind of real-time architecture comparing system based on mass data
CN104462335B (en) * 2014-12-03 2017-12-29 北京和利时系统工程有限公司 A kind of method and server agent for accessing data
US10983732B2 (en) * 2015-07-13 2021-04-20 Pure Storage, Inc. Method and system for accessing a file
CN106557509A (en) * 2015-09-29 2017-04-05 镇江雅迅软件有限责任公司 A kind of distributed file system
CN105630973A (en) * 2015-12-25 2016-06-01 深圳市中博科创信息技术有限公司 File storage method of cluster file system and cluster file system
CN106210038B (en) * 2016-07-06 2019-01-29 网易(杭州)网络有限公司 The processing method and system of data operation request
CN106708439A (en) * 2016-12-23 2017-05-24 深圳市中博科创信息技术有限公司 Node selection and calculation method and system in distributed file system
CN106709045B (en) * 2016-12-29 2020-09-15 北京同有飞骥科技股份有限公司 Node selection method and device in distributed file system
CN106686117B (en) * 2017-01-20 2020-04-03 郑州云海信息技术有限公司 Data storage processing system and method of distributed computing cluster
CN106936899B (en) * 2017-02-25 2021-02-05 九次方大数据信息集团有限公司 Configuration method of distributed statistical analysis system and distributed statistical analysis system
CN106789632A (en) * 2017-02-25 2017-05-31 郑州云海信息技术有限公司 A kind of method of the node-routing of large-scale distributed storage system
CN110019501A (en) * 2017-08-24 2019-07-16 深圳市金证科技股份有限公司 A kind of collecting method, device and terminal device
CN107832138B (en) * 2017-09-21 2021-09-14 南京邮电大学 Method for realizing flattened high-availability namenode model
CN107613026A (en) * 2017-10-31 2018-01-19 四川仕虹腾飞信息技术有限公司 Distributed file management system based on cloud storage system
CN108319634B (en) * 2017-12-15 2021-08-06 深圳创新科技术有限公司 Directory access method and device for distributed file system
CN110071870B (en) * 2018-01-24 2022-03-18 苏宁云商集团股份有限公司 Alluxio-based routing method and device for multiple HDFS clusters
CN108462737B (en) * 2018-01-29 2021-02-02 哈尔滨工业大学深圳研究生院 Batch processing and pipeline-based hierarchical data consistency protocol optimization method
CN110120961B (en) * 2018-02-06 2022-04-26 北京京东尚科信息技术有限公司 Distributed service cluster and route synchronization method thereof
CN109688211A (en) * 2018-12-18 2019-04-26 杭州茂财网络技术有限公司 Data distribution formula processing method
CN109885550B (en) * 2018-12-28 2022-09-13 安徽维德工业自动化有限公司 File storage system based on all-connected routing layer
CN111695018B (en) * 2019-03-13 2023-05-30 阿里云计算有限公司 Data processing method and device, distributed network system and computer equipment
CN110381157A (en) * 2019-07-26 2019-10-25 正链科技(深圳)有限公司 A kind of distributed directional data storage P2P network based on Kademlia algorithm
CN113220644B (en) * 2021-05-28 2022-04-26 北京微纳星空科技有限公司 File processing method, device, equipment and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102737130A (en) * 2012-06-21 2012-10-17 广州从兴电子开发有限公司 Method and system for processing metadata of hadoop distributed file system (HDFS)

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI476610B (en) * 2008-04-29 2015-03-11 Maxiscale Inc Peer-to-peer redundant file server system and methods
CN101840366B (en) * 2010-05-13 2012-05-23 上海交通大学 Storage method of loop chain type n+1 bit parity check code

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102737130A (en) * 2012-06-21 2012-10-17 广州从兴电子开发有限公司 Method and system for processing metadata of hadoop distributed file system (HDFS)

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Kademlia协议资源定位模型的分析与改进";劳炽元;《中国优秀硕士学位论文全文数据库 信息科技辑》;20110315;第I139-55页 *
"Paxos 优化算法下的数据库元数据一致性方法研究";周一帆;《现代电子技术》;20130701;第36卷(第13期);第65-67,70页 *

Also Published As

Publication number Publication date
CN104008152A (en) 2014-08-27

Similar Documents

Publication Publication Date Title
CN104008152B (en) Support the framework method of the distributed file system of mass data access
EP3837652B1 (en) Distributed blockchain data storage under account model
US9558194B1 (en) Scalable object store
Bonvin et al. A self-organized, fault-tolerant and scalable replication scheme for cloud storage
CN103873501B (en) A kind of cloud standby system and its data back up method
JP2021508876A (en) Simultaneous transaction processing in a high-performance distributed recording system
CN109791594A (en) Data are segmented in order to persistently be stored in multiple immutable data structures
CN104391930A (en) Distributed file storage device and method
CN103442057A (en) Cloud storage system based on user collaboration cloud
CN107734026A (en) A kind of design method, device and the equipment of network attached storage cluster
CN106993064A (en) A kind of system and its construction method and application that the storage of mass data scalability is realized based on Openstack cloud platforms
CN102420854A (en) Distributed file system facing to cloud storage
CN101771537A (en) Processing method and certificating method for distribution type certificating system and certificates of certification thereof
US11100094B2 (en) Taking snapshots of blockchain data
CN112559637B (en) Data processing method, device, equipment and medium based on distributed storage
CN113360456B (en) Data archiving method, device, equipment and storage medium
CN105205402A (en) Privacy cluster metadata separation based cloud storage privacy protection method
CN114301972A (en) Block chain link point hierarchical deployment method and system based on cloud edge cooperation
Qin et al. A secure and effective construction scheme for blockchain networks
CN104951475B (en) Distributed file system and implementation method
CN106687943A (en) Systems and methods to organize a computing system having multiple computers, distribute computing tasks among the computers, and maintain data integrity and redundancy in the computing system
CN106919470A (en) A kind of data reconstruction method and device
US11194792B2 (en) Taking snapshots of blockchain data
CN106170012A (en) Distributed file system that a kind of facing cloud renders and structure and access method
CN110362590A (en) Data managing method, device, system, electronic equipment and computer-readable medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant