Background technology
Peer-to-peer network is that a class is utilized Internet (internet) edge resource (storage, calculating, content or the like), distributed system decentralization, self-organizing and all or that most of contact is a symmetry.The node of each participation had both served as client (client) under the P2P environment, also served as server (server), by all or part of resource of oneself is provided, obtained the resource of sharing in the P2P network; There are not concentrated coordinator and database, the neither one fellow has the view of whole system, and the behavioural characteristic of the overall situation comes from the mutual of this locality, and all already present data and service all are open to each fellow, and the fellow is autonomous, and its connection is insecure.
The weakened function of centralized servers of P2P technology, pay attention to the effect of all individualities in the network, what emphasize is direct communication and contact between the individuality, between the system, between the computer, each participant be the client be again service side, this makes the shared behavior of people on Internet be thus lifted to a level widely, people is participated in the network in more initiatively mode go.It and the existing Client/Server pattern that distributed computing technology was adopted based on middleware have essential distinction.From present should being used for, the power of P2P also be mainly reflected on a large scale share, on the search advantages.Mainly caused in this respect, or said the application that better solves four major types on the network: reciprocity calculating, collaborative work, search engine, exchange files.
Research roughly has three kinds based on the distinct issues of the grid computing of P2P: resource location, safety issue and manageability problem.Wherein, the manageability problem is based upon on preceding two kinds of bases.So say in essence, solved resource location and safety issue, just can revolutionary development be arranged so that P2P uses.The major issue that P2P need solve is: under a dynamic environment that lacks the centralization server, each node can maintain consistent network topological information.Because the adding of node and leave very frequently in the P2P network, the method for traditional route diffusion is difficult to address this problem, thus need one efficiently the consensus information maintenance mechanism realize some functions.For example, fast quick-recovery stability of network problem needs to solve when network topology change, but the concurrent adding of a plurality of nodes and leave and make to address this problem and have more challenge.In addition, the user to find the resource and the service that need from the node of a large amount of dispersions also be a challenge.
In P2P, the resource lookup pattern mainly contains following three kinds: (1) centralized search pattern: be characterized in being used for file index, all searching work are finished in centralized servers.Advantage is the minimizing Internet traffic, does not need to broadcast; If server non-overloading then respond fast.Shortcoming is to cause single point failure easily; Potential is congested; Because too many service connects, and is difficult to guarantee service quality; If central server caves in, service disruption then.(2) the distributed search pattern that spreads unchecked: be characterized in there is not central server; Inquiry needs to realize by broadcasting within the specific limits.Advantage is not have fixing network topology; There are not fixing data and index position.Shortcoming is the Internet traffic during broadcast query can bring very; Incoherent computer is forced to carry out inquiry.The scalability of network is bad; Because the inquiry negotiation mechanism causes inquiry response very slow.
(3) its basic ideas of distributed hash search pattern are that application system is passed through SHA
SHA (Secure Hash Algorithm) is mapped to certain identifier space (overall unique identifier) to access resources, and behind given resource label symbol, the node that system utilizes the distributed query agreement to look for a partner in marriage and depended on carries out the visit of resource again.It is the model at center that this location mechanism has changed in the existing network with the server, and making between any two main frames can swap data, share information, has good autgmentability, load balance.Be characterized in using distributed hashtable; Mapping and inquiry file pass through Hash keys; Can quicken and the propagation of minimizing information by Hash keys.Adopt the DHT pattern to have the following advantages: load balance: the distributed hash function makes index entry be evenly distributed on the different nodes, so it has the characteristic of natural load balance.Dispersed: it is distributed fully.Each node all is an equality, has improved the robustness of system effectively.The log logarithmic function of expense that scalability: DHT searches and node is proportional, even therefore very big system also is feasible.Locating information fast.Wherein, DHT is the most rising current a kind of P2P resource lookup pattern.
Summary of the invention
Technical problem: the purpose of this invention is to provide a kind of resource location method that is used for adopting distributed hashtable in the equity calculating at peer-to-peer network, solve the P2P resources in network and locate this bottleneck, the method that the application of the invention proposes can be at JXTA (a kind of P2P Software Development Platform of SUN company) thus making up distributed hashtable in the development system reaches and significantly reduces the P2P resources in network and search this target of time.
Technical scheme: method of the present invention is a kind of method of the property improved, and distributed hashtable Chord (a kind of ring-like distributed hashtable) has been carried out some improvement.Mainly, carry out the weak point that resource is located the bitterness and happiness inequality in one direction thereby remedy Chord by on the Hash spatial loop, adopting up time to carry out consistent the mapping with the inverse time both direction.In addition, allow meeting point primordial in the P2P network in the ring of BChord distributed hashtable technology, each meeting point is mapped to several common reciprocity points.In this way, can further reduce the time of resource location.Enlarge effective management domain of P2P network.
The distributed hash annular space that constitutes by the meeting point in the peer-to-peer network, and the distributed hash annular space carried out clockwise and the scheme of search pointer table counterclockwise simultaneously, be included in the following step:
The establishment in peer-group and set territory:
The step 1) peer-group is created: finish that the P2P peer-group is created, the equity point joins the P2P network,
Step 2) create in the set territory: finish the generation in meeting point, contact is formed the set territory between the equity point closely, and inner resource and the message of realizing is shared between all the equity points in the set territory;
The distributed hash ring is created:
Step 3) meeting point Hash mapping is to the distributed hash annular space of peer-to-peer network: comprise and give the authority that the meeting point has distributed hash, the meeting point by Hash mapping to the peer-to-peer network annular space,
Step 4) is revised the information in other meeting points of peer-to-peer network distributed hash annular space: comprise other meeting points on the announcement peer-to-peer network annular space, corresponding reaction is made in other meeting points on the annular space;
Resource location between the equity point in the peer-to-peer network:
Step 5) is preserved resource location information: comprise the equity point issue resource in the set territory, Hash mapping is carried out to the resource key message in the meeting point in the set territory, and resource location information is saved on the respective episode chalaza according to the distributed hash algorithm,
Resource location between the equity point in the step 6) peer-to-peer network: comprise the reciprocity point search resource in the set territory, the Hash location is carried out to the resource key message in the meeting point in the set territory,
Resource request between the equity point in the step 7) peer-to-peer network: comprise that the meeting point obtains the key message of wanting resource according to the distributed hash algorithm from other meeting points, the requestor of resource is to owner's acquisition request resource of resource,
The processing of withdrawing from and fail in step 8) meeting point: the meeting point single point failure occurs or normally withdraws from the distributed hash ring, and corresponding adjustment substep is made according to the distributed hash algorithm in the relevant meeting point on the distributed hash ring.
One, consistent Hash (Consistent Hashing)
It is that node distributes keyword that BChord has adopted a kind of variant computational methods of consistent Hash.Consistent Hash has several good characteristics, at first is that hash function can be accomplished load balance, that is to say that all nodes can receive the keyword of basic identical quantity.In addition, when N node adds or leave system, have only the keyword of 1/N need move to other position.BChord has improved the extensibility of consistent Hash.In BChord, node does not need to know the information of every other node.Each BChord node only need be known a small amount of " route " information of other nodes.In the system that is made up of N node, each node only need be safeguarded the information of the individual node of other O (LogN).Search equally, at every turn and only need O (LogN) bar message.When node added or leaves system, BChord needed more new routing information, added at every turn or left needs transmission O (Log
2N) bar message.Consistent hash function is the identifier (identifier) of each node and keyword assignment m bit byte, and this identifier can produce with SHA 1 hash function such as grade.The identifier of node can produce by the hash IP addresses of nodes, and direct this keyword of hash of the identifier of keyword.For example the IP address is that the node of 120.10.10.1 is 54 through the identifier that obtains after the SHA1 hash, and the keyword after keyword " LifAtGo " hash is 30.Knowing the symbol length m must long enough, could guarantee that like this probability that two nodes or keyword hash on the same identifier is little of ignoring.
In consistent Hash, each keyword all is kept in its follow-up (successor) node, and descendant node is node identifier first node more than or equal to keyword K identifier, can be expressed as successor (K).If identifier adopts the m bit to represent, and will be arranged in a circle from 0 to 2m-1 number, successor (K) begins the nearest node of clockwise direction from K so, as node among Fig. 1 14.Another characteristics of consistent Hash are exactly that the impact that when node adds or leaves system network is brought can reach minimum.When node n adds network, for the Hash mapping that is consistent, some distributed to the keyword of the descendant node of n originally will distribute to n.When node n deviated from network, the descendant node to n will be redistributed in all keywords of distributing to it.In addition, other variation can not take place in the network.
Two, BChord searches mode
Each node maintenance a spot of routing iinformation, by these routing iinformations, can improve search efficiency.If m is the figure place (employing binary representation) of keyword and node identifier, each node only need be safeguarded the routing table that maximum m list items are formed, and is referred to as pointer gauge (finger table).That i list item comprises in the look-up table of node n is s=success (n+2i-1), and 1<=i<=m and all calculating here all will be carried out mod 2
m, s is called i the pointer of node n, we use n.finger[i] and .node represents, the implication of other in the pointer gauge is as follows:
(1)finger[k].start:(n+2k-1)mod?2m,1<=K<=m
(2)finger[k].interval:[finger[k].start,finger[k+1].start]
(3) finger[k] .node: first>=n.finger[k] node of .start
(4) finger[k] .successor: the next node finger[i in the identifier ring] .node
(5) finger[k] .predecessor: the previous node in the identifier ring
Identifier (8+25) mod 2
6=40, descendant node points to 42.This mode has two important characteristics: at first, each node only need be known a part of nodal information, and from its near more node, it just knows many more information.Secondly, the pointer gauge of each node does not comprise the position of all keywords.For example, the node 8 among Fig. 2 is not just known the position of keyword 54, because the descendant node information of node 8 is not directly pointed to the descendant node 56 of keyword 54.
When node n does not know the descendant node of keyword K what if? if n can find a node, the identifier of this node is more near K, and this node will be known the more information of this keyword so.According to this characteristic, n will search its pointer gauge, find first node j of node identifier greater than K, and inquiry node j, see whether j knows the more close K of which node.By repeating this process, n finally will know the descendant node of K.
Three, the adding of BChord node and crash handling
The adding of node n is divided into three phases: (1) BChord searches the pointer gauge of mode initialization new node: suppose that node n knew certain node n ' in the network by certain mechanism before adding network.At this moment, for the pointer gauge of initialization n, n will require node n ' to search other list items in the pin table for it.(2) upgrade the pointer gauge that has other nodes now.Node adds the renewal function that will call other nodes behind the network, allows its pointer gauge of other node updates.(3) from descendant node keyword is delivered to node n.This step is to be all descendant nodes that the keyword of n is transferred on the n.The time complexity of whole adding operation is O (Log
2N).
In the P2P network, certain peer node may log off or lose efficacy at any time, so the processing node inefficacy is an important problem.In crash handling, the step of most critical is to safeguard correct heir pointer.In order to guarantee this point, each BChord node is all safeguarded a successor list that comprises r immediate successor.If node n notices its descendant node and had lost efficacy that it just replaces failure node with first normal node in the successor list.
Four, novel DHT structure (BChord) is concrete uses
Because in the BChord model, each node all may serve as the tertium quid of information, and also varies between the performance of node.If certain node communication failure, the performance decrease that then causes query resource to be searched.Therefore, can only allow the part communication speed is fast, buffer memory capacity is big, node fast operation is finished the transmission work of message.In JXTA, meeting point (Rendezvous Peer) just possesses such function.A meeting point at first is a Peer, and is a Peer that can handle from other Peer requests.Request also can be entrusted to other Peer in the meeting point, and those Peer also must be the meeting points certainly.A main purpose using the meeting point is exactly for convenience of search advertisements outside local network.The meeting point has more resources usually, and can store a large amount of relevant information of Peer around it.The meeting point also can be used as the transmitter of search.The meeting point of finding to ask other (former meeting point is by having obtained being forwarded the information in meeting point alternately with the advertisement of other Peer) can be transmitted in the meeting point.It all can transmit this request if each meeting point itself does not have requested information.In the P2P network as Fig. 5, a meeting point R and a plurality of equity point P form a set territory RZ.Between the set territory, multicast mode is adopted in (comprising the meeting point) communication between the equity point, and everybody keeps the locally coherence of information mutually.And outside the set territory, then adopt DHT message route technology based on BChord, then realize the global coherency of information.That is to say, between the meeting point, keep the consistency of P2P network global message, and between common equity point, need only keep local consistency.In this way, both can improve communication speed, and also can keep message consistency ground to propagate well.This mode had both made full use of the advantage of BChord balancing resource load, the further not malfunctioning again resource lookup time that reduces among the BChord actively.
It between the peer of the computing platform set circulus that the BChord agreement by distributed hashtable is formed by connecting, how keeping the stability of this BChord ring is the key of this Peer aggregate, comprise that set is set up and set when withdrawing from the forerunner and follow-up maintenance, regularly the BChord ring is carried out stabilisation.Be to encircle on the individual node in the BChord agreement, and the ID of the node on the ring is by obtaining IP addresses of nodes by hash function, here be in the Peer set ring, a plurality of nodes are arranged, this also is the place that we innovate on the BChord agreement, how select a meeting point, and to replace gathering the communication that interior node handles between the set with unique Peer set identifier also be the key of Peer aggregate at these a plurality of intranodals.The location of node also is a key: node is collected self relevant information, as IP address, line duration, system status, available resources etc., add system then, be mapped on the one dimension ring according to nodal information, obtain the coordinate information of node,, find the close interval of physical location (that is Peer set) then by the BChord ring, and add this interval, gather dynamic adjustment.
Beneficial effect: use the BChord technology that following advantage is arranged:
1. be independent of programming language and platform
Reciprocity computational resource location technology based on BChord is developed on the JXTA platform.JXTA is designed to be independent of programming language, is independent of system platform, is independent of the network platform.And JXTA is designed and can realizes on any digital device, comprises transducer, consumption electronic product, network router, desktop computer, server and memory device.
2. the common advantage that has distributed hashtable
(1) load balance: distributed what uncommon function makes index entry be evenly distributed on the different nodes, so it has the characteristic of natural load balance.(2) dispersiveness: it is distributed fully.Each node all is an equality, has improved the robustness of system effectively.(3) scalability is good: the log function of expense that DHT searches and node is proportional, even therefore very big system also is feasible.(4) locating information fast.
3. further reduced the time of resource location.
By improvement, on the Hash spatial loop, adopt up time to carry out consistent the mapping, thereby remedy the weak point that Chord carries out resource location bitterness and happiness inequality in one direction with the inverse time both direction to the Chord Hash table.In addition, allow meeting point primordial in the P2P network in the ring of BChord distributed hashtable technology, each meeting point is mapped to several common reciprocity points.In this way, can further reduce the time of resource location.
Embodiment
Fig. 1 is consistent Hash exemplary plot.Wherein, with the expression node identifier of N beginning, and with the expression Resource Key identifier of K beginning.For example the IP address is that the node of 120.10.10.1 is K54 through the identifier that obtains after the SHA1 hash among the figure, is stored on its follow-up node identifier N56; And the identifier after keyword " LifAtGo " hash is K30, is stored on its follow-up node identifier N32.
Fig. 2 BChord data organization example.Wherein, with the expression node identifier of N beginning, and with the expression Resource Key identifier of K beginning.Each node only need be safeguarded the routing table that maximum m list items are formed, and is referred to as pointer gauge (finger table).If the N8 node can be a N24 action queries information by N (8+16) originally, but do not have the N24 node, so the N8 node can pass through the information of the descendant node N32 of N (8+16) action queries N24.In like manner the N8 node obtains Query Information by N (8+32) to N42.
The application model of Fig. 3 Bchord in the P2P network.Wherein R is the meeting point, and P is reciprocity point, and RZ is the set territory, and DHT is a distributed hashtable.
Fig. 4 mainly comprises the three phases scheme about realizing based on the embodiment flow process of the DHT resource location of BChord in the P2P network: the P2P peer-group is created and scheme is created in the set territory; DHT ring establishment scheme based on BChord; Mutual resource location and search plan between equity is put in the P2P network.
Concrete mode is:
Step 1) P2P peer-group is created: finish that the P2P peer-group is created (P2P network), the equity point joins P2P network etc.
Step 2) create in the set territory: inner resource and the message of realizing is formed between all equity points in set territory, the set territory in generation, the contact of finishing the meeting point closely between the equity point shared etc.
Step 3) is created based on the DHT of BChord ring: give authority that the meeting point has the distributed Hash of BChord, the meeting point by Hash be mapped on the P2P annular space, other meeting points on the announcement P2P annular space, other meeting points on the annular space make corresponding reaction etc.
Step 4) is preserved resource location information: comprise equity point issue resource in the set territory, gather meeting point in the territory and the resource key message is carried out Hash shines upon, resource location information is saved on the corresponding meeting point according to the BChord algorithm.
Resource location and request between the equity point in the step 5) P2P network: the reciprocity point search resource in the set territory, gather meeting point in the territory and the resource key message is carried out Hash, meeting point from other meeting points, obtain the owner acquisition request resource of the requestor of the key message of wanting resource, resource to resource according to the BChord algorithm.
Withdrawing from and the processing of failing of step 6) meeting point: the meeting point single point failure appears or normally withdraw from the DHT ring of BChord, substeps such as corresponding adjustment are made according to the BChord algorithm in the relevant meeting point on the DHT ring.