CN103139226A

CN103139226A - Copy storage system and method based on peer-to-peer (P2P) on-line information aggregation

Info

Publication number: CN103139226A
Application number: CN2011103746049A
Authority: CN
Inventors: 王劲林; 宋军; 尤佳莉; 苏杭; 李晓林; 郑鹏飞; 薛娇; 吕阳
Original assignee: Beijing Zhongke Huaying Media Technology Co ltd; Institute of Acoustics CAS
Current assignee: Zhengzhou Xinrand Network Technology Co ltd; Institute of Acoustics CAS
Priority date: 2011-11-23
Filing date: 2011-11-23
Publication date: 2013-06-05
Anticipated expiration: 2031-11-23
Also published as: CN103139226B

Abstract

The invention provides a copy storage system and method based on P2P on-line information aggregation. The method is used for selecting neighbor nodes which are on line with local nodes simultaneously and high in probabilities as file copy storage nodes. The method includes that each node in a P2P storage system searches historical on-line information of neighbor nodes, the research nodes predict future on-line information according to the historical on-line information; and when a certain local node in the P2P storage system needs to store files, the local nodes acquire a plurality of neighbor nodes which are on line with the local nodes simultaneously, set in number and high in probabilities for storing file copies by using the neighbor-node-based historical on-line information and the predicted future on-line information, wherein the neighbor nodes include neighbor nodes above one level, the one-level neighbor nodes are neighbor nodes of certain nodes, and second-level neighbor nodes are neighbor nodes of neighbor nodes of certain nodes. By means of the storage strategy, the file reliability and availability of the P2P storage system are effectively improved.

Description

A kind of copy storage system and method based on the polymerization of P2P online information

Technical field

The invention belongs to peer-to-peer network (P2P) technical field of memory, be specifically related to a kind of copy storage system and method based on the polymerization of P2P online information.

Background technology

The P2P technology is as an important branch of distributed system, its feature is to be with good expansibility, autonomy, distinguish and traditional C/S framework, node in system be the ISP be also service consumer, the characteristics of this peer-to-peer network have reduced the dependence of user to server, have also utilized fully resource unused on the network, have improved resource utilization.

The P2P storage is the concrete application of of P2P technology, it comprises file-sharing and file storage two aspects, such as Napster, Gnutella, Bittorrent, eDonkey etc. belongs to the shared file system based on P2P, Past, OceanStore, CFS etc. belong to the document storage system based on P2P, and also some has adopted the P2P technology to popular net dish at present.

In the P2P document storage system, in order to guarantee the reliabilty and availability of storage file, most systems has all adopted copy mechanism, when being storage file, system can choose the replica node of some according to certain strategy in all line nodes in this system, then file is all stored on all replica node a, this copy mechanism the reliabilty and availability of storage file.But, strong dynamic due to the P2P system, node in system adds frequently or logs off, cause the replica node number constantly to reduce, do not keep the number of replica node if do not take strategy, finally may cause all replica node all to log off, make the reliabilty and availability of file be had a strong impact on.

The problem that the replica node number that brings for the high dynamic that solves the P2P system reduces, what more current systems adopted is the copy migration mechanism, when namely if certain replica node withdraws from this system, can be being stored in file migration above it in the system in other line node, allow the replica node of other certain line node as file, even it has withdrawed from like this, the replica node number of file does not reduce, and has guaranteed the reliabilty and availability of file.

But for the copy migration mechanism that improves the file reliabilty and availability can expend a large amount of bandwidth of P2P storage system, bring very large pressure for the operation of system.Because the P2P system has high dynamic, replica node can log off frequently, and it can be with file migration in other line node when logging off, the massive band width of P2P storage system namely withdraws from frequently and brings file migration frequently, so can expend at these frequently on file migration.

Summary of the invention

The object of the invention is to, for overcoming the problems referred to above, the invention provides a kind of copy storage system and method based on the polymerization of P2P online information.

For achieving the above object, the invention provides a kind of copy storage means based on the polymerization of P2P online information, the method is used for selecting depositing node with the local node neighbor node that probability is high online simultaneously as duplicate of the document, and described method comprises:

Step 1, each node in the P2P storage system is collected respectively the historical online information of its neighbor node, and this collection node is predicted its following online information according to historical online information.

Step 2, when certain local node in the P2P storage system needs the storage file copy, the neighbor node storing documents copy higher with the simultaneously online probability of this local node that this local node utilizes that the historical online information of its neighbor node and following online information obtains some setting numbers.

Wherein, described neighbor node comprises the above neighbor node of one-level, and described one-level neighbor node is the neighbor node of certain node, and the secondary neighbor node is the neighbor node of the neighbor node of certain node, the like the implication of arbitrary number of level neighbor node as can be known.

In technique scheme, described step 1 further comprises following substep:

Step 1-1, certain node in the P2P storage system is collected the historical online information of its neighbor node, and this collects node and its neighbor node forms a set of node.

Step 1-2, the characteristic vector of the historical online information of each node in the extraction set of node.

Step 1-3 carries out cluster according to the characteristic vector of extracting to set of node, obtains the class bunch of described collection node, then with all nodes of such bunch training set as prediction processing, thereby predicts the following online information of described collection node.

Wherein, in step 1-1: if when the less one-level neighbor node number that certain node searching is obtained of P2P storage system scale is fewer, continue to collect the historical information of second level neighbor node, the like, until the neighbor node number of accumulation search reaches certain scale; If only collect the historical information of the first order neighbor node when P2P storage system is larger.

The rule of wire gauge up and down that described step 1-2 is overall according to P2P node in actual P2P storage system, the dimension of setting characteristic vector.

Optionally, the simultaneously online higher neighbor node of probability of described and this local node is namely found out the online behavior some nodes the most similar to this local node and is deposited node as copy; Wherein, described similitude can be determined by Euclidean distance algorithm or the continuous the longest algorithm of line duration.

Optionally, in described step 2 when local node needs the storage file copy:

Whether total number that at first local node judges node in its neighbor node table is less than the replica node number of needs, if less than would need recurrence to collect the second level neighbor node table of local node, by that analogy, through some grades of neighbor nodes of traversal, finally make total number of the neighbor node of collection equal at least the replica node number, select the replica node number storing documents copy that needs from all neighbor nodes of collecting.

Based on said method, the invention provides a kind of optional copy storage system based on the polymerization of P2P online information, this system is used for selecting depositing node with the local node neighbor node that probability is high online simultaneously as duplicate of the document, and described system comprises:

Be arranged in search and predicting subsystem on each node of P2P storage system, this subsystem is collected the historical online information of the neighbor node of its place node, predicts the following online information of this node according to historical online information.

Be arranged in copy on each node of P2P storage system and deposit node chooser system, being used for node when this subsystem place has file need to carry out copy when depositing, the neighbor node storing documents copy higher with online probability of this local node while that this local node utilizes that the historical online information of its neighbor node and following online information obtains some setting numbers.

Wherein, described neighbor node comprises the above neighbor node of one-level, and described one-level neighbor node is the neighbor node of certain node, and the secondary neighbor node is the neighbor node of the neighbor node of certain node, the like the implication of multistage neighbor node as can be known.

In said system, described search and predicting subsystem further comprise:

The historical information search module be used for to be collected the historical online information of the neighbor node of this module place node.

The characteristic vector extraction module is for the characteristic vector of extracting each historical online information.With

Prediction module based on cluster, utilize the characteristic vector of extracting to carry out cluster to neighbor node, obtain the class bunch of historical information search module place node, then with the training set of all nodes in class bunch as prediction processing, thereby dope the following online information of historical information search module place node.

Wherein, described historical information search module can dynamically be adjusted the progression of the neighbor node of search, if namely P2P storage system scale is less when making the neighbor node number fewer, need search for successively multistage neighbor node; The historical information of the first order neighbor node of only collecting when if the P2P storage system is larger.

Optimize, described prediction module based on cluster further comprises:

The cluster device, the characteristic vector of be used for to utilize extracting is carried out cluster to neighbor node, obtains the class bunch of historical information search module place node, then with the training set input predictor of all nodes in class bunch as fallout predictor; With

Fallout predictor is used for the following online information according to the training set prediction history information search module place node of input.

Optionally, the rule of wire gauge up and down that described characteristic vector extraction module is overall according to user node in actual P2P storage system, the dimension of setting characteristic vector.

Optionally, the simultaneously online higher neighbor node of probability of described and this local node is namely found out the online behavior some nodes the most similar to this local node and is deposited node as copy;

Wherein, described similitude can be determined by Euclidean distance algorithm or the continuous the longest algorithm of line duration.

Described copy is deposited node chooser system and is further comprised as lower module:

Judge module is used for judge that total number of neighbor node table node of local node is whether less than the replica node number of setting; If less than with the cumulative search module of result feedback.

Cumulative search module, recurrence is collected the second level neighbor node table of local node when receiving the feedback information of judge module, by that analogy, through some grades of neighbor nodes of traversal, finally make total number of the neighbor node of collection reach certain scale, then select to provide a larger node for copy and select the space.With

Determine that final copy deposits the module of node, be used for selecting to set from all neighbor nodes of collecting the replica node of numbers, the storing documents copy.Wherein, the described second level refers to the neighbor node of the neighbor node of local node.

The invention provides a kind of storage policy based on the polymerization of P2P online information, this strategy comprises the predicting strategy of P2P node online information and based on polymerization storage policy two parts of information of forecasting.The predicting strategy of P2P node online information is responsible for collecting the historical online information of all neighbor nodes, and utilize these historical online information that the following online information of this node is predicted, predict namely whether this node is online in following certain time period, and the model that prediction is used is Cluster-Based Predictor (based on the fallout predictor of cluster); Partly be responsible for when the local node storage file based on the polymerization storage policy of information of forecasting, collect the online information of neighbor node, then utilize the copy Selection Strategy based on these online information, select the replica node of specified quantity and store from its all neighbor nodes.

Compared with prior art technical advantage of the present invention is: the following online situation of having considered node when choosing due to copy, when making this node reach the file of depositing future, replica node online possibility simultaneously improves greatly, thereby has improved the reliabilty and availability of copy.This with file store into local node on the similar node of behavior that rolls off the production line improve the method for reliabilty and availability, avoided the consumption of the massive band width that the copy moving method brings, make the P2P memory property be improved.

Description of drawings

Fig. 1-a is the schematic network structure based on the storage policy of P2P online information polymerization of the embodiment of the present invention;

Fig. 1-b is the flow chart of the storage policy based on the polymerization of P2P online information of the present invention;

Fig. 2 is of the present invention based on the Cluster-Based Predictor structural representation in the storage policy of P2P online information polymerization;

Fig. 3 is the flow chart based on node online information prediction in the storage policy of P2P online information polymerization of the present invention;

Fig. 4 be of the present invention based in the storage policy of P2P online information polymerization based on the polymerization storage policy flow chart of node information of forecasting.

Embodiment

Below in conjunction with accompanying drawing and instantiation, the storage policy based on the polymerization of P2P online information of the present invention is elaborated.

The present invention proposes a kind of storage policy of the P2P of being applied to document storage system, this strategy is under the prerequisite of not using the copy migration mechanism, guaranteed the reliabilty and availability of system, thereby avoided copy migration frequently, greatly reduce the bandwidth consumption of P2P storage system, improved systematic function.For achieving the above object, the invention provides a kind of storage policy based on the polymerization of P2P online information.

The invention is characterized in: a kind of storage policy based on the polymerization of P2P online information provided by the invention forms by the predicting strategy of P2P node online information and based on polymerization storage policy two parts of information of forecasting.The prediction of wherein said P2P node online information is responsible for collecting the historical online information of neighbor node, and whether predict future moment node is online, and forecast model is Cluster-Based Predictor (based on the fallout predictor of cluster); Described polymerization storage policy based on information of forecasting is responsible for the online information of the neighbor node that will collect and is carried out polymerization, then utilizes the copy Selection Strategy based on these online information designs, chooses replica node and stores.

In storage policy of the present invention, function and the processing method of each substrategy are as follows:

1) predicting strategy of described P2P node online information: local node need to be predicted its following online information, at first need to collect the historical online information of all neighbor nodes of local node, then the historical online information of each neighbor node is extracted characteristic vector, utilize at last Cluster-Based Predictor forecast model that the following online information of local node is predicted.The advantage of this substrategy is:

Utilize the historical online information of neighbor node as the training set of fallout predictor, make the training set of fallout predictor more complete, made up and only used the historical online information of local node as the deficiency of training set, improved the accuracy of prediction;

By neighbor node is carried out cluster, with the training set as fallout predictor of all neighbor nodes of local node place class bunch, namely filter out those neighbor nodes the most similar with the online behavior of local node as the training set of fallout predictor from all neighbor nodes, make fallout predictor training set more effective, thereby further improved the accuracy of prediction.

Wherein, the historical online information of all neighbor nodes of described collection local node can be chosen according to the scale of P2P storage system the progression of neighbor node traversal.If P2P storage system scale is less when causing the neighbor node number of local node fewer, except the neighbor node of collecting these first order, can also select to collect the neighbor node of these neighbor nodes, namely with respect to the second level neighbor node of local node; If the P2P storage system is larger, can only collect first order neighbor node.Guarantee that certain neighbor node number avoided the uncertainty of smallest number, make Clustering Effect better.

Described historical online information to each neighbor node is extracted characteristic vector (characterizing the rule of wire gauge up and down of node), need to decide according to the rule of wire gauge up and down out of node aggregate performance in the P2P storage system of reality the dimension of characteristic vector.If in the P2P storage system, the node aggregate performance is out take one day as the wire gauge up and down in cycle rule, dimension that can characteristic vector is set to intraday historical information number of samples; If with rules such as 2 days or weeks, can characteristic vector be set to 2 days or the historical information number of samples in the week.The method of this setting characteristic vector dimension can be so that the better effects if of cluster.

The described Cluster-Based of utilization Predictor forecast model is predicted the following online information of local node, comprise the cluster device to the neighbor node cluster, so that local node place class bunch is predicted two parts as training set input predictor, as follows:

The described cluster device of ■ according to the characteristic vector of each neighbor node, selects a kind of cluster device to carry out cluster to the neighbor node cluster, and it classifies all neighbor nodes, and the node in each class bunch has the behavioural habits that roll off the production line on similar;

■ is described bunch carries out predicted portions as training set input predictor with local node place class, and as the training set, complete training set has improved the accuracy of prediction with all nodes of local node place class bunch.

2) described polymerization storage policy based on information of forecasting: when local node needs storage file, collect the online information (the online information here comprises historical online information and the following online information of utilizing the historical information prediction recited above) of all neighbor nodes, when the number of one-level neighbor node is fewer, can continue the neighbor node of the second level, then the local node utilization is based on the copy Selection Strategy of these online information, choosing copy stores, the advantage of this substrategy is: the following online situation of having considered node when choosing due to copy, when making this node reach the file of depositing future, replica node online possibility simultaneously improves greatly, thereby improved the reliabilty and availability of copy.This with file store into local node on the similar node of behavior that rolls off the production line improve the method for reliabilty and availability, avoided the consumption of the massive band width that the copy moving method brings, make the P2P memory property be improved.

Explain in detail embodiments of the invention below in conjunction with accompanying drawing.

As shown in Fig. 1-a, this figure is the structural representation of the storage policy based on the polymerization of P2P online information of the present invention.As shown in Fig. 1-a, a kind of storage policy based on the polymerization of P2P online information of the present invention comprises the prediction of P2P node online information, with two parts of storage policy based on information of forecasting.The prediction of wherein said P2P node online information is responsible for collecting the historical online information of neighbor node, then the following online information of node is predicted; The described partly online information polymerization (online information that comprises historical online information and prediction) of the responsible neighbor node that will collect of storage policy based on information of forecasting, then utilize the copy selection strategy based on these online information, choose the replica node of specified quantity from all neighbor nodes, file is stored in these selecteed replica node at last.

As shown in Fig. 1-b, this figure is the schematic flow sheet of the storage policy based on the polymerization of P2P online information of the present invention.A kind of copy storage means based on the polymerization of P2P online information of the present invention, the method are used for selecting depositing node with the local node neighbor node that probability is high online simultaneously as duplicate of the document, and described method comprises:

Step 101, each node in the P2P storage system is collected respectively the historical online information of its neighbor node, and this collection node is predicted its following online information according to historical online information.

Step 102, when certain local node in the P2P storage system needs the storage file copy, the neighbor node storing documents copy higher with the simultaneously online probability of this local node that this local node utilizes that the historical online information of its neighbor node and following online information obtains some setting numbers.

Described step 101 can be completed by search and the predicting subsystem of Fig. 1-a, and described step 102 is deposited node chooser system by the copy of Fig. 1-a.

as shown in Figure 2, this figure is of the present invention based on Cluster-Based Predictor structural representation in the storage policy of P2P online information polymerization, as shown in Figure 2, the input of Cluster-Based Predictor is the characteristic vector of being extracted by the historical online information of the neighbor node of collecting, then enter the cluster device and carry out cluster, last input predictor is predicted, Cluster-Based Predictor forecast model with other forecast model different be to have added the cluster device in the fallout predictor front, it filters out the node similar to the local node behavior as the training set of fallout predictor from all neighbor nodes, make the training set more effective, thereby improve the accuracy of prediction.

As shown in Figure 3, this figure is the flow chart based on certain node online information prediction in the storage policy of P2P online information polymerization of the present invention, and as shown in Figure 3, its step is as follows:

1) collect the historical online information of neighbor node, when the neighbor node number is fewer, can continue to travel through second level neighbor node, guarantee that certain neighbor node number avoided the uncertainty of smallest number, make Clustering Effect better;

2) the historical online information of each neighbor node of collecting is extracted characteristic vector, the dimension of characteristic vector is mainly will be according to the rule setting of rolling off the production line generally of node in actual P2P storage system, for example the online information sampling interval of certain P2P storage system is 20 minutes, in one day, sampling number is 24*60/20=72 so, if the overall rule of wire gauge up and down of this system node is one day, the dimension of characteristic vector can be set to 72 so, if up and down wire gauge rule is 2 days dimension can be set to 144.After dimension is determined, just the extraction of characteristic vector can have been carried out, as the length of the historical online information of the neighbor node of collecting is 7 days (being that sample sequence length is 7*72=504), its characteristic vector in statistics these 7 days every day corresponding time point node on the probability (establishing dimension is 72) that rolls off the production line;

3) characteristic vector has been arranged, just can enter Cluster-Based Predictor fallout predictor predicts, Cluster-Based Predictor inside comprises cluster device and fallout predictor, the purpose of cluster device is to select the node similar to the local node behavior as the input of fallout predictor from all neighbor nodes, make the fallout predictor training set more complete, thereby improve the accuracy of prediction.Can be K-Means etc. such as fallout predictor, fallout predictor can use N-Gram etc.;

4) through just obtaining the following online situation of this node after prediction module (Cluster-Based Predictor) the forecast model prediction based on cluster.

As shown in Figure 4, this figure be of the present invention based in the storage policy of P2P online information polymerization based on the polymerization storage policy flow chart of node information of forecasting, as shown in Figure 4, its step is as follows:

1) when local node needs storage file, at first collect the online information of all neighbor nodes, the online information here comprises historical online information and the following online information of being predicted by historical online information, when the neighbor node number is fewer, can continue to travel through the online information of collecting second level neighbor node, the neighbor node of assurance some has been avoided the uncertainty of smallest number, and the scope that the back copy is selected is larger;

2) online information of neighbor node has been arranged, just can remove to select copy based on these online information, if for example number of copies is k, can utilize online information to find front k node of line duration the longest (line duration of node in statistical history online information) as replica node from neighbor node; Or k node can finding out online behavior the most similar (Euclidean distance of calculated characteristics vector, distance less behavior more similar) is as replica node etc.; The copy that these copy selection strategies are elected has following feature: when local node was online, those corresponding replica node also online probability were very high.In this case, when local node was got file, all replica node not online possibility were very little, thereby guaranteed the reliabilty and availability of storage; Namely in order to improve the reliabilty and availability of data trnascription, the copy Selection Strategy of local node is based on the online information of all neighbor nodes, the feature of these copies is when local node is online, corresponding replica node also online probability is very high, described Euclidean distance is a kind of method of looking for the neighbor node similar to the local node behavior wherein just, also have the methods such as continuous line duration is the longest also can characterize neighbor node and local node behavioral similarity, so as long as or emphasize that the copy selection strategy is based on the just passable of online information here.

3) last, store file into 2) in the replica node selected.

A kind of storage policy based on the polymerization of P2P online information of the present invention, utilized the historical online information of neighbor node that the following online information of node is predicted, wherein cluster device and fallout predictor have been gathered in Cluster-Based Predictor forecast model, the purpose of cluster device is to select the node similar to the local node behavior as the input of fallout predictor from all neighbor nodes, make the fallout predictor training set more complete, thereby improve the accuracy of prediction; Then when local node needs storage file, collect the online information (comprising online information historical and prediction) of neighbor node, utilize the copy selection strategy based on these online information, select replica node and store.Because these replica node have following feature: when local node is online, those corresponding replica node also online probability are very high, so when local node reached file, all replica node not online possibility were very little, thereby guaranteed the reliabilty and availability of storage.Even the invention enables in the situation that there is no copy migration, system also can guarantee very high reliabilty and availability, and a large amount of bandwidth consumption of having avoided frequent copy migration to bring has improved the performance of system.

It should be noted last that, above embodiment is only unrestricted in order to technical scheme of the present invention to be described.Although with reference to embodiment, the present invention is had been described in detail, those of ordinary skill in the art is to be understood that, technical scheme of the present invention is modified or is equal to replacement, do not break away from the spirit and scope of technical solution of the present invention, it all should be encompassed in the middle of claim scope of the present invention.

Claims

1. copy storage means based on the polymerization of P2P online information, the method are used for selecting depositing node with the high neighbor node of the simultaneously online probability of local node as duplicate of the document, and described method comprises:

Step 1, each node in the P2P storage system is collected respectively the historical online information of its neighbor node, and this collection node is predicted its following online information according to historical online information;

Step 2, when certain local node in the P2P storage system needs the storage file copy, the neighbor node storing documents copy higher with the simultaneously online probability of this local node that this local node utilizes that the historical online information of its neighbor node and following online information obtains some setting numbers;

2. the copy storage means based on the polymerization of P2P online information according to claim 1, is characterized in that, described step 1 further comprises following substep:

Step 1-1, certain node in the P2P storage system is collected the historical online information of its neighbor node, and this collects node and its neighbor node forms a set of node;

Step 1-2, the characteristic vector of the historical online information of each node in the extraction set of node;

Step 1-3 carries out cluster according to the characteristic vector of extracting to set of node, obtains the class bunch of described collection node, then with all nodes of such bunch training set as prediction processing, thereby predicts the following online information of described collection node;

Wherein, in step 1-1: if when the less one-level neighbor node number that certain node searching is obtained of P2P storage system scale is fewer, continue to collect the historical information of second level neighbor node, the like, until the neighbor node number of accumulation search reaches certain scale;

If only collect the historical information of the first order neighbor node when P2P storage system is larger.

3. the copy storage means based on the polymerization of P2P online information according to claim 2, is characterized in that, the rule of wire gauge up and down that described step 1-2 is overall according to P2P node in actual P2P storage system, the dimension of setting characteristic vector.

4. the copy storage means based on the polymerization of P2P online information according to claim 1, it is characterized in that, the simultaneously online higher neighbor node of probability of described and this local node is namely found out the online behavior some nodes the most similar to this local node and is deposited node as copy;

5. according to claim 1 or 4 described copy storage meanss based on the polymerization of P2P online information, is characterized in that, in described step 2 when local node needs the storage file copy:

6. copy storage system based on the polymerization of P2P online information, this system is used for selecting depositing node with the high neighbor node of the simultaneously online probability of local node as duplicate of the document, and described system comprises:

Be arranged in search and predicting subsystem on each node of P2P storage system, this subsystem is collected the historical online information of the neighbor node of its place node, predicts the following online information of this node according to historical online information; With

Be arranged in copy on each node of P2P storage system and deposit node chooser system, being used for node when this subsystem place has file need to carry out copy when depositing, the neighbor node storing documents copy higher with online probability of this local node while that this local node utilizes that the historical online information of its neighbor node and following online information obtains some setting numbers;

7. the copy storage system based on the polymerization of P2P online information according to claim 6, is characterized in that, described search and predicting subsystem further comprise:

The historical information search module be used for to be collected the historical online information of the neighbor node of this module place node;

The characteristic vector extraction module is for the characteristic vector of extracting each historical online information; With

Prediction module based on cluster, utilize the characteristic vector of extracting to carry out cluster to neighbor node, obtain the class bunch of historical information search module place node, then with the training set of all nodes in class bunch as prediction processing, thereby dope the following online information of historical information search module place node;

Wherein, described historical information search module can dynamically be adjusted the progression of the neighbor node of search, if namely P2P storage system scale is less when making the neighbor node number fewer, need search for successively multistage neighbor node;

The historical information of the first order neighbor node of only collecting when if the P2P storage system is larger.

8. the copy storage system based on the polymerization of P2P online information according to claim 7, is characterized in that, described prediction module based on cluster further comprises:

9. the copy storage means based on the polymerization of P2P online information according to claim 6, it is characterized in that, the simultaneously online higher neighbor node of probability of described and this local node is namely found out the online behavior some nodes the most similar to this local node and is deposited node as copy;

10. the copy storage means based on the polymerization of P2P online information according to claim 6, is characterized in that, described copy is deposited node chooser system and further comprised as lower module:

Judge module is used for judge that total number of neighbor node table node of local node is whether less than the replica node number of setting; If less than with the cumulative search module of result feedback;

Cumulative search module, recurrence is collected the second level neighbor node table of local node when receiving the feedback information of judge module, by that analogy, through some grades of neighbor nodes of traversal, finally make total number of the neighbor node of collection reach certain scale, then select to provide a larger node for copy and select the space; With

Determine that final copy deposits the module of node, be used for selecting to set from all neighbor nodes of collecting the replica node of numbers, the storing documents copy.