WO2012136091A2 - 一种对等网络中数据迁移的方法及系统 - Google Patents

一种对等网络中数据迁移的方法及系统 Download PDF

Info

Publication number
WO2012136091A2
WO2012136091A2 PCT/CN2012/072071 CN2012072071W WO2012136091A2 WO 2012136091 A2 WO2012136091 A2 WO 2012136091A2 CN 2012072071 W CN2012072071 W CN 2012072071W WO 2012136091 A2 WO2012136091 A2 WO 2012136091A2
Authority
WO
WIPO (PCT)
Prior art keywords
data
node
version
migration
incremental
Prior art date
Application number
PCT/CN2012/072071
Other languages
English (en)
French (fr)
Other versions
WO2012136091A3 (zh
Inventor
王炜
陶全军
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2012136091A2 publication Critical patent/WO2012136091A2/zh
Publication of WO2012136091A3 publication Critical patent/WO2012136091A3/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks

Definitions

  • the present invention relates to Peer-to-Peer (P2P) network technology, and in particular to a method and system for data migration in a P2P network.
  • P2P Peer-to-Peer
  • P2P network technology has developed rapidly as a distributed Internet technology.
  • All nodes in a P2P network can act as both a server and a client.
  • the data in the P2P network is distributed on the nodes in a distributed manner, and the services are implemented by nodes through distributed.
  • nodes in the P2P network store the data necessary for the network to operate, when a node joins, or exits the network, or performs load balancing, data migration is required between the nodes, that is, the node is saved.
  • the data is copied to another node to ensure that the data is not lost and the integrity of the data is maintained.
  • RELOAD resource allocation and discovery
  • data is organized in the form of resources.
  • the network assigns each resource a unique resource identifier for the entire network, which is used to find the data under the resource identifier.
  • Each node can be responsible for data storage and processing of one or more resource identifications.
  • the original responsible node sends the data under the resource identifier that the joining node should be responsible to to the joining node.
  • exiting the node will always have a corresponding node responsible for the data under each resource identifier in the network.
  • Step 101 When the joining node wants to join the network, the sending node sends an access request to the guiding node. Establish a connection with the boot node;
  • Step 102 Under the guidance of the boot node, the joining node establishes a connection with the receiving node.
  • the receiving node refers to: a node currently responsible for storing data required to join the node.
  • Step 103 The joining node sends a join request to the receiving node, requests to access the P2P network, and hopes to take over the data under the partial resource identifier of the receiving node;
  • Step 104 The receiving node sends the data under the resource identifier that needs to be transferred to the joining node.
  • Step 105 If the amount of data is relatively large, the receiving node may send the data in multiple times until all the data is sent.
  • Step 106 After the data is sent, the receiving node notifies the joining node to update the P2P network route, and the joining node is officially responsible for storing and processing the data.
  • Step 107 The joining node sends a routing update to other related nodes, and the notification has been formally added to the P2P network.
  • the private network may be a private network of the operator, and the network connection speed of the private network is fast. The speed of the network connection to the Internet.
  • the network connection speed between the nodes may be slow. In this case, all the data requested by the joining node is prone to error during the transmission, and the transmission time is relatively long. At this time, since the joining node must acquire all the data before it can formally join the P2P network, it will take a long time to complete the node joining or load balancing process.
  • the main purpose of the present invention is to provide a method and system for data migration in a P2P network, which can greatly reduce data volume and migration time of data migration in a P2P network, and further improve The speed and reliability of high data migration.
  • the present invention provides a method for data migration in a P2P network, the method includes: when data migration is required, a node that needs to migrate data compares a current data version of the migrated data with a data version corresponding to the migrated node, and determines the increase. Volume data, and send incremental data to the migration node;
  • the inbound node recovers the latest version of the data based on the incremental data and the data stored by itself.
  • the method further includes: the node that needs to migrate the data saves the migration data of the current version of the current version.
  • the time for data migration needs to be: when a node joins the peer-to-peer network, or when a node exits the peer-to-peer network, or when there is a node load overload in the peer-to-peer network, when the load needs to be balanced, or because of management The reason is when the load needs to be balanced, or when the node needs to be added as a backup of the data of the nodes in the peer-to-peer network.
  • the node that needs to migrate data compares the current data version of the migrated data with the data version corresponding to the migrated node, and is:
  • the node that needs to migrate data compares the data version information of its current migration data with the data version information of the migrated node.
  • the data version information is: a data version number, and/or a modified time stamp, and/or summary information of the data.
  • the node that needs to migrate data compares the current data version of the migrated data with the data version corresponding to the migrated node, and is:
  • the node that needs to migrate data compares the hash value of its current migration data with the hash value corresponding to the migrated node.
  • the node that needs to migrate data compares the hash value of the current migrated data with the hash value corresponding to the moved node, and is: Use the hash comparison of the Merkd tree, or use the variable-grained hash 3 ratio.
  • the ingress node recovers the latest version of the data according to the incremental data and the data stored by itself, and is:
  • the inbound node uses the data content in the incremental data to modify the corresponding data content in the data stored by itself, and obtain the latest version of the data.
  • the present invention also provides a system for data migration in a P2P network, the system comprising: a first node, and a second node;
  • the first node is configured to compare the current data version of the migrated data with the data version corresponding to the second node, determine the incremental data, and send the incremental data to the second node;
  • the second node is configured to recover the latest version of the data according to the incremental data and the data stored by itself after receiving the incremental data sent by the first node.
  • the number of the second nodes is one or more.
  • the first node is further configured to save the migrated data of the current version after sending the incremental data to the second node.
  • the method and system for data migration in a P2P network need to perform data migration, and the node that needs to migrate data compares the current data version with the data version of the migrated node, determines the incremental data, and increases The amount of data is sent to the migrating node; the migrating node recovers the latest version of the data according to the incremental data and the data stored by itself, and the node that needs to move out the data does not need to send all the data that needs to be migrated to the migrating node. Only incremental data is sent to the ingress node, which can greatly reduce the data migration and migration time of data migration in the P2P network, thereby improving the speed and reliability of data migration.
  • the node that needs to migrate the data saves the migration data of its current version, so that the node that needs to migrate the data is responsible again.
  • the data can reduce the amount of data transmission, further improve the speed and reliability of data migration, and ensure the normal operation of the network.
  • FIG. 1 is a schematic diagram of a migration process of data in a node joining process in the prior art
  • FIG. 2 is a schematic flowchart of a method for data migration in a P2P network according to the present invention
  • Embodiment 3 is a schematic flowchart of a method for implementing data migration in Embodiment 1;
  • FIG. 4 is a schematic flowchart of a method for implementing data migration in Embodiment 2;
  • FIG. 5 is a schematic flowchart of a method for implementing data migration in Embodiment 3.
  • Figure 6 is a schematic diagram showing the manner of generating a hash value of a Merkel tree
  • FIG. 7 is a schematic structural diagram of a system for data migration in a P2P network according to the present invention. detailed description
  • Step 201 When data migration is required, the node that needs to migrate data will migrate the current data version of the data with the data version corresponding to the migrated node. Compare, determine incremental data, and send incremental data to the immigration node;
  • the timing of data migration is as follows: When a node joins a P2P network, or when a node exits a P2P network, or when a node load is overloaded in a P2P network, a load needs to be balanced, or when a load is required for management reasons Or, when the node needs to be added as a backup of the data of a certain node; the number of the ingress nodes may be more than one;
  • the node that needs to move out the data may determine the incremental data by comparing the current data version information of the migrated data with the data version information corresponding to the migrated node; or, the incremental data may be determined by comparing the hash values;
  • the data version information refers to: the node that migrates the data can determine the incremental data according to the data version information and its related information.
  • the data version information may be: a data version number, and/or a modified timestamp, and/or summary information of the data, etc.; when the migrated node has no relevant data version information, the data version information is specific It can also be a character that indicates null;
  • the specific process of determining the incremental data by using the hash value comparison method may adopt an existing hash value comparison process; the hash value comparison manner may specifically be: a Merkd tree hash comparison mode or a variable granularity hash comparison. Way, etc.
  • the incremental data refers to: data that is inconsistent with the inbound node of the data to be migrated, and may be: all data under the modified resource identifier, or all data under the specific resource data segment, or data modified. Log information, etc.
  • the method may further include: the node that needs to migrate the data saves the migration data of the current version of the self.
  • Step 202 The in moving node recovers the latest version of the data according to the incremental data and the data stored by itself;
  • the data content in the incremental data is used to modify the corresponding data content in the data stored in the data to obtain the latest version of the data; for example, the incremental data includes: The content of the 100th data is 123456, Then, the moving node modifies the content of the 100th data stored by itself to 123456; for another example, it is assumed that the incremental data includes: the log information of the 100th data is changed to 123456, and the migrated node according to the log information , modify the contents of the 100th data stored by itself to 123456.
  • Embodiment 1 is a diagrammatic representation of Embodiment 1:
  • the application scenario of this embodiment is: a node joining process.
  • a node that requests to join is referred to as a joining node
  • a node that is currently responsible for storing data required to join the node is referred to as an accepting node.
  • the method for implementing data migration in this embodiment includes the following steps: Step 301: The joining node wants to join the P2P network, sends an access request to the guiding node, and establishes a connection with the guiding node.
  • Step 302 Under the guidance of the boot node, the joining node establishes a connection with the receiving node.
  • Step 303 The joining node sends a join request to the receiving node, requests to join the P2P network, and hopes to take over the data under the partial resource identifier of the receiving node;
  • the joining node can use the prior art to know which part of the resource identification of the receiving node should be taken over.
  • Step 304 The receiving node requests, according to the resource identifier of the joining node, the corresponding locally stored data version information to the joining node.
  • Step 305 The joining node sends the corresponding data version information to the receiving node.
  • the joining node may directly send the locally stored data version information corresponding to the requested resource identifier to the receiving node, and at this time, no further steps need to be performed.
  • the data version information is: the receiving node can determine the information of the incremental data according to the data version information and the related information of the data, and the data version information may specifically be: a version number of the data, and/or Modifying the timestamp, and/or summary information of the data, etc.; when the joining node has no data under the requested resource identifier, the data version information may specifically be a character indicating null.
  • Step 306 The receiving node compares the data version information stored by itself with the data version information of the joining node, and determines the incremental data that needs to be provided to the joining node.
  • the receiving node when comparing, if the data version information of the joining node is too old, the receiving node cannot compare, at this time, the receiving node can interact with the joining node to perform hash value comparison until the incremental data is determined;
  • Step 307 The receiving node sends the incremental data to the joining node.
  • the data may be transmitted in multiple times, and the specific processing may adopt an existing processing procedure.
  • Step 308 After the incremental data is sent, the joining node recovers the latest version of the data according to the locally stored data and the received incremental data. At the same time, the receiving node notifies the joining node to update the P2P network route, and the joining node is officially responsible for the requested data.
  • the quasi-static data refers to: data that does not change much with time under the resource identifier; in addition, when storing, the receiving node may also delete part of the stale data according to need, and replace the current version of the migration data with these Obsolete data, such as: According to the frequency of use of data, delete data with relatively low frequency to save local cache resources.
  • Step 309 The joining node sends a routing update to other related nodes, and the notification has been formally added to the P2P network.
  • the process of determining the incremental data by the node in the node exiting process and the load balancing process is similar to the process provided in this embodiment, except that the joining node and the receiving node are respectively replaced with the target node of the data migration.
  • the source node and modify the message accordingly for different processes.
  • the target node of the data migration is the new responsible node, and the source node is the exit node
  • the load balancing process the target node of the data migration is a light load node
  • the source node is For overload nodes.
  • Embodiment 2 is a diagrammatic representation of Embodiment 1:
  • the application scenario of the embodiment is: a node joining process, and the requested data includes quasi-static data and dynamic data; wherein the quasi-static data refers to: data that does not change much with time under the resource identifier, the dynamic data Refers to: Data that changes frequently under resource identification.
  • the quasi-static data refers to: data that does not change much with time under the resource identifier
  • the dynamic data Refers to: Data that changes frequently under resource identification.
  • whether the data is quasi-static or dynamic can be determined by a specific service. For example, in
  • IP Internet Protocol
  • all data of each user can be stored under the same resource identifier.
  • the user number information and user service subscription information change less, so it can be used as a standard.
  • Static data The user's online information and location information change rapidly, so it can be used as dynamic data.
  • the node to which the request is added is referred to as a joining node, and the node currently in charge of storing the data required to join the node is referred to as an accepting node.
  • the method for implementing data migration in this embodiment, as shown in FIG. 4, includes the following steps:
  • Step 401 The joining node wants to join the P2P network, sends an access request to the guiding node, and establishes a connection with the guiding node.
  • Step 402 Under the guidance of the boot node, the joining node establishes a connection with the receiving node.
  • Step 403 The joining node sends a join request to the receiving node, requests to join the P2P network, and hopes to take over the data under the partial resource identifier of the receiving node;
  • the joining node can use the prior art to know which part of the resource identification of the receiving node should be taken over.
  • Step 404 The receiving node requests the joining node to send the locally stored data version information of the quasi-static data under the requested resource identifier.
  • the data version information is: the receiving node can determine the information of the incremental data according to the data version information, where the data version information may be: a version number of the data, and/or a modified time stamp, and And the summary information of the data, etc.; when the joining node has no data under the requested resource identifier, the data version information may specifically be a character indicating null.
  • Step 405 The joining node sends the locally stored data version information of the quasi-static data under the requested resource identifier to the receiving node.
  • the joining node may directly send the locally stored data version information of the quasi-static data under the requested resource identifier to the receiving node.
  • the data version information is: the receiving node can determine the information of the incremental data according to the data version information and the related information of the data, and the data version information may be: a version number of the data, and/or a modification time.
  • Step 406 The receiving node compares the quasi-static data version information of the joining node, and determines the incremental data of the quasi-static data that needs to be added to the joining node;
  • the receiving node when comparing, if the data version information of the joining node is too old, the receiving node cannot compare, at this time, the receiving node can interact with the joining node to perform hash value comparison until the incremental data is determined;
  • the incremental data is all quasi-static data under the resource identifier requested by the joining node.
  • Step 407 The receiving node sends the incremental data of the quasi-static data to the joining node.
  • Step 408 After the incremental data transmission of the quasi-static data is completed, the receiving node sends all the dynamic data to the joining node.
  • Step 409 After the data is sent, the joining node recovers the latest version of the quasi-static data according to the locally stored quasi-static data and the received incremental data, and directly uses the received dynamic data as the latest version of the dynamic data.
  • the receiving node notifies the joining node to update the P2P network route, and the joining node is officially responsible for storing and processing the data under the requested resource identifier;
  • the admission node stores the quasi-static data that has been migrated in the local cache for later use.
  • the receiving node can directly delete the migrated dynamic data.
  • Step 410 The joining node sends a routing update to other related nodes, and the notification has been formally added to the P2P network.
  • the process of determining the incremental data by the node in the node exiting process and the load balancing process is similar to the process provided in this embodiment, except that the joining node and the receiving node are respectively replaced with the target node of the data migration.
  • the source node and modify the message accordingly for different processes.
  • the target node of the data migration is the new responsible node, and the source node is the exit node
  • the load balancing process the target node of the data migration is a light load node, and the source node is For overload nodes.
  • Embodiment 3 is a diagrammatic representation of Embodiment 3
  • the application scenario of this embodiment is as follows:
  • the data version information is a data version number.
  • a node that migrates data is called a source node, and a node that migrates data is called a target node.
  • This embodiment implements data migration.
  • the method as shown in Figure 5, includes the following steps:
  • Step 501 When data migration is required, the target node sends the locally stored data version number to the source node.
  • the format of the data version number may be an integer that is sequentially added, a format of a major version number plus a minor version number, or a timestamp, etc., for example, if the format of the data version number is an integer that increases sequentially
  • the data version number may be 1, 2, 3, etc.
  • the format of the data version number is the format of the major version number plus the minor version number
  • the data version number may be 1.1, 1.2, 1.3, and the like.
  • Step 502 After receiving the data version number sent by the target node, the source node compares the data version number of the target node with its current data version number, and determines the need to provide the target node by using the version modification information between the two versions. Incremental data;
  • the source node records the modification information in advance under the current data version number, and the modification information may specifically be a record of the data modification operation, or a modified resource identification record, etc., and the source node may learn the source node and the target node accordingly.
  • Version modification information between versions the version modification information may be a record of a data modification operation, or a modified resource identification record;
  • the version modification information includes: a specific operation of modifying the data of the 100th data, Based on this information, the source node determines that the 100th data needs to be sent to the target node, that is, the 100th data is one of the incremental data;
  • the source node compares the data version number of the target node with its current data version number, if the source node finds that the data version number of the target node is too old, causing itself to have no version modification information between the two versions, the source node All the data requested by the target node may be sent to the target node as incremental data, or the source node may perform message interaction with the target node, and determine the incremental data by using a hash value comparison manner, and send the data to the target node; Value comparison method The process of determining incremental data can use existing processing.
  • Step 503 The source node sends the incremental data to the target node, and after receiving the incremental data, the target node recovers the latest version of the data according to the locally stored data and the incremental data.
  • the source node may send the version modification information between the versions to the target node together, so that the target node can better recover the latest version of the data;
  • the target node after receiving the incremental data, modifies its own 100th data with the corresponding data in the incremental data, and so on, thereby restoring the latest version of the data.
  • Step 504 The target node adds a corresponding data version number to describe a difference between the data version stored by itself and the source node.
  • the operation of modifying the data version number may also be performed by the source node in step 503.
  • the function of the data version number is: distinguishing the data version of the target node and the source node for subsequent data migration;
  • the data version number of the target node or the source node when the data version number of the target node or the source node is too large, some old data version data may be deleted.
  • the current data version number of the target node or the source node is 3.5 or 3.5.
  • the data corresponding to the data version number is still saved in the hard disk. In this case, all the data corresponding to the data version number before the data version number 1 can be deleted to save. Hard disk.
  • the source node when comparing data version numbers, the source node needs to record version revision information of multiple versions and current versions, so that incremental data can be determined for different data version numbers. At this time, if the data changes When the amount is relatively large, the data amount of the corresponding version modification information is correspondingly larger. Therefore, the source node may also determine the incremental data by using a hash value comparison manner; wherein the manner of comparing the hash values may specifically be: The hash comparison method of the Merkd tree or the variable-grained hash comparison method.
  • Figure 6 is a schematic diagram of the manner of generating the hash value of the Merkd tree.
  • the source node and the target node segment the data segments to be compared according to the minimum granularity, for example: according to a single resource identifier or smaller
  • the resource identifier segment is segmented, and then the hash value is calculated for each data segment to obtain the hash value of the leaf node, such as the hash values of the leaf nodes C, D, E, and F in FIG.
  • the node performs the hash value calculation on the hash values of the plurality of leaf nodes to generate the hash value of the subtree node, such as the hash values of the subtree 8 and B in FIG. 6, and then hashes the plurality of subtree nodes.
  • the value is again calculated as a hash value, and so on, until the hash value of the root node is generated.
  • the hash value of the root node is first compared. If the source node and the root node hash value of the target node are the same, the data content of the source node and the target node are displayed. It is completely consistent, and does not need to transmit changed data. If the hash values of the root nodes are inconsistent, the data content of the two nodes is different. The source node further compares the hash values of each subtree node separately. If the hash values of each subtree node are consistent, then It indicates that all the data nodes under each subtree node have the same content, and the source node does not need to transmit the changed data.
  • the hash values of the subtree nodes are inconsistent, the hash values of the subnodes under the inconsistent subtree node are further compared, and thus the loop Until until the change node of the appropriate granularity is found; After the data content is inconsistent, the source node sends the changed data to the target node as incremental data.
  • the actual changed granularity node may be a leaf node or a subtree node.
  • variable-grained hash comparison method Determine the possibility of inconsistent data under a single resource identifier or a minimum resource identifier segment. Specifically, it is assumed that data under n resource identifiers needs to be migrated, and the probability of data change of a single resource identifier is p, and all data under the k resource identifiers of the source node and the target node are respectively calculated as hash values, and The hash value is obtained for comparison. At this time, the probability that the hash values of all the data of the k resource identifiers are inconsistent is: idp) k ;
  • k when estimating the p value, k can take different sizes, thereby generating different granularities, performing hash value calculation on data segments of different granularities, and comparing hash inconsistencies, and, for one granularity, different
  • the data segment is calculated by hash value, and the hash value is inconsistent to obtain more accurate hash rate data.
  • the most probable p value is fitted;
  • k* -l/(ln(lp))
  • the best hash granularity is obtained, ie: k*; the data segment of the migrated data is divided into equal-sized data segments by k* resource identifiers.
  • the source node compares the data versions of the migrated data, if the n/k* hash values are the same, the data content of the source node and the target node are completely consistent. There is no need to further compare and transmit the changed data.
  • the data segment with inconsistent hash values corresponding to the data content is repeated, and the above process is repeated until the point where the data changes at the minimum granularity is found; where p value can be re-estimated Or use the last estimated p-value to estimate.
  • the p value will gradually increase, and the hash granularity k will gradually decrease. In actual operation, it is also mandatory to reduce the k value to at least one-half of the original value.
  • the data organization manners in the described P2P network are: The data is organized according to resources, each resource has its own unique resource identifier, and the same resources are in accordance with a specific data structure.
  • the specific data can be searched by the resource identification index, and each node can be responsible for a part of the resource identification space segment, that is, responsible for the storage and processing of the corresponding resource data in the resource identification segment, the processing includes: Operations such as reading data, updating data, and deleting data.
  • the technical solution provided by the present invention is particularly applicable to a situation in which a large amount of quasi-static data needs to be stored in a P2P network, for example, in a P2P Voice over Internet Protocol (VoIP) system, a node in a P2P network needs to save a user's subscription.
  • Information which has a relatively small amount of change, is quasi-static data and needs to be stored in a distributed system. If the node joins or exits frequently, the quasi-static data needs to be frequently migrated between the nodes and occupies a large amount of bandwidth.
  • the data migration amount can be greatly reduced, and only the changed user subscription data needs to be transmitted. can.
  • the technical solution of the present invention is also applicable to a situation in which a node needs to perform timing load adjustment, for example, daily load adjustment is performed periodically.
  • the P2P network periodically wakes up and joins a part of the nodes every morning to cope with a foreseeable amount of business, and periodically saves some nodes by saving some energy on a daily basis.
  • the node is often responsible for some data and services.
  • the node After adopting the technical solution of the present invention, the node only needs to update the latest version of data every time of the migration, so that the data stored locally by the node can be obtained. Better use.
  • the present invention further provides a system for data migration in a P2P network.
  • the system includes: a first node 71, and a second node 72;
  • the first node 71 is configured to compare the current data version of the migrated data with the data version corresponding to the second node 72, determine the incremental data, and send the incremental data to the second node 72;
  • the second node 72 is configured to recover the latest version of the data according to the incremental data and the data stored by itself after receiving the incremental data sent by the first node 71.
  • the number of the second nodes 72 may be one or more.
  • the first node 71 is further configured to save the incremental data to the second node 72 and save the migration data of the current version of the current node.
  • the second node 72 is specifically configured to: modify the corresponding data in the data stored in the data by using the data content in the incremental data to obtain the latest version of the data.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Description

一种对等网络中数据迁移的方法及系统 技术领域
本发明涉及对等(P2P, Peer-to-Peer ) 网络技术, 特别涉及一种 P2P网 络中数据迁移的方法及系统。 背景技术
近年来, P2P网络技术作为一种分布式互联网技术, 发展非常迅速。 与 传统的客户端 /服务器 (C/S, Client/Server )技术不同的是: P2P 网络中的 所有节点均可作为服务器和客户端。 P2P网络中的数据以分布式存储在节点 上, 业务由节点通过分布式实现。
由于 P2P网络中的所有节点均存储有网络运行所必需的数据, 因此, 当有节点加入、 或退出网络、 或进行负载均衡时, 节点之间需要进行数据 迁移, 即: 将某个节点所保存的数据拷贝到另一个节点上, 以确保数据不 会丟失, 并维持数据的完整性。 举个例子来说, 在目前广泛应用的资源定 位与发现( RELOAD, REsource LOcation And Discovery ) P2P网络协议中, 数据是以资源的形式组织的。 网络为每个资源分配一个全网唯一的资源标 识, 用于查找该资源标识下的数据。 每个节点可以负责一个或多个资源标 识的数据存储及处理。 在新节点加入时, 原有的负责节点会将加入节点应 负责的资源标识下的数据发送给加入节点。 当节点退出时, 退出节点会将 网络中始终保持每个资源标识下的数据都有一个对应节点负责。
下面以节点加入流程为例, 描述现有技术中数据的迁移流程, 如图 1 所示, 包括以下步驟:
步驟 101 :加入节点希望加入 Ρ2Ρ网络中时,向引导节点发送接入请求, 同引导节点建立连接;
步驟 102: 在引导节点的引导下, 加入节点同接纳节点建立连接; 这里, 所述接纳节点就是指: 当前负责存储加入节点所需获取的数据 的节点。
步驟 103: 加入节点向接纳节点发送加入请求, 请求接入 P2P网络, 并 且希望接管接纳节点的部分资源标识下的数据;
步驟 104: 接纳节点将需要转移的资源标识下的数据发送给加入节点; 步驟 105: 如果数据量比较大, 接纳节点则可以分多次发送数据, 直至 所有数据全部都发送完毕;
步驟 106:数据发送完毕后,接纳节点通知加入节点更新 P2P网络路由, 加入节点正式负责该部分数据的存储和处理;
步驟 107: 加入节点向其它相关节点发送路由更新, 通知已正式加入 P2P网络。
从上面的描述中可以看出, 在现有的迁移流程中, 数据迁移时, 需要 将加入节点所请求的所有数据通过网络发送给加入节点。 即使通过私有网 络发送所有数据, 由于数据量较大, 依然会对网络造成较为严重地沖击, 这里, 所述私有网络具体可以是运营商的专用网络等, 且私有网络的网络 连接速度大大快于互联网的网络连接速度。 另外, 如果 P2P网络中的节点 分布在互联网上, 则节点间的网络连接速度可能会比较慢, 在这种情况下, 加入节点所请求的所有数据在传输过程中容易出错, 且传输时间比较长, 此时, 由于加入节点必须在获取所有的数据后才能正式加入 P2P网络, 因 此, 会造成完成节点加入或负载均衡的过程需要的时间较长。 发明内容
有鉴于此, 本发明的主要目的在于提供一种 P2P网络中数据迁移的方 法及系统, 能大大减少 P2P网络中数据迁移的数据量及迁移时间, 进而提 高数据迁移的速度及可靠性。
为达到上述目的, 本发明的技术方案是这样实现的:
本发明提供了一种 P2P网络中数据迁移的方法, 该方法包括: 需要进行数据迁移时, 需要迁出数据的节点将迁移数据当前的数据版 本与迁入节点对应的数据版本进行比较, 确定增量数据, 并将增量数据发 送给迁入节点;
迁入节点根据增量数据及自身存储的数据, 恢复出最新版本的数据。 上述方案中, 在将增量数据发送给迁入节点后, 该方法进一步包括: 需要迁出数据的节点保存自身当前版本的迁移数据。
上述方案中, 需要进行数据迁移的时机为: 有节点加入对等网络时, 或者, 有节点退出对等网络时, 或者, 对等网络中有节点负载超载, 需要 均衡负载时, 或者, 由于管理原因需要均衡负载时, 或者, 需要增加节点 作为对等网络中的节点的数据的备份时。
上述方案中, 所述需要迁出数据的节点将迁移数据当前的数据版本与 迁入节点对应的数据版本进行比较, 为:
需要迁出数据的节点将自身当前的迁移数据的数据版本信息与迁入节 点的数据版本信息进行比较。
上述方案中, 所述数据版本信息为: 数据版本号、 和 /或为修改时间戳、 和 /或为数据的摘要信息。
上述方案中, 所述需要迁出数据的节点将迁移数据当前的数据版本与 迁入节点对应的数据版本进行比较, 为:
需要迁出数据的节点将自身当前的迁移数据的哈希值与迁入节点对应 的哈希值进行比较。
上述方案中, 所述需要迁出数据的节点将自身当前的迁移数据的哈希 值与迁入节点对应的哈希值进行比较, 为: 采用 Merkd树的哈希(hash )对比方式, 或者, 采用可变粒度的 hash 3于比方式。
上述方案中, 所述迁入节点根据增量数据及自身存储的数据, 恢复出 最新版本的数据, 为:
迁入节点采用增量数据中的数据内容, 修改自身存储的数据中对应的 数据内容, 得到最新版本的数据。
本发明还提供了一种 P2P网络中数据迁移的系统, 该系统包括: 第一 节点、 以及第二节点; 其中,
第一节点, 用于需要进行数据迁移时, 将迁移数据当前的数据版本与 第二节点对应的数据版本进行比较, 确定增量数据, 并将增量数据发送给 第二节点;
第二节点, 用于收到第一节点发送的增量数据后, 根据增量数据及自 身存储的数据, 恢复出最新版本的数据。
上述方案中, 所述第二节点的个数为一个以上。
上述方案中, 所述第一节点, 还用于将增量数据发送给第二节点后, 保存自身当前版本的迁移数据。
本发明提供的 P2P网络中数据迁移的方法及系统, 需要进行数据迁移 时, 需要迁出数据的节点将自身当前的数据版本与迁入节点的数据版本进 行比较, 确定增量数据, 并将增量数据发送给迁入节点; 迁入节点根据增 量数据及自身存储的数据, 恢复出最新版本的数据, 需要迁出数据的节点 不需要将所有需要迁出的数据均发送给迁入节点, 只将增量数据发送给迁 入节点, 如此, 能大大减少 P2P网络中数据迁移的数据量及迁移时间, 进 而提高数据迁移的速度及可靠性。
除此以外, 在将增量数据发送给迁入节点后, 需要迁出数据的节点保 存自身当前版本的迁移数据, 如此, 当需要迁出数据的节点再次负责相应 的数据时, 能减少数据的传输量, 进一步提高数据迁移的速度及可靠性, 保证网络的正常运行。 附图说明
图 1为现有技术中节点加入流程中数据的迁移流程示意图;
图 2为本发明 P2P网络中数据迁移的方法流程示意图;
图 3为实施例一实现数据迁移的方法流程示意图;
图 4为实施例二实现数据迁移的方法流程示意图;
图 5为实施例三实现数据迁移的方法流程示意图;
图 6为 Merkel树的 hash值生成方式示意图;
图 7为本发明 P2P网络中数据迁移的系统结构示意图。 具体实施方式
下面结合附图及具体实施例对本发明再作进一步详细的说明。
本发明 P2P网络中数据迁移的方法, 如图 2所示, 包括以下步驟: 步驟 201 : 需要进行数据迁移时, 需要迁出数据的节点将迁移数据当前 的数据版本与迁入节点对应的数据版本进行比较, 确定增量数据, 并将增 量数据发送给迁入节点;
这里, 需要进行数据迁移的时机为: 有节点加入 P2P网络时, 或者, 有节点退出 P2P网络时, 或者, P2P网络中有节点负载超载, 需要均衡负 载时, 或者, 由于管理原因需要均衡负载时, 或者, 需要增加节点作为某 个节点的数据的备份时; 所述迁入节点的个数可以为一个以上;
在比较时, 需要迁出数据的节点可以通过比较迁移数据当前的数据版 本信息与迁入节点对应的数据版本信息, 确定增量数据; 或者, 可以通过 hash值对比的方式确定增量数据; 其中, 所述数据版本信息是指: 迁出数 据的节点能根据所述数据版本信息及自身的相关信息即可确定出增量数据 的信息, 所述数据版本信息具体可以是: 数据版本号、 和 /或修改时间戳、 和 /或数据的摘要信息等; 当迁入节点没有相关的数据版本信息时, 所述数 据版本信息具体还可以是表示空的字符;
通过 hash值对比的方式确定增量数据的具体处理过程可采用现有的 hash值对比的处理过程; 所述 hash值对比的方式具体可以是: Merkd树的 hash对比方式或可变粒度的 hash对比方式等;
所述增量数据是指: 需要迁出数据的节点与迁入节点不一致的数据, 具体可以是: 修改过的资源标识下的全部数据、 或特定资源数据段下的全 部数据、 或数据修改的日志信息等;
在将增量数据发送给迁入节点后, 该方法还可以进一步包括: 需要迁出数据的节点保存自身当前版本的迁移数据。
步驟 202: 迁入节点根据增量数据及自身存储的数据, 恢复出最新版本 的数据;
具体地, 采用增量数据中的数据内容, 修改自身存储的数据中对应的 数据内容, 得到最新版本的数据; 举个例子来说, 假设增量数据包含: 第 100条数据的内容为 123456, 则迁入节点将自身存储的第 100条数据的内 容修改为 123456; 再举个例子来说, 假设增量数据包含: 第 100条数据修 改为 123456的日志信息,则迁入节点根据该日志信息,将自身存储的第 100 条数据的内容修改为 123456。
下面结合实施例对本发明再作进一步详细的描述。
实施例一:
本实施例的应用场景为: 节点加入过程, 在以下的描述中, 将请求加 入的节点称为加入节点, 将当前正在负责存储加入节点所需获取数据的节 点称为接纳节点。 本实施例实现数据迁移的方法, 如图 3 所示, 包括以下 步驟: 步驟 301 :加入节点希望加入到 P2P网络中,向引导节点发送接入请求, 同引导节点建立连接。
步驟 302: 在引导节点的引导下, 加入节点同接纳节点建立连接。
步驟 303: 加入节点向接纳节点发送加入请求, 请求加入 P2P网络, 并 且希望接管接纳节点的部分资源标识下的数据;
这里, 加入节点利用现有技术可获知应接管接纳节点的哪部分资源标 识下的数据。
步驟 304: 接纳节点针对加入节点请求的资源标识, 向加入节点请求对 应的本地存储的数据版本信息。
步驟 305: 加入节点将对应的数据版本信息发送给接纳节点;
这里, 在步驟 303 中, 加入节点还可以直接将请求的资源标识对应的 本地存储的数据版本信息发送给接纳节点, 此时, 则不需要再执行步驟
304-305;
这里, 所述数据版本信息为: 接纳节点能根据所述数据版本信息及自 身的相关信息即可确定出增量数据的信息 , 所述数据版本信息具体可以是: 数据的版本号、 和 /或修改时间戳、 和 /或数据的摘要信息等; 当加入节点完 全没有所请求的资源标识下的数据时, 所述数据版本信息具体还可以是表 示空的字符。
步驟 306:接纳节点将自身存储的数据版本信息与加入节点的数据版本 信息进行对比, 确定需要提供给加入节点的增量数据;
这里, 在进行对比时, 如果由于加入节点的数据版本信息过于陈旧, 使得接纳节点无法进行对比, 此时, 接纳节点可以和加入节点进行消息交 互, 进行 hash值对比, 直至确定出增量数据;
当数据版本信息为表示空的字符时, 所述增量数据为加入节点所请求 的资源标识下的所有数据。 步驟 307: 接纳节点将增量数据发送给加入节点;
这里, 当增量数据的数据量比较大时, 可以分多次发送数据, 具体处 理过程可采用现有的处理过程。
步驟 308: 增量数据发送完毕后, 加入节点根据本地存储的数据和收到 的增量数据恢复出最新版本的数据, 同时,接纳节点通知加入节点更新 P2P 网络路由, 加入节点正式负责所请求的资源标识下的数据的存储和处理; 这里, 接纳节点将已迁出的当前版本数据存储在本地緩存中, 以备后 续使用, 在存储时, 接纳节点可以选择部分数据进行存储, 比如准静态数 据等; 其中, 所述准静态数据是指: 在资源标识下随时间变化不大的数据; 另外, 在存储时, 接纳节点还可以依据需要删除部分陈旧的数据, 用当前 版本的迁移数据代替这些陈旧的数据, 比如: 依据数据的使用频率等信息, 删除使用频率相对较低的数据, 以节约本地緩存资源。
步驟 309: 加入节点向其它相关节点发送路由更新, 通知已正式加入 P2P网络。
这里, 需要说明的是: 对于节点退出流程和负载均衡流程中节点确定 增量数据的过程与本实施例提供的过程类似, 所不同的是: 加入节点和接 纳节点分别替换为数据迁移的目标节点和源节点, 并针对不同的流程在消 息上进行相应的修改。 举个例子来说, 在节点退出流程中, 数据迁移的目 标节点为新负责节点, 而源节点则为退出节点; 在负载均衡流程中, 数据 迁移的目标节点为轻载节点, 而源节点则为过载节点。 这些修改可以在现 有的节点退出和负载均衡流程中直接得出, 这里不再赘述。
实施例二:
本实施例的应用场景为: 节点加入过程, 且请求的数据包含准静态数 据及动态数据; 其中, 所述准静态数据是指: 在资源标识下随时间变化不 大的数据, 所述动态数据是指: 在资源标识下经常变化的数据。 在实际应 用时, 数据是准静态数据还是动态数据可以由具体的业务确定。 例如, 在
P2P因特网协议 ( IP, Internet Protocol ) 电话服务中, 每个用户的所有数据 可以存储在同一个资源标识下, 但是, 其中的用户号码信息及用户服务签 约信息等变化较少, 因此, 可以作为准静态数据。 而用户的在线信息及位 置信息变化较快, 因此, 可以作为动态数据。 在以下的描述中, 将请求加 入的节点称为加入节点, 将当前正在负责存储加入节点所需获取数据的节 点称为接纳节点。 本实施例实现数据迁移的方法, 如图 4所示, 包括以下 步驟:
步驟 401 :加入节点希望加入到 P2P网络中,向引导节点发送接入请求, 同引导节点建立连接。
步驟 402: 在引导节点的引导下, 加入节点同接纳节点建立连接。
步驟 403: 加入节点向接纳节点发送加入请求, 请求加入 P2P网络, 并 且希望接管接纳节点的部分资源标识下的数据;
这里, 加入节点利用现有技术可获知应接管接纳节点的哪部分资源标 识下的数据。
步驟 404: 接纳节点针对加入节点请求的资源标识,要求加入节点发送 所请求的资源标识下的准静态数据的本地存储的数据版本信息;
这里, 所述数据版本信息为: 接纳节点能根据所述数据版本信息即可 确定出增量数据的信息, 所述数据版本信息具体可以是: 数据的版本号、 和 /或修改时间戳、 和 /或数据的摘要信息等; 当加入节点完全没有所请求的 资源标识下的数据时, 所述数据版本信息具体还可以是表示空的字符。
步驟 405: 加入节点将本地存储的、 所请求的资源标识下的准静态数据 的数据版本信息发送给接纳节点;
这里, 在步驟 403 中, 加入节点还可以直接将本地存储的、 所请求的 资源标识下的准静态数据的数据版本信息发送给接纳节点, 此时, 则不需 要再执行步驟 404~405;
所述数据版本信息为: 接纳节点能根据所述数据版本信息及自身的相 关信息即可确定出增量数据的信息, 所述数据版本信息具体可以是: 数据 的版本号、 和 /或修改时间戳、 和 /或数据的摘要信息等; 当加入节点完全没 有所请求的资源标识下的数据时, 所述数据版本信息具体还可以是表示空 的字符。
步驟 406: 接纳节点对比加入节点的准静态数据版本信息, 确定需要提 供给加入节点的准静态数据的增量数据;
这里, 在进行对比时, 如果由于加入节点的数据版本信息过于陈旧, 使得接纳节点无法进行对比, 此时, 接纳节点可以和加入节点进行消息交 互, 进行 hash值对比, 直至确定出增量数据;
当数据版本信息为表示空的字符时, 所述增量数据为加入节点所请求 的资源标识下的所有准静态数据。
步驟 407: 接纳节点将准静态数据的增量数据发送给加入节点。
步驟 408: 准静态数据的增量数据发送完成后,接纳节点将所有的动态 数据发送给加入节点;
这里, 本步驟还可以在步驟 405、 或步驟 406、 或步驟 407之前执行。 步驟 409: 数据发送完毕后, 加入节点根据本地存储的准静态数据和收 到的增量数据恢复出最新版本的准静态数据, 并直接采用收到的动态数据 作为最新版本的动态数据, 同时, 接纳节点通知加入节点更新 P2P网络路 由, 加入节点正式负责所请求的资源标识下的数据的存储和处理;
这里, 接纳节点将已迁出的准静态数据存储在本地緩存中, 以备后续 使用。 同时, 接纳节点可以直接删除迁出的动态数据。
步驟 410: 加入节点向其它相关节点发送路由更新, 通知已正式加入 P2P网络。 这里, 需要说明的是: 对于节点退出流程和负载均衡流程中节点确定 增量数据的过程与本实施例提供的过程类似, 所不同的是: 加入节点和接 纳节点分别替换为数据迁移的目标节点和源节点, 并针对不同的流程在消 息上进行相应的修改。 举个例子来说, 在节点退出流程中, 数据迁移的目 标节点为新负责节点, 而源节点则为退出节点; 在负载均衡流程中, 数据 迁移的目标节点为轻载节点, 而源节点则为过载节点。 这些修改可以在现 有的节点退出和负载均衡流程中直接得出, 这里不再赘述。
实施例三:
本实施例的应用场景为: 数据版本信息为数据版本号, 在以下的描述 中, 将迁出数据的节点称为源节点, 将迁入数据的节点称为目标节点, 本 实施例实现数据迁移的方法, 如图 5所示, 包括以下步驟:
步驟 501 : 需要进行数据迁移时, 目标节点向源节点发送本地存储的数 据版本号;
这里, 所述数据版本号的格式可以是依序增加的整数、 主版本号加次 版本号的格式或时间戳等格式, 举个例子来说, 如果数据版本号的格式为 依序增加的整数, 则数据版本号可以是 1、 2、 3等等, 如果数据版本号的 格式为主版本号加次版本号的格式, 则数据版本号可以是 1.1、 1.2、 1.3等 等。
步驟 502: 源节点收到目标节点发送的数据版本号后, 将目标节点的数 据版本号与自身当前的数据版本号进行比较, 通过两个版本之间的版本修 改信息, 确定需要提供给目标节点的增量数据;
这里, 源节点事先会在当前数据版本号下记录修改信息, 所述修改信 息具体可以是数据修改操作的记录、 或修改的资源标识记录等, 源节点据 此可以获知源节点与目标节点的两个版本之间的版本修改信息; 所述版本 修改信息可以是数据修改操作的记录、 或修改的资源标识记录等; 在通过两个版本之间的版本修改信息, 确定需要提供给目标节点的增 量数据时, 举个例子来说, 如果版本修改信息中包含: 对第 100条数据进 行数据修改的具体操作, 则源节点根据这条信息, 确定需要将第 100条数 据发送给目标节点, 即: 第 100条数据即为增量数据中的一条;
在源节点将目标节点的数据版本号与自身当前的数据版本号进行比较 时, 如果源节点发现目标节点的数据版本号过于陈旧, 导致自身没有两个 版本之间的版本修改信息时, 源节点可以将目标节点所请求的全部数据作 为增量数据, 发送给目标节点, 或者, 源节点可以和目标节点进行消息交 互,采用 hash值对比方式确定增量数据,发送给目标节点;其中,采用 hash 值对比方式确定增量数据的过程可采用现有的处理过程。
步驟 503: 源节点将增量数据发送给目标节点, 目标节点收到增量数据 后, 根据本地存储的数据和增量数据恢复出最新版本的数据;
这里, 源节点可以将版本之间的版本修改信息一起发送给目标节点, 以便目标节点能更好地恢复出最新版本的数据;
对于上述例子, 目标节点收到增量数据后, 将自身的第 100条数据用 增量数据中相应的数据进行修改, 如此类推, 从而恢复出最新版本的数据。
步驟 504: 目标节点增加相应的数据版本号, 以描述自身和源节点存储 的数据版本的区别;
这里, 修改数据版本号的操作还可以由源节点在步驟 503 中进行, 所 述数据版本号的功能为: 区别目标节点和源节点的数据版本, 以便后续进 行数据迁移时使用;
在实际应用时, 当目标节点或源节点的数据版本号过大时, 可以删除 部分旧数据版本的数据, 举个例子来说, 目标节点或源节点当前的数据版 本号为 3.5 , 3.5之前的数据版本号对应的数据依然保存在硬盘当中, 此时, 可以将数据版本号为 1之前的数据版本号对应的所有数据均删除, 以节约 硬盘。
在实际应用过程中, 当采用数据版本号进行对比时, 源节点需要记录 多个版本和当前版本的版本修改信息, 以便能针对不同数据版本号, 确定 出增量数据, 此时, 如果数据变化量比较大时, 对应的版本修改信息的数 据量也会相应的比较大, 因此, 源节点还可以采用 hash值对比的方式确定 增量数据; 其中, 所述 hash值对比的方式具体可以是: Merkd树的 hash 对比方式或可变粒度的 hash对比方式等。
下面^ Mi
行详细的描述。
图 6为 Merkd树的 hash值生成方式示意图, 对于 Merkd树的对比方 式, 具体为: 首先, 源节点和目标节点将需要对比的数据段按照最小粒度 分段, 比如: 按照单个资源标识或较小的资源标识段分段, 之后对每个数 据段分别进行 hash值计算, 得到叶节点的 hash值, 如图 6中的叶节点 C、 D、 E、 及 F的 hash值。 随后, 节点对多个叶节点的哈希值再次进行 hash 值计算, 生成子树节点的 hash值, 如图 6中的子树八、 及 B的 hash值, 之 后再对多个子树节点的 hash值再次进行 hash值计算, 如此循环, 直至生成 根节点的 hash值。
源节点在进行迁移数据的数据版本对比时, 如图 6所示, 首先比较根 节点的 hash值, 如果源节点与目标节点的根节点 hash值一致, 则表明源节 点与目标节点双方的数据内容完全一致, 不需要传送变化的数据, 如果根 节点的 hash值不一致, 则说明双方数据内容不同; 源节点进一步分别比较 每个子树节点的 hash值, 如果每个子树节点的 hash值均一致, 则说明每个 子树节点下所有的数据节点内容均一致, 源节点不需要传送变化的数据, 如果子树节点的 hash值不一致, 则进一步比较不一致的子树节点下的子节 点的 hash值, 如此循环, 直至到找到合适粒度的变化节点为止; 确定了双 方数据内容不一致的部分后, 源节点将变化的数据作为增量数据发送给目 标节点即可; 其中, 在实际应用时, 所述合适粒度的变化节点可以是叶子 节点或子树节点。
对于可变粒度的 hash对比方式的基本思想是: 判断单个资源标识或最 小资源标识段下的数据不一致的可能性。 具体地, 假设需要迁移 n个资源 标识下的数据,设单个资源标识的数据变化的可能性为 p, 将源节点及目标 节点的 k个资源标识下的所有数据分别进行 hash值计算,并将得到 hash值 进行对比, 此时, k个资源标识的所有数据的 hash值不一致的可能性为: i-d-p)k;
其中, 在估算 p值时, k可以取不同的大小, 从而产生不同的粒度, 对 不同粒度上的数据段进行 hash值计算, 并比较 hash不一致的情况, 并且, 对于一个粒度, 还可以采用不同的数据段进行 hash值计算, 并比较 hash值 不一致的情况, 以获取较准确的哈希变化率数据。 根据不同粒度的 hash变 化率数据, 拟合出最有可能的 p值;
将 p值代入公式: k*=-l/(ln(l-p)), 得到最佳的 hash粒度, 即: k*; 将迁移数据的数据段按 k*个资源标识分为等大小的数据段, 生成 n/k* 个 hash值, 并进行对比, 源节点在进行迁移数据的数据版本对比时, 如果 n/k*个 hash值均一致, 则说明源节点与目标节点双方数据内容完全一致, 不需要进一步对比及传送变化的数据。 对于 n/k*个 hash值中不一致的, 则 将不一致的 hash值对应的数据内容不一致的数据段, 重复上述过程, 直至 找到在最小粒度上数据发生变化的点; 这里, p值可以重新估算或利用前次 估计的 p值推算。在迭代过程中, p值逐渐会增加, hash粒度 k会逐渐减小, 在实际操作中, 也可以强制要求每次 k值至少缩小为原来的二分之一。
上述实施例中, 描述的 P2P网络中数据组织方式均是: 将数据按资源 组织, 每个资源拥有自己唯一的资源标识, 同类资源按照特定的数据结构 来存储相关数据, 特定的数据可通过资源标识索引来进行查找, 每个节点 可负责一部分资源标识空间段, 即: 负责该资源标识段内对应的资源数据 的存储和处理, 所述处理包括: 读取数据、 更新数据、 以及删除数据等操 作。 这里, 本领域的技术人员应当理解: 本发明可以很容易地扩展到其它 数据结构组织方式。
本发明提供的技术方案, 特别适用于 P2P网络中需要存储大量准静态 数据的情况, 比如: 在 P2P 互联网协议语音 (VoIP, Voice over Internet Protocol ) 系统中, P2P网络中的节点需要保存用户的签约信息, 这些信息 变化量比较小, 即为准静态数据, 需要通过分布式系统存储。 如果节点加 入或退出频繁时, 这些准静态数据需要经常在节点之间进行迁移并占用大 量带宽, 采用本发明的技术方案后, 数据迁移量可以大大减小, 只需要传 输改变的用户签约数据即可。
本发明的技术方案也可适用于节点需要进行定时的负载调整的情况, 比如: 每日定期进行负载调整。 在这种情况下, P2P网络会定期如在每曰清 晨唤醒和加入一部分节点以应付可预见的大量业务, 并定期如在每日夜晚 让部分节点退出以节约能源。 在这种情况下, 节点往往会重复负责某些数 据和业务, 采用本发明的技术方案后, 节点只要在每次迁移时更新到最新 版本的数据即可, 如此, 节点本地存储的数据可以得到较好地利用。
基于上述方法, 本发明还提供了一种 P2P网络中数据迁移的系统, 如 图 7所示, 该系统包括: 第一节点 71、 以及第二节点 72; 其中,
第一节点 71 , 用于需要进行数据迁移时, 将迁移数据当前的数据版本 与第二节点 72对应的数据版本进行比较, 确定增量数据, 并将增量数据发 送给第二节点 72;
第二节点 72, 用于收到第一节点 71发送的增量数据后,根据增量数据 及自身存储的数据, 恢复出最新版本的数据。 这里, 需要说明的是: 所述第二节点 72的个数可以为一个以上。 所述第一节点 71 ,还用于将增量数据发送给第二节点 72后,保存自身 当前版本的迁移数据。
所述第二节点 72, 具体用于: 采用增量数据中的数据内容, 修改自身 存储的数据中对应的数据, 得到最新版本的数据。
这里, 本发明的所述系统中的第一节点的具体处理过程已在上文中详 述, 不再赘述。
以上所述, 仅为本发明的较佳实施例而已, 并非用于限定本发明的保 护范围。

Claims

权利要求书
1、 一种对等(P2P ) 网络中数据迁移的方法, 其特征在于, 该方法包 括:
需要进行数据迁移时, 需要迁出数据的节点将迁移数据当前的数据版 本与迁入节点对应的数据版本进行比较, 确定增量数据, 并将增量数据发 送给迁入节点;
迁入节点根据增量数据及自身存储的数据, 恢复出最新版本的数据。
2、 根据权利要求 1所述的方法, 其特征在于, 在将增量数据发送给迁 入节点后, 该方法进一步包括:
需要迁出数据的节点保存自身当前版本的迁移数据。
3、 根据权利要求 1所述的方法, 其特征在于, 需要进行数据迁移的时 机为: 有节点加入对等网络时, 或者, 有节点退出对等网络时, 或者, 对 等网络中有节点负载超载, 需要均衡负载时, 或者, 由于管理原因需要均 衡负载时, 或者, 需要增加节点作为对等网络中的节点的数据的备份时。
4、 根据权利要求 1、 2或 3所述的方法, 其特征在于, 所述需要迁出 数据的节点将迁移数据当前的数据版本与迁入节点对应的数据版本进行比 较, 为:
需要迁出数据的节点将自身当前的迁移数据的数据版本信息与迁入节 点的数据版本信息进行比较。
5、 根据权利要求 4所述的方法, 其特征在于, 所述数据版本信息为: 数据版本号、 和 /或为修改时间戳、 和 /或为数据的摘要信息。
6、 根据权利要求 1、 2或 3所述的方法, 其特征在于, 所述需要迁出 数据的节点将迁移数据当前的数据版本与迁入节点对应的数据版本进行比 较, 为:
需要迁出数据的节点将自身当前的迁移数据的哈希值与迁入节点对应 的哈希值进行比较。
7、 根据权利要求 6所述的方法, 其特征在于, 所述需要迁出数据的节 点将自身当前的迁移数据的哈希值与迁入节点对应的哈希值进行比较, 为: 采用 Merkd树的哈希对比方式,或者,采用可变粒度的哈希对比方式。
8、 根据权利要求 1、 2或 3所述的方法, 其特征在于, 所述迁入节点 根据增量数据及自身存储的数据, 恢复出最新版本的数据, 为:
迁入节点采用增量数据中的数据内容, 修改自身存储的数据中对应的 数据内容, 得到最新版本的数据。
9、 一种对等网络中数据迁移的系统, 其特征在于, 该系统包括: 第一 节点、 以及第二节点; 其中,
第一节点, 用于需要进行数据迁移时, 将迁移数据当前的数据版本与 第二节点对应的数据版本进行比较, 确定增量数据, 并将增量数据发送给 第二节点;
第二节点, 用于收到第一节点发送的增量数据后, 根据增量数据及自 身存储的数据, 恢复出最新版本的数据。
10、 根据权利要求 9所述的系统, 其特征在于, 所述第二节点的个数 为一个以上。
11、 根据权利要求 9或 10所述的系统, 其特征在于, 所述第一节点, 还用于将增量数据发送给第二节点后, 保存自身当前版本的迁移数据。
PCT/CN2012/072071 2011-04-02 2012-03-07 一种对等网络中数据迁移的方法及系统 WO2012136091A2 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2011100842595A CN102739704A (zh) 2011-04-02 2011-04-02 一种对等网络中数据迁移的方法及系统
CN201110084259.5 2011-04-02

Publications (2)

Publication Number Publication Date
WO2012136091A2 true WO2012136091A2 (zh) 2012-10-11
WO2012136091A3 WO2012136091A3 (zh) 2012-11-29

Family

ID=46969602

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2012/072071 WO2012136091A2 (zh) 2011-04-02 2012-03-07 一种对等网络中数据迁移的方法及系统

Country Status (2)

Country Link
CN (1) CN102739704A (zh)
WO (1) WO2012136091A2 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103237247A (zh) * 2013-03-29 2013-08-07 东莞宇龙通信科技有限公司 一种终端与网络播放器同步显示的系统及方法
CN106034080A (zh) * 2015-03-10 2016-10-19 中兴通讯股份有限公司 分布式系统中元数据的迁移方法及装置
CN106383731B (zh) * 2016-09-14 2019-08-27 Oppo广东移动通信有限公司 一种数据迁移方法及移动终端
CN106844510B (zh) * 2016-12-28 2021-01-15 北京五八信息技术有限公司 一种分布式数据库集群的数据迁移方法和装置
CN111147226B (zh) * 2018-11-02 2023-07-18 杭州海康威视系统技术有限公司 数据存储方法、装置及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008033424A2 (en) * 2006-09-12 2008-03-20 Foleeo, Inc. Hive-based peer-to-peer network
CN101626389A (zh) * 2008-07-12 2010-01-13 Tcl集团股份有限公司 一种网络节点的管理方法
CN101651709A (zh) * 2009-09-01 2010-02-17 中国科学院声学研究所 一种p2p下载文件完整性校验方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7996679B2 (en) * 2005-10-05 2011-08-09 International Business Machines Corporation System and method for performing a trust-preserving migration of data objects from a source to a target
CN101127915B (zh) * 2007-09-20 2011-04-20 中兴通讯股份有限公司 一种基于增量式的电子节目导航数据同步方法及系统
CN101242356B (zh) * 2007-12-06 2010-08-18 中兴通讯股份有限公司 Iptv系统中内存数据库的实现方法及iptv系统
CN101453451A (zh) * 2007-12-07 2009-06-10 北京闻言科技有限公司 一种增量下载数据的方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008033424A2 (en) * 2006-09-12 2008-03-20 Foleeo, Inc. Hive-based peer-to-peer network
CN101626389A (zh) * 2008-07-12 2010-01-13 Tcl集团股份有限公司 一种网络节点的管理方法
CN101651709A (zh) * 2009-09-01 2010-02-17 中国科学院声学研究所 一种p2p下载文件完整性校验方法

Also Published As

Publication number Publication date
CN102739704A (zh) 2012-10-17
WO2012136091A3 (zh) 2012-11-29

Similar Documents

Publication Publication Date Title
US9923786B2 (en) System and method for performing a service discovery for virtual networks
US10681127B2 (en) File upload method and system
US7978631B1 (en) Method and apparatus for encoding and mapping of virtual addresses for clusters
US12015666B2 (en) Systems and methods for distributing partial data to subnetworks
US9143452B2 (en) Data processing
EP1825654B1 (en) Routing a service query in an overlay network
US9992274B2 (en) Parallel I/O write processing for use in clustered file systems having cache storage
WO2013178082A1 (zh) 图片上传方法、系统、客户端及网络服务器、计算机存储介质
WO2018049966A1 (zh) 视频监控系统的控制方法、装置及系统
US20120191769A1 (en) Site-aware distributed file system access from outside enterprise network
US10956501B2 (en) Network-wide, location-independent object identifiers for high-performance distributed graph databases
US20170097941A1 (en) Highly available network filer super cluster
WO2012136091A2 (zh) 一种对等网络中数据迁移的方法及系统
WO2022242361A1 (zh) 数据下载方法、装置、计算机设备和存储介质
JP2015509232A (ja) クラウドにおけるメッセージングのための方法および装置
US9426246B2 (en) Method and apparatus for providing caching service in network infrastructure
WO2020024445A1 (zh) 数据存储方法、装置、计算机设备及计算机存储介质
WO2008089616A1 (fr) Serveur, système réseau p2p et procédé d'acheminement et de transfert de l'affectation de la clé de ressource de ce dernier.......
KR101585413B1 (ko) 소프트웨어 정의 네트워크 기반 클라우드 컴퓨팅 시스템을 위한 오픈플로우 컨트롤러 및 재해복구 방법
CN112948052B (zh) 跨数据中心的虚拟机迁移方法、数据中心及计算机介质
US11601497B1 (en) Seamless reconfiguration of distributed stateful network functions
US11996981B2 (en) Options template transport for software defined wide area networks
Liu et al. Software‐Defined Edge Cloud Framework for Resilient Multitenant Applications
US20240012575A1 (en) Methods and storage nodes to decrease delay in resuming input output (i/o) operations after a non-disruptive event for a storage object of a distributed storage system by utilizing asynchronous inflight replay of the i/o operations
Chen et al. Research of distributed file system based on massive resources and application in the network teaching system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12767922

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12767922

Country of ref document: EP

Kind code of ref document: A2