CN111046065B - Extensible high-performance distributed query processing method and device - Google Patents

Extensible high-performance distributed query processing method and device Download PDF

Info

Publication number
CN111046065B
CN111046065B CN201911032931.9A CN201911032931A CN111046065B CN 111046065 B CN111046065 B CN 111046065B CN 201911032931 A CN201911032931 A CN 201911032931A CN 111046065 B CN111046065 B CN 111046065B
Authority
CN
China
Prior art keywords
node
list
nodes
message
network system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911032931.9A
Other languages
Chinese (zh)
Other versions
CN111046065A (en
Inventor
黄罡
刘佳皓
景翔
蔡华谦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN201911032931.9A priority Critical patent/CN111046065B/en
Publication of CN111046065A publication Critical patent/CN111046065A/en
Application granted granted Critical
Publication of CN111046065B publication Critical patent/CN111046065B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Abstract

The invention provides an extensible high-performance distributed query processing method and device, which are applied to a P2P network system, wherein the P2P network system comprises a plurality of nodes; the invention broadcasts the query condition to the nodes in the P2P network system by maintaining the P2P network system with a tree structure with high fault tolerance and load balance, the nodes return the data meeting the query condition locally to the father node in the P2P network system after receiving the query request, the father node performs deduplication and settlement on the data returned by all child nodes and the local query result, the processed result is returned to the father node of the node, and the data is returned to the root node in a layer-by-layer summarizing manner. In the process, the method of hop optimization is adopted, and the problems of expandability and fault tolerance of the query function of the distributed account book based on graph structure random storage are solved by delay optimization and a neighbor node management protocol.

Description

Extensible high-performance distributed query processing method and device
Technical Field
The present invention relates to the field of block chain technology, and in particular, to an expandable high-performance distributed query processing method and an expandable high-performance distributed query processing apparatus.
Background
Due to the characteristics of non-tamper-ability, anonymity, decentralization and the like, the blockchain technology has received extensive attention from the industry and academia, and has led to a large number of blockchain applications, including bitcoin, ether house and the like. The traditional block chain adopts a chain structure account book, maintains a globally unified longest chain through a whole network common identification mechanism, has low transaction throughput and high transaction cost, and cannot be expanded, so that the traditional block chain cannot be applied to scenes such as banks and exchange stations with high real-time requirements and high throughput.
Aiming at the problem, a novel distributed account book based on a graph structure appears in recent block chain research and practice, different from a chain account book, the account book based on the graph structure usually adopts a consensus algorithm of a non-workload certification mechanism PoW, in an application scene of a alliance chain, in order to further improve transaction throughput, a data whole network synchronization mode is not adopted, a randomized storage strategy is adopted, transactions are randomly stored on a plurality of nodes in a network, and the consensus algorithm is based on retrieval results of all relevant nodes. This graph structure based random storage of distributed ledgers presents two challenges to the query of transactions: one is to ensure the scalability of the query function under the condition that the number of nodes and the total network TPS (number of transactions per second) are increased. Secondly, under the environment that the online state of the nodes in the network and the link connection state between the nodes are dynamically changed, relevant data of all the nodes are searched out to support a consensus algorithm, and the fault tolerance of the query function is ensured.
Disclosure of Invention
The invention provides an expandable high-performance distributed query processing method and an expandable high-performance distributed query processing device, so as to overcome the technical problems.
In order to solve the above problems, the present invention discloses an expandable high-performance distributed query processing method, which is applied to a peer-to-peer computing P2P network system, where the P2P network system includes a plurality of nodes, and the nodes include an Active List and a Passive List, and the Active List is divided into an Active List Eager List and an inactive List Lazy List; the number of the nodes in the Active List is a fixed value, and the Eager List stores the nodes establishing TCP connection with the nodes on the P2P network system, and is used for transmitting messages; the Lazy List stores the rest nodes of the Active List except the Eager List, and is used for transferring the summary of the message or the ID of the message, and is used for optimizing and fault-tolerant of a P2P network system; the Passive List stores random nodes for replacing nodes disconnected by the Active List and ensuring the connection between the nodes and the network in the P2P network system; the method comprises the following steps:
in the P2P network system, a first node obtains a query request broadcast by a parent node thereof, wherein the first node is any node in the P2P network system;
the first node broadcasts the query request to own child nodes through a tree maintenance program; the child nodes are used for broadcasting the query request to the child nodes corresponding to the child nodes by using the tree structure of the P2P network system, and the child nodes corresponding to the child nodes repeat the broadcasting steps until the query request is broadcast to all nodes on the P2P network system; after receiving the query request, each node retrieves a local database, waits for the result return of the child node, performs settlement and deduplication operations after collecting the data returned by all the child nodes, and returns the result to the parent node; after layer-by-layer feedback, when the root node receiving the user query request receives the returned results of all child nodes, final settlement and duplicate removal operation is carried out to generate a final query result, and the final query result is returned to the user;
the tree maintenance program comprises an expandable maintenance program and a fault tolerance maintenance program;
for the extensible maintenance program, the method includes:
when the first node broadcasts the query request to the child nodes of the first node, the first node sends an IHAVE message to a second node in the child nodes of the first node, wherein the IHAVE message comprises a message ID;
the second node checks whether it has received a NORMAL message corresponding to the message ID for delivering the query request;
if the second node does not receive the NORMAL message corresponding to the message ID within the timeout period, executing the following steps:
the second node generates a GRAFT message for repairing the P2P network system; the GRAFT message includes the message ID and a request to receive the IHAVE message;
the second node sends the GRAFT message to the first node, and moves the first node from the Lazy List of the second node to the Eager List, so that the first node repairs the P2P network system;
if the second node has received a NORMAL message corresponding to the message ID within a timeout period, performing the following steps:
the second node calculates the difference between the receiving hop count of the IHAVE message and the receiving hop count of the NORMAL message;
the second node judges whether the hop count difference exceeds a hop count threshold value;
if the hop count difference exceeds a hop count threshold, the second node repairs the P2P network system;
for the fault tolerance maintenance program, the method comprises:
when the connection between a first node and a second node constituting an edge of the P2P network system is disconnected, the first node removes the second node from its Eager List;
the first node sequentially initiates a query request to a first target node in a Passive List of the first node; the query request comprises an instruction for checking whether the first target node is on line and an instruction for querying the size of the Lazy List of the first target node;
the first node receives a query result returned by each first target node aiming at the query request, and selects a second target node with the smallest Lazy List and the lowest delay from the first target nodes according to the delay in the query result and the Lazy List of each first target node;
and the first node adds a second target node into the Lazy List of the first node, and repairs the P2P network system by using the node in the Lazy List as a substitute edge.
In order to solve the above problems, the present invention also discloses a scalable high performance distributed query processing apparatus, which is applied in a P2P network system, where the P2P network system includes a plurality of nodes, and the nodes include an Active List and a Passive List, and the Active List is divided into an Active List Eager List and an inactive List Lazy List; the number of the nodes in the Active List is a fixed value, and the Eager List stores the nodes establishing TCP connection with the nodes on the P2P network system, and is used for transmitting messages; the Lazy List stores the rest nodes of the Active List except the Eager List, and is used for transferring the summary of the message or the ID of the message, and is used for optimizing and fault-tolerant of a P2P network system; the Passive List stores random nodes for replacing nodes disconnected by the Active List and ensuring the connection between the nodes and the network in the P2P network system; the device comprises:
a query request obtaining module configured in a first node, configured to obtain, in the P2P network system, a query request broadcasted by a parent node thereof, where the first node is any node in the P2P network system;
the query request broadcasting module is configured in the first node and broadcasts the query request to the child nodes of the query request broadcasting module through the tree maintenance program; the child nodes are used for broadcasting the query request to the child nodes corresponding to the child nodes by using the tree structure of the P2P network system, and the child nodes corresponding to the child nodes repeat the broadcasting steps until the query request is broadcast to all nodes on the P2P network system; after receiving the query request, each node retrieves a local database, waits for the result return of the child node, performs settlement and deduplication operations after collecting the data returned by all the child nodes, and returns the result to the parent node; after layer-by-layer feedback, when the root node receiving the user query request receives the returned results of all child nodes, final settlement and duplicate removal operation is carried out to generate a final query result, and the final query result is returned to the user;
the tree maintenance program comprises an expandable maintenance program and a fault tolerance maintenance program;
for the extensible maintenance program, the apparatus comprises:
an IHAVE message sending module, configured in the first node, for sending an IHAVE message to a second node in the child nodes when the query request is broadcast to the child nodes, where the IHAVE message includes a message ID;
a NORMAL message checking module, configured in the second node, for checking whether it has received a NORMAL message corresponding to the message ID for delivering the query request;
a GRAFT message generating module configured in the second node, configured to generate a GRAFT message for repairing the P2P network system when a NORMAL message corresponding to the message ID is not received within a timeout period; the GRAFT message includes the message ID and a request to receive the IHAVE message;
a GRAFT message sending module configured in the second node, configured to send the GRAFT message to the first node when a NORMAL message corresponding to the message ID is not received within a timeout period, and move the first node from its Lazy List to Eager List, so that the first node repairs the P2P network system;
a hop count difference calculation module configured in the second node, configured to calculate a difference between a received hop count of the IHAVE message and a received hop count of the NORMAL message when the NORMAL message corresponding to the message ID has been received within a timeout period;
a hop count difference determining module, configured in the second node, configured to determine whether the hop count difference exceeds a hop count threshold when the NORMAL message corresponding to the message ID has been received within a timeout period;
a first system repair module configured in the second node, configured to repair the P2P network system when the hop count difference exceeds a hop count threshold;
for the fault tolerance maintenance program, the apparatus comprises:
a second node removing module configured in the first node, for removing the second node from its Eager List when a connection between the first node and the second node constituting an edge of the P2P network system is disconnected;
the system comprises a query request initiating module, a query request sending module and a query request sending module, wherein the query request initiating module is configured in a first node and is used for sequentially initiating query requests to a first target node in a Passive List of the first node; the query request comprises an instruction for checking whether the first target node is online or not and an instruction for querying the size of the Lazy List of the first target node;
a query result receiving module, configured in the first nodes, configured to receive a query result returned by each first target node for the query request, and select, according to a delay in the query result and a size of a Lazy List of each first target node, a second target node with a smallest Lazy List and a lowest delay from the first target nodes;
and the second system repair module is configured in the first node, and is used for adding the second target node into the Lazy List of the second target node and repairing the P2P network system by using the node in the Lazy List as a replacement edge.
Compared with the prior art, the invention has the following advantages:
the invention broadcasts the query condition to the nodes in the P2P network system by maintaining the P2P network system with a tree structure with high fault tolerance and load balance, the nodes return the data meeting the query condition locally to the father node in the P2P network system after receiving the query request, the father node performs deduplication and settlement on the data returned by all child nodes and the local query result, the processed result is returned to the father node of the node, and the data is returned to the root node in a layer-by-layer summarizing manner, thereby reducing the load of the proxy node and ensuring low delay.
Aiming at the problem that the query function of a distributed account book stored randomly based on a graph structure can be expanded, the invention adopts a hop number optimization method, optimizes a P2P network system through the hop number of message transmission, constructs a P2P network system with a more balanced network, thereby evenly distributing the processing operation of the query result to all nodes in the network, dynamically adjusting the output degree according to the computing capacity of the nodes, not generating great influence on the query delay on the premise of ensuring load balance, and ensuring the expandability of the system.
Aiming at the problem of fault tolerance of the query function of a distributed account book randomly stored based on a graph structure, the invention adopts a method of delay optimization and a neighbor node management protocol, can ensure that a query message is received by a lower node under the condition that the upper node is down by the delay optimization, and can dynamically replace a node leaving the network with a new on-line node by the neighbor management protocol, thereby ensuring the connectivity of the whole network.
Drawings
FIG. 1 is a method for scalable high-performance distributed query processing according to an embodiment of the invention;
FIG. 2.1 is a schematic diagram of the first generation of the P2P network system;
FIG. 2.2 is a schematic diagram of the P2P network system after the first message transmission is completed;
FIG. 3 is a flowchart illustrating the steps of an extensible maintenance program according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating the operation of the second node after receiving the IHAVE message according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating the steps of a fault tolerance maintenance procedure according to an embodiment of the present invention;
fig. 6.1 is a schematic structural diagram of the number of nodes-hop count according to the embodiment of the present invention;
FIG. 6.2 is a structural diagram of hop count distribution according to an embodiment of the present invention;
FIG. 6.3 is a schematic structural diagram of the node number-deduplication ratio according to the embodiment of the present invention;
FIG. 6.4 is a schematic diagram of failure rate versus hop count for an embodiment of the present invention;
FIG. 6.5 is a schematic diagram of a failure rate-hop count distribution of an embodiment of the present invention;
FIG. 6.6 is a schematic of failure rate vs. deduplication rate for an embodiment of the present invention;
FIG. 6.7 is a schematic structural diagram of the total node out-degree and the hop count according to the embodiment of the present invention;
FIG. 6.8 is a schematic structural diagram of the total node out-degree and the de-duplication rate according to the embodiment of the present invention;
FIG. 6.9 is a schematic structural diagram of total node out-degree and duplicate removal rate under the fixed root node out-degree in the embodiment of the present invention;
FIG. 6.10 is a schematic structural diagram of total node out-degree and hop count under fixed and unfixed root node out-degree in the embodiment of the present invention;
FIG. 7.1 is a field description diagram of a query result;
FIG. 7.2 is a diagram of the results of a multi-conditional query;
fig. 8 is a schematic structural diagram of a scalable high-performance distributed query processing apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
The distributed ledger can securely record data of a user onto distributed network nodes, and when the data is securely recorded, a reliable and efficient mechanism is still needed to retrieve the data on the distributed ledger. Unlike traditional centralized databases, distributed ledgers are consensus databases that accomplish replication, sharing, and synchronization of data at different geographic locations without a central manager or centralized data store. Reliable storage of transactions and prevention of tampering are the core of the distributed ledger technology, but when transactions are reliably stored on the distributed ledger, users need to query the distributed ledger for transaction data, such as querying balances, querying transaction histories, and the contents of the transactions. Such queries are not just precision queries, but may be fuzzy queries, range queries, multi-conditional queries. The query function should not only provide for the retrieval of transaction data, but also provide for the traceability and auditing functions when the transaction data is subject to divergence. The method is important in how to process the query request of the user, query the data meeting the conditions on the distributed account book, and accurately and quickly respond the result to the user, as well as data storage.
Similar to bitcoin and ether house architectures, miners' nodes broadcast a block of transactions to all nodes on the network after packing the transactions into the block. Each node, upon receiving a newly packed block, places the block into its own maintained block chain structure. Each node in the whole network contains all transaction data, so each node can be used as a proxy node of a query request, and responds to the query request by retrieving data meeting conditions from a database of the node. But bitcoin has produced 193GiB data since the first foundational block of 2009, every full node on the bitcoin network needs to occupy 193GiB of disk space, and the amount of data increases over time. In order to increase the transaction throughput and save the disk space, the novel distributed ledger abandons the mode of data synchronization in the whole network and adopts the mode of random storage of transaction data part. That is, all nodes on the network do not hold the full amount of data, but only store portions of the transaction data randomly. Because the nodes on such an architecture network do not store the full amount of data, bitcoin and ether house query methods cannot be applied.
Another intuitive query implementation is to synchronize all data on the network to a node that proxies the query request. The proxy node acquires data of all nodes on the network in a certain mode, verifies and summarizes the data, stores the data into a database of the proxy node, and provides an inquiry function for the outside. However, the scheme is only suitable for the scenes with a small number of nodes and low TPS, and when the number of nodes and the TPS reach a certain threshold value, the data volume of the transaction exceeds the bandwidth and the computing capacity of the agent node, so that the query function becomes unavailable, namely the architecture of the query system is not expandable. The network environment and the online state of the nodes are complex and changeable, and frequent addition and departure of the nodes should not affect the use of the query function, so the query system should have certain fault tolerance.
Therefore, the query system to be realized needs to have the following characteristics: 1) scalability should not become unavailable as the number of nodes on the network and TPS increase. 2) Fault tolerance, frequent joining and exiting of nodes in the network should not affect the use of the query function.
In the prior art, if the file sharing system is a common P2P file sharing system, a file meeting a certain condition is searched, and only one copy of the file meeting the condition needs to be searched. Since the data present on the network is not necessarily valid, a valid transaction must be deposited on and signed by three different nodes. And the data is randomly stored in the network, a structured network cannot be established according to the attributes of the transaction, and a global index cannot be established, so that all transactions in the network can be queried only in a broadcasting mode. In order to avoid misjudgment on the validity of the query result, the query result should not be broadcast to only part of nodes in the network, and the query request must be initiated to all nodes in the network. The system is completely decentralized, and any node on the network can receive and respond to requests for queries. When the node receives the query request, if the data meeting the query condition exist, the query result is returned to the agent node (the node initiating the broadcast) of the query service. And all the nodes with the data meeting the query condition sequentially send the query result to the queried agent node. This requires the querying proxy node to establish connections with these nodes in turn and accept the data returned by each node. This undoubtedly brings huge overhead to the agent node of inquiry, and when a plurality of nodes initiate connection establishment requests with the inquiry agent node at the same time and the number of inquiry results is huge, the bandwidth, operation or storage capacity of the agent node is likely to be exceeded. This may not only result in failure to verify the validity of the transaction, failure to return the correct result, but may even result in the query agent node going down. The transmission paths between the nodes form a chain, and each node returns data to only one node on the upper layer. Although this reduces the number of connections for the proxy node and reduces the processing load of the data, it causes high delay and wastes a large amount of network bandwidth.
Data of this kind are stored randomly and redundantly in the P2P network, and the validation of data validity is based on the architecture of the retrieval results of all or part of the nodes. Referring to fig. 1, a flowchart illustrating steps of a scalable high-performance distributed query processing method according to an embodiment of the present invention is shown, where the method is applied to a P2P network system, the P2P network system includes a plurality of nodes, and the nodes include an Active List and a Passive List, where the Active List is divided into an Active List, Eager List, and an inactive List, Lazy List; the number of the nodes in the Active List is a fixed value, and the Eager List stores the nodes establishing TCP connection with the nodes on the P2P network system, and is used for transmitting messages; the Lazy List stores the rest nodes of the Active List except the Eager List, and is used for transferring the summary of the message or the ID of the message, and is used for optimizing and fault-tolerant of a P2P network system; the Passive List stores random nodes for replacing nodes disconnected by the Active List and ensuring the connection between the nodes and the network in the P2P network system; the method may specifically include:
step S101, a first node obtains a query request broadcasted by a father node of the first node, wherein the first node is any node in the P2P network system;
step S102, the first node broadcasts the query request to the child nodes of the first node through a tree maintenance program; the child nodes are used for broadcasting the query request to the child nodes corresponding to the child nodes by using the tree structure of the P2P network system, and the child nodes corresponding to the child nodes repeat the broadcasting steps until the query request is broadcast to all nodes on the P2P network system; after receiving the query request, each node retrieves a local database, waits for the result return of the child node, performs settlement and deduplication operations after collecting the data returned by all the child nodes, and returns the result to the parent node; after layer-by-layer feedback, when the root node receiving the user query request receives the returned results of all child nodes, final settlement and duplicate removal operation is performed to generate a final query result, and the final query result is returned to the user.
In the embodiment of the invention, a P2P network system (hereinafter, the system is referred to as a P2P network system or tree for short) with a tree structure with high fault tolerance and load balance is provided, and the P2P network system is not constructed when the protocol starts to run. Instead, when the first message is delivered, a P2P network system is formed along the propagation path of the message, and the tree is optimized and repaired by the propagation of the subsequent messages. In the query method for the P2P network system, the embodiment of the present invention adopts a query request method for a neighbor node (a neighbor node refers to a parent node or a child node of a certain node), broadcasts the query request to the child node of the node through a tree maintenance program, broadcasts the child node to its corresponding child node, repeats the above steps, and broadcasts the query message to all nodes on the network by using the tree structure. After receiving the query request, the node retrieves the local database and waits for the results of the child nodes to return, and in the process, the 'divide and conquer concept' is adopted to uniformly distribute the duplicate removal, verification and transmission of all the results of the query to all the nodes on the network. Firstly, a tree structure is formed among nodes, except a root node, each node transmits a query result to a father node (the father node in the embodiment of the invention), after the father node receives all data returned by all child nodes, the query result is subjected to duplicate removal and verification, after layer-by-layer feedback, when the root node receiving a user query request receives the returned results of all child nodes, final settlement and duplicate removal operation is carried out, a final query result is generated, and the final query result is returned to the user. The embodiment of the invention can reduce the load of the proxy node and ensure low delay by a divergent query method of the neighbor node and the duplicate removal and verification of the query result in the returned transmission process, and the tree not only can be used for recovering the query result, but also can be used for transmitting the query request.
In terms of specific implementation, the distributed ledger based on graph structure random storage is a completely decentralized P2P application, each node on the network cannot store node information of the whole network, and therefore some related data structures need to be defined to store information of part of nodes on the network. All information of the nodes on the whole network is dispersedly stored on all nodes of the network, namely, part of information maintained by all nodes reflects the topology structure of the whole network. Each node can dynamically update according to the connection state between the node and the node maintained by the node, so as to ensure the connection between the node and the nodes of the whole network. The construction of the P2P network system depends on the above data structure, and the connection state and delay between nodes are dynamically changed, and the P2P network system is repaired and optimized according to the change, and the repair and optimization process needs to define some additional data structure. Such as the relevant data structure of the cache message, which can dynamically optimize and repair the P2P network system according to the arrival sequence of the messages. How to define and use the above data structure to maintain a global tree becomes + scoring critical. First, tree maintenance is mainly divided into the following three parts:
1. construction of trees
In an existing network topology environment, edges of some connections are deleted, and an optimal spanning tree satisfying the above conditions is constructed. Operating in the initialization phase of the protocol.
2. Optimization of trees
The connection between nodes in the network and the online state of the nodes are constantly changing, the tree cannot be a uniform tree, and dynamic optimization such as optimization of transmission delay, optimization of transmission hop count and optimization of node outgoing degree must be carried out along with the change of network environment.
3. Tree repair
When a node in the P2P network system leaves a connected edge on the network or tree and is temporarily disconnected, it will affect the reception of the transmission message by the node below. The repair of the tree is to ensure that the P2P network system is repaired in the case of node departure and disconnection, that all nodes can receive the broadcast message and can transmit the result of the query, and that the tree is continuously repaired in the subsequent propagation process.
In order to ensure the rapid transmission of the messages in the transmission process, the connection between the nodes adopts a TCP long connection mode, and the TCP long connection can ensure the reliable transmission of the messages, avoid the overhead of connection establishment every time and also can rapidly detect the failure of the nodes or the disconnection of the connection.
In order to implement the embodiment of the present invention, the P2P network system includes 3 protocols, namely, BroadcastTree, MsgTransferProt and PartialView, and the BroadcastTree is responsible for the maintenance work of the P2P network system; the MsgTransferProt is responsible for broadcasting the query message and verifying and transmitting the query result; the partialView is responsible for managing neighbor nodes of each node, and the neighbor nodes comprise father nodes and child nodes; wherein the Active List and the Pasive List are located in a PartialView protocol of a P2P network system;
each node comprises a first Map cache, a second Map cache and a third Map cache, wherein the first Map cache is a receivedMsgmap which stores the mapping of the message ID and the message and is used for caching the currently received message so as to respond to the request of other nodes which do not receive the message to the message;
the second Map cache is NotReceivedMsgMap, and the cached message ID and the mapping of the node sending the message are shown; when the specified duration is reached, the message sent by the node in the Eager List is not received yet, a Timer is triggered, and the Timer is used for requesting the message to the node sending the message and repairing the P2P network system;
the third Map cache is a timingcachemscgmap and is responsible for caching the currently received messages, and if the messages sent by the nodes in the Lazy List are received within a specified time range, the hop counts of the messages are compared to decide whether to optimize the tree, so that the P2P network system is provided with a new optimization possibility.
For each query request, the structure of the P2P network system may change due to network changes during the query. Therefore, all transmission paths before the node need to be recorded, and a path is provided for transmission of the query result. Therefore, a transferPathMap < QueryID, Path > is defined, and cached is the mapping between the ID of the query request message and the current transmission Path.
Specifically, assuming that a complete P2P network system has not been formed currently, a user sends a query message to node 1 (the first node), and node 1 broadcasts the message to all nodes in its Eager List, i.e., node 2, node 3, and node 6. After receiving the message, the nodes first check whether the message has been received, that is, check whether the current message exists in the ReceivedMsgMap. If so, the node sending the message is moved from Eager List to Lazy List and a PRUNE message is sent to the node sending the message. If not, the message is cached in a receivedMsgMap and sent to all nodes in Eager List. Referring to fig. 2.1, a schematic diagram of a first generation of a P2P network system is shown, in fig. 2.1, the dotted line representation is a node in a Lazy List for fault tolerance; the solid lines represent the edges of the P2P network system for communicating query messages. The P2P network system after the first message transmission is completed is shown in fig. 2.2, where some nodes in Eager List are removed to LazyList, and the connection of each node and its Eager List forms the edge of the spanning tree. Subsequent messages need only be passed to all nodes in Eager List and the ID of the message sent to all nodes in Lazy List. However, the initialization of the P2P network system is generated according to the transmission speed of the transmission path, i.e. the number of subtree nodes of the node with fast transmission speed may be greater than that of the node with slow transmission speed, and no consideration is given to the tree balance.
The above process maintains three important lists, Eager List, Lazy List, and Pasive List. The three lists are the basis for running the maintenance algorithm of the P2P network system, and the quality of the maintenance of the three lists directly influences the structure of the P2P network system, the delay of the query and the deduplication and verification of the query result. The Eager List and the Lazy List are actually nodes that are connected with the current node for a long time, and the Passive List is a candidate List of the two lists and aims to randomly obtain the nodes on the network, so that the situation that the nodes cannot communicate with each other due to the fact that the network is partially regionalized and a plurality of connected components occur is prevented.
Next, the embodiment of the present invention sets forth a specific scheme for how to dynamically maintain the three lists in a complex network environment, thereby optimizing the delay and calculation of transmission and enhancing the connectivity between nodes, taking the first node as an example.
Firstly, before the first node joins the P2P network system, three lists of the first node need to be initialized, and the method includes:
step 1: the first node acquires partial topology information of the P2P network system, and initializes Eager List, Lazy List and Pasive List by applying the topology information;
in the embodiment of the invention, a Node management mode based on HyparView is adopted, one or more Contact nodes (Contact nodes) are defined in the HyparView original text, and when each Node is added into the network, the Node is firstly connected to the Contact Node, and the Contact Node sends three lists of the Contact Node to the Node to be added. However, this easily leads to a network area, and a plurality of connected components are generated, so that all nodes cannot communicate with each other. Therefore, the embodiment of the invention adopts a node adding algorithm realized based on KAD (Kademlia), and the KAD algorithm is a convergence algorithm of a distributed Hash table proposed in 2002 by Petar Maymounkov and David Mazieres of New Yorkuniversity. The algorithm measures the association degree between two nodes through the XOR distance of the IDs of the two nodes, and each node only needs to store partial node information on a network and only needs log (N) hop to find corresponding resources or locate the corresponding node through effective layering of the IDs of the nodes. If there are no resources and nodes meeting the condition, the resource or node with the nearest XOR distance to the resource or node is also found. Based on the characteristics of KAD, the embodiment of the invention utilizes KAD algorithm to initialize the neighbor node list composed of the neighbor nodes maintained by the node itself.
The topology information comprises Node ID distributed randomly; step 1 may specifically include the following substeps:
substep 11: the first Node initiates a request to a network of a P2P network system by utilizing the Node ID and KAD algorithm, and searches a neighbor Node nearest to the Node ID;
substep 12: the first node selects partial nodes from the neighbor nodes to initialize own Eager List, Lazy List and Passive List.
In the embodiment of the invention, each Node is randomly distributed with a Node ID during initialization, and the Node ID and KAD algorithm are utilized to initiate a request to the network to search several nodes nearest to the Node ID. Subsequently, the node selects some nodes from the several nodes to initialize its own three List lists. And selecting several nodes nearest to the node as neighbor nodes each time, preferably, selecting the nodes with lower delay as Active Lists from the neighbor nodes, and using the rest nodes as PassionLists. In a preferred embodiment of the present invention, the substep 12 further comprises:
the first node selects m nodes with the minimum Eager List appearance degree from Eager Lists of the neighbor nodes according to the initial degree m of the first node, adds the m nodes into the Eager Lists of the first node, randomly selects n nodes from the Passive Lists of the neighbor nodes according to the number n of the nodes in the Passive Lists of the first node, adds the n nodes into the Passive Lists of the first node, and initializes the Lazy List to be empty so as to finish initializing three self List lists.
Further, when m nodes meeting the conditions cannot be found from the Eager Lists of the neighbor nodes, the first node selects d nodes with the lowest delay from the returned Passive Lists of the neighbor nodes, and adds the d nodes into the Eager List of the first node. Thereby ensuring that the Eager List and the Passive List of the node can be initialized to the List with the specified size. The Lazy List is initialized to be empty, and the nodes in the Eager List are transferred to the Lazy List in the subsequent tree building and optimizing process.
Step 2: after the initialization of Eager List, Lazy List and Pasive List is completed, the first node establishes TCP long connection with the nodes in the Eager List of the first node respectively, so as to form the edge of a P2P network system;
and 3, step 3: the first node repairs the P2P network system by using the nodes in Lazy List as the alternate edges, leaving only one edge with relatively fast transmission speed and the least number of hops, and the remaining nodes are finally removed to Lazy List.
Through the above series of operations, the joining node finally serves as a leaf node of the P2P network system, and participates in the transmission of messages and the aggregation of results.
Due to changes in network environments and online states of nodes, the nodes and the nodes in Eager List and Lazy List cannot always remain connected, and the nodes in Eager List may be replaced by the nodes in Lazy List, and the nodes in Lazy List may be replaced by the nodes in Passive List. Therefore, the list needs to be dynamically maintained, so that all nodes can grasp the topology structure of the whole network through the neighbor list maintained by the nodes according to the change of the network environment, thereby providing support for the maintenance of the P2P network system.
The tree maintenance program of the embodiment of the invention comprises an expandable maintenance program and a fault tolerance maintenance program;
based on the above, referring to fig. 3, a flowchart of steps of the extensible maintenance program according to the embodiment of the present invention is shown, where the method includes:
step S301, when broadcasting the query request to the child nodes, the first node sends an IHAVE message to a second node in the child nodes, wherein the IHAVE message includes a message ID;
step S302, the second node checks whether it has received a NORMAL message corresponding to the message ID and used for transmitting the query request;
if the second node does not receive the NORMAL message corresponding to the message ID within the timeout period, executing the following steps:
step S303, the second node generates a GRAFT message for repairing the P2P network system; the GRAFT message includes the message ID and a request to receive the IHAVE message;
step S304, the second node sends the GRAFT message to the first node, and moves the first node from its Lazy List to Eager List, so that the first node repairs the P2P network system;
if the second node has received a NORMAL message corresponding to the message ID within a timeout period, performing the following steps:
step S305, the second node calculates the difference between the receiving hop count of the IHAVE message and the receiving hop count of the NORMAL message;
step S306, the second node judges whether the hop count difference exceeds a hop count threshold value;
step S307, if the hop count difference exceeds the hop count threshold, the second node repairs the P2P network system.
In the embodiment of the invention, the NORMAL message has the function that the query message or the query result transmitted in the P2P network system is sent through the long TCP connection established with the node in the Eager List; the IHAVE message contains the message ID of the message received by the current node, and is sent through the TCP long connection established with the node in the Lazy List to inform that the message corresponding to the message ID can be acquired from the node; the GRAFT message is a repair message of the P2P network system, and is used for requesting a message which is not received from a sending node of the IHAVE message, replacing the edge of the Lazy List with EageList, and repairing the P2P network system; the PRUNE message is used to clip redundant edges on the P2P network system to prevent broadcast storms.
The method for setting the timeout comprises the following steps: when the second node receives the message, checking whether the IHAVE message or the NORMAL message is received through the ReceivedMsgMap; discarding the IHAVE message or the NORMAL message if the message is received; checking whether an ID of the IHAVE message or the NORMAL message is received if the IHAVE message or the NORMAL message is not received; if not, discarding the message; otherwise, adding the ID of the IHAVE message or the ID of the NORMAL message into a NotReceivedMsgMap, and setting the IHAVE message or the NORMAL message as a timeout event. Therefore, the second node can judge whether the IHAVE message sent by the first node in the neighbor nodes is received within the timeout period by judging the Timer timeout event. The timeout judging time may be set to 0.1S. Referring to fig. 4, a flowchart illustrating operation of the second node after receiving the IHAVE message according to an embodiment of the present invention is shown.
And if the second node does not receive the NORMAL message corresponding to the message ID and used for transmitting the query request within the timeout period, the second node indicates that the parent node (the first node) of the second node is possibly not online or has too high delay, and then the GRAFT message is sent to the first node of the sender of the message ID. The P2P network system is repaired by requesting the first node for the message that was not received, and by moving the first node from Lazy List to Eager List.
Of course, due to network reasons, the query request may not be sent during the first time, and the second node sends the request for receiving the IHAVE message to the first node again, which results in the first node possibly sending the query message to the second node multiple times. In order to avoid repeated message, occupy node memory and affect tree balance, the embodiment of the invention also provides a method: when receiving the query message, the second node checks whether the query message exists in a ReceivedMsgMap; when the query message exists in the ReceivedMsgMap, the second node moves the first node sending the query message from the Eager List of the first node into a Lazy List, and sends a delete PRUNE message to the first node; when the query message does not exist in the ReceivedMsgMap, caching the query message into the ReceivedMsgMap, and sending the query message to all nodes in Eager List of the nodes; and the first node receives the PRUNE message sent by the second node, deletes the second node from the Eager List of the first node, and puts the second node into the Lazy List of the second node.
If the second node receives the message within the timeout period, it means that the delay from the parent node (first node) of the second node to the second node is only higher, and the hop count difference between the two nodes needs to be compared. The tree is repaired only if the number of Lazy hops is less than Eager hops to a certain threshold. Therefore, the stability and balance of the tree can be ensured, the structure of the tree cannot be changed continuously due to frequent network changes, and the stability of the tree is maintained. And the tree is not optimized only according to the transmission delay and not considering the hop count, thereby maintaining the balance of the tree. In specific implementation, when receiving a NORMAL message sent by the first node, the second node deletes a record about the NORMAL message in the NotReceivedMsgMap and stores the record into the ReceivedMsgMap, so that accuracy of data is guaranteed in time, and the record is used for balancing a later maintenance tree.
Based on the above, referring to fig. 5, a flowchart of steps of the fault tolerance maintenance program according to the embodiment of the present invention is shown, where the method includes:
step S501, when the connection between a first node and a second node of an edge of the P2P network system is disconnected, the first node removes the second node from the Eager List of the first node;
step S502, the first node initiates a query request to a first target node in the Passive List of the first node in sequence; the query request comprises an instruction for checking whether the first target node is online or not and an instruction for querying the size of the Lazy List of the first target node;
step S503, the first node receives a query result returned by each first target node for the query request, and selects a second target node with the smallest Lazy List and the lowest delay from the first target nodes according to the delay in the query result and the Lazy List of each first target node;
step S504, the first node adds a second target node to its Lazy List, and repairs the P2P network system by using the node in the Lazy List as a replacement edge.
In an embodiment of the present invention, the TCP long connections between the first node and the nodes in the Eager List maintained by the first node constitute the edges of the P2P network system, and the TCP long connections between the nodes in the Lazy List maintained by the first node constitute the complements of the edges of the P2P network system. When the connection on the P2P network system is disconnected or the node is not on-line, the next message broadcast will be automatically repaired by the node in Lazy List. In order to ensure the balance between the Active List and Passive List nodes, a corresponding node needs to be selected from the Passive List to replace the node in the Active List. Therefore, the connectivity of the whole tree is ensured, and the fault tolerance rate is increased.
In steps S501 to S504, it is assumed that a TCP long connection is maintained between the first node and the second node, and each of the first node and the second node constitutes an edge of the P2P network system in the Eager List of the other node. When the connection between the first node and the second node is broken, the first node first removes the second node from its Eager List. At this time, the Active List of the first node has one less element, and a node needs to be selected from the Passive List to replace the current node. The first node initiates a query request with the nodes in the Passive List thereof in sequence, wherein the query request has two purposes, namely, whether the current node is online is checked, and the size of the Lazy List of the node is queried. Finally, the latest access time of each node is updated, and a node is comprehensively selected by combining the delay and the size of the Lazy List of the node. The node Lazy List is small in size, which indicates that the number of layers in the P2P network system is small, the node is grafted to the node, and the node with low delay is selected, so that more possibilities are provided for tree optimization. Therefore, the Passive List node selection strategy is to select a Lazy List with a smaller size, and when there are a plurality of nodes of the smallest Lazy List, the selection delay is the lowest. The HyparView original text adopts the method that the nodes in the Passive List are replaced by the nodes in the original Eager List, and the adding operation with different priorities is sent to the nodes in the Passive List according to the size of the current Eager List of the first node. Eager List and Lazy List were used in the original text as fixed size strategies, which led to the necessity of this repair. However, since the policy that the sum of the sizes of the Eager List and the Lazy List is not changed is adopted, even if the current Eager List of the first node only contains one node, the number of the nodes in the Lazy List of the first node is k-1(k is a configured degree parameter when the node joins the network), even if the current Eager List is disconnected, the disconnected edge can be finally repaired by the nodes in the Lazy List. The mode adopted in the original text greatly increases the maintenance cost and the maintenance complexity, and is not suitable for the current scene of the embodiment of the invention. Therefore, in the embodiment of the invention, only the nodes in the Lazy List need to be replaced, and even if the only connection is disconnected, the user only needs to wait for the later rounds of Lazy repair.
Steps S501 to S504 mainly provide a solution for maintaining Eager List and Lazy List, and then, in another optional embodiment of the present invention, a solution for maintaining Passive List is provided.
The Passive List is used for providing a candidate of Active List nodes, and each node uses the Passive List to ensure that the online nodes on the whole network are connected. If the maintenance algorithm of the Passive List is poor, multiple connected components may occur in the network, and each node can initiate a connection request only with a part of nodes in the network, so that all nodes on the whole network cannot receive the query message.
Disconnected nodes are continually added to the Passive List, while nodes with lower delay and smaller Lazy List size are continually removed from the Passive List and added to the Lazy List, which results in fewer and fewer online nodes being available in the Passive List. Therefore, a corresponding strategy needs to be adopted to perform a corresponding updating operation on the Passive List. The whole updating process follows the idea proposed in the HyparView original, namely, randomized updating. However, the adopted updating strategy is greatly different from the original text, and the strategy of low cost and lazy updating is adopted, so that the method is more suitable for the maintenance scene of the embodiment of the invention. The nodes in each node Passive List are randomly saved nodes on the current network. The HyparView original text adopts that every fixed interval, a node sends a random hop count (TTL) Shuffle message through an Eager List, and the node receiving the message and the node sending the message do Shuffle operation on the node in the Pasive List. Therefore, random nodes on the whole network are maintained in the Pasive List of each node. Because all nodes need to do Shuffle operation at fixed intervals, the Load on the network is increased, and in order to reduce the Load, the embodiment of the invention also adopts a Lazy Load (Lazy Load) mode to randomly obtain the nodes on the network. An update operation is only initiated if the number of online nodes in the Passive List is less than a certain threshold.
In the embodiment of the present invention, the update operation may include the following steps:
when the number of online nodes in a Passive List in the first node is smaller than a preset threshold, the first node generates a fixed hop count TTL;
the first node sends the message with the TTL to a random third target node in Eager List of the first node; after the third target node receives the message, the third target node sends the node ID in the latest Pasive List to the first node, subtracts 1 from the TTL, then sends the TTL to a random node in the Eager List of the third target node randomly, and repeats the steps until the TTL is 0;
the first node collects all received node IDs, randomly selects M nodes from all received nodes, and adds the M nodes into a Passive List; and subtracting the currently stored node number in the Passive List from the maximum node number which can be stored in the Passive List.
In addition, in order to further ensure the number of online nodes available in the Passive List, an optional embodiment of the present invention further provides the following steps:
and after the first node removes the second node from the Eager List of the first node, adding the second node into the Pasive List of the first node. This further increases the number in the Passive List.
In summary, the embodiments of the present invention use the three Eager List, Lazy List and Passive List to maintain, and use the Eager List to ensure the connection between nodes in the whole P2P network system, thereby ensuring the transmission of messages; the Passive List is maintained in a random sampling mode to provide support for updating of Eager List and Lazy List, and initialization, repair and optimization algorithms of the P2P network system can be completed. The quality of the List maintenance will affect the stability, transmission delay, and fault tolerance of the spanning tree.
In the embodiment of the invention, the repair and optimization of the tree are realized by exchanging variables in the Active List and Passive List data structures. If the disconnected node is put into the Passive List, the nodes in the Lazy List with low delay and few hops are replaced by the nodes in the Eager List, and the online nodes in the Passive List are replaced by the nodes with the active List and the disconnected nodes are disconnected.
In the embodiment of the present invention, the nodes in Eager List are replaced by nodes in Lazy List according to the delay and hop count of the received message, which results in that the top node Eager List of the P2P network system is as full as possible, the out degree of the nodes is maximally utilized, and the height of the whole tree is reduced. While the underlying nodes have enough Lazy lists to be used for optimization and fault tolerance. Eager List of the nodes is randomly acquired through a KAD algorithm, Lazy List is eliminated from the Eager List and is also random, Lazy List of the lower-layer nodes randomly stores the nodes on the whole tree, and the randomness increases the fault tolerance of the whole tree. The condition that the node of the lower layer cannot receive the broadcast message due to the edge fracture of the upper layer is avoided. Because the control messages are lightweight, both optimization and Repair of the tree are in a Lazy replay approach, and nodes on the federation chain are relatively reliable and do not leave and join very frequently. The optimization and repair process does not occupy a large amount of network bandwidth, and the whole maintenance process is low-load.
Due to the particularity of the scheme design of the embodiment of the invention, the role of Lazy List is not the same as that of HyparView. In the embodiment of the present invention, only the fixed values of the total sizes of Lazy List and Eager List are defined, but the size of Lazy List is not defined like the hypariew original document, so Lazy List is only a candidate of Eager List. Eager List is a Lazy List for nodes on a P2P network system, and does not require the introduction of a separate List maintenance mechanism to maintain the Lazy List. Based on the design scheme different from HyparView in the original text, the size of Lazy List is almost 0 at the nodes with lower layer number on the P2P network system. And the size of Lazy List is relatively large for nodes with higher layer number on the P2P network system. The TCP long connection established between the nodes is fully utilized to transmit the message, the out-degree of the nodes is as large as possible, the height of a P2P network system is reduced, and the low delay of the transmission is ensured. And the nodes with higher layer number in the P2P network system have enough Lazy List to point to other nodes of the P2P network system, thus ensuring the fault tolerance of the tree. The fault tolerance of the scheme provided by the embodiment of the invention is less than that of HyparView, and the percentage of the fault nodes can be ensured not to exceed 30%.
It should be noted that, in the embodiments of the present invention, the first and the second are only used for distinguishing different nodes, and there is no substantial order meaning.
Next, aiming at the steps provided by the embodiment of the present invention, the overall scheme of the embodiment of the present invention is analyzed:
(1) time complexity of the query.
The inquiry process firstly broadcasts the inquiry message to the whole network through the P2P network system, and then the inquiry message is broadcast to the whole networkAnd carrying out layer-by-layer verification and statistics through the nodes of the whole network, and finally returning the query result to the root node of the P2P network system. If the processing process of the query result on the node is omitted, the whole query process is a process of two-time tree level traversal, so the delay of the query is related to the longest path on the P2P network system. Assuming each node is configured with the same out degree k, each layer is given pi (0)<pi<1) The probability out degree of (c) is k, the number of all nodes in the tree is N, the height of the tree is h, then
Figure GDA0002368847490000141
By calculating the height h ═ O (log) of the available treekN). Only log is neededkN hops can traverse all nodes on the whole network and gather query results returned by all nodes, so the time complexity of query is O (log)kN)。
(2) And a load reduction rate.
And defining the data deduplication rate as the difference value between the total amount of data actually received by the root node in the query process and the total amount of data received without adopting the architecture, and dividing the difference value by the total amount of data received without adopting the architecture. Assuming that the number of transactions satisfying the query condition in one query is n, and the amount of transaction data actually received by the root node of the P2P network system is m, the deduplication rate of the data is 1-m/(12 × n).
If such a query architecture is not employed, the size of the amount of data that should be received by the querying proxy node is 12 times the number of all transactions that satisfy the condition. The returned result is deduplicated by the intermediate transmission node in the process of spanning tree transmission, thereby reducing the total data processing amount of the root node of the P2P network system. Assuming that the out-degree of each node is k and all data is uniformly distributed on the whole network, the query data satisfying the corresponding condition should be uniformly distributed in all children of the root node of the P2P network system, and the final returned data amount should be k × n, so the deduplication rate of the query is 1-k/12.
In the comprehensive view, the duplication removal rate of the data and the out-degree k of the node are in negative correlation, and the hop count of the query, namely the delay of the query and the k are in positive correlation. That is, the larger k is, the larger the amount of data received by the root node is, the larger the load of data processing is, but the delay of query is lower. Therefore, the comprehensive consideration should be carried out according to different scenarios, and when the computing power of the query node is weak, the load of the root node can be reduced by adopting low out degree and loss query delay. When the computing power of the node is high and the requirement on the query delay is high, a high-out mode can be adopted, and the query delay is reduced, so that part of the computing power of the root node is occupied.
(3) And performing expandability analysis.
Extensibility is an important attribute of programs, systems and networks, and programs, systems and networks with extensibility can gracefully handle the problem of increased workload at minimal cost, such as by increasing processing resources. For a centralized service architecture, such as a search engine, it must be scalable to handle more users' query requests and to properly index more resources. Centralized service architectures generally employ a horizontally-expanding approach to increase the processing power of the overall system by adding servers.
The scalability of the network is different from the scalability of the centralized server, for example, the scalability of the routing protocol is used to cope with the change of the network scale, and the routing protocol can be considered as scalable assuming that the number of nodes in the network is N and the size of the routing protocol table is on the o (log N) level. The application of Gnutella as in earlier P2P has the problem of being non-scalable, and other nodes can only be found by means of flood broadcast. The distributed hash table solves the problem that the routing protocol cannot be expanded. The pure P2P system does not have any centralized structure, so it can be extended infinitely without any other resources except nodes. In the time complexity analysis, only log (n) hops are needed to analyze a query, and the protocol can be considered to be extensible at the level of a P2P network. In addition to the routing of the message, the transmission and processing of the query result are also performed, and in the analysis of the load reduction rate, the load of the node and the out-degree k of the node are analyzed to be in negative correlation in the scheme of the embodiment of the invention. Therefore, the processing load of the nodes can be dynamically reduced and increased by adjusting the out-degree of the nodes, the nodes with higher computing power can be endowed with higher out-degree, the nodes with lower computing power are endowed with lower out-degree, and the expandability of the query system can be further increased.
In addition to the aforementioned extensibility, the scheme provided by the embodiment of the present invention has strong functional extensibility, and provides a good basis for further implementing a more powerful query system in the future, such as stronger extensibility and smaller overhead.
(4) And analyzing fault tolerance.
The realization of the fault tolerance function completely depends on the selection and replacement of neighbor nodes, when a certain node in the network leaves the network, when the node detects that the neighbor node leaves the network, the node is firstly removed from the neighbor node of the node, and a new neighbor node is selected through a neighbor node selection strategy. The selection of the new neighbor node adopts the randomization thought, firstly, the randomization of the hop count of the request message, secondly, the randomization of the transmission neighbor node of the request message, and thirdly, the randomization of the return result selected from the passive list. Although the specific fault tolerance ratio is difficult to estimate, the connectivity among the nodes can be better ensured by the randomization and timely substitution, and the situation of isolated nodes is avoided. An 80% fault tolerance can be achieved in the HyparView original text, but the overhead of maintaining the structure is large, and the distributed ledger application does not necessarily sacrifice a large amount of bandwidth and computing power to guarantee such a high fault tolerance.
To verify the experimental effect of the embodiment of the present invention, the following example verifies the availability of the query operation on the real machine and the scalability of 10000 nodes in the simulation environment in the scheme of the embodiment of the present invention. The verified metrics include the average hop count of the query operation, the average maximum hop count, the deduplication rate of the data, and the like. Experimental results show that this scheme is not only scalable, but the latency of the query is within log (n) hops. And the method also has better fault tolerance, and can still return correct query results under a lower hop count even under the condition that the node failure rate is 30%.
The experimental reference variables are set mainly according to evaluation indexes of a P2P network protocol, such as the number of nodes and the failure rate of the nodes, and due to the particularity of the scheme, the output degree of the nodes is also selected as the reference variables. The verification of the expandability in the P2P network generally adopts the method of increasing the number of nodes to check the availability of functions, so the number of nodes is used as a reference variable to verify the expandability of the query system. And the availability of the query function is verified through the node failure rate, so that the fault tolerance of the query system can be verified. The influence of node out-degree on the load and transmission delay of the node can be verified by taking the out-degree of the node as a reference quantity, so that parameters are optimized better to improve the expandability of the system.
The observed indexes mainly comprise three dependent variables, namely the load of the node, the query delay and the tree balance, and respectively correspond to the data deduplication rate, the hop count of one query and the hop count distribution from the leaf node to the root node. The load of the node is characterized by the data deduplication rate, the larger the data deduplication rate is, the smaller the load of the node is, and vice versa, the outbound degree of the node can be well adjusted by observing the load of the node, and therefore balance between query delay and load is maintained. Because of the heterogeneity of P2P networks, the delay between nodes varies from one network environment to another, and in order to better characterize the delay of a query, a hop count index of the query is used herein instead of the time consumed by the query. The balance of the tree is a key for ensuring load balance, each node can uniformly load the processing and transmission of the query result by maintaining a relatively balanced tree, the representation of the balance of the tree is carried out by the distribution of the hop counts from the leaf nodes to the root nodes, and if the hop counts from the leaf nodes to the root nodes are all concentrated in log (N), the tree is relatively balanced.
(I) analysis of influence of node quantity on performance
As shown in fig. 6.1, the maximum average hop count and the average hop count of the leaf nodes in the P2P network system are the maximum average hop count and the average hop count as the number of nodes increases in the case that the Active List size of each node is 7. As can be seen, the average hop count is log7(N) orders of magnitude, the maximum average hop count is slightly greater than the average hop count by 1 to 2 hops due to the randomness of the network topology. Proving that the P2P network system is maximumThe tree height of the P2P network system is sufficiently reduced by utilizing the out-degree of each node. As shown in fig. 6.2, the path length from the leaf node to the root node in the P2P network system is shown for different node numbers. If the number of nodes is 10000, the path length is mainly concentrated on 5 hops and 6 hops. The tree is proved to be relatively balanced, and when the balance of the tree directly determines the statistical query result, the returned result information can not be uniformly loaded. In conclusion, the maintenance scheme of the P2P network system in the scheme is verified to be really a balanced tree with a small maximum number of hops by making full use of the out degree of each node. Thereby ensuring a delay of the query and a uniform load of the received query results. As shown in fig. 6.3, showing the data deduplication rates of the root nodes on the P2P network system in the case that the Active List size of each node is 7, it can be found that the data deduplication rates do not change significantly with the increase of the number of nodes, and fluctuate by 51%. This indirectly proves that the data deduplication rate is independent of the number of nodes, and only the node outages in the P2P network system. The experiment verifies that the query delay (measured by hop count in the experiment) is exponentially increased under the condition that the number of nodes is increased, and the low-delay characteristic of the query function is ensured under the condition that the number of nodes is increased. The balance of the P2P network system can be verified through the hop distribution from the leaf node to the root node, the receiving of the query request and the processing of the query result are load balanced, so that the query function is still available on the premise of increasing the number of nodes, and the expandability of the scheme is verified.
(II) analysis of the impact of failure Rate on Performance
In order to increase the fault tolerance of the whole system, Passive List and Lazy List are introduced in the scheme design and are respectively used for replacing Active List and Eager List, so that the structure of the P2P network system is optimized, and the fault tolerance of the P2P network system is increased. The results shown below are the average maximum hop count and the average hop count and the distribution of the hop counts of all leaf nodes in the case of 10000 nodes, the out-degree K is 7, and the failure rate is from 0% to 30%.
The test initiates 50 times of inquiry messages, and randomly exits a part of nodes at the beginning of each message broadcast, so the statistical results are the average maximum hop count and the average hop count in the repair process of the P2P network system, and the nodes with corresponding failure rates are not exited at one time while initiating the request broadcast. Because of this, the tree structure may be completely stable after the first few repairs, resulting in a final experimental result that is too ideal and not practical. Therefore, the evaluation index adopts the hop count evaluation in the repair process, is not stable hop count evaluation, and can be close to the real network environment to the maximum extent.
As shown in fig. 6.4, as the failure rate of the node increases, both the average maximum hop count and the average hop count increase during the repair process. The slow increase of the average hop count proves that although there is a node failure in the transmission process, the repair process can still ensure that the P2P network system is stable. And the increase of the maximum hop count shows that the nodes disconnected with other nodes have a part grafted to other sub-trees, so that the average maximum hop count is increased. As shown in fig. 6.5, it can be seen that in the fault repairing process, a part of leaf nodes and parent nodes which only need 5 hops to the root node are disconnected, and finally grafted to other leaf nodes of 5 hops, which become 6 hops and a very small part of 7 hops nodes. In the repairing process, the hop count of the leaf node is also stable, the hop count is mainly concentrated in 5, 6 and 7 hops, and the process of increasing the query delay is not caused. Therefore, the fault tolerance of the scheme is verified, and the scheme still has better query delay and a more balanced tree structure even in the repair process under the fault rate of 30%. As shown in fig. 6.6, the deduplication rate of the data received by the root node does not change with the change of the failure rate, and still fluctuates around 51%. It turns out that in case the data is evenly distributed over the network, the data deduplication rate is independent of the failure rate of the nodes. The data deduplication rate reflects the size of the data volume to be processed by the root node, determines the query load of the root node, and needs to be dynamically adjusted according to the computing capacity of the root node.
Through the experiment, the node failure rate in the distributed ledger network is taken as a variable, the hop count from the leaf node to the root node, the distribution of the hop count and other information are detected, and the method is verified that the query message can be broadcast to the whole network within a few hop counts (5.87 hops) even under the condition of 30% failure rate, so that the queried hop count is not obviously increased, and the distribution of the hop count is still balanced. The availability of the query function can still be ensured under the condition that the node failure rate exists in the scheme, and the P2P network system serving as the query function core algorithm is only slightly changed, so that the fault tolerance of the scheme is verified.
(III) analysis of influence of node out-degree on performance
In the experiment, only the node out-degree is left as the only reference variable, and then the influence of the node out-degree on the maximum hop count of the P2P network system and the data deduplication rate will be described. In the experiment, the influence of the node out-degree on the evaluation index of the query system is shown under the conditions that the node number is 10000, the failure rate is 0%, and the out-degree of the node is 3-10.
As shown in fig. 6.7, the larger the total node out degree is, the smaller the height of the P2P network system is, and since some of the children nodes in the node need to be kept in Lazy List for recovering from the failure, the fault tolerance is increased. Therefore, the total out degree is not the number of child nodes of all nodes in the P2P network system, so the tree height of the P2P network system is slightly higher than logkN. But the average hop count is in the logkN range, ensuring that the class tree is more balanced.
When the total outbound of the node is 10, only 6 hops are needed to broadcast to the whole network and collect the query results from the whole network. In current network environments, it is very easy to maintain long connections between each node and 10 nodes. Assuming that the average delay between two nodes is 100ms, the TPS of each node averages 60t/s, and the number of nodes is 10000. Then the transaction data is retrieved from the north chain with 10000 nodes and 60 ten thousand TPS nodes, which only needs 1.2s and is completely within the user's acceptance range.
As shown in fig. 6.8, the data deduplication rate is continuously reduced with the increase of the total node output, and by combining all the above experimental results, it can be known that the data deduplication rate is only related to the node output. The larger the out-degree of the node is, the smaller the data deduplication rate is, but the average maximum hop count is reduced and the query delay is reduced. The smaller the out-degree of the node is, the larger the data deduplication rate is, but the average maximum hop count increases and the query delay increases.
When the out-degree of the node is 3, the de-weight rate is as high as 75.19%, and the calculation load of the root node is greatly reduced. But correspondingly the average maximum number of hops for transmission is as many as 12. Therefore, a trade-off needs to be made between the two, and when the computing power of the node is poor, a small out degree can be selected, so that although the delay of the query process is increased, the processing load of the node is greatly reduced. When the query delay requirement is high and the computing power of the nodes is high, the output degree of the nodes can be increased, and the query delay is further reduced. Therefore, the out-degree of the node needs to be comprehensively determined by balancing the computing power of the node, the delay requirement of the query, the bandwidth of the network environment and other factors.
The larger the out-degree of the root node is, the more the child nodes are, and each child node returns a query result. The query results are nodes evenly distributed throughout the network, so that each child node receives approximately the same amount of data, and the query results are passed to the root node. Therefore, the deduplication rate of the root node should be related only to the root node. As shown in fig. 6.9, the experimental environment is that 10000 nodes on the network, the total out-degree of all nodes except the root node is 7, the out-degree of the root node is constantly 3, the failure rate of the node is 0%, and the data deduplication rate is related to the total out-degree of other nodes. Although the node out-degree except the root node is always changed, the data deduplication rate is always fluctuated around 75.20%, and the change amplitude is only 0.01%. The data deduplication rate of the root node is not related to the out-degree of other nodes, and is only related to the out-degree of the root node. The load of processing can be increased and decreased by deciding the total degree of egress of the node according to the computational power of the node.
From the above experimental results, when the out-degree of the root node is not fixed, the out-degrees of all the nodes are the same, and the average hop count of the P2P network system decreases as the out-degree of the nodes increases. However, fixing the degree of departure of the root node according to the computing power of the root node may affect the structure of the P2P network system, so it is necessary to verify the maximum hop count and the average hop count of the P2P network system when the degree of departure of other nodes is changed while the root node is fixed. Fig. 6.10 shows the relationship between the average hop count and the total node out-degree of the P2P network system under the above conditions.
It can be seen that when the degree of departure of the root node is fixed, the average maximum hop count and the average hop count of the P2P network system are slightly different from the degree of departure of the root node. Therefore, the out-degree of the root node is determined according to the computing power of the root node, and the characteristics of the P2P network system are not greatly influenced. Through the experiment, the node out-degree is used as a variable, the hop count from the root node to the leaf node is used as an index for measuring the query delay, and the detection proves that the larger the node out-degree is, the smaller the height of the P2P network system is, and the lower the query delay is. However, the data deduplication rate is used as an index of the root node load for detection, and it is verified that the larger the out-degree of the node is, the larger the load of the node is. By changing the degree of departure of the node and the degree of departure of the fixed root node, the load of the node can be verified to be related to the degree of departure of the node and not related to the degree of departure of other nodes, so that the load of the node can be reduced by reducing the degree of departure of the node. If reducing the load on the node would affect the latency of the query, an experiment as represented in fig. 6.10 was performed, and it can be concluded that: changing the out-degree of the root node does not significantly affect the delay of the query. Therefore, the load of the node can be reduced by reducing the output degree of the node with smaller computing power and bandwidth under the condition of not influencing query delay, the output degree of the node can be dynamically adjusted for the computing power of the node, the expandability of the embodiment of the invention can be further improved, and the expandability of the embodiment of the invention can be further verified.
Fig. 6.1 to 6.10 evaluate the extensibility, fault tolerance, load balancing, etc. of the query system, respectively, and verify that the scheme is extensible, fault tolerant, and load balancing. The query APIs are tested and exposed below, and for convenience of exposition, only one API is tested for a multi-conditional query request, including both conditional queries and range queries. The implementation environment is 10 Alice cloud machines, the delay between nodes is 20ms to 80ms, the maximum out-degree of each node is 7, the optimized hop count is set to be 1, and no node fails.
As shown in fig. 7.1, a field description of the query result is shown. The transactioniD is the unique identification of the transaction, from is the initiator of the transaction, to is the receiver of the transaction, timeStamp is the time stamp when the transaction is initiated, data and signature are binary representations of data content and data signature respectively, when the interface returns data, the content in the graph is generated through Base64 coding, and type is the type of the transaction, and comprises the steps of uploading a contract, starting the contract, calling the contract, closing the contract and the like.
When the transaction on the distributed ledger is divergent, it should provide the functions of tracing and auditing, for example, user a and user B have divergence to the transaction occurring in a certain time period, and need to precisely retrieve the transaction. The tested query conditions are all transactions from 2019-05-1220: 05:31 to 2019-05-1220: 22:21, which meet the condition that the account address of the initiator is 7f18ee43-f142-439c-527a-442171780b38, and the account address of the receiver is 79a3bd9f-81bc-4403 and 4271-3d33e98b8ec 0. As shown in fig. 7.2, the latency of the query is 66ms, and there are only 1 transaction that satisfies the query condition.
The above example respectively takes the number of nodes, the failure rate and the node out degree as variables, verifies the expandability, the fault tolerance and the load balance of the scheme, and finally verifies that the scheme is expandable, high in fault tolerance and balanced in load.
Under the experimental environment that 10000 nodes and TPS are 6 ten thousand per second, when the out degree of each node is 7, the query conditions can be broadcast to all the nodes by only 5.5 hops on average, and the query results returned by the nodes of the whole network are received. The data size of the query result received by the root node is only 49% of the original data size. When the out-degree of the root node is 3, the out-degree is only 25% of the original data amount, and the processing load of the root node is greatly reduced by balancing the load of the processing process of the whole query result to all nodes of the whole network. In the analysis experiment of the influence of the failure rate on the performance, the node failure rate is 30%, and in the process of repairing the P2P network system, all nodes in the whole network can still be ensured to receive the query message at an average 5.87 hops and return a correct query result. The influence of the node out-degree on the data deduplication rate is proved in an analysis experiment of the influence of the node out-degree on the performance, and the change of the out-degree of a certain node does not have a large influence on the evaluation index of the scheme, so that the total out-degree of the node can be specified according to comprehensive indexes such as the bandwidth and the computing capacity of the node. Therefore, the method has no great influence on the query delay and reduces the load of the query process. Therefore, the output degree of the nodes is dynamically adjusted according to the computing capacity of the nodes, the expandability of the system can be further improved, and the expandability of the scheme is further verified.
Subsequent evaluation is carried out on the API of the query system, so that the delay of one query is 66ms, the condition query, the range query and the multi-condition query can be supported, and the problem of single query function of the distributed Hash table is solved. Moreover, the example completes the core work in the query process, namely the maintenance of the P2P network system and the transmission of the query result, and provides an extensible framework for realizing more types of queries subsequently.
It should be noted that, in various embodiments of the present invention, the node may be a service gateway, a server accessing a network, a computer, or other terminals.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Referring to fig. 8, a schematic structural diagram of a scalable high-performance distributed query processing apparatus according to an embodiment of the present invention is shown, where the apparatus is applied to a P2P network system, the P2P network system includes a plurality of nodes, and the nodes include an Active List and a Passive List, where the Active List is divided into an Active List, Eager List and an inactive List, Lazy List; the number of the nodes in the Active List is a fixed value, and the Eager List stores the nodes establishing TCP connection with the nodes on the P2P network system, and is used for transmitting messages; the Lazy List stores the rest nodes of the Active List except the Eager List, and is used for transferring the summary of the message or the ID of the message, and is used for optimizing and fault-tolerant of a P2P network system; the Passive List stores random nodes for replacing nodes disconnected by the Active List and ensuring the connection between the nodes and the network in the P2P network system; in the method of fig. 1 according to the embodiment of the present invention, the apparatus 800 includes the following modules:
a query request obtaining module 801 configured in a first node, configured to obtain, in the P2P network system, a query request broadcasted by a parent node thereof, where the first node is any node in the P2P network system;
a query request broadcasting module 802 configured in the first node, and broadcasting the query request to its own child node through the tree maintenance program; the child nodes are used for broadcasting the query request to the child nodes corresponding to the child nodes by using the tree structure of the P2P network system, and the child nodes corresponding to the child nodes repeat the broadcasting steps until the query request is broadcast to all nodes on the P2P network system; after receiving the query request, each node retrieves a local database, waits for the result return of the child node, performs settlement and deduplication operations after collecting the data returned by all the child nodes, and returns the result to the parent node; after layer-by-layer feedback, when the root node receiving the user query request receives the returned results of all child nodes, final settlement and duplicate removal operation is carried out to generate a final query result, and the final query result is returned to the user;
the tree maintenance program comprises an expandable maintenance program and a fault tolerance maintenance program;
for the expandable maintenance program, corresponding to the method of fig. 3 in the embodiment of the present invention, the apparatus 800 includes the following modules:
an IHAVE message sending module 803, configured in the first node, for sending an IHAVE message to a second node in the child nodes when the query request is broadcasted to the child nodes, where the IHAVE message includes a message ID;
a NORMAL message checking module 804, configured in the second node, for checking whether it has received a NORMAL message corresponding to the message ID for delivering the query request;
a get message generating module 805 configured in the second node, configured to generate a get message for repairing the P2P network system when a NORMAL message corresponding to the message ID is not received within a timeout period; the GRAFT message includes the message ID and a request to receive the IHAVE message;
a GRAFT message sending module 806, configured in the second node, configured to send the GRAFT message to the first node and move the first node from its Lazy List to Eager List when the NORMAL message corresponding to the message ID is not received within a timeout period, so that the first node repairs the P2P network system;
a hop count difference calculation module 807 configured in the second node, configured to calculate a difference between a received hop count of the IHAVE message and a received hop count of the NORMAL message when the NORMAL message corresponding to the message ID has been received within a timeout period;
a hop count difference determining module 808, configured in the second node, configured to determine whether the hop count difference exceeds a hop count threshold when the NORMAL message corresponding to the message ID has been received within a timeout period;
a first system repair module 809 configured in the second node, configured to repair the P2P network system when the hop count difference exceeds a hop count threshold;
for the fault tolerance maintenance program, corresponding to the method of fig. 5 in the embodiment of the present invention, the apparatus 800 includes:
a second node removing module 810 configured in the first node, for removing the second node from its Eager List when a connection between the first node and the second node constituting an edge of the P2P network system is disconnected;
the query request initiating module 811 is configured in the first node, and is configured to sequentially initiate a query request to the first target node in the Passive List thereof; the query request comprises an instruction for checking whether the first target node is online or not and an instruction for querying the size of the Lazy List of the first target node;
a query result receiving module 812, configured in the first nodes, configured to receive a query result returned by each first target node for the query request, and select, according to a delay in the query result and a size of a Lazy List of each first target node, a second target node with a smallest Lazy List and a lowest delay from the first target nodes;
a second system repair module 813 configured in the first node, configured to add the second target node into its Lazy List, and repair the P2P network system by using the node in the Lazy List as a replacement edge.
In a preferred embodiment of the present invention, the apparatus further comprises:
the system comprises a hop TTL generation module, a hop TTL generation module and a hop TTL generation module, wherein the hop TTL generation module is configured in a first node and used for generating a fixed hop TTL when the number of online nodes in a Passive List in the first node is smaller than a preset threshold;
the TTL message sending module is configured in the first node and used for sending the message with the TTL to a random third target node in Eager List of the TTL message sending module; after the third target node receives the message, the third target node sends the node ID in the latest Pasive List to the first node, subtracts 1 from the TTL, then sends the TTL to a random node in the Eager List of the third target node randomly, and repeats the steps until the TTL is 0;
the node selection module is configured in the first node and used for collecting all received node IDs, randomly selecting M nodes from all received nodes and adding the M nodes into the Passive List; and subtracting the currently stored node number in the Passive List from the maximum node number which can be stored in the Passive List.
In a preferred embodiment of the present invention, the apparatus further comprises:
and the node changing module is configured in the first node and is used for adding the second node into the Passive List after removing the second node from the Eager List of the second node.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
The present invention provides an extensible high-performance distributed query processing method and an extensible high-performance distributed query processing apparatus, which are introduced in detail above, and specific examples are applied herein to explain the principle and implementation of the present invention, and the description of the above embodiments is only used to help understanding the method and its core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. The method is applied to a peer-to-peer computing P2P network system, wherein the P2P network system comprises a plurality of nodes, the nodes comprise an Active List and a Passive List, and the Active List is divided into an Active List Eager List and an inactive List Lazy List; the number of the nodes in the Active List is a fixed value, and the Eager List stores the nodes establishing TCP connection with the nodes on the P2P network system, and is used for transmitting messages; the Lazy List stores the rest nodes of the Active List except the Eager List, and is used for transferring the summary of the message or the ID of the message, and is used for optimizing and fault-tolerant of a P2P network system; the Passive List stores random nodes for replacing nodes disconnected by the Active List and ensuring the connection between the nodes and the network in the P2P network system; the method comprises the following steps:
in the P2P network system, a first node obtains a query request broadcast by a parent node thereof, wherein the first node is any node in the P2P network system;
the first node broadcasts the query request to own child nodes through a tree maintenance program; the child nodes are used for broadcasting the query request to the child nodes corresponding to the child nodes by using the tree structure of the P2P network system, and the child nodes corresponding to the child nodes repeat the broadcasting steps until the query request is broadcast to all nodes on the P2P network system; after receiving the query request, each node retrieves a local database, waits for the result return of the child node, performs settlement and deduplication operations after collecting the data returned by all the child nodes, and returns the result to the parent node; after layer-by-layer feedback, when the root node receiving the user query request receives the returned results of all child nodes, final settlement and duplicate removal operation is carried out to generate a final query result, and the final query result is returned to the user;
the tree maintenance program comprises an expandable maintenance program and a fault tolerance maintenance program;
for the extensible maintenance program, the method includes:
when the first node broadcasts the query request to the child nodes of the first node, the first node sends an IHAVE message to a second node in the child nodes of the first node, wherein the IHAVE message comprises a message ID;
the second node checks whether it has received a NORMAL message corresponding to the message ID for delivering the query request;
if the second node does not receive the NORMAL message corresponding to the message ID within the timeout period, executing the following steps:
the second node generates a GRAFT message for repairing the P2P network system; the GRAFT message includes the message ID and a request to receive the IHAVE message;
the second node sends the GRAFT message to the first node, and moves the first node from a Lazy List of the second node to an Eager List, so that the first node repairs the P2P network system;
if the second node has received a NORMAL message corresponding to the message ID within a timeout period, performing the following steps:
the second node calculates the difference between the receiving hop count of the IHAVE message and the receiving hop count of the NORMAL message;
the second node judges whether the hop count difference exceeds a hop count threshold value;
if the hop count difference exceeds a hop count threshold, the second node repairs the P2P network system;
for the fault tolerance maintenance program, the method comprises:
when the connection between a first node and a second node constituting an edge of the P2P network system is disconnected, the first node removes the second node from its Eager List;
the first node sequentially initiates a query request to a first target node in a Passive List of the first node; the query request comprises an instruction for checking whether the first target node is online or not and an instruction for querying the size of the Lazy List of the first target node;
the first node receives a query result returned by each first target node aiming at the query request, and selects a second target node with the smallest Lazy List and the lowest delay from the first target nodes according to the delay in the query result and the Lazy List of each first target node;
and the first node adds a second target node into the Lazy List of the first node, and repairs the P2P network system by using the node in the Lazy List as a substitute edge.
2. The method of claim 1, wherein the P2P network system includes 3 protocols BroadcastTree, msgtransferport and PartialView, and wherein the BroadcastTree is responsible for maintenance work of the P2P network system; the MsgTransferProt is responsible for broadcasting the query message and verifying and transmitting the query result; the partialView is responsible for managing neighbor nodes of each node, and the neighbor nodes comprise father nodes and child nodes; wherein the Active List and the Pasive List are located in a PartialView protocol of a P2P network system;
each node comprises a first Map cache, a second Map cache and a third Map cache, wherein the first Map cache is a receivedMsgmap which stores the mapping of the message ID and the message and is used for caching the currently received message so as to respond to the request of other nodes which do not receive the message to the message;
the second Map cache is NotReceivedMsgMap, and the cached message ID and the mapping of the node sending the message; when the specified duration is reached, the message sent by the node in the Eager List is not received yet, a Timer is triggered, and the Timer is used for requesting the message to the node sending the message and repairing the P2P network system;
the third Map cache is a timingcachemschgmap, and is responsible for caching the currently received messages, and if the messages sent by the nodes in the Lazy List are received within a specified time range, comparing the hop counts of the messages to determine whether to optimize the P2P network system.
3. The method of claim 2, further comprising:
when the second node receives the message, checking whether the IHAVE message or the NORMAL message is received through the ReceivedMsgMap;
discarding the IHAVE message or the NORMAL message if the message is received;
checking whether an ID of the IHAVE message or the NORMAL message is received if the IHAVE message or the NORMAL message is not received;
if not, discarding the message; otherwise, adding the ID of the IHAVE message or the ID of the NORMAL message into a NotReceivedMsgMap, and setting the IHAVE message or the NORMAL message as a timeout event.
4. The method of claim 1, wherein for the fault tolerance maintenance program, the method further comprises:
when the number of online nodes in a Passive List in the first node is smaller than a preset threshold, the first node generates a fixed hop count TTL;
the first node sends the message with the TTL to a random third target node in Eager List of the first node; after receiving the message, the third target node sends the node ID in the latest Pasive List to the first node, subtracts 1 from the TTL, then sends the TTL to a random node in the Eager List of the third target node randomly, and repeats the steps until the TTL is 0;
the first node collects all received node IDs, randomly selects M nodes from all received nodes, and adds the M nodes into a Passive List; and subtracting the currently stored node number in the Passive List from the maximum node number which can be stored in the Passive List.
5. The method according to claim 1, wherein before the first node joins the P2P network system, the method comprises:
the first node acquires partial topology information of the P2P network system, and initializes Eager List, Lazy List and Pasive List by applying the topology information;
after the initialization of Eager List, Lazy List and Pasive List is completed, the first node establishes TCP long connection with the nodes in the Eager List of the first node respectively, so as to form the edge of the P2P network system;
the first node repairs the P2P network system by using the nodes in Lazy List as the alternate edges, leaving only one edge with relatively fast transmission speed and the least number of hops, and the remaining nodes are finally removed to Lazy List.
6. The method of claim 5, wherein the topology information includes randomly assigned node NodeIDs; the first node acquires partial topology information of the P2P network system, and the step of initializing Eager List, Lazy List and Pasive List of the first node by applying the topology information comprises the following steps:
the first node initiates a request to a network of a P2P network system by utilizing the NodeID and KAD algorithm, and searches a neighbor node nearest to the NodeID;
the first node selects partial nodes from the neighbor nodes to initialize Eager List, Lazy List and Passive List of the first node.
7. The method of claim 6, wherein the step of the first node selecting partial nodes from the neighbor nodes to initialize Eager List, Lazy List and Passive List further comprises:
the first node selects m nodes with the minimum Eager List emergence degree from Eager Lists of the neighbor nodes according to the initial degree m of the first node, adds the m nodes into the Eager Lists of the first node, randomly selects n nodes from the Pasive Lists of the neighbor nodes according to the number n of the nodes in the Pasive Lists of the first node, adds the nodes into the Pasive Lists of the first node, and initializes the Lazy Lists to be empty at the same time so as to finish initializing the three List lists of the first node.
8. The device is applied to a P2P network system, wherein the P2P network system comprises a plurality of nodes, the nodes comprise an Active List and a Passive List, and the Active List is divided into an Active List Eager List and an inactive List Lazy List; the number of the nodes in the Active List is a fixed value, and the Eager List stores the nodes establishing TCP connection with the nodes on the P2P network system, and is used for transmitting messages; the Lazy List stores the rest nodes of the Active List except the Eager List, and is used for transferring the summary of the message or the ID of the message, and is used for optimizing and fault-tolerant of a P2P network system; the Passive List stores random nodes for replacing nodes disconnected by the Active List and ensuring the connection between the nodes and the network in the P2P network system; the device comprises:
a query request obtaining module configured in a first node, configured to obtain, in the P2P network system, a query request broadcasted by a parent node thereof, where the first node is any node in the P2P network system;
the query request broadcasting module is configured in the first node and broadcasts the query request to the child nodes of the query request broadcasting module through the tree maintenance program; the child nodes are used for broadcasting the query request to the child nodes corresponding to the child nodes by using the tree structure of the P2P network system, and the child nodes corresponding to the child nodes repeat the broadcasting steps until the query request is broadcast to all nodes on the P2P network system; after each node receives the query request, retrieving a local database, waiting for the return of the results of the child nodes, performing settlement and deduplication operations after the data returned by all the child nodes are collected, and returning the results to the parent node; after layer-by-layer feedback, when the root node receiving the user query request receives the returned results of all child nodes, final settlement and duplicate removal operation is carried out to generate a final query result, and the final query result is returned to the user;
the tree maintenance program comprises an expandable maintenance program and a fault tolerance maintenance program;
for the extensible maintenance program, the apparatus comprises:
an IHAVE message sending module, configured in the first node, for sending an IHAVE message to a second node in the child nodes when the query request is broadcast to the child nodes, where the IHAVE message includes a message ID;
a NORMAL message checking module, configured in the second node, for checking whether it has received a NORMAL message corresponding to the message ID for delivering the query request;
a GRAFT message generating module configured in the second node, configured to generate a GRAFT message for repairing the P2P network system when a NORMAL message corresponding to the message ID is not received within a timeout period; the GRAFT message includes the message ID and a request to receive the IHAVE message;
a GRAFT message sending module configured in the second node, configured to send the GRAFT message to the first node when a NORMAL message corresponding to the message ID is not received within a timeout period, and move the first node from its Lazy List to Eager List, so that the first node repairs the P2P network system;
a hop count difference calculation module configured in the second node, configured to calculate a difference between a received hop count of the IHAVE message and a received hop count of the NORMAL message when the NORMAL message corresponding to the message ID has been received within a timeout period;
a hop count difference determining module, configured in the second node, configured to determine, when a NORMAL message corresponding to the message ID has been received within a timeout period, whether the hop count difference exceeds a hop count threshold;
a first system repair module configured in the second node, configured to repair the P2P network system when the hop count difference exceeds a hop count threshold;
for the fault tolerance maintenance program, the apparatus comprises:
a second node removing module configured in the first node, for removing the second node from its Eager List when a connection between the first node and the second node constituting an edge of the P2P network system is disconnected;
the system comprises a query request initiating module, a query request sending module and a query request sending module, wherein the query request initiating module is configured in a first node and is used for sequentially initiating query requests to a first target node in a Passive List of the first node; the query request comprises an instruction for checking whether the first target node is online or not and an instruction for querying the size of the Lazy List of the first target node;
a query result receiving module, configured in the first nodes, configured to receive a query result returned by each first target node for the query request, and select, according to a delay in the query result and a size of a Lazy List of each first target node, a second target node with a smallest Lazy List and a lowest delay from the first target nodes;
and the second system repair module is configured in the first node, and is used for adding the second target node into the Lazy List of the second target node and repairing the P2P network system by using the node in the Lazy List as a replacement edge.
9. The apparatus of claim 8, further comprising:
a hop TTL generation module, configured in a first node, for generating a fixed hop TTL when the number of online nodes in a Passive List in the first node is less than a preset threshold;
the TTL message sending module is configured in the first node and used for sending the message with the TTL to a random third target node in Eager List of the TTL message sending module; after the third target node receives the message, the third target node sends the node ID in the latest Pasive List to the first node, subtracts 1 from the TTL, then sends the TTL to a random node in the Eager List of the third target node randomly, and repeats the steps until the TTL is 0;
the node selection module is configured in the first node and used for collecting all received node IDs, randomly selecting M nodes from all received nodes and adding the M nodes into the Passive List; and subtracting the currently stored node number in the Passive List from the maximum node number which can be stored in the Passive List.
10. The apparatus of claim 8, further comprising:
and the node changing module is configured in the first node and is used for adding the second node into the Passive List after removing the second node from the Eager List of the second node.
CN201911032931.9A 2019-10-28 2019-10-28 Extensible high-performance distributed query processing method and device Active CN111046065B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911032931.9A CN111046065B (en) 2019-10-28 2019-10-28 Extensible high-performance distributed query processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911032931.9A CN111046065B (en) 2019-10-28 2019-10-28 Extensible high-performance distributed query processing method and device

Publications (2)

Publication Number Publication Date
CN111046065A CN111046065A (en) 2020-04-21
CN111046065B true CN111046065B (en) 2022-06-17

Family

ID=70232895

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911032931.9A Active CN111046065B (en) 2019-10-28 2019-10-28 Extensible high-performance distributed query processing method and device

Country Status (1)

Country Link
CN (1) CN111046065B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112860799A (en) * 2021-02-22 2021-05-28 浪潮云信息技术股份公司 Management method for data synchronization of distributed database
CN115328579B (en) * 2022-10-11 2023-02-24 山东海量信息技术研究院 Scheduling method and system for neural network training and computer readable storage medium
CN115550251B (en) * 2022-12-01 2023-03-10 杭州蚂蚁酷爱科技有限公司 Block chain network, node set maintenance method and device
CN116108238B (en) * 2023-04-12 2023-06-16 杭州悦数科技有限公司 Optimization method, system and device for multi-hop query in graph database

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1953413A (en) * 2006-09-13 2007-04-25 华中科技大学 An organization method for tree network of control stream in the stream media living broadcast system
CN101488898A (en) * 2009-03-04 2009-07-22 北京邮电大学 Tree shaped fast connection establishing method based on multi-Agent cooperation
CN101959054A (en) * 2009-07-14 2011-01-26 中国电信股份有限公司 Integrated P2P (Peer-To-Peer) VOD (Video-On-Demand) system and partner node selecting method
CN102006238A (en) * 2010-12-14 2011-04-06 武汉大学 Balanced quick searching method in structureless P2P (Peer-to-Peer) network
CN104734962A (en) * 2015-02-26 2015-06-24 北京交通大学 Resource searching method for unstructured P2P network
CN106547796A (en) * 2015-09-23 2017-03-29 南京中兴新软件有限责任公司 The execution method and device of data base
CN107612980A (en) * 2017-08-28 2018-01-19 西安电子科技大学 It can adjust in a kind of structured P 2 P network and reliable consistency maintaining method
CN108205561A (en) * 2016-12-19 2018-06-26 北京国双科技有限公司 data query system, method and device
CN108768690A (en) * 2018-04-13 2018-11-06 华侨大学 A kind of the P2P self-organization network structures and resource search method of structuring
CN109684375A (en) * 2018-12-07 2019-04-26 深圳市智税链科技有限公司 Method, accounting nodes and the medium of Transaction Information are inquired in block chain network
CN109977274A (en) * 2019-03-31 2019-07-05 杭州复杂美科技有限公司 A kind of data query and verification method, system, equipment and storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8015211B2 (en) * 2004-04-21 2011-09-06 Architecture Technology Corporation Secure peer-to-peer object storage system
US20060209819A1 (en) * 2005-03-21 2006-09-21 Jennings Raymond B Iii Method and apparatus for efficiently expanding a P2P network
US8699377B2 (en) * 2008-09-04 2014-04-15 Trilliant Networks, Inc. System and method for implementing mesh network communications using a mesh network protocol
US20100250589A1 (en) * 2009-03-26 2010-09-30 Grasstell Networks Llc Tree structured P2P overlay database system
US8266170B2 (en) * 2010-04-26 2012-09-11 International Business Machines Corporation Peer to peer (P2P) missing fields and field valuation feedback
US10120902B2 (en) * 2014-02-20 2018-11-06 Citus Data Bilgi Islemleri Ticaret A.S. Apparatus and method for processing distributed relational algebra operators in a distributed database

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1953413A (en) * 2006-09-13 2007-04-25 华中科技大学 An organization method for tree network of control stream in the stream media living broadcast system
CN101488898A (en) * 2009-03-04 2009-07-22 北京邮电大学 Tree shaped fast connection establishing method based on multi-Agent cooperation
CN101959054A (en) * 2009-07-14 2011-01-26 中国电信股份有限公司 Integrated P2P (Peer-To-Peer) VOD (Video-On-Demand) system and partner node selecting method
CN102006238A (en) * 2010-12-14 2011-04-06 武汉大学 Balanced quick searching method in structureless P2P (Peer-to-Peer) network
CN104734962A (en) * 2015-02-26 2015-06-24 北京交通大学 Resource searching method for unstructured P2P network
CN106547796A (en) * 2015-09-23 2017-03-29 南京中兴新软件有限责任公司 The execution method and device of data base
CN108205561A (en) * 2016-12-19 2018-06-26 北京国双科技有限公司 data query system, method and device
CN107612980A (en) * 2017-08-28 2018-01-19 西安电子科技大学 It can adjust in a kind of structured P 2 P network and reliable consistency maintaining method
CN108768690A (en) * 2018-04-13 2018-11-06 华侨大学 A kind of the P2P self-organization network structures and resource search method of structuring
CN109684375A (en) * 2018-12-07 2019-04-26 深圳市智税链科技有限公司 Method, accounting nodes and the medium of Transaction Information are inquired in block chain network
CN109977274A (en) * 2019-03-31 2019-07-05 杭州复杂美科技有限公司 A kind of data query and verification method, system, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张京楣等.P2P网络安全的信任模型研究.《计算机应用研究》.2004,(第3期),第76-77页. *

Also Published As

Publication number Publication date
CN111046065A (en) 2020-04-21

Similar Documents

Publication Publication Date Title
US11018980B2 (en) Data-interoperability-oriented trusted processing method and system
CN111046065B (en) Extensible high-performance distributed query processing method and device
Zhao et al. Tapestry: A resilient global-scale overlay for service deployment
CN110866046B (en) Extensible distributed query method and device
JP4652435B2 (en) Optimal operation of hierarchical peer-to-peer networks
Ledlie et al. Distributed, secure load balancing with skew, heterogeneity and churn
JP4806203B2 (en) Routing in peer-to-peer networks
Li et al. Efficient and scalable consistency maintenance for heterogeneous peer-to-peer systems
Liu et al. An efficient and trustworthy P2P and social network integrated file sharing system
WO2010127618A1 (en) System and method for implementing streaming media content service
CN110956463B (en) Credible certificate storing method and system based on extensible distributed query system
Xiaoqiang et al. An in-network caching scheme based on betweenness and content popularity prediction in content-centric networking
Shen et al. A proximity-aware interest-clustered P2P file sharing system
CN110990448B (en) Distributed query method and device supporting fault tolerance
Graffi et al. Skyeye. kom: An information management over-overlay for getting the oracle view on structured p2p systems
JP4533923B2 (en) Super-peer with load balancing function in hierarchical peer-to-peer system and method of operating the super-peer
Yoichi et al. Consistency preservation of replicas based on access frequency for content sharing in hybrid peer-to-peer networks
Yu et al. Granary: A sharing oriented distributed storage system
Medrano-Chávez et al. A performance comparison of Chord and Kademlia DHTs in high churn scenarios
Rahmani et al. A comparative study of replication schemes for structured P2P networks
Du et al. Highly available component sharing in large-scale multi-tenant cloud systems
Ma et al. A cloud‐assisted publish/subscribe service for time‐critical dissemination of bulk content
Chan et al. Malugo: A peer-to-peer storage system
Cherbal et al. Peer-to-Peer lookup process based on data popularity
Li et al. Optimal layer division for low latency in DHT‐based hierarchical P2P network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant