WO2022021357A1

WO2022021357A1 - File block download method and apparatus

Info

Publication number: WO2022021357A1
Application number: PCT/CN2020/106280
Authority: WO
Inventors: 徐聪; 袁庭球
Original assignee: 华为技术有限公司
Priority date: 2020-07-31
Filing date: 2020-07-31
Publication date: 2022-02-03
Also published as: CN114375565A; CN114375565B

Abstract

The present invention provides a file block download method. In a P2P network, an index node sends, to a requesting node according to a download request sent from the requesting node, preselected storage nodes that are selected, and then the requesting node selects, from a storage node set formed by the preselected storage nodes and storage nodes in a storage node list, a target storage node for downloading a target file block. In the file download method provided by the present invention, on the basis of the storage nodes in the storage node list, the iterative selection of the target storage node enables the selected target storage node to get closer to a global optimal storage node having the best performance among all the storage nodes storing a target file block, so as to obtain a global optimal solution, thereby improving download efficiency and download quality of file blocks.

Description

A file block downloading method and device

technical field

The present invention mainly relates to the technical field of network communication, and in particular, to a file block downloading method and device.

Background technique

With the popularization of high-speed Internet and the improvement of personal computer computing and storage capabilities, peer-to-peer (P2P) technology has gradually been widely used. Through P2P technology, Internet users can effectively use a large number of network nodes with storage functions scattered in the Internet to distribute files to all network nodes. Each network node can use its free storage space to perform storage tasks, so as to achieve mass storage. Purpose. After the file is stored in the P2P network, when the user requests to download the file on a certain network node in the P2P network, the file needs to be downloaded based on the storage principle of the P2P network.

Usually, a file is divided into multiple file blocks and then stored in the P2P network. Therefore, the process for downloading the file is equivalent to the process of downloading the file blocks of the file multiple times. After the user sends a download request to download the target file block to the requesting node in the P2P network, he can find all the storage nodes that store the target file block through the index node in the P2P network. At this time, the requesting node initiates the download to each storage node at the same time. When the download process is completed on a certain storage node, the download request is completed. It can be seen that this method of file downloading will cause a lot of waste of bandwidth resources. In order to reduce the occupancy rate of bandwidth resources, you can first perform a performance test on each storage node, and select the storage node with the best performance as the target for executing the download request. However, the performance test of a large number of storage nodes will consume a lot of time and overhead, which will reduce the download speed of file blocks.

SUMMARY OF THE INVENTION

The present application provides a method and device for downloading file blocks, so as to improve the downloading speed of files.

In a first aspect, the present invention provides a method for downloading file blocks, which is applied to a P2P network, where the P2P network includes a requesting node and an indexing node, and the method includes: the indexing node receives a download request sent by the requesting node, The download request is used to download the target file block; the index node obtains all storage nodes corresponding to the target file block in the sub-distributed hash table DHT according to the download request, and the sub-DHT corresponds to the P2P network The part of the DHT stored in the index node in the DHT of the index node; the index node selects a plurality of preselected storage nodes from all the storage nodes; the index node sends the plurality of preselected storage nodes to the request node; The requesting node obtains a storage node list, and the storage nodes in the storage node list are used to form a storage node set together with the preselected storage node; the requesting node downloads the target file block based on the storage node set.

Generally, the index node may select a plurality of preselected storage nodes from all the storage nodes in a random selection manner. In this way, by iteratively selecting the target storage node based on the storage nodes in the storage node list, the selected target storage node can be made closer and closer to the globally optimal storage node with the best performance among all the storage nodes storing the target file block. , that is, the global optimal solution is obtained, thereby improving the download efficiency and download quality of file blocks.

In a second aspect, the present invention provides a method for downloading file blocks, which is applied to an index node in a P2P network. The method includes: receiving a download request sent by a requesting node, where the download request is used to download target file blocks; The download request obtains all the storage nodes corresponding to the target file block in the sub-distributed hash table DHT, and the sub-DHT is the part of the DHT stored in the index node in the DHT corresponding to the P2P network; Selecting multiple preselected storage nodes from the storage nodes; sending the multiple preselected storage nodes to the requesting node.

Generally, the index node may select a plurality of preselected storage nodes from all the storage nodes in a random selection manner. In this way, the index node can randomly select a part of the storage nodes for the requesting node to select a suitable storage node to download the target file block, thereby reducing the number and time for the requesting node to detect the performance of the storage node and improving the download speed of the target file block.

In an implementation manner, if the index node does not have a storage node corresponding to the target file block, the download request is sent to an adjacent index node, and the adjacent index node obtains the data with the target file block. All storage nodes corresponding to the target file block, wherein the adjacent index nodes are index nodes in the P2P network that have a specified algorithmic relationship with the index nodes.

In this way, it can be ensured that the requesting node can obtain the storage node corresponding to the target file block, thereby ensuring the downloadability of the target file block.

In one implementation, the sending the plurality of preselected storage nodes to the requesting node includes: generating a local DHT, where the local DHT includes the plurality of preselected storage nodes; sending the local DHT to all the preselected storage nodes the requesting node.

In this way, the index node can send the pre-selected storage nodes to the requesting node in the form of local DHT, to ensure the integrity and standardization of the node information of these pre-selected storage nodes, and it is convenient for the requesting node to perform performance testing on the pre-selected storage nodes later.

In an implementation manner, after the sending the local storage node list to the requesting node, the method further includes: clearing the storage nodes in the local DHT.

In this way, the index node can write the corresponding preselected storage node in the local DHT when it receives a download request next time, so as to avoid confusion with the preselected storage node corresponding to the previous download request.

In a third aspect, the present invention provides a method for downloading file blocks, which is applied to a requesting node in a P2P network, the method comprising: receiving a preselected storage node sent by an index node, where the preselected storage node is a slave sub-distributed hash table The storage node selected from all the storage nodes corresponding to the target file block in the DHT, the sub-DHT is a part of the DHT stored in the index node in the DHT corresponding to the P2P network; obtain the storage node list of the requesting node, The storage nodes in the storage node list are used to form a storage node set together with the preselected storage nodes; the target file block is downloaded based on the storage node set.

In this way, the requesting node can detect the performance of each pre-selected storage node, and select some pre-selected storage nodes with the best performance to download the target file block, thereby reducing the number of storage nodes that download the target file block in parallel and increasing the number of storage nodes for downloading the target file block. The quality of the storage node for file blocks. At the same time, during the download process of each target file block, the storage node set composed of the storage nodes in the storage node list in the request node and the pre-selected storage nodes will be used as the basis for the selection of the target storage node. Based on the above iterative process, it is possible to It is guaranteed that even if the target storage node that downloads each target file block is only a part of the storage nodes in all the storage nodes, it is possible to achieve the storage node with the best performance for the entire file download process, that is, the effect of converging to the global optimal solution, thereby improving the File download efficiency and quality.

In an implementation manner, the storage nodes in the storage node list are the storage nodes whose preset performance indicators selected after the last file block download operation conform to the preset performance index range and store the target file blocks.

In this way, the storage nodes in the storage node list can be used as the basis for iteration, so as to improve the fit between the determined target storage node and the storage node with the best performance among the global storage nodes.

In an implementation manner, before the acquiring the storage node list of the requesting node, the method further includes: detecting whether the storage node list exists; if the storage node list does not exist, the preselected The storage nodes constitute a collection of storage nodes.

In this way, for downloading the first target file block, since there is no storage node list, it is only necessary to perform a performance test on the pre-selected storage nodes.

In an implementation manner, the downloading the target file block based on the storage node set includes: testing a preset performance index of each storage node in the storage node set; obtaining at least one test result that meets a preset performance index threshold node information of the target storage node; download the target file block according to the node information of the target storage node.

In this way, one or more storage nodes whose performance meets the standard can be selected as the target storage node from the storage node set as required to download the target file block.

In an implementation manner, if the node information of a plurality of target storage nodes whose test results meet the preset performance indicator threshold are obtained, the determination of the The number of target storage nodes.

In this way, in order to further improve the quality of the storage nodes for downloading target file blocks, multiple target storage nodes may be selected for parallel download.

In an implementation manner, after downloading the target file block, the method further includes: selecting a storage node from the storage node set, and the selected storage node is a preset performance index in the storage node set. A storage node whose ranking is greater than or equal to a preset ranking; and the storage node in the storage node list is updated by using the selected storage node.

In this way, in order to optimize the optimal storage node in the iterative process, it is necessary to update the optimal storage node list after each download of the target file block to improve the performance of each optimal storage node in the optimal storage node list. .

In an implementation manner, the preset performance index is one or a combination of transmission delay, round-trip delay, and available bandwidth.

In this way, according to actual needs, one or several suitable performance indicators can be selected to test the performance of the storage node.

In a fourth aspect, the present invention provides an index node, the index node includes: a receiver, a processor, a memory and a transmitter, the receiver, the processor, the memory and the transmitter are coupled; The processor invokes the program instructions in the memory, so that the index node executes the following method: receiving a download request sent by a requesting node, the download request being used to download a target file block; obtaining a sub-distribution according to the download request All storage nodes corresponding to the target file block in the formula hash table DHT, and the sub-DHT is a part of the DHT stored in the index node in the DHT corresponding to the P2P network; Preselecting storage nodes; sending the plurality of preselected storage nodes to the requesting node.

Generally, the index node may select a plurality of preselected storage nodes from all the storage nodes in a random selection manner. In this way, the index node can select a part of the storage nodes for the requesting node to select a suitable storage node to download the target file block, thereby reducing the number and time for the requesting node to detect the performance of the storage node and improving the download speed of the target file block.

In an implementation manner, the method performed by the index node further includes: if the index node does not have a storage node corresponding to the target file block, sending the download request to an adjacent index node, and All storage nodes corresponding to the target file block are acquired from the adjacent index nodes, wherein the adjacent index nodes are index nodes in the P2P network that have a specified algorithmic relationship with the index nodes.

In an implementation manner, the method performed by the index node further includes: generating a local DHT, where the local DHT includes the plurality of preselected storage nodes; and sending the local DHT to the requesting node.

In an implementation manner, the method performed by the index node further includes: after the sending the local storage node list to the requesting node, clearing the storage nodes in the local DHT.

In a fifth aspect, the present invention provides a requesting node, the requesting node comprising: a receiver, a processor, a memory and a transmitter, the receiver, the processor, the memory and the transmitter being coupled; The processor invokes the program instructions in the memory, so that the requesting node executes the following method: receiving a preselected storage node sent by the index node, where the preselected storage node is the target file block from the sub-distributed hash table DHT. The storage node selected from all the corresponding storage nodes, the sub-DHT is the part of the DHT stored in the index node in the DHT corresponding to the P2P network; obtain the storage node list of the request node, in the storage node list The storage node is used to form a storage node set together with the preselected storage node; the target file block is downloaded based on the storage node set.

In this way, the requesting node can detect the performance of each preselected storage node, and select some preselected storage nodes with the best performance to download the target file block, thereby reducing the number of storage nodes that download the target file block in parallel, and at the same time, increasing the number of storage nodes used for downloading the target file block in parallel. The quality of the storage node that downloads the target file chunks. At the same time, during the download process of each target file block, the storage node corresponding to the download process of the previous target file block will be used as the basis for the selection of the target storage node. The storage node is only a part of the storage nodes in all the storage nodes, and it can also realize the storage node with the best performance in the whole file download process, that is, the effect of converging to the global optimal solution, thereby improving the file download efficiency and quality.

In an implementation manner, the method performed by the requesting node further includes: before acquiring the storage node list of the requesting node, detecting whether the storage node list exists; if the storage node list does not exist, Then, a set of storage nodes is formed by the preselected storage nodes.

In this way, for downloading the first target file block, since there is no optimal storage node list, it is only necessary to perform a performance test on the preselected storage nodes.

In an implementation manner, the method performed by the requesting node further includes: testing the preset performance indicators of each storage node in the storage node set; acquiring at least one target storage node whose test result meets the preset performance indicator threshold. Node information; download the target file block according to the node information of the target storage node.

In an implementation manner, the method performed by the requesting node further includes: if acquiring node information of a plurality of target storage nodes whose test results meet a preset performance indicator threshold, performing the method according to the number of the preselected storage nodes and the The number of storage nodes in the storage node list determines the number of the target storage nodes.

In an implementation manner, the method performed by the requesting node further includes: after downloading the target file block, selecting a storage node from the storage node set, where the selected storage node is one of the storage node sets The ranking of the preset performance index is greater than or equal to the storage node of the preset ranking; the storage node in the storage node list is updated by using the selected storage node.

In this way, in order to optimize the optimal storage node in the iterative process, it is necessary to update the storage node list after each download of the target file block to ensure the optimal performance of each storage node in the storage node list.

In a sixth aspect, the present invention provides a P2P network, characterized in that it includes at least one request node and at least one index node, wherein the index node is used to execute a related file block downloading method, and the request node is used to Execute the relevant file block download method.

In this way, the index node can obtain the storage node corresponding to the target file block according to the download request sent by the requesting node, and randomly select a plurality of preselected storage nodes to send to the requesting node, so as to reduce the number of storage nodes that the requesting node needs to test the performance of. Thus, the download efficiency of the requesting node is improved. Further, the requesting node determines the target storage node for downloading the target file block by testing the performance of the pre-selected storage node and the optimal storage node, because during the downloading process of each target file block, the data of the previous target file block will be changed. The storage node corresponding to the download process is used as the basis for the selection of the target storage node. Therefore, based on the above iterative process, it can be ensured that the target storage node that downloads each target file block in time is only a partial storage node in all storage nodes, and the overall file download can also be realized. The process corresponds to the storage node with the best performance, that is, the effect of converging to the global optimal solution, thereby improving the download efficiency and quality of the file.

Description of drawings

1 is a schematic structural diagram of a P2P network according to an embodiment of the present invention;

FIG. 2 is a schematic structural diagram of a P2P network according to an embodiment of the present invention;

3 is a schematic structural diagram of a request node according to an embodiment of the present invention;

FIG. 4 is a schematic structural diagram of an index node according to an embodiment of the present invention;

5 is a schematic flowchart of a method for downloading a file according to an embodiment of the present invention;

6 is a schematic structural diagram of a sub-DHT of an index node according to an embodiment of the present invention;

7 is a schematic structural diagram of a local DHT provided by an embodiment of the present invention;

8 is a schematic structural diagram of a storage node list according to an embodiment of the present invention;

FIG. 9 is a simulation diagram for selecting a target storage node and a globally optimal storage node according to an embodiment of the present invention.

detailed description

FIG. 1 is a schematic structural diagram of a P2P network according to an embodiment of the present invention. As shown in FIG. 1 , the P2P network includes nodes 1 to 4, and these nodes all have the ability to store file blocks. Therefore, these nodes may also be called storage nodes. , usually, a file is divided into multiple file blocks and stored on each storage node. There is a global DHT in the P2P network, which is used to store the file block name and the node information of the storage node corresponding to the file block name. The block name can be represented by a resource identifier (Object ID). The resource identifier can be a hash value generated by the file block name through a hash operation, which is unique. The node information of the storage node can include the index address of the storage node, storage Node ID of the node, etc. Each storage node in the P2P network stores a part of the global DHT. The part of the global DHT stored on each storage node can be called a sub-DHT, and each storage node can pass some specific algorithms, such as Tapestry, Pastry, Chord, Kademlia, etc. to determine in which sub-DHT the corresponding information between the file block name and the storage node is stored, that is, on which storage node is stored.

One or several storage nodes in the P2P network can download the target file block by initiating a download request to the adjacent node, that is, query the storage node corresponding to the target file block through the adjacent node, and pass the data of the storage node corresponding to the target file block. Node information download target file block. Among them, the storage node that initiates the download request is the request node, and the adjacent node refers to the storage node that has a specified algorithm relationship with the request node. The specified algorithm usually refers to the information used to determine the corresponding information between the file block name and the storage node. The algorithm of the storage location, as shown in Figure 1, if node 1 is the requesting node, then the adjacent node is node 2.

When the adjacent node receives the download request, it will query the node information of the storage node corresponding to the target file block in the sub-DHT stored locally. At this time, the adjacent node is responsible for indexing the node information of the storage node of the target file block. work, therefore, adjacent nodes can also be called index nodes. If there is no node information of the storage node of the target file block in the adjacent node, then it is necessary to query the node information of the storage node of the target file block through the adjacent node of the adjacent node. The adjacent node is the index node, and the above process is repeated until the node information of the storage node of the target file block is queried. It can be seen that each node in the P2P network can be a storage node and an index node, that is, it is responsible for the work of storing the target file block and the work of querying the node information of the storage node of the target file block.

Further, FIG. 2 is a schematic structural diagram of a P2P network provided by an embodiment of the present invention, and a download request may be initiated by multiple request nodes at the same time. As shown in FIG. Node 1 and requesting node 2 initiate a download request at the same time, wherein requesting node 1 and requesting node 2 can download the same target file block; or, requesting node 1 and requesting node 2 can download different target file blocks.

At the same time, there can be one or more index nodes. As shown in Figure 1, if node 2 undertakes the query work, there is one index node. If the node information of the storage node of the target file block cannot be queried in node 2, Then node 3 and node 4 undertake the query work, then, there are 2 index nodes. As shown in Figure 2, for requesting node 1, when node 3 and node 4 undertake the query work, the number of index nodes of requesting node 1 is 2; for requesting node 2, when node 4 undertakes the query work, the request Node 2 has one index node. At this time, for the P2P network, there are two index nodes in total.

The following embodiments describe a certain request node and an index node corresponding to the request node in the P2P network. For the situation that there are multiple request nodes and multiple index nodes in the P2P network, reference may be made to the following embodiments. method.

3 is a schematic structural diagram of a request node according to an embodiment of the present invention. The request node may be a terminal device with functions such as data storage, data query, and data processing, such as a computer, a mobile phone, a tablet computer, a server, and a cloud server. The requesting node may include at least one receiver, at least one processor, at least one memory and at least one transmitter, taking FIG. 3 as an example, the requesting node includes a receiver 101, a processor 102, a memory 103 and a transmitter 104, wherein , the receiver 101, the processor 102, the memory 103 and the transmitter 104 are coupled, and program instructions are stored in the memory 103. The processor 102 can call the program instructions in the memory 103 to make the requesting node execute the relevant file download method, for example , generate download requests, detect node performance, generate optimal node lists, etc.

4 is a schematic structural diagram of an index node according to an embodiment of the present invention. An index node may be a terminal device with functions such as data storage, data query, and data processing, such as a computer, a mobile phone, a tablet computer, a server, and a cloud server. An index node may include at least one receiver, at least one processor, at least one memory and at least one transmitter, taking FIG. 4 as an example, the index node includes a receiver 201, a processor 202, a memory 203 and a transmitter 204, wherein , the receiver 201, the processor 202, the memory 203 and the transmitter 204 are coupled, and program instructions are stored in the memory 203, and the processor 202 can call the program instructions in the memory 203 to make the index node execute the relevant file download method, for example , query the node information of the storage node of the target file block, randomly select the pre-selected storage node, generate a local DHT, etc.

The receivers and transmitters mentioned in the embodiments of the present invention may be communication interfaces on a terminal device, and the communication interfaces may be one or more optical fiber link interfaces, Ethernet interfaces, microwave link interfaces, or copper wire interfaces, etc. . Specifically, the communication interface may include a network adapter (network adapter), a network card (network interface card), a local area network receiver (LAN adapter), a network interface controller (network interface controller, NIC), a modem (modem), and the like. Wherein, the communication interface may be an independent device, and may also be partially or fully integrated or packaged in the processor to become a part of the processor.

The processor mentioned in the embodiments of the present invention may include one or more processing units, such as a system on a chip (SoC), a central processing unit (CPU), a microcontroller (microcontroller, MCU), memory controller, etc. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.

The memory mentioned in the embodiments of the present invention may include one or more storage units, for example, may include volatile memory (volatile memory), such as: dynamic random access memory (dynamic random access memory, DRAM), static random access memory Access memory (static random access memory, SRAM), etc.; can also include non-volatile memory (non-volatile memory, NVM), such as: read-only memory (read-only memory, ROM), flash memory (flash memory), etc. . Wherein, different storage units may be independent devices, or may be integrated or packaged in one or more processors or communication interfaces, and become part of the processors or communication interfaces.

FIG. 5 is a schematic flowchart of a file downloading method provided by an embodiment of the present invention. As shown in FIG. 5 , the method includes:

S1. The requesting node sends a download request to the index node.

The requesting node generates a corresponding download request by receiving relevant operations initiated by the user, such as the file download instruction input by the user in the requesting node, the file downloading instruction initiated by the user through clicking and other operations in the requesting node, etc. As can be seen from the above, the process of storing a file in a P2P network is to first split the file into multiple file blocks, and then store each file block separately. It can be seen that the file download process should also be downloaded separately. process for each file block of the file. When the requesting node receives the user's file download instruction, it can first determine the file name corresponding to the file name according to the file name corresponding to the download instruction, such as the hash value corresponding to the file name calculated by the hash algorithm. Name, in this way, the requesting node can know all the file blocks to be downloaded by the user to complete the file download process, and generate a download request for each file block, wherein each file block can be called a target file during the download process. piece. Further, when downloading each file block, it needs to go through two processes of querying the node information of the storage node where the file block is located and using the node information to download the file block. If some file blocks are already stored in the request node, you can query the request node by Its own database or file log determines the file blocks that have been stored locally (file blocks that do not need to be downloaded), and the requesting node only needs to generate download requests for the remaining file blocks, which can effectively save the query time and download time of file blocks , which can improve the overall efficiency of file downloads.

The download request usually includes the file block name of the target file block, such as the Object ID mentioned above, etc. Usually, a download request corresponds to one target file block. The requesting node can generate a download request for the next target file block to be downloaded after a target file block is downloaded, or can generate a download request for all target file blocks at the same time after receiving a file download instruction from the user. The requesting node may determine which file block to download first according to a certain priority or weight.

S2. The index node receives the download request sent by the requesting node, where the download request is used to download the target file block.

After receiving the download request, the index node can obtain the file block name, download priority and other information of the target file block included in the download request by parsing the download request.

S3. The index node obtains all storage nodes corresponding to the target file block in the sub-distributed hash table DHT according to the download request, where the sub-DHT is the part of the DHT corresponding to the P2P network that is stored in the index node DHT.

As can be seen from the above, the index node stores part of the DHT in the DHT corresponding to the P2P network, that is, the sub-DHT. FIG. 6 is a schematic structural diagram of a sub-DHT of an index node provided by an embodiment of the present invention, as shown in FIG. The sub-DHT includes the file block name and the node information of the storage node corresponding to the file block name. Specifically, the sub-DHT includes M file block names and the node information of the storage node. Hash is used to identify the file block name, and Peer is used to identify the file block name. Identify the node identifier of the storage node corresponding to the file block, and use add_ to identify the index location of the storage node. Thus, the inode can determine whether the inode stores the node information of the storage node corresponding to the target file block by matching the file block name of the target file block with the file block name in the sub-DHT.

In a query result, if the file block name of the target file block is identified by Hash a, the storage nodes corresponding to the file block name with the same file block name as the target file block in FIG. 6 are Peer a1 and Peer a2 respectively. , Peer a3, Peer z5, etc. Assuming that the number of storage nodes corresponding to the file block name with the same file block name as the target file name is N, then N≤M. In this way, all storage nodes corresponding to the target file block can be obtained from the sub-DHT corresponding to the index node.

In a query result, if the file block name of the target file block is identified by Hash c, obviously there is no file block name that is the same as the file block name of the target file block in Fig. 6, so it cannot be retrieved from the index node. Query the node information of the storage node that stores the target file block. At this time, the index node needs to send the download request to the adjacent node, and query the node information of the storage node of the target file block in the adjacent node. Because the adjacent node undertakes the query work, the adjacent node can also be called is the adjacent index node. Repeat the above process until a certain adjacent index node queries the node information of the storage node of the target file block. At this time, the previous query result can be realized in the adjacent index node, that is, the same query result as the target file block is obtained. All corresponding storage nodes.

S4. The index node selects a plurality of preselected storage nodes from all the storage nodes.

In order to effectively reduce the number of nodes whose performance is further checked by the requesting node, the index node needs to select multiple pre-selected storage nodes from all the obtained storage nodes. For example, m pre-selected storage nodes are selected from the above N storage nodes, where 1≤m≤N, and the specific number of m can be set by the user according to experience, or set by the index node through a related algorithm. The setting method of m is limited.

Further, in order to improve the fit between the subsequent target storage node and the storage node with the best performance among all the storage nodes, a preselected storage node may be selected by random selection to reduce the chance of selection.

In order to distinguish it from the sub-DHT in the index node, the pre-selected storage node can be specially identified in the sub-DHT. line for coloring, etc. In this embodiment, in order to distinguish it from the sub-DHT, a separate DHT, that is, a local DHT, is generated for the preselected storage node. FIG. 7 is a schematic structural diagram of a local DHT provided in this embodiment. As shown in FIG. 7 , the local DHT at least includes node identifiers of m preselected storage nodes and file block names of corresponding target file blocks. The local DHT may also include The index address of the pre-selected storage node, that is, the content in the local DHT can be exactly the same as the corresponding content of the pre-selected storage node in the sub-DHT. It can be written by copying or other methods, or it can be part of the content of the pre-selected storage node in the sub-DHT. Select Rewrite Local DHT. 1 to m in Peer 1 to Peer m in FIG. 7 represent the number of pre-selected storage nodes in the pre-selected storage nodes, and do not represent specific node identifiers. Among them, the local DHT can be a table pre-established as an index node. After the index node randomly selects the pre-selected storage node, the information such as the node identifier, file block name, index address and other information corresponding to the pre-selected storage node can be directly written into the index node. table to generate the final local DHT, which can effectively improve the generation efficiency of the local DHT.

S5. The index node sends the plurality of preselected storage nodes to the requesting node.

The index node can directly send to the requesting node according to the node information corresponding to the preselected storage node in the sub-DHT, such as the node identification, index address and other information. The index node can also directly send the generated local DHT to the requesting node. In this way, it is not easy to miss the node information of the pre-selected storage node and save the time of querying the node information of the pre-selected storage node from the sub-DHT. Node information with a certain standard format is received, which is convenient for requesting nodes to perform subsequent data processing.

Further, if the index node generates a local DHT, after the index node sends the local DHT to the requesting node, the node information of the storage node in the local DHT can be cleared, so that if the index node receives a new download request, it can The node information of the preselected storage node corresponding to the new download request is written into the local DHT to form a new local DHT, thereby realizing the reusability of the local DHT.

S6. The requesting node receives the preselected storage node sent by the index node, where the preselected storage node is a storage node randomly selected from all storage nodes corresponding to the target file block in the sub-distributed hash table DHT, and the sub-DHT is the Part of the DHT stored in the index node in the DHT corresponding to the P2P network.

The requesting node can receive the pre-selected storage nodes corresponding to the download request sent by the index node, such as node information, such as the node information of Peer 1 to Peer m in the above example. Of course, if the index node generates a local DHT, the requesting node will also receive As shown in the local DHT in FIG. 7 , the node information of each preselected storage node can be clearly browsed in the local DHT, and the node information of each preselected storage node can be quickly and accurately obtained.

S7. The requesting node obtains a storage node list, where the storage nodes in the storage node list are used to form a storage node set together with the preselected storage nodes.

S8. The requesting node downloads the target file block based on the storage node set.

S9. Request the node to test the preset performance indicators of each storage node in the storage node set.

After receiving the pre-selected storage nodes sent by the index node, the requesting node does not directly download the target file blocks from all the pre-selected storage nodes, but needs to test the performance of each pre-selected storage node. The performance of each storage node in the storage node list in the requesting node is tested, wherein the above-mentioned storage node is the preset performance index selected after the last file block download operation that meets the preset performance index range, and stores the target. Storage node for file blocks.

For example, (1) the download request corresponding to the file block download operation on the requesting node is A1, and the first file block a1 in the file a is requested to be downloaded, then in the process of downloading the file block a1, the index node will give the download request A1 to The requesting node sends the node information of m pre-selected storage nodes. At this time, since the requesting node is the first file block in the downloaded file a, the requesting node does not have a record of the storage node related to the file block in the downloaded file a. Therefore, , the requesting node only needs to test the preset performance indicators of the m preselected storage nodes. After the file download operation ends, the requesting node can select n storage nodes with the best performance from the m preselected storage nodes for recording, so as to generate a storage node list, where 1≤n≤m. FIG. 8 is a schematic structural diagram of a storage node list provided by an embodiment of the present invention. In FIG. 8, 1 to n in Peer 1 to Peer n represent the number of storage nodes in the storage node list, and do not represent specific node identifiers. .

(2) Further, based on the above-mentioned operation of downloading the file block a1, when the download request corresponding to the current file block download operation of the requesting node is A2, and when requesting to download the second file block a2 in the file a, then the download file block In the process of a2, the index node will send the node information of m preselected storage nodes to the requesting node again according to the download request A2. At this time, the requesting node can not only download the file block a2 from the m preselected storage nodes, but also download the file block a2 from the above A file block a2 is downloaded from the n storage nodes recorded after a file block download operation. Therefore, at this time, it is necessary to comprehensively consider the performance of the m preselected storage nodes and the performance of the n storage nodes, and determine the storage node from which the file block a2 is downloaded. After the file download operation is completed, the requesting node needs to select n storage nodes with the best performance from the m pre-selected storage nodes and the performance of the n storage nodes and record them in the storage node list.

It should be noted that if the storage node A in the P2P network stores the file a, that is, all the file blocks corresponding to the storage file a, then the node information of the storage node corresponding to these file blocks will correspond to the node information of the storage node A. . If the requesting node initiates a download request A for the target file block a1 (the target file block a1 is a file block of file a), and receives the feedback from the index node, the pre-selected storage nodes include storage node A, and through the performance of each storage node The test obtains multiple optimal storage nodes, including storage node A, then, when the requesting node continues to request to download other file blocks in file a, such as file block a2, storage node A includes file block a2, therefore, in When requesting the node to download the file block a2, the storage node A can still be used as a storage node for downloading the file block a2, and there is no problem that the target file block cannot be downloaded in the optimal storage node. The range of storage nodes is valid.

(3) Further, when the requesting node continues to download other file blocks in the file a, the downloading process in (2) can be repeated. Wherein, after the n storage nodes are determined again, the requesting node can replace the node information in the storage node list with the node information of the n storage nodes to update the storage node list. Generally, OPT_index and OPT_metric can be used to represent the storage nodes in the storage node list, where OPT_index can be the hash value (node identifier) corresponding to the storage node, and OPT_metric is used to identify the load of the transmission path from the requesting node to the storage node. .

Based on the above process, after the requesting node receives the node information of the preselected storage node sent by the index node, it needs to first detect whether the requesting node stores a storage node list. In an implementation manner, the requesting node pre-establishes a table of storage node lists before downloading a file, and writes the node information of the selected storage node into the table of the storage node list after each download of the target file block. , at this time, it can be determined whether there is a storage node list stored in the request node by detecting whether there are characters in the table, or whether there are characters related to node information. For example, when there are no characters, or there is no character related to node information When it is determined that there is no storage node list, the current file download operation of the requesting node corresponds to the first file block in the downloaded file. In another implementation manner, after each download of the target file block, the requesting node generates a storage node list from the node information of the selected storage node. At this point, it can be directly detected whether the requesting node has a storage node list.

After the storage node set is obtained, a preset performance index test is performed on each storage node in the storage node set, where the preset performance index may be one or a combination of transmission delay, round-trip delay, and available bandwidth. For example, use the PING tool to detect the transmission delay from the request node to each storage node, and use the IPerf tool to detect the available bandwidth from the request node to each storage node. For example, the tested storage node can be recorded as Peer(Hash), Hash_metric, where Peer is the storage node, Hash is the hash value that identifies the storage node, and Hash_metric is the load status of the transmission path that identifies the request node to the storage node .

In an implementation manner, the requesting node may test the performance of each storage node in the storage node set for each download request. In an implementation manner, the requesting node can also choose to periodically test the performance of the storage nodes in the storage node list, and can directly refer to the storage node list when the performance test of the storage nodes in the storage node list is not required. The performance of the storage node as documented in . In an implementation manner, the requesting node can also choose to test the performance of the storage nodes in the storage node list when the state changes of the storage nodes are received. , you can directly refer to the storage node performance recorded in the storage node list.

S10. Request the node to acquire node information of at least one target storage node whose test result meets a preset performance indicator threshold.

The requesting node can obtain the test results of the preset performance indicators corresponding to each storage node through the above test. In this way, according to these test results, a storage node that meets the preset performance indicator threshold can be selected as the target storage node for the final download of the target file block. .

In an implementation manner, a storage node with the best performance may be selected from all storage nodes as a target storage node, so that the download speed can be effectively improved and the resource occupation of each storage node by the download operation can be reduced.

In another implementation, if the P2P network is relatively unstable, and each storage node may exit the P2P network at any time, the method of selecting only one target storage node for downloading is prone to the problem that the target storage node fails and cannot be downloaded. In order to ensure the reliability of the download process as much as possible, multiple target storage nodes can be selected. For example, the number of target storage nodes is determined according to the number of preselected storage nodes and the number of storage nodes in the storage node list. For example, assuming that the probability of failure of each storage node in the P2P network is the same, which is p ₀ , and the reliability of the download process obtained by the requesting node from each storage node in the storage node set is P, the requesting node can select from m preselected storage nodes according to the following formula. Select N target storage nodes from the n storage nodes in the node and storage node list,

Based on this implementation, the requesting node can use the N storage nodes with the best performance as the target storage nodes. Further, in this implementation manner, the requesting node may select n storage nodes to update the storage node list, where n may be equal to N, that is, the N target storage nodes are directly recorded as storage nodes; n may not be equal to N, that is, select n target storage nodes with the best performance as storage nodes for recording.

S11. The requesting node downloads the target file block according to the node information of the target storage node.

In an implementation manner, if there is only one target storage node, the requesting node only needs to obtain the index address of the target storage node, and download the target file block according to the index address.

In an implementation manner, if there are multiple target storage nodes, the requesting node needs to obtain the index address of each target storage node, and download the target file blocks from the corresponding target storage node according to these index addresses. When the node completes the download from any target storage node, the overall download process ends.

Based on the above process of selecting a target storage node, it can be known that for each non-first downloaded file block in the file to be downloaded, the corresponding target storage node is the one with the best performance selected by comprehensively considering the previous file download operation. The storage node and the performance of the newly added pre-selected storage node in this file download operation are selected. It can be seen that the download process of each target file block is an iterative selection process, which can have the effect of selecting a storage node with better performance from storage nodes with excellent performance, and at the same time, the index node continues to add new options In order to ensure the updateability of storage node selection, it can expand the selection range of target storage nodes to a certain extent. Moreover, with the iterative process, the target storage node corresponding to each file block will get closer and closer to the global optimal storage node with the best performance among all the storage nodes storing the target file block, that is, the global optimal solution is obtained. It can be seen that, for the entire file downloading process, the file downloading method provided by the embodiment of the present invention can provide a target storage node with relatively high performance for each file block, thereby improving the downloading efficiency and downloading quality of the file block.

The process of obtaining the global optimal solution through the above iterative process can be proved by combining the queuing theory model and the Lyapunov drift function. The relevant model parameters of the theoretical proof are defined as follows:

N: The total number of storage nodes corresponding to each file block, or the maximum concurrent download requests for the same file block

δ _i : the mean value of the download request sending rate on the i-th requesting node

λ _j : the mean value of the arrival rate of download requests on the jth storage node

μ _j : the jth storage node, the mean download rate

q _k (t): at time t, the amount of requests backlogged by the kth storage node under the file download method provided by the embodiment of the present invention

At time t, the request backlog of the target storage node selected by the i-th requesting node under the file download method provided by the embodiment of the present invention

q ^* (t): At time t, the request backlog of the globally optimal storage node

K: At the same time, the maximum number of download requests allowed to arrive concurrently for each storage node

In this model, the overall file download time is used as the performance index, and the global and local optimal storage nodes are compared. Among them, the global optimal storage node refers to the storage node with the best performance among all the storage nodes corresponding to the target file block; The performance storage node refers to the target storage node corresponding to the target file block. Consider the case with the slowest convergence speed: n=m=1, where m is the number of pre-selected storage nodes, and n is the number of storage nodes with the best performance recorded after downloading the target file block, where, when n=1, the request The local optimal performance storage node recorded by the node is the target storage node. If the file download method converges at this time, other file download methods with faster convergence speed will also converge. It is assumed that at each point in time, each storage node processes a download request corresponding to one file block in the request cache queue at most. According to the results of the queuing theory model, at any moment, the probability that the i-th storage node has a file block download request arrives as follows:

Similarly, the probability that the jth storage node has a download request processed is:

When n=m=1, the solution of the present invention selects the global optimal storage node each time

The probability is at least 1/N. At the same time, each storage node can accommodate at most K download requests that are scheduled at the same time; when more than K, the request will be discarded. Therefore, when j=i:

Similarly, at any time, the probability that the jth storage node has a download request successfully transmitted is:

Assuming that the record of the storage node with the best performance in the previous round (n=0) is not introduced in the embodiment of the present invention, the scheduling policy does not converge, and the specific proof process is as follows:

Consider the Nth storage node: For any requesting node, the maximum probability that the download request is scheduled to the Nth storage node is: m/N. Therefore, the maximum possible request arrival rate for the Nth storage node is:

For the remaining N-1 storage nodes, the minimum download request arrival rate sum is:

If the following formula holds, according to the conclusion of the queuing theory model, it can be proved that the whole scheduling scheme is not convergent:

The download request generation and processing rates only satisfy:

The specific mean value distribution can be made uneven, so a set of input and output rate distributions can be found to satisfy:

Obviously, the average rate distribution of download request arrival and processing satisfying (1) is easy to construct. At this time, the subsystem composed of the first N-1 storage nodes is unstable. So the whole system is also unstable. So in the case of n=0, it is not convergent.

Next, continue to prove that the record of the storage node with the best performance of the previous file block download operation (n>0) is introduced, and the entire random download strategy converges to the globally optimal storage node:

The quadratic Lyapunov function V is constructed, and the convergence of the algorithm is demonstrated by proving that the Lyapunov function has a negative expected single-step drift. The Lyapunov function is constructed as follows:

The following proves that ε > 0, k > 0 such that:

E[V(t+1)-V(t)|V(t)]≤εV(t)+k

Since each storage node can accommodate at most K download requests scheduled at the same time at the same time, there are:

At the same time, each storage node processes at most one download request at the same time, so there are:

q ^* (t)-q ^* (t+1)≤1 (3)

Combining (2) (3) with:

Examining the first term of the Lyapunov function, let:

There are (probability that a download request is scheduled to storage node i, multiplied by the square of the queue difference):

Substitute (4) into (5) and scale the Lyapunov function to get:

The upper bound of the first term of the Lyapunov function is as follows:

Let's continue to examine the second term of the Lyapunov function, let

Among them, k=i, then there are:

The inferences from the queuing model are:

Substituting (8) into (7), and further scaling the Lyapunov function, we get:

Further sorting, namely:

Combining (6) and (9) we get:

Next, define:

Then there are:

Since q ^* (t) is the global optimal storage node, so: q ^* (t) _≤qi (t), further:

The initial conditions of the queuing model are:

So there are:

Assuming that n=m=1, the target storage node under the scheme of the present invention does not converge to the global optimal storage node, then there must be t, so that when t is large enough, the value of the Lyapunov function V(t) is extremely large . Since V(t)=V1(t)+V2(t), at least one of V1(t) and V2(t) is extremely large.

Assuming that V1(t) is extremely large, since

Then it shows that the deviation degree between the target storage node and the global optimal storage node is relatively large, and the deviation degree will not converge to a fixed upper bound. Since the values of δ _i and μ _i are known, the value of c is A fixed value:

Therefore there must be some value t1 such that:

At this time, substitute (11), (12), (13) into (10), we can get:

E[V(t+1)-V(t)|V(t)]<0

That is, there exists ε>0, such that:

E[V(t+1)-V(t)|V(t)]<-ε

The Lyapunov function has a negative expectation drift, that is, the local optimal storage node download strategy converges to the global optimal storage node, which contradicts the assumption. Therefore, V1(t) will not have a very large value.

Assuming that the value of V2(t) is extremely large, it means that there must be a certain value t2. When t≥t ₂ , the load condition of the whole queue is serious, and the value of q _i (t) is extremely large and keeps increasing. . It is assumed that the value of the global optimal solution q ^* (t) is not large at this time, that is, the value of q ^* (t) and q _i (t) are very different. Since the value of c is a fixed value, there are:

At the same time, according to the properties of the completely flat method, there are:

Substitute (12), (14), (15) into (10) to have:

E[V(t+1)-V(t)|V(t)]<0

That is, there exists ε>0, such that:

E[V(t+1)-V(t)|V(t)]<-ε

The Lyapunov function has a negative expectation drift, which contradicts the hypothesis. Therefore, V2(t) will not have a very large value.

Assuming that the optimal storage node q ^* (t) is also extremely large at this time, the value of q ^* (t) and q _i (t) are not much different. At this time, although the formula (15) will not hold, there must be t≥t ₃ such that:

Substitute (11), (15), (16) into (10) to get:

E[V(t+1)-V(t)|V(t)]<0

That is, there exists ε>0, such that:

E[V(t+1)-V(t)|V(t)]<-ε

So far, we have ruled out the non-convergence of all Lyapunov functions, so it can be demonstrated that when n=m=1, the local optimal storage node under this scheme converges to the global optimal storage node. Further, for the (n, m) policy (n>1 or m>1) embodiment, which converges faster, the convergence is stronger. Therefore, the target storage node selected for each file block download in the embodiment of the present invention converges to the global optimal storage node, that is, to the global optimal solution.

In order to further illustrate the technical effect of the embodiments of the present invention, a P2P storage network composed of 50 storage nodes is simulated to download the first 100 file blocks in a file download scenario. We respectively select the global optimal storage node (ignoring the time overhead of performance measurement) and the target storage node (taking n=1, m=2 as an example) based on the file download method provided by the embodiment of the present invention, and examine the selection of the present invention. The changes of the download delay of the target storage node and the download delay of the global optimal storage node are shown in Figure 9. It can be seen that, after downloading about the 60th file block, the target storage node selected by the present invention and the global optimal storage node have completely overlapped. Therefore, the target storage node selected by the embodiment of the present invention may converge to the globally optimal storage node.

Through the above verification, it is further explained that the file download method provided by the embodiment of the present invention selects the storage node with the best performance by comprehensively considering the previous file download operation, and the performance of the newly added preselected storage node in this file download operation. , select the target storage node that will eventually be used to download the target file block. Through the iterative process, the target storage node corresponding to each file block will get closer and closer to the global optimal storage node with the best performance among all the storage nodes storing the target file block, that is, the global optimal solution is obtained. It can be seen that, for the entire file downloading process, the file downloading method provided by the embodiment of the present invention can provide a target storage node with relatively high performance for each file block, thereby improving the downloading efficiency and downloading quality of the file block.

The above specific embodiments further describe the purpose, technical solutions and beneficial effects of the present invention in detail. It should be understood that the above are only specific embodiments of the present invention, and are not intended to limit the protection scope of the present invention. On the basis of the technical solutions of the present invention, any modifications, equivalent replacements, improvements, etc. made shall be included within the protection scope of the present invention.

Claims

A method for downloading file blocks, characterized in that it is applied to a P2P network, wherein the P2P network includes a request node and an index node, and the method includes:

receiving, by the index node, a download request sent by the requesting node, where the download request is used to download target file blocks;

The index node obtains all storage nodes corresponding to the target file block in the sub-distributed hash table DHT according to the download request, where the sub-DHT is the part of the DHT corresponding to the P2P network that is stored in the index node DHT;

The index node selects a plurality of preselected storage nodes from all the storage nodes;

sending, by the index node, the plurality of preselected storage nodes to the requesting node;

the requesting node receives the plurality of preselected storage nodes;

obtaining, by the requesting node, a list of storage nodes, where the storage nodes in the storage node list are used to form a storage node set together with the preselected storage nodes;

The requesting node downloads the target file block based on the set of storage nodes.
A method for downloading file blocks, characterized in that it is applied to a requesting node in a P2P network, the method comprising:

Receive a preselected storage node sent by the index node, where the preselected storage node is a storage node selected from all storage nodes corresponding to the target file block in the sub-distributed hash table DHT, and the sub-DHT is the DHT corresponding to the P2P network part of the DHT stored in the index node;

obtaining a storage node list of the requesting node, where the storage nodes in the storage node list are used to form a storage node set together with the preselected storage node;

The target file block is downloaded based on the set of storage nodes.
The method according to claim 2, wherein the storage node in the storage node list is that the preset performance index selected after the last file block download operation conforms to the preset performance index range, and stores the target file Block storage node.
The method according to claim 2 or 3, wherein before the acquiring the storage node list of the requesting node, the method further comprises:

detecting whether the storage node list exists;

If the storage node list does not exist, a storage node set is formed by the preselected storage nodes.
The method according to any one of claims 2-4, wherein the downloading the target file block based on the storage node set comprises:

testing the preset performance indicators of each storage node in the storage node set;

Obtain node information of at least one target storage node whose test result meets the preset performance indicator threshold;

The target file block is downloaded according to the node information of the target storage node.
The method according to claim 5, wherein if the node information of a plurality of target storage nodes whose test results meet the preset performance indicator threshold is obtained, the storage nodes are stored according to the number of the preselected storage nodes and the storage node list. The number of nodes determines the number of target storage nodes.
The method according to any one of claims 5-7, wherein after downloading the target file block, the method further comprises:

Selecting a storage node from the storage node set, where the selected storage node is a storage node with a preset performance index ranking greater than or equal to a preset ranking in the storage node set;

The storage nodes in the storage node list are updated using the selected storage nodes.
The method according to any one of claims 2-7, wherein the preset performance index is one or a combination of transmission delay, round-trip delay, and available bandwidth.
A requesting node, characterized in that the requesting node comprises: a receiver, a processor, a memory and a transmitter, the receiver, the processor, the memory and the transmitter are coupled; the processor Program instructions in the memory are invoked to cause the requesting node to perform the following methods:

Receive a preselected storage node sent by the index node, where the preselected storage node is a storage node selected from all storage nodes corresponding to the target file block in the sub-distributed hash table DHT, and the sub-DHT is the DHT corresponding to the P2P network part of the DHT stored in the index node;

obtaining a storage node list of the requesting node, where the storage nodes in the storage node list are used to form a storage node set together with the preselected storage node;

The target file block is downloaded based on the set of storage nodes.
The requesting node according to claim 9, wherein the storage node in the storage node list is a preset performance index selected after a previous file block download operation that conforms to a preset performance index range, and stores the target Storage node for file blocks.
The requesting node according to claim 9 or 10, wherein the method performed by the requesting node further comprises:

Before acquiring the storage node list of the requesting node, detecting whether the storage node list exists;

If the storage node list does not exist, a storage node set is formed by the preselected storage nodes.
The requesting node according to any one of claims 9-11, wherein the method performed by the requesting node further comprises:

testing the preset performance indicators of each storage node in the storage node set;

Obtain node information of at least one target storage node whose test result meets the preset performance indicator threshold;

The target file block is downloaded according to the node information of the target storage node.
The requesting node according to claim 12, wherein the method performed by the requesting node further comprises:

If the node information of a plurality of target storage nodes whose test results meet the preset performance indicator threshold is obtained, the number of the target storage nodes is determined according to the number of the preselected storage nodes and the number of storage nodes in the storage node list.
The requesting node according to any one of claims 9-13, wherein the method executed by the requesting node further comprises:

After downloading the target file block, a storage node is selected from the storage node set, and the selected storage node is a storage node with a preset performance index ranking in the storage node set greater than or equal to a preset ranking;

The storage nodes in the storage node list are updated using the selected storage nodes.
The requesting node according to any one of claims 9-14, wherein the preset performance indicator is one or a combination of transmission delay, round-trip delay, and available bandwidth.
A P2P network, comprising at least one requesting node and at least one indexing node, wherein the indexing node is used to receive a download request sent by the requesting node, and the download request is used to download a target file block;

Obtain all storage nodes corresponding to the target file block in the sub-distributed hash table DHT according to the download request, where the sub-DHT is a part of the DHT stored in the index node in the DHT corresponding to the P2P network;

selecting a plurality of preselected storage nodes from all the storage nodes;

sending the plurality of preselected storage nodes to the requesting node;

The requesting node is configured to perform the method according to any one of claims 5-9.
The P2P network according to claim 16, wherein if the index node does not have a storage node corresponding to the target file block, the index node is further configured to send the download request to an adjacent index node, and obtain all storage nodes corresponding to the target file block from the adjacent index nodes, wherein the adjacent index nodes are index nodes in the P2P network that have a specified algorithmic relationship with the index nodes.
The P2P network according to claim 16 or 17, wherein the index node is further used for:

generating a local DHT, the local DHT including the plurality of preselected storage nodes;

The local DHT is sent to the requesting node.
The P2P network according to claim 18, wherein the index node is further used for:

After the sending the local storage node list to the requesting node, the storage nodes in the local DHT are emptied.