CN117240851B - Data distribution method, device, equipment and storage medium - Google Patents

Data distribution method, device, equipment and storage medium Download PDF

Info

Publication number
CN117240851B
CN117240851B CN202311514857.0A CN202311514857A CN117240851B CN 117240851 B CN117240851 B CN 117240851B CN 202311514857 A CN202311514857 A CN 202311514857A CN 117240851 B CN117240851 B CN 117240851B
Authority
CN
China
Prior art keywords
node
source node
target
nodes
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311514857.0A
Other languages
Chinese (zh)
Other versions
CN117240851A (en
Inventor
王晓通
郭锋
贾伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Metabrain Intelligent Technology Co Ltd
Original Assignee
Suzhou Metabrain Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Metabrain Intelligent Technology Co Ltd filed Critical Suzhou Metabrain Intelligent Technology Co Ltd
Priority to CN202311514857.0A priority Critical patent/CN117240851B/en
Publication of CN117240851A publication Critical patent/CN117240851A/en
Application granted granted Critical
Publication of CN117240851B publication Critical patent/CN117240851B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Transfer Between Computers (AREA)

Abstract

The embodiment of the application provides a data distribution method, a device, equipment and a storage medium, which belong to the technical field of data processing, wherein the method comprises the steps of executing a data distribution strategy, and the data distribution strategy comprises the following steps: respectively determining target nodes corresponding to each source node in the current source node list from a plurality of nodes in the current node list to be distributed, wherein the target nodes corresponding to any two source nodes are not repeated; each source node in the current source node list respectively sends a target sending file to a corresponding target node; deleting the target node successfully receiving the target sending file from the current node list to be distributed, adding the target node into the current source node list, and executing the data distribution strategy again according to the updated source node list and the updated node list to be distributed. Embodiments of the present application aim to improve the performance of data distribution.

Description

Data distribution method, device, equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of data processing, in particular to a data distribution method, a device, equipment and a storage medium.
Background
With the continuous development of information technology and popularization of innovative application, data transmission and distribution are always interesting research methods.
In early stages, data transmission is mainly aimed at meeting different application scenarios, and with the rapid development of digitization, more and more application scenarios are required to perform large-scale data processing and distribution more efficiently and reliably, that is, more and more application scenarios are increasingly required for data transmission or data distribution, so that a method for distributing data files is deeply explored, and some optimization strategies and improvement methods are provided to further improve the efficiency and reliability of the method.
The data distribution algorithm includes a data transmission method based on a centralized architecture, such as FTP (File Transfer Protocol ), and the like, and the data transmission method of the centralized architecture generally requires a central server as a single point of storing and distributing files, and a client needs to continuously send a request to the central server to acquire the files, and although this method is easy to implement and convenient to manage, there are performance bottlenecks and reliability problems due to the reliance on a single server.
Therefore, how to reliably distribute data with high efficiency remains a concern.
Disclosure of Invention
The embodiment of the application provides a data distribution method, a device, equipment and a storage medium, aiming at improving the data distribution performance.
In a first aspect, an embodiment of the present application provides a data distribution method, where the method includes:
executing a data distribution policy, the data distribution policy comprising:
respectively determining target nodes corresponding to each source node in the current source node list in a plurality of nodes of the current node list to be distributed, wherein the target nodes corresponding to any two source nodes are not repeated;
each source node in the current source node list respectively sends a target sending file to a corresponding target node;
deleting the target node successfully receiving the target sending file from the current node list to be distributed and adding the target node into the current source node list to obtain an updated source node list and an updated node list to be distributed;
and executing the data distribution strategy again according to the updated source node list and the updated node list to be distributed.
Optionally, before executing the data distribution policy, the method further comprises:
determining an initial source node, and constructing an initial source node list and an initial node list to be distributed, wherein the initial source node list comprises initial source nodes, and the initial node list to be distributed comprises all nodes except the initial source nodes in a target node cluster.
Optionally, determining the initial source node includes:
acquiring respective current performance states and current load states of all nodes in the target node cluster;
and taking the node with the best current performance state and the smallest current load state of all the nodes as an initial source node.
Optionally, determining the initial source node includes:
and taking the node meeting the preset condition as an initial source node among all the nodes of the target node cluster.
Optionally, among all the nodes of the target node cluster, a node meeting a preset condition is taken as an initial source node, including:
acquiring respective current performance states and current load states of all nodes in the target node cluster;
and randomly selecting one node from all nodes of which the current performance state meets the performance preset condition and the current load state meets the load preset condition as an initial source node.
Optionally, determining the initial source node includes:
and determining the initial source node in response to the selection operation of the initial source node.
Optionally, determining, among the plurality of nodes in the current node list to be distributed, a target node corresponding to each source node in the current source node list, respectively, includes:
among the plurality of nodes of the current list of nodes to be distributed, a target node is determined for each source node in the current list of source nodes.
Optionally, among the plurality of nodes of the current node to be distributed list, determining a target node for each source node in the current source node list includes:
and for any source node in the current source node list, selecting a node with the smallest distance with the source node from a plurality of nodes in the current node list to be distributed as a target node of the source node, wherein the target nodes corresponding to any two source nodes are different.
Optionally, among the plurality of nodes of the current node to be distributed list, determining a target node for each source node in the current source node list includes:
and for any source node in the current source node list, selecting a node which has the smallest distance with the source node and has the bandwidth larger than a bandwidth calibration value from a plurality of nodes in the current node list to be distributed as a target node of the source node, wherein the target nodes corresponding to any two source nodes are different.
Optionally, among the plurality of nodes of the current node to be distributed list, determining a target node for each source node in the current source node list includes:
and for any source node in the current source node list, selecting a node with the minimum distance from the source node, the bandwidth larger than a bandwidth calibration value and the load smaller than a load calibration value from a plurality of nodes in the current node list to be distributed as a target node of the source node, wherein the target nodes corresponding to any two source nodes are different.
Optionally, each source node in the current source node list sends a target sending file to a corresponding target node respectively, including:
and monitoring network transmission information in the process that any source node transmits the target transmission file to a target node corresponding to the source node in real time, and determining whether the source node terminates the current transmission process between the target nodes corresponding to the source node according to the network transmission information.
Optionally, the network transmission information includes network communication failure information, monitors network transmission information in a process that any source node sends the target sending file to a target node corresponding to the source node in real time, and determines whether the source node terminates a current transmission process between the target nodes corresponding to the source node according to the network transmission information, including:
Monitoring network communication fault information in the process that any source node sends the target sending file to a target node corresponding to the source node in real time;
and if the network communication fault information is monitored, terminating the current transmission process, and reselecting a target node for the source node from a plurality of nodes in the current node list to be distributed.
Optionally, the network transmission information includes a transmission duration, monitors network transmission information in a process that any source node sends the target sending file to a target node corresponding to the source node in real time, and determines whether the source node terminates a current transmission process between the target nodes corresponding to the source node according to the network transmission information, including:
monitoring the transmission time length of any source node in the process of transmitting the target transmission file to the target node corresponding to the source node in real time;
and if the transmission time length is monitored to be larger than the transmission time length threshold, terminating the current transmission process, and reselecting a target node for the source node from a plurality of nodes in the current node list to be distributed.
Optionally, the method further comprises:
and for any source node, if all nodes of the current node list to be distributed are polled, determining no target node corresponding to the source node, and deleting the source node from the source node list.
Optionally, the method further comprises:
and responding to a joining request of a new node, and adding the new node into a node list to be distributed.
Optionally, the method further comprises:
detecting a fault signal of a fault node, and deleting the fault node from the current source node list or the current node to be distributed list.
Optionally, after detecting a fault signal of a fault node and deleting the fault node from the current source node list or the current node to be distributed list, the method further includes:
responding to a fault recovery signal of the fault node, and checking whether the target transmission file exists in the fault node;
if the target sending file exists in the fault node, adding the fault node into the source node list;
and if the target sending file does not exist in the fault node, adding the fault node into the node list to be distributed.
In a second aspect, an embodiment of the present application provides a data distribution apparatus, including:
the first execution module is used for executing a data distribution strategy, wherein the data distribution strategy comprises the following steps: respectively determining target nodes corresponding to each source node in the current source node list in a plurality of nodes of the current node list to be distributed, wherein the target nodes corresponding to any two source nodes are not repeated; each source node in the current source node list respectively sends a target sending file to a corresponding target node; deleting the target node successfully receiving the target sending file from the current node list to be distributed and adding the target node into the current source node list to obtain an updated source node list and an updated node list to be distributed;
And the second execution module is used for executing the data distribution strategy again according to the updated source node list and the updated node list to be distributed.
In a third aspect, embodiments of the present application provide a computer device, comprising: at least one processor, and a memory storing a computer program executable on the processor, wherein the processor performs the data distribution method according to the first aspect of the embodiment when executing the computer program.
In a fourth aspect, embodiments of the present application provide a non-volatile readable storage medium storing a computer program, where the computer program, when executed by a processor, performs the data distribution method according to the first aspect of the embodiments.
The beneficial effects are that:
executing a data distribution policy, the data distribution policy comprising: respectively determining target nodes corresponding to each source node in the current source node list in a plurality of nodes of the current node list to be distributed, wherein the target nodes corresponding to any two source nodes are not repeated; each source node in the current source node list respectively sends a target sending file to a corresponding target node; deleting the target node successfully receiving the target sending file from the current node list to be distributed and adding the target node into the current source node list to obtain an updated source node list and an updated node list to be distributed; and executing the data distribution strategy again according to the updated source node list and the updated node list to be distributed.
When data distribution is carried out, the target sending file exists in each source node in the current source node list, then a corresponding target node is determined in the current node list to be distributed for each source node, each source node sends the target sending file to the corresponding target node, the target node which successfully receives the target sending file is added into the current source node list and is deleted from the current node list to be distributed, and further, when the data is distributed next time, the target node is also used as the source node to distribute the target sending file together with other source nodes in the source node list, so that the decentralization data distribution is realized, the failure of data distribution caused by single-point faults existing in the centralized data distribution method is avoided, the reliability of the data distribution is improved, any node containing the target sending file can be a distribution node of the data distribution, the target node of each source node is not repeated, the data distribution efficiency is improved, and the data distribution method with better performance is provided.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
Fig. 1 shows a flowchart of steps of a data distribution method provided in an embodiment of the present application;
FIG. 2 is a flowchart illustrating steps of a data distribution strategy provided by an embodiment of the present application;
FIG. 3 shows a flowchart of the data distribution method according to the embodiment of the present application;
FIG. 4 shows a data distribution schematic provided in an embodiment of the present application;
fig. 5 shows a functional block diagram of a data distribution device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the following detailed description of each embodiment of the present application will be given with reference to the accompanying drawings. However, those of ordinary skill in the art will understand that in various embodiments of the present application, numerous technical details have been set forth in order to provide a better understanding of the present application. However, the technical solutions claimed in the present application can be implemented without these technical details and with various changes and modifications based on the following embodiments. The following embodiments are divided for convenience of description, and should not be construed as limiting the specific implementation of the present application, and the embodiments may be mutually combined and referred to without contradiction.
Referring to fig. 1, a step flowchart of a data distribution method provided in an embodiment of the present application is shown, where the method may include the following steps:
s101: and determining an initial source node, and constructing an initial source node list and an initial node list to be distributed.
Firstly, in an application environment requiring data distribution, an initial source node is selected from a target node cluster, wherein the target node cluster comprises a plurality of equipment nodes which are connected with each other in a wired or wireless communication manner, for example, a target node cluster formed by various network equipment and computing equipment such as a cloud, a server, a firewall, a switch, a router and the like, a target node cluster formed by a plurality of storage server nodes in a distributed storage system, and even a target node cluster formed by element nodes such as a BMC (baseboard management controller) and a CPU (central processing unit) in a computer.
In a possible implementation manner, determining the initial source node may be to obtain respective current performance states and current load states of all nodes in the target node cluster; and taking the node with the best current performance state and the smallest current load state of all the nodes as an initial source node.
By comparing the current performance states and the current load states of all the nodes in the target node cluster, the performance states can be comprehensively considered from multiple dimensions of hardware performance configuration of the nodes, storage space of the nodes, running memory, current network states and the like, and can be customized in different application scenes according to actual application requirements, and the embodiment is not limited.
In another possible implementation manner, the determining of the initial source node may further include taking a node meeting a preset condition as the initial source node in all nodes of the target node cluster, and obtaining respective current performance states and current load states of all nodes in the target node cluster by way of example; and randomly selecting one node from all nodes of which the current performance state meets the performance preset condition and the current load state meets the load preset condition as an initial source node.
By analyzing the performance states and the load states of all the nodes of the target node cluster, any node with the performance states meeting the performance preset conditions and the current load state meeting the load preset conditions is selected as a source node, so that the source node is a node with good performance and low load, the performance preset conditions and the load preset conditions can be subjected to self-defining setting according to the requirements of practical applications, and if the load preset conditions can be set as that the load carried at present is not more than one third of the rated load carried by the node.
In another possible implementation manner, the initial source node determination may be a manually specified source node, and the source node is selected manually according to the requirement according to the application environment of the current data distribution, so that the adaptability of the method is improved.
The source node should contain a target transmission file which needs to be subjected to data distribution, and the target transmission file can be different under different data distribution scenes.
For example, the target sending file may be an installation package, an upgrade package, or configuration information when operations such as batch firmware upgrade, configuration, etc. are performed in the BMC federal management in a scenario that the server BMC product faces data distribution such as a file, an installation package, an upgrade package, etc.; the system can also be configuration files, mirror image data, software installation packages, upgrade packages and the like transmitted between various network devices such as cloud, servers, firewalls, switches, routers and the like and computing devices, and can also be data needing to be copied and backed up in a distributed storage system; the target transmission file may also be text, video, voice, or other multimedia data.
After the source nodes are determined, an initialization source node list and an initialization node list to be distributed are constructed, wherein the initialization source node list comprises initial source nodes, and the initialization node list to be distributed comprises all nodes except the initial source nodes in a target node cluster.
S102, executing a data distribution strategy.
Referring to fig. 2, a flowchart illustrating steps of a data distribution policy provided in an embodiment of the present application is shown, where the data distribution policy includes the following steps:
S1021: and respectively determining a target node corresponding to each source node in the current source node list from a plurality of nodes in the current node list to be distributed.
Each source node in the current source node list has downloaded the target sending file, but all nodes in the current node list to be distributed have not downloaded the target sending file, firstly, determining a target node for each source node, wherein each target node corresponding to each source node is the target node which needs to send the target sending file for the source node, and the target nodes corresponding to any two source nodes are not repeated, namely, in one data transmission, the target sending file can only be received from one source node for any one node.
In one possible implementation, multiple target nodes may be determined for each source node, i.e., one source node may send target send files to multiple target nodes simultaneously.
However, considering that when one source node sends a target sending file to a plurality of target nodes, the load of the source node is increased, so that the situation that the transmission duration is increased, the transmission is blocked or even crashed is possibly caused, in a feasible implementation manner, when the target node of the source node is determined, only one target node is determined for each source node in the current source node list in a plurality of nodes of the current node list to be distributed, so that one-to-one transmission is realized, and the transmission efficiency can be improved.
Specifically, when determining a target node for each source node in the current source node list among the plurality of nodes in the current node to be distributed list, the process of selecting the target node may be any one of the following:
a1: and for any source node in the current source node list, selecting a node with the smallest distance with the source node from a plurality of nodes in the current node list to be distributed as a target node of the source node, wherein the target nodes corresponding to any two source nodes are different.
The distance refers to a communication transmission distance between the source node and any target node, and in an exemplary wireless transmission scenario, a node closest to the source node is preferably selected as the target node, so that the success rate and the transmission speed of data distribution and transmission can be improved.
A2: and for any source node in the current source node list, selecting a node which has the smallest distance with the source node and has the bandwidth larger than a bandwidth calibration value from a plurality of nodes in the current node list to be distributed as a target node of the source node, wherein the target nodes corresponding to any two source nodes are different.
The target node may be selected based on the communication transmission distance and the transmission bandwidth, for example, for the source node 1, the bandwidth calibration value is D, the communication transmission distance between the node 1 and the source node 1 is 10m, the bandwidth is D1, and D1 is smaller than D; the communication transmission distance between the node 2 and the source node 1 is 20m, the bandwidth is D2, and D2 is larger than D; and if the communication transmission distance between the node 3 and the source node 1 is 15m, the bandwidth is D3, and D3 is larger than D2, selecting the node 3 as a target node of the source node 1.
A3: and for any source node in the current source node list, selecting a node with the minimum distance from the source node, the bandwidth larger than a bandwidth calibration value and the load smaller than a load calibration value from a plurality of nodes in the current node list to be distributed as a target node of the source node, wherein the target nodes corresponding to any two source nodes are different.
The target node may also be selected based on the communication transmission distance, the transmission bandwidth and the load of the node, for example, for the source node 2, the bandwidth calibration value is D, the load calibration value is F, the communication transmission distance between the node a and the source node 2 is 12m, the bandwidth is greater than D, but the load is greater than F; and if the communication transmission distance between the node B and the source node 2 is 15m, the bandwidth is larger than D, and the load is smaller than F, the node B is selected as a target node of the source node 2.
Under the condition that only one target node is set for one source node, idle source nodes in a current source node list can be monitored in real time, and corresponding target nodes are determined only for the idle source nodes; after the source node in the data transmission process finishes data transmission, the source node is used as an idle source node in response to a transmission finishing signal sent by the source node, and for example, an idle and non-idle mark can be added for each source node so as to determine a target node for the idle source node in time and quickly, thereby further improving the efficiency of data distribution.
S1022: and each source node in the current source node list respectively sends a target sending file to the corresponding target node.
After determining the destination node of one source node, the source node sends the destination sending file to the destination node, and the mode adopted in the data transmission process is not limited in this embodiment, and may be, for example, packet serial sending or packet parallel sending.
In a possible implementation manner, when each source node sends a target sending file to a corresponding target node, network transmission information in the process that any source node sends the target sending file to the target node corresponding to the source node is monitored in real time, whether the source node terminates the current transmission process between the target nodes corresponding to the source node or not is determined according to the network transmission information, and the network transmission information may include network communication fault information, transmission duration and the like.
By way of example, monitoring network communication fault information in the process that any source node sends the target sending file to a target node corresponding to the source node in real time; and if the network communication fault information is monitored, terminating the current transmission process, and reselecting a target node for the source node from a plurality of nodes in the current node list to be distributed.
When network communication fault information is detected in the data transmission process, the communication environment between the source node and the corresponding target node is not suitable for data transmission, and the failure rate of data transmission is high, so that the current transmission process can be terminated, and the target node can be reselected from a plurality of nodes in the current node list to be distributed for the source node.
By way of example, the transmission duration in the process that any source node sends the target sending file to the target node corresponding to the source node can also be monitored in real time; and if the transmission time length is monitored to be larger than the transmission time length threshold, terminating the current transmission process, and reselecting a target node for the source node from a plurality of nodes in the current node list to be distributed.
When the transmission time length of the data transmission is detected to be longer than the transmission time length threshold in the data transmission process, the communication network between the source node and the corresponding target node is poor, or the source node or the target node side is in fault, so that the current transmission process can be terminated, and the target node can be reselected from a plurality of nodes in the current node list to be distributed for the source node.
By timely detecting network transmission information in the process of transmitting data from a source node to a target node, the data transmission object of the source node can be timely adjusted, the time consumption is reduced, and the efficiency and quality of data distribution are integrally improved.
The process of re-determining the target node for one source node may be any of steps A1-A3,
in a possible implementation manner, for any source node, if all nodes in the current node list to be distributed are polled, the destination node corresponding to the source node is not determined, which indicates that the source node is not suitable for data distribution any more, and a great amount of time is consumed for determining the destination node for the source node, so that the source node is deleted from the source node list, the cost of data processing can be reduced, and the efficiency and quality of data distribution are further improved.
S1023: deleting the target node successfully receiving the target sending file from the current node list to be distributed and adding the target node into the current source node list to obtain an updated source node list and an updated node list to be distributed.
Specifically, after the target node successfully receives the target transmission file, the target node is marked as a source node and added into the current source node list in response to a data reception success signal sent by the target node, an updated source node list is obtained, and the target node is deleted from the current node list to be distributed, so that an updated node list to be distributed is obtained.
S103: and executing the data distribution strategy again according to the updated source node list and the updated node list to be distributed.
The source nodes in the source node list in this embodiment have all the target sending files, each source node sends the target sending files to its corresponding target node, the target node that successfully receives the target sending files also becomes a source node and adds the source node to the source node list, and when the data is distributed next time, the target node also serves as the source node to distribute the target sending files together with other source nodes in the source node list, thereby implementing the decentralized data distribution, avoiding the failure of data distribution caused by single point failure existing in the centralized data distribution method, improving the reliability and robustness of data distribution, and enabling any node containing the target sending files to become a distribution node of data distribution, and enabling the target node of each source node not to be repeated, thereby improving the efficiency of data distribution, and further providing a data distribution method with better performance.
In a possible embodiment, the method further comprises a process of adding a new node or processing a failed node.
Specifically, responding to a joining request of a new node, and adding the new node into a node list to be distributed; detecting a fault signal of a fault node, deleting the fault node from the current source node list or the current node list to be distributed, continuously detecting a fault recovery signal of the fault node, and responding to the fault recovery signal of the fault node to check whether the target transmission file exists in the fault node; if the target sending file exists in the fault node, adding the fault node into the source node list; and if the target sending file does not exist in the fault node, adding the fault node into the node list to be distributed.
Referring to fig. 3, a flowchart of an implementation of a data distribution method provided by the embodiment of the present application is shown, in a feasible implementation manner, when data is distributed, a source node is selected first, then a source node list and a node list to be distributed are initialized, then whether all nodes in a target node cluster download files is judged, if not, a target node corresponding to the source node is selected from the node list to be distributed, and whether a communication fault or a network unreachable problem exists in a data transmission process is detected.
If the communication fault or the network is not reachable, a new target node is selected for the source node again, and if no suitable target node exists, the process is finished; if the target node is proper, continuing to select the target node corresponding to the source node.
If the problem of communication failure or unreachable network does not exist, the source node sends the file to the target node, the target node which receives the file is added into the source node list as the source node, whether all nodes in the target node cluster download the file is continuously judged, and the cycle is performed until all nodes in the target node cluster download the file, and then the process is finished.
Assuming that all nodes are S1, the source node selects a1, the to-be-distributed contact list is L, and the source node list is M, then:
S1={a1,a2,a3,a4,a5...an}
L={a2,a3,a4,a5...an}
M={a1}
if the target node selects a2, the source node a1 sends a file to the target node a2, and if the sending is successful, the source node a2 is marked as a source node to be stored in M, and then:
M={a1,a2}
L={a3,a4,a5...an}
this loops until a1-an all have files.
Referring to fig. 4, a schematic diagram of data distribution provided by the embodiment of the present application is shown, where the data distribution method provided by the embodiment actually uses a hierarchical streaming data distribution process that distributes data in a hierarchical manner, where a1 distributes data to a2 in a first layer, a1 distributes data to a3, a2 distributes data to a4 in a second layer, a1 distributes data to a6, a2 distributes data to a5, a3 distributes data to a9, a4 distributes data to a7 in a third layer, and so on, until each node in the target node cluster receives the target transmission file.
In an ideal case, if each node distributes files to two new nodes exactly at the same time and with the same time consumption, a recursion can be derived to calculate the number of iterations needed to complete the distribution:
f (n) =f (n/2) +1, where n represents the number of nodes remaining without the download target transmission file, and f (n) represents the number of iterations required to complete distribution.
The initial condition is f (1) =0, i.e. when only one node needs to download a file, the distribution has been completed.
The correctness of the above-mentioned recursion can be demonstrated by mathematical induction, assuming that the above-mentioned recursion holds true for all positive integers n equal to or less than k, now consider the case where n=2k.
First, it can be derived that: f (k) =f (k/2) +1, meaning that k nodes can complete distribution in f (k) iterations; the distribution process of k nodes is then considered as two parts: a first half node and a second half node.
Each node in the first part selects two new nodes among k/2 nodes to distribute the file, so that the file is transmitted to k nodes in total; the same is true for each node of the second part, so a total of 2k distributions are required to complete.
Since the distribution process of the first half node and the second half node are independent, f (2 k) can be regarded as twice f (k) plus one, i.e., f (2 k) =2f (k) +1, indicating pich.
Thus, the number of iterations required for the distribution to complete can be derived as: f (N) =log2 (N), and the result is that in a binary tree structure, the tree height is log2 (N), and thus the number of iterations required is also log2 (N).
However, the data distribution method provided in this embodiment has essential three differences from the binary tree-based file distribution:
the algorithm ideas are different: the file distribution algorithm based on the binary tree structure takes a central node as a starting point, two-way connection is established among the nodes, data distribution is carried out according to a hierarchy, except for the last layer, each middle layer is a node for distributing files to two nodes, and the load of the distributed nodes is increased;
algorithm efficiency is different: in the embodiment, all distributed nodes are repeatedly utilized, a distribution task is built to the nodes which are not distributed again, and the file distribution of the binary tree structure does not utilize the previous nodes, so that the efficiency is low; compared with a distribution method based on a binary tree structure, the method avoids repeated data distribution to the same node for a plurality of times by intelligently selecting the target node and the source node, reduces the access load of the node and further improves the distribution efficiency;
Robustness is different: the data distribution method in this embodiment adopts a source node list management mechanism, and can improve the robustness of data distribution by reselecting an available node as a target node, wherein a network between some nodes is not reachable or fails.
Compared with the traditional FTP, the method is an decentralization algorithm, single-point faults do not exist, each node can be connected with each other and exchange data, and the method adopts an intelligent target node selection algorithm and a source node list management mechanism, so that data distribution can be performed more efficiently and reliably.
The data distribution method based on the hierarchical layer type in the embodiment is suitable for distributing large files such as video, audio or software installation packages to scenes of a plurality of nodes, can effectively reduce network load between a source node and a target node, reduces network congestion, improves data transmission efficiency, and is also suitable for scenes such as video live broadcast, on-demand, software upgrading, data backup and the like which need to rapidly transmit a large amount of data.
Referring to fig. 5, a functional block diagram of a data distribution device according to an embodiment of the present application is shown, where the device includes:
a first execution module 100, configured to execute a data distribution policy, where the data distribution policy includes: respectively determining target nodes corresponding to each source node in the current source node list in a plurality of nodes of the current node list to be distributed, wherein the target nodes corresponding to any two source nodes are not repeated; each source node in the current source node list respectively sends a target sending file to a corresponding target node; deleting the target node successfully receiving the target sending file from the current node list to be distributed and adding the target node into the current source node list to obtain an updated source node list and an updated node list to be distributed;
and the second execution module 200 is configured to execute the data distribution policy again according to the updated source node list and the updated node list to be distributed.
Optionally, the apparatus further comprises:
the initialization module is used for determining an initial source node, constructing an initialization source node list and an initialization node list to be distributed, wherein the initialization source node list comprises the initial source node, and the initialization node list to be distributed comprises all nodes except the initial source node in a target node cluster.
Optionally, the initialization module includes:
the first initial source node determining unit is used for acquiring the current performance states and the current load states of all nodes in the target node cluster; and taking the node with the best current performance state and the smallest current load state of all the nodes as an initial source node.
Optionally, the initialization module includes:
the second initial source node determining unit is used for taking the node meeting the preset condition as an initial source node in all the nodes of the target node cluster.
Optionally, the second initial source node determining unit includes:
a random selection subunit, configured to obtain respective current performance states and current load states of all nodes in the target node cluster; and randomly selecting one node from all nodes of which the current performance state meets the performance preset condition and the current load state meets the load preset condition as an initial source node.
Optionally, the initialization module includes:
and the third initial source node determining unit is used for determining the initial source node in response to the selection operation of the initial source node.
Optionally, the first execution module includes:
And the target node determining unit is used for determining a target node for each source node in the current source node list in a plurality of nodes of the current node list to be distributed.
Optionally, the target node determining unit includes:
the first determining subunit is configured to, for any source node in the current source node list, select, from multiple nodes in the current node list to be distributed, a node with a minimum distance from the source node as a target node of the source node, where the target nodes corresponding to any two source nodes are different.
Optionally, the target node determining unit includes:
and the second determining subunit is used for selecting, for any source node in the current source node list, a node which has the smallest distance with the source node and has a bandwidth larger than a bandwidth calibration value from among the plurality of nodes in the current node list to be distributed as a target node of the source node, wherein the target nodes corresponding to any two source nodes are different.
Optionally, the target node determining unit includes:
and the third determining subunit is configured to select, for any source node in the current source node list, a node with a minimum distance from the source node, a bandwidth greater than a bandwidth calibration value and a load less than a load calibration value from among multiple nodes in the current node list to be distributed, as a target node of the source node, where the target nodes corresponding to any two source nodes are different.
Optionally, the first execution module includes:
and the transmission detection module is used for monitoring network transmission information in the process that any source node sends the target sending file to the target node corresponding to the source node in real time, and determining whether the source node terminates the current transmission process between the target nodes corresponding to the source node according to the network transmission information.
Optionally, the transmission detection module includes:
the first transmission detection unit is used for monitoring network communication fault information in the process that any source node sends the target sending file to a target node corresponding to the source node in real time;
and if the network communication fault information is monitored, terminating the current transmission process, and reselecting a target node for the source node from a plurality of nodes in the current node list to be distributed.
Optionally, the transmission detection module includes:
the second transmission detection unit is used for monitoring the transmission time length in the process that any source node transmits the target transmission file to the target node corresponding to the source node in real time;
and if the transmission time length is monitored to be larger than the transmission time length threshold, terminating the current transmission process, and reselecting a target node for the source node from a plurality of nodes in the current node list to be distributed.
Optionally, the apparatus further comprises:
and the deleting module is used for deleting any source node from the source node list if the target node corresponding to the source node is not determined after all nodes of the current node list to be distributed are polled.
Optionally, the apparatus further comprises:
and the adding module is used for responding to the adding request of the new node and adding the new node into the node list to be distributed.
Optionally, the apparatus further comprises:
the fault processing module is used for detecting fault signals of the fault nodes and deleting the fault nodes from the current source node list or the current node list to be distributed.
Optionally, the apparatus further comprises:
the fault recovery module is used for responding to the fault recovery signal of the fault node and checking whether the target sending file exists in the fault node; if the target sending file exists in the fault node, adding the fault node into the source node list; and if the target sending file does not exist in the fault node, adding the fault node into the node list to be distributed.
The embodiment of the application also provides a computer device, which comprises: at least one processor, and a memory storing a computer program executable on the processor, wherein the processor performs the data distribution method according to the embodiment when executing the program.
The present application also provides a non-volatile readable storage medium storing a computer program, wherein the computer program, when executed by a processor, performs the data distribution method according to the embodiment.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.
It will be apparent to those skilled in the art that embodiments of the present application may be provided as a method, apparatus, or computer program product. Accordingly, the present embodiments may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present embodiments have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the present application.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or terminal device comprising the element.
The principles and embodiments of the present application are described herein with specific examples, the above examples being provided only to assist in understanding the methods of the present application and their core ideas; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims (19)

1. A method of data distribution, the method comprising:
executing a data distribution policy, the data distribution policy comprising:
respectively determining one target node corresponding to each source node in the current source node list in a plurality of nodes of the current node list to be distributed so as to improve the data transmission efficiency, wherein the process of selecting any one target node corresponding to the source node comprises selecting the node with the smallest distance from the source node as the target node of the source node, selecting the node with the smallest distance from the source node and with the bandwidth larger than a bandwidth calibration value as the target node of the source node, and selecting the node with the smallest distance from the source node, with the bandwidth larger than the bandwidth calibration value and with the load smaller than the load calibration value as any one of the target nodes of the source node, wherein the target nodes corresponding to any two source nodes are not repeated;
Each source node in the current source node list respectively sends a target sending file to a corresponding target node;
deleting the target node successfully receiving the target sending file from the current node list to be distributed and adding the target node into the current source node list to obtain an updated source node list and an updated node list to be distributed;
and executing the data distribution strategy again according to the updated source node list and the updated node list to be distributed.
2. The method of claim 1, wherein prior to executing the data distribution policy, the method further comprises:
determining an initial source node, and constructing an initial source node list and an initial node list to be distributed, wherein the initial source node list comprises initial source nodes, and the initial node list to be distributed comprises all nodes except the initial source nodes in a target node cluster.
3. The method of claim 2, wherein determining the initial source node comprises:
acquiring respective current performance states and current load states of all nodes in the target node cluster;
And taking the node with the best current performance state and the smallest current load state of all the nodes as an initial source node.
4. The method of claim 2, wherein determining the initial source node comprises:
and taking the node meeting the preset condition as an initial source node among all the nodes of the target node cluster.
5. The method according to claim 4, wherein among all nodes of the target node cluster, a node satisfying a preset condition is taken as an initial source node, comprising:
acquiring respective current performance states and current load states of all nodes in the target node cluster;
and randomly selecting one node from all nodes of which the current performance state meets the performance preset condition and the current load state meets the load preset condition as an initial source node.
6. The method of claim 2, wherein determining the initial source node comprises:
and determining the initial source node in response to the selection operation of the initial source node.
7. The method of claim 1, wherein among the plurality of nodes of the current list of nodes to be distributed, determining a target node for each source node in the current list of source nodes comprises:
And for any source node in the current source node list, selecting a node with the smallest distance with the source node from a plurality of nodes in the current node list to be distributed as a target node of the source node, wherein the target nodes corresponding to any two source nodes are different.
8. The method of claim 1, wherein among the plurality of nodes of the current list of nodes to be distributed, determining a target node for each source node in the current list of source nodes comprises:
and for any source node in the current source node list, selecting a node which has the smallest distance with the source node and has the bandwidth larger than a bandwidth calibration value from a plurality of nodes in the current node list to be distributed as a target node of the source node, wherein the target nodes corresponding to any two source nodes are different.
9. The method of claim 1, wherein among the plurality of nodes of the current list of nodes to be distributed, determining a target node for each source node in the current list of source nodes comprises:
and for any source node in the current source node list, selecting a node with the minimum distance from the source node, the bandwidth larger than a bandwidth calibration value and the load smaller than a load calibration value from a plurality of nodes in the current node list to be distributed as a target node of the source node, wherein the target nodes corresponding to any two source nodes are different.
10. The method according to claim 1, wherein each source node in the current source node list sends a target send file to a respective corresponding target node, respectively, comprising:
and monitoring network transmission information in the process that any source node transmits the target transmission file to a target node corresponding to the source node in real time, and determining whether the source node terminates the current transmission process between the target nodes corresponding to the source node according to the network transmission information.
11. The method according to claim 10, wherein the network transmission information includes network communication failure information, monitoring network transmission information in a process that any source node sends the target sending file to a target node corresponding to the source node in real time, and determining whether the source node terminates a current transmission process between the target nodes corresponding to the source node according to the network transmission information, includes:
monitoring network communication fault information in the process that any source node sends the target sending file to a target node corresponding to the source node in real time;
and if the network communication fault information is monitored, terminating the current transmission process, and reselecting a target node for the source node from a plurality of nodes in the current node list to be distributed.
12. The method according to claim 10, wherein the network transmission information includes a transmission duration, monitoring network transmission information in a process that any source node sends the target sending file to a target node corresponding to the source node in real time, and determining whether the source node terminates a current transmission process between the target nodes corresponding to the source node according to the network transmission information, including:
monitoring the transmission time length of any source node in the process of transmitting the target transmission file to the target node corresponding to the source node in real time;
and if the transmission time length is monitored to be larger than the transmission time length threshold, terminating the current transmission process, and reselecting a target node for the source node from a plurality of nodes in the current node list to be distributed.
13. The method according to claim 11 or 12, characterized in that the method further comprises:
and for any source node, if all nodes of the current node list to be distributed are polled, determining no target node corresponding to the source node, and deleting the source node from the source node list.
14. The method according to claim 1, wherein the method further comprises:
And responding to a joining request of a new node, and adding the new node into a node list to be distributed.
15. The method according to claim 1, wherein the method further comprises:
detecting a fault signal of a fault node, and deleting the fault node from the current source node list or the current node to be distributed list.
16. The method of claim 15, wherein detecting a failure signal of a failed node, and deleting the failed node from the current source node list or the current node to be distributed list, the method further comprises:
responding to a fault recovery signal of the fault node, and checking whether the target transmission file exists in the fault node;
if the target sending file exists in the fault node, adding the fault node into the source node list;
and if the target sending file does not exist in the fault node, adding the fault node into the node list to be distributed.
17. A data distribution apparatus, the apparatus comprising:
the first execution module is used for executing a data distribution strategy, wherein the data distribution strategy comprises the following steps: respectively determining one target node corresponding to each source node in the current source node list in a plurality of nodes of the current node list to be distributed so as to improve the data transmission efficiency, wherein the process of selecting any one target node corresponding to the source node comprises selecting the node with the smallest distance from the source node as the target node of the source node, selecting the node with the smallest distance from the source node and with the bandwidth larger than a bandwidth calibration value as the target node of the source node, and selecting the node with the smallest distance from the source node, with the bandwidth larger than the bandwidth calibration value and with the load smaller than the load calibration value as any one of the target nodes of the source node, wherein the target nodes corresponding to any two source nodes are not repeated; each source node in the current source node list respectively sends a target sending file to a corresponding target node; deleting the target node successfully receiving the target sending file from the current node list to be distributed and adding the target node into the current source node list to obtain an updated source node list and an updated node list to be distributed;
And the second execution module is used for executing the data distribution strategy again according to the updated source node list and the updated node list to be distributed.
18. A computer device, comprising: at least one processor, and a memory storing a computer program executable on the processor, wherein the processor performs the data distribution method according to any of claims 1-16 when the computer program is executed.
19. A non-transitory readable storage medium storing a computer program, wherein the computer program when executed by a processor performs the data distribution method according to any of claims 1-16.
CN202311514857.0A 2023-11-14 2023-11-14 Data distribution method, device, equipment and storage medium Active CN117240851B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311514857.0A CN117240851B (en) 2023-11-14 2023-11-14 Data distribution method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311514857.0A CN117240851B (en) 2023-11-14 2023-11-14 Data distribution method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN117240851A CN117240851A (en) 2023-12-15
CN117240851B true CN117240851B (en) 2024-02-20

Family

ID=89093379

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311514857.0A Active CN117240851B (en) 2023-11-14 2023-11-14 Data distribution method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117240851B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118694759A (en) * 2024-08-26 2024-09-24 南京普天通信股份有限公司 Distributed multi-element data rapid transmission method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102833293A (en) * 2011-06-17 2012-12-19 腾讯科技(深圳)有限公司 Method for downloading resources in peer to server and peer (P2SP) network, and client
CN106993054A (en) * 2017-05-05 2017-07-28 腾讯科技(深圳)有限公司 Document distribution method, node and system
CN109040308A (en) * 2018-09-12 2018-12-18 杭州趣链科技有限公司 A kind of document distribution system and document distribution method based on IPFS
CN110891079A (en) * 2018-09-11 2020-03-17 北京奇虎科技有限公司 File distribution method, source server, node server and file distribution system
CN115102986A (en) * 2022-06-15 2022-09-23 之江实验室 Internet of things data distribution and storage method and system in edge environment
CN115665135A (en) * 2022-10-28 2023-01-31 中银金融科技有限公司 File distribution method and related device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102833293A (en) * 2011-06-17 2012-12-19 腾讯科技(深圳)有限公司 Method for downloading resources in peer to server and peer (P2SP) network, and client
CN106993054A (en) * 2017-05-05 2017-07-28 腾讯科技(深圳)有限公司 Document distribution method, node and system
CN110891079A (en) * 2018-09-11 2020-03-17 北京奇虎科技有限公司 File distribution method, source server, node server and file distribution system
CN109040308A (en) * 2018-09-12 2018-12-18 杭州趣链科技有限公司 A kind of document distribution system and document distribution method based on IPFS
CN115102986A (en) * 2022-06-15 2022-09-23 之江实验室 Internet of things data distribution and storage method and system in edge environment
CN115665135A (en) * 2022-10-28 2023-01-31 中银金融科技有限公司 File distribution method and related device

Also Published As

Publication number Publication date
CN117240851A (en) 2023-12-15

Similar Documents

Publication Publication Date Title
US10674486B2 (en) System, security and network management using self-organizing communication orbits in distributed networks
CN117240851B (en) Data distribution method, device, equipment and storage medium
CN108023812B (en) Content distribution method and device of cloud computing system, computing node and system
US20070230415A1 (en) Methods and apparatus for cluster management using a common configuration file
US20130110931A1 (en) Scalable Peer To Peer Streaming For Real-Time Data
US7764683B2 (en) Reliable multicast operating system (OS) provisioning
US20110126256A1 (en) Method for live broadcasting in a distributed network and apparatus for the same
US10931529B2 (en) Terminal device management method, server, and terminal device for managing terminal devices in local area network
WO2023077968A1 (en) On-board communication method, apparatus and device, and storage medium
KR101276155B1 (en) Method, apparatus, and server for spreading file transfer notifications in time
CN105357048A (en) Method and system for data synchronization of network equipment
CN115328579B (en) Scheduling method and system for neural network training and computer readable storage medium
CN107547374B (en) Aggregation route processing method and device
WO2016070566A1 (en) Cloud terminal upgrade method and system, network management server and proxy server
CN112202877A (en) Gateway linkage method, gateway, cloud server and user terminal
US10225339B2 (en) Peer-to-peer (P2P) network management system and method of operating the P2P network management system
CN109189403B (en) Operating system OS batch installation method and device and network equipment
CN114090342A (en) Storage disaster tolerance link management method, message execution node and storage control cluster
US8566681B1 (en) Distributed data distribution
CN116684416A (en) Mirror image distribution method, device and system in network element cluster
US12052162B2 (en) Systems and methods for path determination in a network
CN108632066B (en) Method and device for constructing video multicast virtual network
CN112910997B (en) Resource acquisition method of local area network
CN117499017B (en) Block chain network transmission method, system, storage medium and terminal equipment
CN102201965A (en) Data transmission method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant