CN115242782A - Large file fragment transmission method and transmission architecture between super-computing centers - Google Patents

Large file fragment transmission method and transmission architecture between super-computing centers Download PDF

Info

Publication number
CN115242782A
CN115242782A CN202211148476.0A CN202211148476A CN115242782A CN 115242782 A CN115242782 A CN 115242782A CN 202211148476 A CN202211148476 A CN 202211148476A CN 115242782 A CN115242782 A CN 115242782A
Authority
CN
China
Prior art keywords
size
super
fragment
sender
receiver
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211148476.0A
Other languages
Chinese (zh)
Other versions
CN115242782B (en
Inventor
余冬冬
俞圣亮
方启明
秦亦
孔丽娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Lab
Original Assignee
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lab filed Critical Zhejiang Lab
Priority to CN202211148476.0A priority Critical patent/CN115242782B/en
Publication of CN115242782A publication Critical patent/CN115242782A/en
Application granted granted Critical
Publication of CN115242782B publication Critical patent/CN115242782B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to the technical field of resource management of a supercomputer, and discloses a large file fragment transmission method and a transmission architecture among supercomputer centers, wherein the method comprises the following steps: collecting file data of a package transmitted among super-computation centers, and initializing the super-computation center serving as a sender; step two, after the initialization work is finished, the super-computation center of the sender obtains an initial state, the fragment size of the file data is dynamically adjusted by using a reinforcement learning algorithm, and then the file data is fragmented according to the fragment size and transmitted to the super-computation center of the receiver; step three, the super-computation center of the receiver sends transmission feedback to the super-computation center of the sender according to the receiving state; and step four, updating and judging the size of the residual file data so as to judge whether the file data is completely transmitted, and if not, repeating the step two to the step four until the whole file data is completely transmitted. The invention can effectively reduce the waste of system resources and improve the overall system efficiency.

Description

Large file fragment transmission method and transmission architecture between super-computing centers
Technical Field
The invention relates to the technical field of resource management of supercomputers, in particular to a large file fragment transmission method and a transmission framework between supercomputer centers.
Background
Files needing to be transmitted among the super-computation centers are generally large files, and the large files need to be added with a function of fragment transmission in order to prevent integral retransmission caused by transmission interruption in the transmission process. If the large file fragment transmission mechanism can be well optimized, the overall performance of data communication between super-computation centers is undoubtedly improved.
At present, a common file fragmentation strategy is a fixed-size fragmentation, a sender and a receiver define the size of the file fragmentation in advance, then the sender fragments the file and transmits the fragments one by one, and if the fragmentation transmission fails due to network instability in the fragmentation transmission process, the sender retransmits the fragments. There is a significant problem with fixed fragmentation strategies: the size of the slice cannot be dynamically adjusted according to actual requirements. When the fragmentation is too large, once the transmission of the fragmentation fails, the subsequent retransmission of the fragmentation still possibly fails, and the cost of retransmitting the fragmentation becomes very high, thereby causing the waste of system network resources.
Therefore, a large file fragmentation strategy is needed to solve the problems in the above technical solutions.
Disclosure of Invention
In order to solve the technical problems in the prior art, the invention provides a large file fragment transmission method and a transmission architecture between supercomputing centers, which are used for improving the transmission efficiency of large files between supercomputing centers and reducing the consumption of network bandwidth of the supercomputing centers, and the specific technical scheme is as follows:
a large file fragment transmission method between super-computing centers comprises the following steps:
collecting file data of a package transmitted among super-computation centers, counting and computing average package sending rate, and initializing the super-computation center serving as a sender;
step two, after the initialization work is finished, the super-computation center of the sender obtains an initial state, the fragment size of the file data is dynamically adjusted by using a reinforcement learning algorithm, and then the file data is fragmented according to the fragment size and transmitted to the super-computation center serving as a receiver;
step three, the super-computation center of the receiver sends transmission feedback to the super-computation center of the sender according to the receiving state of the super-computation center, and calculates the average packet sending rate among the super-computation centers according to the transmission result;
and step four, updating and judging the size of the residual file data so as to judge whether the file data is completely transmitted, and if not, repeating the step two to the step four until the whole file data is completely transmitted.
Further, the specific content of the initialization work is as follows: initializing a set of states { S }, a set of shards { C }, a desired reward Q (S, C), a reward mechanism R, a discounting factor γ, a model learning rate α, and a sampling threshold
Figure 279751DEST_PATH_IMAGE001
The states of the set of states { S } include: network running state, data residual size, sender resource load and receiver resource load;
the shard set { C } is a set of shard sizes which can be adopted, and supports various shard strategies;
the expected reward Q (S, C) refers to the expected reward after the fragmentation is carried out according to the fragmentation size C under each state S;
the reward mechanism R is set as follows: let r be the instant reward feedback, if the slicing strategy is adopted in the current state, the average packet sending rate is calculated to be
Figure 462471DEST_PATH_IMAGE002
Then r =
Figure 773366DEST_PATH_IMAGE003
-
Figure 201199DEST_PATH_IMAGE004
Figure 546730DEST_PATH_IMAGE004
The average packet sending rate obtained by statistical calculation in the step one is obtained;
the discount factor gamma is used for weakening the reward feedback of the future state to the current state, namely if the time interval between the future state and the current state is larger, the reward feedback is smaller under the condition that the reward value is the same;
the model learning rate α is initially set to: alpha is more than 0 and less than 1;
the sampling threshold value
Figure 900350DEST_PATH_IMAGE005
With greedy policy thresholds, the initialization settings are: 0 < ϵ < 1.
Further, the plurality of slicing strategies include:
and (3) arithmetic permutation: the user can specify the maximum fragment size, the minimum fragment size and the interval size between the fragments, and the system automatically generates a fragment set which is between the maximum fragment size and the minimum fragment size and is arranged from small to large according to the interval size between the fragments;
and (3) geometric proportion arrangement: the user can specify the maximum fragment size, the minimum fragment size and the growth proportion among the fragments, and the system automatically generates a fragment set which is between the maximum fragment size and the minimum fragment size and is arranged from small to large according to the growth proportion among the fragments;
user self-defining: the user specifies the maximum fragment size, the minimum fragment size and a user-defined function, and the system generates an integer value between the maximum fragment size and the minimum fragment size as a fragment set according to the user-defined function;
user-defined fragment size: i.e. to enable the user to manually enter the respective slice sizes.
Further, the second step specifically includes the following sub-steps:
step 2.1, after finishing the initialization work, the super computing center as the sender starts to acquire the current state
Figure 698542DEST_PATH_IMAGE006
Get the file size T t The senderResource load, network running state, receiver resource load;
step 2.1, adopting a Q-learning reinforcement learning algorithm to dynamically adjust the fragment size of the file data, specifically: randomly generating a number x in a range of [0,1] by a system model of a super-computation center of a sender;
if x>
Figure 163022DEST_PATH_IMAGE007
Then according to
Figure 363059DEST_PATH_IMAGE008
Acquiring the size of the fragment capable of generating the optimal expected reward as the size of the next fragment; if a plurality of optimal fragment sizes exist, randomly selecting one from the optimal fragment sizes as the size of the next fragmentc t Wherein
Figure 887581DEST_PATH_IMAGE009
Is shown in a state
Figure 173069DEST_PATH_IMAGE006
Lower adopting the size of the slicec t The size of the desired reward to be later received,
Figure 441239DEST_PATH_IMAGE010
for states ofs t By size of the slicecThen, the state is switched tos t+1 The obtained states t+1 The maximum desired reward of;
if x is less than or equal to
Figure 731668DEST_PATH_IMAGE001
Then one slice size needs to be randomly selected from the slice set { C } as the size of the next slice
Figure 692671DEST_PATH_IMAGE011
Step 2.3, the super-computation center of the sender according to the size of the fragmentsc t Transmitting the file data to the interface after fragmentationA super-calculation center of the receiver.
Further, the third step is specifically: if the super-computation center of the receiver successfully receives the data, the super-computation center of the receiver feeds back the successful receiving to the super-computation center of the sender; if the super-computation center of the receiver checks the data error, the super-computation center of the receiver feeds back the receiving failure to the super-computation center of the sender; if the super-computation center of the sender does not receive the feedback for a long time, namely a timeout error is generated, receiving failure processing is fed back according to the super-computation center of the receiver; wherein, when the transmission fails, the current average packet sending rate is
Figure 465455DEST_PATH_IMAGE012
=0; when the transmission is successful, recording the initial time t of transmission n1 And a transmission end time t n2 Calculating an average packet transmission rate of
Figure 271737DEST_PATH_IMAGE013
=(t n2 -t n1 )/c t
Further, the fourth step specifically includes:
if the super computing center of the receiver successfully receives the data, namely the data is successfully sent, T t+1 =T t -c t (ii) a Otherwise, if the transmission fails, T t+1 =T t ,,T i Representing the size of the file data remaining after i times of transmission;
judging whether the file data is transmitted completely:
if T t+1 <If the value is =0, the file data is transmitted completely, and the package sending process is ended; if T t+1 >0, if it is not transmission completed, then compare the current average packet sending rate
Figure 446366DEST_PATH_IMAGE014
And average packet transmission rate
Figure 47112DEST_PATH_IMAGE015
Size, calculating the instant feedback award r t+1 =
Figure 307192DEST_PATH_IMAGE014
-
Figure 917165DEST_PATH_IMAGE016
Updating the desired reward according to the reward mechanism:
Figure 680722DEST_PATH_IMAGE017
wherein the left side of the equationq(s t ,c t ) To represent the states t Using the size of the fragment c t Later desired reward, and update acquisition of current statuss t+1 Including the remaining file size T t+1 And repeating the second step to the fourth step until the whole file is successfully transmitted.
A transmission architecture of a large file fragmentation transmission method between super-computation centers comprises the super-computation center of a sender and the super-computation center of a receiver, wherein the super-computation center of the sender comprises a learner, a fragmenter and a transmitter, and the super-computation center of the receiver comprises a receiver and a feedback device;
the system comprises a slicer, a transmitter, a receiver, a learner, a feedback value acquisition module, a feedback value feedback module and a feedback module, wherein the slicer is used for acquiring the size of a file, cutting file data and outputting the sliced data to the transmitter; the learner is used for updating the state parameters according to the feedback provided by the transmitter, calculating the size of the next fragment and transmitting the value to the fragment device; the receiver is used for receiving the fragment data of the sender, verifying the data and sending a verification result to the feedback device; the feedback device receives the feedback state of the receiver and then sends the value to the sender of the sender.
A device for transmitting large files among supercomputing centers in a segmented mode comprises one or more processors and is used for achieving the method for transmitting the large files among the supercomputing centers in the segmented mode.
A computer readable storage medium, on which a program is stored, which when executed by a processor, implements the method for fragmented transmission of large files between supercomputing centers.
The invention has the advantages and beneficial effects that:
compared with the traditional fixed fragmentation mode, the method has the sensing capability on the external environment, can adapt to the change of the external environment better, accurately regulates and controls the size of fragments, and effectively reduces the waste of system resources; when the network is unstable and continuous repeated retransmission fails, the method can reduce the size of the fragments, quickly locate the size of the fragments which can be successfully transmitted, and avoid resource waste caused by repeated retransmission; when the network condition is stable, the method can increase the size of the fragments properly, quickly position the maximum size of the fragments which can ensure the successful transmission, and improve the overall system efficiency.
Drawings
Fig. 1 is a schematic block diagram of a large file fragmentation transmission architecture between supercomputing centers according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a file fragment transmission flow of an overall module architecture to which the method of the embodiment of the present invention is applied;
FIG. 3 is a schematic flow chart of a method for transmitting large files in a fragmented manner between supercomputing centers according to the present invention;
fig. 4 is a schematic structural diagram of a large file fragmentation transmission device between supercomputing centers according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and technical effects of the present invention more clearly apparent, the present invention is further described in detail below with reference to the accompanying drawings and examples.
As shown in fig. 1, the architecture for transmitting large file fragments between supercomputing centers of the present invention mainly considers 2 nodes of a sender and a receiver, wherein the supercomputing center as the sender includes a learner, a fragmenter, and a transmitter, and the supercomputing center as the receiver includes a receiver and a feedback device.
The slicer is used for acquiring file size and file data cutting and outputting the sliced data to the sender, the sender is used for receiving the sliced data output by the slicer and sending the data to the receiver of the receiver, and the sender acquires data sending feedback and transmits the feedback value to the learner; the learner is used for updating the state parameters according to the feedback provided by the transmitter, calculating the size of the next fragment and transmitting the value to the fragment device; the receiver is used for receiving the fragment data of the sender, verifying the data and sending a verification result to the feedback device; the feedback device receives the feedback state of the receiver and then sends the value to the sender of the sender.
As shown in fig. 2 and fig. 3, a method for transmitting large files in a fragmented manner between supercomputing centers according to an embodiment of the present invention includes the following steps:
step one, collecting file data of a package transmitted among super-computation centers, counting and computing average package sending rate, and initializing the super-computation center serving as a sending party.
The method for collecting data of the super-computation center to transmit the packets and calculating the average packet sending rate specifically comprises the following steps: collecting a large amount of data of packets transmitted between the super-computation center A and the super-computation center B by taking the super-computation center A as a sender and the super-computation center B as a receiver, and counting and calculating the average packet sending rate
Figure 496711DEST_PATH_IMAGE018
The initialization work is carried out on the super computing center as the sender, wherein the specific content of the initialization work is as follows: initialization state set S, shard set C, expected reward Q (S, C) (by default each action of the respective state is expected to be 0), reward mechanism R, discount factor γ, model learning rate α, and sampling threshold
Figure 978508DEST_PATH_IMAGE019
The states S of the set of states { S } include: network running state, data residual size, sender resource load and receiver resource load.
The shard set { C } is a set of shard sizes that can be employed, and supports a plurality of shard policies, including:
and (3) arithmetic permutation: the user can specify the maximum fragment size, the minimum fragment size and the size of the interval between the fragments, and the system automatically generates a fragment set which is between the maximum fragment size and the minimum fragment size and is arranged from small to large according to the size of the interval between the fragments;
proportionally arranging: the user can specify the maximum fragment size, the minimum fragment size and the growth proportion among the fragments, and the system automatically generates a fragment set which is between the maximum fragment size and the minimum fragment size and is arranged from small to large according to the growth proportion among the fragments;
user self-defining: the user specifies the maximum fragment size, the minimum fragment size and a user-defined function, and the system generates an integer value between the maximum fragment size and the minimum fragment size as a fragment set according to the user-defined function;
user-defined fragment size: i.e. to enable the user to manually enter the respective slice sizes.
The expected reward Q (S, C) refers to an expected reward after slicing according to the slicing size C in each state S.
The reward mechanism R is set as follows: let r be the immediate reward feedback, if the current state s t Lower adoption fragmentation strategy c t Then, the average packet sending rate is calculated to be
Figure 392171DEST_PATH_IMAGE020
Then r =
Figure 10235DEST_PATH_IMAGE020
-
Figure 483941DEST_PATH_IMAGE021
(ii) a The reward mechanism R can guide the super-computation center of the sender to obtain an optimal slicing scheme according to the task completion state fed back by the super-computation center of the receiver.
The discount factor γ is used to weaken the reward feedback of the future state to the current state, i.e. if the time interval between the future state and the current state is larger, the reward feedback should be smaller under the same reward value.
The model learning rate α is initially set to: alpha is more than 0 and less than 1.
The sampling threshold value
Figure 453034DEST_PATH_IMAGE001
With greedy policy thresholds, the initialization settings are: 0 < ϵ < 1.
Step two, after the initialization work is completed, the super computing center of the sender obtains an initial state, the fragment size of the file data is dynamically adjusted by utilizing a Q-learning reinforcement learning algorithm, then the file data is fragmented according to the fragment size and then transmitted to the super computing center serving as a receiver, and the method specifically comprises the following substeps:
step 2.1, after the initialization is finished, the super-computation center as the sender starts to acquire the current state s t Get the file size T t The resource load of the sender, the network running state and the resource load of the receiver.
Step 2.1, adopting a Q-learning reinforcement learning algorithm to dynamically adjust the fragment size of the file data, specifically: randomly generating a number x in a range of [0,1] by a system model of a super-computation center of a sender;
if x>
Figure 404810DEST_PATH_IMAGE001
Then according to
Figure 142959DEST_PATH_IMAGE008
Acquiring the size of the fragment capable of generating the optimal expected reward as the size of the next fragment; if a plurality of optimal fragment sizes exist, randomly selecting one from the optimal fragment sizes as the size of the next fragmentc t Wherein Q: (s t , c t ) Is shown in a states t Lower by slice size c t The size of the desired reward to be later received,
Figure 787567DEST_PATH_IMAGE010
indicating for the states t After the slice size c is adopted, the state is switched tos t+1 Obtained byState of states t+1 The maximum desired reward of;
if x is less than or equal to
Figure 745421DEST_PATH_IMAGE001
Then one slice size needs to be randomly selected from the slice set C as the size of the next slice.
Step 2.3, the super-computation center of the sender fragments the file data with corresponding size according to the fragment size and transmits the file data to the super-computation center of the receiver;
specifically, the supercomputing center of the sender cuts the current file once according to the size of the fragment to form data fragments with corresponding sizes, and transmits the data fragments to the supercomputing center of the receiver.
And step three, the super-computation center of the receiver sends transmission feedback to the super-computation center of the sender according to the receiving state of the super-computation center, and calculates the average packet sending rate among the super-computation centers according to the transmission result.
Specifically, obtaining time feedback: if the super-computation center of the receiver successfully receives the data, the super-computation center of the receiver feeds back the successful reception to the super-computation center of the sender through a feedback device; if the receiver checks the data error, the super-computation center of the receiver feeds back the receiving failure to the super-computation center of the sender; if the super-computation center of the sender does not receive the feedback for a long time, namely a timeout error is generated, receiving failure processing is fed back according to the super-computation center of the receiver;
if the transmission fails, the current average packet sending rate is
Figure 500887DEST_PATH_IMAGE022
=0, which means that the receiving side cannot receive the fragmented data;
if the transmission is successful, recording the initial time t of transmission n1 And a transmission end time t n2 Calculating an average packet transmission rate of
Figure 827963DEST_PATH_IMAGE022
=(t n2 -t n1 )/c t c t The size of the fragment for the t-th transmission; such as sending 10k data is used for 2s, then the average packet transmission rate is
Figure 643472DEST_PATH_IMAGE022
The unit of the method is that (= 10k/2s (= 5 k/s), and the larger the average packet sending rate is, the faster the data transmission is.
And step four, updating and judging the size of the residual file data so as to judge whether the file data is completely transmitted, and if not, repeating the step two to the step four until the whole file data is completely transmitted.
Specifically, the size of the remaining file is updated and calculated as follows: if the super computing center of the receiver successfully receives the data, namely the data is successfully sent, T t+1 =T t -c t (ii) a Otherwise, if the transmission fails, T t+1 =T t ,,T i Indicating the size of the file data remaining after i transfers.
And then judging whether the file data is transmitted completely:
if T t+1 <If the value is =0, the file data is transmitted completely, and the package sending process is ended;
if T t+1 >0, indicating that the file data is not transmitted completely, further calculating the size of a new fragment and transmitting the new fragment, and comparing the current average packet transmission rate
Figure 587158DEST_PATH_IMAGE023
And average packet transmission rate
Figure 880736DEST_PATH_IMAGE024
Size, calculating the instant feedback award r t+1 =
Figure 593477DEST_PATH_IMAGE023
-
Figure 314308DEST_PATH_IMAGE024
Updating the desired reward according to the reward mechanism:
Figure 745289DEST_PATH_IMAGE017
q(s) to the left of the equation t ,c t ) To represent the states t Using the size of the fragment c t Later desired reward, and update acquisition of current state s t+1 Including the remaining file size T t+1 The network running state, the resource load of the sender and the resource load of the receiver are repeated until the whole file is successfully transmitted.
According to the invention, the Q-learning reinforcement learning algorithm is utilized to determine the size of the current file fragment according to the network average time delay of the last fragment transmission, then the reward value is generated according to the network average time delay of the current transmission, the model is dynamically adjusted, and then the size of the next fragment is more accurately and reasonably adjusted, so that the network transmission efficiency of the large file is improved, the fragment sending success rate is improved, and the network bandwidth utilization rate is further improved.
Corresponding to the embodiment of the method for transmitting the large file fragments among the supercomputing centers, the invention also provides an embodiment of a device for transmitting the large file fragments among the supercomputing centers.
Referring to fig. 4, an embodiment of the present invention provides a large-file fragment transmission apparatus between supercomputing centers, including one or more processors, configured to implement a large-file fragment transmission method between supercomputing centers in the foregoing embodiment.
The embodiment of the large file fragmentation transmission device between super computing centers can be applied to any equipment with data processing capability, such as computers and other equipment or devices. The device embodiments may be implemented by software, or by hardware, or by a combination of hardware and software. The software implementation is taken as an example, and as a logical device, the device is formed by reading corresponding computer program instructions in the nonvolatile memory into the memory for running through the processor of any device with data processing capability. In terms of hardware, as shown in fig. 4, the present invention is a hardware structure diagram of an arbitrary device with data processing capability where a large file fragmentation transmission device between super computing centers is located, except for the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 4, in an embodiment, the arbitrary device with data processing capability where the device is located may also include other hardware according to the actual function of the arbitrary device with data processing capability, which is not described again.
The implementation process of the functions and actions of each unit in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the invention. One of ordinary skill in the art can understand and implement it without inventive effort.
The embodiment of the invention also provides a computer readable storage medium, which stores a program, and when the program is executed by a processor, the method for transmitting the large file segments among the supercomputing centers is realized.
The computer readable storage medium may be an internal storage unit, such as a hard disk or a memory, of any data processing capability device described in any of the foregoing embodiments. The computer readable storage medium may also be an external storage device such as a plug-in hard disk, a Smart Media Card (SMC), an SD Card, a Flash memory Card (Flash Card), etc. provided on the device. Further, the computer readable storage medium may include both an internal storage unit and an external storage device of any data processing capable device. The computer-readable storage medium is used for storing the computer program and other programs and data required by the arbitrary data processing-capable device, and may also be used for temporarily storing data that has been output or is to be output.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way. Although the foregoing has described the practice of the present invention in detail, it will be apparent to those skilled in the art that modifications may be made to the practice of the invention as described in the foregoing examples, or that certain features may be substituted in the practice of the invention. All changes, equivalents and modifications which come within the spirit and scope of the invention are desired to be protected.

Claims (9)

1. A method for transmitting large files in a slicing mode among super computing centers is characterized by comprising the following steps:
collecting file data of a package transmitted among supercomputing centers, counting and calculating an average package sending rate, and initializing the supercomputing centers serving as sending sides;
step two, after the initialization work is finished, the super-computation center of the sender obtains an initial state, the fragment size of the file data is dynamically adjusted by using a reinforcement learning algorithm, and then the file data is fragmented according to the fragment size and transmitted to the super-computation center of the receiver;
step three, the super-computation center of the receiver sends transmission feedback to the super-computation center of the sender according to the receiving state of the super-computation center, and calculates the average packet sending rate among the super-computation centers according to the transmission result;
and step four, updating and judging the size of the residual file data so as to judge whether the file data is completely transmitted, and if not, repeating the step two to the step four until the whole file data is completely transmitted.
2. The method for transmitting the large file fragments among the supercomputing centers as recited in claim 1, wherein the specific contents of the initialization work are as follows: initialization state set S, shard set C, expected reward Q (S, C), reward mechanism R, discount factor gamma, model learning rate alpha, and sampling threshold
Figure DEST_PATH_IMAGE001
The states of the set of states { S } include: network running state, data residual size, sender resource load and receiver resource load;
the shard set { C } is a set of shard sizes which can be adopted, and supports various shard strategies;
the expected reward Q (S, C) refers to the expected reward after fragmentation is carried out according to the fragmentation size C under each state S;
the reward mechanism R is set as follows: let r be the instant reward feedback, if the slicing strategy is adopted in the current state, the average packet sending rate is calculated to be
Figure DEST_PATH_IMAGE002
Then r =
Figure DEST_PATH_IMAGE003
-
Figure DEST_PATH_IMAGE004
Figure 336740DEST_PATH_IMAGE004
Average packet sending rate obtained by statistical calculation;
the discount factor gamma is used for weakening the reward feedback of the future state to the current state, namely if the time interval between the future state and the current state is larger, the reward feedback is smaller under the condition that the reward value is the same;
the model learning rate α is initially set to: alpha is more than 0 and less than 1;
the sampling threshold value
Figure 511369DEST_PATH_IMAGE001
With greedy policy thresholds, the initialization settings are: 0 < ϵ < 1.
3. The method for transmitting the large file fragments among the supercomputing centers as claimed in claim 2, wherein the plurality of fragmentation strategies include:
and (3) arithmetic permutation: the user can specify the maximum fragment size, the minimum fragment size and the interval size between the fragments, and the system automatically generates a fragment set which is between the maximum fragment size and the minimum fragment size and is arranged from small to large according to the interval size between the fragments;
and (3) geometric proportion arrangement: the user can specify the maximum fragment size, the minimum fragment size and the growth proportion among the fragments, and the system automatically generates a fragment set which is between the maximum fragment size and the minimum fragment size and is arranged from small to large according to the growth proportion among the fragments;
user self-defining: the user specifies the maximum fragment size, the minimum fragment size and a user-defined function, and the system generates an integer value between the maximum fragment size and the minimum fragment size as a fragment set according to the user-defined function;
user-defined fragment size: i.e. to enable the user to manually enter the respective slice sizes.
4. The method for transmitting the large file fragments among the supercomputing centers as recited in claim 2, wherein the second step specifically comprises the following substeps:
step 2.1, after finishing the initialization work, the super computing center as the sender starts to acquire the current state
Figure DEST_PATH_IMAGE006
Get the file size T t A sender resource load, a network running state and a receiver resource load;
step 2.1, adopting a Q-learning reinforcement learning algorithm to dynamically adjust the fragment size of the file data, specifically: randomly generating a number x in a range of [0,1] by a system model of a super-computation center of a sender;
if x>
Figure 911782DEST_PATH_IMAGE001
Then according to
Figure DEST_PATH_IMAGE008
Acquiring the size of the fragment capable of generating the optimal expected reward as the size of the next fragment; if a plurality of optimal fragment sizes exist, randomly selecting one from the optimal fragment sizes as the size of the next fragmentc t Wherein
Figure DEST_PATH_IMAGE010
Is shown in a state
Figure 234179DEST_PATH_IMAGE006
Lower by slice sizec t The size of the desired reward to be later received,
Figure DEST_PATH_IMAGE012
for states ofs t By size of the slicecThen, the state is switched tos t+1 The obtained states t+1 The maximum desired reward of;
if x is less than or equal to
Figure 109731DEST_PATH_IMAGE001
Then one slice size needs to be randomly selected from the slice set { C } as the size of the next slice
Figure DEST_PATH_IMAGE014
Step 2.3, the super-computation center of the sender according to the size of the fragmentsc t And transmitting the file data to a super-computation center of a receiver after fragmentation.
5. The method for transmitting the large file fragments among the supercomputing centers as recited in claim 4, wherein the third step is specifically: if the super-computation center of the receiver successfully receives the data, the super-computation center of the receiver feeds back the successful receiving to the super-computation center of the sender; if the super-computation center of the receiver checks the data error, the super-computation center of the receiver feeds back the receiving failure to the super-computation center of the sender; if the super-computation center of the sender does not receive the feedback for a long time, that isIf timeout errors occur, receiving failure processing is fed back according to a super-computation center of a receiver; wherein, when the transmission fails, the current average packet sending rate is
Figure DEST_PATH_IMAGE015
=0, which means that the receiving party cannot receive the fragmented data; when the transmission is successful, recording the initial time t of transmission n1 And a transmission end time t n2 Calculating an average packet transmission rate of
Figure 201184DEST_PATH_IMAGE015
= (t n2 -t n1 )/c t c t The size of the slice is sent for the t-th time.
6. The method for transmitting the large file fragments among the supercomputing centers as recited in claim 5, wherein the fourth step is specifically:
if the super computing center of the receiver successfully receives the data, namely the data is successfully sent, T t+1 =T t -c t (ii) a Otherwise, if the transmission fails, T t+1 =T t ,,T i Representing the size of the file data remaining after i times of transmission;
judging whether the file data is transmitted completely:
if T t+1 <If the value is =0, the file data is transmitted completely, and the package sending process is ended; if T t+1 >0, if it is not transmission completed, then compare the current average packet sending rate
Figure DEST_PATH_IMAGE016
And average packet transmission rate
Figure DEST_PATH_IMAGE017
Size, calculate instant feedback award r t+1 =
Figure 300727DEST_PATH_IMAGE016
-
Figure DEST_PATH_IMAGE018
Updating the desired reward according to the reward mechanism:
Figure DEST_PATH_IMAGE020
wherein to the left of the equationq(s t ,c t ) To represent the states t Using the size of the fragment c t Later desired reward, and update to obtain current statuss t+1 Including the remaining file size T t+1 And repeating the second step to the fourth step until the whole file is successfully transmitted.
7. A transmission architecture adopting the transmission method of the large file fragments among the supercomputing centers of any one of claims 1 to 6, comprising a supercomputing center of a sender and a supercomputing center of a receiver, wherein the supercomputing center of the sender comprises a learner, a fragmenter and a transmitter, and the supercomputing center of the receiver comprises a receiver and a feedback device;
the slicer is used for acquiring file size and file data cutting and outputting the sliced data to the sender, the sender is used for receiving the sliced data output by the slicer and sending the data to the receiver of the receiver, and the sender acquires data sending feedback and transmits the feedback value to the learner; the learning device is used for updating the state parameters according to the feedback provided by the transmitter and calculating to obtain the size of the next fragment, and transmitting the value to the fragmenter; the receiver is used for receiving the fragment data of the sender, verifying the data and sending a verification result to the feedback device; the feedback device receives the feedback state of the receiver and then sends the value to the sender of the sender.
8. An apparatus for transmitting large document fragments between supercomputing centers, comprising one or more processors, configured to implement a method for transmitting large document fragments between supercomputing centers according to any one of claims 1 to 6.
9. A computer-readable storage medium, on which a program is stored, which, when executed by a processor, implements a method for fragmented transmission of large files between supercomputing centers as claimed in any of claims 1 to 6.
CN202211148476.0A 2022-09-21 2022-09-21 Large file fragment transmission method and transmission architecture between super-computing centers Active CN115242782B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211148476.0A CN115242782B (en) 2022-09-21 2022-09-21 Large file fragment transmission method and transmission architecture between super-computing centers

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211148476.0A CN115242782B (en) 2022-09-21 2022-09-21 Large file fragment transmission method and transmission architecture between super-computing centers

Publications (2)

Publication Number Publication Date
CN115242782A true CN115242782A (en) 2022-10-25
CN115242782B CN115242782B (en) 2023-01-03

Family

ID=83680885

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211148476.0A Active CN115242782B (en) 2022-09-21 2022-09-21 Large file fragment transmission method and transmission architecture between super-computing centers

Country Status (1)

Country Link
CN (1) CN115242782B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104022852A (en) * 2014-06-25 2014-09-03 北京奇艺世纪科技有限公司 Document transmission method and device
CN104168081A (en) * 2013-05-20 2014-11-26 腾讯科技(深圳)有限公司 Document transmission method and device
CN106302589A (en) * 2015-05-27 2017-01-04 腾讯科技(深圳)有限公司 Document transmission method and terminal
CN111314022A (en) * 2020-02-12 2020-06-19 四川大学 Screen updating transmission method based on reinforcement learning and fountain codes
EP3929745A1 (en) * 2020-06-27 2021-12-29 INTEL Corporation Apparatus and method for a closed-loop dynamic resource allocation control framework
US20220067063A1 (en) * 2020-08-27 2022-03-03 Industry-Academic Cooperation Foundation, Yonsei University Apparatus and method for adaptively managing sharded blockchain network based on deep q network
WO2022083029A1 (en) * 2020-10-19 2022-04-28 深圳大学 Decision-making method based on deep reinforcement learning
US11373062B1 (en) * 2021-01-29 2022-06-28 EMC IP Holding Company LLC Model training method, data processing method, electronic device, and program product
US20220261295A1 (en) * 2021-02-05 2022-08-18 Qpicloud Technologies Private Limited Method and system for ai based automated capacity planning in data center

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104168081A (en) * 2013-05-20 2014-11-26 腾讯科技(深圳)有限公司 Document transmission method and device
CN104022852A (en) * 2014-06-25 2014-09-03 北京奇艺世纪科技有限公司 Document transmission method and device
CN106302589A (en) * 2015-05-27 2017-01-04 腾讯科技(深圳)有限公司 Document transmission method and terminal
CN111314022A (en) * 2020-02-12 2020-06-19 四川大学 Screen updating transmission method based on reinforcement learning and fountain codes
EP3929745A1 (en) * 2020-06-27 2021-12-29 INTEL Corporation Apparatus and method for a closed-loop dynamic resource allocation control framework
US20220067063A1 (en) * 2020-08-27 2022-03-03 Industry-Academic Cooperation Foundation, Yonsei University Apparatus and method for adaptively managing sharded blockchain network based on deep q network
WO2022083029A1 (en) * 2020-10-19 2022-04-28 深圳大学 Decision-making method based on deep reinforcement learning
US11373062B1 (en) * 2021-01-29 2022-06-28 EMC IP Holding Company LLC Model training method, data processing method, electronic device, and program product
US20220261295A1 (en) * 2021-02-05 2022-08-18 Qpicloud Technologies Private Limited Method and system for ai based automated capacity planning in data center

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JUN YAO: "Adaptive rate decision algorithm for dash based on deep reinforcement learning", 《2020 INTERNATIONAL CONFERENCE ON ROBOTS & INTELLIGENT SYSTEM (ICRIS)》 *
冯苏柳等: "基于强化学习的DASH自适应码率决策算法研究", 《中国传媒大学学报(自然科学版)》 *

Also Published As

Publication number Publication date
CN115242782B (en) 2023-01-03

Similar Documents

Publication Publication Date Title
Marx et al. Same standards, different decisions: A study of QUIC and HTTP/3 implementation diversity
US8583977B2 (en) Method and system for reliable data transfer
US8085781B2 (en) Bulk data transfer
CN107257329B (en) A kind of data sectional unloading sending method
CN113489570B (en) Data transmission method, device and equipment of PCIe link
US20230015180A1 (en) Error Correction in Network Packets Using Soft Information
JP2023090883A (en) Optimizing network parameter for enabling network coding
CN115276920A (en) Audio data processing method and device, electronic equipment and storage medium
EP1531577B1 (en) Method for transmitting and processing command and data
US20090010157A1 (en) Flow control in a variable latency system
CN115242782B (en) Large file fragment transmission method and transmission architecture between super-computing centers
CN112995329B (en) File transmission method and system
CN112202896A (en) Edge calculation method, frame, terminal and storage medium
WO2015100932A1 (en) Network data transmission method, device and system
CN115499173A (en) Credible communication method and system based on UDP protocol
CN110912969B (en) High-speed file transmission source node, destination node device and system
AU2014200413B2 (en) Bulk data transfer
Henze A Machine-Learning Packet-Classification Tool for Processing Corrupted Packets on End Hosts
CN109005011B (en) Data transmission method and system for underwater acoustic network and readable storage medium
CN111698176A (en) Data transmission method and device, electronic equipment and computer readable storage medium
US8065374B2 (en) Application-level lossless compression
CN112953686B (en) Data retransmission method, device, equipment and storage medium
US11973744B2 (en) Systems and methods for establishing consensus in distributed communications
US20240031402A1 (en) System and method for suppressing transmissions from a wireless device
US6981194B1 (en) Method and apparatus for encoding error correction data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant