CN103986744A - Throughput-based file parallel transmission method - Google Patents
Throughput-based file parallel transmission method Download PDFInfo
- Publication number
- CN103986744A CN103986744A CN201310578190.0A CN201310578190A CN103986744A CN 103986744 A CN103986744 A CN 103986744A CN 201310578190 A CN201310578190 A CN 201310578190A CN 103986744 A CN103986744 A CN 103986744A
- Authority
- CN
- China
- Prior art keywords
- throughput
- connection
- transmission
- files
- time period
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses a throughput-based file parallel transmission method. The method comprises the following steps: a step for extracting file size and dividing file blocks, a step for establishing parallel connections, a step for comparing the number of the file blocks and the number of the parallel connections, a step for charring out parallel transmission of the file blocks, a step for measuring and calculating throughput, and a step for adjusting a parallel transmission degree according to the throughput. The advantages are as follows: 1, for end-to-end files, the transmission performance can be substantially improved; 2, the method has quite high universality, is not redistricted to specific network environments, systems and hardware environments, and can improve network throughput by applying the scheme provided by the invention; and 3, the degree of parallelism is adjusted in real time by use of the throughput, so that the method can be adaptive to the change of the network environments, and the network bandwidth can be fully utilized.
Description
Technical field
The present invention relates to technical field of the computer network, is specifically a kind of file in parallel transmission method based on throughput.
Background technology
Along with the development of the communication technology, computer technology and internet technique, the Internet is just towards future developments such as high bandwidth, long delay, intelligent radio, space communications; The mobile terminal devices such as smart mobile phone are constantly updated, and internet, applications data are risen suddenly and sharply day by day; The magnanimity scientific research data such as high-energy physics, astronomical observation, aviation all propose higher requirement to Internet Transmission by the development of new application model such as constantly generation and distributed network, cloud computing etc.Present stage, network configuration was relatively stable, and it is perfect that procotol has been tending towards, how to utilize to greatest extent conventional network resources, improved the transmission speed of file, had important research and wide using value prospect.Parallel data transmission technology belongs to bandwidth polymerization technique, refers to use many connections to carry out transfer of data between source host and destination host simultaneously, can solve the inefficient transmission problem of single connection, significantly improves network throughput and efficiency of transmission end to end.
Research parallel transmission technology concentrates on three layers in theory: application layer, transport layer and data link layer.There are at present many application layer protocols to be devoted to parallel TCP (the Transmission Control Protocol) stream of research and utilization, as grid data transmission protocol GridFTP (Grid File Transfer Protocol).Due to the transmission means of traditional F TP (File Transfer Protocol) single connection can not adaptive mess in the quick transmission storage of large-scale data, GridFTP expands based on FTP comprehensively, by the expansion to FTP order and passage, support parallel data to transmit, data are transmitted in a plurality of TCP connections simultaneously, and the performance of transfer of data is significantly improved.In transport layer, realizing end-to-end parallel transmission mainly contains based on transmission control protocol (TCP) with based on SCTP (SCTP (Stream Control Transmission Protocol)).Theoretical circles once proposed a kind of MulTCP method of real parallel flow that replaces by N bar virtual stream at TCP layer, and parallel TCP thought is realized in the transmission of a TCP stream.Stochastic TCP is also based on MulTCP algorithm, MulTCP is the set as the virtual TCP of N bar by congestion window, and think that this N bar TCP stream is identical, Stochastic TCP thinks that this N TCP stream is different, the size of the window of each virtual stream is random, tackles each independent operation.SCTP has multifrequency nature, one of them key property is to support multithread, and the data of SCTP can send in different data flow, have improved data throughout, and can use other paths to carry out transfer of data when main path failure, the reliability of assurance business transmission.Bandwidth in data link layer for a plurality of network interfaces of polymerization, the bonding technology of Linux can be bound into a plurality of network interfaces a virtual interface, thus user data is realized load balancing and bandwidth polymerization according to certain algorithmic dispatching between each interface.IPMP in Solaris (IP (Internet Protocol) network multipathing) has realized in the bandwidth polymerization of many interfaces of SUN operating system and parallel data transmission.
The research of above three levels, the research in application layer and application need to apply in specific network environment; The research of transport layer need to be done corresponding change to kernel, just rests on not large-scale popularization in theoretical research at present; The parallel research of data link layer needs the support of extra hardware; The parallel transmission end to end that some researchs are not suitable for domestic consumer to carry out above.
Summary of the invention
The object of this invention is to provide and a kind ofly in application layer, realize the file in parallel transmission method based on throughput, utilize to greatest extent conventional network resources, improve the transmission speed of file.
Technical scheme of the present invention is as follows: a kind of file in parallel transmission method based on throughput, comprises
Step 1: the big or small FileSize that extracts file to be transmitted; Size is set for the blocks of files of SegmentSize; By Divide File to be transmitted, be m blocks of files,
Step 2: set up n connection;
Step 3: if m < is n, use m to connect m blocks of files of parallel transmission, until All Files piece end of transmission; Otherwise go to step 4;
Step 4: choose n piece from m piece, use n to connect these blocks of files of parallel transmission, be made as transmission degree of parallelism n; The connection that each connection is set is masked as true; When parallel transmission starts, start timing, after duration t, stop and reclocking, obtain time period k, k=1,2 ..., N;
Step 5: the throughput parameter of measurements and calculations parallel transmission, comprises
501: measure the valid data amount that each connects transmission:
The valid data amount that connection i transmits at time period k is D (i, k), i=1, and 2 ..., n;
502: the throughput of calculating each connection:
Connecting i in the throughput of time period k is
503: the total throughout that calculates all connections:
All total throughouts that are connected to time period k are
504: calculate level and smooth throughput:
Smooth_throughput (k+1)=smooth_throughput (k)+α all_throughput (k+1), wherein, α is smoothing factor, smooth_throughput (1)=all_throughput (1);
505: the average throughput that calculates level and smooth rear each connection:
506: calculation expectation throughput:
expect_throughput(k+1)=smooth_throughput(k)+Dev(k),
Wherein, Dev (k) is the deviation variables of time period k,
Dev (k+1)=(1-β) Dev (k)+β | smooth_throughput (k+1)-all_throughput (k+1) |, wherein,
β is the smoothing factor of deviation variables,
Step 6: according to throughput parameter adjustment transmission degree of parallelism, comprise
601: judge whether the total throughout of time period k+1 is greater than the total throughout of time period k, continue in this way, otherwise the connection sign that throughput in time period k is less than those connections of average throughput in time period k is set to false, go to step afterwards 7;
602: judge whether the level and smooth throughput of time period k+1 is greater than the expectation throughput of this time period, newly-built n the blocks of files of connecting parallel transmission not transmit in this way, transmission degree of parallelism n'=2n after adjusting; As otherwise be a newly-built blocks of files that connection comes parallel transmission not transmit, transmission degree of parallelism n'=n+1 after adjusting;
Step 7: when any connection of parallel transmission transfers after a blocks of files, whether the connection sign that detects this connection is true is chosen a not blocks of files for transmission in this way from m piece, uses this connections to transmit, otherwise cancels this connection; After cancelling a connection, transmit degree of parallelism n'=n-1;
Step 8: repeating step 5 is to step 7, until all blocks of files end of transmissions.
In technique scheme, sliding factor-alpha equals 0.5, and the smoothing factor β of described deviation variables equals 0.8, and described connection is based on FTP.
The invention has the beneficial effects as follows: 1, for file transfer performance end to end, increase significantly; 2, there is good universality, be not limited to specific network environment, system and hardware environment, can both apply the throughput that the solution of the present invention improves network; 3, adopt throughput to carry out real-time adjustment degree of parallelism, to adapt to the variation of network environment, can utilize fully the network bandwidth.
Accompanying drawing explanation
Fig. 1 is when network condition is better, the laser propagation effect comparison of method of the present invention and SmartFTP;
Fig. 2 is when network condition is poor, the laser propagation effect comparison of method of the present invention and SmartFTP.
Embodiment
File in parallel transmission method based on throughput is for end-to-end file transfer, and either party can, as customer side or service end, can utilize the method to carry out propelling movement and the transmission of file.In this method, each connection is based on ftp agreement.Step 1: the big or small FileSize that extracts file to be transmitted; Size is set for the blocks of files of SegmentSize; By Divide File to be transmitted, be m blocks of files,
extraction document when size, if pushing files cuts apart according to size file virtually in this locality, and each piece is increased to attribute block numbering, the initial pointer of piece and block end pointer etc.From the other side, transmit data in this way, the size that information that file is relevant arranges piece is obtained in connection of model.
Step 2: set up n connection;
Step 3: if m < is n, use m to connect m blocks of files of parallel transmission, until All Files piece end of transmission; Otherwise go to step 4;
Step 4: choose n piece from m piece, use n to connect these blocks of files of parallel transmission, be made as transmission degree of parallelism n; The connection that each connection is set is masked as true; When parallel transmission starts, start timing, after duration t, stop and reclocking, obtain time period k, k=1,2 ..., N; Wherein, duration t is much smaller than the transmission time of blocks of files.
Step 5: the throughput parameter of measurements and calculations parallel transmission, comprises
501: measure the valid data amount that each connects transmission:
The valid data amount that connection i transmits at time period k is D (i, k), i=1, and 2 ..., n;
502: the throughput of calculating each connection:
Connecting i in the throughput of time period k is
503: the total throughout that calculates all connections:
All total throughouts that are connected to time period k are
504: calculate level and smooth throughput:
Smooth_throughput (k+1)=smooth_throughput (k)+α all_throughput (k+1), wherein, α is smoothing factor, smooth_throughput (1)=all_throughput (1); Here, α value is 0.5.
505: the average throughput that calculates level and smooth rear each connection:
506: calculation expectation throughput:
expect_throughput(k+1)=smooth_throughput(k)+Dev(k),
Wherein, Dev (k) is the deviation variables of time period k,
Dev (k+1)=(1-β) Dev (k)+β | smooth_throughput (k+1)-all_throughput (k+1) |, wherein, the smoothing factor that β is deviation variables,
here, β value is 0.8.
Step 6: according to throughput parameter adjustment transmission degree of parallelism, comprise
601: judge whether the total throughout of time period k+1 is greater than the total throughout of time period k, continue in this way, otherwise the connection sign that throughput in time period k is less than those connections of average throughput in time period k is set to false, go to step afterwards 7;
602: judge whether the level and smooth throughput of time period k+1 is greater than the expectation throughput of this time period, newly-built n the blocks of files of connecting parallel transmission not transmit in this way, transmission degree of parallelism n'=2n after adjusting; As otherwise be a newly-built blocks of files that connection comes parallel transmission not transmit, transmission degree of parallelism n'=n+1 after adjusting;
Step 7: when any connection of parallel transmission transfers after a blocks of files, whether the connection sign that detects this connection is true is chosen a not blocks of files for transmission in this way from m piece, uses this connections to transmit, otherwise cancels this connection; After cancelling a connection, transmit degree of parallelism n'=n-1;
Step 8: repeating step 5 is to step 7, until all blocks of files end of transmissions.
Fig. 1 shows when network condition is better, adopts the effect comparison of method transfer files of the present invention (transmission method of the present invention represents with throughputFTP) and SmartFTP transfer files.As can be seen from the figure,, when network condition is better, while utilizing method of the present invention to transmit large file (the big or small FileSize of file is greater than 160MB), the transmission time of file (Transmission time) significantly shortens.Fig. 2 shows when network condition is poor, adopts the effect comparison of method transfer files of the present invention and SmartFTP transfer files.Can find out, even if file is less, the transmission time of method of the present invention also has obvious advantage.
Claims (3)
1. the file in parallel transmission method based on throughput, is characterized in that, comprises
Step 1: the big or small FileSize that extracts file to be transmitted; Size is set for the blocks of files of SegmentSize; By Divide File to be transmitted, be m blocks of files,
Step 2: set up n connection;
Step 3: if m < is n, use m to connect m blocks of files of parallel transmission, until All Files piece end of transmission; Otherwise go to step 4;
Step 4: choose n piece from m piece, use n to connect these blocks of files of parallel transmission, be made as transmission degree of parallelism n; The connection that each connection is set is masked as true; When parallel transmission starts, start timing, after duration t, stop and reclocking, obtain time period k, k=1,2 ..., N;
Step 5: the throughput parameter of measurements and calculations parallel transmission, comprises
501: measure the valid data amount that each connects transmission:
The valid data amount that connection i transmits at time period k is D (i, k), i=1, and 2 ..., n;
502: the throughput of calculating each connection:
Connecting i in the throughput of time period k is
503: the total throughout that calculates all connections:
All total throughouts that are connected to time period k are
504: calculate level and smooth throughput:
Smooth_throughput (k+1)=smooth_throughput (k)+α all_throughput (k+1), wherein, α is smoothing factor, smooth_throughput (1)=all_throughput (1);
505: the average throughput that calculates level and smooth rear each connection:
506: calculation expectation throughput:
expect_throughput(k+1)=smooth_throughput(k)+Dev(k),
Wherein, Dev (k) is the deviation variables of time period k,
Dev (k+1)=(1-β) Dev (k)+β | smooth_throughput (k+1)-all_throughput (k+1) |, wherein, the smoothing factor that β is deviation variables,
Step 6: according to throughput parameter adjustment transmission degree of parallelism, comprise
601: judge whether the total throughout of time period k+1 is greater than the total throughout of time period k, continue in this way, otherwise the connection sign that throughput in time period k is less than those connections of average throughput in time period k is set to false, go to step afterwards 7;
602: judge whether the level and smooth throughput of time period k+1 is greater than the expectation throughput of this time period, newly-built n the blocks of files of connecting parallel transmission not transmit in this way, transmission degree of parallelism n'=2n after adjusting; As otherwise be a newly-built blocks of files that connection comes parallel transmission not transmit, transmission degree of parallelism n'=n+1 after adjusting;
Step 7: when any connection of parallel transmission transfers after a blocks of files, whether the connection sign that detects this connection is true is chosen a not blocks of files for transmission in this way from m piece, uses this connections to transmit, otherwise cancels this connection; After cancelling a connection, transmit degree of parallelism n'=n-1;
Step 8: repeating step 5 is to step 7, until all blocks of files end of transmissions.
2. parallel transmission method as claimed in claim 1, is characterized in that, described smoothing factor α equals 0.5, and the smoothing factor β of described deviation variables equals 0.8.
3. any one parallel transmission method as claimed in claim 1 or 2, is characterized in that, described connection is based on FTP.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310578190.0A CN103986744B (en) | 2013-11-18 | 2013-11-18 | Throughput-based file parallel transmission method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310578190.0A CN103986744B (en) | 2013-11-18 | 2013-11-18 | Throughput-based file parallel transmission method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103986744A true CN103986744A (en) | 2014-08-13 |
CN103986744B CN103986744B (en) | 2017-02-08 |
Family
ID=51278567
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310578190.0A Active CN103986744B (en) | 2013-11-18 | 2013-11-18 | Throughput-based file parallel transmission method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103986744B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107453944A (en) * | 2017-07-07 | 2017-12-08 | 上海斐讯数据通信技术有限公司 | A kind of method and system for the optimal test connection number for determining network throughput test |
CN112019447A (en) * | 2020-08-19 | 2020-12-01 | 博锐尚格科技股份有限公司 | Data flow control method, device, system, electronic equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030223430A1 (en) * | 2002-06-04 | 2003-12-04 | Sandeep Lodha | Distributing unused allocated bandwidth using a borrow vector |
CN101133599A (en) * | 2004-12-24 | 2008-02-27 | 阿斯帕拉公司 | Bulk data transfer |
CN101136791A (en) * | 2006-11-16 | 2008-03-05 | 中兴通讯股份有限公司 | File transfer protocol based network throughput testing approach |
CN101616077A (en) * | 2009-07-29 | 2009-12-30 | 武汉大学 | The rapid transmission method of the big file in the Internet |
-
2013
- 2013-11-18 CN CN201310578190.0A patent/CN103986744B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030223430A1 (en) * | 2002-06-04 | 2003-12-04 | Sandeep Lodha | Distributing unused allocated bandwidth using a borrow vector |
CN101133599A (en) * | 2004-12-24 | 2008-02-27 | 阿斯帕拉公司 | Bulk data transfer |
CN101136791A (en) * | 2006-11-16 | 2008-03-05 | 中兴通讯股份有限公司 | File transfer protocol based network throughput testing approach |
CN101616077A (en) * | 2009-07-29 | 2009-12-30 | 武汉大学 | The rapid transmission method of the big file in the Internet |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107453944A (en) * | 2017-07-07 | 2017-12-08 | 上海斐讯数据通信技术有限公司 | A kind of method and system for the optimal test connection number for determining network throughput test |
CN112019447A (en) * | 2020-08-19 | 2020-12-01 | 博锐尚格科技股份有限公司 | Data flow control method, device, system, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN103986744B (en) | 2017-02-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104734946A (en) | Multi-tenant high-concurrency instant messaging cloud platform | |
CN101945103B (en) | IP (Internet Protocol) network application accelerating system | |
CN102347876B (en) | Multilink aggregation control device for cloud computing network | |
CN102263825A (en) | Cloud-position-based hybrid cloud storage system data transmission method | |
CN103812949A (en) | Task scheduling and resource allocation method and system for real-time cloud platform | |
CN103746938A (en) | Method and device for transmitting data packet | |
CN104580503A (en) | Efficient dynamic load balancing system and method for processing large-scale data | |
CN103986783A (en) | Cloud computing system | |
CN104092758A (en) | Distributed high-speed cloud storage server cluster system and reading method thereof | |
CN105610992A (en) | Task allocation load balancing method for distributed stream computing system | |
CN103986744A (en) | Throughput-based file parallel transmission method | |
CN103401778A (en) | Receiving-end buffer overflow probability guarantee based multi-path transmission packet scheduling method | |
CN103577161A (en) | Big data frequency parallel-processing method | |
CN205540723U (en) | Information retrieval system based on cloud calculates | |
Zeinali et al. | Comprehensive practical evaluation of wired and wireless internet base smart grid communication | |
CN102946443B (en) | Multitask scheduling method for realizing large-scale data transmission | |
CN117196014B (en) | Model training method and device based on federal learning, computer equipment and medium | |
CN104065719A (en) | Variable sampling period scheduler and control method thereof | |
CN103338156A (en) | Thread pool based named pipe server concurrent communication method | |
CN103532866A (en) | Flow control method and system for virtual machine | |
Yamanaka et al. | A TCP/IP-based constant-bit-rate file transfer protocol and its extension to multipoint data delivery | |
CN105407383A (en) | Multi-version video-on-demand streaming media server cluster resource prediction method | |
CN102075584A (en) | Distributed file system and access method thereof | |
CN103701865A (en) | Data transmission method and system | |
CN102546659A (en) | Durable TCP (transmission control protocol) connection method oriented to remote procedure call |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |