CN103986744B - Throughput-based file parallel transmission method - Google Patents

Throughput-based file parallel transmission method Download PDF

Info

Publication number
CN103986744B
CN103986744B CN201310578190.0A CN201310578190A CN103986744B CN 103986744 B CN103986744 B CN 103986744B CN 201310578190 A CN201310578190 A CN 201310578190A CN 103986744 B CN103986744 B CN 103986744B
Authority
CN
China
Prior art keywords
throughput
connection
time period
files
transmission
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310578190.0A
Other languages
Chinese (zh)
Other versions
CN103986744A (en
Inventor
王俊峰
牟璇
黄辛
黄一辛
王敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN201310578190.0A priority Critical patent/CN103986744B/en
Publication of CN103986744A publication Critical patent/CN103986744A/en
Application granted granted Critical
Publication of CN103986744B publication Critical patent/CN103986744B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a throughput-based file parallel transmission method. The method comprises the following steps: a step for extracting file size and dividing file blocks, a step for establishing parallel connections, a step for comparing the number of the file blocks and the number of the parallel connections, a step for charring out parallel transmission of the file blocks, a step for measuring and calculating throughput, and a step for adjusting a parallel transmission degree according to the throughput. The advantages are as follows: 1, for end-to-end files, the transmission performance can be substantially improved; 2, the method has quite high universality, is not redistricted to specific network environments, systems and hardware environments, and can improve network throughput by applying the scheme provided by the invention; and 3, the degree of parallelism is adjusted in real time by use of the throughput, so that the method can be adaptive to the change of the network environments, and the network bandwidth can be fully utilized.

Description

File in parallel transmission method based on handling capacity
Technical field
The present invention relates to technical field of the computer network, it is specifically a kind of file in parallel transmission side based on handling capacity Method.
Background technology
With the continuous development of communication technology, computer technology and internet technique, the Internet is just towards high bandwidth, length The directions such as time delay, intelligent radio, space communication are developed;The mobile terminal devices such as smart mobile phone are constantly updated so that the Internet should Increasingly risen suddenly and sharply with data;The magnanimity scientific data such as high-energy physics, astronomical observation, aviation will constantly produce and distributed network, The development of the new opplication pattern such as cloud computing all proposes higher requirement to network transmission.Network structure is relatively stable at this stage, Procotol has tended to perfect, how to utilize conventional network resources to greatest extent, improves the transmission speed of file, have important Research and wide using value prospect.Parallel data transmission technology belongs to aggregated bandwidth technology, refers in source host and purpose Carried out data transmission using a plurality of connection between main frame simultaneously, can solve the problem that the low transmission problem of single connection efficiency, significantly improve Network throughput and efficiency of transmission end to end.
Research parallel transmission technology concentrates on three layers in theory:Application layer, transport layer data link layer.Have at present not Few application layer protocol is devoted to the parallel TCP of research and utilization (Transmission Control Protocol) stream, such as grid data Host-host protocol GridFTP (Grid File Transfer Protocol).Due to traditional FTP (File Transfer Protocol) transmission means of single connection is unable to the quick transmission storage of large-scale data in adaptive mess, and GridFTP is based on FTP is extended comprehensively, and by the extension support parallel data transmission to FTP order and passage, multiple TCP connect simultaneous transmissions Data, the performance of data transfer is significantly improved.Realizing end-to-end parallel transmission in transport layer mainly has based on biography Transport control protocol is discussed (TCP) and is based on SCTP (SCTP (Stream Control Transmission Protocol)).Theoretical circles it is proposed that a kind of replace the MulTCP method of real parallel stream in TCP layer with N bar virtual stream, Parallel TCP thought is realized in the transmission of a TCP flow.Stochastic TCP is also based on MulTCP algorithm, MulTCP Congestion window is treated as the set of the virtual TCP of N bar, and think that this N bar TCP flow is identical, and Stochastic TCP is then Think that this N number of TCP flow is different, the size of the window of each bar virtual stream is random, tackles each independent operation.SCTP has multiple Characteristic, one of key property is to support multithread, and the data of SCTP can send in different data flows, improves data Handling capacity, and main path failure when can carry out data transmission using other paths it is ensured that business transmit reliability.Counting Multiple network interfaces can be bound into be polymerized the bandwidth of multiple network interfaces, the bonding technology of Linux according to link layer One virtual interface, user data between each interface according to certain algorithmic dispatching thus realizing load balancing and aggregated bandwidth. IPMP (IP (Internet Protocol) network multipathing) in Solaris achieves and operates system in SUN The aggregated bandwidth of multiplex roles of system and parallel data transmission.
The research of three above level, the research in application layer and application need to apply in specific network environment;Transmission The research of layer needs kernel is done with corresponding change, does not promote on a large scale in the research being merely resting on theory at present;Number Parallel research according to link layer needs the support of extra hardware;The end that some of the above research is not suitable for domestic consumer is carried out is arrived The parallel transmission at end.
Content of the invention
It is an object of the invention to provide a kind of file in parallel transmission method realized in application layer based on handling capacity, maximum limit The utilization conventional network resources of degree, improve the transmission speed of file.
Technical scheme is as follows:A kind of file in parallel transmission method based on handling capacity, including
Step 1:Extract size FileSize of file to be transmitted;Setting size is the blocks of files of SegmentSize;To treat Transmission file is divided into m blocks of files,
Step 2:Set up n connection;
Step 3:If m is < n, using m connection m blocks of files of parallel transmission, until All Files block end of transmission; Otherwise go to step 4;
Step 4:Choose n block from m block, using n connection these blocks of files of parallel transmission, be set to transmit degree of parallelism n;The connection arranging each connection is masked as true;When parallel transmission starts start timing, after duration t stop and again Timing, obtains time period k, k=1,2 ..., N;
Step 5:Measurement and the throughput parameter calculating parallel transmission, including
501:Measure each and connect the valid data amount transmitted:
The valid data amount that connection i is transmitted in time period k is D (i, k), i=1,2 ..., n;
502:Calculate the handling capacity of each connection:
Connecting i in the handling capacity of time period k is
503:Calculate the total throughout of all connections:
All be connected to time period k total throughout be
504:Calculate smooth handling capacity:
Smooth_throughput (k+1)=smooth_throughput (k)+α all_throughput (k+1), Wherein, α is smoothing factor, smooth_throughput (1)=all_throughput (1);
505:Calculate the average throughput of smooth each connection rear:
Wherein, n (k) be time period k transmission simultaneously Row degree;
506:Calculate expectation handling capacity:
Expect_throughput (k+1)=smooth_throughput (k)+Dev (k),
Wherein, Dev (k) is the deviation variables of time period k,
Dev (k+1)=(1- β) Dev (k)+β | smooth_throughput (k+1)-all_throughput (k+ 1) |, wherein, β is the smoothing factor of deviation variables,
Step 6:Transmission degree of parallelism is adjusted according to throughput parameter, including
601:Judge whether the total throughout of time period k+1 is more than the total throughout of time period k, then continue in this way, otherwise Those connection marks connecting that handling capacity in time period k is less than average throughput in time period k are set to false, turn afterwards Step 7;
602:Judge the smooth handling capacity of time period k+1 whether more than the expectation handling capacity of this time period, in this way then newly-built n Individual connection carrys out the blocks of files that parallel transmission does not transmit, and transmits degree of parallelism n'=2n after adjustment;Connect to come as being otherwise newly-built one The blocks of files that parallel transmission does not transmit, transmits degree of parallelism n'=n+1 after adjustment;
Step 7:After any one connection of parallel transmission transfers a blocks of files, detect that the connection mark of this connection is No for true, from m block, then choose a blocks of files do not transmitted in this way, using this connection transmission, otherwise cancel this connection; Revocation one transmits degree of parallelism n'=n-1 after connecting;
Step 8:Repeat step 5 to step 7, until all of blocks of files end of transmission.
In technique scheme, sliding factor-alpha is equal to 0.5, and the smoothing factor β of described deviation variables is equal to 0.8, described connection Based on FTP.
The invention has the beneficial effects as follows:1st, file transmission performance end to end is increased significantly;2nd, have preferably Universality, be not limited to specific network environment, system and hardware environment, can apply the solution of the present invention improve net The handling capacity of network;3rd, degree of parallelism is adjusted in real time using handling capacity, to adapt to the change of network environment, can sufficiently utilize The network bandwidth.
Brief description
Fig. 1 is when network condition is preferable, and the laser propagation effect of the method for the present invention and SmartFTP compares;
Fig. 2 is when network condition is poor, and the laser propagation effect of the method for the present invention and SmartFTP compares.
Specific embodiment
File in parallel transmission method based on handling capacity is directed to end-to-end file transmission, and either party can serve as visitor Take end or service end, it is possible to use the method carries out push and the transmission of file.In this method, each connection is all based on Ftp agreement.
Step 1:Extract size FileSize of file to be transmitted;Setting size is the blocks of files of SegmentSize;To treat Transmission file is divided into m blocks of files,During extraction document size, if pushing literary composition Part, locally carrying out virtual Ground Split to file according to size, and increases attribute block number, block head pointer and block to each block End pointer etc..In this way from other side's transmission data, initially set up a connection to arrange block to obtain the related information of file Size.
Step 2:Set up n connection;
Step 3:If m is < n, using m connection m blocks of files of parallel transmission, until All Files block end of transmission; Otherwise go to step 4;
Step 4:Choose n block from m block, using n connection these blocks of files of parallel transmission, be set to transmit degree of parallelism n;The connection arranging each connection is masked as true;When parallel transmission starts start timing, after duration t stop and again Timing, obtains time period k, k=1,2 ..., N;Wherein, duration t is much smaller than the transmission time of blocks of files.
Step 5:Measurement and the throughput parameter calculating parallel transmission, including
501:Measure each and connect the valid data amount transmitted:
The valid data amount that connection i is transmitted in time period k is D (i, k), i=1,2 ..., n;
502:Calculate the handling capacity of each connection:
Connecting i in the handling capacity of time period k is
503:Calculate the total throughout of all connections:
All be connected to time period k total throughout be
504:Calculate smooth handling capacity:
Smooth_throughput (k+1)=smooth_throughput (k)+α all_throughput (k+1), Wherein, α is smoothing factor, smooth_throughput (1)=all_throughput (1);Here, α value is 0.5.
505:Calculate the average throughput of smooth each connection rear:
Wherein, n (k) be time period k transmission simultaneously Row degree;
506:Calculate expectation handling capacity:
Expect_throughput (k+1)=smooth_throughput (k)+Dev (k),
Wherein, Dev (k) is the deviation variables of time period k,
Dev (k+1)=(1- β) Dev (k)+β | smooth_throughput (k+1)-all_throughput (k+ 1) |, wherein, β is the smoothing factor of deviation variables,Here, β value is 0.8.
Step 6:Transmission degree of parallelism is adjusted according to throughput parameter, including
601:Judge whether the total throughout of time period k+1 is more than the total throughout of time period k, then continue in this way, otherwise Those connection marks connecting that handling capacity in time period k is less than average throughput in time period k are set to false, turn afterwards Step 7;
602:Judge the smooth handling capacity of time period k+1 whether more than the expectation handling capacity of this time period, in this way then newly-built n Individual connection carrys out the blocks of files that parallel transmission does not transmit, and transmits degree of parallelism n'=2n after adjustment;Connect to come as being otherwise newly-built one The blocks of files that parallel transmission does not transmit, transmits degree of parallelism n'=n+1 after adjustment;
Step 7:After any one connection of parallel transmission transfers a blocks of files, detect that the connection mark of this connection is No for true, from m block, then choose a blocks of files do not transmitted in this way, using this connection transmission, otherwise cancel this connection; Revocation one transmits degree of parallelism n'=n-1 after connecting;
Step 8:Repeat step 5 to step 7, until all of blocks of files end of transmission.
Fig. 1 shows that, when network condition is preferable, (transmission method of the present invention is used to transmit file using the method for the present invention ThroughputFTP represents) with SmartFTP transmit file effectiveness comparison.It can be seen that it is preferable in network condition When, when transmitting big file (size FileSize of file is more than 160MB) using the method for the present invention, the transmission time of file (Transmission time) is notable to be shortened.Fig. 2 shows when network condition is poor, transmits literary composition using the method for the present invention Part transmits the effectiveness comparison of file with SmartFTP.Even if as can be seen that file is less, the transmission time of the method for the present invention There is obvious advantage.

Claims (3)

1. a kind of file in parallel transmission method based on handling capacity is it is characterised in that include
Step 1:Extract size FileSize of file to be transmitted;Setting size is the blocks of files of SegmentSize;Will be to be transmitted File is divided into m blocks of files,
Step 2:Set up n connection;
Step 3:If m is < n, using m connection m blocks of files of parallel transmission, until All Files block end of transmission;
Otherwise go to step 4;
Step 4:Choose n block from m block, using n connection these blocks of files of parallel transmission, be set to transmit degree of parallelism n;
The connection arranging each connection is masked as true;Start timing when parallel transmission starts, stop laying equal stress on after duration t New timing, obtains time period k, k=1,2 ..., N;
Step 5:Measurement and the throughput parameter calculating parallel transmission, including
501:Measure each and connect the valid data amount transmitted:
The valid data amount that connection i is transmitted in time period k is D (i, k), i=1,2 ..., n;
502:Calculate the handling capacity of each connection:
Connecting i in the handling capacity of time period k is
503:Calculate the total throughout of all connections:
All be connected to time period k total throughout be
504:Calculate smooth handling capacity:
Smooth_throughput (k+1)=smooth_throughput (k)+α all_throughput (k+1), wherein, α is smoothing factor, smooth_throughput (1)=all_throughput (1);
505:Calculate the average throughput of smooth each connection rear:
Wherein, n (k) is the transmission degree of parallelism of time period k;
506:Calculate expectation handling capacity:
Expect_throughput (k+1)=smooth_throughput (k)+Dev (k),
Wherein, Dev (k) is the deviation variables of time period k,
Dev (k+1)=(1- β) Dev (k)+β | smooth_throughput (k+1)-all_throughput (k+1) |, Wherein, β is the smoothing factor of deviation variables,
Step 6:Transmission degree of parallelism is adjusted according to throughput parameter, including
601:Judge the total throughout whether total throughout more than time period k of time period k+1, then continue in this way, otherwise by when Between in section k handling capacity be less than those connection marks connecting of average throughput in time period k and be set to false, go to step afterwards 7;
602:Judge the smooth handling capacity of time period k+1 whether more than the expectation handling capacity of this time period, in this way then newly-built n company Fetch the blocks of files that parallel transmission does not transmit, after adjustment, transmit degree of parallelism n'=2n;Connect to come parallel as being otherwise newly-built one The blocks of files do not transmitted, transmits degree of parallelism n'=n+1 after adjustment;
Step 7:When parallel transmission any one connect transfer a blocks of files after, detect that whether the connection mark of this connection is True, then chooses a blocks of files do not transmitted from m block in this way, using this connection transmission, otherwise cancels this connection;Revocation Degree of parallelism n'=n-1 is transmitted after one connection;
Step 8:Repeat step 5 to step 7, until all of blocks of files end of transmission.
2. the file in parallel transmission method based on handling capacity as claimed in claim 1 is it is characterised in that described smoothing factor α Equal to 0.5, the smoothing factor β of described deviation variables is equal to 0.8.
3. the file in parallel transmission method based on handling capacity as claimed in claim 1 or 2 is it is characterised in that described linker In FTP.
CN201310578190.0A 2013-11-18 2013-11-18 Throughput-based file parallel transmission method Active CN103986744B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310578190.0A CN103986744B (en) 2013-11-18 2013-11-18 Throughput-based file parallel transmission method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310578190.0A CN103986744B (en) 2013-11-18 2013-11-18 Throughput-based file parallel transmission method

Publications (2)

Publication Number Publication Date
CN103986744A CN103986744A (en) 2014-08-13
CN103986744B true CN103986744B (en) 2017-02-08

Family

ID=51278567

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310578190.0A Active CN103986744B (en) 2013-11-18 2013-11-18 Throughput-based file parallel transmission method

Country Status (1)

Country Link
CN (1) CN103986744B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107453944B (en) * 2017-07-07 2021-04-02 台州市吉吉知识产权运营有限公司 Method and system for determining optimal test connection number of network throughput test
CN112019447B (en) * 2020-08-19 2024-06-25 博锐尚格科技股份有限公司 Data flow control method, device, system, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101133599A (en) * 2004-12-24 2008-02-27 阿斯帕拉公司 Bulk data transfer
CN101136791A (en) * 2006-11-16 2008-03-05 中兴通讯股份有限公司 File transfer protocol based network throughput testing approach
CN101616077A (en) * 2009-07-29 2009-12-30 武汉大学 The rapid transmission method of the big file in the Internet

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7352761B2 (en) * 2002-06-04 2008-04-01 Lucent Technologies Inc. Distributing unused allocated bandwidth using a borrow vector

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101133599A (en) * 2004-12-24 2008-02-27 阿斯帕拉公司 Bulk data transfer
CN101136791A (en) * 2006-11-16 2008-03-05 中兴通讯股份有限公司 File transfer protocol based network throughput testing approach
CN101616077A (en) * 2009-07-29 2009-12-30 武汉大学 The rapid transmission method of the big file in the Internet

Also Published As

Publication number Publication date
CN103986744A (en) 2014-08-13

Similar Documents

Publication Publication Date Title
CN102347876B (en) Multilink aggregation control device for cloud computing network
CN105242956A (en) Virtual function service chain deployment system and deployment method therefor
CN106850279B (en) Distributed avionics system network collocating method and system, emulator and hardware system
CN102394929A (en) Conversation-oriented cloud computing load balancing system and method therefor
CN102857505A (en) Data bus middleware of Internet of things
CN103051716A (en) Method and system for redirecting network-oriented serial device
CN104426758A (en) Transmission control system, device and method
CN105227601A (en) Data processing method in stream processing system, device and system
CN105656964B (en) The implementation method and device of data-pushing
CN106817314A (en) Big data acquisition method, device and system
CN105763297A (en) Cloud computing system-based remote data optimized transmission method and device
CN103986744B (en) Throughput-based file parallel transmission method
CN107454009B (en) Data center-oriented offline scene low-bandwidth overhead traffic scheduling scheme
CN106781429A (en) Intelligent electric meter, concentrator and meter register method
CN205540723U (en) Information retrieval system based on cloud calculates
CN106059940A (en) Flow control method and device
CN103401778A (en) Receiving-end buffer overflow probability guarantee based multi-path transmission packet scheduling method
CN105610992A (en) Task allocation load balancing method for distributed stream computing system
CN102984082A (en) Network service quality control method and device
Al-Salim et al. Energy efficient tapered data networks for big data processing in IP/WDM networks
CN117196014B (en) Model training method and device based on federal learning, computer equipment and medium
CN108769227A (en) A kind of data trade SiteServer LBS
CN105656794A (en) Data distribution method and device
CN104618159B (en) A kind of Internet resources reassignment method based on non-linear capacity load module
CN102075584A (en) Distributed file system and access method thereof

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant