CN101860479B - Method for improving data transmission efficiency in grid environment - Google Patents

Method for improving data transmission efficiency in grid environment Download PDF

Info

Publication number
CN101860479B
CN101860479B CN2010101694276A CN201010169427A CN101860479B CN 101860479 B CN101860479 B CN 101860479B CN 2010101694276 A CN2010101694276 A CN 2010101694276A CN 201010169427 A CN201010169427 A CN 201010169427A CN 101860479 B CN101860479 B CN 101860479B
Authority
CN
China
Prior art keywords
server
transmission
score
cpu
bandwidth
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2010101694276A
Other languages
Chinese (zh)
Other versions
CN101860479A (en
Inventor
吴卿
张奇锋
倪永军
周兴武
金恭华
赵俊杰
郁伟炜
吴鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN2010101694276A priority Critical patent/CN101860479B/en
Publication of CN101860479A publication Critical patent/CN101860479A/en
Application granted granted Critical
Publication of CN101860479B publication Critical patent/CN101860479B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Transfer Between Computers (AREA)

Abstract

The invention relates to a method for improving the data transmission efficiency in a grid environment. The document transmission efficiency of a traditional method is low. The method comprises the following steps of: firstly, distributing a document fragment transmission task with the same size to each grid FTP server at an initial stage, wherein the sizes of document fragments are determined by the bandwidth and the weight; then, when certain servers finish the current document fragment transmission tasks, distributing novel document fragment transmission tasks for the servers; if the transmission fails, retransmitting; finally, setting a termination condition in order to avoid generating too small document fragments; and if the sizes of the residual documents to be transmitted are smaller than the fragment sizes set at the initial stage, transmitting the final residual document fragments immediately. The method can achieve the maximization of the utilization efficiency of resources.

Description

A kind of method of improving data transmission efficiency in the grid environment
Technical field
The invention belongs to the grid computing field, be specifically related to a kind of method of improving data transmission efficiency in the grid environment.
Background technology
Grid is the virtual environment of a resource-sharing, co-ordination, can fully receive various resources, and can they be converted into a kind of that be available anywhere, reliable, standard, still economic simultaneously resource.The very important effect of grid computing is exactly effectively to utilize various resources, carries out mass data rapidly and accurately and calculates.To be used for Practical Calculation to the effective time, just need reduce data transmission period as far as possible.
In the grid environment, adopt collaborative allocation architecture at present, utilize the concurrent transmission of a plurality of backups that exist in the grid to solve this problem.The collaborative allocation architecture of tradition has three kinds of strategies to distribute the data block in a plurality of backups, improves the efficient of Network Transmission.Be respectively:
1) method of exhaustion is collaborative distributes
Realize through the file size mean allocation is connected to each, do not consider the bandwidth difference that each client-service end connects.For example a client is connected with 3 service ends, and then each service end institute allocate file size all is 1/3.
2) distribute based on historical method is collaborative
Recently, each distributes corresponding file size through the prediction transmission speed for connecting.For example the transmission performance of 3 connections is than 1: 2: 3, and to be exactly 1/6, the second be exactly 2/6 to first service end institute allocate file size so, and the 3rd is exactly 3/6.
3) conservative load balancing method is collaborative distributes
The file allocation that conservative load balancing method will be transmitted becomes the file fragmentation of some identical sizes, and the server of each connection all distributes a block file burst to carry out the band shape transmission.If a server has been accomplished the transmission of a block file burst, just download until whole file and finish for another block file burst of server-assignment of accomplishing transmission.Should collaboratively divide the loading on the flow is dynamically adjustment, and therefore transmitting fast server can transmit more relatively file size.
Key based on the collaborative distribution of historical method is the prediction to connection speed; Certainly this is Utopian; When the average transmission rate that obtains according to test was distributed the data amount transmitted of each copy, it was perhaps slack-off always to exist various risks to make that transmitting speed accelerates suddenly.It is accurately whether bigger to the transmission performance influence to predict the outcome, and the load sharing process in this strategy is accomplished before the mass data transmission, and no longer changes once distributing just.The same with the collaborative distribution of the method for exhaustion, its maximum shortcoming is the dynamic change that can not adapt to network performance.
The burst size is the good and bad critical factor of the conservative load balancing method algorithm performance of influence.Burst is crossed conference and is made collaborative precision of distributing reduce effect to descend, and the branch leaf length is too small then can to cause connecting the reduction of asking burst to cause performance continually again.So, should weigh both and select a suitable branch leaf length.
The burst size should satisfy following some:
1, should be far longer than the number of data transfer server by the data block unit pack bag number of burst size decision.
2, unit pack should be as far as possible little: can make each data transfer server near finishing at one time, to improve the whole service efficiency of server on the one hand; On the other hand, can adapt to the variation of each data transfer server transmission performance better.
3, unit pack should be big relatively, is far longer than the free time between the transmission unit bag so that transmit a used time of unit pack.
Burst size is selected to be difficult to hold, and in addition, when the transmission speed of one of them server accelerated, it will be than other Server Transport more data amount, and transmission performance is by that minimum server decision of efficiency of transmission.In fact in this case, influence still is smaller, and the method that effectively solves this situation is dynamically to give the bigger data volume of this server, to alleviate the load of other servers.If when the transmission speed slows of one of them server, it will all will be delayed time than other all servers and finish the work.Owing to the said reason in front, this server determines whole transmission performances exactly, and it is fatal for a kind of situation in therefore relative front.
Traditional collaborative allocation strategy does not overcome and transmits the shortcoming that server faster must wait the slowest last block file of Server Transport to be transmitted.In most of the cases, this will waste the overall performance of a lot of times and final influence transmission.
Summary of the invention
The present invention is directed to the deficiency of prior art, a kind of method of improving data transmission efficiency in the grid environment is provided.
The target of Network Transmission is to make transmission performance reach best in the grid environment, time of each copy transmission is reached minimize.A method that reaches above-mentioned target is exactly to make each copy be in the state of transmission data always, and promptly to begin to its transfer of data of the end of transmission from transfer of data be continual to each copy, and requires the almost end data transmission simultaneously of each copy.The approach that realizes this method is exactly to distribute data amount transmitted concrete between each copy according to the transfer rate between future copy and the destination, accomplishes transfer of data simultaneously to expect each copy, obtains best transmission performance.
The inventive method is a kind of dynamic adjustment strategy, further is a kind of improved dynamic cooperation distribution method, and it has reduced the stand-by period, thereby has improved the overall transfer performance.
The step that technical solution problem of the present invention is taked is:
Step (1) distributes an identical big or small file fragmentation transformation task for each grid ftp server in the starting stage, and the file fragmentation size is confirmed by bandwidth and weights.
When step (2) was accomplished current file slicing transmission task as server, this server can be assigned with new file fragmentation transformation task again, if bust this then can retransfer.
Step (3) need be provided with termination condition for fear of generating too little file fragmentation.If the burst size that remaining file size to be transmitted was set less than the starting stage, remaining at last file fragmentation will be by transmission immediately.
The beneficial effect of the inventive method was this method; Each grid service server will come dynamic adjustment transmission policy according to the variation of bandwidth in the network environment and weights; This will make each server make full use of the performance of bandwidth and each server in the network, thereby reach the maximization of the level of resources utilization.
Embodiment
A kind ofly improve that the method for data transmission efficiency may further comprise the steps in the grid environment
Step (1) distributes an identical big or small file fragmentation transformation task for each grid ftp server in the starting stage, and the file fragmentation size is confirmed by bandwidth and weights.
At first, we have set a upper limit for original document burst size, and this depends on client maximum bandwidth and copy number of resources.Although our can a plurality of servers parallel downloaded copy, the file fragmentation of downloading from many lines is gathered all will pass through single line of client.The client bandwidth becomes the bottleneck that collaborative allocation architecture is quickened download obviously.Original document burst maxsize InitialPT computing formula is following:
InitialPT=ClientMaxBandwidth/NumberOf?Re?plicaSource (1.1)
ClientMaxBandwidth is the client maximum bandwidth in the formula (1.1), and NumberOfReplicaSource is a copy resource sum.
We define the computing formula of auxiliary parameter Score earlier according to the state of different server equipment:
Score i = P i CPU × R CPU + P i MEM × R MEM + P i BW × R BW - - - ( 1.2 )
R wherein CPU+ R MEM+ R BW=1; Each parameter declaration is following in the formula (1.2):
Score i: the score value of server i, 1≤i≤n here, n is the server number.
The CPU idle condition percentage of
Figure GSA00000117362900032
server i.
R CPU: the shared ratio that influences key element of user-defined cpu load.
The internal memory free space percentage of
Figure GSA00000117362900033
server i.
R MEM: the shared ratio that influences key element of user-defined internal memory free space.
bandwidth from server i to client accounts for total server bandwidth percentage.Available current bandwidth obtains divided by theoretical maximum bandwidth.
R BW: the shared ratio that influences key element of user-defined bandwidth.
Score when each server iAfter obtaining, just can calculate the weight w eighing of each server i:
weighing i = Score i / Σ k = 1 n Score k - - - ( 1.3 )
Obtain the weight w eighing of each server iAfter, the file fragmentation size NewPT next to be transmitted of each server iJust can obtain by through type (1.4):
NewPT i=ClientBandwidth×weighing i (1.4)
When step (2) was accomplished current file slicing transmission task as server, this server can be assigned with new file fragmentation transformation task again, if bust this then can retransfer.
Each file fragmentation transmission that is through with current as server i, it will obtain new transformation task.New transformation task is that the real-time status according to server i obtains.Dynamically the adjustment strategy at every turn can be according to the transformation task next time of server end load and each server of Bandwidth Dynamic adjustment.The load of server is light more, and the transformation task that it is assigned with is just many more.
Step (3) need be provided with termination condition for fear of generating too little file fragmentation.If remaining file fragmentation size to be transmitted is less than the file fragmentation size of starting stage, remaining at last file fragmentation will be by transmission immediately.
Below will advantage of the present invention be described through the experiment contrast
Before making an experiment, carry out qualitative analysis for the collaborative allocation algorithm of tradition and the inventive method, as shown in table 1.
The collaborative allocation algorithm of table 1 relatively
Figure GSA00000117362900043
Test of heuristics and analysis
(1) setting of input parameter
In order to make dynamic adjustment strategy realize that efficiency of transmission is best, before making an experiment, at first to set dynamic adjustment strategy input parameter.Three influencing factors are arranged: CPU idle condition, internal memory free space and bandwidth, the necessary shared ratio of setting influencing factor here here.
At first need know the influence of bandwidth.When file is smaller, the not much difference of transmission speed.Along with file to be transmitted is increasing, the bandwidth influence is just apparent in view.When the bandwidth ratio was lower than 0.6, transmission speed was slack-off gradually.When the bandwidth ratio was 0.8, transmission speed reached maximum.
Then need know of the influence of CPU computing capability to efficiency of transmission.Adopt three and have same memory and bandwidth, but the different loom of cpu type is tested.Test result shows that cpu performance is good more, and transmission performance is good more, is not very big but transmission performance is influenced by cpu performance, and performance is not significantly increased because cpu performance improves.
At last memory size is tested the influence of efficiency of transmission.Adopt three looms with different internal memories from the server file in download, result of the test shows that memory size neither be clearly for the influence of efficiency of transmission.
From experimental result all in all, CPU computing capability and memory size can improve transmission performance, but influence is not very big.This let us believes firmly that more bandwidth is the main factor that influences transmission performance, and the shared ratio of bandwidth should be greater than other two influencing factors.In the experiment that next will carry out, we are provided with dynamic adjustment strategy input parameter R CPU, R MEM, R MEMRatio is 1: 1: 8.
(2) experimental enviroment builds
Building the grid environment of a compartmentalization, set up 4 different zones, is that the physical circuit through 100Mbps is formed by connecting between each zone, adopts full connected mode between the zone.Each regional concrete configuration is seen table 2.Each server is all installed Globus Tookit4.0.0 and above configuration, and the data transfer tool GridFTP that utilizes it to provide makes an experiment.
The regional machines configurations of table 2
Figure GSA00000117362900051
(3) result of the test and analysis
For the performance of verifying the inventive method and copy number of resources to the inventive method Effect on Performance.We carry out two experiments.
Experiment one, the performance of checking the inventive method.
With the a-quadrant is the client-requested transfer of data, from 3 different zones, obtains data.After request of data is sent; When ldap server successfully navigates to 3 data copies, utilize the collaborative distribution of the method for exhaustion, 100MB tested to the data file of 2GB based on the collaborative distribution of historical method, conservative balance policy, dynamic allocation scheme and 5 kinds of methods of single copy transmission.Wherein the transfer of data of single copy is obtained data from B, C, D district separately respectively, is 5% of file length based on length burst in the collaborative distribution of historical method.
Through testing a result, we can obtain as drawing a conclusion:
(1) when the transmission small documents; Situation like 100M; The inventive method transfer of data effect is also not obvious; And the transfer of data effect that single copy occurred is transmitted situation preferably than the copy that manys of the inventive method, and this situation is because the proportion that accounts for total data transmission period computing time of the inventive method itself causes too greatly.Along with the increase of file size, above-mentioned ratio reduces, and the laser propagation effect that has occurred the inventive method in the time of 2G is two times a better effects of single-site transmission.
(2) file is greater than 100M the time, and conservative balance policy and two kinds of dynamic cooperation distribution methods of the inventive method overall transfer performance are superior to single copy data transmission and other two kinds of collaborative distribution methods.And the inventive method completion transformation task time is the shortest, and transmission performance is optimum.
(3) situation that single copy transmission performance is superior to the collaborative distribution method of tradition has appearred in experimental result.This is because the collaborative distribution method transmission performance of tradition depends on the transmission deadline of last block file burst, when the part server transmission performance is relatively poor, has influenced the overall transfer performance of collaborative distribution.
In the transmission course, experiment is provided with the situation that a certain service end lost efficacy, and the single copy transmission is in the wait state in indefinite duration; And the task that the inventive method can not have the inefficacy service end to accomplish in time is adjusted to end of transmission service end; Improved the reliability of grid data transmission effectively, this point is helpful to the data sharing and the collaborative work of grid system complicated and changeable in the wide scope.
Experiment two, checking copy number of resources is to the inventive method Effect on Performance.
Adopt the inventive method file in download from the server of multiple combination to test, the overall transfer performance is represented with transmission speed.Table 3 has been listed the Servers-all combination that experiment is used.For example server is combined as B, and the D region representation is from B, file in download on two region server of D.
The combination of table 3 copy Resource Server
Server combination title The server combination
B The B zone
D The D zone
BD B, the D zone
CD C, the D zone
BCD B, C, D zone
Through testing two results, we can obtain as drawing a conclusion:
(1) less during like 10MB when file size, the inventive method does not have advantage.
(2) in most of the cases, the inventive method overall performance improves along with the increase of copy number of resources.
When (3) experimental result show to be selected C, two replica servers of D, it is best that the inventive method overall performance reaches, and when increasing the D server, performance descends on the contrary.This means that increasing the copy number of resources might not improve overall performance.Transmission performance reaches best so we must select suitable copy number of resources.
Can find out that from top test the method for the grid environment lower network transmission that the present invention proposes has better efficiency of transmission.

Claims (1)

1. a method of improving data transmission efficiency in the grid environment is characterized in that this method comprises the steps:
Step (1) distributes an identical big or small file fragmentation transformation task for each grid ftp server in the starting stage, specifically:
At first, set the upper limit InitialPT of original document burst size,
InitialPT=ClientMaxBandwidth?/NumberOfReplicaSource
ClientMaxBandwidth representes the client maximum bandwidth in the formula, and NumberOfReplicaSource representes copy resource sum;
Define the computing formula of auxiliary parameter Score then according to the state of different server equipment:
Score i = P i CPU × R CPU + P i MEM × R MEM + P i BW × R BW
R wherein CPU+ R MEM+ R BW=1; Score iThe score value of expression server i, 1≤i≤n here, n is the server number;
Figure FSB00000692965900012
The CPU idle condition percentage of expression server i; R CPUThe shared ratio that influences key element of cpu load of expression definition;
Figure FSB00000692965900013
The internal memory free space percentage of expression server i; R MEMRepresent the shared ratio that influences key element of user-defined internal memory free space;
Figure FSB00000692965900014
The bandwidth of expression from server i to client accounts for total server bandwidth percentage; R BW: the shared ratio that influences key element of user-defined bandwidth;
Score when each server iAfter obtaining, calculate the weight w eighing of each server i:
weighing i = Score i / Σ k = 1 n Score k
Last weight w eighing according to each server that obtains i, calculate the file fragmentation next to be transmitted size NewPT of each server i,
NewPT i=ClientBandwidth×weighing i
When step (2) was accomplished current file slicing transmission task as server, this server can be assigned with new file fragmentation transformation task again, if bust this then can retransfer, specifically:
Each file fragmentation transmission that is through with current as server i, it will obtain new transformation task, and new transformation task is that the real-time status according to server i obtains, and the load of server is light more, and the transformation task that it is assigned with is just many more;
Step (3) is provided with termination condition, and when remaining file fragmentation size to be transmitted was big or small less than the file fragmentation of starting stage, remaining file fragmentation to be transmitted will be by transmission immediately.
CN2010101694276A 2010-05-11 2010-05-11 Method for improving data transmission efficiency in grid environment Expired - Fee Related CN101860479B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010101694276A CN101860479B (en) 2010-05-11 2010-05-11 Method for improving data transmission efficiency in grid environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010101694276A CN101860479B (en) 2010-05-11 2010-05-11 Method for improving data transmission efficiency in grid environment

Publications (2)

Publication Number Publication Date
CN101860479A CN101860479A (en) 2010-10-13
CN101860479B true CN101860479B (en) 2012-07-25

Family

ID=42946145

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010101694276A Expired - Fee Related CN101860479B (en) 2010-05-11 2010-05-11 Method for improving data transmission efficiency in grid environment

Country Status (1)

Country Link
CN (1) CN101860479B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104184753B (en) 2013-05-20 2018-04-27 腾讯科技(深圳)有限公司 A kind of document transmission method and device
CN104168081B (en) 2013-05-20 2018-09-07 腾讯科技(深圳)有限公司 A kind of document transmission method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1997013A (en) * 2006-12-22 2007-07-11 华中科技大学 Grid data transfer system based on multiple copies with the quality assurance
CN101187931A (en) * 2007-12-12 2008-05-28 浙江大学 Distribution type file system multi-file copy management method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8903981B2 (en) * 2008-05-05 2014-12-02 International Business Machines Corporation Method and system for achieving better efficiency in a client grid using node resource usage and tracking

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1997013A (en) * 2006-12-22 2007-07-11 华中科技大学 Grid data transfer system based on multiple copies with the quality assurance
CN101187931A (en) * 2007-12-12 2008-05-28 浙江大学 Distribution type file system multi-file copy management method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郭权 等.网格数据传输问题的一个优化算法.《大连理工大学学报》.2005,第45卷(第3期),438-442. *

Also Published As

Publication number Publication date
CN101860479A (en) 2010-10-13

Similar Documents

Publication Publication Date Title
US7062768B2 (en) Dynamic load-distributed computer system using estimated expansion ratios and load-distributing method therefor
CN104125006B (en) Satellite communication network bandwidth allocation methods
CN104168332A (en) Load balance and node state monitoring method in high performance computing
EP2695271B1 (en) Device and method for adjusting electricity consumption of a plurality of lighting devices of a lighting system
CN102111337A (en) Method and system for task scheduling
CN102170396A (en) QoS control method of cloud storage system based on differentiated service
CN105718364A (en) Dynamic assessment method for ability of computation resource in cloud computing platform
CN108170517A (en) A kind of container allocation method, apparatus, server and medium
CN110109756A (en) A kind of network target range construction method, system and storage medium
CN104104551B (en) Cloud resource need assessment method and device
Li et al. Leveraging endpoint flexibility when scheduling coflows across geo-distributed datacenters
CN110022330A (en) For the processing method of network packet, device and electronic equipment
CN106371916B (en) A kind of thread optimized method and device thereof of storage system IO
Shen et al. Probabilistic network-aware task placement for mapreduce scheduling
CN113157447A (en) RPC load balancing method based on intelligent network card
CN107276850B (en) Method and system for testing and transmitting unified interface of electricity consumption information acquisition system
CN101860479B (en) Method for improving data transmission efficiency in grid environment
CN104536832A (en) Virtual machine deployment method
CN103425536A (en) Test resource management method oriented towards distributed system performance tests
CN105491150A (en) Load balance processing method based on time sequence and system
CN106998340B (en) Load balancing method and device for board resources
WO2020134133A1 (en) Resource allocation method, substation, and computer-readable storage medium
CN102594902A (en) BitTorrent node selecting method based on node performance
CN111182061A (en) Task distribution processing method, system, computer device and storage medium
CN106681839A (en) Elasticity calculation dynamic allocation method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20101013

Assignee: Zhejiang Topthinking Information Technology Co., Ltd.

Assignor: Hangzhou Electronic Science and Technology Univ

Contract record no.: 2013330000096

Denomination of invention: Method for improving data transmission efficiency in grid environment

Granted publication date: 20120725

License type: Common License

Record date: 20130424

Application publication date: 20101013

Assignee: Zhejiang Maximal Forklift Co., Ltd.

Assignor: Hangzhou Electronic Science and Technology Univ

Contract record no.: 2013330000094

Denomination of invention: Method for improving data transmission efficiency in grid environment

Granted publication date: 20120725

License type: Common License

Record date: 20130423

LICC Enforcement, change and cancellation of record of contracts on the licence for exploitation of a patent or utility model
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20120725

Termination date: 20170511

CF01 Termination of patent right due to non-payment of annual fee