CN101483650B - File fast transmission method based on data grid under campus network circumstance - Google Patents

File fast transmission method based on data grid under campus network circumstance Download PDF

Info

Publication number
CN101483650B
CN101483650B CN2009100246532A CN200910024653A CN101483650B CN 101483650 B CN101483650 B CN 101483650B CN 2009100246532 A CN2009100246532 A CN 2009100246532A CN 200910024653 A CN200910024653 A CN 200910024653A CN 101483650 B CN101483650 B CN 101483650B
Authority
CN
China
Prior art keywords
data
module
ftp
function
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2009100246532A
Other languages
Chinese (zh)
Other versions
CN101483650A (en
Inventor
王汝传
孔强
任勋益
付雄
邓松
邓勇
季一木
易侃
杨明慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Post and Telecommunication University
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing Post and Telecommunication University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Post and Telecommunication University filed Critical Nanjing Post and Telecommunication University
Priority to CN2009100246532A priority Critical patent/CN101483650B/en
Publication of CN101483650A publication Critical patent/CN101483650A/en
Application granted granted Critical
Publication of CN101483650B publication Critical patent/CN101483650B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a quick transmitting method of data files of a campus network based on data grids. By adding interference mechanism to the FTP system which is widely used in common campus networks, a transition to a data grid environment can be performed quickly, files can be transmitted quickly by using the advantages of the data grids. Sources of the original FTP system can be imported into the new data grid environment automatically by the running of a system discovering module, a source discovering module, a function discovering module, a conversion module, a request response converting module and the like, and at the same time, existing data transmission software is compatible in the running process of data grids, therefore, the data grid using barrier is reduced, and disadvantages that the data grid has a long period of application development, is hard to combine with existing sources, is difficult for users to master, and the like are overcome.

Description

Under a kind of campus network environment based on the file rapid transmission method of data grids
Technical field
The present invention is a kind of campus network data file rapid transmission method based on data grids.By on the widely used FTP system-based of common campus network, having increased interface mechanism, thereby carry out the transition to data grid environment fast, use the advantage of data grids to carry out file and transmit fast.The new data grid environment of structure under the system configuration situation of the existing operation of the change of minimum degree of the present invention, and compatible data with existing transmitting software, use threshold thereby reduced data grids, solved the data grids application and development cycle long, be difficult to combine with existing resource, domestic consumer is difficult to shortcomings such as grasp.
Background technology
Campus network is in school's scope, and the computer local network of resource-sharing, information interchange and collaborative work is provided for education such as school instruction, scientific research and management." the 22nd China Internet network state of development statistical report " according to CNNIC CNNIC in July, 2008 provides data, by in by the end of June, 2008, China netizen quantity reaches 2.53 hundred million people, wherein junior college, undergraduate course, master and above total number of persons have accounted for 31.2%, 7,893 ten thousand people just, be a huge user crowd, simultaneously, the China netizen has 13.1% to be selected at the school online, thus, there is the campus network construction of huge user base and demand very important.Simultaneously, whether the technology that current campus net adopted can satisfy huge radix user to use growing file storage and the transmission requirement brought also be a major issue.
Each operator generally adopts traditional FTP (file transfer protocol (FTP)) system to store data at present, File Transfer Protocol is the widest agreement of the scope of application on the present internet, having a large amount of technology and resource base, also is one of specification data host-host protocol among the Internet.Use the system of File Transfer Protocol that a kind of concentrated interactively visit is provided, allow authorized user to upload data file and download needed file to server or from server by client computer.Simultaneously, traditional FTP system mask the computer that connects each other and the details in the transmission course, thereby be well suited in the network configuration of isomery, transmitting data.
But increase along with large volume files such as campus network user base number, multimedias, a large amount of network data visit and transmission have been brought, and people are more and more stronger to the demand of large-scale data sharing, common FTP system is as a kind of centralized system, more and more can not adapt to the requirement of development, simultaneously, the development of data grids has brought new selection to us.
Data grids are a kind of grids towards large-scale distributed storage and processing, it couples together the storage and the data resource of the isomery that distributes by high performance network, and provides related mechanism to make the visit that the user can be transparent and handle large-scale distributed data set.Can well handle the resource data that rapidly expands now.
But make up data grid environment at present and mainly face three problems: waste time and energy (1) newly-built new data grids system fully, need under a lot of situations to begin to make up from the most basic system, the system that has given up original long-time running fully needs exploitation again to have the system and the software of corresponding function.(2) need rearrange for the data resource of the FTP that has existed, user right, journal function etc., can not make things convenient for fast automatic transition.(3) user need relearn and use new software to carry out transfer of data, but the user may tend to the accustomed software of selection, can limit the use crowd so make up brand-new data grid environment.
The technical scheme that existing campus network is built data grids is the design about new structure data grid environment substantially, do not consider rationally to utilize resource under the existing FTP system, do not realize a kind of automatic identification existing resource, support the scheme of the data grids of original user software.
Summary of the invention
Technical problem: the purpose of this invention is to provide under a kind of campus network environment file rapid transmission method based on data grids, campus network FTP is upgraded fast become data grids, solved the data grid environment inconvenient problem with use, thereby provide resource and operational assurance for utilizing the data grids advantage to carry out the large-scale data transmission.
Technical scheme: the present invention seeks to change the mechanism the arrive school data grid environment of net of smooth transition for the increase of original campus network FTP system.At grid middleware OGSA-DAI (a kind of grid middleware, can simplify visit to the data in the data grid) on the grid function basis that provides, expand transformation by two aspects, promptly on the one hand by the system discovery module, resource discovery module, function is found module, format converting module, the data message module, release module imports in the new data grid environment legacy data and control resource, make the legacy data transmitting software be converted into use automatically by the request-reply conversion module on the other hand to the new data grid environment to the use of trivial file transfer protocol FTP system, thereby seamless upgrade is to data grid environment on user class, give full play to the data grids advantage more, reach the purpose of data high-speed transmission;
Step 1: need be to the original system preparation of upgrading and upgrade, be that first start-up system is found module, search installed and used at present or internal memory in the FTP system that moving, obtain its concrete title, version number, installation path, Profile Path, read corresponding contents; The resource discovery module of reruning, use the FTP systematic name, version, the Profile Path that read in the previous step, read specific configuration information and parsing, thereby find to be stored in the data resource on the memory node, the path of scan-data resource, size, kind, the authority that each user is distributed, obtain the data description information result, utilize the data message module that the data description information result is stored in the database again;
Step 2: operation function is found module, same FTP systematic name, version, the Profile Path that reads previously that use, obtain the specific configuration information and the parsing of targeted environment, thereby find each subsidiary other journal functions, backup functionality, timing function of concrete FTP system, use the data message module stores to go in the database;
Step 3: before upgrading, start the format conversion module, the data description information that use has been found that, be converted into the form that new data grids can be discerned, simultaneously the function correspondence of finding is transformed into function corresponding in the data grids system, the requirement that configuration file is described converts the requirement description that can realize to;
Step 4: build the data grids system environments and OGSA-DAI is installed according to general data grid building method, put up data grid environment and test available after, the operation release module, the function that will transform on the one hand good data resource provides by OGSA-DAI is published to goes in the data grid environment, and the function that will transform the functional requirement of getting well on the other hand provides by OGSA-DAI is submitted in the functional module of data grids system correspondence and goes; Carry out issuing steps repeatedly, all transformed up to original resource and finish;
Step 5: start the request-reply conversion module at last, the FTP request of response user original system software, when finding the new request of user, with original FTP request analysis is data grids transmission request, be transferred to data grid environment, and the result is converted into original FTP return information returns to the user;
Step 6: after above upgrade job is finished, wait for client's file transfer requests, after the client sends file transfer requests and asks, the system responses request, judge that the user uses original system FTP transmitting software still to support the transmitting software of data grids, if the old FTP transmitting software of original system then communicates with by the request-reply conversion module, and call parallel data transmission and block data transmission under the data grids, thereby quick transfer files; If support the novel transmitting software of data grids, then directly meet at data grid environment, use parallel data transmission and block data transmission and tripartite transfer of data.
Beneficial effect: along with the increase of large volume files such as campus network user base number, multimedia, a large amount of network data visit and transmission have been brought, and people are more and more stronger to the demand of large-scale data sharing, traditional centralized FTP system, more and more can not adapt to the requirement of development, be subject to the performance of certain server easily, and overall load is difficult for balanced.And common distributed FTP system is difficult to change dynamically topological structure, and data sharing is also abundant inadequately.Though the data grids system can solve above shortcoming, but newly-built new data grids system wastes time and energy fully, need under a lot of situations to begin to make up from the most basic system, the system that has given up original long-time running fully, need exploitation again to have the system and the software of corresponding function, can not make things convenient for the transition of quick and stable.The user need relearn and use new software to carry out transfer of data simultaneously, has limited the use crowd.
The solution route that this programme provides can realize steadily that original FTP system carries out the transition to data grid environment, make full use of existing resource on the one hand, compatible original FTP transmitting software, thereby seamless upgrade is to data grid environment on user class, use OGSA-DAI to realize data grid environment fast on the other hand, realize the function of whole mesh unified management and resource discovering, utilize parallel data transmission, block data transmission and tripartite transfer of data under the data grids to make full use of Internet resources simultaneously, reach the purpose of data high-speed transmission.
Description of drawings
The whole flowchart of Fig. 1 campus data grids file transfer,
Fig. 2 module relation diagram,
Fig. 3 parallel data transmission diagram,
Fig. 4 block data transmission diagram,
The tripartite data transmission scheme of Fig. 5,
Fig. 6 integrated data transmission diagram.
Embodiment
As shown in Figure 1, provide specific implementation below.
One, architecture
Whole proposal has comprised system discovery module, resource discovery module, function discovery module, data message module, format converting module, release module, request-reply conversion module.Correlation as shown in Figure 2.
Provide the explanation of concrete module below;
The system discovery module: this module functions is to find the FTP system that installs and uses or using.For example under Windows, the mount message of common FTP service software is sought in the search of the registration table that can pass through, and the concrete process of FTP etc. is provided to provide by service describing and service name in the perhaps service that moving of scanning system.After the FTP system that finds to have existed, it is carried out scanning analysis, draw concrete title, version number, the path, place, profile name, configuration file position, then with prior system in common system compare, obtain comparison result, select corresponding analytic method, the content that configuration file is described is resolved, obtains resource position and user profile, generally include following a few part:
1. resource location: already present data file position tabulation.
2. network description: IP information etc.
3. the user describes: the user profile of using, come into force, deleting and the configuration section of particular user.
4. domain information: the following user list of information and territory of describing the territory.
5. authority information: authority such as the user had reads, write, add, modification, deletion and corresponding resource location are described.
6. other information: other descriptors.
The function of system discovery module is exactly that the descriptive configuration information that the FTP that different software is built serves is read out, and by other conversion of semantic class, imports in the new data grid environment, and it can be realized and old FTP system identical functions.Simultaneity factor finds that module also provides the basis for other discovery modules.
Resource discovery module: this module functions is to find existing data resource, for example data file that each user uploaded.Resource location by finding in the system discovery module scans memory, and legal normal scanning result is got off by the data message module stores, and the part of main scanning has:
1. user profile: the user that this file belongs to.
2. positional information: path, file place and physical file name.
3. time on date: comprise the foundation of file, modification time.
4. attribute information: the Authorization Attributes of this file, according to different system different descriptions is arranged, as read-only, the system under the Windows, hide, read, write, carry out attribute under the Linux.
5. file size: this document size information.
The major function of resource discovery module is to prepare for resource steadily imports data grid environment, and the data message scanning that has existed in the original system is preserved, so that other modules are published in the data grid environment automatically, avoids inefficient manual operation.
Function is found module: this module functions is to keep other specific service functions of original system, realizes a smooth transition.By the system discovery module, navigate to concrete system, search this system again and whether contain common FTP expanded function, if find to comprise expanded function, then analyze the configuration file of describing expanded function, parse the implication of expanded function, as journal function, backup functionality, timing function etc.
The data message module: this module functions is the scanning result that register system is found module, resource discovery module, function discovery module.Can realize by various databases.
The format conversion module: this module functions is that system discovery module, resource discovery module, the function of will store in the data message module find that the scanning result of module is converted into discernible resource and the function in the data grids.Comprising that two aspects transform, is the conversion of data description form on the one hand, is the conversion of function on the other hand.
The data description format conversion: with in the original FTP system about the description of data file, the description of file corresponding user information is converted into the description about data under the new data grids resource.If any demand, data file can be duplicated or moved in the new position simultaneously.
Function transforms: find the functional description found in the module by function, seek under the data grid environment expanded function identical functions module therewith, calculate and realize the identical function parameters needed, and result of calculation is preserved.
Release module: this module functions is that the data that formats stored in the data message module has transformed are published in the data grid environment, and this programme is selected the mode based on OGSA-DAI (Open Grid Services Architecture-Data Access andIntegration).
OGSA-DAI is a kind of middleware, and its design object provides a kind of easy method, the visit of realization data and integrated in grid environment.OGSA-DAI comprises following components:
1. grid data service (Grid Data Service, GDS): can visit certain data resource by this service, as relational database, XML data and file resource.
2. factory (Factory): this service is used to create a GDS example, visits specific data resource.
3. service groups Register (Service group registry): by the service groups Register, can find needed GDS, perhaps find the factory of can be newly-built required GDS.
4. carry out document (Perform document): the document of XML form, the user describes the operation that need carry out on GDS, and for example file duplicates, changes, the operation of database etc.
5. response document (ResponseDocument): a kind of document of XML form, returned the result who returns behind the GDS processing execution document.
6. movable (Activity or Activities): realize the core of program function, realize various application.
Normally used flow process is:
1. service data grid container;
2. the user is according to specific requirement, and request is searched factory by the service groups Register
3. create a GDS by corresponding factory;
The user send PerformDocument communicate by letter with GDS carry out mutual;
5.GDS return a ResponseDocument;
6. the user finishes GDS or allows its automatic extinction.
This flow process by OGSA-DAI provides concerning the data file, is published to the data file that has transformed form among the corresponding GDS by carrying out document on the one hand, and the operation respective activity realizes the issue of data file; Concerning functional module, seek corresponding GDS on the other hand, document will be ordered and parameter is published among the GDS by carrying out, and the operation activity realizes function.
The request-reply module: the function of this module is the employed software of compatible original FTP system, thereby make the user need not to learn and use new transmitting software, do not influence simultaneously the new transmitting software of the autonomous support data grid environment of selecting of user, thereby better bring into play the advantage of data grids.Implementation method is as follows:
When the user starts transmitting software, when the order that sends is submitted to server, module at first identifies the type of transmitting software, if use the software of common FTP, then analyzes the File Transfer Protocol of its use, judge concrete File Transfer Protocol version, with the service of the original old FTP version correspondence of virtual generation, do not produce actual service then, but meeting echo reply information, at transmitting software, as broad as long with real service.Secondly, when the user proposes file transfer requests or other requests by transmitting software, module is converted into the request with said function under the corresponding data grid environment with request, carry out document by XML, be published on the GDS, the response of returning that simultaneously the response document contents extraction is converted under the old FTP release format is described, and returns to user's transmitting software.
By this module, the transmitting software that can compatible be used for being accustomed to makes system upgrade to user transparentization.Flow process as shown in Figure 2.
Two, method flow
1, graceful upgrade flow process
1. the user at first selects OGSA-DAI and Globus Toolkit Java Web Service Core version, installs according to common flow process, builds general data grid environment, and can test normally be used;
2. operational system is found module, and system before upgrading is scanned, and at first surveys the operating system of using, and forwards the scan module of specific operating system correspondence to according to result of detection, because the details of different operating system scan methods is different.The parsing means that read of configuration information and configuration file also need to treat respectively.
3. use different scanning means to seek the FTP system that has has installed and used according to the operating system of surveying, be included in and seek example in memory and the internal memory with FTP system features, automatically it is carried out scanning analysis, draw concrete title, version number, the path, place, profile name, the configuration file position, then with prior storage in common system features compare, obtain comparison result, select corresponding analytic method, the content that configuration file is described is resolved, obtain information such as resource position and user profile description,, use in order to other modules at last by the data message module records.
4. as automatically inaccurate the or user of result of detection need make amendment, then enter the manual configuration of system discovery module, by tabulation select user-defined FTP system and with the file and the associated configuration information of configuration association, needing that in case of necessity configuration information is carried out semanteme resolves, select to dispose the implication that shows by self-defined tabulation, dispose the implication of used parameter,, use in order to other modules by the data message module records.
5. operation resource discovery module, scan by the concrete resource location that the system discovery module that writes down in the data message module is obtained, compare resource description information simultaneously, obtain contents such as each file user information corresponding, positional information, time on date, attribute information, file size.Use the data message module stores to get off at last.
6. as automatically inaccurate the or user of result of detection need make amendment, and then enters the manual configuration of resource discovery module.At first need the position of user's designated recorder file storage information and the form of log file information, rescan the acquisition scanning result according to user appointed information then.If secondly under the situation that the log file information format can not be analyzed automatically, need the user that the form of descriptor file is carried out the semanteme parsing, explain corresponding implication, select the physical meaning of each parametric description by tabulation.The file storage result that user-defined information scanning is obtained at last covers old scanning result or newly stores by the data message module.
7. operation function is found module, and purpose is to keep other specific service functions of original system, realizes a smooth transition.The information that provides by the system discovery module navigates to concrete system, judge in conjunction with configuration file whether the preceding system of upgrading contains the expanded function of common FTP again, if find to comprise expanded function, then analyzing the configuration of describing expanded function explains, parse the implication of expanded function, as journal function, backup functionality, timing function etc., under the data message module records.
8. as automatically inaccurate the or user of result of detection need make amendment, and then enters the manual configuration that function is found module.At first list the attainable function that has comprised, select in order to the user, after the user selects certain function, the parameter that needs when needing the concrete realization of this function of configuration, the user can select suitable parameters according to actual conditions.Secondly as the FTP system of not haveing been friends in the past in the attainable feature list can realize a certain function the time, need operate in two kinds of situation.One: certain function of old FTP system is the combination of several functions supporting of new system, then can be by new system support functions be merged operation, and remove unwanted module according to user's needs; Its two: old FTP system function is that institute of new system is unsupported, then can select to abandon or use other function expanding module by interface according to customer requirements.At last the description and the parameter of User Defined function are got off by the data message module stores.
9. the operation conversion module comprises that two aspects transform, and is the conversion of data description form on the one hand, is the conversion of function on the other hand.
The data description format conversion: with in the original FTP system about the description of data file, the description of file corresponding user information is converted into the description about data under the new data grids resource.If any demand, data file can be duplicated or moved in the new position simultaneously.
Function transforms: find the functional description found in the module by function, seek under the data grid environment expanded function identical functions module therewith, calculate and realize the identical function parameters needed, and result of calculation is preserved.
10. operation release module: carry out document by submission and be provided with in the services such as GDS of correspondence to the GDS of data issue correspondence, GDS, the authority of establishment user correspondence, issue is by the data file and the related content of resource discovery module collection acquisition.On the one hand, each data file is published in the OGSA-DAI data grid environment according to transforming good data description form, on the other hand, be written into the function conversion module and calculate the parameter that obtains, be submitted on the GDS that can realize identical function by carrying out document, obtain and old system identical functions effect.
2, compatible common ftp software transfer process
Operation request-reply module, thus the employed common ftp software of compatible old system makes upgrading transparence concerning the user.After the operation, the information that module is collected according to the system discovery module is taken over corresponding ports, the transmitting software submiting command of user's use by the time.
When the user starts transmitting software, when the order that sends is submitted to server, module at first identifies the type of transmitting software, if use the software of common FTP, then analyzes the File Transfer Protocol of its use, judge concrete File Transfer Protocol version, with the service of the original old FTP version correspondence of virtual generation, still in fact do not produce real service, just echo reply information then, at transmitting software, as broad as long with real service.Secondly, when the user proposes file transfer requests or other requests by transmitting software, module is converted into the request with said function under the corresponding data grid environment with request, carry out document by XML, be published on the GDS, the response of returning that simultaneously the response document contents extraction is converted under the old FTP release format is described, and returns to user's transmitting software.During the actual transmissions file, an interface by module itself, to transmit both sides connects, it seems from the transmitting software aspect, what its was got in touch is a concrete common centralized ftp server, it seems that from the memory node aspect its transfer of data is to be based upon on the mechanism of data grids file transfer.
High-speed transfer flow process under 3 data grid environment
No matter traditional passive type FTP system that concentrates still is the data transmission procedure that the distributed FTP system is adopted, and certain shortcoming is all arranged.
The passive type FTP data transmission procedure that tradition is concentrated is: client FTPClient uses the PASV metacommand to send the transmission request to server end FTPServer; And FTPServer makes and replying, and has replied transmission and has resisted and corresponding ports; FTPClient is the description by replying then, sets up data transmission channel and carries out transfer of data; After transfer of data was finished, FTPServer transmitted process accordingly and then disconnects connection, and FTPServer returns the end of transmission to FTPClient and replys, and so far a transmission course finishes.
There are following shortcomings in this centralized system:
(1) is subject to server performance.Comprise the restriction that is subjected to server stores spatial limitation, server I/O speed, the restriction of server outlet bandwidth etc.
(2) load is unbalanced.In the actual moving process, over-burden for some FTP Server, and other a lot of FTP Server is in idle state.
Because employed command process of passive type FTP system and data processing that tradition is concentrated can be separated fully.Just produced distributed FTP system thus, be about to data transmission channel and separate with the command channel.Though this mentality of designing has solved the unbalanced problem of server load to a certain extent,, still there is the shortcoming of following two aspects at campus network:
(1) topological structure is difficult for expansion: increase or reduce server, bring very big inconvenience all can for keeper and user, can't change topological structure easily and fast, also do not consider the problem of dynamic increase and decrease;
(2) insufficiency of data sharing: the main bottleneck of campus network is the interface of campus network and public network, though ample resources is present in the campus network, but it is also ignorant under a lot of situations of user, cause from selecting from public network searching and transmission, if so can make full use of the interior resource of campus network, to promote the quality of network greatly, reduce unnecessary bandwidth consumption.
And grid is a kind of extension and the expansion of traditional distributed technology, has well solved these shortcomings of traditional distributed FTP system.So this programme adopts the FTP architecture based on data grids.Can upgrade fast under existing FTP system situation on the one hand, compatible on the other hand original FTP system's program thereby and data need not to do any change to the user.
FTP system result in order to bring into play the grid advantage, needs to realize following function under the data grids except needs realization and original FTP compatibility:
1 data grids uniform resource management.For old system data, be published in the grid by release module, guaranteed the integrality of resource representation, and the information of resource has been carried out centralized and unified management.
The user management that 2 data grids are unified.For old system user information, be published in the data grids user management module by release module, thereby embodied sharing and the characteristics of monopolizing, resource unified management and grid node autonomy combine of resource in the grid.
The authorization identifying management that 3 data grids are unified.By resource discovery module, old user right and authorization identifying management are transferred under the authorization identifying administrative mechanism unified under the data grids.
Function by 1,2,3, unified management resource, user, authority, the topological structure that can solve traditional distributed FTP system is difficult for the shortcoming of expansion.
4 realize grid FTP monitoring resource and discovery.
Under data grid environment, obtainable resource collection changes, and new data resource or service function may be added, also may original data resource or service function is deleted and revise, or the attribute of resource is changed.So because the existence of this dynamic, how to find this variation and how Resources allocation is handled is very significant.Simultaneously, same because dynamic and distributivity make monitoring in data grids, just the data resource of present existence or the service function of moving are followed the tracks of for certain purpose, and also be crucial.
The interface that this programme adopts OGSA-DAI to provide uses a kind of MDS (Monitoring and Discovery System), realizes the discovery and the monitoring of the service of resource and function.MDS provides the state information of relevant gridding resource.Its primary clustering comprises index service (Index service), triggers service (Trigger service) and assembles service (Aggregator service).
MDS is one group of WEB service that is used for monitoring and finding data grids resource and service, can be to collecting, manage, index and respond about the descriptor of resource, function service, node state.The assembly of core is Index Service among the MDS, it be used for the various resources of gather data grid information and for these information provide one the inquiry and predetermined interface.We can effectively manage for the resource of wanting to enter or withdraw from the data grids system by this interface.By using MDS, can conveniently find user's resource needed, solve the inadequate problem of data sharing of traditional distributed FTP system to a certain extent.
5 use parallel data transmission, block data transmission and the tripartite transfer of data of Gridftp under the data grids: main purpose is to make rational use of resources, and rationally handles the problem of load balance, shares inadequate shortcoming thereby further solve traditional F TP system data.Simultaneously in conjunction with the compatible original FTP transmitting software of request-reply conversion module.
Gridftp is after to be Globus project team to existing File Transfer Protocol and technology expand, under grid environment, carry out the agreement of transfer of data, OGSA-DAI well provides the interface that can use Gridftp, easily the parallel data transmission, block data transmission and the tripartite transfer of data that are adopted of design and use Gridftp.
(1) parallel data transmission:
In the transfer of data, be not, only use a TCP transmission channel, and be to use a plurality of parallel TCP interface channels, thereby can improve the total bandwidth of transfer of data effectively as traditional F TP.Thereby GridFTP uses the expansion of instruction and data channel to support the parallel data transfer function.
The parallel data mode as shown in Figure 3.Connect by setting up a plurality of TCP, the different piece of transfer files is transferred to data file on the destination node on different data channel.As having set up n bar data channel between the economize on electricity of source node and target, and the transmission rate of i bar path is vi, and so total transmission rate V just can reach V=v i(0≤i≤n) does not reach the upper limit as the outlet bandwidth of going into port band width and destination node of source node, then can improve data transmission bauds by increasing the way of new data channel, reduces in the transmission road other image factor to the influence of transfer of data.When one of them TCP transmission channel is subjected to stoping, do not influence the transfer of data of other TCP transmission channel.Thereby well utilized the bandwidth between source node and destination node.
The concrete transfer process of parallel data transmission is as follows usually:
Step1. create a Gridftp parallel transmission example, and verify;
Step2., the type and the transmission mode of transmission are set;
Step3., parallel quantity is set;
Step4., the parameter that other Ftp transmission needs is set;
Step5. connect;
Step6. the piecemeal read data files is transmitted;
Step7. end of transmission is closed connection.
(2) block data transmission:
In data grid environment, large-scale data can distribute and be placed on a plurality of memory nodes, and this is referred to as the block data storage, and the transmission of data between these memory points is exactly the block data transmission.The block data transmission both can be that the different piece of a complete data centralization is disperseed to be transferred on the different destination nodes, also can be to be transferred to same position (being identical node), generate new data acquisition system according to the correlation between the subclass being distributed in a plurality of data subsets that have certain relation on the multiple source node.
Mode under this block data storage condition as shown in Figure 4, core concept is to use a plurality of TCP transmission channels to transmit the data subset that is distributed on the different nodes.This block data transmission mode has further improved the bandwidth utilization rate on the whole and has made appendix balance more on the basis of parallel data transmission.
The concrete transfer process of block data transmission is as follows usually:
Step1. create a Gridftp piecemeal transmission example, and verify;
Step2., the type and the transmission mode of transmission are set, and before pattern, add piecemeal transmission sign;
Step3., parallel quantity is set;
Step4., the parameter that other Ftp transmission needs is set, and the protection buffering on each destination node is set;
Step5. connect;
Step6. the piecemeal read data files is transmitted;
Step7. end of transmission is closed connection.
(3) tripartite transfer of data:
This is in order to manage the memory node of separation, allow certain user or certain application can use the data resource in a plurality of places, simultaneously, can be for setting up the trust of strange land storage system communicating pair, create new security mechanism, the work of authentication is finished by the third party except communicating pair, set up the security system that meets custom more.GridFTP provides the data-transformation facility by third party's control through differentiating.
The data transfer model of tripartite control as shown in Figure 5, it contains GridFTP client and two GridFTP servers.The GridFTP client is set up control channel and is carried out purview certification and audit with two server respectively, and after purview certification and audit were passed through, control command was just transmitted in these two control channels of setting up, thereby controls corresponding server.Then between server, produce data channel, and the control command that the root client is sent by control channel is provided with parameter, thereby in data channel, transmits data according to Control Parameter.
The concrete transfer process of tripartite transfer of data is as follows usually:
(1) creates two GridFTP examples, set up control channel with the server of control respectively;
(2) carry out the user right authentication with two servers;
(3) size that protection cushions is set;
(4) transport-type and the pattern of data are set;
(5) parameter that the transmission of parallel transmission and piecemeal needs is set;
(6) transmission mode of the server of control is set;
(7) between the server of control, set up data channel;
(8) in the enterprising line data transmission of data channel;
(9) close the connection of foundation.
On the whole, combine parallel data transmission, block data transmission and the concrete transfer process of tripartite transfer of data as shown in Figure 6.Support the software of data grids can use these three kinds of transmission methods to carry out the rapid file transmission, original system FTP transmitting software also can utilize parallel by the request-reply conversion module and the piecemeal transmission means increases transmission speed.
The present invention seeks to change the mechanism the arrive school data grid environment of net of smooth transition for the increase of original campus network FTP system.On the grid function basis that OGSA-DAI provides, expand transformation by two aspects, on the one hand by the system discovery module, resource discovery module, function is found module, format converting module, the data message module, release module imports in the new data grid environment legacy data and control resource, make the legacy data transmitting software be converted into use automatically by the request-reply conversion module on the other hand to the new data grid environment to the use of common FTP system, thereby seamless upgrade is to data grid environment on user class, give full play to the data grids advantage more, reach the purpose of data high-speed transmission.
Concrete steps are:
1. the user at first selects OGSA-DAI and Globus Toolkit Java Web Service Core version, installs according to common flow process, builds general data grid environment, and can test normally be used;
2. configuration makes up the MDS module, whole mesh user management module, whole mesh resource management module, whole mesh purview certification module and test;
3. configuration Gridftp transport module, and test;
4. operational system is found module, and system before upgrading is scanned, and at first surveys the operating system of using, and forwards the scan module of specific operating system correspondence to according to result of detection, because the details of different operating system scan methods is different.The parsing means that read of configuration information and configuration file also need to treat respectively.
5. use different scanning means to seek the FTP system that has has installed and used according to the operating system of surveying, be included in and seek example in memory and the internal memory with FTP system features, automatically it is carried out scanning analysis, draw concrete title, version number, the path, place, profile name, the configuration file position, then with prior storage in common system features compare, obtain comparison result, select corresponding analytic method, the content that configuration file is described is resolved, obtain information such as resource position and user profile description,, use in order to other modules at last by the data message module records.
6. as automatically inaccurate the or user of result of detection need make amendment, then enter the manual configuration of system discovery module, by tabulation select user-defined FTP system and with the file and the associated configuration information of configuration association, needing that in case of necessity configuration information is carried out semanteme resolves, select to dispose the implication that shows by self-defined tabulation, dispose the implication of used parameter,, use in order to other modules by the data message module records.
7. operation resource discovery module, scan by the concrete resource location that the system discovery module that writes down in the data message module is obtained, compare resource description information simultaneously, obtain contents such as each file user information corresponding, positional information, time on date, attribute information, file size.Use the data message module stores to get off at last.
8. as automatically inaccurate the or user of result of detection need make amendment, and then enters the manual configuration of resource discovery module.At first need the position of user's designated recorder file storage information and the form of log file information, rescan the acquisition scanning result according to user appointed information then.If secondly under the situation that the log file information format can not be analyzed automatically, need the user that the form of descriptor file is carried out the semanteme parsing, explain corresponding implication, select the physical meaning of each parametric description by tabulation.The file storage result that user-defined information scanning is obtained at last covers old scanning result or newly stores by the data message module.
9. operation function is found module, and purpose is to keep other specific service functions of original system, realizes a smooth transition.The information that provides by the system discovery module navigates to concrete system, judge in conjunction with configuration file whether the preceding system of upgrading contains the expanded function of common FTP again, if find to comprise expanded function, then analyzing the configuration of describing expanded function explains, parse the implication of expanded function, as journal function, backup functionality, timing function etc., under the data message module records.
10. the inaccurate or user of moving result of detection need make amendment, and then enters the manual configuration that function is found module.At first list the attainable function that has comprised, select in order to the user, after the user selects certain function, the parameter that needs when needing the concrete realization of this function of configuration, the user can select suitable parameters according to actual conditions.Secondly as the FTP system of not haveing been friends in the past in the attainable feature list can realize a certain function the time, need operate in two kinds of situation.One: certain function of old FTP system is the combination of several functions supporting of new system, then can be by new system support functions be merged operation, and remove unwanted module according to user's needs; Its two: old FTP system function is that institute of new system is unsupported, then can select to abandon or use other function expanding module by interface according to customer requirements.At last the description and the parameter of User Defined function are got off by the data message module stores.
11. conversion module comprises that two aspects transform, and is the conversion of data description form on the one hand, is the conversion of function on the other hand.
The data description format conversion: with in the original FTP system about the description of data file, the description of file corresponding user information is converted into the description about data under the new data grids resource.If any demand, data file can be duplicated or moved in the new position simultaneously.
Function transforms: find the functional description found in the module by function, seek under the data grid environment expanded function identical functions module therewith, calculate and realize the identical function parameters needed, and result of calculation is preserved.
12. operation release module: carry out document by submission and be provided with in the services such as GDS of correspondence to the GDS of data issue correspondence, GDS, the authority of establishment user correspondence, issue is by the data file and the related content of resource discovery module collection acquisition.On the one hand, each data file is published in the OGSA-DAI data grid environment according to transforming good data description form, on the other hand, be written into the function conversion module and calculate the parameter that obtains, be submitted on the GDS that can realize identical function by carrying out document, obtain and old system identical functions effect.
13. operation request-reply module, thereby the employed common ftp software of compatible old system makes upgrading transparence concerning the user.After the operation, the information that module is collected according to the system discovery module is taken over corresponding ports, the transmitting software submiting command of user's use by the time.
When the user starts transmitting software, when the order that sends is submitted to server, module at first identifies the type of transmitting software, if use the software of common FTP, then analyzes the File Transfer Protocol of its use, judge concrete File Transfer Protocol version, with the service of the original old FTP version correspondence of virtual generation, still in fact do not produce real service, just echo reply information then, at transmitting software, as broad as long with real service.Secondly, when the user proposes file transfer requests or other requests by transmitting software, module is converted into the request with said function under the corresponding data grid environment with request, carry out document by XML, be published on the GDS, the response of returning that simultaneously the response document contents extraction is converted under the old FTP release format is described, and returns to user's transmitting software.During the actual transmissions file, an interface by module itself, to transmit both sides connects, it seems from the transmitting software aspect, what its was got in touch is a concrete common centralized ftp server, it seems that from the memory node aspect its transfer of data is to be based upon on the mechanism of data grids file transfer.

Claims (1)

  1. Under the campus network environment based on the file rapid transmission method of data grids, it is characterized in that on the grid function basis that grid middleware OGSA-DAI provides, expand transformation by two aspects, promptly on the one hand by the system discovery module, resource discovery module, function is found module, format converting module, the data message module, release module imports in the new data grid environment legacy data and control resource, make the legacy data transmitting software be converted into use automatically by the request-reply conversion module on the other hand to the new data grid environment to the use of trivial file transfer protocol FTP system, thereby seamless upgrade is to data grid environment on user class, give full play to the data grids advantage more, reach the purpose of data high-speed transmission;
    Step 1: need be to the original system preparation of upgrading and upgrade, be that first start-up system is found module, search installed and used at present or internal memory in the FTP system that moving, obtain its concrete title, version number, installation path and Profile Path, read corresponding contents; The resource discovery module of reruning, use the FTP systematic name, version number and the Profile Path that read in the previous step, read specific configuration information and parsing, thereby find to be stored in the data resource on the memory node, the path of scan-data resource, size, kind and the authority that each user is distributed, obtain the data description information result, utilize the data message module that the data description information result is stored in the database again;
    Step 2: operation function is found module, same FTP systematic name, version number and the Profile Path that reads previously that use, obtain the specific configuration information and the parsing of targeted environment, thereby find each subsidiary other journal functions, backup functionality and timing function of concrete FTP system, use the data message module stores to go in the database;
    Step 3: before upgrading, start the format conversion module, the data description information that use has been found that, be converted into the form that new data grids can be discerned, simultaneously the function correspondence of finding is transformed into function corresponding in the data grids system, the requirement that configuration file is described converts the requirement description that can realize to;
    Step 4: build the data grids system environments and OGSA-DAI is installed according to general data grid building method, put up data grid environment and test available after, the operation release module, the function that will transform on the one hand good data resource provides by OGSA-DAI is published to goes in the data grid environment, and the function that will transform the functional requirement of getting well on the other hand provides by OGSA-DAI is submitted in the functional module of data grids system correspondence and goes; Carry out issuing steps repeatedly, all transformed up to original resource and finish;
    Step 5: start the request-reply conversion module at last, the FTP request of response user original system software, when finding the new request of user, with original FTP request analysis is data grids transmission request, be transferred to data grid environment, and the result is converted into original FTP return information returns to the user;
    Step 6: after above upgrade job is finished, wait for client's file transfer requests, after the client sends file transfer requests, the system responses request, judge that the user uses original system FTP transmitting software still to support the transmitting software of data grids, if the old FTP transmitting software of original system then communicates with by the request-reply conversion module, and call parallel data transmission and block data transmission under the data grids, thereby quick transfer files; If support the transmitting software of data grids, then directly meet at data grid environment, use parallel data transmission and block data transmission and tripartite transfer of data.
CN2009100246532A 2009-02-25 2009-02-25 File fast transmission method based on data grid under campus network circumstance Expired - Fee Related CN101483650B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009100246532A CN101483650B (en) 2009-02-25 2009-02-25 File fast transmission method based on data grid under campus network circumstance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009100246532A CN101483650B (en) 2009-02-25 2009-02-25 File fast transmission method based on data grid under campus network circumstance

Publications (2)

Publication Number Publication Date
CN101483650A CN101483650A (en) 2009-07-15
CN101483650B true CN101483650B (en) 2011-08-03

Family

ID=40880577

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009100246532A Expired - Fee Related CN101483650B (en) 2009-02-25 2009-02-25 File fast transmission method based on data grid under campus network circumstance

Country Status (1)

Country Link
CN (1) CN101483650B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104426880A (en) * 2013-09-03 2015-03-18 中国银联股份有限公司 Network-based centralized automatic file collection and distribution device
CN110543533B (en) * 2019-08-02 2021-11-02 武大吉奥信息技术有限公司 Method and device for automatically generating basic grid data
CN115499488A (en) * 2022-09-15 2022-12-20 詹纳 Algorithm for fast adapting computer and network equipment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1595932A (en) * 2004-06-28 2005-03-16 上海理工大学 Grid computing process express system based on OGSA specification and implementing method thereof

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1595932A (en) * 2004-06-28 2005-03-16 上海理工大学 Grid computing process express system based on OGSA specification and implementing method thereof

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
徐清宇, 刘福岩.网格数据传输协议GridFTP.《中北大学学报》.2005,第26卷(第5期),全文. *
徐清宇.基于网格的文件浏览与传输服务的研究.《中北大学硕士学位论文》.2005,全文. *
徐秀芳.校园网格资源管理系统研究.《常熟理工学院学报(自然科学)》.2008,第22卷(第10期),全文. *

Also Published As

Publication number Publication date
CN101483650A (en) 2009-07-15

Similar Documents

Publication Publication Date Title
CN112685385B (en) Big data platform for smart city construction
CN107819824A (en) A kind of Urban Data opens and information service system and method for servicing
CN107003906A (en) The type of cloud computing technology part is to type analysis
CN102254022A (en) Method for sharing metadata of information resources of various data types
WO2018036324A1 (en) Smart city information sharing method and device
CN101741696A (en) Multi-user real-time cooperative system in distributed geographic information environment
CN109542967A (en) Smart city data-sharing systems and method based on XBRL standard
CN103390018A (en) Web service data modeling and searching method based on SDD (service data description)
JPWO2012160814A1 (en) Information processing system, access right management method, information processing apparatus, control method thereof, and control program
CN101483650B (en) File fast transmission method based on data grid under campus network circumstance
CN103595727A (en) Cross-domain incremental data exchange model and method based on exchange identification
CN104765763B (en) A kind of semantic matching method of the Heterogeneous Spatial Information classification of service based on concept lattice
CN103533094B (en) Coding all-in-one and coding system
Elsayed et al. Towards realization of dataspaces
Lawrence et al. The NERC DataGrid Prototype
Zhao et al. Design and Implementation of Enterprise Public Data Management Platform Based on Artificial Intelligence
Arcieri et al. Distributed territorial data management and exchange for public organizations
Hussain et al. Development of a novel approach to search resources in IoT
Vdovjak et al. RDF and traditional query architectures
Kolchin et al. Unequal temperature changes in city: A case study using a semantic IoT platform
CN110083577B (en) Data collection method of intelligent home cloud platform
Gao Research on the Sharing Strategy of Electronic Book Resources in Universities in the Internet Era.
Bellini et al. Automating Heterogeneous IoT Device Networks from Multiple Brokers with Multiple Data Models
Qu et al. Internet Engineering Task Force C. Yang, Ed. Internet-Draft SY. Pan, Ed. Intended status: Standards Track South China University of Technology Expires: April 24, 2020 HB. Sun Inspur
Qu et al. Internet Engineering Task Force C. Yang, Ed. Internet-Draft SY. Pan, Ed. Intended status: Standards Track South China University of Technology Expires: November 22, 2020 HB. Sun Inspur

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20090715

Assignee: Jiangsu Nanyou IOT Technology Park Ltd.

Assignor: Nanjing Post & Telecommunication Univ.

Contract record no.: 2016320000220

Denomination of invention: File fast transmission method based on data grid under campus network circumstance

Granted publication date: 20110803

License type: Common License

Record date: 20161121

LICC Enforcement, change and cancellation of record of contracts on the licence for exploitation of a patent or utility model
EC01 Cancellation of recordation of patent licensing contract

Assignee: Jiangsu Nanyou IOT Technology Park Ltd.

Assignor: Nanjing Post & Telecommunication Univ.

Contract record no.: 2016320000220

Date of cancellation: 20180116

EC01 Cancellation of recordation of patent licensing contract
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20110803

Termination date: 20180225

CF01 Termination of patent right due to non-payment of annual fee