CN1275157C - Open network computing platform based on improved linux virtual server structure - Google Patents

Open network computing platform based on improved linux virtual server structure Download PDF

Info

Publication number
CN1275157C
CN1275157C CN 200410012810 CN200410012810A CN1275157C CN 1275157 C CN1275157 C CN 1275157C CN 200410012810 CN200410012810 CN 200410012810 CN 200410012810 A CN200410012810 A CN 200410012810A CN 1275157 C CN1275157 C CN 1275157C
Authority
CN
China
Prior art keywords
server
module
volunteer
backup
application
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 200410012810
Other languages
Chinese (zh)
Other versions
CN1560742A (en
Inventor
金海�
鄢娟
章勤
韩宗芬
龚文君
康达祥
铁婧
王成伟
张超
袁泉
王玮
邹德清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN 200410012810 priority Critical patent/CN1275157C/en
Publication of CN1560742A publication Critical patent/CN1560742A/en
Application granted granted Critical
Publication of CN1275157C publication Critical patent/CN1275157C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Hardware Redundancy (AREA)
  • Computer And Data Communications (AREA)

Abstract

The present invention discloses an open network computing platform based on an improved linux virtual server structure, which comprises a server system and a volunteer machine system, wherein the server system comprises a front end machine and a plurality of haoyu servers, and the volunteer machine system comprises a plurality of haoyu volunteer machines, wherein the front end machine comprises a load balancing module and a front end machine backup module; each haoyu server comprises a storage system, an application module, a monitoring module, a server backup module and a dispatching module; each haoyu volunteer machine comprises a volunteer interface module, a control module, a communication module and a computation module. The present invention has the advantages that the details of the network computing platform are obtainable to application scientists, multiple-utility sharing volunteer machine resources and applied selecting, upgrading and resource allocating strategies are obtainable to volunteers, etc. The server system naturally inherits all characteristics of LVS. The present invention is suitable for Windows and Linux volunteer software, humanized volunteer interface systems and calculation threads started by the lowest priority and is favorable to the adsorption of more volunteers to enable the platform to have strong computing capacity.

Description

Open network computing platform based on improved linux virtual server framework
Technical field
The invention belongs to field of network computing, be specifically related to a kind of open network computing platform based on improved LVS (linuxVirtual Server, i.e. linux virtual server) framework.
Background technology
Along with development of internet technology, people begin thinking and how to set up the network calculations platform, utilize Internet to go up the idling-resource that extensively distributes, big, the distributed application program of operation calculated amount; Provide the computing machine of idling-resource to be called as the aspiration machine.In the past few years, the famous network calculations project that emerges in the world comprises SETI@Home, GIMPS, D2OL, XtremWeb etc., and their success has confirmed the feasibility of this computation model.
In above-mentioned project, SETI@Home, GIMPS and D2OL are widely used, and they have the following advantages:
(1) server system is powerful, and aspiration machine a large amount of, dynamic change request is handled in operation that can be efficient, reliable, stable.The aspiration machine number of SETI@Home, GIMPS and D2OL project is respectively up to 4,500,000,200,000 and 5.5 ten thousand, these aspiration machines computing platform that may add or exit network at any time, send various requests to server, have only the very powerful server system of function could handle these requests efficiently, reliably, timely.
(2) has the aspiration machine system that is applicable to the several operation systems platform that comprises Windows.Windows is that Internet goes up most popular operating system, has the aspiration machine system that is applicable to windows platform, means that Internet goes up most of computing machine and becomes potential aspiration machine.The aspiration machine system of several operation systems platform makes Internet go up many other computing machines and also may be utilized by the network calculations platform.
But above-mentioned project all is that the dedicated platform at the application-specific design: SETI@Home is exclusively used in the extraterrestrial biology of search, and GIMPS is exclusively used in the searching Mersenne Prime, and D2OL is exclusively used in and seeks the SARS antidote.The structure of dedicated platform as shown in Figure 1, they have the following disadvantages:
(1) the realization details of network calculations platform is opaque to using scientist.For computing application, the applied scientist specialized groups establishing network computing platform of having to.
(2) be difficult to shared resource between the different application.For example, when the SARS large-scale outbreak, people may wish more computational resource is used to seek the SARS antidote.But the D2OL project has only 5.5 ten thousand aspiration machines, though the registration of SETI@Home project aspiration machine number is difficult to be utilized by D2OL up to 4,500,000 at this moment.
(3) realization of the selection of Ying Yonging, upgrading and resource allocation policy is opaque to the volunteer.If the volunteer wishes to move a plurality of application, he has to download and install the aspiration machine software of a plurality of projects so, and manual coordination different application taking computational resource; When application code is upgraded, volunteer's aspiration machine software of having to upgrade.
Calendar year 2001, Fedak G has delivered article " XtremWeb:a generic global computing system " in Cluster Computing and the Grid meeting, propose a kind of open network computing platform---the design of XtremWeb in the literary composition first, its structure as shown in Figure 2.Different with dedicated platform, XtremWeb is not at special Application Design, but can move a plurality of different application.Open design makes XtremWeb overcome the above-mentioned deficiency of dedicated platform preferably.But owing to there is following problem, XtremWeb fails to be widely used.
(1) server system is based upon on single PC, is difficult to handle aspiration machine a large amount of, dynamic change request, also is difficult to provide jede Woche 24 hours every days available service in 7 days.
(2) present XtremWeb project is only developed the aspiration machine system based on Linux, therefore is difficult to utilize Internet to go up the computing machine of other operating system platform (especially windows platform).
(3) the aspiration machine software design of impersonalityization.Aspiration machine software does not provide graphical interfaces, and in operational process, the volunteer is difficult to mutual with it; Computational threads is moved with normal priority, once startup, can take all cpu resources immediately, makes the volunteer be difficult to operate as normal.
Summary of the invention
The objective of the invention is to overcome the deficiency of existing network computing platform, a kind of open network computing platform based on improved LVS framework is provided; The open design of the present invention makes the realization details of network calculations platform transparent to using scientist, and use shared aspiration machine resource more; Its server system has all good characteristics of LVS framework.
A kind of open network computing platform based on improved linux virtual server framework provided by the invention comprises the aspiration machine system based on the server system and the idle computational resource of contribution of improved linux virtual server framework;
Server system comprises FEP and m platform sky space server, and m is a positive integer; Aspiration machine system comprises n platform sky space aspiration machine, and n is a positive integer.
FEP comprises load balancing module and FEP backup module; Wherein, when load balancing module sends to FEP when vast and boundless space aspiration machine with request, request is forwarded on the scheduler module of vast and boundless space server; The FEP backup module is set up according to adaptive strategy and is safeguarded that the backup between the vast and boundless space server concerns vector table, and should back up and concern that vector table sends to the server backup module.
Sky space server comprises storage system, application module, monitoring module, server backup module and scheduler module, and wherein, storage system comprises the data base set file system of unifying; Application module is used to receive the application that the client submits to, and application is divided into the little task of calculated amount, deposits application message and mission bit stream in Database Systems by order again, and the binary code of application and the data file of task deposit file system in; Monitoring module is used for the information of reading database system, is shown after the processing, to monitor the operational situation of vast and boundless space server; The server backup module concerns and database information in the storage system and fileinfo is backuped on the corresponding server vector table according to backup; Scheduler module writes mission bit stream in the Database Systems according to application module when the communication module request calculation task in the vast and boundless space aspiration machine, the uniform dispatching task is distributed to the aspiration machine by the communication module in the vast and boundless space aspiration machine and carried out; Also be used for after communication module is returned result of calculation, destination file being stored in the file system, and the mission bit stream in the modification Database Systems.
Sky space aspiration machine comprises volunteer's interface module, control module, communication module and computing module; Wherein, volunteer's interface module is used for real-time listening and carries out the order that the volunteer sends, and revises corresponding item in the configuration file according to order; The control module circulation is according to the setting of configuration file, when the startup The conditions of calculation satisfies, start communication module, to vast and boundless space server requests calculation task, after scheduler module is returned calculation task, start computing module, with the lowest priority calculation task, communication module periodically sends " alive " information to scheduler module in computation process, and result of calculation is returned to scheduler module by communication module.
The FEP backup module is set up in the following manner and safeguarded that backup concerns vector table: from 1 to j is that every station server is numbered, and establishes server and is numbered i 1, work as i 1During<j, i 1The information back-up of number server is to i 1On+No. 1 server, on information back-up to 1 server of j server; When new server adds fashionablely, new server is numbered j+1, then with the information back-up of j server to the j+1 server, on information back-up to 1 server of j+1 server, the backup relation between other server is constant; When being numbered i 2Server when withdrawing from, will be greater than i 2Server numbering all subtract 1, new i 2Number server backups to i 2On-No. 1 server, the backup relation between other server is constant, wherein, i, j, p is positive integer.
Server system of this computing platform and aspiration machine system adopt application module and kernel software separation mechanism respectively, make application utilize this platform to calculate easily, have overcome the deficiency of dedicated platform.Based on the server system of LVS framework, natural succession all good characteristics of LVS, comprise high availability, scalability and high performance-price ratio etc.; Based on the backup module of LVS exploitation, improved the fault-tolerance of system greatly.Based on the aspiration machine software of Windows and Linux platform, make Internet go up the potential aspiration machine that most of computing machine becomes vast and boundless space.Volunteer's interface module design of " people-oriented " makes the volunteer can control the operation of aspiration machine software easily, has respected the final control of volunteer to the aspiration machine; Computing module starts calculating on the backstage with lowest priority, can not disturb volunteer's work fully.Particularly, the present invention has the following advantages and effect:
(1) server system has all good characteristics of LVS framework
Based on the server system natural succession of LVS framework all characteristics of LVS, comprising:
Scalability: more and more when the number of aspiration machine, when the load of server was increased, server system can be expanded satisfying the demands, and does not reduce service quality.
High availability: even the part hardware and software of server system can break down, the service of total system must be 24 hours every days jede Woche 7 days available.
High performance-price ratio: compare with supercomputer, the network calculations platform should have stronger computing power, and the realization of total system is economical, easy payment.
(2) good fault-tolerance
Fault-tolerance is related to the availability of system and the user confidence level to system.Backup module based on the LVS exploitation makes server system have good fault-tolerance.The catastrophic failure of any station server can not cause catastrophic consequence, according to backup information, and a certain correct status before server can return to rapidly and lose efficacy.
(3) the aspiration machine software of hommization helps attracting more volunteer
The release that is applicable to the aspiration machine software of Windows and Linux platform makes Internet go up the potential aspiration machine that most of computing machine all becomes vast and boundless space; Volunteer's interface module of hommization has realized the control of volunteer to the aspiration machine; The computational threads that starts with lowest priority makes vast and boundless space really realize utilizing the idling-resource of aspiration machine to calculate, and does not disturb volunteers working fully; This realization of three helps attracting more Internet user to add this platform.
(4) network calculations Platform Implementation details is transparent to using scientist
Applied scientist only needs vast and boundless space is submitted in application, just can utilize existing network calculations platform to calculate, and need not set up the network calculations platform of oneself specially and collect the volunteer.Applied scientist can be freed from the realization details of network calculations platform like this, concentrate one's energy to carry out the scientific research in own field.
(5) use shared aspiration machine resource more
All that operate on the vast and boundless space are used shared aspiration machine resource.The benefit of doing like this is: 1. make full use of aspiration machine idling-resource; Temporarily do not have enough tasks to need to calculate even certain is used, can not influence utilization aspiration machine idling-resource; 2. when certain application urgency is higher, can concentrate a large amount of computational resources to calculate rapidly, easily.
(6) selection of Ying Yonging and upgrading are transparent to the volunteer
By volunteer's interface module, the application that the volunteer can independently select one or more hope to calculate, and the strategy that between these are used, distributes idling-resource; Aspiration machine software is according to volunteer's the computing application that is provided with.Aspiration machine software can also be transparent the upgrade applications code.
Description of drawings
Fig. 1 is the structural representation of dedicated network computing platform, and wherein, 1 is server, 2.1 ..., 2.n is the aspiration machine;
Fig. 2 is the structural representation of XtremWeb, and wherein, 3 is the XtremWeb server, 4.1 ..., 4.n is an XtremWeb aspiration machine, 5.1 ..., 5.k is the client;
Fig. 3 is the structural representation of vast and boundless space network calculations platform, and wherein, 6 is FEP, 7.1 ..., 7.m is vast and boundless space server, 8.1 ..., 8.n is vast and boundless space aspiration machine;
Fig. 4 is the structural representation of vast and boundless space server;
Fig. 5 is the structural representation of vast and boundless space aspiration machine;
Fig. 6 is the open network computing platform modular structure synoptic diagram based on LVS;
Fig. 7 is fashionable adaptive strategy adjustment mode for new node adds;
Adaptive strategy was adjusted mode when Fig. 8 withdrawed from for old node;
Fig. 9 is the workflow diagram of vast and boundless space server;
Figure 10 is the workflow diagram of aspiration machine.
Embodiment
Open network computing platform provided by the invention is referred to as vast and boundless space network calculations platform again, and its structure as shown in Figure 3.This platform can be divided into the two large divisions from the angle of principle of work: one is based on the server system 20 of improved LVS framework, comprises FEP 6 and vast and boundless space server 7.1 ..., 7.m; The 2nd, extensively be distributed on the Internet, contribute the aspiration machine system 21 of idle computational resource, comprise vast and boundless space aspiration machine 8.1 ..., 8.n.
The structure of sky space server 7 comprises storage system 9, application module 10, monitoring module 11, server backup module 12 and scheduler module 13 as shown in Figure 4.
The structure of sky space aspiration machine 8 comprises volunteer's interface module 14, control module 15, communication module 16 and computing module 17 as shown in Figure 5.
As shown in Figure 6, server system 20 is based upon on the cluster, adopts the LVS framework.One of them node is as FEP 6, and other node is as vast and boundless space server 7.1 ..., 7.m.Set up load balancing module 18 and FEP backup module 19 on the FEP, wherein, FEP backup module 19 is developed based on LVS by the inventor.
Sky space server 7.1,7.m formation identical, each component part in the sky space server 7.1 is labeled as storage system 9.1, application module 10.1, monitoring module 11.1, server backup module 12.1 and scheduler module 13.1 in the drawings, accordingly, each component part among the vast and boundless space server 7.m is labeled as storage system 9.m, application module 10.m, monitoring module 11.m, server backup module 12.m and scheduler module 13.m.
Each communication, sky space aspiration machine 8 is actually request is sent to FEP 6, seamlessly will volunteer the request scheduling of machine to the vast and boundless space server 7 that is fit to by load balancing module 18 then, load balancing module 18 is that the LVS framework itself has, and the present invention utilizes its original function to carry out work.Scheduler module 13 on this server is handled after the request, directly the result is sent to vast and boundless space aspiration machine 8.The structure of server cluster is transparent to the aspiration machine, and the aspiration machine is as high-performance of visit, high available server.
FEP backup module 19 is set up according to adaptive strategy and is safeguarded that the backup between the vast and boundless space server concerns vector table, server backup module 12 according to vector table with information back-up in the vast and boundless space server of appointment.The foundation of backup module, make failed server can be rapidly, easily from the backup server recovering information.
The advantage of adaptive strategy is that when certain station server added or withdraws from, strategy itself can carry out the self-adaptation adjustment, and minimum to original backup relation influence between the vast and boundless space server.Method of adjustment is described as follows shown in Fig. 7,8: from 1 to j is every station server numbering.Suppose that server is numbered i, when i<j, the information back-up of i server is to the i+1 server; On information back-up to 1 server of j server.When new server adds fashionablely, new server is numbered j+1, then with the information back-up of j server to the j+1 server, on information back-up to 1 server of j+1 server, the backup relation between other server is constant.When the server that is numbered p withdraws from, will all subtract 1 greater than the server numbering of p, the p-1 server backups on the new p server, and the backup relation between other server is constant.
Every vast and boundless space server 7 comprises storage system 9, application module 10, monitoring module 11, server backup module 12 and scheduler module 13, and wherein storage system is that existing server has, and comprises the data base set file system of unifying.
By the interface that application module 10 provides, the client can submit application to, submits to content to comprise the executable file, data file, command line parameter etc. of application; Application module 10 by the dividing data file, is divided into task a large amount of, that calculated amount is less with application after receiving and using; Then, application module 10 is installed by order and is used, and application message is write database, and binary code and data file deposit assigned catalogue in; At last, application module 10 deposits mission bit stream in database by order, and the data file of task deposits assigned catalogue in.
Scheduler module 13 writes the mission bit stream of database according to application module 10, and the task that uniform dispatching etc. are pending is distributed to the aspiration machine and carried out.When aspiration machine communication module 16 request calculation tasks, scheduler module 13 is according to the operating system and the cpu type of aspiration machine, and the application of request, selects suitable Task Distribution to give it, if the aspiration machine is the binary code that should use of buffer memory not as yet, scheduler module 13 can transmit to communication module 16.Distribute after the task, if scheduler module 13 fails to receive " alive " information that communication module 16 sends in a period of time, it can dispatch this task once more, distributes to other aspiration computer.After communication module 16 was returned result of calculation, scheduler module 13 can be stored in destination file in the file system, and revised the mission bit stream in the Database Systems.
In the server operational process, monitoring module 11 is the operational situation of the vast and boundless space server of monitoring in real time.Can design according to user's requirement.For example, monitoring module is by the application information in the reading database, volunteer's information, aspiration machine information and mission bit stream, after handling, or/and the form of curve map shows, make the keeper can monitor the current operational situation of vast and boundless space server 7 long-range, real-time, intuitively, easily with form, pie chart, post figure.The content of the monitoring page can comprise: application program essential information, volunteer's personal information, aspiration machine information, the trace information as a result of having finished the work, aspiration machine finish the work several historical informations, the movable condition information of aspiration machine and task status information.
The backup relation that server backup module 12 is formulated according to FEP backup module 19 adopts suitable backup policy, the important information in the storage system 9 is backuped in the storage system 9 of respective server.Backup policy has increased and has backed up in realtime, so that back up synchronously when data message changes on the backup fully of routine, incremental backup basis.The backed up data library information comprises application information, mission bit stream, volunteer's information and aspiration machine statistical information.The file of backup comprises the data of application binary code, application, the data file and the result of calculation of task.The The data of application message, volunteer's information, application binary code and the application mode of backing up in realtime wherein, out of Memory carries out incremental backup one time every day, and all information once back up weekly fully.
The workflow of sky space server as shown in Figure 9.
Sky space aspiration machine 8 comprises volunteer's interface module 14, control module 15, communication module 16 and computing module 17.After the aspiration machine software startup, volunteer's interface module 14 and control module 15 are activated immediately.Volunteer's interface module 14 real-time listenings are also carried out the order that the volunteer sends, and comprising: the daily record of work of 1. checking aspiration machine software; 2. time-out/continuation is calculated; 3. revise the application and the idling-resource allocation strategy information of wishing operation; 4. revise volunteer's personal information; 5. revise aspiration machine resource contribution information; 6. check the current total cpu busy percentage and the cpu busy percentage of computational threads; 7. check the current total memory usage and the memory usage of computational threads; 8. check the animation introduction of vast and boundless space network calculations platform; 9. check statistical information; 10. withdraw from the operation of aspiration machine software.When 9. 8. 7. 6. 1. the volunteer gave an order, volunteer's interface module 14 was called corresponding program segment, according to the specified format display message; When 10. 5. 4. 3. 2. the volunteer gave an order, volunteer's interface module 14 was revised in the configuration files corresponding.
Control module 15 circulations are judged the current startup The conditions of calculation that whether satisfies according to the setting of configuration file, if do not satisfy, wait for a period of time; If satisfy, start communication module 16, to the calculation task of the server requests suitable application that moves of aspiration machine at present, tell the caching situation of these application binary codes of server simultaneously; After the acquisition task, start computing module 17, with the lowest priority calculation task; In computation process, communication module 16 periodically sends " alive " information to scheduler module 13; After calculating finished, communication module 16 sent result of calculation to vast and boundless space server.
After sky space aspiration machine 8 starts, control module 15 is provided with according to the volunteer that volunteer's interface module 14 receives, understand the volunteer and wish which uses, and the allocation strategy of idling-resource between these are used, judge the whether binary code of these application of buffer memory of aspiration machine then; When design conditions reached, control module 15 was passed through the calculation task of communication module 16 actives to these application of server requests, and tells the binary code caching situation of each application of server; The task Buddhist monk that communication module 16 reception server scheduler modules 13 are distributed is the binary code of the application of buffer memory not; Control module 15 starts computing module 17, carries out calculation task, calculates the back communication module 16 that finishes to server scheduling module 13 return results, asks next task then, successively circulation.
The workflow of aspiration machine as shown in figure 10.
Develop at present the aspiration machine software of Windows and Linux platform, and provide extensive interface for other operating system platforms.
Example:
Sky space network calculations Platform Server system constructing is on 3 node machines of cluster, and its basic configuration is as shown in table 1.
CPU Internal memory Hard disk Operating system Network
PIII 866 256M 30G Linux 7.2 The 100M switch
The hardware of each node of table 1 and network configuration
Wherein, one as FEP, the vast and boundless space server of all the other two conducts.
Concrete enforcement is as follows: all nodes all dispose LVS; Wherein node 3 serves as FEP, has real IP 211.69.206.236 and virtual IP address 192.168.1.203 simultaneously; Node 9 and node 11 have virtual IP address 192.168.1.220 and 192.168.1.211 respectively as background server.Background server is installed vast and boundless space server software, and FEP is responsible for safeguarding the load balance of background server.
Aspiration machine system constructing is on the PC on the Internet, and its basic configuration is as shown in table 2.
CPU Internal memory Hard disk Operating system Network
More than PIII 450 reaches More than 64M reaches More than 10G reaches Win2000, XP, 2003 or Linux 7.2 More than the 10M switch reaches
The hardware and the network configuration of table 2 aspiration machine
Realization to total system is described as follows:
(1) based on the self-adapted tolerance standby system of LVS
According to adaptive strategy, FEP is set up and is safeguarded that the backup between the vast and boundless space server concerns vector table, and its example is as shown in table 3.
PeerName PeerIP Status Port BKPeerIP Description
Node9 192.168.1.220 true 8848 192.168.1.211 Server1
Node11 192.168.1.211 true 8849 192.168.1.220 Server2
The backup that table 3 FEP is safeguarded concerns vector table
Each field is explained as follows:
PeerName: the title of vast and boundless space server;
PeerIP: the IP address of vast and boundless space server;
Status: vast and boundless space server current state, true represents to run well, and false represents improper running;
Port: vast and boundless space server is monitored the port of backup information;
BKPeerIP: the server ip of vast and boundless space server info backup;
Description: the descriptor of vast and boundless space server;
As can be seen from Table 3, present FEP has been safeguarded the backup information between node9 and two vast and boundless space servers of node11.The IP address of node9 is 192.168.1.220, the current active state that is in, the order that it monitors the FEP standby system at port 8848; According to adaptive strategy, the important information of node9 backups on the server 192.168.1.211.The backup information of node11 also can be read from vector table.
Node9 and node11 will carry out the unify backup of file system of data base set according to the regulation of vector table.The mysqldump+Crontab mode is adopted in the Database Systems backup, and the NFS+shellscript+Crontab mode is adopted in the file system backup.
(2) server system of employing application system and kernel software separation mechanism
Application system writes file app.conf with application message after receiving and using, and main item of information is explained as follows:
App.os.name: use the operating system platform type that is fit to operation;
App.cpu.type: use the cpu type that is fit to operation;
App.name: Apply Names;
App.bin: the binary code of application is deposited the path;
App.client_id: submit the Customer ID of using to;
App.priority: the priority of application;
After app.conf file editor finishes, fill order make installApp, this order is according to the content among the app.conf, set up and use Zhu Mulu ${HowU master catalogue/the db/${ Apply Names }, and then set up Mu Lu ${ and use master catalogue/bin, the binary code of using is used master catalogue from app.bin catalogue Kao Beidao ${ }/bin under, and application message write database apps table, use and just be installed to server system and suffered.Next, application system need be decomposed application and submission task.Present vast and boundless space mainly moves the application of SPMD type, and therefore dividing application is the dividing data file in fact, the corresponding task of each data file after the division.Each task is assigned with a wid, i.e. the ID of task, and the data file of task is named with task ID.After decomposition is finished, fill order javaSubTasks, system can be that each task creation Mu Lu ${ uses master catalogue }/{ task ID }, the data file of task is copied under the catalogue then, and mission bit stream is written in the works table of database.
Main item of information in the works table is as follows:
Wid: the ID of task;
Priority: the priority of task;
Os: task is fit to the operating system platform type of operation;
NameApp: the application name under the task;
Status: task present located state;
DirInName: the data file place catalogue of task, Ji ${ uses master catalogue }/{ task ID };
DirOutName: the destination file storing directory of task;
Works table in the periodic scan database of dispatching system according to the OS Type and the priority of task, is selected the task of status=" waiting ", deposits task pool in, and the status with this task changes " inqueue " into then.When aspiration machine request calculation task, dispatching system is selected suitable task from task pool, from the dirInName catalogue sense data file of this task, send the aspiration machine to then; If the aspiration machine needs binary code, dispatching system can be shown the nameApp item according to works, obtains Apply Names information, then ${ is used master catalogue }/binary code used under the bin catalogue sends the aspiration machine to.Start to calculate after the aspiration machine reception task and return results, dispatching system is used master catalogue with destination file Bao Cundao ${ }/{ task ID }, and change the status of this task into " completed ".
(3) volunteer's interface system of hommization
Volunteer's interface system comprises three master menus " Status ", " Setting ", " Help " and four buttons " Log ", " CPU Utility ", " MEM Utility ", " Introduction ".
The drop-down menu of " Status " comprises " Statistics ... ", " Pause ", " Continue " and " Exit ", choose them, aspiration machine software is carried out following operation respectively: " showing the statistical information that aspiration machine software is finished the work ", " suspend and calculate ", " continuing to calculate " and " withdrawing from the operation of aspiration machine software ".
The drop-down menu of " Setting " comprises " User Information ... ", " CPU andMemory ... " " Distribution Strategy ... "Choose " User Information ... ", can eject dialog box and show volunteer's personal information, the volunteer can revise the personal information in the dialog box; Choose " CPU and Memory ... ", dialog box be can eject and the CPU of the current contribution of aspiration machine and the quantity of memory source shown, the volunteer can revise the resource contribution information in the dialog box; Choose " Distribution Strategy ... ", dialog box be can eject and current application that can move of aspiration machine and the allocation strategy of resource between these are used shown, the volunteer can revise and use and resource allocation policy.
The drop-down menu of " Help " comprises " Help Topics " and " About us ".Choose them, aspiration machine software can show help information.
Selected button " Log ", volunteer's interface system interface can show the log information of aspiration machine running software; Choose " CPU Utility ", can show the current total cpu busy percentage of aspiration machine and the cpu busy percentage of computational threads, choose " MEM Utility ", can show the current total memory usage of aspiration machine and the memory usage of computational threads, choose " Introduction ", can show the animation introduction of vast and boundless space network calculations platform.
(4) platform transplantation is easily realized Windows and Linux
For convenience develop the aspiration machine system that is applicable to the several operation systems platform, the inventor has adopted the extraordinary Java language of transplantability.The inventor just can move under windows platform through minor modifications at the aspiration machine software of developing under the Linux environment, and the inventor also can develop the aspiration machine software that is applicable to other operating system platform from now on.
(5) the aspiration machine system that separates with kernel software of application system
When aspiration machine software moves for the first time, can Zai ${ volunteer's master catalogue/set up catalogue howuCache below the Temp, the aspiration machine all leaves under this catalogue from the application binary code that server obtains later on.
After the aspiration machine software startup, wait for that starting The conditions of calculation satisfies; Then according to volunteer's setting, judge that the aspiration machine is current which can move use Cha Xunmulu ${ volunteer master catalogue/Temp/howuCache, judge the whether binary code of these application of buffer memory of volunteer; Aspiration machine software initiatively calculates the task of these application and informs the binary code caching situation to server requests: receive task and the binary code of the application of calculating first from server at last.

Claims (2)

1, a kind of open network computing platform based on improved linux virtual server framework is characterized in that: this computing platform comprises the aspiration machine system (21) based on the server system (20) of improved linux virtual server framework and the idle computational resource of contribution;
Server system (20) comprise FEP (6) and m platform sky space server (7.1 ..., 7.m), m is a positive integer; Aspiration machine system (21) comprise n platform sky space aspiration machine (8.1 ..., 8.n), n is a positive integer;
FEP (6) comprises load balancing module (18) and FEP backup module (19); Wherein, when load balancing module (18) sends to FEP when vast and boundless space aspiration machine with request, request is forwarded on the scheduler module (13) of vast and boundless space server; FEP backup module (19) is set up according to adaptive strategy and is safeguarded that the backup between the vast and boundless space server concerns vector table, and should back up and concern that vector table sends to server backup module (12);
Sky space server (7) comprises storage system (9), application module (10), monitoring module (11), server backup module (12) and scheduler module (13), and wherein, storage system comprises the data base set file system of unifying; Application module (10) is used to receive the application that the client submits to, and application is divided into the little task of calculated amount, deposits application message and mission bit stream in Database Systems by order again, and the binary code of application and the data file of task deposit file system in; Monitoring module (11) is used for the information of reading database system, is shown after the processing, to monitor the operational situation of vast and boundless space server; Server backup module (12) concerns and database information in the storage system and fileinfo is backuped on the corresponding server vector table according to backup; Scheduler module (13) is when the request of the communication module (16) in vast and boundless space aspiration machine calculation task, write mission bit stream in the Database Systems according to application module (10), the uniform dispatching task is distributed to the aspiration machine by the communication module (16) in the vast and boundless space aspiration machine and is carried out; Also be used for after communication module (16) is returned result of calculation, destination file being stored in the file system, and the mission bit stream in the modification Database Systems;
Sky space aspiration machine (8) comprises volunteer's interface module (14), control module (15), communication module (16) and computing module (17); Wherein, volunteer's interface module (14) is used for real-time listening and carries out the order that the volunteer sends, and revises corresponding item in the configuration file according to order; Control module (15) circulation is according to the setting of configuration file, when the startup The conditions of calculation satisfies, start communication module (16), to vast and boundless space server requests calculation task, after scheduler module (13) is returned calculation task, start computing module (17), with the lowest priority calculation task, communication module in computation process (16) periodically sends " alive " information to scheduler module (13), and result of calculation is returned to scheduler module (13) by communication module (16).
2, computing platform according to claim 1, it is characterized in that: FEP backup module (19) is set up in the following manner and safeguarded that backup concerns vector table: from 1 to j is every station server numbering, suppose that server is numbered i, when i<j, the information back-up of i server is to the i+1 server; In on information back-up to 1 server of j server; When new server adds fashionablely, new server is numbered j+1, then with the information back-up of j server to the j+1 server, on information back-up to 1 server of j+1 server, the backup relation between other server is constant; When the server that is numbered p withdraws from, will all subtract 1 greater than the server numbering of p, the p-1 server backups on the new p server, and the backup relation between other server is constant, wherein, and i, j, p is positive integer.
CN 200410012810 2004-03-04 2004-03-04 Open network computing platform based on improved linux virtual server structure Expired - Fee Related CN1275157C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200410012810 CN1275157C (en) 2004-03-04 2004-03-04 Open network computing platform based on improved linux virtual server structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200410012810 CN1275157C (en) 2004-03-04 2004-03-04 Open network computing platform based on improved linux virtual server structure

Publications (2)

Publication Number Publication Date
CN1560742A CN1560742A (en) 2005-01-05
CN1275157C true CN1275157C (en) 2006-09-13

Family

ID=34440109

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200410012810 Expired - Fee Related CN1275157C (en) 2004-03-04 2004-03-04 Open network computing platform based on improved linux virtual server structure

Country Status (1)

Country Link
CN (1) CN1275157C (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101087237B (en) * 2007-07-03 2010-07-14 中兴通讯股份有限公司 A magnetic array share file system and its implementation method

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100461719C (en) * 2006-06-15 2009-02-11 华为技术有限公司 System and method for detecting service healthiness
CN100403271C (en) * 2006-08-23 2008-07-16 华为技术有限公司 Method for data backup and recovery

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101087237B (en) * 2007-07-03 2010-07-14 中兴通讯股份有限公司 A magnetic array share file system and its implementation method

Also Published As

Publication number Publication date
CN1560742A (en) 2005-01-05

Similar Documents

Publication Publication Date Title
Zhong et al. A cost-efficient container orchestration strategy in kubernetes-based cloud computing infrastructures with heterogeneous resources
US9658887B2 (en) Computing session workload scheduling and management of parent-child tasks where parent tasks yield resources to children tasks
US20190220319A1 (en) Usage instrumented workload scheduling
Carriero et al. Adaptive parallelism and Piranha
Al-Kiswany et al. The case for a versatile storage system
US8689226B2 (en) Assigning resources to processing stages of a processing subsystem
KR20180027326A (en) Efficient data caching management in scalable multi-stage data processing systems
CN113645300B (en) Intelligent node scheduling method and system based on Kubernetes cluster
WO2021254135A1 (en) Task execution method and storage device
CN103067425A (en) Creation method of virtual machine, management system of virtual machine and related equipment thereof
CN101859317A (en) Method for establishing database cluster by utilizing virtualization
CN1645330A (en) Method and system for grid-enabled virtual machines with distributed management of applications
CN1664803A (en) Mechanism for enabling the distribution of operating system resources in a multi-node computer system
CN102404385A (en) Virtual cluster deployment system and deployment method for high performance computing
Sudarsan et al. ReSHAPE: A framework for dynamic resizing and scheduling of homogeneous applications in a parallel environment
CN1801096A (en) Method and system for implementing thread sleep in computer system
US20140237151A1 (en) Determining a virtual interrupt source number from a physical interrupt source number
CN1869933A (en) Computer processing system for implementing data update and data updating method
Li et al. Improving preemptive scheduling with application-transparent checkpointing in shared clusters
Basney et al. High Throughput Monte Carlo.
CN115495221A (en) Data processing system and method
CN1959643A (en) Apparatus, system, and method for reassigning a client
Jia Google cloud computing platform technology architecture and the impact of its cost
CN1275157C (en) Open network computing platform based on improved linux virtual server structure
CN1276349C (en) Method for mirror backup of cluster platform cross parallel system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20060913