CN105183554B - High-performance calculation and cloud computing hybrid system and its method for managing resource - Google Patents

High-performance calculation and cloud computing hybrid system and its method for managing resource Download PDF

Info

Publication number
CN105183554B
CN105183554B CN201510466360.5A CN201510466360A CN105183554B CN 105183554 B CN105183554 B CN 105183554B CN 201510466360 A CN201510466360 A CN 201510466360A CN 105183554 B CN105183554 B CN 105183554B
Authority
CN
China
Prior art keywords
cloud computing
calculate node
performance calculation
resource
management system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510466360.5A
Other languages
Chinese (zh)
Other versions
CN105183554A (en
Inventor
胡耀国
晏望龙
李鹏
常艺伟
张转转
刘孟博
陈开渠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NATIONAL SUPERCOMPUTING CENTER IN SHENZHEN (SHENZHEN CLOUD COMPUTING CENTER)
Original Assignee
NATIONAL SUPERCOMPUTING CENTER IN SHENZHEN (SHENZHEN CLOUD COMPUTING CENTER)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NATIONAL SUPERCOMPUTING CENTER IN SHENZHEN (SHENZHEN CLOUD COMPUTING CENTER) filed Critical NATIONAL SUPERCOMPUTING CENTER IN SHENZHEN (SHENZHEN CLOUD COMPUTING CENTER)
Priority to CN201510466360.5A priority Critical patent/CN105183554B/en
Publication of CN105183554A publication Critical patent/CN105183554A/en
Application granted granted Critical
Publication of CN105183554B publication Critical patent/CN105183554B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Hardware Redundancy (AREA)

Abstract

The present invention relates to a kind of high-performance calculations and cloud computing hybrid system and its method for managing resource.Cloud computing proxy service module is arranged in each calculate node of high-performance calculation resource pool and connect simultaneously with high-performance calculation scheduling system and cloud computing management system for this method.When cloud computing resources deficiency, the resource bid of system sending node is dispatched from cloud computing management system to high-performance calculation, and the first cleaning order is sent to corresponding cloud computing proxy service module to clear up the free time calculate node and add it to cloud computing resource pool according to the idle calculate node information that system notice is dispatched in high-performance calculation.When cloud computing resources redundancy, a part of calculate node is discharged by cloud computing management system, and the second cleaning order is sent to corresponding cloud computing proxy service module to clear up calculate node to be released, high-performance calculation resource pool is then recovered to by high-performance calculation scheduling system.The present invention avoids resources idle, improves resource utilization.

Description

High-performance calculation and cloud computing hybrid system and its method for managing resource
Technical field
The present invention relates to computing techniques, more specifically to a kind of high-performance calculation and cloud computing hybrid system And its method for managing resource.
Background technique
High-performance calculation (high performance computing, abridge HPC) is in engineering calculation, scientific research, something lost It passes in gene and Financial Risk Analysis and has been achieved for huge achievement.Development of the high-performance calculation after decades, nowadays The third-largest scientific research means other than experiment and theory are functioned not only as, and a national overall national strength can be treated as Primary evaluation standard.Cloud computing is the increase, use and delivery mode of related service Internet-based, is usually directed to and passes through Internet dynamically easily extends and is often the resource of virtualization to provide.
Current China is very big for the supporting dynamics of cloud computing and high-performance calculation, establishes in many supercomputings The heart forms large-scale computing cluster.The key technology that cloud computing uses is virtualization, can make resource scheduling on demand in this way, But it certainly will also lead to the performance loss of a part, and high-performance calculation is the first pursuit with calculating speed, thus in aspect of performance Certain conflict is had with cloud computing.Each supercomputing center generally all can according to business, service it is different by computing cluster It is divided into cloud computing subregion and high-performance calculation subregion, cloud computing partition running cloud operating system is applied to cloud computing service, and high Performance calculates subregion and then runs high-performance job scheduling system, is applied to high performance computing service.When high-performance calculation business not When busy and cloud calculation service is more nervous, if according to the previous subregion division mode for calculating center routine, it will There is the case where high-performance calculation resources idle, it is therefore desirable to find a better reasonable distribution and use supercomputing center meter Calculate the technology of resource.
Summary of the invention
The technical problem to be solved in the present invention is that for traditional calculating center resources distribution above shortcomings it Place provides a kind of high-performance calculation and cloud computing hybrid system and its method for managing resource.
The present invention is that technical solution used by solving its technical problem is in first aspect: proposing a kind of high-performance calculation With cloud computing hybrid system, including high-performance calculation resource pool and cloud computing resource pool, the high-performance calculation resource pool The multiple calculate nodes dispatching system including high-performance calculation and being managed by it, the cloud computing resource pool includes cloud computing management System and the multiple calculate nodes managed by it, the calculate node in the high-performance calculation resource pool are equipped with cloud computing agency Service module, the cloud computing proxy service module dispatches system with high-performance calculation simultaneously and cloud computing management system is connect;
The cloud computing management system dispatches system sending node resource to high-performance calculation when monitoring inadequate resource Application, and according to high-performance calculation dispatch calculate node from the idle calculate node information of system notice to the free time cloud meter It calculates proxy service module and sends the first cleaning order to clear up the calculate node of the free time by the cloud computing proxy service module And add it to cloud computing resource pool, and when monitoring resource redundancy discharge from the application of high-performance calculation resource pool to A part of calculate node, and the second cleaning order is sent by institute to the cloud computing proxy service module of calculate node to be released It states cloud computing proxy service module and clears up calculate node to be released;
The node resource application scheduling free time that the high-performance calculation scheduling system is sent based on cloud computing management system Calculate node is simultaneously notified to cloud computing management system, and is based on second cleaning according to the cloud computing proxy service module Order clears up the information returned after corresponding calculate node and the calculate node after the cleaning is recovered to high-performance calculation resource pool.
In one embodiment according to a first aspect of the present invention, the high-performance calculation scheduling system is based on cloud computing management The calculate node and notice for the node resource application scheduling free time that system is sent further comprise to cloud computing management system: by The node resource application is set as highest priority, when there is idle calculate node, locks the calculate node of the free time, And the calculate node information of the free time is notified to cloud computing management system.
In one embodiment according to a first aspect of the present invention, it is clear that the cloud computing proxy service module is based on described first Idle calculate node is cleared up in reason order
Terminate all high-performance calculation operation processes;
The distributed file system that unloading high-performance calculation operation uses;
Firewall policy is set, cgroup resource policy is set, to forbid high-performance calculation user to access in terms of the free time Operator node;
The calculate node of the free time is switched to the clothes that cloud computing environment needs from the service that high-performance computing environment needs Business;
Obtaining cloud computing from cloud computing management system needs the connection number of distributed file system or storage resource to be used According to, the corresponding file system of carry or storage resource, and link information is fed back into cloud computing management system;
Virtual switch is created, and the virtual switch machine information is returned into cloud computing management system.
In one embodiment according to a first aspect of the present invention, it is clear that the cloud computing proxy service module is based on described second Corresponding calculate node is cleared up in reason order
The distributed file system or storage resource that unloading cloud computing uses;
By the calculate node from the service that cloud computing environment needs be switched to high-performance computing environment need service, and Notify cloud computing management system so that cloud computing management system deletes the calculate node from cloud computing resource pool after success;
The distributed file system that carry high-performance calculation needs;
Firewall and cgroup resource policy are set, to allow high-performance calculation user to access the calculate node;
Corresponding information is returned to high-performance calculation scheduling system so that the calculate node is recovered to high-performance calculation resource Chi Zhong.
The present invention is that technical solution used by solving its technical problem is in second aspect: proposing a kind of high-performance calculation With the method for managing resource of cloud computing hybrid system, wherein the system comprises high-performance calculation resource pool and cloud computings to provide Source pond, the high-performance calculation resource pool includes multiple calculate nodes that high-performance calculation is dispatched system and managed by it, described Cloud computing resource pool includes cloud computing management system and multiple calculate nodes by its management, and described method includes following steps:
S1, cloud computing proxy service module, institute are set in each calculate node of the high-performance calculation resource pool Cloud computing proxy service module is stated to connect with high-performance calculation scheduling system and cloud computing management system simultaneously;
S2, when cloud computing management system monitors the inadequate resource of cloud computing resource pool, by cloud computing management system It unites and dispatches the resource bid of system sending node to high-performance calculation, and dispatch the idle meter of system notice according to high-performance calculation Operator node information sends the first cleaning order in terms of by the cloud to the cloud computing proxy service module of the calculate node of the free time Proxy service module is calculated to clear up the calculate node of the free time and add it to cloud computing resource pool;
S3, when cloud computing management system monitors the resource redundancy of cloud computing resource pool, by cloud computing management system System release from the application of high-performance calculation resource pool to a part of calculate node, and to the cloud computing generation of calculate node to be released It manages service module and sends the second cleaning order to clear up calculate node to be released by the cloud computing proxy service module, then It is cleared up by high-performance calculation scheduling system according to the cloud computing proxy service module and is returned after corresponding calculate node Calculate node after the cleaning is recovered to high-performance calculation resource pool by information.
In one embodiment according to a second aspect of the present invention, the step S2 further comprises:
The node resource application that the cloud computing management system is sent is set to by high-performance calculation scheduling system Highest priority locks the calculate node of the free time, and the calculate node of the free time is believed when there is idle calculate node Breath notice is to cloud computing management system.
In one embodiment according to a second aspect of the present invention, by the cloud computing proxy service module in the step S2 The calculate node for clearing up the free time further comprises:
Terminate all high-performance calculation operation processes;
The distributed file system that unloading high-performance calculation operation uses;
Firewall policy is set, cgroup resource policy is set, to forbid high-performance calculation user to access in terms of the free time Operator node;
The calculate node of the free time is switched to the clothes that cloud computing environment needs from the service that high-performance computing environment needs Business;
Obtaining cloud computing from cloud computing management system needs the connection number of distributed file system or storage resource to be used According to, the corresponding file system of carry or storage resource, and link information is fed back into cloud computing management system;
Virtual switch is created, and the virtual switch machine information is returned into cloud computing management system.
In one embodiment according to a second aspect of the present invention, by judging current cloud computing resource pool in the step S2 In surplus resources whether be less than preset resource residual amount threshold value or judge that the surplus resources in current cloud computing resource pool are Whether no to meet resource bid demand insufficient to monitor resource.
In one embodiment according to a second aspect of the present invention, by the cloud computing proxy service module in the step S3 Clearing up calculate node to be released further comprises:
The distributed file system or storage resource that unloading cloud computing uses;
By the calculate node from the service that cloud computing environment needs be switched to high-performance computing environment need service, and Notify cloud computing management system so that cloud computing management system deletes the calculate node from cloud computing resource pool after success;
The distributed file system that carry high-performance calculation needs;
Firewall and cgroup resource policy are set, to allow high-performance calculation user to access the calculate node;
Corresponding information is returned to high-performance calculation scheduling system so that the calculate node is recovered to high-performance calculation resource Chi Zhong.
In one embodiment according to a second aspect of the present invention, discharged in the step S3 by the cloud computing management system From the application of high-performance calculation resource pool to a part of calculate node further comprise:
It can be discharged by the surplus resources in statistics cloud computing resource pool to determine by the cloud computing management system Calculate node quantity, and when the quantity of completely idle calculate node is inadequate, by only a small amount of virtual machine operation calculating Virtual machine in node is moved out, until there is sufficient amount of idle calculate node.
High-performance calculation and cloud computing hybrid system and its method for managing resource of the invention, can be in high-performance meter When calculating resources idle, the high-performance calculation resource that would sit idle for carries out corresponding management setting, is added in cloud computing resource pool, with full The service application of sufficient cloud computing resource pool, and the calculate node of cloud computing resource pool can will be added in cloud computing resources redundancy It releases, is recycled by high-performance calculation resource pool, to avoid resources idle, improve resource utilization.Mixing of the invention The application characteristic of computing system combination high-performance calculation and the advantage of cloud computing platform pass through virtualization technology and automation skill Art, while supporting physical machine and virtual machine environment, realize the unified management, unified distribution, unified plan, unified prison of hardware resource Control, breaks single business and monopolizes to resource, provide dynamic computing services platform.
Detailed description of the invention
Present invention will be further explained below with reference to the attached drawings and examples, in attached drawing:
Fig. 1 is that the high-performance calculation of one embodiment of the invention is shown with the original state structure of cloud computing hybrid system It is intended to;
Fig. 2 is that high-performance calculation shown in FIG. 1 and cloud computing hybrid system carry out the structural representation after scheduling of resource Figure;
Fig. 3 is the high-performance calculation of one embodiment of the invention and the method for managing resource of cloud computing hybrid system Flow chart;
Fig. 4 is the high-performance calculation of another specific embodiment of the present invention and the resource management side of cloud computing hybrid system The flow chart of method.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.
Entire computing resource pool all can directly be divided into two big by present supercomputing center according to business demand Subregion: high-performance calculation subregion and cloud computing subregion cope with high-performance and cloud calculation service respectively.For high-performance calculation, Resource is generally relatively fixed, and the utilization rate of resource is usually improved by the rational management of resource.In a High Performance Computing Cluster In, all calculate nodes use linux, and by high-performance calculation scheduling system unified management, each calculate node is ok Carry distributed file system.And for cloud computing, all calculate node deployment virtualization Hypervisor own Hypervisor is managed collectively by cloud computing management system.Hypervisor is that one kind operates in physical server and operating system Between intermediate software layer, allow multiple operating systems and a set of underlying physical hardware of Application share, therefore be also considered as It is " member " operating system in virtual environment, it can coordinate to access all physical equipments and virtual machine on server, also cry Virtual machine monitor (Virtual Machine Monitor).Hypervisor is the core of all virtualization technologies.It is non-interrupted Ground support multiplexing make load migration ability be Hypervisor basic function.When server is started and carried out Hypervisor When, it can distribute suitable memory, CPU, network and disk to each virtual machine, and load the guest operation of all virtual machines System.
When entire computing resource pool is there are when high-performance calculation subregion and this physical division of cloud computing subregion, as long as the two Portfolio it is unbalanced, can all cause the idle of resource.Therefore, the present invention proposes that the high-performance calculation resource that can be would sit idle for adds Enter into cloud computing resource pool to meet the hybrid system of cloud calculation service application.
Fig. 1 show high-performance calculation according to an embodiment of the invention and cloud computing hybrid system 100 just Beginning status architecture schematic diagram.As shown in Figure 1, the hybrid system 100 includes that high-performance calculation resource pool 110 and cloud computing provide Source pond 120.High-performance calculation resource pool 110 further comprises high-performance calculation scheduling system 111 and multiple meters by its management Operator node 112.Cloud computing resource pool 120 includes cloud computing management system 121 and multiple calculate nodes 122 by its management.For The calculate node 112 idle in high-performance calculation resource pool 110 is set to be added in cloud computing resource pool 120 to meet in terms of cloud Service application is calculated, cloud computing agency service is arranged in the application in each calculate node 112 in high-performance calculation resource pool 110 Module 1121, so as to receive the deployment operation order that cloud computing management system 121 is sent.Each cloud computing agency service mould Block 1121 dispatches system 111 with high-performance calculation simultaneously and cloud computing management system 121 is connect, such as passes through high-performance meter simultaneously Calculate API (Application Programming Interface, application programming interface) He Yunji of scheduling system 111 The API for calculating management system 121 is communicatively coupled.
High-performance calculation scheduling system 111 is made using operation strict control resource of the cgroup to high-performance calculation user With.Cgroups is that one kind that linux kernel provides can be limited, be recorded, isolated process group (process groups) is used The mechanism of physical resource (such as cpu, memory, IO etc.).In order to make calculate node idle in high-performance calculation resource pool 110 112 can be added cloud computing resource pool 120, (SuSE) Linux OS that high-performance calculation uses (such as CentOS, SLES, Ubuntu it) must be able to support the Hypervisor of cloud computing resource pool 120.The Hypervisor of (SuSE) Linux OS mainstream has KVM and Xen, the preferred KVM of the application is as Hypervisor.The maximum difference of KVM and xen is the difference of framework, and KVM is direct Linux kernel is become Hypervisor, utilizes the existing functional development of kernel by building on Linux kernel Function needed for KVM.The Hypervisor of Xen is then building of starting from scratch, to the pipe of the management and running of hardware resource, virtual machine Reason, also many interfaces and Linux kernel are incompatible, need to modify to Linux kernel, such operating system is not It is to be well suited for for doing high-performance calculation.
In hybrid system 100 shown in FIG. 1, when the cloud computing management system 121 of cloud computing resource pool 120 monitors Cloud computing resources are insufficient (such as the surplus resources in current cloud computing resource pool are less than preset resource residual amount threshold value or current Surplus resources in cloud computing resource pool are insufficient for resource bid demand) when, to the high property of high-performance calculation resource pool 110 It can 111 sending node resource bid of computerized operation system.Cloud computing management system 121 can call high-performance calculation to dispatch system 111 API application node resource in the form of Ordinary Work.High-performance calculation dispatches system 111 and is based on the node resource application tune The idle calculate node 112 of degree gives cloud computing management system 121.For example, high-performance calculation scheduling system 111 can provide the node Source application is set as highest priority, and after there are idle calculate node 112, the locking of system 111 is dispatched in high-performance calculation should Idle calculate node 112, and the calculate node information of the free time is notified to cloud computing management system 121.Cloud computing management System 121 dispatches idle calculate node information that system 111 notifies to idle calculate node 112 according to high-performance calculation Cloud computing proxy service module 1121 sends the first cleaning order to clear up the free time by the cloud computing proxy service module 1121 Calculate node adds it in cloud computing resource pool 120, as shown in Figure 2 after having cleared up.
In hybrid system 100 shown in Fig. 2, when cloud computing management system 121 monitors cloud computing resources redundancy (example Such as cloud calculation service slump in demand) when, a part of calculate node 112 that will apply before from high-performance calculation resource pool 110 Release is to return to high-performance calculation resource pool 110.Cloud meter of the cloud computing management system 121 to calculate node 112 to be released It calculates proxy service module 1121 and sends the second cleaning order to clear up calculating to be released by cloud computing proxy service module 1121 Node.Then after high-performance calculation scheduling system 111 clears up corresponding calculate node according to cloud computing proxy service module 1121 Calculate node 112 after the cleaning is recovered to high-performance calculation resource pool by the information of return.
Based on high-performance calculation described above and cloud computing hybrid system, the present invention also proposes a kind of high-performance Calculate the method for managing resource with cloud computing hybrid system.Fig. 3 shows high-performance according to an embodiment of the invention Calculate the flow chart with the method for managing resource 200 of cloud computing hybrid system.As shown in figure 3, the method for managing resource 200 Include the following steps:
In step S201, cloud computing agency service mould is set in each calculate node of high-performance calculation resource pool Block, the cloud computing proxy service module dispatches system with high-performance calculation simultaneously and cloud computing management system is connect.
In step S202, when cloud computing management system monitors the inadequate resource of cloud computing resource pool, by the cloud meter It calculates management system and dispatches the resource bid of system sending node to high-performance calculation, and system notice is dispatched according to high-performance calculation Idle calculate node information to the cloud computing proxy service module of the calculate node of free time send the first cleaning order with by The cloud computing proxy service module clears up the calculate node of the free time and adds it to cloud computing resource pool.
In step S203, when cloud computing management system monitors the resource redundancy of cloud computing resource pool, by the cloud meter Calculate management system release from the application of high-performance calculation resource pool to a part of calculate node, and to calculate node to be released Cloud computing proxy service module sends the second cleaning order to clear up calculating to be released by the cloud computing proxy service module Then node clears up corresponding calculate node according to the cloud computing proxy service module by high-performance calculation scheduling system Calculate node after the cleaning is recovered to high-performance calculation resource pool by the information returned afterwards.
By above-described method for managing resource, high-performance calculation of the invention and cloud computing hybrid system can be with In high-performance calculation resources idle, the high-performance calculation resource that would sit idle for carries out corresponding management setting, is added to cloud computing money In the pond of source, to meet the service application of cloud computing resource pool, and cloud computing resources can will be added in cloud computing resources redundancy The calculate node in pond releases, and is recycled by high-performance calculation resource pool, to avoid resources idle, improves resource utilization.
Fig. 4 shows the high-performance calculation of another specific embodiment according to the present invention and the money of cloud computing hybrid system The flow chart of power supply management method 300.As shown in figure 4, the detailed process of the method for managing resource 300 is as follows:
In step S301, cloud computing management system monitors the service condition of cloud computing resources.For example, cloud computing management system Whether can be less than preset resource residual amount threshold value by the surplus resources judged in current cloud computing resource pool or judgement is current Whether the surplus resources in cloud computing resource pool meet resource bid demand to judge that resource is insufficient or redundancy.Work as cloud computing When inadequate resource, method 300 executes step S302, and when cloud computing resources redundancy, method 300 executes step S308.
In step S302, when cloud computing resources deficiency, cloud computing management system is sent to high-performance calculation scheduling system Node resource application, to apply for idle calculate node.For example, cloud computing management system can call high-performance calculation scheduling system System API application node resource in the form of Ordinary Work.
In step S303, the calculate node of system call free time is dispatched in high-performance calculation.System is dispatched in high-performance calculation can Highest priority is set by the node resource application that cloud computing management system is sent.When currently without idle calculate node, Method 300 then executes step S304, waits the calculate node of free time to appear.When there is idle calculate node, method 300 Then step S305 is executed, the calculate node of the system lock free time is dispatched in high-performance calculation, and by the calculate node of the free time Information is notified to cloud computing management system.
In step S306, cloud computing management system is dispatched the idle calculate node that system notifies according to high-performance calculation and is believed It ceases to the cloud computing proxy service module of the calculate node of the free time and sends the first cleaning order by cloud computing agency service mould Block clears up the calculate node of the free time.In specific embodiment, cloud computing proxy service module executes following cleaning work:
1. terminating all high-performance calculation operation processes.The user that cloud computing proxy service module logs in all ssh (including root) and the pressure of ssh server subprocess are offline, prevent from impacting subsequent operation;Cloud computing agency service mould Block traverses all system process, and the process of nonsystematic plug-in is all terminated.
2. the distributed file system that unloading high-performance calculation operation uses.
3. firewall policy is arranged, cgroup resource policy is set, to forbid high-performance calculation user to access the calculating section Point.
4. pair calculate node carries out service switching, i.e., the clothes needed the calculate node of the free time from high-performance computing environment Business is switched to the service of cloud computing environment needs.
5. calling cloud computing management system API to obtain cloud computing from cloud computing management system needs distributed document to be used The connection data of the storage resources such as system or IP-SAN, FC-SAN, the corresponding file system of carry or storage resource, and will connection Information feeds back to cloud computing management system.
6. creating virtual switch, and the virtual switch machine information is returned into cloud computing management system.
In step S307, cloud computing management system provides CPU, memory, local disk of the calculate node after the cleaning etc. Source is added in cloud computing resource pool.
So far, cloud computing resource pool terminates from the process of high-performance calculation resource pool application computing resource.
When monitoring cloud computing resources redundancy (such as cloud calculation service slump in demand) in step S301, method 300 is held Row step S308.
In step S308, cloud computing management system release from the application of high-performance calculation resource pool to a part calculate section Point.In specific embodiment, cloud computing management system counts how many CPU in cloud computing resource pool, memory residue, surplus by calculating Remaining resource can return to the calculate node quantity of high-performance calculation resource pool to determine.If completely idle calculate node number Not enough, virtual machine of only a small amount of virtual machine in the calculate node of operation is moved out for amount, until there is sufficient amount of idle meter Operator node.
In step S309, cloud computing management system sends the to the cloud computing proxy service module of calculate node to be released Two cleaning orders are to be cleared up calculate node to be released by cloud computing proxy service module.In specific embodiment, cloud computing agency Service module executes following cleaning work:
1. distributed file system or storage resource that unloading cloud computing uses.Cloud computing proxy service module confirms without void Quasi- machine unloads the storages money such as distributed file system or IP-SAN, FC-SAN that cloud computing uses after running in respective nodes Source.
2. pair calculate node carries out service switching, i.e., the calculate node is switched to height from the service that cloud computing environment needs Performance calculates the service that environment needs, and notifies cloud computing management system so that cloud computing management system is by the calculating after success Node is deleted from cloud computing resource pool.
3. the distributed file system that carry high-performance calculation needs.
4. firewall and cgroup resource policy is arranged, it is to allow high-performance calculation user to dispatch by high-performance calculation System normally accesses the calculate node.
5. returning to corresponding information gives the current no operatton of high-performance calculation scheduling system representation calculate node, Ke Yijie By new operation.
Then in step S310, high-performance calculation dispatches system and the calculate node after the cleaning is recovered to high-performance calculation In resource pool, it is supplied to the use of high-performance calculation business.
So far, cloud computing resource pool will return to high-performance calculation money from the computing resource of high-performance calculation resource pool application The process in source pond terminates.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within mind and principle.

Claims (6)

1. a kind of high-performance calculation and cloud computing hybrid system, including high-performance calculation resource pool and cloud computing resource pool, The high-performance calculation resource pool includes high-performance calculation scheduling system and multiple calculate nodes by its management, the cloud computing Resource pool includes cloud computing management system and multiple calculate nodes by its management, which is characterized in that the high-performance calculation money Calculate node in the pond of source is equipped with cloud computing proxy service module, the cloud computing proxy service module simultaneously with high-performance meter Scheduling system is calculated to connect with cloud computing management system;
The cloud computing management system dispatches the resource bid of system sending node to high-performance calculation when monitoring inadequate resource, And according to high-performance calculation dispatch calculate node from the idle calculate node information of system notice to the free time cloud computing generation Reason service module send first cleaning order with cleared up by the cloud computing proxy service module free time calculate node and by It is added to cloud computing resource pool, and when monitoring resource redundancy discharge from the application of high-performance calculation resource pool to one Divide calculate node, and sends the second cleaning order by the cloud to the cloud computing proxy service module of calculate node to be released It calculates proxy service module and clears up calculate node to be released;
The high-performance calculation scheduling system calculating idle based on the node resource application scheduling that cloud computing management system is sent Node is simultaneously notified to cloud computing management system, and is based on second cleaning according to the cloud computing proxy service module and is ordered It clears up the information returned after corresponding calculate node and the calculate node after the cleaning is recovered to high-performance calculation resource pool;
Wherein, the cloud computing proxy service module includes: based on the idle calculate node of the first cleaning order cleaning
Terminate all high-performance calculation operation processes;
The distributed file system that unloading high-performance calculation operation uses;
Firewall policy is set, cgroup resource policy is set, to forbid high-performance calculation user to access the calculating section of the free time Point;
The calculate node of the free time is switched to the service that cloud computing environment needs from the service that high-performance computing environment needs;
Obtaining cloud computing from cloud computing management system needs the connection data of distributed file system or storage resource to be used, hangs Corresponding file system or storage resource are carried, and link information is fed back into cloud computing management system;
Virtual switch is created, and the virtual switch machine information is returned into cloud computing management system;
Wherein, the cloud computing proxy service module includes: based on the corresponding calculate node of the second cleaning order cleaning
The distributed file system or storage resource that unloading cloud computing uses;
The calculate node is switched to the service that high-performance computing environment needs from the service that cloud computing environment needs, and in success Notify cloud computing management system so that cloud computing management system deletes the calculate node from cloud computing resource pool afterwards;
The distributed file system that carry high-performance calculation needs;
Firewall and cgroup resource policy are set, to allow high-performance calculation user to access the calculate node;
Corresponding information is returned to high-performance calculation scheduling system so that the calculate node to be recovered in high-performance calculation resource pool.
2. system according to claim 1, which is characterized in that the high-performance calculation scheduling system is based on cloud computing management The calculate node and notice for the node resource application scheduling free time that system is sent further comprise to cloud computing management system: by The node resource application is set as highest priority, when there is idle calculate node, locks the calculate node of the free time, And the calculate node information of the free time is notified to cloud computing management system.
3. the method for managing resource of a kind of high-performance calculation and cloud computing hybrid system, wherein the system comprises high-performance Computing resource pool and cloud computing resource pool, the high-performance calculation resource pool include high-performance calculation scheduling system and are managed by it Multiple calculate nodes, the cloud computing resource pool include cloud computing management system and by its manage multiple calculate nodes, It is characterized in that, described method includes following steps:
S1, cloud computing proxy service module, the cloud are set in each calculate node of the high-performance calculation resource pool Proxy service module is calculated to connect with high-performance calculation scheduling system and cloud computing management system simultaneously;
S2, when cloud computing management system monitors the inadequate resource of cloud computing resource pool, from the cloud computing management system to The resource bid of system sending node is dispatched in high-performance calculation, and the idle calculating section of system notice is dispatched according to high-performance calculation Point information sends the first cleaning order by the cloud computing generation to the cloud computing proxy service module of the calculate node of the free time Reason service module clears up the calculate node of the free time and adds it to cloud computing resource pool;
S3, when cloud computing management system monitors the resource redundancy of cloud computing resource pool, released by the cloud computing management system Put from the application of high-performance calculation resource pool to a part of calculate node, and to the cloud computing of calculate node to be released act on behalf of take Module of being engaged in sends the second cleaning order to clear up calculate node to be released by the cloud computing proxy service module, then by institute It states high-performance calculation scheduling system and the information returned after corresponding calculate node is cleared up according to the cloud computing proxy service module Calculate node after the cleaning is recovered to high-performance calculation resource pool;
Wherein, further comprise by the calculate node that the cloud computing proxy service module clears up the free time in the step S2:
Terminate all high-performance calculation operation processes;
The distributed file system that unloading high-performance calculation operation uses;
Firewall policy is set, cgroup resource policy is set, to forbid high-performance calculation user to access the calculating section of the free time Point;
The calculate node of the free time is switched to the service that cloud computing environment needs from the service that high-performance computing environment needs;
Obtaining cloud computing from cloud computing management system needs the connection data of distributed file system or storage resource to be used, hangs Corresponding file system or storage resource are carried, and link information is fed back into cloud computing management system;
Virtual switch is created, and the virtual switch machine information is returned into cloud computing management system;
Wherein, clearing up calculate node to be released by the cloud computing proxy service module in the step S3 further comprises:
The distributed file system or storage resource that unloading cloud computing uses;
The calculate node is switched to the service that high-performance computing environment needs from the service that cloud computing environment needs, and in success Notify cloud computing management system so that cloud computing management system deletes the calculate node from cloud computing resource pool afterwards;
The distributed file system that carry high-performance calculation needs;
Firewall and cgroup resource policy are set, to allow high-performance calculation user to access the calculate node;
Corresponding information is returned to high-performance calculation scheduling system so that the calculate node to be recovered in high-performance calculation resource pool.
4. according to the method described in claim 3, it is characterized in that, the step S2 further comprises:
Highest is set by the node resource application that the cloud computing management system is sent by high-performance calculation scheduling system Priority locks the calculate node of the free time, and the calculate node information of the free time is led to when there is idle calculate node Know to cloud computing management system.
5. according to the method described in claim 3, it is characterized in that, by judging current cloud computing resource pool in the step S2 In surplus resources whether be less than preset resource residual amount threshold value or judge that the surplus resources in current cloud computing resource pool are Whether no to meet resource bid demand insufficient to monitor resource.
6. according to the method described in claim 3, it is characterized in that, being discharged in the step S3 by the cloud computing management system From the application of high-performance calculation resource pool to a part of calculate node further comprise:
The calculating that can be discharged is determined by the surplus resources in statistics cloud computing resource pool by the cloud computing management system Number of nodes, and when the quantity of completely idle calculate node is inadequate, by only a small amount of virtual machine operation calculate node In virtual machine move out, until there is sufficient amount of idle calculate node.
CN201510466360.5A 2015-07-31 2015-07-31 High-performance calculation and cloud computing hybrid system and its method for managing resource Active CN105183554B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510466360.5A CN105183554B (en) 2015-07-31 2015-07-31 High-performance calculation and cloud computing hybrid system and its method for managing resource

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510466360.5A CN105183554B (en) 2015-07-31 2015-07-31 High-performance calculation and cloud computing hybrid system and its method for managing resource

Publications (2)

Publication Number Publication Date
CN105183554A CN105183554A (en) 2015-12-23
CN105183554B true CN105183554B (en) 2019-07-09

Family

ID=54905650

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510466360.5A Active CN105183554B (en) 2015-07-31 2015-07-31 High-performance calculation and cloud computing hybrid system and its method for managing resource

Country Status (1)

Country Link
CN (1) CN105183554B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106020969A (en) * 2016-05-05 2016-10-12 云神科技投资股份有限公司 High-performance cloud computing hybrid computing system and method
CN106199696B (en) * 2016-06-29 2019-01-18 中国石油天然气股份有限公司 seismic data processing system and method
CN106250562B (en) * 2016-08-24 2019-08-23 苏州蓝海彤翔系统科技有限公司 Processing data information system
CN106453550B (en) * 2016-10-09 2019-08-27 烽火通信科技股份有限公司 A kind of deep-packet detection system and method based on cloud computing
CN106803842A (en) * 2017-02-15 2017-06-06 无锡十月中宸科技有限公司 A kind of distributed management architecture and method based on expansible and high-performance calculation
CN108334409B (en) * 2018-01-15 2020-10-09 北京大学 Fine-grained high-performance cloud resource management scheduling method
CN110119405B (en) * 2019-03-28 2023-10-13 江苏瑞中数据股份有限公司 Distributed parallel database resource management method
CN110225111A (en) * 2019-06-06 2019-09-10 武汉市智驾科技有限公司 A kind of high-performance calculation and cloud computing hybrid algorithm system and method for managing resource
CN110716790A (en) * 2019-09-12 2020-01-21 中城智慧(北京)城市规划设计研究院有限公司 Method for building high-performance hybrid cloud computing platform
CN113157429B (en) * 2020-01-22 2024-04-09 中移智行网络科技有限公司 SAAS cloud service implementation method and system
CN112532696B (en) * 2020-11-16 2024-09-17 广州华数云计算有限公司 Method and system for uploading data to cloud server to perform cloud computing
CN113507441B (en) * 2021-06-08 2023-04-28 中国联合网络通信集团有限公司 Secure resource expansion method, secure protection management platform and data node
CN114827236B (en) * 2022-01-29 2023-07-14 中国银联股份有限公司 Firewall virtual connection processing method and device and computer readable storage medium
CN114464269B (en) * 2022-04-07 2022-07-08 国家超级计算天津中心 Virtual medicine generation method and device and computer equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103533086A (en) * 2013-10-31 2014-01-22 中国科学院计算机网络信息中心 Uniform resource scheduling method in cloud computing system
CN104216782A (en) * 2014-08-19 2014-12-17 东南大学 Dynamic resource management method for high-performance computing and cloud computing hybrid environment
US9015708B2 (en) * 2011-07-28 2015-04-21 International Business Machines Corporation System for improving the performance of high performance computing applications on cloud using integrated load balancing

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140196054A1 (en) * 2013-01-04 2014-07-10 International Business Machines Corporation Ensuring performance of a computing system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9015708B2 (en) * 2011-07-28 2015-04-21 International Business Machines Corporation System for improving the performance of high performance computing applications on cloud using integrated load balancing
CN103533086A (en) * 2013-10-31 2014-01-22 中国科学院计算机网络信息中心 Uniform resource scheduling method in cloud computing system
CN104216782A (en) * 2014-08-19 2014-12-17 东南大学 Dynamic resource management method for high-performance computing and cloud computing hybrid environment

Also Published As

Publication number Publication date
CN105183554A (en) 2015-12-23

Similar Documents

Publication Publication Date Title
CN105183554B (en) High-performance calculation and cloud computing hybrid system and its method for managing resource
US20210089344A1 (en) Methods and apparatus to deploy a hybrid workload domain
US10292044B2 (en) Apparatus for end-user transparent utilization of computational, storage, and network capacity of mobile devices, and associated methods
CN103873279B (en) Server management method and server management device
CN106020969A (en) High-performance cloud computing hybrid computing system and method
CN102346460A (en) Transaction-based service control system and method
CN102594919B (en) Information technology (IT) resource supporting system
CN108632057A (en) A kind of fault recovery method of cloud computing server, device and management system
CN106462450A (en) Notification about virtual machine live migration to VNF manager
JP2010267009A (en) License management system, license management method and computer program
CN102594861A (en) Cloud storage system with balanced multi-server load
CN103473117A (en) Cloud-mode virtualization method
CN103516759B (en) Cloud system method for managing resource, cloud call center are attended a banquet management method and cloud system
CN110427246A (en) Multi-core virtual subregion reconfiguration system
CN101694633A (en) Equipment, method and system for dispatching of computer operation
CN107943559A (en) A kind of big data resource scheduling system and its method
CN103414712A (en) Management system and method of distributed virtual desktop
CN106293933A (en) A kind of cluster resource configuration supporting much data Computational frames and dispatching method
CN104123183B (en) Cluster job scheduling method and apparatus
CN113886089A (en) Task processing method, device, system, equipment and medium
CN112948063A (en) Cloud platform creation method and device, cloud platform and cloud platform implementation system
CN107203413A (en) A kind of resource data dispatches system and method
WO2012100545A1 (en) Method, system and device for service scheduling
KR20210041295A (en) Virtualized resource distribution system in cloud computing environment
CN106412094A (en) A method for organizing and managing scattered resources in a public cloud mode

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant