CN104123183A

CN104123183A - Cluster assignment dispatching method and device

Info

Publication number: CN104123183A
Application number: CN201410363745.4A
Authority: CN
Inventors: 马四腾
Original assignee: Inspur Beijing Electronic Information Industry Co Ltd
Current assignee: Inspur Beijing Electronic Information Industry Co Ltd
Priority date: 2014-07-28
Filing date: 2014-07-28
Publication date: 2014-10-29
Anticipated expiration: 2034-07-28
Also published as: CN104123183B

Abstract

The invention provides a cluster assignment dispatching method and device. The method comprises the steps that assignments are grouped, the grouped assignments are dispatched to at least two virtual machines on management nodes, and the virtual machines use shared resources, configured in advance, in shared storages for assignment processing; if the virtual machines for assignment processing break down, the assignments are switched to a backup virtual machine; if the management nodes are maintained or break down, the virtual machines on the management nodes are transferred to other management nodes. As the assignments are grouped, the grouped assignments are dispatched to the different virtual machines, and the different virtual machines use the shared resources in the shared storages for assignment processing, high fault tolerance and high availability of an assignment dispatching system are achieved.

Description

Cluster job scheduling method and apparatus

Technical field

The present invention relates to field of computer technology, relate in particular to a kind of cluster job scheduling method and apparatus.

Background technology

Current, network computer technology, has promoted development and the widespread use of group system.With express network, high-performance workstation or PC are connected into cluster by certain structure, realize parallel computation, only need very little cost just can obtain the performance of large scale computer and parallel machine.Along with the continuous expansion of high-performance computer cluster application scale, it is all the more outstanding that the problem of management of cluster seems.

Job scheduling system, is generally deployed on the management node of High Performance Cluster System, and main being responsible for receives the job request that user submits to, and to the requirement of operation, selects suitable resource to carry out completing user job request according to specific scheduling rule and user.For user, under the help of job scheduling system, HPCC system just looks like a large server that possesses a lot of CPU, and a plurality of users can use this system simultaneously.The job request that job scheduling system leading subscriber is submitted to, is each job request Resources allocation reasonably, thereby guarantees to make full use of the computing power of group system, and as far as possible promptly obtains operation result.Therefore, job scheduling system is extremely important to the management of cluster.

Traditional job scheduling system is deployed with two kinds, and a kind of method is to dispose at the management node unit of cluster, and job scheduling software, the Torque+Maui software of for example increasing income, is directly deployed on the management node of cluster.But the mode that adopts management node unit to dispose, once this management node breaks down, just can cause the job scheduling system of whole cluster to quit work, the operation of whole cluster cannot be carried out reasonable efficient scheduling, and job run also just there will be stagnation, has a strong impact on running efficiency of system.

Another kind method is to use heartbeat (heartbeat) scheme, is about to job scheduling Software deployment on two management nodes of cluster, and also disposes heartbeat on these two management nodes.Job scheduling system by a management node provides job scheduling to serve, and after this management node breaks down, by heartbeat, is controlled and is switched on another management node, by another management node, continues to provide job scheduling to serve.But, because heartbeat can only monitor management node, can not monitor the resource of job scheduling system, once the resource of monitoring breaks down, for example Maui service is broken down, just can not effectively carry out resource switch, thereby also can cause whole group operation cannot carry out reasonable efficient scheduling, have a strong impact on running efficiency of system.

Summary of the invention

In order to solve the problems of the technologies described above, the invention provides a kind of cluster job scheduling method and apparatus, the height that can realize job scheduling system is fault-tolerant and high available.

In order to reach the object of the invention, the invention provides a kind of cluster job scheduling method, comprise: the operation that user is submitted to is divided into groups, job scheduling operating system is by the job scheduling after grouping at least two virtual machines on management node, and these at least two virtual machines are used the shared resource in pre-configured shared storage to carry out operation processing; If carry out the virtual machine of operation processing on management node, break down, job scheduling operating system is switched to backup virtual machine by operation, and backup virtual machine is used the shared resource of sharing in storage to carry out operation processing; If management node is carried out as safeguarding, or management node breaks down, and job scheduling operating system is by the virtual machine (vm) migration on management node to other management nodes, and the virtual machine after migration is used the shared resource of sharing in storage to carry out operation processing.

Further, the method also comprises: in management node deploy, share storage, this is shared storage and comprises shared resource.

Further, the method also comprises: at least two virtual machines of management node deploy, be specially, by the virtual machine KVM based on kernel at least two virtual machines of management node deploy; If virtual machine is used identical job scheduling operating system, first by KVM, dispose a virtual machine, then adopt clone Clone mode to dispose other virtual machines.

Further, the method also comprises: in virtual machine deploy job scheduling operating system.

Further, if carry out the virtual machine of operation processing on management node, break down, comprising: if the operation that job scheduling operating system is processed virtual machine cannot be dispatched, judge the virtual machine that carries out operation processing on management node and break down.

Further, backup virtual machine is the virtual machine redundancy of disposing in advance, or adopts Clone mode to create judging after virtual machine breaks down.

The invention provides a kind of cluster job scheduling device, comprising: grouping module, for the operation that user is submitted to, divide into groups; Scheduler module, for by the job scheduling after grouping at least two virtual machines on management node, these at least two virtual machines are used the shared resource of sharing in storage to carry out operation processing; Handover module, if break down for carrying out the virtual machine of operation processing on management node, is switched to backup virtual machine by operation, and backup virtual machine is used the shared resource of sharing in storage to carry out operation processing; Transferring module, if for management node being carried out for safeguarding, or management node breaks down, by the virtual machine (vm) migration on management node, to other management nodes, the virtual machine after migration is used the shared resource of sharing in storage to carry out operation processing.

Further, this device also comprises: the first deployment module, for sharing storage in management node deploy, this is shared storage and comprises shared resource.

Further, this device also comprises: the second deployment module, and at least two virtual machines of management node deploy, this virtual machine is stored in to be shared in storage; This second deployment module, at least two virtual machines of management node deploy, specifically comprises: the virtual machine KVM of the second deployment module by based on kernel is at least two virtual machines of management node deploy; If virtual machine is used identical job scheduling operating system, first by KVM, dispose a virtual machine, then adopt clone Clone mode to dispose other virtual machines.

Further, this device also comprises: the 3rd deployment module, and in virtual machine deploy job scheduling operating system.

Further, this device also comprises: the first judge module, and whether the virtual machine that carries out operation processing for judging on management node breaks down; If the operation that job scheduling operating system is processed virtual machine cannot be dispatched, judge the virtual machine that carries out operation processing on management node and break down.

Further, this device also comprises: the second judge module, and for judging whether management node is being safeguarded, or whether management node breaks down; According to the management cycle setting in advance, judge management node and safeguarding; If network failure or the management node machine of delaying, judges management node and breaks down between management node.

Compared with prior art, the present invention includes: the operation that user is submitted to is divided into groups, job scheduling operating system is by the job scheduling after grouping at least two virtual machines on management node, and these at least two virtual machines are used the shared resource in pre-configured shared storage to carry out operation processing; If carry out the virtual machine of operation processing on management node, break down, job scheduling operating system is switched to backup virtual machine by operation, and backup virtual machine is used the shared resource of sharing in storage to carry out operation processing; If management node is carried out as safeguarding, or management node breaks down, and job scheduling operating system is by the virtual machine (vm) migration on management node to other management nodes, and the virtual machine after migration is used the shared resource of sharing in storage to carry out operation processing.The present invention divides into groups by the operation that user is submitted to, and by grouping after job scheduling to virtual machines different on management node, because do not interfere with each other between each virtual machine, so load distribution has been carried out in the operation of not only user being submitted to, also can go wrong at a virtual machine, can not affect the normal operation of other virtual machines on management node, thereby it is fault-tolerant to realize the height of job scheduling system.In addition, the present invention shares storage in management node deploy, if carry out virtual machine switching, backup virtual machine after switching is used the shared resource of sharing in storage to carry out operation processing, if carry out virtual machine (vm) migration, the virtual machine after migration is also used the shared resource of sharing in storage to carry out operation processing, has avoided resource to break down, can not effectively carry out the problem of resource switch, thereby it is available to realize the height of job scheduling system.

Accompanying drawing explanation

Fig. 1 is the schematic flow sheet of cluster job scheduling method of the present invention.

Fig. 2 is the structural representation of cluster job scheduling device of the present invention.

Embodiment

Below with reference to embodiment shown in the drawings, describe the present invention.

Cluster job scheduling system of the present invention comprises management node, virtual machine and shared storage, wherein,

Management node is physical node, and cluster job scheduling system at least comprises two management nodes;

On management node, pass through the virtual machine (KVM based on kernel, Kernel-based Virtual Machine) dispose virtual machine, KVM is a kind of system virtualization module of increasing income, it is the virtual machine instrument that (SuSE) Linux OS kernel carries, can on management node, by KVM, create and managing virtual machines, on a management node, can dispose a plurality of virtual machines, each virtual machine can be disposed independently job scheduling operating system;

In management node deploy, share storage, shared storage can be network file system(NFS) (NFS, Network File System) or network attached storage (NAS, Network Attached Storage), for the data of Sharing Management node and storage virtual machine etc.

Fig. 1 is the schematic flow sheet of cluster job scheduling method of the present invention, as shown in Figure 1, comprising:

Step 11, shares storage in management node deploy, and this is shared storage and comprises shared resource.

In this step, on management node, storage is shared in establishment, and configures the storage resources that this shares storage, specifically can use Virsh order or share by graphical interfaces the storage resources of storing to configure.

Step 12, at least two virtual machines of management node deploy, this virtual machine is stored in to be shared in storage.

In this step, can create virtual machine by KVM, if virtual machine is used identical job scheduling operating system, can first create a virtual machine, then carry out Clone order, adopt clone's mode to create other virtual machines.

Virtual machine can adopt identical hardware resource when creating, 1G internal memory for example, and 2 CPU, 10G hard drive spaces etc., can carry out the modification of hardware configuration again by demand after establishment.

Step 13, in virtual machine deploy job scheduling operating system.

In this step, each virtual machine can move independently operating system, so in each virtual machine deploy job scheduling operating system, job scheduling operating system be mainly that operation that user is submitted to is ranked, dispatched and is the necessary resource such as operation storage allocation, input-output device, when operation is finished, be responsible for recovery system resource.Job scheduling operating system can be the Torque+Maui software of increasing income.

Step 14, the operation that user is submitted to is divided into groups, and job scheduling operating system is by the job scheduling after grouping at least two virtual machines on management node, and these at least two virtual machines are used the shared resource of sharing in storage to carry out operation processing.

In this step, the operation that user is submitted to is divided into groups, and can be to divide into groups according to homework type, or divides into groups according to user's definition.

Job scheduling operating system is given different virtual machines by the job scheduling after grouping, with different virtual machines, carry out operation processing, so, load distribution has been carried out in the operation of on the one hand user being submitted to, on the other hand, once a virtual machine goes wrong, can not affect the normal operation of other virtual machines.

Step 15, breaks down if carry out the virtual machine of operation processing on management node, and job scheduling operating system is switched to backup virtual machine by operation, and backup virtual machine is used the shared resource of sharing in storage to carry out operation processing.

In this step, if the operation that job scheduling operating system is processed virtual machine cannot be dispatched, for example check that job state is that operation is not finished dealing with within the predetermined time, or check that job scheduling service state is for to dispatch unsuccessfully, can judge the virtual machine that carries out operation processing and break down.

Job scheduling operating system is switched to backup virtual machine by the operation on fault virtual machine.Backup virtual machine can be the virtual machine redundancy of disposing in advance, for example in advance two virtual machines are carried out in back-up processing, job scheduling operating system distributes operation to a wherein virtual machine, another virtual machine is as backup virtual machine, when carrying out after the virtual machine of operation processing breaks down, job scheduling operating system is switched to backup virtual machine by operation.Backup virtual machine can be also after job scheduling operating system is judged virtual machine and broken down, and by KVM, adopts clone's mode fast creation.

Step 16, if management node is carried out as safeguarding, or management node breaks down, job scheduling operating system is by the virtual machine (vm) migration on management node to other management nodes, and the virtual machine after migration is used the shared resource of sharing in storage to carry out operation processing.

In this step, in order to ensure the stable of management node, management node needs regularly to safeguard, the management cycle for example setting in advance, when management node being safeguarded according to this management cycle, job scheduling operating system will operate in virtual machine dynamic migration on current management node to other management nodes, and after management node is safeguarded and finished, job scheduling operating system is returned virtual machine (vm) migration.

If management node breaks down, for example, due to the network between management node goes wrong or management node is delayed machine, cause other nodes cannot Access Management Access node, will operate in virtual machine dynamic migration on current management node to other management nodes.

Fig. 2 is the structural representation of cluster job scheduling device of the present invention, as shown in Figure 2, comprising:

The first deployment module, for sharing storage in management node deploy, this is shared storage and comprises shared resource.

The second deployment module, at least two virtual machines of management node deploy, this virtual machine is stored in to be shared in storage.

The 3rd deployment module, in virtual machine deploy job scheduling operating system.

Grouping module, divides into groups for the operation that user is submitted to.

Scheduler module, for by the job scheduling after grouping at least two virtual machines on management node, these at least two virtual machines are used the shared resource of sharing in storage to carry out operation processing.

The first judge module, whether the virtual machine that carries out operation processing for judging on management node breaks down;

Handover module, if break down for carrying out the virtual machine of operation processing on management node, is switched to backup virtual machine by operation, and backup virtual machine is used the shared resource of sharing in storage to carry out operation processing.

The second judge module, for judging whether management node is safeguarded, or whether management node breaks down.

Transferring module, if for management node is safeguarded, or management node breaks down, by the virtual machine (vm) migration on management node, to other management nodes, the virtual machine after migration is used the shared resource of sharing in storage to carry out operation processing.

The present invention divides into groups by the operation that user is submitted to, and by grouping after job scheduling to virtual machines different on management node, because do not interfere with each other between each virtual machine, so load distribution has been carried out in the operation of not only user being submitted to, also can go wrong at a virtual machine, can not affect the normal operation of other virtual machines on management node, thereby it is fault-tolerant to realize the height of job scheduling system.

In addition, the present invention shares storage in management node deploy, if carry out virtual machine switching, backup virtual machine after switching is used the shared resource of sharing in storage to carry out operation processing, if carry out virtual machine (vm) migration, the virtual machine after migration is also used the shared resource of sharing in storage to carry out operation processing, has avoided resource to break down, can not effectively carry out the problem of resource switch, thereby it is available to realize the height of job scheduling system.

Be to be understood that, although this instructions is described according to embodiment, but not each embodiment only comprises an independently technical scheme, this narrating mode of instructions is only for clarity sake, those skilled in the art should make instructions as a whole, technical scheme in each embodiment also can, through appropriately combined, form other embodiments that it will be appreciated by those skilled in the art that.

Listed a series of detailed description is above only illustrating for feasibility embodiment of the present invention; they are not for limiting the scope of the invention, all disengaging within equivalent embodiment that skill spirit of the present invention does or change all should be included in protection scope of the present invention.

Claims

1. a cluster job scheduling method, is characterized in that, comprising:

The operation that user is submitted to is divided into groups, and job scheduling operating system is by the job scheduling after grouping at least two virtual machines on management node, and these at least two virtual machines are used the shared resource in pre-configured shared storage to carry out operation processing;

If carry out the virtual machine of operation processing on management node, break down, job scheduling operating system is switched to backup virtual machine by operation, and backup virtual machine is used the shared resource of sharing in storage to carry out operation processing;

If management node is carried out as safeguarding, or management node breaks down, and job scheduling operating system is by the virtual machine (vm) migration on management node to other management nodes, and the virtual machine after migration is used the shared resource of sharing in storage to carry out operation processing.

2. cluster job scheduling method according to claim 1, is characterized in that, the method also comprises:

In management node deploy, share storage, this is shared storage and comprises shared resource.

3. cluster job scheduling method according to claim 1, is characterized in that, the method also comprises: at least two virtual machines of management node deploy, be specially,

By the virtual machine KVM based on kernel at least two virtual machines of management node deploy;

If virtual machine is used identical job scheduling operating system, first by KVM, dispose a virtual machine, then adopt clone Clone mode to dispose other virtual machines.

4. cluster job scheduling method according to claim 1, is characterized in that, the method also comprises: in virtual machine deploy job scheduling operating system.

5. cluster job scheduling method according to claim 1, is characterized in that, if carry out the virtual machine of operation processing on described management node, breaks down, and comprising:

If the operation that job scheduling operating system is processed virtual machine cannot be dispatched, judge the virtual machine that carries out operation processing on management node and break down.

6. cluster job scheduling method according to claim 1, is characterized in that, described backup virtual machine is the virtual machine redundancy of disposing in advance, or adopts Clone mode to create judging after virtual machine breaks down.

7. a cluster job scheduling device, is characterized in that, comprising:

Grouping module, divides into groups for the operation that user is submitted to;

Scheduler module, for by the job scheduling after grouping at least two virtual machines on management node, these at least two virtual machines are used the shared resource of sharing in storage to carry out operation processing;

Handover module, if break down for carrying out the virtual machine of operation processing on management node, is switched to backup virtual machine by operation, and backup virtual machine is used the shared resource of sharing in storage to carry out operation processing;

Transferring module, if for management node being carried out for safeguarding, or management node breaks down, by the virtual machine (vm) migration on management node, to other management nodes, the virtual machine after migration is used the shared resource of sharing in storage to carry out operation processing.

8. cluster job scheduling device according to claim 7, is characterized in that, this device also comprises:

9. cluster job scheduling device according to claim 7, is characterized in that, this device also comprises:

The second deployment module, at least two virtual machines of management node deploy, this virtual machine is stored in to be shared in storage;

This second deployment module, at least two virtual machines of management node deploy, specifically comprises:

The virtual machine KVM of the second deployment module by based on kernel is at least two virtual machines of management node deploy; If virtual machine is used identical job scheduling operating system, first by KVM, dispose a virtual machine, then adopt clone Clone mode to dispose other virtual machines.

10. cluster job scheduling device according to claim 7, is characterized in that, this device also comprises:

11. cluster job scheduling devices according to claim 7, is characterized in that, this device also comprises:

12. cluster job scheduling devices according to claim 7, is characterized in that, this device also comprises: the second judge module, and for judging whether management node is being safeguarded, or whether management node breaks down;

According to the management cycle setting in advance, judge management node and safeguarding;

If network failure or the management node machine of delaying, judges management node and breaks down between management node.