WO2007043142A1 - Job management device and job management program - Google Patents

Job management device and job management program Download PDF

Info

Publication number
WO2007043142A1
WO2007043142A1 PCT/JP2005/018418 JP2005018418W WO2007043142A1 WO 2007043142 A1 WO2007043142 A1 WO 2007043142A1 JP 2005018418 W JP2005018418 W JP 2005018418W WO 2007043142 A1 WO2007043142 A1 WO 2007043142A1
Authority
WO
WIPO (PCT)
Prior art keywords
job
nodes
parallel
queue
node
Prior art date
Application number
PCT/JP2005/018418
Other languages
French (fr)
Japanese (ja)
Inventor
Makoto Kanno
Original Assignee
Fujitsu Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Limited filed Critical Fujitsu Limited
Priority to PCT/JP2005/018418 priority Critical patent/WO2007043142A1/en
Publication of WO2007043142A1 publication Critical patent/WO2007043142A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5017Task decomposition

Definitions

  • the present invention relates to a job management apparatus and job management program that allocate a queued job to one of a plurality of nodes, and more particularly to a job management apparatus and job management that allocates sequential jobs and parallel jobs to nodes. Regarding the program.
  • Patent Documents 1 and 2 do not consider job assignment when sequential jobs and parallel jobs are mixed. Therefore, in a system that executes processing of sequential jobs and parallel jobs, job execution efficiency cannot be sufficiently improved.
  • the sequential job is a sequential job executed on one node
  • the parallel job is a job executed on multiple nodes in parallel.
  • Sequential jobs have a wide range of tolerances to the decline in execution performance, so multiple jobs can be executed on one node.
  • information is transferred between processes executed in parallel. Therefore, if multiple jobs are executed on one node, the execution performance deteriorates. If the timing for passing information to other nodes is delayed, waiting time occurs on other nodes, and processing is also delayed.
  • FIG. 15 is a diagram showing job assignment by the stuffing method. As shown in FIG. 15, six nodes 921 to 926 are connected via a network 910. Each node 921 to 926 is assumed to have two processors. Then, the job management apparatus 931 assigns the job to each of the nodes 921 to 926 by a packing method.
  • a parallel job 945 occurs after four sequential jobs 941 to 944 continue.
  • the number of parallel jobs 945 (the number of nodes that perform parallel processing is “3”.
  • the order in which jobs are queued is shown in the job.
  • jobs that can be executed by the nodes are sequentially assigned to each node in a predetermined order. Assigned to first and second sequential jobs 941, 942 force node 921. Next, the third and fourth sequential jobs 943, 944 are assigned to node 922. The fifth parallel job 945 is assigned to nodes 923-925.
  • FIG. 16 is a diagram showing job assignment in the distributed arrangement method.
  • jobs similar to those in FIG. 15 are assigned to each of the nodes 921 to 926 by the job management apparatus 932 by the distributed arrangement method.
  • the generated jobs are distributed so that the load on each node is equalized. Placed. Specifically, the first to fourth sequential jobs 941 to 944
  • Patent Document 1 Japanese Patent Laid-Open No. 8-305671
  • Patent Document 2 JP 2000-315199 A
  • the present invention has been made in view of the above points, and a job management apparatus and a job management apparatus capable of performing efficient job allocation even when the generation ratio of sequential jobs and parallel jobs changes. And a job management program.
  • a job management apparatus 1 as shown in FIG. 1 assigns a sequential job executed in one node 2a, 2b,... And a parallel job executed in parallel processing by a plurality of nodes to nodes. Therefore, it has the following functions.
  • the queue buffer la stores the job queue in the input order.
  • Reserved node number determination means lb is stored in queue buffer la Based on the stored job queue indicating the parallel job processing request, the number of reserved nodes indicating the number of nodes to be reserved in advance for the parallel job processing is determined.
  • the free node number calculating means lc monitors whether or not a job is executed on each of a plurality of nodes, executes the job, and calculates the number of free nodes indicating the number of nodes.
  • Job queue acquisition means Id acquires the job queues stored in the queue buffer in the order in which they were stored.
  • the job type determination unit le determines whether the job queue acquired by the job queue acquisition unit Id is a parallel job processing request or a sequential job processing request.
  • the parallel job allocation unit If determines an empty node corresponding to the parallel number of parallel jobs as an allocation destination.
  • the allocation method selection unit lg determines whether the number of free nodes is excessive or insufficient based on the number of free nodes and the number of reserved nodes. If the number of free nodes is insufficient, the stuffing method is selected as the job allocation method, and if the number of free nodes is sufficient, the distributed arrangement method is selected as the job allocation method.
  • the stuffing method job allocation unit lh preferentially determines a node that has already executed another job as an assignment destination of the job acquired by the job queue acquisition unit Id.
  • the distributed allocation method job allocation unit li preferentially determines a node that is not executing a job as an allocation destination of the job acquired by the job queue acquisition unit Id.
  • the processing request transmission means lj transmits the processing request for the job indicated by the job queue acquired by the job queue acquisition means Id to the node determined as the allocation destination.
  • the job queues are stored in the queue buffer la in the input order.
  • the reserved node number determining means lb determines the reserved node number indicating the number of nodes that should be reserved in advance for processing the parallel job.
  • the number of free nodes calculating means lc monitors whether or not a job is executed in each of the plurality of nodes, and calculates the number of free nodes indicating the number of obsolete nodes by executing the job.
  • job queues stored in the queue buffer are acquired in the order in which they are stored by the job queue acquisition means Id. Then, the job queue obtained by the job queue obtaining means Id is parallelized by the job type judging means le. It is determined whether it is a job processing request or a sequential job processing request.
  • the parallel job allocation unit determines the free node corresponding to the parallel number of parallel jobs as the allocation destination.
  • Job queue acquisition means When the job queue acquired by Id is a sequential job processing request, the allocation method selection means lg determines whether the number of free nodes is excessive or insufficient based on the number of free nodes and the number of reserved nodes. If the number of nodes is insufficient, the stuffing method is selected as the job allocation method, and if the number of free nodes is sufficient, the distributed arrangement method is selected as the job allocation method.
  • the node that has already executed another job is preferentially determined by the filling method job assignment unit lh as the assignment destination of the job acquired by the job queue acquisition unit Id.
  • the V-zone that is not executing the job is preferentially determined by the distributed placement method job assignment means li as the assignment destination of the job acquired by the job queue acquisition means Id.
  • the processing request transmitting unit lj transmits the processing request for the job indicated by the job queue acquired by the job queue acquiring unit Id to the node determined as the allocation destination.
  • a parallel processing is performed based on a queue buffer storing the job queue in the input order and a job queue indicating a parallel job processing request stored in the queue buffer.
  • Reserved node number determination means monitors the execution of jobs on each of the plurality of nodes, and executes jobs.
  • An empty node number calculating means for calculating the number of empty nodes indicating the number of nodes, and the job stored in the queue buffer
  • Job queue acquisition means for acquiring queues in the order of storage, job type determination means for determining whether the job queue acquired by the job queue acquisition means is a parallel job processing request or a sequential job processing request, and the job
  • job type determination means for determining whether the job queue acquired by the job queue acquisition means is a parallel job processing request or a sequential job processing request, and the job
  • a free node corresponding to the parallel number of parallel jobs is determined as an allocation destination. If the job queue acquired by the queue job allocation means and the job queue acquisition means is a sequential job processing request, the excess number of empty nodes is determined based on the number of empty nodes and the number of reserved nodes!
  • the stuffing method is selected as the job allocation method, and the distributed allocation method is selected as the job allocation method when the number of empty nodes is sufficient.
  • Method selection means when the stuffing method is selected, a stuffing method job assignment means for preferentially determining a node already executing another job as an assignment destination of the job acquired by the job queue acquisition means, When the distributed placement method is selected, jobs that have been acquired by the job queue acquisition means are given priority to nodes that are not executing jobs.
  • Distributed allocation method job allocation means that determines the allocation destination of the job, and processing request transmission that transmits the processing request of the job indicated by the job queue acquired by the job queue acquisition means to the node determined as the allocation destination
  • a job management program characterized in that it functions as a means is provided.
  • the number of reserved nodes is determined based on a job queue indicating a parallel job processing request stored in the queue buffer, and an excess of the number of free nodes is determined based on the number of free nodes and the number of reserved nodes. Judgment is made, and if the number of free nodes is insufficient, a sequential job is assigned to the nodes using the stuffing method. If the number of free nodes is sufficient, a sequential job is assigned to the nodes using the distributed placement method. As a result, when there are few parallel job processing requests, a distributed arrangement method that prioritizes sequential jobs is adopted, and when parallel job processing requests increase, a stuffing method that gives priority to parallel jobs is adopted. As a result, efficient job assignment can always be performed.
  • FIG. 1 is a diagram showing an outline of the present embodiment.
  • FIG. 2 is a diagram showing an example of job distribution.
  • FIG. 3 is a diagram showing a system configuration example according to the present embodiment.
  • FIG. 4 is a diagram illustrating a hardware configuration example of a job management server used in the present embodiment.
  • FIG. 5 is a block diagram showing functions of the job management server.
  • FIG. 6 is a block diagram illustrating functions of a job assignment unit.
  • FIG. 7 is a flowchart showing a procedure for job assignment processing.
  • FIG. 8 shows an example of a queued job.
  • FIG. 9 is a diagram showing job assignment statuses up to the third.
  • FIG. 10 is a diagram showing a state after completion of parallel job processing.
  • FIG. 11 is a diagram showing a state where up to the sixth job is assigned.
  • FIG. 12 is a diagram showing a state in which all jobs are assigned.
  • FIG. 13 is a diagram illustrating an example in which jobs are assigned only by a filling method.
  • FIG. 14 is a diagram showing an example in which jobs are assigned only by a distributed arrangement method.
  • FIG. 15 is a diagram showing job assignment by a filling method.
  • FIG. 16 is a diagram showing job assignment in a distributed arrangement method.
  • FIG. 1 is a diagram showing an outline of the present embodiment.
  • the job management apparatus 1 assigns sequential jobs executed on one node and parallel jobs executed by parallel processing by a plurality of nodes to the nodes 2a, 2b,.
  • the job management apparatus 1 includes a queue buffer la, a reserved node number determining unit lb, a free node number calculating unit lc, a job queue obtaining unit Id, a job type determining unit le, a parallel job allocation unit lf, and an allocation method selection unit lg. It has a stuffing method job assignment means lh, a distributed arrangement method job assignment means li, and a processing request transmission means 1 j.
  • the queue buffer la stores the job queue in the input order.
  • the input job processing requests include processing requests for sequential jobs and parallel jobs.
  • parallel jobs or sequential jobs Information is included.
  • the parallel job processing request indicates the parallel number of the parallel job.
  • the reserved node number determination means lb determines the number of nodes to be reserved in advance for processing a parallel job based on the job queue indicating the processing request for the parallel job stored in the queue buffer la. Determine the number of reserved nodes to show. For example, referring to a job queue indicating a parallel job processing request stored in the queue buffer la, the total number of parallel jobs of each parallel job can be set as the number of reserved nodes.
  • Free node number calculation means lc monitors the execution of jobs on each of the plurality of nodes 2a, 2b, ⁇ , and executes the job to calculate the number of free nodes indicating the number of nodes To do.
  • Job queue acquisition means Id acquires the job queues stored in the queue buffer la in the order of storage.
  • the job type determination unit le determines whether the job queue acquired by the job queue acquisition unit Id is a parallel job processing request or a sequential job processing request.
  • the parallel job allocation unit If the job queue acquired by the job queue acquisition unit Id is a parallel job processing request, the parallel job allocation unit If determines an empty node corresponding to the parallel number of parallel jobs as an allocation destination.
  • the allocation method selection unit lg determines whether the number of free nodes is excessive or insufficient based on the number of free nodes and the number of reserved nodes. Judging. For example, the allocation method selection means lg determines that the number of free nodes is insufficient if the number of free nodes is equal to or less than the number of reserved nodes, and determines the number of free nodes if the number of free nodes exceeds the number of reserved nodes. Is judged to be sufficient. The allocation method selection means lg selects the stuffing method as the job allocation method if the number of free nodes is insufficient, and the distributed allocation method as the job allocation method if the number of free nodes is sufficient. Select.
  • the stuffing method job assigning means lh executes another job and preferentially determines the node to which the job acquired by the job queue obtaining means Id is assigned as the assignment destination. .
  • the stuffing method job allocation unit lh selects a node that can execute a job from among nodes that execute another job, and allocates it. It is a target.
  • the distributed allocation method job allocation unit li executes a job and preferentially determines the source code as the allocation destination of the job acquired by the job queue acquisition unit Id. .
  • the processing request transmission unit lj transmits a processing request for the job indicated by the job queue acquired by the job queue acquisition unit Id to the node determined as the assignment destination. In response to this processing request, the job is processed at each node.
  • Such a job management apparatus 1 is a system having a configuration in which a plurality of nodes 2a, 2b, ... are connected via a network. Each node 2a, 2b,... Has a plurality of CPUs.
  • the job management apparatus 1 assigns the queued job to the node, and arranges and executes the job on the node.
  • the node that allocates the job based on the number of nodes to which no job is assigned (free node), the number of free nodes in the entire system, the remaining time of running jobs, and the type of queued job. Is determined.
  • the reserved node number determining means lb sets the reserved node number to be small.
  • sequential jobs can be allocated by the distributed arrangement method. In the distributed placement method, jobs are preferentially assigned to free nodes, so the execution efficiency of the entire system can be improved.
  • the reserved node number determination means lb increases the number of reserved nodes. As a result, even if the number of parallel jobs increases, it is possible to prevent the execution of waiting for execution of parallel jobs.
  • FIG. 2 is a diagram illustrating an example of job distribution.
  • jobs are assigned according to the present embodiment under the same conditions as in the conventional example shown in FIGS.
  • six nodes 2a, 2b, 2c, 2d, 2e, 2f are connected via network 3.
  • No Each of the nodes 2a, 2b, 2c, 2d, 2e, and 2f has two CPUs.
  • the job queue indicating the processing requests of the sequential jobs 4a, 4b, 4c, and 4d continues in the queue buffer la of the job management device 1, the job queue indicating the processing requests of the three parallel jobs 4e is stored. It shall be assumed.
  • the total number of parallel jobs in queue buffer la is the number of reserved nodes.
  • the sequential jobs 4a, 4b, 4c are distributed to the nodes 2a, 2b, 2c, respectively, by the distributed arrangement method.
  • the number of free nodes matches the number of reserved nodes. Therefore, the next sequential job 4d is assigned to the node 2a by the filling method.
  • the parallel job 4e is assigned harm to the nodes 2d, 2e, and 2f.
  • sequential jobs that do not hinder the execution of parallel jobs are arranged so as to be distributed to the nodes, and the job can be swept. Another factor is that the throughput of the entire system with free nodes is improved.
  • FIG. 3 is a diagram showing a system configuration example of the present embodiment.
  • the job management server 100 is connected to six nodes 21 to 26 via the network 10.
  • the nodes 21 to 26 are connected via a network 30 capable of high-speed communication.
  • Network 30 is used to communicate information passed between parallel jobs executed on nodes 21-26.
  • the job processing requesting power of the job to be executed is first registered in the job management server 100.
  • the registered job processing request is automatically sent at a predetermined time.
  • the job management server 100 performs job processing request distribution processing, and each job processing request is assigned to any of the nodes 21 to 26.
  • the job management server 100 has the following hardware configuration to perform job processing request distribution processing.
  • FIG. 4 is a diagram illustrating a hardware configuration example of the job management server used in the present embodiment.
  • the job management server 100 is entirely controlled by the CPU 101.
  • a random access memory (RAM) 102, a hard disk drive (HDD) 103, a graphic processing device 104, an input interface 105, and a communication interface 106 are connected to the CPU 101 via a bus 107.
  • At least a part of an OS (Operating System) program application program to be executed by the CPU 101 is temporarily stored in the RAM 102.
  • the RAM 102 stores various data necessary for processing by the CPU 101.
  • the HDD 103 stores the OS and application programs.
  • a monitor 11 is connected to the graphic processing device 104.
  • the graphic processing device 104 displays an image on the screen of the monitor 11 in accordance with a command from the CPU 101.
  • a keyboard 12 and a mouse 13 are connected to the input counter face 105.
  • the input interface 105 transmits a signal sent from the keyboard 12 or mouse 13 to the CPU 101 via the bus 107.
  • the communication interface 106 is connected to the network 10.
  • the communication interface 106 transmits / receives data to / from other computers via the network 10.
  • the processing functions of the present embodiment can be realized.
  • the power nodes 21 to 26 shown in FIG. 4 as an example of the hardware configuration of the job management server 100 can also be realized with the same hardware configuration.
  • each of the nodes 21 to 26 has two CPUs.
  • FIG. 5 is a block diagram illustrating functions of the job management server.
  • the job management server 100 includes a communication unit 111, a job input unit 112, a queue buffer 113, a reserved node number management unit 114, an empty node number management unit 115, an execution job remaining time management unit 116, and a job allocation unit 120. is doing.
  • the communication unit 111 communicates with the nodes 21 to 26, collects various management information, transmits a job processing request regarding the assigned job, and the like.
  • the job submission unit 112 submits a job processing request to the queue buffer 113.
  • the job submission unit 112 submits a job processing request for a batch job to the queue buffer 113 at a preset time. Further, when a job processing request is issued by an operation input from the user, the job input unit 112 inputs the job processing request of the corresponding job to the queue buffer 113.
  • the queue buffer 113 queues the submitted job processing requests in the order of submission.
  • the queue buffer 113 passes the job processing requests to the job allocation unit 120 in order of the input time.
  • the reserved node number management unit 114 calculates the number of reserved nodes based on the parallel jobs registered in the queue buffer 113.
  • the number of reserved nodes for parallel jobs on the queue buffer 113 is calculated as follows. The calculation process for the number of reserved nodes is divided into a parallel job selection process to be calculated and a parallel number calculation process based on the selected parallel job.
  • continuous parallel jobs are counted as processing targets by counting the head force of the queue. For example, if there is a sequential job next to a parallel job, the subsequent parallel jobs are excluded from the selection target.
  • the total number of parallel jobs is calculated in the order of queued items. And the parallel number until the total value of the parallel number exceeds the preset maximum number Select jobs, and exclude parallel jobs after exceeding the maximum number.
  • the maximum number is
  • the value is half the number of nodes in the system (50 for a 100-node system)
  • the maximum number of parallel jobs that exist is the number of reserved nodes (for example, 4 for 2 parallels, 2 for 4 parallels, or 8 for 8 parallels).
  • the reserved node number management unit 114 passes the reserved node number calculated in this way to the job allocation unit 120.
  • the free node number management unit 115 acquires status information indicating whether or not the nodes 21 to 26 are processing jobs from the nodes 21 to 26 via the communication unit 111. Then, the free node number management unit 115 passes the number of free nodes to the job assignment unit 120.
  • the remaining execution job time management unit 116 calculates the remaining execution time of the job being processed for each of the nodes 21 to 26 via the communication unit 111. For example, there is a case where the maximum processing time of a job is specified in advance and the job is designed to always finish within the maximum processing time when a program for executing the job is created. In this case, the execution job remaining time management unit 116 sets a value obtained by subtracting the current execution time of each job from the maximum processing time as the remaining time of each job. The execution job remaining time management unit 116 executes the job on each of the nodes 21 to 26 and passes the remaining time of the job to the job allocation unit 120.
  • the job allocation unit 120 receives the job processing request queued in the queue buffer 113 based on the number of reserved nodes, the number of free nodes, and the remaining time of the execution job of each of the nodes 21 to 26. Sort to 21-26.
  • FIG. 6 is a block diagram illustrating functions of the job assignment unit.
  • the job assignment unit 120 includes a job queue acquisition unit 121, a job type determination unit 122, a parallel job assignment unit 123, a job assignment method selection unit 124, a stuffing method job assignment unit 125, and a distributed arrangement method job.
  • the job queue acquisition unit 121 acquires the first job queue from the queue buffer 113 at the job distribution timing.
  • the job distribution timing is when a job queue is queued in the empty queue buffer 113 or when any of the jobs being executed in the nodes 21 to 26 is completed.
  • the job queue acquisition unit 121 passes the acquired job queue to the job type determination unit 122.
  • the job type determination unit 122 analyzes the content of the job processing request indicated by the job queue and determines whether the job is a sequential job or a parallel job. The job type determination unit 122 determines the number of parallel jobs if the job is a parallel job. In the case of a parallel job, the job type determination unit 122 passes the job processing request indicated by the job queue to the parallel job assignment unit 123 and notifies the parallel number. In the case of a sequential job, the job type determination unit 122 passes a job processing request indicated by the job queue to the job allocation method selection unit 124.
  • the parallel job allocation unit 123 selects a node corresponding to the number of parallel jobs in parallel as a free node, and determines it as an allocation destination.
  • the job allocation method selection unit 124 determines whether the job queue stored in the queue buffer 113 or the job processing status by the nodes 21 to 26 is used. Select either the filling method or the distributed placement method as the job assignment method. Specifically, if the number of free nodes is larger than the number of reserved nodes, the job allocation method selection unit 124 selects the distributed arrangement method as the allocation method. Further, if the number of free nodes is equal to or less than the number of reserved nodes, the job allocation method selection unit 124 selects the stuffing method as the allocation method.
  • the job assignment method selection unit 124 passes the acquired job processing request indicated by the job queue to the stuffing method job assignment unit 125.
  • the job allocation method selection unit 124 passes the acquired job processing request indicated by the job queue to the distributed allocation method job allocation unit 126.
  • the stuffing method job allocation unit 125 Upon receiving the job processing request, the stuffing method job allocation unit 125 receives the current node number. Based on the processing status of jobs 21 to 26, the job assignment destination is determined by the stuffing method. Specifically, the stuffing method job allocation unit 125 detects a node that is executing a job and that can execute an additional job. Further, the stuffing method job allocation unit 125 acquires the remaining job execution time of the detected node from the execution job remaining time management unit 116, and determines the node having the shortest remaining time as the allocation destination. If there is no node that can execute the additional job among the nodes that are executing the job, the stuffing method job allocation unit 125 determines an empty node as an allocation destination. Then, the stuffing method job allocation unit 125 notifies the determined allocation destination to the job processing request transmission unit 127.
  • the distributed allocation method job allocation unit 126 determines a job allocation destination by the distributed allocation method based on the job processing status of the current nodes 21 to 26. Specifically, the distributed placement method job allocation unit 126 detects a free node. Then, the distributed placement method job allocation unit 126 determines one node from among the free nodes as an allocation destination. Then, the distributed allocation method job allocation unit 126 notifies the job processing request transmission unit 127 of the determined allocation destination.
  • the job processing request transmission unit 127 transmits a job processing request to the determined assignment destination node.
  • FIG. 7 is a flowchart showing a procedure of job assignment processing. In the following, the process shown in FIG. 7 will be described in order of step number.
  • Step S 11 The job queue acquisition unit 121 acquires the first job queue from the queue buffer 113.
  • Step S12 The job type determination unit 122 analyzes the content of the job processing request indicated by the acquired job queue, and determines whether the job is a parallel job or a sequential job. If it is a parallel job, the process proceeds to step S13. If the job is a sequential job, the process proceeds to step S14.
  • the parallel job allocation unit 123 selects the same number of free nodes as the number of parallel jobs for which a processing request has been issued, and determines it as an allocation destination. After that, the processing power is advanced S22.
  • the job allocation method selection unit 124 obtains the number of free nodes among the nodes 21 to 26 from the free node number management unit 115.
  • Step S15 The job allocation method selection unit 124 determines whether or not there is a free node among the nodes 21 to 26. If there is an empty node, the process proceeds to step S16. If there is no free node, the job assignment method selection unit 124 determines the job assignment method as the clogging method, and advances the process to step S20.
  • Step S 16 The job allocation method selection unit 124 acquires the presence / absence of a parallel job and the number of reserved nodes from the reserved node number management unit 114.
  • Step S 17 The job allocation method selection unit 124 determines whether or not a parallel job queue is queued in the queue buffer 113. If there is a parallel job, the process proceeds to step S18. If there is no parallel job, the job assignment method selection unit 124 selects the distributed arrangement method as the job assignment method, and advances the process to step S19.
  • Step S 18 The job allocation method selection unit 124 determines whether or not the number of free nodes exceeds the number of reserved nodes. If the number of free nodes exceeds the number of reserved nodes, the job allocation method selection unit 124 selects the distributed arrangement method as the job allocation method, and advances the process to step S19. If the number of free nodes is equal to or less than the number of reserved nodes, the job allocation method selection unit 124 selects the stuffing method as the job allocation method, and the process proceeds to step S20.
  • Step S 19 The distributed allocation method job allocation unit 126 allocates a job to an empty node. Thereafter, the process proceeds to step S22.
  • Step S20 The filling method job allocation unit 125 acquires the remaining time of the execution job of each node from the execution job remaining time management unit 116.
  • the stuffing method job allocation unit 125 selects the node with the shortest remaining time as the job allocation destination and determines it as the allocation destination.
  • Step S22 The job processing request sending unit 127 sends a job to a node determined as an assignment destination by the parallel job assignment unit 123, the stuffing method job assignment unit 125, or the distributed placement method job assignment unit 126. Send processing request.
  • the filling method and the distributed arrangement method can be switched according to the number of reserved nodes and the number of free nodes. This improves the sweeping of jobs in the queue buffer 113 and improves the overall system throughput.
  • FIG. 8 is a diagram illustrating an example of a queued job.
  • Job queues 41 to 48 are queued in 113.
  • the job queues 41 to 48 indicate the queuing order by numerical values (# 1 to # 8).
  • the two job queues 41 and 42 from the top indicate sequential job processing requests.
  • the next job queue 43 shows processing requests for five parallel jobs.
  • the following four job queues 44 to 47 indicate sequential job processing requests.
  • the last job queue 48 indicates processing requests for two parallel jobs.
  • the job queue 41 for sequential jobs is assigned.
  • the number of reserved nodes is “7” (the parallel number “5” of the parallel job indicated by the job queue 43 + the parallel number “2” of the parallel job 48 indicated by the job queue 48). Since the number of free nodes is “6”, “the number of reserved nodes ⁇ the number of free nodes” is satisfied, and the filling method is adopted.
  • the sequential job indicated by the job queue 41 is assigned to any of the nodes 21 to 26. In this example, it is assumed that the sequential job 41 is assigned to the node 21.
  • the job queue 42 for sequential jobs is assigned.
  • the number of reserved nodes is “7” and the number of free nodes is “5”. Therefore, “reserved node number ⁇ free node number” is satisfied, and the filling method is adopted.
  • the stuffing method if an additional job can be assigned to a node that is already executing a job, the job is assigned to that node. Accordingly, the job queue 42 is assigned to the node 21.
  • the parallel job 43 is assigned. Since the parallel job 43 needs to be executed on each free node of the number of parallel jobs, it is assigned to the free node. Therefore, parallel job 43 is assigned to nodes 22-26.
  • FIG. 9 is a diagram showing the third job assignment status.
  • the sequential job 41a indicated by the job queue 41 and the sequential job 42a indicated by the job queue 42 are executed.
  • the parallel job 43a indicated by the job queue 43 is executed.
  • the parallel job 43a is executed, so that there is no empty node.
  • Node 21 is also executing the same number of jobs as the CPU, so no additional jobs can be assigned. Therefore, the job queue 44 is in a waiting state until the processing of any job is completed.
  • FIG. 10 is a diagram illustrating a state after the parallel job processing is completed. As shown in FIG. 10, when the parallel job 43a has been processed, the nodes 22 to 26 become free. Therefore, job assignment after job queue 44 is started.
  • the number of reserved nodes when assigning the job queue 44 for sequential jobs is “2”.
  • the number of free nodes is “5”, “the number of reserved nodes is equal to the number of free nodes”. Therefore, the job queues 44 to 46 for the sequential jobs are allocated by the distributed arrangement method. As a result, it is assumed that job queue 44 is assigned to node 22, job queue 45 is assigned to node 23, and job queue 46 is assigned to node 24.
  • FIG. 11 is a diagram showing a state in which up to the sixth job is assigned.
  • a sequential job 44a indicated by the job queue 44 is executed.
  • the sequential job 45a indicated by the job queue 45 is executed.
  • the sequential job 46a indicated by the job queue 46 is executed. At this point, there are two free nodes.
  • the number of reserved nodes is equal to the number of empty nodes, so the jobs are assigned using the stuffing method. That is, among the nodes 22 to 24, the job queue 47 is assigned to the node having the shortest remaining job execution time. In this example, it is assumed that job queue 47 is assigned to node 22. [0092] Finally, the job queue 48 for parallel jobs is assigned. At this point, since two free nodes are reserved, the job queue 48 is assigned to the nodes 25 and 26.
  • FIG. 12 shows a state in which all jobs are assigned.
  • a sequential job 47a indicated by the job queue 47 is executed.
  • the parallel job 48a indicated by the job queue 48 is executed.
  • all jobs are executed in this example.
  • the overall system can be operated with high efficiency.
  • FIG. 13 is a diagram showing an example in which jobs are assigned only by the stuffing method.
  • the state in which the third job is assigned is the same as in FIG.
  • the node 26 becomes an empty node as shown in FIG. In other words, the operating rate of the entire system is poor.
  • sequential jobs are executed two by two in one node, and the execution performance is degraded due to resource competition.
  • FIG. 14 is a diagram showing an example in which jobs are assigned only by the distributed arrangement method.
  • the distributed placement method if each job after the fourth sequential job 44a is assigned by the distributed placement method, when two parallel jobs 48a are assigned, there will be one free node. Therefore, the job queue 48 for parallel jobs is waiting to be executed until the number of free nodes increases. This makes sweeping parallel jobs worse. In addition, since the node 26 is left free, the operation rate of the entire system remains poor.
  • the job allocation method by adding a predetermined correction value ex to the number of reserved nodes in the above embodiment, and comparing the value after the addition with the number of free nodes. .
  • the number of reserved nodes and the number of free nodes based on the number of parallel jobs on the job queue are collected. If the number of free nodes is greater than the number of reserved nodes + ⁇ , the distributed placement method is selected. If is less than the number of reserved nodes + ⁇ , select the filling method.
  • the correction value ⁇ represents the number of nodes reserved for the most recently submitted parallel job. For example, the power specified by the system administrator and the average number of parallel jobs submitted in the past are calculated by the job assignment unit 120, and the calculation result is used as a correction value.
  • the number of reserved nodes can be fixed to a preset value.
  • “number of parallel jobs that can be submitted” X “number of parallel jobs” is set as the number of reserved nodes.
  • the number of parallel jobs that can be submitted is a value arbitrarily set by the system administrator in advance.
  • the number of parallels may be a value specified by the system administrator in advance! /, Or the average number of parallels in the past! /.
  • the number of sequential jobs being executed and the number of sequential jobs queued in the queue buffer If the number of nodes that preferentially execute sequential jobs (the number of nodes for sequential jobs) is smaller, the job is distributed If so, you may be able to place jobs using the stuffing method.
  • the number of sequential job nodes can be, for example, a value specified in advance by the system administrator, or the average number of sequential jobs per unit time in the past! /.
  • a node to which jobs are sequentially assigned by the stuffing method it may be assigned to the sword having the longest execution time of the job being executed among the assigned nodes. For example, if the maximum processing time at the job program design stage is not specified, the remaining time of the job being processed cannot be calculated. In this case, jobs are assigned sequentially to the node that has the longest execution time since job processing started.
  • the above processing functions can be realized by a computer.
  • a program describing the processing contents of the functions that the job management apparatus should have is provided.
  • the program describing the processing contents is a computer-readable recording medium.
  • Examples of the computer-readable recording medium include a magnetic recording device, an optical disk, a magneto-optical recording medium, and a semiconductor memory.
  • Magnetic recording devices include hard disk drives (HDD), flexible disks (FD), and magnetic tapes.
  • Optical discs include DVD (Digital Versatile Disc), DVD—RAM (Random Access Memory), CD—ROM (Compact Disc Read Only Memory), and CD—R (Recordable) ZRW (Rewritable).
  • Magneto-optical recording media include MO (Magneto-Optical disk).
  • a portable recording medium such as a DVD or a CD-ROM in which the program is recorded is sold. It is also possible to store the program in a storage device of the server computer and transfer the program to other computers via the network.
  • a computer that executes a program stores, for example, a program recorded on a portable recording medium or a server computer-powered program in its own storage device. Then, the computer reads its own storage device power program and executes processing according to the program. The computer can also read the program directly from the portable recording medium and execute processing according to the program. The computer can also execute processing according to the received program sequentially each time the program is transferred to the server computer.

Abstract

It is possible to effectively perform allocation even if the generation ratio of a sequential job and a parallel job is changed. When a job queue is a parallel job processing request, parallel job allocation means (1f) decides an empty node in accordance with the parallel number of the parallel jobs as an allocation destination. When a job queue is a successive job processing request, allocation method selection means (1g) judges whether the number of empty nodes is appropriate according to the number of empty nodes and the number of reserved nodes. If the number of empty nodes is insufficient, a node which has already executed another job is decided as an allocation destination of the job acquired by the job queue acquisition means (1d) with a higher priority. If the number of empty nodes is sufficient, a node which has not executed a job is decided as an allocation destination of the job acquired by the job queue acquisition means (1d) with a high priority.

Description

明 細 書  Specification
ジョブ管理装置およびジョブ管理プログラム  Job management apparatus and job management program
技術分野  Technical field
[0001] 本発明はキューイングされたジョブを複数のノードのいずれかに割り当てるジョブ管 理装置およびジョブ管理プログラムに関し、特に逐次ジョブと並列ジョブとのノードへ の割り当てを行うジョブ管理装置およびジョブ管理プログラムに関する。  TECHNICAL FIELD [0001] The present invention relates to a job management apparatus and job management program that allocate a queued job to one of a plurality of nodes, and more particularly to a job management apparatus and job management that allocates sequential jobs and parallel jobs to nodes. Regarding the program.
背景技術  Background art
[0002] 複数の計算機 (ノード)にジョブ (処理の実行単位)を分散処理させるためには、ジョ ブの割り当てを効率的に行う必要がある。そこで、サーバ等を用いて、各ノードの負 荷状態等に基づいてジョブの割り当てが行われる(例えば、特許文献 1, 2参照)。  In order to distribute jobs (processing execution units) to multiple computers (nodes) in a distributed manner, it is necessary to efficiently allocate jobs. Therefore, using a server or the like, job assignment is performed based on the load state of each node (see, for example, Patent Documents 1 and 2).
[0003] ところが、特許文献 1, 2に開示された技術は、逐次ジョブと並列ジョブとが混在する 場合のジョブの割り当てについて考慮されていない。そのため、逐次ジョブと並列ジョ ブとの処理を実行するシステムでは、ジョブの実行効率を十分に上げることができな い。ここで、逐次ジョブは、 1つのノードで実行される逐次ジョブであり、並列ジョブは、 複数のノードで並列実行されるジョブである。  [0003] However, the techniques disclosed in Patent Documents 1 and 2 do not consider job assignment when sequential jobs and parallel jobs are mixed. Therefore, in a system that executes processing of sequential jobs and parallel jobs, job execution efficiency cannot be sufficiently improved. Here, the sequential job is a sequential job executed on one node, and the parallel job is a job executed on multiple nodes in parallel.
[0004] 逐次ジョブと並列ジョブとのノードへの割り当て処理では、 1つのノード上で複数の ジョブを実行できる力否かという点が大きく異なる。すなわち、 1つのノードに複数の C PU (Central Processing Unit)が搭載されていれば、そのノード上で複数のジョブの 同時実行が可能である。ただし、 1つのノードで複数のジョブを実行した場合、ジョブ が使用する資源 (メモリ等)の競合により、ジョブを単独で実行する場合に比べ実行性 能が低下する。  [0004] In the process of assigning sequential jobs and parallel jobs to nodes, whether or not the ability to execute a plurality of jobs on one node differs greatly. In other words, if multiple CPUs (Central Processing Units) are installed in one node, multiple jobs can be executed simultaneously on that node. However, when multiple jobs are executed on one node, the execution performance will be lower than when the job is executed alone due to contention for resources (memory, etc.) used by the job.
[0005] 逐次ジョブは、実行性能の低下に対して許容できる範囲が広いため、 1ノードで複 数のジョブを実行可能である。一方、並列ジョブでは、並列に実行される処理の間で 、情報の受け渡しが発生する。そのため、 1つのノードで複数のジョブを実行させるこ とにより実行性能が低下し、他のノードに情報を渡すタイミングが遅れると、他のノー ドでは待ち時間が発生し、処理も遅れてしまう。  [0005] Sequential jobs have a wide range of tolerances to the decline in execution performance, so multiple jobs can be executed on one node. On the other hand, in a parallel job, information is transferred between processes executed in parallel. Therefore, if multiple jobs are executed on one node, the execution performance deteriorates. If the timing for passing information to other nodes is delayed, waiting time occurs on other nodes, and processing is also delayed.
[0006] 並列ジョブを実行する際の他のノード力 の情報取得の待ち時間をなくすには、各 ノードの実行性能が均等であることが望まれる。そのため、並列ジョブを実行する場 合には、 1ノード上で 1つのジョブのみが実行される。 [0006] To eliminate the waiting time for acquiring information on other node forces when executing parallel jobs, It is desirable that the node execution performance is uniform. Therefore, when executing parallel jobs, only one job is executed on one node.
[0007] そこで、並列ジョブをノードに割り当てる場合、割り当て先のノードは空きノード(実 行しているジョブ数が「0」のノード)であることが必要である。換言すると、並列ジョブ をノードに割り当てるためには、平行して実行する処理数 (並列数)分の空きノードが 必要である。空きノードが不足していれば、並列ジョブは実行待ちになる。このように 、逐次ジョブと並列ジョブとでは、ジョブを割り当てるための条件が異なる。  [0007] Therefore, when a parallel job is assigned to a node, it is necessary that the assignment destination node is an empty node (a node having the number of jobs being executed of “0”). In other words, in order to assign parallel jobs to nodes, there must be as many free nodes as the number of processes executed in parallel (the number of parallel processes). If there are not enough free nodes, the parallel job is waiting to be executed. Thus, the conditions for assigning jobs differ between sequential jobs and parallel jobs.
[0008] 従来、逐次ジョブと並列ジョブとが混在したジョブを実行するシステムでは、詰め込 み方式と分散配置方式とのどちらかの割り当て方式が採用されていた。詰め込み方 式は、並列ジョブの実行を優先するために、逐次ジョブを 1ノード上に複数、かつノー ドが有するプロセッサ数まで割り当てる方式である。分散配置方式は、 1ノード上での 資源競合をできるだけ避けるために、逐次ジョブも 1ノードに 1つずつ割り当てる方式 である。  [0008] Conventionally, in a system that executes a job in which sequential jobs and parallel jobs are mixed, either an stuffing method or a distributed arrangement method has been adopted. In the stuffing method, in order to prioritize the execution of parallel jobs, multiple sequential jobs are allocated to one node and up to the number of processors that the node has. In the distributed placement method, sequential jobs are assigned to each node in order to avoid resource contention on one node as much as possible.
[0009] 図 15は、詰め込み方式によるジョブ割り当てを示す図である。図 15に示すように、 ネットワーク 910を介して、 6台のノード 921〜926が接続されている。各ノード 921〜 926は、それぞれ 2つのプロセッサを搭載しているものとする。そして、各ノード 921〜 926に対してジョブ管理装置 931がジョブを詰め込み方式によって割り当てる。  FIG. 15 is a diagram showing job assignment by the stuffing method. As shown in FIG. 15, six nodes 921 to 926 are connected via a network 910. Each node 921 to 926 is assumed to have two processors. Then, the job management apparatus 931 assigns the job to each of the nodes 921 to 926 by a packing method.
[0010] この例では、逐次ジョブ 941〜944が 4つ続いた後に、並列ジョブ 945が発生して いる。並列ジョブ 945の並列数(何台のノードで並列処理を行う力 は、「3」である。な お、図 15では、ジョブがキューイングされた順を、ジョブ内に示している。  [0010] In this example, a parallel job 945 occurs after four sequential jobs 941 to 944 continue. The number of parallel jobs 945 (the number of nodes that perform parallel processing is “3”. In FIG. 15, the order in which jobs are queued is shown in the job.
[0011] 詰め込み方式では、各ノードに対して、所定の順番でそのノードが実行可能なジョ ブが順次割り当てられる。 1番目と 2番目の逐次ジョブ 941, 942力ノード 921に割り 当てられる。次に、 3番目と 4番目の逐次ジョブ 943, 944がノード 922に割り当てられ る。そして、 5番目の並列ジョブ 945は、ノード 923〜925に割り当てられる。  [0011] In the stuffing method, jobs that can be executed by the nodes are sequentially assigned to each node in a predetermined order. Assigned to first and second sequential jobs 941, 942 force node 921. Next, the third and fourth sequential jobs 943, 944 are assigned to node 922. The fifth parallel job 945 is assigned to nodes 923-925.
[0012] 図 16は、分散配置方式のジョブ割り当てを示す図である。この例では、図 15と同様 のジョブが、ジョブ管理装置 932によって分散配置方式により各ノード 921〜926に 割り当てられる。  FIG. 16 is a diagram showing job assignment in the distributed arrangement method. In this example, jobs similar to those in FIG. 15 are assigned to each of the nodes 921 to 926 by the job management apparatus 932 by the distributed arrangement method.
[0013] 分散配置方式では、発生したジョブが、各ノードの負荷が均等になるように分散配 置される。具体的には、 1〜4番目の逐次ジョブ 941〜944は、それぞれノード 921〜[0013] In the distributed placement method, the generated jobs are distributed so that the load on each node is equalized. Placed. Specifically, the first to fourth sequential jobs 941 to 944
924に割り当てられる。この時点で、空きノードは、ノード 925とノード 926の 2台のみ である。そのため、並列処理数力 「3」である並列ジョブ 945は、空きノード数が「3」に なるまで待ち状態となる。 Assigned to 924. At this point, there are only two free nodes, node 925 and node 926. Therefore, the parallel job 945 having the parallel processing power “3” is in a wait state until the number of free nodes becomes “3”.
特許文献 1:特開平 8— 305671号公報  Patent Document 1: Japanese Patent Laid-Open No. 8-305671
特許文献 2:特開 2000 - 315199号公報  Patent Document 2: JP 2000-315199 A
発明の開示  Disclosure of the invention
発明が解決しょうとする課題  Problems to be solved by the invention
[0014] しかし、従来は、予めジョブの傾向を調査し運用ポリシーを決めて力もジョブ割り当 て方式を選択していたため、発生するジョブの傾向が変わったときに、システム全体 のスループットが低下するという問題がある。 [0014] However, in the past, since the job trend was investigated in advance, the operation policy was determined, and the job allocation method was selected, the throughput of the entire system was reduced when the trend of the generated job changed. There is a problem.
[0015] 例えば、並列ジョブが多い場合、並列ジョブを優先する詰め込み方式が採用される[0015] For example, when there are many parallel jobs, a stuffing method that prioritizes parallel jobs is adopted.
。ところが、詰め込み方式では、並列ジョブが少なくなつた場合、空きノードが増加し システム全体の稼働率が下がってしまう。また、詰め込み方式では、 1つのノードで複 数の逐次ジョブが実行される場合が多くなり、逐次ジョブの実行性能も低下する。 . However, in the stuffing method, when the number of parallel jobs decreases, the number of empty nodes increases and the operating rate of the entire system decreases. In the stuffing method, multiple sequential jobs are often executed on one node, and the execution performance of sequential jobs is also reduced.
[0016] また、逐次ジョブが多い場合、資源競合を避ける分散配置方式が採用される。とこ ろが、分散配置方式では、並列ジョブが多くなると、空きノードが不足し、並列ジョブ が長時間実行待ちとなる。 In addition, when there are many sequential jobs, a distributed arrangement method that avoids resource contention is adopted. However, in the distributed arrangement method, when the number of parallel jobs increases, there are insufficient free nodes, and parallel jobs wait for a long time.
[0017] 本発明はこのような点に鑑みてなされたものであり、逐次ジョブと並列ジョブとの発 生割合が変化しても、効率的なジョブ割り当てを行うことができるジョブ管理装置およ びジョブ管理プログラムを提供することを目的とする。 [0017] The present invention has been made in view of the above points, and a job management apparatus and a job management apparatus capable of performing efficient job allocation even when the generation ratio of sequential jobs and parallel jobs changes. And a job management program.
課題を解決するための手段  Means for solving the problem
[0018] 本発明では上記課題を解決するために、図 1に示すようなジョブ管理装置 1が提供 される。本発明に係るジョブ管理装置 1は、 1つのノード 2a, 2b, · · ·で実行される逐 次ジョブと複数のノードによる並列処理で実行される並列ジョブとのノードへの割り当 てを行うために、以下の機能を有している。 In the present invention, in order to solve the above problems, a job management apparatus 1 as shown in FIG. 1 is provided. The job management apparatus 1 according to the present invention assigns a sequential job executed in one node 2a, 2b,... And a parallel job executed in parallel processing by a plurality of nodes to nodes. Therefore, it has the following functions.
[0019] キューバッファ laは、ジョブの処理要求を示すジョブキューが入力されると、入力さ れた順にジョブキューを格納する。予約ノード数決定手段 lbは、キューバッファ laに 格納された並列ジョブの処理要求を示すジョブキューに基づ 、て、並列ジョブの処理 を行うために予め確保しておくべきノードの数を示す予約ノード数を決定する。空きノ ード数計算手段 lcは、複数のノードそれぞれでのジョブの実行の有無を監視し、ジョ ブを実行して 、な ゾードの数を示す空きノード数を算出する。ジョブキュー取得手 段 Idは、キューバッファに格納されたジョブキューを、格納された順に取得する。ジョ ブ種別判断手段 leは、ジョブキュー取得手段 Idで取得されたジョブキューが並列ジ ヨブの処理要求か、逐次ジョブの処理要求かを判断する。並列ジョブ割り当て手段 If は、ジョブキュー取得手段 Idで取得されたジョブキューが並列ジョブの処理要求の場 合、並列ジョブの並列数に応じた空きノードを割り当て先として決定する。割り当て方 式選択手段 lgは、ジョブキュー取得手段 Idで取得されたジョブキューが逐次ジョブ の処理要求の場合、空きノード数と予約ノード数とに基づ 、て空きノード数の過不足 を判断し、空きノード数が不足していれば、ジョブの割り当て方式として詰め込み方 式を選択し、空きノード数が足りていれば、ジョブの割り当て方式として分散配置方式 を選択する。詰め込み方式ジョブ割り当て手段 lhは、詰め込み方式が選択された場 合、既に他のジョブを実行しているノードを優先的に、ジョブキュー取得手段 Idで取 得したジョブの割り当て先として決定する。分散配置方式ジョブ割り当て手段 liは、 分散配置方式が選択された場合、ジョブを実行していないノードを優先的に、ジョブ キュー取得手段 Idで取得したジョブの割り当て先として決定する。処理要求送信手 段 ljは、割り当て先として決定されたノードに対して、ジョブキュー取得手段 Idで取 得したジョブキューで示されるジョブの処理要求を送信する。 When a job queue indicating a job processing request is input, the queue buffer la stores the job queue in the input order. Reserved node number determination means lb is stored in queue buffer la Based on the stored job queue indicating the parallel job processing request, the number of reserved nodes indicating the number of nodes to be reserved in advance for the parallel job processing is determined. The free node number calculating means lc monitors whether or not a job is executed on each of a plurality of nodes, executes the job, and calculates the number of free nodes indicating the number of nodes. Job queue acquisition means Id acquires the job queues stored in the queue buffer in the order in which they were stored. The job type determination unit le determines whether the job queue acquired by the job queue acquisition unit Id is a parallel job processing request or a sequential job processing request. When the job queue acquired by the job queue acquisition unit Id is a parallel job processing request, the parallel job allocation unit If determines an empty node corresponding to the parallel number of parallel jobs as an allocation destination. When the job queue acquired by the job queue acquisition unit Id is a sequential job processing request, the allocation method selection unit lg determines whether the number of free nodes is excessive or insufficient based on the number of free nodes and the number of reserved nodes. If the number of free nodes is insufficient, the stuffing method is selected as the job allocation method, and if the number of free nodes is sufficient, the distributed arrangement method is selected as the job allocation method. When the stuffing method is selected, the stuffing method job allocation unit lh preferentially determines a node that has already executed another job as an assignment destination of the job acquired by the job queue acquisition unit Id. When the distributed allocation method is selected, the distributed allocation method job allocation unit li preferentially determines a node that is not executing a job as an allocation destination of the job acquired by the job queue acquisition unit Id. The processing request transmission means lj transmits the processing request for the job indicated by the job queue acquired by the job queue acquisition means Id to the node determined as the allocation destination.
このようなジョブ管理装置によれば、ジョブの処理要求を示すジョブキューが入力さ れると、キューバッファ laに、入力された順にジョブキューが格納される。また、予約ノ ード数決定手段 lbにより、並列ジョブの処理を行うために予め確保しておくべきノー ドの数を示す予約ノード数が決定される。また、空きノード数計算手段 lcにより、複数 のノードそれぞれでのジョブの実行の有無が監視され、ジョブを実行して ヽな ゾー ドの数を示す空きノード数が算出される。さらに、ジョブキュー取得手段 Idにより、キ ユーバッファに格納されたジョブキューを、格納された順に取得される。すると、ジョブ 種別判断手段 leにより、ジョブキュー取得手段 Idで取得されたジョブキューが並列 ジョブの処理要求か、逐次ジョブの処理要求かが判断される。ジョブキュー取得手段According to such a job management apparatus, when a job queue indicating a job processing request is input, the job queues are stored in the queue buffer la in the input order. Also, the reserved node number determining means lb determines the reserved node number indicating the number of nodes that should be reserved in advance for processing the parallel job. Also, the number of free nodes calculating means lc monitors whether or not a job is executed in each of the plurality of nodes, and calculates the number of free nodes indicating the number of obsolete nodes by executing the job. Further, job queues stored in the queue buffer are acquired in the order in which they are stored by the job queue acquisition means Id. Then, the job queue obtained by the job queue obtaining means Id is parallelized by the job type judging means le. It is determined whether it is a job processing request or a sequential job processing request. Job queue acquisition means
Idで取得されたジョブキューが並列ジョブの処理要求の場合、並列ジョブ割り当て手 段 Ifにより、並列ジョブの並列数に応じた空きノードを割り当て先として決定される。 ジョブキュー取得手段 Idで取得されたジョブキューが逐次ジョブの処理要求の場合 、割り当て方式選択手段 lgにより、空きノード数と予約ノード数とに基づいて空きノー ド数の過不足が判断され、空きノード数が不足していれば、ジョブの割り当て方式とし て詰め込み方式が選択され、空きノード数が足りていれば、ジョブの割り当て方式とし て分散配置方式が選択される。詰め込み方式が選択された場合、詰め込み方式ジョ ブ割り当て手段 lhにより、既に他のジョブを実行しているノードが優先的に、ジョブキ ユー取得手段 Idで取得したジョブの割り当て先として決定される。分散配置方式が 選択された場合、分散配置方式ジョブ割り当て手段 liにより、ジョブを実行していな Vゾードが優先的に、ジョブキュー取得手段 Idで取得したジョブの割り当て先として 決定される。そして、処理要求送信手段 ljにより、割り当て先として決定されたノード に対して、ジョブキュー取得手段 Idで取得したジョブキューで示されるジョブの処理 要求が送信される。 When the job queue acquired by Id is a parallel job processing request, the parallel job allocation unit If determines the free node corresponding to the parallel number of parallel jobs as the allocation destination. Job queue acquisition means When the job queue acquired by Id is a sequential job processing request, the allocation method selection means lg determines whether the number of free nodes is excessive or insufficient based on the number of free nodes and the number of reserved nodes. If the number of nodes is insufficient, the stuffing method is selected as the job allocation method, and if the number of free nodes is sufficient, the distributed arrangement method is selected as the job allocation method. When the filling method is selected, the node that has already executed another job is preferentially determined by the filling method job assignment unit lh as the assignment destination of the job acquired by the job queue acquisition unit Id. When the distributed placement method is selected, the V-zone that is not executing the job is preferentially determined by the distributed placement method job assignment means li as the assignment destination of the job acquired by the job queue acquisition means Id. Then, the processing request transmitting unit lj transmits the processing request for the job indicated by the job queue acquired by the job queue acquiring unit Id to the node determined as the allocation destination.
また、上記課題を解決するために、 1つのノードで実行される逐次ジョブと複数のノ ードによる並列処理で実行される並列ジョブとのノードへの割り当てを行うジョブ管理 プログラムにおいて、コンピュータを、ジョブの処理要求を示すジョブキューが入力さ れると、入力された順に前記ジョブキューを格納するキューバッファ、前記キューバッ ファに格納された並列ジョブの処理要求を示すジョブキューに基づ 、て、並列ジョブ の処理を行うために予め確保しておくべきノードの数を示す予約ノード数を決定する 予約ノード数決定手段、前記複数のノードそれぞれでのジョブの実行の有無を監視 し、ジョブを実行して 、な ゾードの数を示す空きノード数を算出する空きノード数計 算手段、前記キューバッファに格納された前記ジョブキューを、格納された順に取得 するジョブキュー取得手段、前記ジョブキュー取得手段で取得された前記ジョブキュ 一が並列ジョブの処理要求か、逐次ジョブの処理要求かを判断するジョブ種別判断 手段、前記ジョブキュー取得手段で取得された前記ジョブキューが並列ジョブの処理 要求の場合、並列ジョブの並列数に応じた空きノードを割り当て先として決定する並 列ジョブ割り当て手段、前記ジョブキュー取得手段で取得された前記ジョブキューが 逐次ジョブの処理要求の場合、前記空きノード数と前記予約ノード数とに基づ!/、て前 記空きノード数の過不足を判断し、前記空きノード数が不足していれば、ジョブの割り 当て方式として詰め込み方式を選択し、前記空きノード数が足りていれば、ジョブの 割り当て方式として分散配置方式を選択する割り当て方式選択手段、前記詰め込み 方式が選択された場合、既に他のジョブを実行しているノードを優先的に、前記ジョ ブキュー取得手段で取得したジョブの割り当て先として決定する詰め込み方式ジョブ 割り当て手段、前記分散配置方式が選択された場合、ジョブを実行していないノード を優先的に、前記ジョブキュー取得手段で取得したジョブの割り当て先として決定す る分散配置方式ジョブ割り当て手段、割り当て先として決定されたノードに対して、前 記ジョブキュー取得手段で取得した前記ジョブキューで示されるジョブの処理要求を 送信する処理要求送信手段、として機能させることを特徴とするジョブ管理プログラム が提供される。 In order to solve the above problem, in a job management program that assigns to a node a sequential job executed on one node and a parallel job executed in parallel processing by a plurality of nodes, When a job queue indicating a job processing request is input, a parallel processing is performed based on a queue buffer storing the job queue in the input order and a job queue indicating a parallel job processing request stored in the queue buffer. Determines the number of reserved nodes indicating the number of nodes that should be reserved in advance for job processing. Reserved node number determination means, monitors the execution of jobs on each of the plurality of nodes, and executes jobs. An empty node number calculating means for calculating the number of empty nodes indicating the number of nodes, and the job stored in the queue buffer Job queue acquisition means for acquiring queues in the order of storage, job type determination means for determining whether the job queue acquired by the job queue acquisition means is a parallel job processing request or a sequential job processing request, and the job When the job queue acquired by the queue acquisition means is a parallel job processing request, a free node corresponding to the parallel number of parallel jobs is determined as an allocation destination. If the job queue acquired by the queue job allocation means and the job queue acquisition means is a sequential job processing request, the excess number of empty nodes is determined based on the number of empty nodes and the number of reserved nodes! If the number of empty nodes is insufficient, the stuffing method is selected as the job allocation method, and the distributed allocation method is selected as the job allocation method when the number of empty nodes is sufficient. Method selection means, when the stuffing method is selected, a stuffing method job assignment means for preferentially determining a node already executing another job as an assignment destination of the job acquired by the job queue acquisition means, When the distributed placement method is selected, jobs that have been acquired by the job queue acquisition means are given priority to nodes that are not executing jobs. Distributed allocation method job allocation means that determines the allocation destination of the job, and processing request transmission that transmits the processing request of the job indicated by the job queue acquired by the job queue acquisition means to the node determined as the allocation destination A job management program characterized in that it functions as a means is provided.
[0022] このようなジョブ管理プログラムをコンピュータに実行させることにより、上記ジョブ管 理装置と同様の機能がコンピュータによって実現される。  [0022] By causing a computer to execute such a job management program, the same function as that of the job management apparatus is realized by the computer.
発明の効果  The invention's effect
[0023] 本発明では、キューバッファに格納された並列ジョブの処理要求を示すジョブキュ 一に基づいて予約ノード数を決定し、空きノード数と予約ノード数とに基づいて空きノ ード数の過不足を判断し、空きノード数が不足して 、れば詰め込み方式で逐次ジョ ブをノードに割り当て、空きノード数が足りて 、れば分散配置方式で逐次ジョブをノー ドに割り当てるようにした。これにより、並列ジョブの処理要求が少ないときは、逐次ジ ヨブを優先する分散配置方式が採用され、並列ジョブの処理要求が増えると、並列ジ ヨブを優先する詰め込み方式が採用される。その結果、常に効率的なジョブ割り当て を行うことが可能となる。  [0023] In the present invention, the number of reserved nodes is determined based on a job queue indicating a parallel job processing request stored in the queue buffer, and an excess of the number of free nodes is determined based on the number of free nodes and the number of reserved nodes. Judgment is made, and if the number of free nodes is insufficient, a sequential job is assigned to the nodes using the stuffing method. If the number of free nodes is sufficient, a sequential job is assigned to the nodes using the distributed placement method. As a result, when there are few parallel job processing requests, a distributed arrangement method that prioritizes sequential jobs is adopted, and when parallel job processing requests increase, a stuffing method that gives priority to parallel jobs is adopted. As a result, efficient job assignment can always be performed.
[0024] 本発明の上記および他の目的、特徴および利点は本発明の例として好ま U、実施 の形態を表す添付の図面と関連した以下の説明により明らかになるであろう。  [0024] The above and other objects, features and advantages of the present invention are preferred as examples of the present invention, and will become apparent from the following description in conjunction with the accompanying drawings showing embodiments.
図面の簡単な説明  Brief Description of Drawings
[0025] [図 1]本実施の形態の概略を示す図である。 [図 2]ジョブの振り分け例を示す図である。 FIG. 1 is a diagram showing an outline of the present embodiment. FIG. 2 is a diagram showing an example of job distribution.
[図 3]本実施の形態のシステム構成例を示す図である。  FIG. 3 is a diagram showing a system configuration example according to the present embodiment.
[図 4]本実施の形態に用いるジョブ管理サーバのハードウェア構成例を示す図である  FIG. 4 is a diagram illustrating a hardware configuration example of a job management server used in the present embodiment.
[図 5]ジョブ管理サーバの機能を示すブロック図である。 FIG. 5 is a block diagram showing functions of the job management server.
[図 6]ジョブ割り当て部の機能を示すブロック図である。  FIG. 6 is a block diagram illustrating functions of a job assignment unit.
[図 7]ジョブ割り当て処理の手順を示すフローチャートである。  FIG. 7 is a flowchart showing a procedure for job assignment processing.
[図 8]キューイングされたジョブの例を示す図である。  FIG. 8 shows an example of a queued job.
[図 9]3番目までのジョブ割り当て状況を示す図である。  FIG. 9 is a diagram showing job assignment statuses up to the third.
[図 10]並列ジョブの処理終了後の状態を示す図である。  FIG. 10 is a diagram showing a state after completion of parallel job processing.
[図 11]6番目までのジョブが割り当てられた状態を示す図である。  FIG. 11 is a diagram showing a state where up to the sixth job is assigned.
[図 12]全てのジョブを割り当てた状態を示す図である。  FIG. 12 is a diagram showing a state in which all jobs are assigned.
[図 13]詰め込み方式のみでジョブの割り当てを行った場合の例を示す図である。  FIG. 13 is a diagram illustrating an example in which jobs are assigned only by a filling method.
[図 14]分散配置方式のみでジョブの割り当てを行った場合の例を示す図である。  FIG. 14 is a diagram showing an example in which jobs are assigned only by a distributed arrangement method.
[図 15]詰め込み方式によるジョブ割り当てを示す図である。  FIG. 15 is a diagram showing job assignment by a filling method.
[図 16]分散配置方式のジョブ割り当てを示す図である。  FIG. 16 is a diagram showing job assignment in a distributed arrangement method.
発明を実施するための最良の形態  BEST MODE FOR CARRYING OUT THE INVENTION
[0026] 以下、本発明の実施の形態を図面を参照して説明する。  Hereinafter, embodiments of the present invention will be described with reference to the drawings.
図 1は、本実施の形態の概略を示す図である。ジョブ管理装置 1は、 1つのノードで 実行される逐次ジョブと複数のノードによる並列処理で実行される並列ジョブとのノー ド 2a, 2b, · · 'への割り当てを行う。そのために、ジョブ管理装置 1は、キューバッファ la、予約ノード数決定手段 lb、空きノード数計算手段 lc、ジョブキュー取得手段 Id 、ジョブ種別判断手段 le、並列ジョブ割り当て手段 lf、割り当て方式選択手段 lg、 詰め込み方式ジョブ割り当て手段 lh、分散配置方式ジョブ割り当て手段 li、および 処理要求送信手段 1 jを有して!/ヽる。  FIG. 1 is a diagram showing an outline of the present embodiment. The job management apparatus 1 assigns sequential jobs executed on one node and parallel jobs executed by parallel processing by a plurality of nodes to the nodes 2a, 2b,. For this purpose, the job management apparatus 1 includes a queue buffer la, a reserved node number determining unit lb, a free node number calculating unit lc, a job queue obtaining unit Id, a job type determining unit le, a parallel job allocation unit lf, and an allocation method selection unit lg. It has a stuffing method job assignment means lh, a distributed arrangement method job assignment means li, and a processing request transmission means 1 j.
[0027] キューバッファ laは、ジョブの処理要求を示すジョブキューが入力されると、入力さ れた順にジョブキューを格納する。入力されるジョブの処理要求には、逐次ジョブと並 列ジョブとの処理要求が混在する。なお、ジョブキューには、並列ジョブか逐次ジョブ かを示す情報が含まれている。また、並列ジョブの処理要求には、その並列ジョブの 並列数が示されている。 When a job queue indicating a job processing request is input, the queue buffer la stores the job queue in the input order. The input job processing requests include processing requests for sequential jobs and parallel jobs. In the job queue, parallel jobs or sequential jobs Information is included. The parallel job processing request indicates the parallel number of the parallel job.
[0028] 予約ノード数決定手段 lbは、キューバッファ laに格納された並列ジョブの処理要 求を示すジョブキューに基づいて、並列ジョブの処理を行うために予め確保しておく べきノードの数を示す予約ノード数を決定する。例えば、キューバッファ laに格納さ れた並列ジョブの処理要求を示すジョブキューを参照し、各並列ジョブの並列数の合 計値を予約ノード数とすることができる。  [0028] The reserved node number determination means lb determines the number of nodes to be reserved in advance for processing a parallel job based on the job queue indicating the processing request for the parallel job stored in the queue buffer la. Determine the number of reserved nodes to show. For example, referring to a job queue indicating a parallel job processing request stored in the queue buffer la, the total number of parallel jobs of each parallel job can be set as the number of reserved nodes.
[0029] 空きノード数計算手段 lcは、複数のノード 2a, 2b, · · 'それぞれでのジョブの実行 の有無を監視し、ジョブを実行して 、な ゾードの数を示す空きノード数を算出する。  [0029] Free node number calculation means lc monitors the execution of jobs on each of the plurality of nodes 2a, 2b, ···, and executes the job to calculate the number of free nodes indicating the number of nodes To do.
[0030] ジョブキュー取得手段 Idは、キューバッファ laに格納されたジョブキューを、格納さ れた順に取得する。  [0030] Job queue acquisition means Id acquires the job queues stored in the queue buffer la in the order of storage.
ジョブ種別判断手段 leは、ジョブキュー取得手段 Idで取得されたジョブキューが並 列ジョブの処理要求か、逐次ジョブの処理要求かを判断する。  The job type determination unit le determines whether the job queue acquired by the job queue acquisition unit Id is a parallel job processing request or a sequential job processing request.
[0031] 並列ジョブ割り当て手段 Ifは、ジョブキュー取得手段 Idで取得されたジョブキュー が並列ジョブの処理要求の場合、並列ジョブの並列数に応じた空きノードを割り当て 先として決定する。 [0031] If the job queue acquired by the job queue acquisition unit Id is a parallel job processing request, the parallel job allocation unit If determines an empty node corresponding to the parallel number of parallel jobs as an allocation destination.
[0032] 割り当て方式選択手段 lgは、ジョブキュー取得手段 Idで取得されたジョブキュー が逐次ジョブの処理要求の場合、空きノード数と予約ノード数とに基づ 、て空きノー ド数の過不足を判断する。例えば、割り当て方式選択手段 lgは、空きノード数が予 約ノード数以下であれば、空きノード数が不足していると判断し、空きノード数が予約 ノード数を超えていれば、空きノード数が足りていると判断する。そして、割り当て方 式選択手段 lgは、空きノード数が不足していれば、ジョブの割り当て方式として詰め 込み方式を選択し、空きノード数が足りていれば、ジョブの割り当て方式として分散配 置方式を選択する。  [0032] When the job queue acquired by the job queue acquisition unit Id is a sequential job processing request, the allocation method selection unit lg determines whether the number of free nodes is excessive or insufficient based on the number of free nodes and the number of reserved nodes. Judging. For example, the allocation method selection means lg determines that the number of free nodes is insufficient if the number of free nodes is equal to or less than the number of reserved nodes, and determines the number of free nodes if the number of free nodes exceeds the number of reserved nodes. Is judged to be sufficient. The allocation method selection means lg selects the stuffing method as the job allocation method if the number of free nodes is insufficient, and the distributed allocation method as the job allocation method if the number of free nodes is sufficient. Select.
[0033] 詰め込み方式ジョブ割り当て手段 lhは、詰め込み方式が選択された場合、既に他 のジョブを実行して 、るノードを優先的に、ジョブキュー取得手段 Idで取得したジョブ の割り当て先として決定する。すなわち、詰め込み方式ジョブ割り当て手段 lhは、他 のジョブを実行しているノードのうち、さらにジョブを実行可能なノードを選択し、割り 当て先とする。 [0033] When the stuffing method is selected, the stuffing method job assigning means lh executes another job and preferentially determines the node to which the job acquired by the job queue obtaining means Id is assigned as the assignment destination. . In other words, the stuffing method job allocation unit lh selects a node that can execute a job from among nodes that execute another job, and allocates it. It is a target.
[0034] 分散配置方式ジョブ割り当て手段 liは、分散配置方式が選択された場合、ジョブを 実行して 、な ソードを優先的に、ジョブキュー取得手段 Idで取得したジョブの割り 当て先として決定する。  [0034] When the distributed allocation method is selected, the distributed allocation method job allocation unit li executes a job and preferentially determines the source code as the allocation destination of the job acquired by the job queue acquisition unit Id. .
[0035] 処理要求送信手段 ljは、割り当て先として決定されたノードに対して、ジョブキュー 取得手段 Idで取得したジョブキューで示されるジョブの処理要求を送信する。この処 理要求に応じて、各ノードでジョブが処理される。  The processing request transmission unit lj transmits a processing request for the job indicated by the job queue acquired by the job queue acquisition unit Id to the node determined as the assignment destination. In response to this processing request, the job is processed at each node.
[0036] このようなジョブ管理装置 1は、複数のノード 2a, 2b, · · ·がネットワークで接続され た構成のシステムである。各ノード 2a, 2b, · · ·は複数の CPUを有する。ジョブが投 入されるとキューバッファ laにキューイングされる。ジョブ管理装置 1は、キューイング されたジョブをノードに割り当て、ノード上にジョブを配置し実行させる。このとき、ジョ ブが割り当てられていないノード (空きノード)と、システム全体の空きノードの数と、実 行中ジョブの残り時間と、キューイングされたジョブの種類から、ジョブを配置するノー ドが決定される。  [0036] Such a job management apparatus 1 is a system having a configuration in which a plurality of nodes 2a, 2b, ... are connected via a network. Each node 2a, 2b,... Has a plurality of CPUs. When a job is submitted, it is queued in the queue buffer la. The job management apparatus 1 assigns the queued job to the node, and arranges and executes the job on the node. At this time, the node that allocates the job based on the number of nodes to which no job is assigned (free node), the number of free nodes in the entire system, the remaining time of running jobs, and the type of queued job. Is determined.
[0037] 例えば、並列ジョブが少な 、場合、予約ノード数も少なくて済む。そこで、キューバ ッファ la内に並列ジョブの処理要求を示すジョブキューが少なければ、予約ノード数 決定手段 lbは、予約ノード数も少なく設定する。これにより、空きノード数が予約ノー ド数よりも多ければ、逐次ジョブを分散配置方式で割り当てることができる。分散配置 方式では、空きノードに優先的にジョブが割り当てられるため、システム全体での実 行効率を向上させることができる。  For example, when the number of parallel jobs is small, the number of reserved nodes can be small. Therefore, if there are not many job queues indicating parallel job processing requests in the queue buffer la, the reserved node number determining means lb sets the reserved node number to be small. As a result, if the number of free nodes is larger than the number of reserved nodes, sequential jobs can be allocated by the distributed arrangement method. In the distributed placement method, jobs are preferentially assigned to free nodes, so the execution efficiency of the entire system can be improved.
[0038] また、並列ジョブが多くなると、多くの空きノードを確保しておかないと、並列ジョブ の実行待ち時間が長くなつてしまう。そこで、キューバッファ laに並列ジョブの処理要 求を示すジョブキューが増えると、予約ノード数決定手段 lbは、予約ノード数を増加 させる。これにより、並列ジョブが増加した場合であっても、並列ジョブの実行待ちの 発生を防止することができる。  [0038] When the number of parallel jobs increases, the execution waiting time for parallel jobs becomes long unless a large number of free nodes are secured. Therefore, when the number of job queues indicating processing requests for parallel jobs increases in the queue buffer la, the reserved node number determination means lb increases the number of reserved nodes. As a result, even if the number of parallel jobs increases, it is possible to prevent the execution of waiting for execution of parallel jobs.
[0039] 図 2は、ジョブの振り分け例を示す図である。この図では、図 15、図 16に示す従来 例と同じ条件の下で、本実施の形態によるジョブの割り当てを行ったものである。この 例では、 6台のノード 2a, 2b, 2c, 2d, 2e, 2fがネットワーク 3で接続されている。ノー ド 2a, 2b, 2c, 2d, 2e, 2fは、それぞれ 2つの CPUを有しているものとする。そして、 ジョブ管理装置 1のキューバッファ laに、逐次ジョブ 4a, 4b, 4c, 4dの処理要求を示 すジョブキュー力 つ続いた後に、 3並列の並列ジョブ 4eの処理要求を示すジョブキ ユーが格納されたものとする。なお、キューバッファ la内の並列ジョブの並列数の合 計を、予約ノード数とする。 FIG. 2 is a diagram illustrating an example of job distribution. In this figure, jobs are assigned according to the present embodiment under the same conditions as in the conventional example shown in FIGS. In this example, six nodes 2a, 2b, 2c, 2d, 2e, 2f are connected via network 3. No Each of the nodes 2a, 2b, 2c, 2d, 2e, and 2f has two CPUs. Then, after the job queue indicating the processing requests of the sequential jobs 4a, 4b, 4c, and 4d continues in the queue buffer la of the job management device 1, the job queue indicating the processing requests of the three parallel jobs 4e is stored. It shall be assumed. The total number of parallel jobs in queue buffer la is the number of reserved nodes.
[0040] この例では、逐次ジョブ 4a, 4b, 4cは、分散配置方式によりそれぞれノード 2a, 2b , 2cに振り分けられる。この時点で、空きノード数と予約ノード数とがー致する。そこで 、次の逐次ジョブ 4dは、詰め込み方式によりノード 2aに割り当てられる。そして、並列 ジョブ 4e力 ノード 2d, 2e, 2fに害割り当てられる。  In this example, the sequential jobs 4a, 4b, 4c are distributed to the nodes 2a, 2b, 2c, respectively, by the distributed arrangement method. At this point, the number of free nodes matches the number of reserved nodes. Therefore, the next sequential job 4d is assigned to the node 2a by the filling method. The parallel job 4e is assigned harm to the nodes 2d, 2e, and 2f.
[0041] 図 2の割り当て結果を、詰め込み方式のみを適用した場合(図 15参照)と比較する と、 1つのノードで複数のジョブを実行するノードの数が減り、システム全体での実行 効率が向上していることが分かる。また、図 2の割り当て結果を、分散配置方式のみ を適用した場合(図 16参照)と比較すると、並列ジョブを即時に実行することができ、 並列ジョブの処理効率が上がっていることが分かる。  [0041] Comparing the allocation results shown in Fig. 2 with the case where only the stuffing method is applied (see Fig. 15), the number of nodes that execute multiple jobs on one node is reduced, and the execution efficiency of the entire system is reduced. It can be seen that it has improved. In addition, comparing the assignment results in Fig. 2 with the case where only the distributed arrangement method is applied (see Fig. 16), it can be seen that parallel jobs can be executed immediately, and the processing efficiency of parallel jobs is improved.
[0042] すなわち、分散配置方式のみでジョブの割り当てを行うと、並列ジョブの実行待ち が多くなる(並列ジョブの掃けが悪い)。また、詰め込み方式のみでジョブの割り当て を行うと、空きノードがあるにもかかわらず、 1つのノードに複数の逐次ジョブが詰め込 まれてしま!/、逐次ジョブの掃けが悪!、。  [0042] That is, if jobs are assigned only by the distributed arrangement method, the waiting time for executing parallel jobs increases (the parallel jobs are poorly swept). Also, if you assign jobs using only the stuffing method, multiple sequential jobs will be stuffed into one node even though there are empty nodes! /, Sweeping sequential jobs is bad! ,.
[0043] 本実施の形態では、並列ジョブの実行を阻害することなぐ逐次ジョブがノードに分 散するように配置され、ジョブの掃けがよい。また、空きノードもなぐシステム全体の スループットが向上することも分力る。  In the present embodiment, sequential jobs that do not hinder the execution of parallel jobs are arranged so as to be distributed to the nodes, and the job can be swept. Another factor is that the throughput of the entire system with free nodes is improved.
[0044] 次に、本実施の形態の詳細を説明する。  [0044] Next, details of the present embodiment will be described.
図 3は、本実施の形態のシステム構成例を示す図である。ジョブ管理サーバ 100は 、ネットワーク 10を介して 6台のノード 21〜26に接続されている。また、各ノード 21〜 26は、高速通信可能なネットワーク 30を介して接続されている。ネットワーク 30は、ノ ード 21〜26で実行された並列ジョブ間で受け渡される情報の通信に使用される。  FIG. 3 is a diagram showing a system configuration example of the present embodiment. The job management server 100 is connected to six nodes 21 to 26 via the network 10. The nodes 21 to 26 are connected via a network 30 capable of high-speed communication. Network 30 is used to communicate information passed between parallel jobs executed on nodes 21-26.
[0045] このような構成のシステムにおいて、実行するジョブのジョブ処理要求力 まずジョ ブ管理サーバ 100に登録される。登録されるジョブ処理要求には、所定の時刻に自 動的に開始されるバッチジョブの処理要求や、ユーザからの操作入力等に応答して 発生するインタラクティブなジョブの処理要求等がある。 In the system having such a configuration, the job processing requesting power of the job to be executed is first registered in the job management server 100. The registered job processing request is automatically sent at a predetermined time. There are batch job processing requests that are started dynamically and interactive job processing requests that are generated in response to user input.
[0046] そして、ジョブ管理サーバ 100によってジョブ処理要求の振り分け処理が行われ、 各ジョブ処理要求が 、ずれかのノード 21〜26に割り当てられる。ジョブ管理サーバ 1 00は、ジョブ処理要求の振り分け処理を行うために、以下のようなハードウェア構成 を有している。  [0046] Then, the job management server 100 performs job processing request distribution processing, and each job processing request is assigned to any of the nodes 21 to 26. The job management server 100 has the following hardware configuration to perform job processing request distribution processing.
[0047] 図 4は、本実施の形態に用いるジョブ管理サーバのハードウェア構成例を示す図で ある。ジョブ管理サーバ 100は、 CPU101によって装置全体が制御されている。 CP U101には、バス 107を介して RAM (Random Access Memory) 102、ハードディスク ドライブ(HDD:Hard Disk Drive) 103、グラフィック処理装置 104、入力インタフエ一 ス 105、および通信インタフェース 106が接続されている。  FIG. 4 is a diagram illustrating a hardware configuration example of the job management server used in the present embodiment. The job management server 100 is entirely controlled by the CPU 101. A random access memory (RAM) 102, a hard disk drive (HDD) 103, a graphic processing device 104, an input interface 105, and a communication interface 106 are connected to the CPU 101 via a bus 107.
[0048] RAM102には、 CPU101に実行させる OS (Operating System)のプログラムゃァ プリケーシヨンプログラムの少なくとも一部が一時的に格納される。また、 RAM102に は、 CPU101による処理に必要な各種データが格納される。 HDD103には、 OSや アプリケーションプログラムが格納される。  [0048] At least a part of an OS (Operating System) program application program to be executed by the CPU 101 is temporarily stored in the RAM 102. The RAM 102 stores various data necessary for processing by the CPU 101. The HDD 103 stores the OS and application programs.
[0049] グラフィック処理装置 104には、モニタ 11が接続されている。グラフィック処理装置 1 04は、 CPU101からの命令に従って、画像をモニタ 11の画面に表示させる。入カイ ンタフェース 105には、キーボード 12とマウス 13とが接続されている。入力インタフエ ース 105は、キーボード 12やマウス 13から送られてくる信号を、バス 107を介して CP U 101に送信する。  A monitor 11 is connected to the graphic processing device 104. The graphic processing device 104 displays an image on the screen of the monitor 11 in accordance with a command from the CPU 101. A keyboard 12 and a mouse 13 are connected to the input counter face 105. The input interface 105 transmits a signal sent from the keyboard 12 or mouse 13 to the CPU 101 via the bus 107.
[0050] 通信インタフェース 106は、ネットワーク 10に接続されている。通信インタフェース 1 06は、ネットワーク 10を介して、他のコンピュータとの間でデータの送受信を行う。  [0050] The communication interface 106 is connected to the network 10. The communication interface 106 transmits / receives data to / from other computers via the network 10.
[0051] 以上のようなハードウェア構成によって、本実施の形態の処理機能を実現すること ができる。なお、図 4にはジョブ管理サーバ 100のハードウェア構成例を示した力 各 ノード 21〜26も同様のハードウェア構成で実現することができる。ただし、本実施の 形態では、ノード 21〜26は、それぞれ CPUを 2つずつ有しているものとする。  [0051] With the hardware configuration described above, the processing functions of the present embodiment can be realized. Note that the power nodes 21 to 26 shown in FIG. 4 as an example of the hardware configuration of the job management server 100 can also be realized with the same hardware configuration. However, in this embodiment, each of the nodes 21 to 26 has two CPUs.
[0052] 次に、ジョブ管理サーバ 100がジョブ処理要求の振り分けを行うために必要な処理 機能について説明する。 図 5は、ジョブ管理サーバの機能を示すブロック図である。ジョブ管理サーバ 100は 、通信部 111、ジョブ投入部 112、キューバッファ 113、予約ノード数管理部 114、空 きノード数管理部 115、実行ジョブ残り時間管理部 116、およびジョブ割り当て部 12 0を有している。 Next, processing functions necessary for the job management server 100 to distribute job processing requests will be described. FIG. 5 is a block diagram illustrating functions of the job management server. The job management server 100 includes a communication unit 111, a job input unit 112, a queue buffer 113, a reserved node number management unit 114, an empty node number management unit 115, an execution job remaining time management unit 116, and a job allocation unit 120. is doing.
[0053] 通信部 111は、ノード 21〜26との間で通信を行い、各種管理情報の収集および割 り当てられたジョブに関するジョブ処理要求の送信等を行う。  [0053] The communication unit 111 communicates with the nodes 21 to 26, collects various management information, transmits a job processing request regarding the assigned job, and the like.
ジョブ投入部 112は、キューバッファ 113に対してジョブ処理要求を投入する。例え ば、ジョブ投入部 112は、予め設定された時刻にバッチジョブのジョブ処理要求をキ ユーバッファ 113に投入する。また、ジョブ投入部 112は、ユーザからの操作入力に よりジョブの処理要求が出されると、該当するジョブのジョブ処理要求をキューバッフ ァ 113に投入する。  The job submission unit 112 submits a job processing request to the queue buffer 113. For example, the job submission unit 112 submits a job processing request for a batch job to the queue buffer 113 at a preset time. Further, when a job processing request is issued by an operation input from the user, the job input unit 112 inputs the job processing request of the corresponding job to the queue buffer 113.
[0054] キューバッファ 113は、投入されたジョブ処理要求を、投入順にキューイングする。  The queue buffer 113 queues the submitted job processing requests in the order of submission.
そして、キューバッファ 113は、投入された時刻が早い順に、ジョブ処理要求をジョブ 割り当て部 120に渡す。  Then, the queue buffer 113 passes the job processing requests to the job allocation unit 120 in order of the input time.
[0055] 予約ノード数管理部 114は、キューバッファ 113に登録されている並列ジョブに基 づいて、予約ノード数を計算する。キューバッファ 113上の並列ジョブの予約ノード数 は次のように計算する。予約ノード数の計算処理は、計算対象の並列ジョブ選択処 理と、選択した並列ジョブに基づく並列数計算処理とに分けられる。  The reserved node number management unit 114 calculates the number of reserved nodes based on the parallel jobs registered in the queue buffer 113. The number of reserved nodes for parallel jobs on the queue buffer 113 is calculated as follows. The calculation process for the number of reserved nodes is divided into a parallel job selection process to be calculated and a parallel number calculation process based on the selected parallel job.
[0056] (1)計算対象並列ジョブ選択処理  [0056] (1) Calculation target parallel job selection processing
予約ノード数を計算する対象の並列ジョブの選択方法としては、次の 、ずれかの方 法を採用する。  As the method for selecting the target parallel job for calculating the number of reserved nodes, the following method is adopted.
[0057] 第 1の選択方法では、キューイングされている全並列ジョブを処理対象として選択 する。  [0057] In the first selection method, all queued parallel jobs are selected as processing targets.
第 2の選択方法では、キューの先頭力 数えて連続した並列ジョブを処理対象とし て選択する。例えば、並列ジョブの次に逐次ジョブがあったら、それ以降の並列ジョ ブは選択対象から除外する。  In the second selection method, continuous parallel jobs are counted as processing targets by counting the head force of the queue. For example, if there is a sequential job next to a parallel job, the subsequent parallel jobs are excluded from the selection target.
[0058] 第 3の選択方法では、キューイングされて 、る順に、並列ジョブの並列数の合計を 計算する。そして、並列数の合計値が予め設定された最大数を超える前までの並列 ジョブを選択し、最大数を超えた後の並列ジョブは選択対象力 除外する。最大数は[0058] In the third selection method, the total number of parallel jobs is calculated in the order of queued items. And the parallel number until the total value of the parallel number exceeds the preset maximum number Select jobs, and exclude parallel jobs after exceeding the maximum number. The maximum number is
、例えば、システムのノード数の半分の値にする(100ノードのシステムのときは 50個For example, the value is half the number of nodes in the system (50 for a 100-node system)
) o ) o
[0059] (2)並列数計算処理  [0059] (2) Parallel number calculation processing
キューバッファ 113上の並列ジョブの各並列数の総数を計算し、予約ノード数とす る(例えば、 2並列が 4個、 4並列が 2個、 8並列が 1個ならば、 2 X 4+4 X 2 + 8 X 1 = Calculate the total number of parallel jobs in the queue buffer 113 and use it as the number of reserved nodes (for example, if 2 parallels is 4, 4 parallels is 2 and 8 parallels is 1 then 2 X 4+ 4 X 2 + 8 X 1 =
24)。または、存在する並列ジョブの最大並列数を予約ノード数とする(例えば、 2並 列が 4個, 4並列が 2個, 8並列が 1個ならば 8)。 twenty four). Or, the maximum number of parallel jobs that exist is the number of reserved nodes (for example, 4 for 2 parallels, 2 for 4 parallels, or 8 for 8 parallels).
[0060] 予約ノード数管理部 114は、このようにして計算された予約ノード数を、ジョブ割り当 て部 120に渡す。 The reserved node number management unit 114 passes the reserved node number calculated in this way to the job allocation unit 120.
空きノード数管理部 115は、通信部 111を介して、ノード 21〜26がジョブを処理中 か否かを示す状態情報を、各ノード 21〜26から取得する。そして、空きノード数管理 部 115は、空きノードの数をジョブ割り当て部 120に渡す。  The free node number management unit 115 acquires status information indicating whether or not the nodes 21 to 26 are processing jobs from the nodes 21 to 26 via the communication unit 111. Then, the free node number management unit 115 passes the number of free nodes to the job assignment unit 120.
[0061] 実行ジョブ残り時間管理部 116は、通信部 111を介して、ノード 21〜26それぞれ について、処理中のジョブの実行の残り時間を算出する。例えば、ジョブの最大処理 時間が予め規定されており、ジョブを実行するプログラムの作成時に、最大処理時間 内に必ずジョブが終了するように設計されている場合がある。この場合、実行ジョブ 残り時間管理部 116は、最大処理時間から各ジョブの現在の実行時間を減算した値 を、それぞれのジョブの残り時間とする。そして、実行ジョブ残り時間管理部 116は、 各ノード 21〜26で実行して 、るジョブの残り時間をジョブ割り当て部 120に渡す。  The remaining execution job time management unit 116 calculates the remaining execution time of the job being processed for each of the nodes 21 to 26 via the communication unit 111. For example, there is a case where the maximum processing time of a job is specified in advance and the job is designed to always finish within the maximum processing time when a program for executing the job is created. In this case, the execution job remaining time management unit 116 sets a value obtained by subtracting the current execution time of each job from the maximum processing time as the remaining time of each job. The execution job remaining time management unit 116 executes the job on each of the nodes 21 to 26 and passes the remaining time of the job to the job allocation unit 120.
[0062] ジョブ割り当て部 120は、予約ノード数、空きノード数、各ノード 21〜26の実行ジョ ブの残り時間に基づいて、キューバッファ 113にキューイングされたジョブ処理要求を 、いずれかのノード 21〜26に振り分ける。  [0062] The job allocation unit 120 receives the job processing request queued in the queue buffer 113 based on the number of reserved nodes, the number of free nodes, and the remaining time of the execution job of each of the nodes 21 to 26. Sort to 21-26.
[0063] 以下、ジョブ割り当て部 120の機能を詳細に説明する。  Hereinafter, the function of the job assignment unit 120 will be described in detail.
図 6は、ジョブ割り当て部の機能を示すブロック図である。図 6に示すように、ジョブ 割り当て部 120は、ジョブキュー取得部 121、ジョブ種別判断部 122、並列ジョブ割り 当て部 123、ジョブ割り当て方式選択部 124、詰め込み方式ジョブ割り当て部 125、 分散配置方式ジョブ割り当て部 126、およびジョブ処理要求送信部 127を有して 、る [0064] ジョブキュー取得部 121は、ジョブの振り分けタイミングになると、キューバッファ 11 3から先頭のジョブキューを取得する。ここで、ジョブの振り分けタイミングとは、空のキ ユーバッファ 113にジョブキューがキューイングされたとき、ノード 21〜26で実行して いるいずれかのジョブが終了したときである。ジョブキュー取得部 121は、取得したジ ヨブキューをジョブ種別判断部 122に渡す。 FIG. 6 is a block diagram illustrating functions of the job assignment unit. As shown in FIG. 6, the job assignment unit 120 includes a job queue acquisition unit 121, a job type determination unit 122, a parallel job assignment unit 123, a job assignment method selection unit 124, a stuffing method job assignment unit 125, and a distributed arrangement method job. An allocation unit 126 and a job processing request transmission unit 127; The job queue acquisition unit 121 acquires the first job queue from the queue buffer 113 at the job distribution timing. Here, the job distribution timing is when a job queue is queued in the empty queue buffer 113 or when any of the jobs being executed in the nodes 21 to 26 is completed. The job queue acquisition unit 121 passes the acquired job queue to the job type determination unit 122.
[0065] ジョブ種別判断部 122は、ジョブキューで示されるジョブ処理要求の内容を解析し、 逐次ジョブか並列ジョブかを判断する。また、ジョブ種別判断部 122は、並列ジョブで あれば、並列数を判断する。並列ジョブの場合、ジョブ種別判断部 122は、並列ジョ ブ割り当て部 123に対してジョブキューで示されるジョブの処理要求を渡すと共に、 並列数を通知する。また、逐次ジョブの場合、ジョブ種別判断部 122は、ジョブ割り当 て方式選択部 124に対してジョブキューで示されるジョブの処理要求を渡す。  The job type determination unit 122 analyzes the content of the job processing request indicated by the job queue and determines whether the job is a sequential job or a parallel job. The job type determination unit 122 determines the number of parallel jobs if the job is a parallel job. In the case of a parallel job, the job type determination unit 122 passes the job processing request indicated by the job queue to the parallel job assignment unit 123 and notifies the parallel number. In the case of a sequential job, the job type determination unit 122 passes a job processing request indicated by the job queue to the job allocation method selection unit 124.
[0066] 並列ジョブ割り当て部 123は、ジョブ種別判断部 122からジョブの処理要求を受け 取ると、空きノードの中力も並列ジョブの並列数に応じたノードを選択し、割り当て先 として決定する。  When receiving the job processing request from the job type determination unit 122, the parallel job allocation unit 123 selects a node corresponding to the number of parallel jobs in parallel as a free node, and determines it as an allocation destination.
[0067] ジョブ割り当て方式選択部 124は、ジョブ種別判断部 122から逐次ジョブの処理要 求を受け取ると、キューバッファ 113に格納されたジョブキューやノード 21〜26による ジョブの処理状況に応じて、ジョブ割り当て方式として、詰め込み方式と分散配置方 式とのいずれかを選択する。具体的には、ジョブ割り当て方式選択部 124は、空きノ ード数が予約ノード数より大きければ、割り当て方式として分散配置方式を選択する 。また、ジョブ割り当て方式選択部 124は、空きノード数が予約ノード数以下であれば 、割り当て方式として詰め込み方式を選択する。  [0067] When the job allocation method selection unit 124 receives the sequential job processing request from the job type determination unit 122, the job allocation method selection unit 124 determines whether the job queue stored in the queue buffer 113 or the job processing status by the nodes 21 to 26 is used. Select either the filling method or the distributed placement method as the job assignment method. Specifically, if the number of free nodes is larger than the number of reserved nodes, the job allocation method selection unit 124 selects the distributed arrangement method as the allocation method. Further, if the number of free nodes is equal to or less than the number of reserved nodes, the job allocation method selection unit 124 selects the stuffing method as the allocation method.
[0068] 割り当て方式として詰め込み方式を選択した場合、ジョブ割り当て方式選択部 124 は、取得したジョブキューで示されるジョブの処理要求を、詰め込み方式ジョブ割り当 て部 125に渡す。また、割り当て方式として分散配置方式を選択した場合、ジョブ割 り当て方式選択部 124は、取得したジョブキューで示されるジョブの処理要求を、分 散配置方式ジョブ割り当て部 126に渡す。  When the stuffing method is selected as the assignment method, the job assignment method selection unit 124 passes the acquired job processing request indicated by the job queue to the stuffing method job assignment unit 125. When the distributed allocation method is selected as the allocation method, the job allocation method selection unit 124 passes the acquired job processing request indicated by the job queue to the distributed allocation method job allocation unit 126.
[0069] 詰め込み方式ジョブ割り当て部 125は、ジョブの処理要求を受け取ると、現在のノ ード 21〜26のジョブの処理状況に基づいて、詰め込み方式によりジョブの割り当て 先を決定する。具体的には、詰め込み方式ジョブ割り当て部 125は、ジョブを実行中 であり、かつ追加のジョブを実行可能なノードを検出する。さらに、詰め込み方式ジョ ブ割り当て部 125は、検出したノードのジョブ実行の残り時間を実行ジョブ残り時間 管理部 116から取得し、最も残り時間の少ないノードを割り当て先として決定する。な お、ジョブを実行中のノードの中に、追カ卩のジョブを実行可能なノードが存在しない 場合、詰め込み方式ジョブ割り当て部 125は、空きノードを割り当て先として決定する 。そして、詰め込み方式ジョブ割り当て部 125は、決定した割り当て先をジョブ処理要 求送信部 127に通知する。 [0069] Upon receiving the job processing request, the stuffing method job allocation unit 125 receives the current node number. Based on the processing status of jobs 21 to 26, the job assignment destination is determined by the stuffing method. Specifically, the stuffing method job allocation unit 125 detects a node that is executing a job and that can execute an additional job. Further, the stuffing method job allocation unit 125 acquires the remaining job execution time of the detected node from the execution job remaining time management unit 116, and determines the node having the shortest remaining time as the allocation destination. If there is no node that can execute the additional job among the nodes that are executing the job, the stuffing method job allocation unit 125 determines an empty node as an allocation destination. Then, the stuffing method job allocation unit 125 notifies the determined allocation destination to the job processing request transmission unit 127.
[0070] 分散配置方式ジョブ割り当て部 126は、ジョブの処理要求を受け取ると、現在のノ ード 21〜26のジョブの処理状況に基づいて、分散配置方式によりジョブの割り当て 先を決定する。具体的には、分散配置方式ジョブ割り当て部 126は、空きノードを検 出する。そして、分散配置方式ジョブ割り当て部 126は、空きノードの中から 1つのノ ードを割り当て先として決定する。そして、分散配置方式ジョブ割り当て部 126は、決 定した割り当て先をジョブ処理要求送信部 127に通知する。  When receiving the job processing request, the distributed allocation method job allocation unit 126 determines a job allocation destination by the distributed allocation method based on the job processing status of the current nodes 21 to 26. Specifically, the distributed placement method job allocation unit 126 detects a free node. Then, the distributed placement method job allocation unit 126 determines one node from among the free nodes as an allocation destination. Then, the distributed allocation method job allocation unit 126 notifies the job processing request transmission unit 127 of the determined allocation destination.
[0071] ジョブ処理要求送信部 127は、決定した割り当て先のノードに対して、ジョブの処理 要求を送信する。  The job processing request transmission unit 127 transmits a job processing request to the determined assignment destination node.
図 7は、ジョブ割り当て処理の手順を示すフローチャートである。以下、図 7に示す 処理をステップ番号に沿って説明する。  FIG. 7 is a flowchart showing a procedure of job assignment processing. In the following, the process shown in FIG. 7 will be described in order of step number.
[0072] [ステップ S 11]ジョブキュー取得部 121は、キューバッファ 113から先頭のジョブキ ユーを取得する。 [Step S 11] The job queue acquisition unit 121 acquires the first job queue from the queue buffer 113.
[ステップ S 12]ジョブ種別判断部 122は、取得したジョブキューで示されるジョブ処 理要求の内容を解析し、並列ジョブか逐次ジョブかを判断する。並列ジョブであれば 、処理がステップ S 13に進められる。また、逐次ジョブであれば、処理がステップ S 14 に進められる。  [Step S12] The job type determination unit 122 analyzes the content of the job processing request indicated by the acquired job queue, and determines whether the job is a parallel job or a sequential job. If it is a parallel job, the process proceeds to step S13. If the job is a sequential job, the process proceeds to step S14.
[0073] [ステップ S 13]並列ジョブ割り当て部 123は、処理要求が出された並列ジョブの並 列数と同数の空きノードを選択し、割り当て先として決定する。その後、処理力^テツ プ S22〖こ進められる。 [0074] [ステップ S 14]ジョブ割り当て方式選択部 124は、空きノード数管理部 115から、ノ ード 21〜26のうちの空きノード数を取得する。 [Step S 13] The parallel job allocation unit 123 selects the same number of free nodes as the number of parallel jobs for which a processing request has been issued, and determines it as an allocation destination. After that, the processing power is advanced S22. [Step S 14] The job allocation method selection unit 124 obtains the number of free nodes among the nodes 21 to 26 from the free node number management unit 115.
[ステップ S 15]ジョブ割り当て方式選択部 124は、ノード 21〜26の中に空きノード がある力否かを判断する。空きノードがある場合、処理がステップ S16に進められる。 空きノードが無い場合、ジョブ割り当て方式選択部 124は、ジョブ割り当て方式を詰 め込み方式に決定し、処理をステップ S20に進める。  [Step S15] The job allocation method selection unit 124 determines whether or not there is a free node among the nodes 21 to 26. If there is an empty node, the process proceeds to step S16. If there is no free node, the job assignment method selection unit 124 determines the job assignment method as the clogging method, and advances the process to step S20.
[0075] [ステップ S16]ジョブ割り当て方式選択部 124は、予約ノード数管理部 114から並 列ジョブの有無および予約ノード数を取得する。  [Step S 16] The job allocation method selection unit 124 acquires the presence / absence of a parallel job and the number of reserved nodes from the reserved node number management unit 114.
[ステップ S 17]ジョブ割り当て方式選択部 124は、キューバッファ 113内に並列ジョ ブのジョブキューがキューイングされて 、るか否かを判断する。並列ジョブがあれば、 処理がステップ S 18に進められる。並列ジョブがなければ、ジョブ割り当て方式選択 部 124は、ジョブ割り当て方式として分散配置方式を選択し、処理をステップ S19に 進める。  [Step S 17] The job allocation method selection unit 124 determines whether or not a parallel job queue is queued in the queue buffer 113. If there is a parallel job, the process proceeds to step S18. If there is no parallel job, the job assignment method selection unit 124 selects the distributed arrangement method as the job assignment method, and advances the process to step S19.
[0076] [ステップ S18]ジョブ割り当て方式選択部 124は、空きノード数が予約ノード数を超 えている力否かを判断する。空きノード数が予約ノード数を超えている場合、ジョブ割 り当て方式選択部 124は、ジョブ割り当て方式として分散配置方式を選択し、処理を ステップ S 19に進める。空きノード数が予約ノード数以下の場合、ジョブ割り当て方式 選択部 124は、ジョブ割り当て方式として詰め込み方式を選択し、処理をステップ S2 0に進める。  [Step S 18] The job allocation method selection unit 124 determines whether or not the number of free nodes exceeds the number of reserved nodes. If the number of free nodes exceeds the number of reserved nodes, the job allocation method selection unit 124 selects the distributed arrangement method as the job allocation method, and advances the process to step S19. If the number of free nodes is equal to or less than the number of reserved nodes, the job allocation method selection unit 124 selects the stuffing method as the job allocation method, and the process proceeds to step S20.
[0077] [ステップ S 19]分散配置方式ジョブ割り当て部 126は、空きノードにジョブを割り当 てる。その後、処理がステップ S22に進められる。  [Step S 19] The distributed allocation method job allocation unit 126 allocates a job to an empty node. Thereafter, the process proceeds to step S22.
[ステップ S20]詰め込み方式ジョブ割り当て部 125は、実行ジョブ残り時間管理部 116から各ノードの実行ジョブの残り時間を取得する。  [Step S20] The filling method job allocation unit 125 acquires the remaining time of the execution job of each node from the execution job remaining time management unit 116.
[0078] [ステップ S21]詰め込み方式ジョブ割り当て部 125は、残り時間の最も少ないノー ドを、ジョブの割り当て先として選択し、割り当て先として決定する。 [Step S21] The stuffing method job allocation unit 125 selects the node with the shortest remaining time as the job allocation destination and determines it as the allocation destination.
[ステップ S22]ジョブ処理要求送信部 127は、並列ジョブ割り当て部 123、詰め込 み方式ジョブ割り当て部 125、または分散配置方式ジョブ割り当て部 126によって割 り当て先として決定されたノードに対して、ジョブの処理要求を送信する。 [0079] このようにして、予約ノード数と空きノードの数とに応じて、詰め込み方式と分散配 置方式とを切り替えることができる。これにより、キューバッファ 113内のジョブの掃け がよくなり、システム全体のスループットが向上する。 [Step S22] The job processing request sending unit 127 sends a job to a node determined as an assignment destination by the parallel job assignment unit 123, the stuffing method job assignment unit 125, or the distributed placement method job assignment unit 126. Send processing request. In this way, the filling method and the distributed arrangement method can be switched according to the number of reserved nodes and the number of free nodes. This improves the sweeping of jobs in the queue buffer 113 and improves the overall system throughput.
[0080] 次に、ジョブ割り当ての具体例につ!、て説明する。 Next, a specific example of job assignment will be described.
図 8は、キューイングされたジョブの例を示す図である。この例では、キューバッファ FIG. 8 is a diagram illustrating an example of a queued job. In this example, the queue buffer
113に、 8個のジョブキュー 41〜48がキューイングされている。なお、図 8中、各ジョ ブキュー 41〜48には、キューイングの順番が数値(# 1〜# 8)で示されている。 Eight job queues 41 to 48 are queued in 113. In FIG. 8, the job queues 41 to 48 indicate the queuing order by numerical values (# 1 to # 8).
[0081] 先頭から 2つのジョブキュー 41、 42は、逐次ジョブの処理要求を示している。次の ジョブキュー 43は、 5並列の並列ジョブの処理要求を示している。次の 4つのジョブキ ユー 44〜47は、逐次ジョブの処理要求を示している。そして、最後のジョブキュー 48 は、 2並列の並列ジョブの処理要求を示している。 [0081] The two job queues 41 and 42 from the top indicate sequential job processing requests. The next job queue 43 shows processing requests for five parallel jobs. The following four job queues 44 to 47 indicate sequential job processing requests. The last job queue 48 indicates processing requests for two parallel jobs.
[0082] なお、以下の説明では、予約ノード数の計算方法として、キューイングされて 、る全 並列ジョブの並列数の総数を採用するものとする。 In the following description, as the method for calculating the number of reserved nodes, it is assumed that the total number of parallel queued queues is used.
これらのジョブを、それぞれ 2つの CPUを有する 6台のノード 21〜26で処理する。 なお、図 8に示す各ジョブの実行開始前は、全てのノード 21〜26が空きノードである ものとする。  These jobs are processed by six nodes 21 to 26 each having two CPUs. Note that before the execution of each job shown in FIG. 8 is started, all the nodes 21 to 26 are assumed to be empty nodes.
[0083] まず、逐次ジョブのジョブキュー 41の割り当てが行われる。このとき、予約ノード数 は「7」(ジョブキュー 43で示される並列ジョブの並列数「5」 +ジョブキュー 48で示さ れる並列ジョブ 48の並列数「2」)である。空きノード数は「6」であるため、「予約ノード 数≥空きノード数」が満たされ、詰め込み方式が採用される。なお、この段階では、全 てのノード 21〜26が空きノードであるため、ジョブキュー 41で示される逐次ジョブは、 ノード 21〜26のいずれかに割り当てられる。この例では、逐次ジョブ 41がノード 21に 割り当てられたものとする。  First, the job queue 41 for sequential jobs is assigned. At this time, the number of reserved nodes is “7” (the parallel number “5” of the parallel job indicated by the job queue 43 + the parallel number “2” of the parallel job 48 indicated by the job queue 48). Since the number of free nodes is “6”, “the number of reserved nodes ≧ the number of free nodes” is satisfied, and the filling method is adopted. At this stage, since all the nodes 21 to 26 are empty nodes, the sequential job indicated by the job queue 41 is assigned to any of the nodes 21 to 26. In this example, it is assumed that the sequential job 41 is assigned to the node 21.
[0084] 次に、逐次ジョブのジョブキュー 42の割り当てが行われる。この時点では、予約ノー ド数「7」、空きノード数「5」である。従って、「予約ノード数≥空きノード数」が満たされ 、詰め込み方式が採用される。詰め込み方式の場合、既にジョブを実行しているノー ドに、ジョブの追加割り当てが可能であれば、そのノードにジョブが割り当てられる。 従って、ジョブキュー 42は、ノード 21に割り当てられる。 [0085] 次に、並列ジョブ 43の割り当てが行われる。並列ジョブ 43は、並列ジョブ数の空き ノードそれぞれで実行する必要があるため、空きノードに割り当てられる。従って、並 列ジョブ 43は、ノード 22〜26に割り当てられる。 Next, the job queue 42 for sequential jobs is assigned. At this point, the number of reserved nodes is “7” and the number of free nodes is “5”. Therefore, “reserved node number ≧ free node number” is satisfied, and the filling method is adopted. In the case of the stuffing method, if an additional job can be assigned to a node that is already executing a job, the job is assigned to that node. Accordingly, the job queue 42 is assigned to the node 21. Next, the parallel job 43 is assigned. Since the parallel job 43 needs to be executed on each free node of the number of parallel jobs, it is assigned to the free node. Therefore, parallel job 43 is assigned to nodes 22-26.
[0086] 図 9は、 3番目までのジョブ割り当て状況を示す図である。ノード 21では、ジョブキュ 一 41で示された逐次ジョブ 41aと、ジョブキュー 42で示された逐次ジョブ 42aとが実 行されている。また、ノード 22〜26において、ジョブキュー 43で示された並列ジョブ 4 3aが実行されている。  FIG. 9 is a diagram showing the third job assignment status. In the node 21, the sequential job 41a indicated by the job queue 41 and the sequential job 42a indicated by the job queue 42 are executed. In addition, in the nodes 22 to 26, the parallel job 43a indicated by the job queue 43 is executed.
[0087] このように、並列ジョブ 43aが実行されることにより、空きノードが無くなつている。ま た、ノード 21も CPUと同数のジョブを実行しているため、追加でジョブを割り当てるこ とはできない。そこで、いずれかのジョブの処理が終了するまで、ジョブキュー 44は待 ち状態となる。  As described above, the parallel job 43a is executed, so that there is no empty node. Node 21 is also executing the same number of jobs as the CPU, so no additional jobs can be assigned. Therefore, the job queue 44 is in a waiting state until the processing of any job is completed.
[0088] ここで、並列ジョブ 43aの処理が終了したものとする。  Here, it is assumed that the parallel job 43a has been processed.
図 10は、並列ジョブの処理終了後の状態を示す図である。図 10に示すように、並 列ジョブ 43aの処理が終了したことで、ノード 22〜26が空き状態となる。そこで、ジョ ブキュー 44以降のジョブの割り当てが開始される。  FIG. 10 is a diagram illustrating a state after the parallel job processing is completed. As shown in FIG. 10, when the parallel job 43a has been processed, the nodes 22 to 26 become free. Therefore, job assignment after job queue 44 is started.
[0089] 逐次ジョブのジョブキュー 44の割り当てを行うときの予約ノード数は「2」である。現 在、空きノード数が「5」であるため、「予約ノード数く空きノード数」となる。従って、逐 次ジョブのジョブキュー 44〜46は、分散配置方式によって割り当てられる。その結果 、ジョブキュー 44がノード 22に割り当てられ、ジョブキュー 45がノード 23に割り当てら れ、ジョブキュー 46がノード 24に割り当てられたものとする。  The number of reserved nodes when assigning the job queue 44 for sequential jobs is “2”. Currently, since the number of free nodes is “5”, “the number of reserved nodes is equal to the number of free nodes”. Therefore, the job queues 44 to 46 for the sequential jobs are allocated by the distributed arrangement method. As a result, it is assumed that job queue 44 is assigned to node 22, job queue 45 is assigned to node 23, and job queue 46 is assigned to node 24.
[0090] 図 11は、 6番目までのジョブが割り当てられた状態を示す図である。ノード 22では、 ジョブキュー 44で示された逐次ジョブ 44aが実行されている。ノード 23では、ジョブキ ユー 45で示された逐次ジョブ 45aが実行されている。ノード 24では、ジョブキュー 46 で示された逐次ジョブ 46aが実行されている。この時点で、 2つの空きノードがある。  FIG. 11 is a diagram showing a state in which up to the sixth job is assigned. In the node 22, a sequential job 44a indicated by the job queue 44 is executed. On the node 23, the sequential job 45a indicated by the job queue 45 is executed. In the node 24, the sequential job 46a indicated by the job queue 46 is executed. At this point, there are two free nodes.
[0091] ここで、逐次ジョブのジョブキュー 47の割り当てが行われると、予約ノード数 =空き ノード数であるため、詰め込み方式でジョブの割り当てが行われる。すなわち、ノード 22〜24のうち、ジョブ実行の残り時間が最も少ないノードにジョブキュー 47が割り当 てられる。この例では、ジョブキュー 47はノード 22に割り当てられたものとする。 [0092] 最後に、並列ジョブのジョブキュー 48の割り当てが行われる。この時点で、 2つの空 きノードが確保されているため、ジョブキュー 48は、ノード 25, 26に割り当てられる。 Here, when the job queue 47 for sequential jobs is assigned, the number of reserved nodes is equal to the number of empty nodes, so the jobs are assigned using the stuffing method. That is, among the nodes 22 to 24, the job queue 47 is assigned to the node having the shortest remaining job execution time. In this example, it is assumed that job queue 47 is assigned to node 22. [0092] Finally, the job queue 48 for parallel jobs is assigned. At this point, since two free nodes are reserved, the job queue 48 is assigned to the nodes 25 and 26.
[0093] 図 12は、全てのジョブを割り当てた状態を示す図である。ノード 22では、ジョブキュ 一 47で示された逐次ジョブ 47aが実行されている。ノード 25, 26では、ジョブキュー 4 8で示された並列ジョブ 48aが実行されている。図 12に示すように、この例では、全て のジョブが実行されている。しかも、空きノードが発生していないため、システム全体と して高 、運用効率を得ることができる。  FIG. 12 shows a state in which all jobs are assigned. In the node 22, a sequential job 47a indicated by the job queue 47 is executed. On the nodes 25 and 26, the parallel job 48a indicated by the job queue 48 is executed. As shown in Fig. 12, all jobs are executed in this example. In addition, since there are no free nodes, the overall system can be operated with high efficiency.
[0094] 以下、比較対象として、詰め込み方式のみでジョブの割り当てを行った場合と、分 散配置方式のみでジョブの割り当てを行った場合とのそれぞれの例を示す。  In the following, as comparison targets, there are shown examples of a case where job assignment is performed only by the stuffing method and a case where job assignment is performed only by the distributed arrangement method.
図 13は、詰め込み方式のみでジョブの割り当てを行った場合の例を示す図である 。詰め込み方式でジョブの割り当てを行うと、 3番目のジョブを割り当てた状態は、図 1 0と同じである。そして、 4番目の逐次ジョブ 44a以降の各ジョブを詰め込み方式で割 り当てると、図 13に示すようにノード 26が空きノードとなる。すなわち、システム全体で の稼働率が悪い。し力も、逐次ジョブは、 1つのノード内で 2つずつ実行され、資源競 合により実行性能が低下してしまう。  FIG. 13 is a diagram showing an example in which jobs are assigned only by the stuffing method. When job assignment is performed using the stuffing method, the state in which the third job is assigned is the same as in FIG. Then, when each job after the fourth sequential job 44a is assigned by the stuffing method, the node 26 becomes an empty node as shown in FIG. In other words, the operating rate of the entire system is poor. However, sequential jobs are executed two by two in one node, and the execution performance is degraded due to resource competition.
[0095] また、図 10に示す状態から、 4番目以降のジョブを分散配置方式で割り当てた場合 について説明する。  Further, a case will be described in which the fourth and subsequent jobs are allocated by the distributed arrangement method from the state shown in FIG.
図 14は、分散配置方式のみでジョブの割り当てを行った場合の例を示す図である 。分散配置方式の場合、 4番目の逐次ジョブ 44a以降の各ジョブを分散配置方式で 割り当てると、 2並列の並列ジョブ 48aの割り当てを行う際に、空きノードが 1つになつ てしまう。そのため、空きノードが増えるまで並列ジョブのジョブキュー 48は実行待ち となる。これにより、並列ジョブの掃けが悪くなる。しかも、ノード 26は空き状態のまま であるため、システム全体での稼働率も悪 、ままである。  FIG. 14 is a diagram showing an example in which jobs are assigned only by the distributed arrangement method. In the case of the distributed placement method, if each job after the fourth sequential job 44a is assigned by the distributed placement method, when two parallel jobs 48a are assigned, there will be one free node. Therefore, the job queue 48 for parallel jobs is waiting to be executed until the number of free nodes increases. This makes sweeping parallel jobs worse. In addition, since the node 26 is left free, the operation rate of the entire system remains poor.
[0096] 図 12に示した本実施の形態によるジョブの割り当て結果と、図 13、図 14に示した ジョブの割り当て結果を比較すると、本実施の形態を適用することで、並列ジョブ 43a の終了後は、残りの全てのジョブが待ち時間なしで配置される。し力も、本実施の形 態では、空きノード発生していないことにより、システムの稼働率が向上しているととも に、並列ジョブの掃けがよくなつていることが分かる。さらに、複数の逐次ジョブを同時 に実行するノード数が 2つで済み、資源競合の発生確率を減らすことができ、実行性 能が向上している。 [0096] Comparing the job assignment results according to this embodiment shown in FIG. 12 with the job assignment results shown in FIG. 13 and FIG. 14, the end of the parallel job 43a is achieved by applying this embodiment. After that, all the remaining jobs are arranged without waiting time. However, in this embodiment, it can be seen that the availability of the system has improved due to the absence of free nodes, and the sweeping of parallel jobs has improved. In addition, multiple sequential jobs can be executed simultaneously. Only two nodes need to be executed, reducing the probability of resource contention and improving execution performance.
[0097] ところで、上記の実施の形態における予約ノード数に、所定の補正値 exを加算し、 加算後の値と空きノード数とを比較することで、ジョブの割り当て方式を決定すること もできる。具体的には、ジョブキュー上の並列ジョブの並列数に基づく予約ノード数と 空きノード数を採取し、空きノード数が予約ノード数 + αより多い場合は分散配置方 式を選択、空きノード数が予約ノード数 + α以下の場合は詰め込み方式を選択する 。補正値 αは、直近で投入される並列ジョブのために確保しておくノード数を表す。 例えば、システム管理者が指定する力、過去に投入された並列ジョブの平均並列数 を、ジョブ割り当て部 120で計算し、その計算結果を補正値ひとする。  By the way, it is also possible to determine the job allocation method by adding a predetermined correction value ex to the number of reserved nodes in the above embodiment, and comparing the value after the addition with the number of free nodes. . Specifically, the number of reserved nodes and the number of free nodes based on the number of parallel jobs on the job queue are collected. If the number of free nodes is greater than the number of reserved nodes + α, the distributed placement method is selected. If is less than the number of reserved nodes + α, select the filling method. The correction value α represents the number of nodes reserved for the most recently submitted parallel job. For example, the power specified by the system administrator and the average number of parallel jobs submitted in the past are calculated by the job assignment unit 120, and the calculation result is used as a correction value.
[0098] また、予約ノード数を、予め設定した値に固定しておくことも可能である。例えば、「 投入可能な並列ジョブ数」 X「並列数」を予約ノード数として設定する。ここで、投入 可能な並列ジョブ数は、予めシステム管理者が任意に設定した値である。また、並列 数は予めシステム管理者が指定した値でもよ!/、し、過去の平均並列数でもよ!/、。  In addition, the number of reserved nodes can be fixed to a preset value. For example, “number of parallel jobs that can be submitted” X “number of parallel jobs” is set as the number of reserved nodes. Here, the number of parallel jobs that can be submitted is a value arbitrarily set by the system administrator in advance. Also, the number of parallels may be a value specified by the system administrator in advance! /, Or the average number of parallels in the past! /.
[0099] また、実行中の逐次ジョブとキューバッファにキューイングされた逐次ジョブの数力 逐次ジョブを優先的に実行させるノード数 (逐次ジョブ用ノード数)より少な 、場合は 分散配置方式でジョブ配置し、そうでな!/、場合は詰め込み方式でジョブ配置するよう にしてもよい。逐次ジョブ用ノード数は、例えば、予めシステム管理者が指定した値で もよ 、し、過去の単位時間あたりの平均逐次ジョブ数でもよ!/、。  [0099] Also, the number of sequential jobs being executed and the number of sequential jobs queued in the queue buffer. If the number of nodes that preferentially execute sequential jobs (the number of nodes for sequential jobs) is smaller, the job is distributed If so, you may be able to place jobs using the stuffing method. The number of sequential job nodes can be, for example, a value specified in advance by the system administrator, or the average number of sequential jobs per unit time in the past! /.
[0100] また、詰め込み方式で逐次ジョブを割り当てるノードの選択方法として、割り当て済 ノードの中で実行中ジョブの実行時間の最も長 ソードに割り当てるようにしてもょ ヽ 。例えば、ジョブのプログラム設計段階での最大処理時間が規定されていない場合、 処理中のジョブの残り時間を算出することができない。その場合、ジョブの処理を開 始してからの実行時間が最も長いノードに、逐次ジョブを割り当てる。  [0100] In addition, as a method of selecting a node to which jobs are sequentially assigned by the stuffing method, it may be assigned to the sword having the longest execution time of the job being executed among the assigned nodes. For example, if the maximum processing time at the job program design stage is not specified, the remaining time of the job being processed cannot be calculated. In this case, jobs are assigned sequentially to the node that has the longest execution time since job processing started.
[0101] なお、上記の処理機能は、コンピュータによって実現することができる。その場合、 ジョブ管理装置が有すべき機能の処理内容を記述したプログラムが提供される。その プログラムをコンピュータで実行することにより、上記処理機能がコンピュータ上で実 現される。処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒 体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、磁 気記録装置、光ディスク、光磁気記録媒体、半導体メモリなどがある。磁気記録装置 には、ハードディスク装置 (HDD)、フレキシブルディスク (FD)、磁気テープなどがあ る。光ディスクには、 DVD (Digital Versatile Disc)、 DVD— RAM (Random Access Memory)、 CD— ROM (Compact Disc Read Only Memory)、 CD— R (Recordable) ZRW (Rewritable)などがある。光磁気記録媒体には、 MO (Magneto- Optical disk) などがある。 [0101] The above processing functions can be realized by a computer. In that case, a program describing the processing contents of the functions that the job management apparatus should have is provided. By executing the program on a computer, the above processing functions are realized on the computer. The program describing the processing contents is a computer-readable recording medium. Can be recorded on the body. Examples of the computer-readable recording medium include a magnetic recording device, an optical disk, a magneto-optical recording medium, and a semiconductor memory. Magnetic recording devices include hard disk drives (HDD), flexible disks (FD), and magnetic tapes. Optical discs include DVD (Digital Versatile Disc), DVD—RAM (Random Access Memory), CD—ROM (Compact Disc Read Only Memory), and CD—R (Recordable) ZRW (Rewritable). Magneto-optical recording media include MO (Magneto-Optical disk).
[0102] プログラムを流通させる場合には、例えば、そのプログラムが記録された DVD、 CD —ROMなどの可搬型記録媒体が販売される。また、プログラムをサーバコンピュータ の記憶装置に格納しておき、ネットワークを介して、サーバコンピュータ力 他のコン ピュータにそのプログラムを転送することもできる。  [0102] When the program is distributed, for example, a portable recording medium such as a DVD or a CD-ROM in which the program is recorded is sold. It is also possible to store the program in a storage device of the server computer and transfer the program to other computers via the network.
[0103] プログラムを実行するコンピュータは、例えば、可搬型記録媒体に記録されたプロ グラムもしくはサーバコンピュータ力 転送されたプログラムを、 自己の記憶装置に格 納する。そして、コンピュータは、自己の記憶装置力 プログラムを読み取り、プロダラ ムに従った処理を実行する。なお、コンピュータは、可搬型記録媒体から直接プログ ラムを読み取り、そのプログラムに従った処理を実行することもできる。また、コンビュ ータは、サーバコンピュータ力もプログラムが転送される毎に、逐次、受け取ったプロ グラムに従った処理を実行することもできる。  [0103] A computer that executes a program stores, for example, a program recorded on a portable recording medium or a server computer-powered program in its own storage device. Then, the computer reads its own storage device power program and executes processing according to the program. The computer can also read the program directly from the portable recording medium and execute processing according to the program. The computer can also execute processing according to the received program sequentially each time the program is transferred to the server computer.
[0104] 上記については単に本発明の原理を示すものである。さらに、多数の変形、変更が 当業者にとって可能であり、本発明は上記に示し、説明した正確な構成および応用 例に限定されるものではなぐ対応するすべての変形例および均等物は、添付の請 求項およびその均等物による本発明の範囲とみなされる。  [0104] The above merely illustrates the principle of the present invention. In addition, many variations and modifications are possible to those skilled in the art, and the invention is not limited to the precise configuration and application shown and described above, but all corresponding variations and equivalents are It is regarded as the scope of the present invention by the claims and their equivalents.
符号の説明  Explanation of symbols
[0105] 1 ジョブ管理装置 [0105] 1 Job management device
la キューノ ッファ  la cuno fffa
lb 予約ノード数決定手段  lb Reserved node number determination means
lc 空きノード数計算手段  lc Number of free nodes
Id ジョブキュー取得手段 le ジョブ種別判断手段 Id job queue acquisition means le Job type judgment means
If 並列ジョブ割り当て手段 lg 割り当て方式選択手段 lh 詰め込み方式ジョブ割り当て手段 li 分散配置方式ジョブ割り当て手段 lj 処理要求送信手段  If parallel job assignment means lg assignment method selection means lh stuffing method job assignment means li distributed placement method job assignment means lj processing request sending means
2a, 2b, · · · ノード  2a, 2b, ... node

Claims

請求の範囲 The scope of the claims
1つのノードで実行される逐次ジョブと複数のノードによる並列処理で実行される並 列ジョブとのノードへの割り当てを行うジョブ管理装置にぉ 、て、  A job management device that assigns to a node a sequential job executed on one node and a parallel job executed in parallel processing by a plurality of nodes.
ジョブの処理要求を示すジョブキューが入力されると、入力された順に前記ジョブキ ユーを格納するキューバッファと、  When a job queue indicating a job processing request is input, a queue buffer for storing the job queues in the input order;
前記キューバッファに格納された並列ジョブの処理要求を示すジョブキューに基づ Based on a job queue indicating a parallel job processing request stored in the queue buffer.
V、て、並列ジョブの処理を行うために予め確保しておくべきノードの数を示す予約ノ 一ド数を決定する予約ノード数決定手段と、 V, reserved node number determining means for determining the number of reserved nodes indicating the number of nodes to be reserved in advance for processing parallel jobs;
前記複数のノードそれぞれでのジョブの実行の有無を監視し、ジョブを実行して ヽ ないノードの数を示す空きノード数を算出する空きノード数計算手段と、  Free node number calculating means for monitoring the presence or absence of job execution in each of the plurality of nodes and calculating the number of free nodes indicating the number of nodes that should not execute the job;
前記キューバッファに格納された前記ジョブキューを、格納された順に取得するジョ ブキュー取得手段と、  Job queue acquisition means for acquiring the job queues stored in the queue buffer in the order of storage;
前記ジョブキュー取得手段で取得された前記ジョブキューが並列ジョブの処理要求 か、逐次ジョブの処理要求かを判断するジョブ種別判断手段と、  Job type determination means for determining whether the job queue acquired by the job queue acquisition means is a parallel job processing request or a sequential job processing request;
前記ジョブキュー取得手段で取得された前記ジョブキューが並列ジョブの処理要求 の場合、並列ジョブの並列数に応じた空きノードを割り当て先として決定する並列ジョ ブ割り当て手段と、  Parallel job assignment means for determining, as the assignment destination, an empty node corresponding to the number of parallel jobs when the job queue obtained by the job queue obtaining means is a parallel job processing request;
前記ジョブキュー取得手段で取得された前記ジョブキューが逐次ジョブの処理要求 の場合、前記空きノード数と前記予約ノード数とに基づ!、て前記空きノード数の過不 足を判断し、前記空きノード数が不足していれば、ジョブの割り当て方式として詰め 込み方式を選択し、前記空きノード数が足りていれば、ジョブの割り当て方式として分 散配置方式を選択する割り当て方式選択手段と、  If the job queue acquired by the job queue acquisition means is a sequential job processing request, based on the number of free nodes and the number of reserved nodes, it is determined whether the number of free nodes is insufficient, An allocation method selection means for selecting a stuffing method as a job allocation method if the number of free nodes is insufficient, and a distributed arrangement method for selecting a job allocation method if the number of empty nodes is sufficient;
前記詰め込み方式が選択された場合、既に他のジョブを実行して 、るノードを優先 的に、前記ジョブキュー取得手段で取得したジョブの割り当て先として決定する詰め 込み方式ジョブ割り当て手段と、  When the stuffing method is selected, a stuffing method job assigning unit that preferentially determines another node that has already executed another job as an assignment destination of the job acquired by the job queue obtaining unit;
前記分散配置方式が選択された場合、ジョブを実行していないノードを優先的に、 前記ジョブキュー取得手段で取得したジョブの割り当て先として決定する分散配置方 式ジョブ割り当て手段と、 割り当て先として決定されたノードに対して、前記ジョブキュー取得手段で取得した 前記ジョブキューで示されるジョブの処理要求を送信する処理要求送信手段と、 を有することを特徴とするジョブ管理装置。 When the distributed arrangement method is selected, a node that is not executing a job is preferentially determined as a job assignment destination acquired by the job queue acquisition means, and a distributed arrangement method job assignment unit; And a processing request transmission unit configured to transmit a processing request for the job indicated by the job queue acquired by the job queue acquisition unit to a node determined as an allocation destination.
[2] 前記予約ノード数決定手段は、前記キューバッファに格納されている並列ジョブの 処理要求で示される並列数の合計値を予約ノード数とすることを特徴とする請求の範 囲第 1項記載のジョブ管理装置。  [2] The reserved node number determining means uses the total value of the parallel numbers indicated in the parallel job processing request stored in the queue buffer as the reserved node number. The job management apparatus described.
[3] 前記予約ノード数決定手段は、前記キューバッファに格納されている並列ジョブの 処理要求で示される並列数の最大値を予約ノード数とすることを特徴とする請求の範 囲第 1項記載のジョブ管理装置。  [3] The reserved node number determination means uses the maximum value of the parallel number indicated by the parallel job processing request stored in the queue buffer as the reserved node number. The job management apparatus described.
[4] 前記割り当て方式選択手段は、前記空きノード数が前記予約ノード数以下であれ ば、前記空きノード数が不足していると判断し、前記空きノード数が前記予約ノード数 を超えて 、れば、前記空きノード数が足りて 、ると判断することを特徴とする請求の範 囲第 1項記載のジョブ管理装置。  [4] The allocation method selection means determines that the number of free nodes is insufficient if the number of free nodes is equal to or less than the number of reserved nodes, and the number of free nodes exceeds the number of reserved nodes, The job management apparatus according to claim 1, wherein it is determined that the number of empty nodes is sufficient.
[5] 前記割り当て方式選択手段は、前記空きノード数が前記予約ノード数に所定値を 加算した値以下であれば、前記空きノード数が不足していると判断し、前記空きノー ド数が前記予約ノード数に前記所定値を加算した値を超えて ヽれば、前記空きノー ド数が足りていると判断することを特徴とする請求の範囲第 1項記載のジョブ管理装 置。  [5] If the number of free nodes is equal to or less than a value obtained by adding a predetermined value to the number of reserved nodes, the allocation method selection unit determines that the number of free nodes is insufficient, and the number of free nodes is 2. The job management apparatus according to claim 1, wherein if the number of reserved nodes exceeds a value obtained by adding the predetermined value, it is determined that the number of empty nodes is sufficient.
[6] 前記詰め込み方式ジョブ割り当て手段は、既にジョブを実行しているノードのうち、 実行中のジョブの残り時間の最も少な ソードを割り当て先として決定することを特徴 とする請求の範囲第 1項記載のジョブ管理装置。  [6] The stuffing method job allocation means determines, as an allocation destination, a sword with the shortest remaining time of a job being executed among nodes that have already executed the job. The job management apparatus described.
[7] 前記詰め込み方式ジョブ割り当て手段は、既にジョブを実行しているノードのうち、 ジョブ実行開始力もの経過時間の最も長 ソードを割り当て先として決定することを 特徴とする請求の範囲第 1項記載のジョブ管理装置。 [7] The stuffing method job allocation means determines, as an allocation destination, the longest sword of the elapsed time of the job execution start power among the nodes that have already executed the job. The job management apparatus described.
[8] 1つのノードで実行される逐次ジョブと複数のノードによる並列処理で実行される並 列ジョブとのノードへの割り当てを行うジョブ管理プログラムにお 、て、 [8] In a job management program that assigns to a node a sequential job executed on one node and a parallel job executed by parallel processing by a plurality of nodes.
コンピュータを、  Computer
ジョブの処理要求を示すジョブキューが入力されると、入力された順に前記ジョブキ ユーを格納するキューバッファ、 When a job queue indicating a job processing request is input, the job queues are input in the input order. Queue buffer to store users,
前記キューバッファに格納された並列ジョブの処理要求を示すジョブキューに基づ V、て、並列ジョブの処理を行うために予め確保しておくべきノードの数を示す予約ノ 一ド数を決定する予約ノード数決定手段、  Based on the job queue indicating the parallel job processing request stored in the queue buffer, V determines the number of reserved nodes indicating the number of nodes to be reserved in advance for processing the parallel job. Reservation node number determination means,
前記複数のノードそれぞれでのジョブの実行の有無を監視し、ジョブを実行して ヽ ないノードの数を示す空きノード数を算出する空きノード数計算手段、  A number of empty nodes calculating means for monitoring the presence or absence of execution of the job in each of the plurality of nodes and calculating the number of empty nodes indicating the number of nodes that should not execute the job;
前記キューバッファに格納された前記ジョブキューを、格納された順に取得するジョ ブキュー取得手段、  Job queue acquisition means for acquiring the job queues stored in the queue buffer in the order of storage;
前記ジョブキュー取得手段で取得された前記ジョブキューが並列ジョブの処理要求 か、逐次ジョブの処理要求かを判断するジョブ種別判断手段、  Job type determination means for determining whether the job queue acquired by the job queue acquisition means is a parallel job processing request or a sequential job processing request;
前記ジョブキュー取得手段で取得された前記ジョブキューが並列ジョブの処理要求 の場合、並列ジョブの並列数に応じた空きノードを割り当て先として決定する並列ジョ ブ割り当て手段、  Parallel job assignment means for determining, as the assignment destination, an empty node corresponding to the parallel number of parallel jobs when the job queue obtained by the job queue obtaining means is a parallel job processing request;
前記ジョブキュー取得手段で取得された前記ジョブキューが逐次ジョブの処理要求 の場合、前記空きノード数と前記予約ノード数とに基づ!、て前記空きノード数の過不 足を判断し、前記空きノード数が不足していれば、ジョブの割り当て方式として詰め 込み方式を選択し、前記空きノード数が足りていれば、ジョブの割り当て方式として分 散配置方式を選択する割り当て方式選択手段、  If the job queue acquired by the job queue acquisition means is a sequential job processing request, based on the number of free nodes and the number of reserved nodes, it is determined whether the number of free nodes is insufficient, An allocation method selection means for selecting a stuffing method as a job allocation method if the number of free nodes is insufficient, and a distributed arrangement method for selecting a job allocation method if the number of free nodes is sufficient;
前記詰め込み方式が選択された場合、既に他のジョブを実行して 、るノードを優先 的に、前記ジョブキュー取得手段で取得したジョブの割り当て先として決定する詰め 込み方式ジョブ割り当て手段、  When the stuffing method is selected, a stuffing method job assigning unit that preferentially determines another node that has already executed another job as an assignment destination of the job acquired by the job queue obtaining unit;
前記分散配置方式が選択された場合、ジョブを実行していないノードを優先的に、 前記ジョブキュー取得手段で取得したジョブの割り当て先として決定する分散配置方 式ジョブ割り当て手段、  When the distributed arrangement method is selected, a node that is not executing a job is preferentially determined as an assignment destination of the job acquired by the job queue acquisition unit,
割り当て先として決定されたノードに対して、前記ジョブキュー取得手段で取得した 前記ジョブキューで示されるジョブの処理要求を送信する処理要求送信手段、 として機能させることを特徴とするジョブ管理プログラム。  A job management program for causing a node determined as an assignment destination to function as a processing request transmission unit that transmits a processing request for a job indicated by the job queue acquired by the job queue acquisition unit.
1つのノードで実行される逐次ジョブと複数のノードによる並列処理で実行される並 列ジョブとのノードへの割り当てを、コンピュータにより行うジョブ管理方法において、 ジョブの処理要求が入力されると、入力された順にキューバッファにキューイングし 前記キューバッファに格納された並列ジョブの処理要求を示すジョブキューに基づ V、て、並列ジョブの処理を行うために予め確保しておくべきノードの数を示す予約ノ 一ド数を決定し、 Parallel jobs executed in parallel processing by multiple jobs with sequential jobs executed on one node In a job management method in which assignment of queue jobs to nodes is performed by a computer, when job processing requests are input, queued jobs are queued in the order in which they are input, and parallel job processing requests stored in the queue buffer Based on the job queue indicating V, the number of reserved nodes indicating the number of nodes to be reserved in advance for performing parallel job processing is determined.
前記複数のノードそれぞれでのジョブの実行の有無を監視し、ジョブを実行して ヽ ないノードの数を示す空きノード数を算出し、  Monitor the presence or absence of job execution on each of the plurality of nodes, calculate the number of free nodes indicating the number of nodes that should not execute the job,
前記キューバッファに格納された前記ジョブキューを、格納された順に取得し、 取得された前記ジョブキューが並列ジョブの処理要求力 逐次ジョブの処理要求か を判断し、  The job queues stored in the queue buffer are acquired in the order in which they are stored, and it is determined whether the acquired job queue is a parallel job processing request power or a sequential job processing request,
取得された前記ジョブキューが並列ジョブの処理要求の場合、並列ジョブの並列数 に応じた空きノードを割り当て先として決定し、  When the acquired job queue is a parallel job processing request, an empty node corresponding to the parallel number of parallel jobs is determined as an assignment destination,
取得された前記ジョブキューが逐次ジョブの処理要求の場合、前記空きノード数と 前記予約ノード数とに基づいて前記空きノード数の過不足を判断し、前記空きノード 数が不足していれば、ジョブの割り当て方式として詰め込み方式を選択し、前記空き ノード数が足りていれば、ジョブの割り当て方式として分散配置方式を選択し、 前記詰め込み方式が選択された場合、既に他のジョブを実行して 、るノードを優先 的に、前記ジョブキュー取得手段で取得したジョブの割り当て先として決定し、 前記分散配置方式が選択された場合、ジョブを実行していないノードを優先的に、 前記ジョブキュー取得手段で取得したジョブの割り当て先として決定し、  When the acquired job queue is a sequential job processing request, it is determined whether the number of free nodes is excessive or insufficient based on the number of free nodes and the number of reserved nodes, and if the number of free nodes is insufficient, If the stuffing method is selected as the job assignment method and the number of empty nodes is sufficient, the distributed placement method is selected as the job assignment method. If the stuffing method is selected, another job has already been executed. Node is preferentially determined as an assignment destination of the job acquired by the job queue acquisition means, and when the distributed arrangement method is selected, the node that is not executing the job is preferentially acquired the job queue. Determined as the job assignment destination obtained by
割り当て先として決定されたノードに対して、前記ジョブキュー取得手段で取得した 前記ジョブキューで示されるジョブの処理要求を送信する、  A processing request for the job indicated by the job queue acquired by the job queue acquisition unit is transmitted to the node determined as the allocation destination;
ことを特徴とするジョブ管理方法。  A job management method characterized by the above.
PCT/JP2005/018418 2005-10-05 2005-10-05 Job management device and job management program WO2007043142A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2005/018418 WO2007043142A1 (en) 2005-10-05 2005-10-05 Job management device and job management program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2005/018418 WO2007043142A1 (en) 2005-10-05 2005-10-05 Job management device and job management program

Publications (1)

Publication Number Publication Date
WO2007043142A1 true WO2007043142A1 (en) 2007-04-19

Family

ID=37942412

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2005/018418 WO2007043142A1 (en) 2005-10-05 2005-10-05 Job management device and job management program

Country Status (1)

Country Link
WO (1) WO2007043142A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009057208A1 (en) * 2007-10-31 2009-05-07 Fujitsu Limited Resource assignment program, management node, resource assignment method, and parallel computer system
CN108062254A (en) * 2017-12-12 2018-05-22 腾讯科技(深圳)有限公司 Job processing method, device, storage medium and equipment
JP2019053587A (en) * 2017-09-15 2019-04-04 株式会社日立製作所 Storage system
CN109871266A (en) * 2018-12-15 2019-06-11 中国平安人寿保险股份有限公司 Task delay process method, apparatus, computer installation and storage medium
CN109871266B (en) * 2018-12-15 2024-05-14 中国平安人寿保险股份有限公司 Task delay processing method and device, computer device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10240549A (en) * 1997-02-24 1998-09-11 Hitachi Ltd Parallel job multiplex scheduling method and device
JP2001184326A (en) * 1999-12-27 2001-07-06 Hitachi Ltd Multiprocessor scheduling method and computer system for multiprocessor scheduling
JP2002014829A (en) * 2000-06-30 2002-01-18 Japan Research Institute Ltd Parallel processing control system, method for the same and medium having program for parallel processing control stored thereon

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10240549A (en) * 1997-02-24 1998-09-11 Hitachi Ltd Parallel job multiplex scheduling method and device
JP2001184326A (en) * 1999-12-27 2001-07-06 Hitachi Ltd Multiprocessor scheduling method and computer system for multiprocessor scheduling
JP2002014829A (en) * 2000-06-30 2002-01-18 Japan Research Institute Ltd Parallel processing control system, method for the same and medium having program for parallel processing control stored thereon

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHOU BB ET AL: "Job Scheduling Strategies for Networks of Workstations.", LECTURE NOTES IN COMPUTER SCIENCE., vol. 1459, 1998, pages 143 - 157, XP002994942 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009057208A1 (en) * 2007-10-31 2009-05-07 Fujitsu Limited Resource assignment program, management node, resource assignment method, and parallel computer system
JP2019053587A (en) * 2017-09-15 2019-04-04 株式会社日立製作所 Storage system
US10824425B2 (en) 2017-09-15 2020-11-03 Hitachi, Ltd. Selecting destination for processing management instructions based on the processor buffer size and uncompleted management instructions
CN108062254A (en) * 2017-12-12 2018-05-22 腾讯科技(深圳)有限公司 Job processing method, device, storage medium and equipment
CN109871266A (en) * 2018-12-15 2019-06-11 中国平安人寿保险股份有限公司 Task delay process method, apparatus, computer installation and storage medium
CN109871266B (en) * 2018-12-15 2024-05-14 中国平安人寿保险股份有限公司 Task delay processing method and device, computer device and storage medium

Similar Documents

Publication Publication Date Title
JP4921054B2 (en) Load balancing control system and load balancing control method
US7958509B2 (en) Method and system for scheduling of jobs
JP3678414B2 (en) Multiprocessor system
JP5744909B2 (en) Method, information processing system, and computer program for dynamically managing accelerator resources
US8996756B2 (en) Using process location to bind IO resources on NUMA architectures
US6658449B1 (en) Apparatus and method for periodic load balancing in a multiple run queue system
US8239868B2 (en) Computer system, servers constituting the same, and job execution control method and program
JP4922496B2 (en) Method for giving priority to I / O requests
KR101644800B1 (en) Computing system and method
JP3977698B2 (en) Storage control device, storage control device control method, and program
TWI235952B (en) Thread dispatch mechanism and method for multiprocessor computer systems
JP2003256221A (en) Parallel process executing method and multi-processor type computer
JP2004362459A (en) Network information recording device
JP2011165223A (en) Method for allocating resource computer-based system
US20030191794A1 (en) Apparatus and method for dispatching fixed priority threads using a global run queue in a multiple run queue system
JP3664021B2 (en) Resource allocation method by service level
WO2007043142A1 (en) Job management device and job management program
US20170344266A1 (en) Methods for dynamic resource reservation based on classified i/o requests and devices thereof
JP2012059152A (en) System for performing data processing and method for allocating memory
US20220222013A1 (en) Scheduling storage system tasks to promote low latency and sustainability
JP2007328413A (en) Method for distributing load
US8245229B2 (en) Temporal batching of I/O jobs
US7086059B2 (en) Throttling queue
JP2010097566A (en) Information processing apparatus, and method for assigning batch processing in information processing system
JP2924725B2 (en) Buffer allocation control system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 05790542

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP