WO2012105056A1 - 並列分散処理システムのデータ転送制御方法、並列分散処理システム及び記憶媒体 - Google Patents
並列分散処理システムのデータ転送制御方法、並列分散処理システム及び記憶媒体 Download PDFInfo
- Publication number
- WO2012105056A1 WO2012105056A1 PCT/JP2011/052435 JP2011052435W WO2012105056A1 WO 2012105056 A1 WO2012105056 A1 WO 2012105056A1 JP 2011052435 W JP2011052435 W JP 2011052435W WO 2012105056 A1 WO2012105056 A1 WO 2012105056A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- parallel distributed
- distributed processing
- server
- processing execution
- data
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
Definitions
- the present invention relates to a data transfer control method and apparatus using data processing history information, server operation history information, and the like.
- Parallel distributed processing technology is attracting attention as a technology for analyzing large amounts of data in a short time.
- log data that has not been used so far has no clear way of utilizing and analyzing data, and trial and error are required.
- parallel distributed processing processing is divided and assigned to a plurality of servers, and processing is distributed and executed in parallel. Therefore, it is necessary to prepare a large number of servers. For this reason, the return on investment resulting from the introduction of the parallel distributed processing system at the initial stage is unclear, so the introduction barrier for customers is high.
- each server calculates the ratio of processing-allocated data to stored data.
- a scheduling method for allocating processing of data stored in a server having the smallest ratio is disclosed (see Patent Document 1).
- Non-Patent Document 1 By using the technique described in Non-Patent Document 1, it is possible to process data stored in the server as much as possible, so that occurrence of data transfer can be suppressed. Further, when data processing stored in another server is assigned, data processing costs can be suppressed because data processing stored in the server closest to the network is assigned.
- Non-Patent Document 1 the data transfer cost cannot always be suppressed in a situation where an existing system with a high priority and a parallel distributed processing system operate together.
- Patent Document 1 is known as one of methods for suppressing frequent data transfer in a parallel distributed processing system.
- the ratio of allocated data of each server is calculated, and the server stores the data. Decide whether to allocate the processing of the data that is present. In this determination, the possibility of transferring and acquiring data from a server with little unallocated data to the local server is reduced. Therefore, each server can process data stored in its own server as much as possible, thereby reducing the number of data transfers.
- Patent Document 1 In a situation where a parallel distributed processing system is operated in a coexisting system with a high priority, the conventional technique such as Patent Document 1 is not necessarily sufficient.
- the parallel distributed processing system needs to change the execution multiplicity in accordance with the load fluctuation and resource fluctuation of the existing system. For this reason, when the load on the existing system increases, the multiplicity of execution must be reduced in a parallel distributed processing execution system having a low priority. For this reason, there is a possibility that a task that has already been assigned a process cannot be executed due to a decrease in execution multiplicity. Further, since the execution multiplicity of each server is different, the amount of data that can be processed per unit time by each server is also different. Therefore, even when the ratio of allocated data is small, the amount of data that can be processed per unit time is large, and there is a possibility that unprocessed data can be processed in a shorter time than other servers.
- the amount of I / O resources that the parallel distributed processing system can use for data transfer also varies from server to server. For this reason, if the data transfer destination server has a large amount of free I / O resources and a large amount of data can be transferred at one time, but a server with a small amount of I / O resources available for the transfer source server is selected, It is necessary to transfer data in accordance with the smaller I / O resource amount, and the free I / O resource of the transfer destination server cannot be used to the maximum extent. This increases the data transfer time and may affect the processing time of the entire parallel distributed processing system.
- the computer resources that can be allocated to the parallel distributed processing system fluctuate as in an environment in which the parallel distributed processing system operates in an existing system with high priority, and the effective multiplicity of the parallel distributed processing system changes. Even if it changes, it becomes a problem to allocate data to each server so that the subsequent parallel processing can be efficiently executed in terms of the number of data transfers and the amount of resources used.
- the present invention has been made in view of the above problems, and it is possible to reduce the number of data transfers in a situation where the computer resources that can be allocated to the parallel distributed processing system change and the execution multiplicity of the parallel distributed processing changes. Objective.
- the present invention comprises a processor and a storage device, stores data blocks previously divided as data to be processed in the storage device, and a plurality of parallel executions in which the processor executes tasks for processing the data blocks in parallel
- a parallel distributed processing system comprising a distributed processing execution server and a management computer that controls the plurality of parallel distributed processing execution servers
- the management computer assigns data blocks to be assigned to tasks of the first parallel distributed processing execution server.
- a data transfer control method for a parallel distributed processing system for selecting a second parallel distributed processing execution server as a transmission source, wherein the management computer has completed the task from the first parallel distributed processing execution server.
- a first step of receiving a completion notification, and a resource of the plurality of parallel distributed processing execution servers by the management computer A second step of collecting each dose; a third step in which the management computer acquires data blocks and task states held by the plurality of parallel distributed processing execution servers; and A second data block is transferred to the first parallel distributed processing execution server based on the progress of the processing of the data blocks respectively held by the parallel distributed processing execution server and the resource usage of the plurality of parallel distributed processing execution servers.
- a fourth step of selecting the parallel distributed processing execution server, and the management computer transfers the data block to the first parallel distributed processing execution server to the selected second parallel distributed processing execution server
- a fifth step of transmitting a command, and the management computer sends the transferred data to the first parallel distributed processing execution server.
- the present invention in a parallel distributed processing system in which available computer resources change during execution of parallel distributed processing and the execution multiplicity of parallel distributed processing changes, the number of times of data transfer is reduced, and data transfer time is further reduced. Therefore, it is possible to provide a data transfer control method and apparatus capable of efficiently executing the entire parallel and distributed processing.
- FIG. 1 is a block diagram illustrating an example of a computer system according to a first embodiment of this invention.
- FIG. 2 is a detailed block diagram of each server in the computer system according to the first embodiment of this invention. It is a flowchart which shows the 1st Embodiment of this invention and shows the whole process of a parallel distributed process. It is a flowchart which shows the 1st Embodiment of this invention and shows the detailed process performed by allocation process S116 of FIG. 2A. It is a figure which shows the 1st Embodiment of this invention and shows an example of a data information management table.
- FIG. 1 shows the 1st Embodiment of this invention and shows the example of a resource usage-amount management table.
- FIG. 2nd Embodiment of this invention shows an example of a structure of a computer system. It is a block diagram which shows the 2nd Embodiment of this invention and shows an example of each server among computer systems. It is a flowchart which shows the 2nd Embodiment of this invention and shows the whole parallel distributed processing.
- FIGS. 1 to 13 a description will be given of a first embodiment of the present invention.
- FIGS. 1A and 1B are block diagrams illustrating an example of a configuration of a computer system according to the first embodiment of this invention.
- a client device 110 a plurality of parallel distributed processing execution servers 120-1 to 120-n, a parallel distributed processing control server 130, a data transfer control server 140, and a resource usage management server 150 are mutually connected via a network 100. It is connected.
- the network 100 is a global network such as a LAN (Local Area Network), a WAN (Wide Area Network), or the Internet.
- the network 100 may be divided into a plurality of networks 100.
- the parallel distributed processing execution servers 120-1 to 120-n are collectively referred to as the parallel distributed processing execution server 120.
- the client device 110 is a computer that includes a network interface 111, a CPU 112, a main storage device 113, a secondary storage device 114, and a bus (or interconnect) 115 that interconnects them.
- the network interface 111 is an interface for the client device 110 to connect to the network 100.
- the CPU 112 is an arithmetic processing unit that implements a predetermined function of the client device 110 by executing a program stored in the main storage device 113.
- the main storage device 113 is a storage device such as a RAM that stores a program executed by the CPU 112 and data necessary for executing the program.
- the program is, for example, a program for realizing the functions of the OS and the client processing unit 1131 (not shown).
- the secondary storage device 114 is a non-volatile storage medium such as a hard disk device that stores programs, data, and the like necessary for realizing predetermined functions of the client device 110.
- the secondary storage device 114 is not limited to a magnetic storage medium such as a hard disk device, and may be a non-volatile semiconductor storage medium such as a flash memory.
- the parallel distributed processing execution server 120 is a computer that includes a network interface 121, a CPU 122, a main storage device 123, a secondary storage device 124, and a bus (or interconnect) 125 that interconnects these.
- the parallel distributed processing execution servers 120-1 to 120-n have the same configuration.
- the network interface 121 is an interface for the parallel distributed processing execution server 120 to connect to the network 100.
- the CPU 122 is an arithmetic processing unit that implements a predetermined function of the parallel distributed processing execution server by executing a program stored in the main storage device 133.
- the main storage device 123 is a storage device such as a RAM that stores a program executed by the CPU 122 and data necessary for executing the program.
- the program is, for example, a program for realizing the functions of the OS (not shown) and the user definition processing execution unit 1231 and the data management unit 1232.
- the secondary storage device 124 is a hard disk device or the like that stores programs necessary for the parallel distributed processing execution server 120 to realize predetermined functions, and data such as input data 1241, output data 1242, and a data management table 1243. It is a non-volatile storage medium.
- the secondary storage device 124 is not limited to a magnetic storage medium such as a hard disk device, and may be a non-volatile semiconductor storage medium such as a flash memory.
- the input data 1241 is logical data composed of a plurality of data blocks divided into a predetermined size, and includes a name and information for identifying the data block constituting the data.
- the information for identifying the data block of the input data 1241 is, for example, the address information of the parallel distributed processing execution server 120 that stores the data block and the name of the data block.
- the substance of data is stored in the parallel distributed processing execution server 120 as a data block.
- the output data 1242 is data output by the parallel distributed processing described above.
- the user definition processing execution unit 1231 executes the assigned task.
- the data management unit 1232 manages allocation of data blocks (input data 1241) to tasks.
- the parallel distributed processing control server 130 is a server for assigning a process to each parallel distributed processing execution server 120 and controlling the execution of the entire parallel distributed processing.
- the parallel distributed processing network interface 131, the CPU 132, the main storage device 133, A secondary storage device 134 and a bus 135 for connecting them to each other are provided.
- the network interface 131 is an interface for the parallel distributed processing control server 130 to connect to the network 100.
- the CPU 132 is an arithmetic processing unit that implements a predetermined function of the parallel distributed processing control server by executing a program stored in the main storage device 133.
- the main storage device 133 is a storage device such as a RAM that stores a program executed by the CPU 132 and data necessary for executing the program.
- the programs are, for example, programs for realizing the functions of an OS (not shown) and a process allocation control unit 1331, a data information management unit 1332, a process execution server management unit 1333, and a task management unit 1334.
- the secondary storage device 134 stores programs necessary for the parallel distributed processing control server 130 to realize a predetermined function, and data such as a data information management table 300, a processing execution server management table 500, a task management table 600, and the like. It is a non-volatile storage medium such as a hard disk device.
- the secondary storage device 134 is not limited to a magnetic storage medium such as a hard disk device, and may be a non-volatile semiconductor storage medium such as a flash memory.
- the data transfer control server 140 stores a data block to be processed when assigning processing of a data block stored in a parallel distributed processing execution server 120 different from the parallel distributed processing execution server 120 to which processing is assigned.
- a server that selects a distributed processing execution server and includes a network interface 141, a CPU 142, a main storage device 143, a secondary storage device 144, and a bus 145 that interconnects them.
- the network interface 141 is an interface for connecting the data transfer control server 140 to the network 100.
- the CPU 142 is an arithmetic processing unit that implements a predetermined function of the data transfer control server 140 by executing a program stored in the main storage device 143.
- the main storage device 143 is a storage device such as a RAM that stores a program executed by the CPU 142 and data necessary for executing the program.
- the program is a program for realizing the functions of an OS (not shown), a processing delay server extraction processing unit 1431, a free I / O resource comparison processing unit 1432, and a processing status management unit 1433.
- the secondary storage device 144 is a non-volatile device such as a hard disk device that stores programs necessary for the data transfer control server 140 to realize a predetermined function and data such as the processing status management table 800 and the processing delay threshold management table 1000. It is a sexual storage medium.
- the secondary storage device 144 is not limited to a magnetic storage medium such as a hard disk device, and may be a non-volatile semiconductor storage medium such as a flash memory.
- processing delay server extraction processing unit 1431 The functions of the processing delay server extraction processing unit 1431, the free I / O resource amount comparison processing unit 1432, and the processing status management unit 1433 will be described later in the description of the processing.
- the resource usage management server 150 is a server for managing the I / O resource usage of each server, and connects the network interface 151, the CPU 152, the main storage device 153, the secondary storage device 154, and these to each other.
- the bus 155 is provided.
- the network interface 151 is an interface for the resource usage management server 150 to connect to the network 100.
- the CPU 152 is an arithmetic processing device that implements a predetermined function of the resource usage management server 150 by executing a program stored in the main storage device 153.
- the main storage device 153 is a storage device such as a RAM that stores a program executed by the CPU 152 and data necessary for executing the program.
- the program is, for example, a program for realizing the functions of the OS (not shown) and the resource usage management unit 1531 and the resource usage monitoring unit 1532.
- the secondary storage device 154 is a non-volatile storage medium such as a hard disk device that stores a program necessary for the resource usage management server 150 to realize a predetermined function and data such as the resource usage management table 1200.
- the secondary storage device 154 is not limited to a magnetic storage medium such as a hard disk device, and may be a non-volatile semiconductor storage medium such as a flash memory.
- the hardware configuration and software configuration of each device have been described above.
- the configuration of 150 is not limited to the configuration shown in FIGS. 1A and 1B.
- the parallel distributed processing control server 130, the data transfer control server 140, and the resource usage management server may be configured to operate on either the client device 110 or the parallel distributed processing execution server 120.
- the parallel distributed processing control server 130, the data transfer control server 140, and the resource usage management server 150 are configured to be executed on different servers. The parts may be executed on the same server.
- the program of each server includes the parallel distributed processing control unit, the data transfer control unit, and the resource usage. What is necessary is just to function as a quantity management part.
- the programs of the parallel distributed processing control unit, data transfer control unit, and resource usage management unit can be installed on the same server by a program distribution server or a non-transitory computer-readable storage medium.
- FIGS. 1A and 1B the first embodiment of the present invention will be described along FIGS. 2A to 13 and the processing will be described.
- FIG. 2A and FIG. 2B are flowcharts showing the first embodiment of the present invention and showing the overall processing flow related to parallel distributed processing.
- a data load request for processing target data (input data 1241) is transmitted from the client apparatus 110 to the parallel distributed processing control server 130.
- the parallel distributed processing control server 130 divides the load target data into data blocks (input data 1241) of a prescribed size, and distributes and loads the data to a plurality of parallel distributed processing execution servers 120 (S101). ).
- a parallel distributed processing execution request is transmitted from the client device 110 to the parallel distributed processing control server 130.
- the process allocation control unit 1331 of the parallel distributed processing control server 130 transmits the processing execution request for the loaded data block to each parallel distributed processing execution server 120, and the data information management unit 1332
- the allocation state 303 of the data block to be processed in the data information management table 300 is updated to “allocated” (S102).
- the process execution request transmitted by the process allocation control unit 1331 includes the data block ID of the data block to be processed and the task ID of the task that executes the process of each data block.
- the task is a program that executes a predetermined process using the data block to be processed as input data 1241.
- the user-defined process execution unit 1231 of the parallel distributed process execution server 120 that has received the process execution request executes a predetermined process using the data block specified by the process execution request in each task as the input data 1241 (S103). . That is, the user-defined process execution unit 1231 of the parallel distributed process execution server 120 activates the task specified by the process execution request, and allocates the data block specified by the process execution request as input data 1241 to each task. Execute the process.
- the parallel distributed processing execution server 120 When the parallel distributed processing execution server 120 completes the processing of the data block designated as the input data 1241 for the assigned task, the parallel distributed processing execution server 120 sends a processing completion notification to the parallel distributed processing control server 130.
- the task ID of the completed task is transmitted (S104).
- the data information management unit 1332 of the parallel distributed processing control server 130 that has received the processing completion notification updates the processing state 304 of the corresponding data block ID 302 in the data information management table 300 to “processed” (S105).
- the processing allocation control unit 1331 of the parallel distributed processing control server 130 refers to the data management information table 300 and determines whether or not the processing state 304 of all the data blocks is “processed” (S106).
- step S106 If all the data blocks are “processed” as a result of the determination in step S106 (S106 ⁇ Yes), the process allocation control unit 1331 of the parallel distributed processing control server 130 notifies the client apparatus 110 of the completion of the parallel distributed processing. The computer system 10 ends the parallel distributed processing.
- step S106 If any data block is not “processed” as a result of the determination in step S106 (S106 ⁇ No), the parallel distributed processing control server 130 performs the allocation process (S116) of FIG. 2B in step S116. Returning to step S103, the above processing is repeated.
- the parallel distributed processing control server 130 refers to the data management information table 300, and the parallel distributed processing execution server 120 that has transmitted the process completion notification has a data block whose allocation status 403 is “unallocated”. It is determined whether or not it exists (S107).
- step S107 when there is an unallocated data block in the parallel distributed processing execution server 120 that has transmitted the processing completion notification (S107 ⁇ Yes), the process allocation control unit 1331 of the parallel distributed processing control server 130 One data block is arbitrarily selected from unallocated data blocks existing in the parallel distributed processing execution server 120 that has transmitted the processing completion notification (S108).
- step S107 If the result of determination in step S107 is that there is no unallocated data block in the parallel distributed processing execution server 120 that sent the processing completion notification (S107 ⁇ No), the process proceeds to step S109.
- the processing allocation control unit 1331 of the parallel distributed processing control server 130 stores the server name of the parallel distributed processing execution server 120 that has transmitted the processing completion notification and the server name list of the parallel distributed processing execution server 120 having unallocated data blocks.
- the data is generated from the information management table 300 and transmitted to the data transfer control server 140 (S109).
- the processing delay server extraction processing unit 1431 of the data transfer control server 140 extracts a processing delay server from the parallel distributed processing execution server 120 included in the received server name list having unallocated data blocks.
- the processing delay server refers to the parallel distributed processing execution server 120 in which the progress of the processing is delayed.
- the ratio of the total number of data blocks held by the parallel distributed processing execution server 120 and the number of processed data blocks that have already been processed by executing the task is obtained as the processed data rate.
- the parallel distributed processing execution server 120 whose processed data rate is less than the threshold value 1001 is extracted as a processing delay server whose processing progress is delayed.
- the processing delay server extraction processing unit 1431 of the data transfer control server 140 transmits to the resource usage management server 150 the server name of the parallel distributed processing execution server 120 that has transmitted the processing completion notification and the server name list of the extracted servers. Then, an amount of free I / O resources of each parallel distributed processing execution server 120 is requested (S110).
- the resource usage management unit 1531 of the resource usage management server 150 refers to the resource usage management table 1200, and the parallel distributed processing execution server 120 that has transmitted the processing completion notification, and the parallel distribution included in the server name list
- the free I / O resource amount of the process execution server 120 is acquired and transmitted to the data transfer control server 140 (S111).
- the free I / O resource amount includes the free I / O resource amount of the network I / O and the free I / O resource amount of the disk I / O.
- the free I / O resource amount of the network I / O indicates the data transfer rate (Gbit / sec) that can be used by the network interface 121 of the parallel distributed processing execution server 120 as a ratio.
- an average value of data transfer rates for a predetermined time for example, 1 minute
- a value obtained by multiplying the effective value or theoretical value of the link speed of the network interface 121 by a predetermined ratio may be used.
- the free I / O resource amount of the disk I / O indicates a data transfer rate (MByte / sec) that can be used by the secondary storage device 124 of the parallel distributed processing execution server 120 as a ratio.
- an average value of data transfer rates for a predetermined time for example, 1 minute
- a value obtained by multiplying the effective value or theoretical value of the data transfer rate of the secondary storage device 124 by a predetermined ratio may be used.
- the free I / O resource amount comparison processing unit 1432 of the data transfer control server 140 sends the free I / O resource amount of the parallel distributed processing execution server 120 that has transmitted the processing completion notification, and the free I / O of each processing delay server.
- the resource amounts are compared, the processing delay server having the smallest difference in the free I / O resource amount is selected, and the name of the selected processing delay server is transmitted to the parallel distributed processing control server 130 (S112).
- the processing allocation control unit 1331 of the parallel distributed processing control server 130 selects from among unallocated data blocks existing in the parallel distributed processing execution server 120 corresponding to the processing delay server name received from the data transfer control server 140.
- One data block is arbitrarily selected (S113). That is, the process allocation control unit 1331 refers to the data information management table 300, and among the data placement server names 302 that matches the process delay server name received from the data transfer control server 140, the allocation status 303 is unallocated data.
- One data block ID is arbitrarily selected from the block IDs. This selection may be made by a known or publicly known method such as ascending order of data block IDs or round robin.
- the data information management unit 1332 of the parallel distributed processing control server 130 updates the allocation state 303 of the data block selected in the data information management table 300 to “allocated”. Further, the process allocation control unit 1331 transmits a process request for the data block updated to “allocated” to the parallel distributed process execution server 120 that has transmitted the process completion notification, and the computer system 10 performs the process in FIG. 2A. The process returns to S103 (S114).
- the data block processing request transmitted to the parallel distributed processing execution server 120 includes a data block ID of the selected data block and a task ID for executing the processing of the data block.
- FIG. 3 is a diagram illustrating an example of the data information management table 300 according to the first embodiment of this invention.
- the data information management table 300 is composed of a plurality of data blocks obtained by dividing the input data 1241 by a specified size, and as attribute information for managing each data block, a data block ID 301 for identifying each data block, and data One entry is configured from the placement server name 302, an allocation state 303 indicating whether or not the processing of the data block has already been allocated, and a processing state 304 indicating whether or not the processing of the data block has been completed.
- the data arrangement server name 302 stores the name or identifier of the parallel distributed processing execution server 120.
- “Server A” corresponds to the parallel distributed processing execution server 120-1 in FIG. 1A
- “Server B” corresponds to the parallel distributed processing execution server 120-1.
- FIGS. 4A and 4B are flowcharts showing the allocation process of S116 of FIG. 2A and showing the detailed procedure regarding the process allocation to the parallel distributed processing execution server 120 of steps S107 to S114 of FIG. 2B.
- 4A and 4B are processes performed by the processing allocation control unit 1331, the data information management unit 1332, and the task management unit 1334 on the parallel distributed processing control server 130.
- the process allocation control unit 1331 that has received the task ID as a process completion notification from the parallel distributed processing execution server 120 requests the task management unit 1334 to update the task management table 600. Further, the process allocation control unit 1331 requests the server name list of the parallel distributed processing execution server 120 having the unallocated data block from the data information management unit 1332 (S201).
- the task management unit 1334 that has received the update request for the task management table 600 updates the execution state 603 of the corresponding task in the task management table 600 to “waiting” and updates the processing data block ID 604 to a NULL value ( S202).
- the data information management unit 1332 refers to the data information management table 300, calculates the processed data rate of the parallel distributed processing execution server 120 that has transmitted the processing completion notification, and transmits the processed data to the data transfer control server 140.
- the rate and the name of the parallel distributed processing execution server 120 are transmitted.
- the data transfer control server 140 that has received the processed data rate updates the processed data rate 802 in the processing status management table 800 for the received server name (S203).
- the processing rate 304 of the data information management table 300 is “processed” data for the number of data blocks stored in the parallel distributed processing execution server 120 by the data information management unit 1332 for calculating the processed data rate. This is done by determining the ratio of the number of blocks. That is, the processed data rate is the ratio of data blocks that have been processed so far to all data blocks stored in the parallel distributed processing execution server 120, and is calculated by the following equation (1). Is.
- Processed data rate Number of stored “processed” data blocks
- the data information management unit 1332 refers to the data information management table 300, creates a server name list of the parallel distributed processing execution server 120 having a data block whose allocation state 303 is “unallocated”, and processes allocation control unit 1331 (S204).
- the process allocation control unit 1331 that has received the server name list of the parallel distributed processing execution server 120 having an unallocated data block determines whether or not the parallel distributed processing execution server 120 that has transmitted the process completion notification exists in the list. (S205).
- step S205 when the parallel distributed processing execution server 120 that has transmitted the processing completion notification exists in the list (S205 ⁇ Yes), the processing allocation control unit 1331 refers to the data information management table 300 and the processing is completed.
- One data block is arbitrarily selected from the unallocated data blocks stored in the parallel distributed processing execution server 120 that has transmitted the notification (S206).
- step S205 If the result of determination in step S205 is that the parallel distributed processing execution server 120 that transmitted the processing completion notification does not exist in the list (S205 ⁇ No), the processing proceeds to FIG. 4B.
- the process allocation control unit 1331 requests the data transfer control server 140 for the server name of the parallel distributed processing execution server 120 having the data block to be processed next by the task of the parallel distributed processing execution server 120 that has transmitted the processing completion notification (S207). ).
- the server name request includes the server name of the parallel distributed processing execution server 120 that has transmitted the processing completion notification, and the server name list of the parallel distributed processing execution server 120 having an unallocated data block.
- the data transfer control server 140 extracts the parallel distributed processing execution server 120 storing the data block to be processed next by the parallel distributed processing execution server that has transmitted the processing completion notification. It transmits to the process allocation control part 1331 (S208). Details of the processing in step S208 will be described later.
- the process allocation control unit 1331 arbitrarily selects one data block from the unallocated data blocks stored in the parallel distributed processing execution server 120 corresponding to the server name received from the data transfer control server 140. Select (S209).
- the process allocation control unit 1331 requests the parallel distributed processing execution server 120 corresponding to the received server name to execute the transfer of the selected data block to the parallel distributed processing execution server 120 that has transmitted the process completion notification.
- the data block transfer request includes the data block ID of the selected data block and the server name of the parallel distributed processing execution server 120 that has transmitted the processing completion notification.
- the process allocation control unit 1331 requests the data information management unit 1332 to update the data information management table 300, and requests the task management unit 1334 to update the task management table 600 (S211).
- the update request for the data information management table 300 includes the data block ID 301 and the data allocation server name 302 of the selected data block, and the update request for the task management table 600 is received from the parallel distributed processing execution server 120 as a process completion notification. Task ID 601 and the processing data block ID 604 of the selected data block.
- the data information management unit 1332 updates the allocation state 303 of the data block corresponding to the data block ID 301 included in the received update request to “allocated”, and the task management unit 1334 performs task management corresponding to the received task ID.
- the task execution state 603 in the table 600 is updated to “being executed”, and the processing target data block ID 604 is updated with the received data block ID (S212).
- the process assignment control unit 1331 transmits a process execution request for the selected data block to the parallel distributed process execution server 120 that has transmitted the process completion notification (S213).
- the process execution request includes the data block ID of the selected data block and the task ID to which the process is assigned.
- FIG. 5 is a diagram illustrating an example of the process execution server management table 500 according to the first embodiment of this invention.
- the process execution server management table 500 includes information of a server name 501 for identifying each parallel distributed process execution server 120 as attribute information for managing the parallel distributed process execution server 120.
- FIG. 6 is a diagram illustrating an example of a task management table 600 managed by the task management unit 1334 according to the first embodiment of this invention.
- the task management table 600 includes a task ID 601 for storing an identifier for identifying each task as attribute information for managing each task of the parallel distributed processing execution server 120, and the name or identifier of the parallel distributed processing execution server 120 in which the task exists.
- One entry is configured from a server name 602 to be stored, an execution state 603 indicating whether or not the task is executing processing, and a processing data block ID 604 indicating an identifier of the data block being processed by the task.
- the execution status 603 column stores information such as “being executed” and “waiting”.
- 7A and 7B are flowcharts showing the detailed processing procedure of step S208 in FIG. 4B.
- 7A and 7B are processes performed by the processing delay server extraction processing unit 1431, the free resource amount comparison processing unit 1432, and the processing status management unit 1433 on the data transfer control server 140.
- step S208 the data transfer control server 140 extracts the parallel distributed processing execution server 120 storing the data block to be processed next by the parallel distributed processing execution server that has transmitted the processing completion notification, and the parallel distributed processing control server This is a process of transmitting a server name having a data block to 130 process allocation control units 1331.
- the server name of the parallel distributed processing execution server 120 to which the processing delay server extraction processing unit 1431 of the data transfer control server 140 has transmitted the processing completion notification, and the server name of the parallel distributed processing execution server 120 having an unallocated data block. are received from the parallel distributed processing control server 130 (S301).
- This list of server names is a server name list of the parallel distributed processing execution server 120 having unallocated data blocks generated by the processing allocation control unit 1331 in step S207 of FIG. 4B.
- the processing delay server extraction processing unit 1431 requests the processing status management unit 1433 for the processed data rate of the parallel distributed processing execution server 120 having unallocated data blocks (S302).
- the request for the processed data rate is the server name list of the parallel distributed processing execution server 120 having unallocated data blocks received by the processing delay server extraction processing unit 1431 of the data transfer control server 140 in step S203 of FIG. 4A. including.
- the processing status management unit 1433 that has received the server name list refers to the processing status management table 800, extracts the processed data rate of the parallel distributed processing execution server 120 included in the server name list, and extracts the processing delay server.
- the processed data rate is transmitted to the processing unit 1431 (S303).
- the processing delay server extraction processing unit 1431 acquires the processing delay threshold value 1001 with reference to the processing delay threshold value table 1000.
- the processing delay server extraction processing unit 1431 compares the processed data rate of the parallel distributed processing execution server 120 included in the server name list received from the parallel distributed processing control server 130 with the processing delay threshold 1001 to determine the processing delay threshold 1001.
- the parallel distributed processing execution server 120 having a smaller processed data rate and a slower processing progress is extracted (S304).
- the processing delay threshold 1001 is set in advance by a user or an administrator before executing the parallel distributed processing.
- the processing delay server extraction processing unit 1431 determines whether there is a parallel distributed processing execution server 120 whose processing progress extracted in step S304 is slow (S305).
- step S305 if there is a parallel distributed processing execution server 120 with a slow process progress (S305 ⁇ Yes), the processing delay server extraction processing unit 1431 causes the parallel distributed process execution server 120 with a slow process progress. It is determined whether the number of servers is one (S308 in FIG. 7B).
- step S308 when there is one parallel distributed processing execution server 120 whose processing progress is slow (S308 ⁇ Yes), the processing delay server extraction processing unit 1431 sends a processing completion notification to the parallel distributed processing control server 130.
- the transmitted parallel distributed processing execution server 120 transmits the server name of the parallel distributed processing execution server whose processing progress is slow as the transfer source server of the data block to be processed next (S315).
- step S308 when there are a plurality of parallel distributed processing execution servers 120 with slow processing progress (S308 ⁇ No), the processing delay server extraction processing unit 1431 sends the processing status management unit 1433 to the parallel processing with slow processing progress.
- An execution multiplicity of the distributed processing execution server 120 is requested (S309).
- the execution multiplicity of the parallel distributed processing execution server 120 indicates the number of tasks executed simultaneously.
- the multiplicity of execution is handled as the processing performance per unit time of each parallel distributed processing execution server 120.
- the request for execution multiplicity includes a server name list of the parallel distributed processing execution server 120 whose processing progress is slow.
- a value indicating the processing performance per unit time a value indicating a hardware specification such as a CPU processing capacity or a memory capacity may be used.
- the processing status management unit 1433 that has received the execution multiplicity of the server whose processing progress is slow refers to the processing status management table 800 and executes the multiplicity 803 of each parallel distributed processing execution server 120 included in the server name list. Is transmitted to the processing delay server extraction processing unit 1431 (S311).
- the processing delay server extraction processing unit 1431 extracts the parallel distributed processing execution server 120 having the smallest execution multiplicity as the processing delay server among the parallel distributed processing execution servers 120 whose processing progress is slow (S311).
- the processing delay server extraction processing unit 1431 is included in the parallel distributed processing execution server 120 whose processing progress is slow.
- the parallel distributed processing execution server 120 having the lowest hardware specification is extracted as a processing delay server.
- FIG. 9 shows a modified example of the processing status management table, in which hardware specifications 903 to 905 are used in place of the execution multiplicity 803 of the processing status management table 800 shown in FIG.
- the processing status management table 800 ′ of FIG. 9 includes a CPU type 903 for storing the type of the CPU 122 of the parallel distributed processing execution server 120, instead of the execution multiplicity 803 of the processing status management table 800 shown in FIG.
- One entry is composed of the number of cores 904 and the memory type 905 that stores the type of the main storage device 123.
- the CPU type 903 “CPU1” indicates a high-spec (high performance) CPU 122, “CPU2” indicates a low-spec CPU, and the memory type “memory 1” is high.
- Spec (high performance) memory “Memory 2” indicates a low-spec memory, and the parallel distributed processing execution server 120 with the lowest hardware spec includes one-core “CPU 2” and a low-spec “Memory 2”. ”.
- the CPU processing capacity may be represented by the number of operating clocks of the CPU 122, the number of cores, and the cache capacity.
- the processing delay server extraction processing unit 1431 determines whether or not the number of processing delay servers extracted in step S311 is one (S312).
- step S312 If it is determined in step S312 that the number of processing delay servers is one (S312 ⁇ Yes), the process proceeds to step S315, and the server name of the processing delay server is transmitted to the parallel distributed processing control server 130 (S315).
- step S312 determines whether there are a plurality of processing delay servers (S312 ⁇ No). If the result of determination in step S312 is that there are a plurality of processing delay servers (S312 ⁇ No), the processing delay server extraction processing unit 1431 sends the server name of the parallel distributed processing execution server 120 that sent the processing completion notification, and processing delay. The server name list of the server is transmitted to the free I / O resource amount comparison processing unit 1432 (S313).
- the free I / O resource amount comparison processing unit 1432 requests the resource usage management server 150 for the amount of free I / O resources of the parallel distributed processing execution server 120 that transmitted the processing completion notification and the processing delay server. (S314).
- the request for the free I / O resource amount includes the server name of the parallel distributed processing execution server 120 that has transmitted the processing completion notification and the server name list of the processing delay servers.
- step S305 when there is no parallel distributed processing execution server 120 with a slow process progress (S305 ⁇ No), the processing delay server extraction processing unit 1431 notifies the free I / O resource amount comparison processing unit 1432 of the processing completion. And the server name list of the parallel distributed processing execution server 120 having the unallocated data block are transmitted (S306).
- the free I / O resource amount comparison processing unit 1432 transmits the processing completion notification to the resource usage management server 150, and the server of the parallel distributed processing execution server 120 having an unallocated data block.
- the name list is transmitted, and the free I / O resource amount of each parallel distributed processing execution server 120 is requested (S307).
- the request for the free I / O resource amount includes the server name of the parallel distributed processing execution server that has transmitted the processing completion notification and the server name list of the parallel distributed processing execution server 120 having an unallocated data block.
- the resource usage management server 150 transmits the processing completion notification received from the data transfer control server 140 to the parallel distributed processing execution server 120 and the free I / Os of the parallel distributed processing execution server 120 included in the server name list.
- the resource amount is transmitted to the free I / O resource amount comparison processing unit 1432 (S316 in FIG. 7A).
- the free I / O resource amount includes the free resource amount of the network I / O and the free resource amount of the disk I / O. Details of the processing in step S316 will be described later.
- the free I / O resource amount comparison processing unit 1432 determines the free I / O resource amount of the parallel distributed processing execution server 120 that has transmitted the processing completion notification and the free I / O resource amount of each parallel distributed processing execution server 120. In comparison, the parallel distributed processing execution server 120 having the smallest difference in the amount of free I / O resources is extracted as the transfer source server of the data block to be processed next by the parallel distributed processing execution server 120 that has transmitted the processing completion notification. The free I / O resource amount comparison processing unit 1432 transmits the extracted server name of the parallel distributed processing execution server 120 to the parallel distributed processing control server 130 (S317).
- the free I / O resource amount comparison method is such that the free I / O resource amount comparison processing unit 1432 performs a difference between the absolute value of the difference in the free resource amount of the disk I / O and the free resource amount of the network I / O. Is obtained, and the value having the larger absolute value of the difference between each free I / O resource amount is obtained. That is, the difference in the free I / O resource amount is calculated by the following equation (2).
- Difference in free I / O resource amount max ⁇
- the amount of free I / O resources of the parallel distributed processing execution server that has transmitted the processing completion notification is In order to use the data block as much as possible, the free resource amount of the disk I / O and the network I / O has a larger free I / O resource amount than the parallel distributed processing execution server 120 that sent the process completion notification.
- a parallel distributed processing execution server having the smallest difference in the amount of free I / O resources may be extracted from the parallel distributed processing execution servers 120.
- FIG. 8 is a diagram illustrating an example of the processing status management table 800 according to the first embodiment of this invention.
- FIG. 9 shows a modified example of the processing status management table, in which hardware specifications 903 to 905 are used in place of the execution multiplicity 803 of the processing status management table 800 shown in FIG.
- the processing status management table 800 is stored as attribute information for managing the processing progress status of each parallel distributed processing execution server 120 in a server name 801 for identifying each parallel distributed processing execution server 120 and each parallel distributed processing execution server.
- Each piece of information includes a processed data rate 802 indicating the ratio of processed data blocks to the processed data blocks, and an execution multiplicity 803 indicating the processing performance of each parallel distributed processing execution server 120 per unit time.
- the execution multiplicity 803 is deleted from the processing status management table 800, and the CPU type indicates the CPU specification name as the processing performance per unit time of each parallel distributed processing execution server 120.
- Information of 903, the number of CPU cores 904, and the memory type 904 indicating the specification name of the memory are added.
- FIG. 10 is a diagram illustrating an example of a processing delay threshold management table according to the first embodiment of this invention.
- the processing delay threshold management table 1000 has a low processed data rate of the parallel distributed processing execution server 120 as attribute information for managing a criterion for determining whether the progress of the processing of each parallel distributed processing execution server 120 is delayed, It has threshold information for determining whether or not processing is delayed.
- FIG. 11 is a flowchart showing a detailed processing procedure of step S316 in FIG. 7A. Note that the processing in FIG. 11 is performed by the resource usage management unit 1531 on the resource usage management server 150.
- the resource usage management unit 1531 receives from the data transfer control server 140 the server name of the parallel distributed processing execution server 120 that transmitted the processing completion notification and the server name list (S401).
- the resource usage management unit 1531 refers to the resource usage management table 1200 to refer to the free I / O resource amount of the network I / O of each parallel distributed processing execution server 120 and the free I / O of the disk I / O.
- the O resource amount is calculated and transmitted to the data transfer control server 140 (S316).
- FIG. 12 is a diagram illustrating an example of the resource usage management table 1200 according to the first embodiment of this invention.
- the resource usage management table 1200 includes, as attribute information for managing the I / O resource usage of each parallel distributed processing execution server 120, a server name 1201 for identifying each parallel distributed processing execution server 120, network I / O usage Each information includes an amount 1202 and a disk I / O usage amount 1203.
- the network I / O usage and the disk I / O usage are shown as a ratio of the used bandwidth to the I / O bandwidth of the entire parallel distributed processing execution server.
- the used I / O bandwidth may be used as it is.
- FIG. 13 shows a parallel distributed processing control in which the parallel distributed processing execution server A sends a process completion notification from the parallel distributed processing execution server A in an environment where four parallel distributed processing execution servers 120-1 to 120-4 are operating.
- An example of selecting a data block transfer source server when the server 130 receives the data block is shown.
- the parallel distributed processing execution server 120-1 is indicated by a parallel distributed processing execution server A
- the parallel distributed processing execution servers 120-2 to 120-4 are indicated by parallel distributed processing execution servers B to D.
- the processing allocation control unit 1331 acquires the server name list “server B, server C, server D” of the parallel distributed processing execution server 120 having unallocated data blocks from the data information management unit 1332 and transmits a processing completion notification. It is confirmed that there is no unallocated data block in the parallel distributed processing execution server A. Subsequently, the processing allocation control unit 1331 has a server name “server A” of the parallel distributed processing execution server 120 that has transmitted the processing completion notification to the data transfer control server 140, and parallel processing having unallocated data blocks 20, 30, and 40. The server name list “server B, server C, server D” of the distributed processing execution server 120 is transmitted.
- the processing delay server extraction processing unit 1431 of the data transfer control server 140 obtains the processed data rates of the parallel distributed processing execution servers B, C, and D included in the received server name list from the processing status management unit 1433. To do. Then, the threshold values stored in the processing delay threshold value management table 1000 are compared with the processed data rates of the parallel distributed processing execution servers B, C, and D, respectively. As a result of the comparison, as the parallel distributed processing execution server 120 whose processing progress is slow, the parallel distributed processing execution servers C and D having the processed data rate smaller than the threshold 50% are extracted, and the extracted server of the parallel distributed processing execution server 120 is extracted. The name list “server C, server D” is transmitted to the processing status management unit 1433, and the execution multiplicity of each parallel distributed processing execution server 120 is obtained.
- each parallel distributed processing execution server 120 is compared, and since the execution multiplicity is 1 for both the parallel distributed processing execution servers C and D, the parallel distributed processing execution servers C and D are extracted as processing delay servers.
- the processing completion notification is transmitted to the free I / O resource amount comparison processing unit 1432 together with the server name of the parallel distributed processing execution server A that has transmitted the processing completion notification.
- the free I / O resource amount comparison processing unit 1432 sends the free I / O resource amounts of the parallel distributed processing execution server A and the processing delay servers B and C that transmitted the processing completion notification from the resource usage management server 150. get. Then, the amount of free I / O resources of the parallel distributed processing execution server A that transmitted the processing completion notification and each of the processing delay servers C and D is compared.
- the difference in the amount of free I / O resources between the parallel distributed processing execution server A and the parallel distributed processing execution server C is 30%, and the amount of free I / O resources between the parallel distributed processing execution server A and the parallel distributed processing execution server D
- the free I / O resource amount comparison processing unit 1432 selects the parallel distributed processing execution server D having a small difference in the free I / O resource amount as a data block transfer source server. Subsequently, the free I / O resource amount comparison processing unit 1423 transmits the server name “server D” of the selected parallel distributed processing execution server 120 to the parallel distributed processing control server 130.
- the process allocation control unit 1331 of the parallel distributed processing control server 130 that has received the server name “server D” selects “data 40” from the unallocated data blocks stored in the parallel distributed processing execution server D, and The parallel distributed processing execution server D is requested to transfer “data 40” to the parallel distributed processing execution server A. Then, the process allocation control unit 1331 transmits a process execution request for the transferred data block “data 40” in the task A3 to the parallel distributed process execution server A.
- the processing status of each parallel processing server and the data processing capacity per unit time are used.
- the server with slow progress is extracted, and the parallel I / O resource amount of the extracted parallel distributed processing execution server and the parallel distributed processing execution server to which the data block processing is allocated is compared.
- the block transfer source server By selecting as the block transfer source server, the number of data transfers is reduced, the data transfer time is shortened, and the entire parallel distributed processing is efficiently performed. It can be executed.
- the execution multiplicity is used as an example of selecting the parallel distributed processing execution server 120 having the lowest performance in the process of step S311 in FIG. 7B.
- the hardware specification is used. Using this value, the parallel distributed processing execution server 120 with the lowest hardware specification can be selected as the computer with the lowest processing performance.
- the processing delay server extraction processing unit 1431 includes the parallel distributed processing execution server 120 whose processing progress is slow.
- the parallel distributed processing execution server 120 having the lowest hardware specification may be extracted as a processing delay server.
- the determination of the processing delay server is not limited to the above-described processed data rate and the threshold value 1001, and a parallel processing in which a processing completion notification is not transmitted even after a predetermined time has elapsed from the start of processing (task) execution.
- the distributed processing execution server 120 may be a processing delay server.
- the processing delay server may be determined by comparing a threshold value with an unprocessed ratio of data blocks after a predetermined time has elapsed.
- the free I / O resource amount is represented by the ratio between the actual usage amount and the theoretical value
- a value obtained by multiplying the theoretical value by a predetermined ratio may be used.
- each parallel distributed processing execution server 120 operates as one physical server, and a parallel distributed processing execution server 120 stores data blocks stored in other parallel distributed processing execution servers 120.
- the data block transfer source server was selected based on the processing progress rate of each parallel distributed processing execution server 120 and the amount of free resources.
- a plurality of parallel distributed processing execution servers 120 are executed as virtual servers on a physical server, data transfer between the parallel distributed processing execution servers 120 operating on the same physical server is performed by transferring data via the network 100. Therefore, the data block transfer time may be shorter than the data transfer between the parallel distributed processing execution servers 120 operating on different physical servers.
- the data is stored in one parallel distributed processing execution server 120 in another parallel distributed processing execution server 120.
- the processing of each parallel distributed processing execution server 120 Is selected as a data block transfer source server based on the progress rate and the amount of free resources.
- FIGS. 14A and 14B are block diagrams illustrating an example of the computer system 20 according to the second embodiment of this invention.
- the parallel distributed processing execution server 120 is executed by the physical server 210-1 as a virtual server that executes parallel distributed processing.
- the physical server 210-1 to 210-n is collectively referred to as the physical server 210.
- the virtualization unit 230 executed on the physical resource 220 provides a plurality of virtual servers, and each virtual server is executed as the parallel distributed processing execution server 120.
- the virtualization unit 230 includes a hypervisor and a VMM (Virtual Machine Monitor) that allocates the physical resource 220 to the plurality of parallel distributed processing execution servers 120.
- the physical resource 220 includes a CPU 122, a main storage device 123, a network interface 121, and a secondary storage device 124.
- the virtualization unit 230 provides each parallel distributed processing execution server 120 with a virtual (or logical) CPU 122v, a virtual main storage device 123v, a virtual network interface 121v, and a virtualized secondary storage device 124v.
- each parallel distributed processing execution server 120 is the same as those in the first embodiment.
- the parallel distributed processing control server 130, the data transfer control server 140, and the resource usage management server 150 are also configured in the same manner as in the first embodiment.
- the processing execution server management table 1700 and the processing status management table 1800 are different from the first embodiment in that the correspondence relationship between the physical server 210 and the virtual server is added to the configuration of the first embodiment. Is a point.
- FIG. 15A and FIG. 15B are flowcharts showing the second embodiment of the present invention and showing the procedure of the overall processing related to the parallel distributed processing execution method. 15A and 15B are processes corresponding to FIGS. 2A and 2B of the first embodiment.
- a data load request for processing target data (input data 1241) is transmitted from the client apparatus 110 to the parallel distributed processing control server 130.
- the parallel distributed processing control server 130 divides the load target data into data blocks (input data 1241) of a prescribed size, and distributes and loads the data to a plurality of parallel distributed processing execution servers 120 (S501). ).
- a parallel distributed processing execution request is transmitted from the client device 110 to the parallel distributed processing control server 130.
- the process allocation control unit 1331 of the parallel distributed processing control server 130 transmits the processing execution request for the loaded data block to each parallel distributed processing execution server 120, and the data information management unit 1332
- the allocation state 303 of the data block to be processed in the data information management table 300 is updated to “allocated” (S502).
- the processing execution request for the data block transmitted by the processing allocation control unit 1331 includes the data block ID of the data block to be processed and the task ID of the task that executes the processing of each data block.
- the task is a program that executes a predetermined process using the data block to be processed as input data 1241.
- the user-defined process execution unit 1231 of the parallel distributed process execution server 120 that has received the process execution request executes a predetermined process using the data block specified by the process execution request in each task as the input data 1241 (S503). . That is, the user-defined process execution unit 1231 of the parallel distributed process execution server 120 activates the task specified by the process execution request, and allocates the data block specified by the process execution request as input data 1241 to each task. Execute the process.
- the parallel distributed processing execution server 120 completes the processing as a processing completion notification from the parallel distributed processing execution server 120 to the parallel distributed processing control server 130.
- the task ID of the completed task is transmitted (S504).
- the data information management unit 1332 of the parallel distributed processing control server 130 that has received the processing completion notification updates the processing state 304 of the corresponding data block ID 301 in the data information management table 300 to “processed” (S505).
- the processing allocation control unit 1331 of the parallel distributed processing control server 130 refers to the data information management table 300 and determines whether or not the processing state 304 of all the data blocks is “processed” (S506).
- step S506 If all the data blocks are “processed” as a result of the determination in step S506 (S506 ⁇ Yes), the processing allocation control unit 1331 of the parallel distributed processing control server 130 transmits a parallel distributed processing completion notification to the client device 110. Then, the computer system 10 ends the process.
- the parallel distributed processing control server 130 refers to the data management information table 300 and transmits the processing completion notification. It is determined whether or not there is a data block whose allocation state 403 is “unallocated” in the process execution server 120 (S507).
- step S507 when there is an unallocated data block in the parallel distributed processing execution server 120 that has transmitted the processing completion notification (S507 ⁇ Yes), the processing allocation control unit 1331 of the parallel distributed processing control server 130 performs processing.
- One data block is arbitrarily selected from unallocated data blocks existing in the parallel distributed processing execution server 120 that has transmitted the completion notification (S508).
- step S507 determines whether there is no unallocated data block in the parallel distributed processing execution server 120 that transmitted the processing completion notification (S507 ⁇ No).
- the parallel distributed processing control server 130 refers to the data information management table 300, and sends it to another parallel distributed processing execution server 120 on the physical server 210 on which the parallel distributed processing execution server 120 that transmitted the processing completion notification operates. Then, it is determined whether or not there is a data block whose allocation state 303 is “unallocated” (S509). That is, a virtual server having an unallocated data block on the same physical server 210 is extracted.
- step S509 If the result of determination in step S509 is that there is an unallocated data block (S509 ⁇ Yes), the parallel distributed processing control server 130 sends the virtual server name of the parallel distributed processing execution server 120 that sent the processing completion notification, and the parallel The virtual server name list of the parallel distributed processing execution server 120 operating on the same physical server as the distributed processing execution server 120 and storing unallocated data blocks is transmitted to the data transfer control server 140 (S510).
- step S509 If there is no unallocated data block on the same physical server 210 as a result of the determination in step S509 (S509 ⁇ No), the process allocation control unit 1331 of the parallel distributed processing control server 130 transmits the process completion notification.
- the virtual server name of the distributed processing execution server 120 and the virtual server name list of the parallel distributed processing execution server 120 having unallocated data blocks are transmitted to the data transfer control server 140 (S511).
- the processing delay server extraction processing unit 1431 of the data transfer control server 140 selects from the parallel distributed processing execution servers 120 included in the virtual server name list having unallocated data blocks received from the parallel distributed processing control server 130.
- the virtual server name of the parallel distributed processing execution server 120 that has extracted the processing delay server and transmitted the processing completion notification and the virtual server name list of the extracted server are transmitted to the resource usage management server 150, and each parallel distributed The amount of free I / O resources of the process execution server 120 is requested (S512).
- the resource usage management unit 1531 of the resource usage management server 150 refers to the resource usage management table 1200 and the parallel distributed processing execution server 120 that has transmitted the processing completion notification and the parallel included in the virtual server name list.
- the free I / O resource amount of the distributed processing execution server 120 is transmitted to the transfer control server 140 (S513).
- the free I / O resource amount is the same as that in the first embodiment, and includes the free I / O resource amount of the network I / O and the free I / O resource amount of the disk I / O.
- the free I / O resource amount comparison processing unit 1432 of the data transfer control server 140 compares the parallel I / O resource amount of each processing delay server with the parallel distributed processing execution server 120 that transmitted the processing completion notification, A processing delay server with a small difference in free I / O resource amount is extracted, and the extracted virtual server name of the processing delay server is transmitted to the parallel distributed processing control server 130 (S514).
- the processing allocation control unit 1331 of the parallel distributed processing control server 130 selects any of the unallocated data blocks existing in the parallel distributed processing execution server 120 corresponding to the virtual server name received from the data transfer control server 140.
- One data block is selected (S515).
- the data information management unit 1332 of the parallel distributed processing control server 130 updates the allocation state 303 of the data block selected in the data information management table 300 to “allocated”.
- the processing allocation control unit 1331 transmits a processing request for the selected data block to the parallel distributed processing execution server 120 that transmitted the processing completion notification, and the computer system 20 returns the processing to step S503 (S516).
- the data block processing request includes the data block ID of the selected data block and the task ID for executing the processing of the data block.
- FIG. 16A and FIG. 16B are flowcharts showing a detailed procedure regarding the allocation of processing to the parallel distributed processing execution server 120 from step S507 to step S516 shown in FIG. 15A and FIG. 15B.
- 16A and 16B are processes performed by the processing allocation control unit 1331, the data information management unit 1332, and the task management unit 1334 on the parallel and distributed processing control server 130.
- FIG. 4A and FIG. This process corresponds to 4B.
- the process allocation control unit 1331 that has received the task ID as the process completion notification from the parallel distributed processing execution server 120 requests the task management unit 1334 to update the task management table 600. Further, the process allocation control unit 1331 requests the virtual server name list of the parallel distributed processing execution server 120 having unallocated data blocks from the data information management unit 1332 (S601).
- the task management unit 1334 that has received the update request for the task management table 600 updates the execution state 603 of the corresponding task in the task management table 600 to “waiting”, and updates the processing data block ID 604 to a NULL value (S602). ).
- the data information management unit 1332 refers to the data information management table 300, calculates the processed data rate of the parallel distributed processing execution server 120 that has transmitted the processing completion notification, and transmits the processed data to the data transfer control server 140.
- the rate and the name of the parallel distributed processing execution server 120 are transmitted.
- the transfer control server 140 that has received the processed data rate updates the processed data rate 1803 from the processing status management table 1800 for the received virtual server name (S603).
- the method for calculating the processed data rate is the same as in the first embodiment, and the data information management unit 1332 uses the data information management table 300 for the number of data blocks stored in the parallel distributed processing execution server 120. This is done by determining the ratio of the number of data blocks whose processing status 304 is “processed”. That is, the processed data rate is the ratio of data blocks that have been processed so far to all data blocks stored in the parallel distributed processing execution server 120, and is calculated using the formula (1) shown in the first embodiment. It is calculated in 1).
- the data information management unit 1332 refers to the data information management table 300, creates a virtual server name list of the parallel distributed processing execution server 120 having a data block whose allocation state 303 is “unallocated”, and a process allocation control unit It transmits to 1331 (S604).
- the process allocation control unit 1331 that has received the virtual server name list of the parallel distributed processing execution server 120 having an unallocated data block determines whether or not the parallel distributed processing execution server 120 that transmitted the process completion notification is included in the virtual server name list. Is determined (S605).
- step S605 when the parallel distributed processing execution server 120 that has transmitted the process completion notification is present in the list (S605 ⁇ Yes), the process allocation control unit 1331 refers to the data information management table 300 and notifies the process completion notification One data block is arbitrarily selected from the unallocated data blocks stored in the parallel distributed processing execution server 120 that has transmitted (S606).
- step S605 when the parallel distributed processing execution server 120 that transmitted the processing completion notification does not exist in the list (S605 ⁇ No), the process proceeds to FIG. 16B.
- the process allocation control unit 1331 displays a virtual server name list of the parallel distributed processing execution server 120 operating on the same physical server as the parallel distributed processing execution server 120 that transmitted the process completion notification to the processing execution server management unit 1333.
- a request is made (S607).
- the request for the virtual server name list includes the virtual server name of the parallel distributed processing execution server 120 that has transmitted the processing completion notification.
- the process execution server management unit 1333 displays a virtual server name list of another parallel distributed process execution server 120 on the same physical server as the parallel distributed process execution server 120 corresponding to the received virtual server name as a process allocation control unit. It transmits to 1331 (S608).
- step S609 when another parallel distributed processing execution server 120 exists (S609 ⁇ Yes), the process allocation control unit 1331 includes the virtual server name of the parallel distributed processing execution server 120 that has transmitted the process completion notification, and The virtual server name list of the parallel distributed processing execution server 120 operating on the same physical server and having unallocated data blocks is transmitted to the data transfer control server 140.
- the parallel distributed processing execution server 120 that holds the data block to be processed next is requested (S611).
- the process allocation control unit 1331 is the data to be processed next by the task of the parallel distributed processing execution server 120 that has transmitted the process completion notification.
- a virtual server name of the parallel distributed processing execution server 120 having a block is requested (S610).
- the virtual server name request includes the virtual server name of the parallel distributed processing execution server 120 that has transmitted the processing completion notification and the virtual server name list of the parallel distributed processing execution server 120 having an unallocated data block.
- the data transfer control server 140 extracts the parallel distributed processing execution server 120 storing the data block to be processed next by the parallel distributed processing execution server that has transmitted the processing completion notification, and transmits it to the parallel distributed processing control server 130 ( S612). Note that the processing in step S612 is processing corresponding to the processing in FIGS. 7A and 7B of the first embodiment.
- the process allocation control unit 1331 arbitrarily selects one data block from unallocated data blocks stored in the parallel distributed processing execution server 120 corresponding to the received virtual server name (S613). .
- the process allocation control unit 1331 sends the process completion notification to the parallel distributed processing execution server 120 corresponding to the virtual server name received from the data transfer control server 140, and the above-described step S613.
- a request is made to transfer the data block selected in (S614).
- the data block transfer request includes the data block ID of the selected data block and the virtual server name of the parallel distributed processing execution server 120 that has transmitted the processing completion notification.
- the process allocation control unit 1331 returns to the process of FIG. 16A, requests the data information management unit 1332 to update the data information management table 300, and requests the task management unit 1334 to update the task management table 600 (S615). ).
- the update request for the data information management table 300 includes the data block ID 301 of the selected data block
- the update request for the task management table 600 includes the task ID received from the parallel distributed processing execution server 120 as the processing completion notification and the selection. Data block ID 601 of the data block.
- the data information management unit 1332 updates the allocation state 303 of the data information management table 300 corresponding to the data block corresponding to the data block ID included in the received update request to “allocated”.
- the task management unit 1334 updates the task execution state 603 of the task management table 600 corresponding to the received task ID to “being executed”, and updates the processing target data block ID 604 with the received data block ID (S616).
- the process allocation control unit 1331 transmits a process execution request for the selected data block to the parallel distributed process execution server 120 that has transmitted the process completion notification (S617).
- the process execution request includes the data block ID of the selected data block and the task ID to which the process is assigned.
- FIG. 17 is a diagram illustrating an example of the process execution server management table 1700 according to the second embodiment of this invention.
- FIG. 17 is a table corresponding to the process execution server management table 500 shown in FIG. 5 of the first embodiment.
- the process execution server management table 1700 includes, as attribute information for managing the physical server 210 and the parallel distributed process execution server 120, a physical server name 1701 for identifying each physical server 210, and each parallel distributed process composed of virtual servers. One entry is configured from the virtual server name 1702 for identifying the execution server 120.
- FIG. 18 is a diagram illustrating another example of the processing status management table 1800 according to the second embodiment of this invention.
- FIG. 18 is a table corresponding to the processing status management table 800 shown in FIG. 8 of the first embodiment.
- the processing status management table 1800 identifies the physical server name 1801 for identifying each physical server 210 and each parallel distributed processing execution server 120 as attribute information for managing the processing progress status of each parallel distributed processing execution server 120.
- Virtual server name 1802 processed data rate 1803 indicating the ratio of processed data blocks to data blocks stored in each parallel distributed processing execution server, and processing performance per unit time of each parallel distributed processing execution server One entry is configured from the execution multiplicity 1804 shown.
- processing status management table 1800 does not show hardware specification attribute information such as CPU and memory, but may include these hardware specifications.
- the present invention can be applied to a parallel distributed processing system, and is particularly suitable for a parallel distributed processing system including a parallel distributed processing execution server in which physical resource allocation varies.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
ネットワークI/Oの空きI/Oリソース量 =(理論値-現在使用帯域)÷ 理論値
である。現在使用している帯域は、所定時間(例えば、1分間)のデータ転送速度の平均値などを用いることができる。また、データ転送速度の理論値に代わって、ネットワークインターフェース121のリンク速度の実効値または理論値に所定の比率を乗じた値を用いてもよい。
ディスクI/Oの空きI/Oリソース量 =(理論値-現在使用帯域)÷ 理論値
である。現在使用している帯域は、所定時間(例えば、1分間)のデータ転送速度の平均値などを用いることができる。また、二次記憶装置124のデータ転送速度の実効値または理論値に所定の比率を乗じた値を用いてもよい。
処理状況管理テーブル800は、各並列分散処理実行サーバ120の処理の進捗状況を管理する属性情報として、各並列分散処理実行サーバ120を識別するためのサーバ名801、各並列分散処理実行サーバに格納されているデータブロックに対する、処理済みのデータブロックの割合を示す処理済みデータ率802、各並列分散処理実行サーバ120の単位時間当たりの処理性能を示す実行多重度803の各情報を有する。
ネットワークI/Oの空きI/Oリソース量 = (実効値-現在使用帯域)÷ 実効値
である。また、二次記憶装置124で利用可能なデータ転送速度(MByte/sec)も、
ディスクI/Oの空きI/Oリソース量 = (実効値-現在使用帯域)÷ 実効値
とすることができる。なお、実効値に代わって、理論値に所定の比率を乗じた値を用いてもよい。
Claims (13)
- プロセッサと記憶装置とを備えて処理対象のデータとして予め分割されたデータブロックを前記記憶装置に格納し、前記プロセッサが前記データブロックを処理するタスクを並列して実行する複数の並列分散処理実行サーバと、前記複数の並列分散処理実行サーバを制御する管理計算機と、を備えた並列分散処理システムで、前記管理計算機が、第1の並列分散処理実行サーバのタスクに割り当てるデータブロックの送信元となる第2の並列分散処理実行サーバを選択する並列分散処理システムのデータ転送制御方法であって、
前記管理計算機が、前記第1の並列分散処理実行サーバから前記タスクが完了したことを示す完了通知を受信する第1のステップと、
前記管理計算機が、前記複数の並列分散処理実行サーバのリソース使用量をそれぞれ収集する第2のステップと、
前記管理計算機が、前記複数の並列分散処理実行サーバが保持するデータブロックとタスクの状態を取得する第3のステップと、
前記管理計算機が、前記複数の並列分散処理実行サーバがそれぞれ保持するデータブロックの処理の進行状況と前記複数の並列分散処理実行サーバのリソース使用量に基づいて、前記第1の並列分散処理実行サーバへデータブロックを転送する第2の並列分散処理実行サーバを選択する第4のステップと、
前記管理計算機が、前記選択した第2の並列分散処理実行サーバに対して前記データブロックを前記第1の並列分散処理実行サーバへ転送する指令を送信する第5のステップと、
前記管理計算機が、前記第1の並列分散処理実行サーバに対して前記転送されたデータブロックを処理するタスクを実行する指令を送信する第6のステップと、
を含むことを特徴とする並列分散処理システムのデータ転送制御方法。 - 請求項1に記載の並列分散処理システムのデータ転送制御方法であって、
前記第4のステップは、
前記複数の並列分散処理実行サーバが保持するデータブロックのうち、タスクによる処理が完了した処理済みのデータブロックの比率を処理済みデータ率として演算する第7のステップと、
前記処理済みデータ率が、予め設定した閾値未満の並列分散処理実行サーバを処理遅延サーバとして抽出する第8のステップと、
前記抽出された処理遅延サーバから前記第2の並列分散処理実行サーバを選択する第9のステップと、
を含むことを特徴とする並列分散処理システムのデータ転送制御方法。 - 請求項2に記載の並列分散処理システムのデータ転送制御方法であって、
前記第9のステップは、
前記並列分散処理実行サーバの単位時間当たりの処理能力として前記タスクの実行多重度を取得し、前記実行多重度が最も少ない処理遅延サーバを前記第2の並列分散処理実行サーバとして選択することを特徴とする並列分散処理システムのデータ転送制御方法。 - 請求項2に記載の並列分散処理システムのデータ転送制御方法であって、
前記第9のステップは、
前記並列分散処理実行サーバの単位時間当たりの処理能力として前記並列分散処理実行サーバのハードウェアスペックを示す値を取得し、前記ハードウェアスペックを示す値が最も低い処理遅延サーバを前記第2の並列分散処理実行サーバとして選択することを特徴とする並列分散処理システムのデータ転送制御方法。 - 請求項2に記載の並列分散処理システムのデータ転送制御方法であって、
前記第9のステップは、
前記第1の並列分散処理実行サーバの空きI/Oリソース量を求める第10のステップと、
前記処理遅延サーバの空きI/Oリソース量を求める第11のステップと、
前記第1の並列分散処理実行サーバの空きI/Oリソース量と、前記処理遅延サーバの空きI/Oリソース量の差が最も小さい前記処理遅延サーバを第2の並列分散処理実行サーバを選択する第12のステップと、
を含むことを特徴とする並列分散処理システムのデータ転送制御方法。 - 請求項5に記載の並列分散処理システムのデータ転送制御方法であって、
前記第10のステップは、
前記第1の並列分散処理実行サーバのネットワークの空きI/Oリソース量と、ディスクI/Oの空きリソース量とを求め、
前記第11のステップは、
前記処理遅延サーバのネットワークの空きI/Oリソース量と、ディスクI/Oの空きリソース量とを求め、
前記第12のステップは、
前記第1の並列分散処理実行サーバのネットワークI/Oの空きリソース量と、処理遅延サーバのネットワークI/Oの空きリソース量との差の絶対値を第1の絶対値として求め、
第1の並列分散処理実行サーバのディスクI/Oの空きリソース量と、処理遅延サーバのディスクI/Oの空きリソース量との差の絶対値を第2の絶対値として求め、
前記第1の絶対値と第2の絶対値のうち大きい方の値を前記処理遅延サーバの絶対値として選択し、当該処理遅延サーバの絶対値のうち最も小さい値の処理遅延サーバを前記第2の並列分散処理実行サーバとして選択することを特徴とする並列分散処理システムのデータ転送制御方法。 - 請求項2に記載の並列分散処理システムのデータ転送制御方法であって、
前記第9のステップは、
前記第1の並列分散処理実行サーバの空きI/Oリソース量を求める第10のステップと、
前記処理遅延サーバの空きI/Oリソース量を求める第11のステップと、
前記第1の並列分散処理実行サーバの空きI/Oリソース量よりも多い空きI/Oリソース量を有する処理遅延サーバのうち、前記空きI/Oリソース量の差が最も小さい前記処理遅延サーバを第2の並列分散処理実行サーバを選択する第12のステップと、
を含むことを特徴とする並列分散処理システムのデータ転送制御方法。 - 請求項1に記載の並列分散処理システムのデータ転送制御方法であって、
前記並列分散処理実行サーバは、物理サーバ上で実行される仮想化部が提供する仮想サーバとして実行され、
前記第4のステップは、
前記第1の並列分散処理実行サーバを実行する物理サーバと同一の物理サーバで実行される仮想サーバを優先して前記第2の並列分散処理実行サーバとして選択することを特徴とする並列分散処理システムのデータ転送制御方法。 - プロセッサと記憶装置とを備えて処理対象のデータとして予め分割されたデータブロックを前記記憶装置に格納し、前記プロセッサが前記データブロックを処理するタスクを並列して実行する複数の並列分散処理実行サーバと、
前記複数の並列分散処理実行サーバを制御する管理計算機と、を備えた並列分散処理システムであって、
前記管理計算機は、
前記複数の並列分散処理実行サーバのデータブロックに対すタスクを制御する並列分散処理制御部と、
前記複数の並列分散処理実行サーバのうち、タスクにデータブロックを割り当てる第1の並列分散処理実行サーバ対して、前記データブロックの送信元となる第2の並列分散処理実行サーバを選択するデータ転送制御部と、
前記複数の並列分散処理実行サーバのリソース使用量をそれぞれ管理するリソース使用量管理部と、
を有する並列分散処理システムであって、
前記並列分散処理制御部は、
前記第1の並列分散処理実行サーバから前記タスクが完了したことを示す完了通知を受信し、前記複数の並列分散処理実行サーバが保持するデータブロックとタスクの状態を取得し、
前記データ転送制御部は、
前記複数の並列分散処理実行サーバがそれぞれ保持するデータブロックの処理の進行状況と前記複数の並列分散処理実行サーバのリソース使用量に基づいて、前記第1の並列分散処理実行サーバへデータブロックを転送する第2の並列分散処理実行サーバを選択し、前記選択した第2の並列分散処理実行サーバに対して前記データブロックを前記第1の並列分散処理実行サーバへ転送する指令を送信し、
前記並列分散処理制御部は、
前記第1の並列分散処理実行サーバに対して前記転送されたデータブロックを処理するタスクを実行する指令を送信することを特徴とする並列分散処理システム。 - 請求項9に記載の並列分散処理システムであって、
前記データ転送制御部は、
前記複数の並列分散処理実行サーバが保持するデータブロックのうち、タスクによる処理が完了した処理済みのデータブロックの比率を処理済みデータ率として演算し、前記処理済みデータ率が、予め設定した閾値未満の並列分散処理実行サーバを処理遅延サーバとして抽出し、前記抽出された処理遅延サーバから前記第2の並列分散処理実行サーバを選択することを特徴とする並列分散処理システム。 - 請求項10に記載の並列分散処理システムであって、
前記データ転送制御部は、
前記第1の並列分散処理実行サーバの空きI/Oリソース量を求め、前記処理遅延サーバの空きI/Oリソース量を求めて、前記第1の並列分散処理実行サーバの空きI/Oリソース量と、前記処理遅延サーバの空きI/Oリソース量の差が最も小さい前記処理遅延サーバを第2の並列分散処理実行サーバを選択することを特徴とする並列分散処理システム。 - 請求項9に記載の並列分散処理システムであって、
前記並列分散処理実行サーバは、物理サーバ上で実行される仮想化部が提供する仮想サーバとして実行され、
前記データ転送制御部は、
前記第1の並列分散処理実行サーバを実行する物理サーバと同一の物理サーバ上の仮想サーバを優先して前記第2の並列分散処理実行サーバとして選択することを特徴とする並列分散処理システム。 - 複数の並列分散処理実行サーバのうち、第1の並列分散処理実行サーバのタスクに割り当てるデータブロックの送信元となる第2の並列分散処理実行サーバを選択するプログラムを格納した記憶媒体であって、
前記第1の並列分散処理実行サーバから前記タスクが完了したことを示す完了通知を受信する第1の手順と、
前記複数の並列分散処理実行サーバのリソース使用量をそれぞれ収集する第2の手順と、
前記複数の並列分散処理実行サーバが保持するデータブロックとタスクの状態を取得する第3の手順と、
前記複数の並列分散処理実行サーバがそれぞれ保持するデータブロックの処理の進行状況と前記複数の並列分散処理実行サーバのリソース使用量に基づいて、前記第1の並列分散処理実行サーバへデータブロックを転送する第2の並列分散処理実行サーバを選択する第4の手順と、
前記選択した第2の並列分散処理実行サーバに対して前記データブロックを前記第1の並列分散処理実行サーバへ転送する指令を送信する第5の手順と、
前記第1の並列分散処理実行サーバに対して前記転送されたデータブロックを処理するタスクを実行する指令を送信する第6の手順と、
を計算機に実行させるプログラムを格納した非一時的な計算機読み取り可能な記憶媒体。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/978,798 US9244737B2 (en) | 2011-02-04 | 2011-02-04 | Data transfer control method of parallel distributed processing system, parallel distributed processing system, and recording medium |
PCT/JP2011/052435 WO2012105056A1 (ja) | 2011-02-04 | 2011-02-04 | 並列分散処理システムのデータ転送制御方法、並列分散処理システム及び記憶媒体 |
JP2012555678A JP5484601B2 (ja) | 2011-02-04 | 2011-02-04 | 並列分散処理システムのデータ転送制御方法、並列分散処理システム及び記憶媒体 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2011/052435 WO2012105056A1 (ja) | 2011-02-04 | 2011-02-04 | 並列分散処理システムのデータ転送制御方法、並列分散処理システム及び記憶媒体 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2012105056A1 true WO2012105056A1 (ja) | 2012-08-09 |
Family
ID=46602297
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2011/052435 WO2012105056A1 (ja) | 2011-02-04 | 2011-02-04 | 並列分散処理システムのデータ転送制御方法、並列分散処理システム及び記憶媒体 |
Country Status (3)
Country | Link |
---|---|
US (1) | US9244737B2 (ja) |
JP (1) | JP5484601B2 (ja) |
WO (1) | WO2012105056A1 (ja) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2014102740A (ja) * | 2012-11-21 | 2014-06-05 | Fujitsu Ltd | 情報処理方法、プログラム、情報処理装置、及び情報処理システム。 |
JP2014235734A (ja) * | 2013-06-04 | 2014-12-15 | 富士通株式会社 | プロセスマイグレーション方法、プロセスマイグレーションを実行するよう動作するコンピュータシステム、そのようなシステム内の中間計算リソース、及びプロセスマイグレーション方法のためのパーティショニング前の計算リソースの選択方法 |
WO2015145598A1 (ja) * | 2014-03-26 | 2015-10-01 | 株式会社 日立製作所 | 並列演算処理システムのデータ配分装置、データ配分方法、及びデータ配分プログラム |
JP2019035996A (ja) * | 2017-08-10 | 2019-03-07 | 株式会社日立製作所 | 分散処理システム、分散処理方法、及び分散処理プログラム |
US10936377B2 (en) | 2017-02-28 | 2021-03-02 | Hitachi, Ltd. | Distributed database system and resource management method for distributed database system |
CN114006898A (zh) * | 2021-10-30 | 2022-02-01 | 杭州迪普信息技术有限公司 | 版本更换方法、装置及系统 |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9298760B1 (en) * | 2012-08-03 | 2016-03-29 | Google Inc. | Method for shard assignment in a large-scale data processing job |
US20150081400A1 (en) * | 2013-09-19 | 2015-03-19 | Infosys Limited | Watching ARM |
US11327779B2 (en) * | 2015-03-25 | 2022-05-10 | Vmware, Inc. | Parallelized virtual machine configuration |
WO2017006346A1 (en) * | 2015-07-09 | 2017-01-12 | Sai Venkatesh | System of disseminated parallel control computing in real time |
US11734064B2 (en) | 2016-02-05 | 2023-08-22 | Sas Institute Inc. | Automated virtual machine resource management in container-supported many task computing |
US11107037B2 (en) | 2017-12-15 | 2021-08-31 | Siemens Industry Software Inc. | Method and system of sharing product data in a collaborative environment |
US11748159B2 (en) | 2018-09-30 | 2023-09-05 | Sas Institute Inc. | Automated job flow cancellation for multiple task routine instance errors in many task computing |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000268013A (ja) * | 1999-03-18 | 2000-09-29 | Nec Corp | 分散ジョブ制御システムおよび分散ジョブ制御方法 |
JP2008299791A (ja) * | 2007-06-04 | 2008-12-11 | Hitachi Ltd | 仮想計算機システム |
JP2010140134A (ja) * | 2008-12-10 | 2010-06-24 | Hitachi Ltd | 仮想マシン管理方法、プログラムおよび管理サーバ |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH1188373A (ja) * | 1997-09-12 | 1999-03-30 | Nec Corp | コネクション振り分けによる負荷分散方式 |
US8443372B2 (en) * | 2006-03-23 | 2013-05-14 | International Business Machines Corporation | Methods and systems for partitioning data in parallel processing systems |
US9378066B2 (en) * | 2008-02-25 | 2016-06-28 | Sap Se | Dynamic resizing of applications running on virtual machines |
US8819106B1 (en) * | 2008-12-12 | 2014-08-26 | Amazon Technologies, Inc. | Managing distributed execution of programs |
US8370493B2 (en) * | 2008-12-12 | 2013-02-05 | Amazon Technologies, Inc. | Saving program execution state |
US8793365B2 (en) * | 2009-03-04 | 2014-07-29 | International Business Machines Corporation | Environmental and computing cost reduction with improved reliability in workload assignment to distributed computing nodes |
JP5323554B2 (ja) | 2009-03-27 | 2013-10-23 | 株式会社日立製作所 | ジョブ処理方法、ジョブ処理プログラムを格納したコンピュータ読み取り可能な記録媒体、および、ジョブ処理システム |
US8266289B2 (en) * | 2009-04-23 | 2012-09-11 | Microsoft Corporation | Concurrent data processing in a distributed system |
-
2011
- 2011-02-04 US US13/978,798 patent/US9244737B2/en not_active Expired - Fee Related
- 2011-02-04 JP JP2012555678A patent/JP5484601B2/ja not_active Expired - Fee Related
- 2011-02-04 WO PCT/JP2011/052435 patent/WO2012105056A1/ja active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000268013A (ja) * | 1999-03-18 | 2000-09-29 | Nec Corp | 分散ジョブ制御システムおよび分散ジョブ制御方法 |
JP2008299791A (ja) * | 2007-06-04 | 2008-12-11 | Hitachi Ltd | 仮想計算機システム |
JP2010140134A (ja) * | 2008-12-10 | 2010-06-24 | Hitachi Ltd | 仮想マシン管理方法、プログラムおよび管理サーバ |
Non-Patent Citations (2)
Title |
---|
MASAAKI YONEDA: "Cloud o Ikasu Kigyo Cookpad", MITE WAKARU CLOUD MAGAZINE, vol. 1, 10 May 2010 (2010-05-10), pages 7 - 10 * |
YOSHIKI YAZAWA ET AL.: "The Design and Performance Evaluation of the Process Migratable MPI Program", IPSJ SIG NOTES, vol. 2008, no. 43, 13 May 2008 (2008-05-13), pages 31 - 36 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2014102740A (ja) * | 2012-11-21 | 2014-06-05 | Fujitsu Ltd | 情報処理方法、プログラム、情報処理装置、及び情報処理システム。 |
JP2014235734A (ja) * | 2013-06-04 | 2014-12-15 | 富士通株式会社 | プロセスマイグレーション方法、プロセスマイグレーションを実行するよう動作するコンピュータシステム、そのようなシステム内の中間計算リソース、及びプロセスマイグレーション方法のためのパーティショニング前の計算リソースの選択方法 |
WO2015145598A1 (ja) * | 2014-03-26 | 2015-10-01 | 株式会社 日立製作所 | 並列演算処理システムのデータ配分装置、データ配分方法、及びデータ配分プログラム |
US10936377B2 (en) | 2017-02-28 | 2021-03-02 | Hitachi, Ltd. | Distributed database system and resource management method for distributed database system |
JP2019035996A (ja) * | 2017-08-10 | 2019-03-07 | 株式会社日立製作所 | 分散処理システム、分散処理方法、及び分散処理プログラム |
CN114006898A (zh) * | 2021-10-30 | 2022-02-01 | 杭州迪普信息技术有限公司 | 版本更换方法、装置及系统 |
Also Published As
Publication number | Publication date |
---|---|
JPWO2012105056A1 (ja) | 2014-07-03 |
US9244737B2 (en) | 2016-01-26 |
US20130290979A1 (en) | 2013-10-31 |
JP5484601B2 (ja) | 2014-05-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5484601B2 (ja) | 並列分散処理システムのデータ転送制御方法、並列分散処理システム及び記憶媒体 | |
JP5614226B2 (ja) | 仮想マシン制御装置、仮想マシン制御プログラムおよび仮想マシン制御方法 | |
JP5939740B2 (ja) | 動的にリソースを割り当てる方法、システム及びプログラム | |
CN106933669B (zh) | 用于数据处理的装置和方法 | |
KR101781063B1 (ko) | 동적 자원 관리를 위한 2단계 자원 관리 방법 및 장치 | |
CN107111517B (zh) | 针对归约器任务的虚拟机优化分配和/或生成 | |
JP5664098B2 (ja) | 複合イベント分散装置、複合イベント分散方法および複合イベント分散プログラム | |
CN104915253B (zh) | 一种作业调度的方法及作业处理器 | |
US10193973B2 (en) | Optimal allocation of dynamically instantiated services among computation resources | |
EP2755133B1 (en) | Application execution controller and application execution method | |
WO2015001850A1 (ja) | タスク割り当て判定装置、制御方法、及びプログラム | |
US10891164B2 (en) | Resource setting control device, resource setting control system, resource setting control method, and computer-readable recording medium | |
US20170220385A1 (en) | Cross-platform workload processing | |
JP6107801B2 (ja) | 情報処理装置、情報処理システム、タスク処理方法、及び、プログラム | |
JP6519111B2 (ja) | データ処理制御方法、データ処理制御プログラムおよびデータ処理制御装置 | |
JP2017041191A (ja) | リソース管理装置、リソース管理プログラム、及びリソース管理方法 | |
US10606650B2 (en) | Methods and nodes for scheduling data processing | |
JP2014186411A (ja) | 管理装置、情報処理システム、情報処理方法、及びプログラム | |
JP2016004328A (ja) | タスク割当プログラム、タスク割当方法およびタスク割当装置 | |
JP2017191387A (ja) | データ処理プログラム、データ処理方法およびデータ処理装置 | |
US11561843B2 (en) | Automated performance tuning using workload profiling in a distributed computing environment | |
CN117472570A (zh) | 用于调度加速器资源的方法、装置、电子设备和介质 | |
JP6158751B2 (ja) | 計算機資源割当装置及び計算機資源割当プログラム | |
US9710311B2 (en) | Information processing system, method of controlling information processing system, and recording medium | |
WO2013065151A1 (ja) | 計算機システム、データ転送方法、および、データ転送プログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11857607 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2012555678 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13978798 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 11857607 Country of ref document: EP Kind code of ref document: A1 |