WO2014016950A1 - Parallel computer system, and method for arranging processing load in parallel computer system - Google Patents

Parallel computer system, and method for arranging processing load in parallel computer system Download PDF

Info

Publication number
WO2014016950A1
WO2014016950A1 PCT/JP2012/069077 JP2012069077W WO2014016950A1 WO 2014016950 A1 WO2014016950 A1 WO 2014016950A1 JP 2012069077 W JP2012069077 W JP 2012069077W WO 2014016950 A1 WO2014016950 A1 WO 2014016950A1
Authority
WO
WIPO (PCT)
Prior art keywords
processing
load
computer system
parallel computer
processing load
Prior art date
Application number
PCT/JP2012/069077
Other languages
French (fr)
Japanese (ja)
Inventor
泰幸 工藤
加藤 猛
雅士 高田
幸二 福田
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Priority to JP2014526681A priority Critical patent/JPWO2014016950A1/en
Priority to PCT/JP2012/069077 priority patent/WO2014016950A1/en
Publication of WO2014016950A1 publication Critical patent/WO2014016950A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • G06F9/5088Techniques for rebalancing the load in a distributed system involving task migration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5019Workload prediction

Definitions

  • the present invention relates to a parallel computer system, and more particularly, to a processing load arrangement on the parallel computer system.
  • Patent Document 1 There is a technique described in Patent Document 1 as background art in this technical field.
  • This publication describes a load distribution method for appropriately distributing the loads of a plurality of computers. Specifically, when a certain process (described as a program in Patent Document 1) is executed by its own calculation processor (described as an executioner in Patent Document 1), the communication time before movement with each of the calculation processors, A method of obtaining a post-movement communication time with each processor when processing is moved to another calculation processor and moving the process to the other calculation processor so that the post-movement communication time is shorter than the pre-movement communication time Is described. Moreover, there exists a technique of patent document 2 as background art of this technical field.
  • This publication describes a method of executing a process being executed as a separate process so that the communication efficiency of the entire system is optimized in a parallel computer system. Specifically, when the amount of communication usage exceeds the threshold for the amount of communication when transmitting from the running process to another calculation processor, the smallest communication among the nearby calculation processors on the receiving side. A method is described in which a calculation processor which is a usage amount is selected and a process being executed is transferred to the processor.
  • JP-A-8-30558 Japanese Patent Laid-Open No. 11-154143
  • ABS Agent-based Simulation
  • the determination as to whether or not to move the process to another calculation processor is based on the sum of the process execution time and the communication time associated with the process movement, and the process is being executed.
  • the communication between computation nodes occurring in the above is not considered. Therefore, it is difficult to properly distribute the processing load including communication in a process such as ABS in which a large number of communications between calculation processors occur.
  • the present invention has been made in view of the above problems, and its object is to provide a parallel computer system capable of appropriately distributing the processing load amount including the influence of increase / decrease in the communication load between the calculation processors accompanying the process movement. It is to provide a processing load arrangement method.
  • a parallel computer system that distributes and executes a plurality of processing loads for a plurality of processing nodes
  • information on the amount of communication between the processing loads is acquired, and a plurality of processes are performed based on the acquired information.
  • Predicting a change in the communication load amount between the pair of processing nodes when a pair of processing nodes whose arrangement is to be changed is selected and the processing load is moved between the selected pair of processing nodes.
  • the above-mentioned problem is solved by rearranging the plurality of processing loads based on the prediction result.
  • positioning means It is a figure for demonstrating the operation image of a process load arrangement
  • FIG. 1 is a functional block diagram of the parallel computer system 100 of this embodiment.
  • the parallel computer system 100 includes a plurality of information processing apparatuses.
  • the plurality of information processing apparatuses of the parallel computer system 100 include one management node 101 and three processing nodes 102 to 104.
  • Each of the management node 101 and the processing nodes 102 to 104 is a server device, for example.
  • the processing node 102 when referring to a specific processing node, the processing node 102 is referred to as a processing node 1, the processing node 103 is referred to as a processing node 2, and the processing node 104 is referred to as a processing node 3.
  • the management node 101 In the management node 101, program information 105, processing load arrangement information 106, monitoring result information 107, processing load arrangement means 108, communication control means 109, and calculation result aggregation means 110 are arranged. In each processing node, processing information 111, calculation necessary information 112, calculation processing means 113, communication control means 114, and load monitoring means 115 are arranged.
  • the parallel computer system 100 includes a storage device 117. The management node 101, the storage device 117, and the processing nodes 102 to 104 are connected via the communication path control means 116.
  • FIG. 2 shows the functional block of FIG. 1 as a hardware image.
  • Each of the management node 101 and the processing nodes 102 to 104 has a central processing unit (CPU) 201, a memory 202, and communication means 203.
  • CPU central processing unit
  • Various network devices can be used for the communication path control means 116.
  • a network switch is used for the communication path control means 116.
  • FIG. 3 is a flowchart showing the operation of the parallel computer system 100.
  • a program is loaded from the storage device 117 to the memory 202 of the management node 101 and stored as program information 105.
  • the stored program information 105 includes calculation information performed by each agent of the ABS, information related to communication performed by the agents, and the like.
  • an ABS composed of 18 agents A to R will be described.
  • the process is repeated 20 times while changing the calculation coefficient of the agent.
  • the processing load allocation 301 and subsequent steps include processing execution 302, load analysis 303, processing load rearrangement 304, setting change 305, processing execution 306, load analysis 307, and result output 308.
  • the processing load placement 301 is executed by the processing load placement means 108 and the communication control means 109 of the management node 101.
  • the processing load arrangement unit 108 reads the stored program information 105 and assigns, for example, agents A to R cyclically to the processing nodes 1, 2, and 3.
  • computation information executed by each agent in each processing node, information related to communication performed by the agents, and the like are transmitted using the communication control means 108, and these are transmitted as processing information 111 to the memory 202 of each processing node.
  • FIG. 4 is a list showing the arrangement destination of each agent on the processing node and the relationship between the communicating agents, and this information is stored as the processing load arrangement information 106.
  • FIG. 5 is an image of the list shown in FIG. 4.
  • Each agent is indicated by a circle, agents that need to communicate with each other are represented by solid lines, and a dotted line is represented by a region assigned to each processing node.
  • the process execution 302 is executed by the arithmetic processing unit 113 of each processing node.
  • the arithmetic processing means 113 reads the processing information 111 and the calculation necessary information 112, executes the calculation of each agent in each processing node from these information, and outputs the result.
  • the calculation necessary information 112 is information necessary for an agent in each processing node to perform a calculation, for example, a result of a calculation executed by another agent.
  • the calculation order and parallelism in the calculation processing unit 113, synchronization with other processing nodes, and the like are determined based on the calculation necessary information 112, and these are included in the processing information 111.
  • the calculation result in the calculation processing unit 113 is transmitted using the communication control unit 114, stored in the calculation totaling unit 110 of the management node 101, and also in the calculation necessary information 112 of the processing node that needs the information. Stored.
  • the load analysis 303 is executed by the load monitoring unit 115 of each processing node.
  • the load monitoring unit 115 monitors the communication load of the communication control unit 114, and the monitoring result is transmitted to the management node 101 using the communication control unit 114 and stored as monitoring result information 107.
  • the communication load to be monitored is a value obtained by measuring the communication time between processing nodes in the first process execution 302, and is measured by, for example, the communication unit 203 or the communication path control unit 116 of each processing device. .
  • the reason for selecting this index is that the communication speed between agents across the processing nodes is much slower than that in the processing nodes, which is considered to easily cause a bottleneck in processing.
  • FIG. 6 is an example of the monitoring result information 107 transmitted from each processing node.
  • the communication time is proportional to the number of connections between the processing nodes in the image diagram of FIG. 5, for the sake of convenience of explanation, it is assumed that the communication amounts of the agents are all equal. That is, in this embodiment, the communication amount between the processing nodes is evaluated by the number of connections.
  • the processing load rearrangement 304 is executed by the processing load placement unit 108 of the management node 108.
  • the processing load placement unit 108 reads the monitoring result information 107 and determines the bottleneck route with the highest communication load. For example, in FIG. 6, since the average usage rate between the processing node 1 and the processing node 2 is the highest at 100%, this route is determined as a bottleneck route.
  • the processing load placement unit 108 selects an agent to be moved to alleviate the bottleneck. The concept for this selection will be described with reference to the image diagram of FIG.
  • the processing load placement unit 108 of the management node 101 selects the pair of processing node 1 and processing node 2 having the highest average usage rate, and tries to move the agent to be processed between the pair.
  • the agent D when the agent D is moved from the processing node 1 to the processing node 2, since the connection with the agent B moves into the processing node 2, the number of connections between the processing node 1 and the processing node 2 is calculated. One can be reduced. However, three connections with agents A, M, and J are newly generated between processing node 1 and processing node 2. Therefore, the number of connections between the processing node 1 and the processing node 2 is increased by two compared to before the movement. On the other hand, when the agent G is moved from the processing node 1 to the processing node 2, the connection between the agents E, K, and Q moves into the processing node 2, so that the connection between the processing node 1 and the processing node 2 is performed. The number can be reduced by three.
  • one connection with the agent J is newly generated between the processing node 1 and the processing node 2. Therefore, the number of connections between the processing node 1 and the processing node 2 is reduced by two compared to before the movement. In this way, if the number of connections with the agent in the processing node to be moved is subtracted from the number of connections with the agent in the processing node, and the result is negative, the processing nodes It is possible to reduce the number of connections.
  • the processing load placement unit 108 of the management node 101 changes the number of connections between the processing nodes when the processing load is moved between the selected pairs based on the processing load placement information 106 shown in FIG. calculate.
  • FIG. 7A is an example of the calculation result of the change in the number of connections between processing nodes for the processing node 1, and the number of agents communicated by each agent in the processing node 1 and the processing by each agent. Information on the change (increase / decrease) in the number of connections between processing nodes when moving to node 2 is included.
  • the change in the number of connections between the processing node 1 and the processing node 2 depends on the number of agents that each agent communicates in the processing node 1 from the number of agents that are arranged in the processing node 2 that communicates with each agent.
  • the number of connections between the processing nodes 2 and 3 increases by the number of agents arranged in the processing nodes 3 with which the respective processing nodes communicate.
  • the change in the number of connections between the processing node 3 and the processing node 1 can be obtained by subtracting from 0 the number of agents arranged in the processing node 3 with which each agent communicates.
  • the magnitude of the increase in the number of connections between the processing nodes 2 and 3 and the magnitude of the decrease in the number of connections between the processing nodes 3 and 1 coincide.
  • the processing load placement unit 108 can reduce the number of connections between the processing nodes 1 and 2 by moving the agent G to the processing node 2 and can also reduce the number of connections between the other processing nodes. It can be determined that the number of connections does not increase or decrease.
  • FIG. 7B is an example of a list when processing node 2 is targeted, and includes an increase / decrease in the number of connections between processing nodes when each agent moves to processing node 1.
  • the processing load placement unit 108 determines that the agent G in the processing node 1 should be moved to the processing node 2 in order to alleviate the bottleneck. If there are multiple candidates for movement to be processed to eliminate bottlenecks, select the candidate that will most reduce the number of connections, that is, the candidate that most reduces the overall communication load. It can be lowered well.
  • the processing load placement means 108 updates the list of FIG. 4 so that the agent G is placed in the processing node 2 and stores it again as processing load placement information 106.
  • the content of the setting change 305 is included in the program information 105, and the management node 101 updates the content of the processing information 111 of each processing node using the communication control unit 108 according to the number of times of processing execution.
  • the setting change 305 is a change of the calculation coefficient of each agent.
  • the second processing execution 302 is performed based on the processing load arrangement information 106 and the processing information 111 updated by the first processing procedure.
  • the parallel computer system 100 is functionally The same operations as those in the first process execution 302 and the load analysis 303 are performed. Specifically, the parallel computer system 100 performs processing execution 302 based on the processing load arrangement shown in the image diagram of FIG. 8 and obtains monitoring result information 107 shown in FIG.
  • the processing load placement unit 108 determines from the monitoring result information 107 shown in FIG. 9 that the bottleneck path is between the processing node 2 and the processing node 3 with the longest communication time. To do.
  • the increase / decrease in the number of connections between the processing nodes is calculated, and the processing load placement unit 108 reduces the number of connections between the processing nodes 2 and 3 based on the calculation result shown in FIG.
  • the agent N is determined to be moved from the processing node 3 to the processing node 2.
  • the processing load placement unit 108 updates the list of FIG. 4 so that the agent N is placed in the processing node 2 and stores the updated list as the processing load placement information 106 again.
  • FIG. 11 is an image diagram of the updated processing load arrangement information 106. Thereafter, the operation of the parallel computer system 100 shifts to the setting change 305, and the contents of the processing information 111 are updated as in the first processing procedure.
  • the second processing procedure is completed, and the procedure proceeds to the third processing procedure.
  • the processing load rearrangement 304 is not executed, the processing execution 306, the load analysis 307, and the setting change 305 are executed, and the same operation is repeated until the twentieth processing procedure is completed. .
  • the operations of the process execution 306, the load analysis 307, and the setting change 305 are functionally similar to the first and second processing procedures.
  • the result of the load analysis 307 becomes the monitoring result information 107 shown in FIG. 12, and the communication time is further reduced compared to the monitoring result information 107 shown in FIG. I understand.
  • the calculation results from the first time to the 20th time are stored in the calculation totaling means 110, and the parallel computer system 100 outputs these results based on the program information 105.
  • the output destination of the calculation result is not specified, for example, an image display device, a printer, an external storage, or the like can be used as the output destination.
  • the second processing procedure can alleviate the communication bottleneck than the first processing
  • the third processing procedure can perform the third processing procedure more than the second processing procedure.
  • Communication bottlenecks can be alleviated. That is, it is possible to provide a parallel computer system that can appropriately distribute the processing load amount for communication when executing parallel processing such as ABS in which a large number of communication between calculation processors occurs.
  • the processing load rearrangement is set to two times, but is not limited to this, and can be set once or three times or more.
  • processing time since processing time also occurs in the processing load rearrangement process itself, it is not efficient to repeat the processing load rearrangement after the bottleneck has been resolved to some extent. Therefore, it is desirable to determine the number of repetitions of processing load relocation in consideration of the balance between the time for executing one processing procedure and the number of repetitions thereof and the time for executing processing load relocation. Since it is desirable that the bottleneck be eliminated at an early stage, it is desirable to perform the processing load rearrangement in the first processing procedure described above.
  • the processing load rearrangement operation is determined by comparing the communication time.
  • the present invention is not limited to this.
  • the processing load is set only when the communication time threshold is exceeded and the threshold is exceeded. It is also possible to perform relocation.
  • the communication bandwidth between the processing nodes may be different.
  • the communication time may be different between the processing nodes. .
  • the communication time per unit information amount between the processing nodes is calculated in advance from the measurement or the specification of the apparatus, and the processing load is rearranged by taking this information into account, thereby making the communication time uniform. Is possible.
  • ABS is taken as an example of parallel computation, but the present invention is not limited to this, and can be applied to analysis called graph processing such as shortest path search.
  • graph processing processing for each graph vertex corresponds to each processing load.
  • one management node and three processing nodes are used, but it goes without saying that the present invention is not limited to this.
  • the basic configuration and operation are the same as those of the parallel computer system 100 shown in the first embodiment of the present invention.
  • the difference from the first embodiment of the present invention is that the execution contents of the load monitoring means 115 and the contents of the monitoring result information 107 as a result thereof, and the execution contents of the processing load arrangement means 108 and the processing load arrangement information 106 as a result thereof. It is the contents of.
  • these differences will be described in detail.
  • the load monitoring means 115 monitors the communication load of the communication control means 114 as in the first embodiment of the present invention, but also monitors the calculation load of the arithmetic processing means 113 anew.
  • the computational load to be monitored is, for example, the average load rate of the CPU 201 and the average usage rate of the memory 202 at the time of operation of the process execution 302.
  • FIG. 13 is an example of the monitoring result information 107 transmitted from each processing node, and assumes the case of the processing load arrangement image shown in FIG. In FIG. 13, the CPU load factor and the memory usage rate of each processing node are proportional to the number of agents in the image diagram of FIG. This is because they are assumed to be equal.
  • the processing load arrangement means 108 performs the same operation as that of the first embodiment of the present invention, but the difference from the first embodiment of the present invention is that the CPU load factor or the memory usage rate is different. If the value is exceeded, the agent will not be moved. For example, if the threshold value in the present embodiment is 65%, the CPU load rate and the memory usage rate are both 60% in the monitoring result information 107 in FIG. Then, the agent G moves as in the first embodiment of the present invention.
  • both the CPU load rate and the memory usage rate are 70%, which exceeds 65% of the threshold value. Therefore, the candidate agent N is not moved, and the second arrangement image shown in FIG. 8 is changed. Without this, the third and subsequent processing steps are executed.
  • the communication bottleneck can be mitigated in the second processing procedure than in the first.
  • the agent is moved to further reduce the communication bottleneck, the calculation processing is affected. Therefore, it is determined that the processing load arrangement at this point is optimal. That is, it is possible to provide a parallel computer system that can appropriately distribute the processing load including communication when executing processing such as ABS in which a large number of communications between computing processors occur.
  • the CPU load rate or the memory usage rate is set to 65% as a threshold value.
  • the present invention is not limited to this. It is desirable to determine the threshold in consideration of the balance between time and calculation time.
  • 100 parallel computing system 100, 101: management node, 102 to 104: processing node 1, 105: program information, 106: processing load arrangement information, 107: monitoring result information, 108: processing load arrangement means, 109: communication control means 110: calculation result totaling means, 111: processing information, 112: calculation necessary information, 113: calculation processing means, 114: communication control means, 115: load monitoring means, 116: communication path control means, 117: storage device, 201 : CPU, 202: Memory, 203: Communication means, 301: Processing load allocation, 302: Processing execution, 303: Load analysis, 304: Processing load relocation, 305: Setting change, 306: Processing execution, 307: Load analysis, 308: Result output.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)
  • Computer And Data Communications (AREA)

Abstract

The present invention addresses the problem of providing a method for arranging the processing load in a parallel computer system, said method being capable of appropriately distributing a processing load that includes the influence of increases/decreases in the communication load between computer processors in conjunction with the migration of processes. The present invention solves this problem by means of a parallel computer system that distributes multiple processing loads with respect to multiple processing nodes, wherein: information regarding the amount of communication between processing loads is obtained; pairs of processing nodes are selected from the multiple processing nodes on the basis of the obtained information, as nodes for which the arrangement is to be changed; a prediction is made regarding the change in the amount of the communication load between these pairs of processing nodes when the processing load is migrated between the selected pairs of processing nodes; and the multiple processing loads are rearranged on the basis of the prediction results.

Description

並列計算機システムおよび並列計算機システムへの処理負荷配置方法Parallel computer system and processing load allocation method to parallel computer system
 本発明は、並列計算機システムに関し、特に並列計算機システムへの処理負荷配置に関する。 The present invention relates to a parallel computer system, and more particularly, to a processing load arrangement on the parallel computer system.
 本技術分野の背景技術として特許文献1に記載の技術がある。この公報には、複数の計算機の負荷を適正に分散する負荷分散方法が記載されている。具体的には、あるプロセス(特許文献1ではプログラムと記載)を自己の計算プロセッサ(特許文献1では実行器と記載)で実行した場合の各計算プロセッサとの間の移動前通信時間と、その処理を他の計算プロセッサに移動させた場合の各プロセッサとの間の移動後通信時間とを求め、移動後通信時間が移動前通信時間より短くなるようにプロセスを他の計算プロセッサに移動させる方法が記載されている。また、本技術分野の背景技術として特許文献2に記載の技術がある。この公報には、並列計算機システムにおいて、システム全体の通信効率が最適になるように実行中のプロセスを別プロセスで実行させる方法が記載されている。具体的には、動作中のプロセスからの他の計算プロセッサへ送信を行う際、通信使用量が通信量のしきい値を超える場合には、受信側の近傍の計算プロセッサの中で最少の通信使用量である計算プロセッサを選択し、該当プロセッサへ実行中のプロセスのものを移動する方法が記載されている。 There is a technique described in Patent Document 1 as background art in this technical field. This publication describes a load distribution method for appropriately distributing the loads of a plurality of computers. Specifically, when a certain process (described as a program in Patent Document 1) is executed by its own calculation processor (described as an executioner in Patent Document 1), the communication time before movement with each of the calculation processors, A method of obtaining a post-movement communication time with each processor when processing is moved to another calculation processor and moving the process to the other calculation processor so that the post-movement communication time is shorter than the pre-movement communication time Is described. Moreover, there exists a technique of patent document 2 as background art of this technical field. This publication describes a method of executing a process being executed as a separate process so that the communication efficiency of the entire system is optimized in a parallel computer system. Specifically, when the amount of communication usage exceeds the threshold for the amount of communication when transmitting from the running process to another calculation processor, the smallest communication among the nearby calculation processors on the receiving side. A method is described in which a calculation processor which is a usage amount is selected and a process being executed is transferred to the processor.
特開平8-30558号公報JP-A-8-30558 特開平11-154143号公報Japanese Patent Laid-Open No. 11-154143
 近年、複雑な経済システムや社会の動きの解析や予測を行うため、エージェントベースシミュレーション(ABS:Agent-based Simulation)と呼ばれる手法の需要が高まりつつある。ABSにおいては、自律的な意思決定を行う複数のエージェントから構成され、エージェント同士の相互作用も加味してシステムの振る舞いを考察することが一般的である。並列計算を用いてABSを実行する場合、エージェントをグループピングして各計算プロセッサに割り振り、エージェント同士の相互作用はネットワークを介して伝播させることになる。 In recent years, in order to analyze and predict complex economic systems and social movements, there is an increasing demand for a method called agent-based simulation (ABS: Agent-based Simulation). The ABS is generally composed of a plurality of agents that make autonomous decision making, and generally considers the behavior of the system in consideration of the interaction between agents. When executing ABS using parallel computation, agents are grouped and allocated to each computation processor, and the interaction between agents is propagated through the network.
 ここで、前述した特許文献1の計算機システムにおいては、プロセスを他の計算プロセッサに移動させるか否かの判断は、プロセス実行時間とプロセス移動に伴う通信時間の和に基づいており、プロセス実行中に発生する計算ノード間の通信については考慮されていない。したがって、ABSのように計算プロセッサ間の通信が多数発生するプロセスにおいては、通信分を含めた処理負荷量を適正に分散することが困難であった。 Here, in the computer system of Patent Document 1 described above, the determination as to whether or not to move the process to another calculation processor is based on the sum of the process execution time and the communication time associated with the process movement, and the process is being executed. The communication between computation nodes occurring in the above is not considered. Therefore, it is difficult to properly distribute the processing load including communication in a process such as ABS in which a large number of communications between calculation processors occur.
 一方、前述した特許文献2の計算機システムは、プロセス実行中における計算プロセッサ間の通信負荷をプロセス移動の判断基準にしているものの、プロセス移動に伴い別の通信経路で発生する通信負荷の増減については考慮されていない。したがって、ABSのように計算プロセッサ間の通信が多数発生するプロセスにおいては、プロセス移動に伴い別の通信経路の通信が増大する可能性が高く、通信分を含めた処理負荷量を適正に分散することが困難であった。 On the other hand, although the computer system of Patent Document 2 described above uses the communication load between the calculation processors during the process execution as a criterion for determining the process movement, the increase or decrease in the communication load generated in another communication path due to the process movement is described below. Not considered. Therefore, in a process in which a large number of communication between calculation processors occurs, such as ABS, there is a high possibility that communication on another communication path will increase as the process moves, and the processing load including communication will be distributed appropriately. It was difficult.
 本発明は、上記の課題を鑑みてなされたものであり、その目的は、プロセス移動に伴う計算プロセッサ間の通信負荷の増減の影響を含めた処理負荷量を適正に分散可能な、並列計算機システムへの処理負荷配置方法を提供することにある。 The present invention has been made in view of the above problems, and its object is to provide a parallel computer system capable of appropriately distributing the processing load amount including the influence of increase / decrease in the communication load between the calculation processors accompanying the process movement. It is to provide a processing load arrangement method.
 本発明では、複数の処理ノードに対して複数の処理負荷を分散配置して実行する並列計算機システムで、各処理負荷間の通信量の情報を取得し、取得した情報に基づいて、複数の処理ノードの内で配置の変更の対象となる処理ノードの対を選択し、選択した対の処理ノード間で処理負荷の移動を行った場合の該対の処理ノード間の通信負荷量の変化の予測を行い、予測結果に基づいて、前記複数の処理負荷の再配置を行うことで、上述の課題を解決する。 In the present invention, in a parallel computer system that distributes and executes a plurality of processing loads for a plurality of processing nodes, information on the amount of communication between the processing loads is acquired, and a plurality of processes are performed based on the acquired information. Predicting a change in the communication load amount between the pair of processing nodes when a pair of processing nodes whose arrangement is to be changed is selected and the processing load is moved between the selected pair of processing nodes. And the above-mentioned problem is solved by rearranging the plurality of processing loads based on the prediction result.
 本発明によれば、ABSのように計算プロセッサ間の通信が多数発生するプロセスにおいても、通信分を含めた処理負荷量を適正に分散することが可能となり、並列計算機システムでの処理の高速化を図ることが可能となる。 According to the present invention, even in a process such as ABS in which a large number of communications between computing processors occur, it is possible to appropriately distribute the processing load including communications, and to speed up the processing in the parallel computer system. Can be achieved.
並列計算機システムの機能を説明するブロック図である。It is a block diagram explaining the function of a parallel computer system. 並列計算機システムのハードウエアイメージを説明するブロック図である。It is a block diagram explaining the hardware image of a parallel computer system. 並列計算機システムの動作を説明するフローチャートである。It is a flowchart explaining operation | movement of a parallel computer system. 処理負荷配置手段の動作を説明するためのリストである。It is a list for demonstrating operation | movement of a process load arrangement | positioning means. 処理負荷配置手段の動作イメージを説明するための図である。It is a figure for demonstrating the operation image of a process load arrangement | positioning means. 負荷監視手段の動作を説明するためのリストである。It is a list for demonstrating operation | movement of a load monitoring means. 処理負荷配置手段の動作を説明するためのリストである。It is a list for demonstrating operation | movement of a process load arrangement | positioning means. 処理負荷配置手段の動作を説明するためのリストである。It is a list for demonstrating operation | movement of a process load arrangement | positioning means. 処理負荷配置手段の動作イメージを説明するための図である。It is a figure for demonstrating the operation image of a process load arrangement | positioning means. 負荷監視手段の動作を説明するためのリストである。It is a list for demonstrating operation | movement of a load monitoring means. 処理負荷配置手段の動作を説明するためのリストである。It is a list for demonstrating operation | movement of a process load arrangement | positioning means. 処理負荷配置手段の動作を説明するためのリストである。It is a list for demonstrating operation | movement of a process load arrangement | positioning means. 処理負荷配置手段の動作イメージを説明するための図である。It is a figure for demonstrating the operation image of a process load arrangement | positioning means. 負荷監視手段の動作を説明するためのリストである。It is a list for demonstrating operation | movement of a load monitoring means. 負荷監視手段の動作を説明するためのリストである。It is a list for demonstrating operation | movement of a load monitoring means. 負荷監視手段の動作を説明するためのリストである。It is a list for demonstrating operation | movement of a load monitoring means.
 以下、実施例を図面を用いて説明する。 Hereinafter, examples will be described with reference to the drawings.
 本実施例では、ABS等の並列処理を伴う計算を実行する際に、通信分を含めた処理負荷量を適正に分散可能な並列計算機システムの例について説明する。 In the present embodiment, an example of a parallel computer system capable of appropriately distributing the processing load including the communication amount when executing calculation involving parallel processing such as ABS will be described.
 図1は、本実施例の並列計算機システム100の機能ブロック図である。並列計算機システム100は、複数の情報処理装置を備える。並列計算機システム100の複数の情報処理装置には、1台の管理ノード101と3台の処理ノード102~104とが含まれる。管理ノード101と処理ノード102~104は、それぞれ例えばサーバ装置である。以下の説明では、特定の処理ノードを指して説明する際に、処理ノード102を処理ノード1と、処理ノード103を処理ノード2と、処理ノード104を処理ノード3と呼ぶ。管理ノード101には、プログラム情報105、処理負荷配置情報106、監視結果情報107、処理負荷配置手段108、通信制御手段109、および演算結果集計手段110が配置される。各処理ノードには、処理情報111、演算必要情報112、演算処理手段113、通信制御手段114、および負荷監視手段115が配置される。並列計算機システム100は、ストレージ装置117を備える。管理ノード101、ストレージ装置117、および処理ノード102~104は、通信経路制御手段116を介して接続される。 FIG. 1 is a functional block diagram of the parallel computer system 100 of this embodiment. The parallel computer system 100 includes a plurality of information processing apparatuses. The plurality of information processing apparatuses of the parallel computer system 100 include one management node 101 and three processing nodes 102 to 104. Each of the management node 101 and the processing nodes 102 to 104 is a server device, for example. In the following description, when referring to a specific processing node, the processing node 102 is referred to as a processing node 1, the processing node 103 is referred to as a processing node 2, and the processing node 104 is referred to as a processing node 3. In the management node 101, program information 105, processing load arrangement information 106, monitoring result information 107, processing load arrangement means 108, communication control means 109, and calculation result aggregation means 110 are arranged. In each processing node, processing information 111, calculation necessary information 112, calculation processing means 113, communication control means 114, and load monitoring means 115 are arranged. The parallel computer system 100 includes a storage device 117. The management node 101, the storage device 117, and the processing nodes 102 to 104 are connected via the communication path control means 116.
 図2は、図1の機能ブロックをハードウエアイメージで示したものである。管理ノード101、および処理ノード102~104のそれぞれは、中央処理装置(CPU)201、メモリ202、および通信手段203を有する。通信経路制御手段116には、各種のネットワーク装置を用いることが可能である。本実施例では、通信経路制御手段116にネットワークスイッチを用いる。 FIG. 2 shows the functional block of FIG. 1 as a hardware image. Each of the management node 101 and the processing nodes 102 to 104 has a central processing unit (CPU) 201, a memory 202, and communication means 203. Various network devices can be used for the communication path control means 116. In this embodiment, a network switch is used for the communication path control means 116.
 次に、本実施例の並列計算機システム100の動作について説明する。図3は、並列計算機システム100の動作を示すフローチャートである。 Next, the operation of the parallel computer system 100 of this embodiment will be described. FIG. 3 is a flowchart showing the operation of the parallel computer system 100.
 まず、管理ノード101のメモリ202にストレージ装置117からプログラムがロードされ、プログラム情報105として格納される。格納されたプログラム情報105は、ABSの各エージェントが実施する演算情報、およびエージェント同士が実施する通信に関する情報などを含む。本実施例ではA~Rの18個のエージェントから構成されるABSで説明する。また、本実施例のABSにおいては、エージェントの演算係数を変更しながら処理を20回繰り返すものとする。 First, a program is loaded from the storage device 117 to the memory 202 of the management node 101 and stored as program information 105. The stored program information 105 includes calculation information performed by each agent of the ABS, information related to communication performed by the agents, and the like. In this embodiment, an ABS composed of 18 agents A to R will be described. In the ABS of this embodiment, the process is repeated 20 times while changing the calculation coefficient of the agent.
 次に、処理負荷配置301以降での並列計算機システム100の動作について説明する。図3において、処理負荷配置301以降には、処理実行302、負荷解析303、処理負荷再配置304、設定変更305、処理実行306、負荷解析307、および結果出力308の各ステップがある。 Next, the operation of the parallel computer system 100 after the processing load arrangement 301 will be described. In FIG. 3, the processing load allocation 301 and subsequent steps include processing execution 302, load analysis 303, processing load rearrangement 304, setting change 305, processing execution 306, load analysis 307, and result output 308.
 まず、処理負荷配置301は、管理ノード101の処理負荷配置手段108および通信制御手段109により実行される。まず、処理負荷配置手段108は、格納されたプログラム情報105を読み出し、例えばエージェントAからRを、処理ノード1、処理ノード2、処理ノード3に対してサイクリックに割り当てる。そして、各処理ノード内の各エージェントが実施する演算情報や、エージェント同士が実施する通信に関する情報などを、通信制御手段108を用いて送信し、これらは処理情報111として各処理ノードのメモリ202に格納される。図4は、各エージェントの処理ノードへの配置先、および通信するエージェントの関係を示すリストであり、この情報は処理負荷配置情報106として格納される。また、図5は、図4のリストをイメージ化したものであり、各エージェントを丸印、通信が必要なエージェント同士を実線の結線、点線を各処理ノードの担当領域として表現したものである。 First, the processing load placement 301 is executed by the processing load placement means 108 and the communication control means 109 of the management node 101. First, the processing load arrangement unit 108 reads the stored program information 105 and assigns, for example, agents A to R cyclically to the processing nodes 1, 2, and 3. Then, computation information executed by each agent in each processing node, information related to communication performed by the agents, and the like are transmitted using the communication control means 108, and these are transmitted as processing information 111 to the memory 202 of each processing node. Stored. FIG. 4 is a list showing the arrangement destination of each agent on the processing node and the relationship between the communicating agents, and this information is stored as the processing load arrangement information 106. FIG. 5 is an image of the list shown in FIG. 4. Each agent is indicated by a circle, agents that need to communicate with each other are represented by solid lines, and a dotted line is represented by a region assigned to each processing node.
 次に、第1回目の処理実行302での並列計算機システム100の動作について説明する。処理実行302は、各処理ノードの演算処理手段113によって実行される。演算処理手段113は、処理情報111および演算必要情報112を読み出し、これらの情報から各処理ノード内の各エージェントの演算を実行してその結果を出力する。ここで、演算必要情報112とは、各処理ノード内のエージェントが演算を行うために必要な情報であり、例えば他のエージェントで実行された演算の結果である。また、演算処理手段113における演算の順番や並列度、他の処理ノードとの同期などは、演算必要情報112に基づいて定められており、これらは処理情報111に含まれている。さらに、演算処理手段113における演算結果は、通信制御手段114を用いて送信され、管理ノード101の演算集計手段110に格納されると共に、その情報を必要とする処理ノードの演算必要情報112にも格納される。 Next, the operation of the parallel computer system 100 in the first process execution 302 will be described. The process execution 302 is executed by the arithmetic processing unit 113 of each processing node. The arithmetic processing means 113 reads the processing information 111 and the calculation necessary information 112, executes the calculation of each agent in each processing node from these information, and outputs the result. Here, the calculation necessary information 112 is information necessary for an agent in each processing node to perform a calculation, for example, a result of a calculation executed by another agent. Further, the calculation order and parallelism in the calculation processing unit 113, synchronization with other processing nodes, and the like are determined based on the calculation necessary information 112, and these are included in the processing information 111. Furthermore, the calculation result in the calculation processing unit 113 is transmitted using the communication control unit 114, stored in the calculation totaling unit 110 of the management node 101, and also in the calculation necessary information 112 of the processing node that needs the information. Stored.
 次に、負荷解析303の動作について説明する。負荷解析303は、各処理ノードの負荷監視手段115により実行される。負荷監視手段115は、通信制御手段114の通信負荷を監視し、監視結果は、通信制御手段114を用いて管理ノード101に送信され、監視結果情報107として格納される。ここで、監視する通信負荷とは、第1回目の処理実行302において、処理ノード間の通信時間を測定したものであり、例えば各処理装置の通信手段203または通信経路制御手段116で測定される。この指標を選定した理由は、処理ノード間を跨ぐエージェント同士の通信速度が処理ノード内のそれと比べてはるかに遅く、処理のボトルネックになり易いと考えたためである。図6は、各処理ノードから送信された監視結果情報107の例である。なお、図6において、通信時間は図5のイメージ図における各処理ノード間の結線本数に比例しているが、これは、説明の便宜上、エージェント同士の通信量が全て等しいと仮定したためである。すなわち、本実施例では、各処理ノード間の通信量を、結線本数で評価する。 Next, the operation of the load analysis 303 will be described. The load analysis 303 is executed by the load monitoring unit 115 of each processing node. The load monitoring unit 115 monitors the communication load of the communication control unit 114, and the monitoring result is transmitted to the management node 101 using the communication control unit 114 and stored as monitoring result information 107. Here, the communication load to be monitored is a value obtained by measuring the communication time between processing nodes in the first process execution 302, and is measured by, for example, the communication unit 203 or the communication path control unit 116 of each processing device. . The reason for selecting this index is that the communication speed between agents across the processing nodes is much slower than that in the processing nodes, which is considered to easily cause a bottleneck in processing. FIG. 6 is an example of the monitoring result information 107 transmitted from each processing node. In FIG. 6, the communication time is proportional to the number of connections between the processing nodes in the image diagram of FIG. 5, for the sake of convenience of explanation, it is assumed that the communication amounts of the agents are all equal. That is, in this embodiment, the communication amount between the processing nodes is evaluated by the number of connections.
 次に、処理負荷再配置304の動作について説明する。処理負荷再配置304は、管理ノード108の処理負荷配置手段108により実行される。処理負荷配置手段108は、監視結果情報107を読み出し、最も通信負荷の高いボトルネック経路を判定する。例えば図6においては、処理ノード1と処理ノード2との間の平均使用率が100%と最も高いので、この経路をボトルネック経路と判定する。次に、処理負荷配置手段108は、ボトルネック緩和のために移動すべきエージェントの選定を行う。この選定に対する考え方を、図5のイメージ図を用いて説明する。 Next, the operation of the processing load rearrangement 304 will be described. The processing load rearrangement 304 is executed by the processing load placement unit 108 of the management node 108. The processing load placement unit 108 reads the monitoring result information 107 and determines the bottleneck route with the highest communication load. For example, in FIG. 6, since the average usage rate between the processing node 1 and the processing node 2 is the highest at 100%, this route is determined as a bottleneck route. Next, the processing load placement unit 108 selects an agent to be moved to alleviate the bottleneck. The concept for this selection will be described with reference to the image diagram of FIG.
 先に述べたように、通信時間が各処理ノード間の結線本数に比例していると仮定していることから、処理ノード1と処理ノード2との間の結線本数を削減することで、通信負荷分を含めた処理負荷の適正な分散を図る。そこで、管理ノード101の処理負荷配置手段108は、平均使用率が最も高い処理ノード1と処理ノード2の対を選択して、対の間で処理対象であるエージェントの移動を試みる。 As described above, since it is assumed that the communication time is proportional to the number of connections between processing nodes, communication can be reduced by reducing the number of connections between processing nodes 1 and 2. Appropriate distribution of processing load including the load. Therefore, the processing load placement unit 108 of the management node 101 selects the pair of processing node 1 and processing node 2 having the highest average usage rate, and tries to move the agent to be processed between the pair.
 例えば、図5において、エージェントDを処理ノード1から処理ノード2に移動させる場合、エージェントBとの結線が処理ノード2内に移動するため、処理ノード1と処理ノード2との間の結線本数を1本削減可能である。しかしながら、エージェントA、M、Jとの3本の結線が、新たに処理ノード1と処理ノード2との間に発生する。したがって、処理ノード1と処理ノード2の結線本数は、移動前に比べて2本増加する。これに対し、エージェントGを処理ノード1から処理ノード2に移動させる場合、エージェントE,K,Qとの結線が処理ノード2内に移動するため、処理ノード1と処理ノード2との間の結線本数を3本削減可能である。一方、エージェントJとの1本の結線が、新たに処理ノード1と処理ノード2との間に発生する。したがって、処理ノード1と処理ノード2の結線本数は、移動前に比べて2本削減される。このように、自処理ノード内にあるエージェントとの結線本数から、移動対象の処理ノード内にあるエージェントとの結線本数を減算し、その結果がマイナスであれば、移動前に比べて処理ノード間の結線本数を削減させることが可能である。 For example, in FIG. 5, when the agent D is moved from the processing node 1 to the processing node 2, since the connection with the agent B moves into the processing node 2, the number of connections between the processing node 1 and the processing node 2 is calculated. One can be reduced. However, three connections with agents A, M, and J are newly generated between processing node 1 and processing node 2. Therefore, the number of connections between the processing node 1 and the processing node 2 is increased by two compared to before the movement. On the other hand, when the agent G is moved from the processing node 1 to the processing node 2, the connection between the agents E, K, and Q moves into the processing node 2, so that the connection between the processing node 1 and the processing node 2 is performed. The number can be reduced by three. On the other hand, one connection with the agent J is newly generated between the processing node 1 and the processing node 2. Therefore, the number of connections between the processing node 1 and the processing node 2 is reduced by two compared to before the movement. In this way, if the number of connections with the agent in the processing node to be moved is subtracted from the number of connections with the agent in the processing node, and the result is negative, the processing nodes It is possible to reduce the number of connections.
 そこで、管理ノード101の処理負荷配置手段108は、図4で示した処理負荷配置情報106に基づき、選択した対の間で処理負荷を移動させた際の、処理ノード間の結線本数の変化を計算する。図7(a)は、処理ノード1を対象とした処理ノード間の結線本数の変化の計算結果の一例であり、処理ノード1内の各エージェントが通信するエージェントの数、および、各エージェントが処理ノード2に移動した場合の処理ノード間の結線本数の変化(増減)の情報が含まれる。ここで、処理ノード1と処理ノード2との間の結線本数の変化は、各エージェントが処理ノード1内で通信するエージェントの数から、各エージェントが通信する処理ノード2に配置されているエージェントの数を減算することで求めることができる。また、処理ノード2と処理ノード3との間の結線本数は、各処理ノードが通信する処理ノード3に配置されているエージェントの数だけ増える。さらに、処理ノード3と処理ノード1との間の結線本数の変化は、各エージェントが通信する処理ノード3に配置されているエージェントの数を0から減算することで求めることができる。なお、処理ノード2と処理ノード3との間の結線本数の増加の大きさと、処理ノード3と処理ノード1との間の結線本数の減少の大きさは一致する。 Therefore, the processing load placement unit 108 of the management node 101 changes the number of connections between the processing nodes when the processing load is moved between the selected pairs based on the processing load placement information 106 shown in FIG. calculate. FIG. 7A is an example of the calculation result of the change in the number of connections between processing nodes for the processing node 1, and the number of agents communicated by each agent in the processing node 1 and the processing by each agent. Information on the change (increase / decrease) in the number of connections between processing nodes when moving to node 2 is included. Here, the change in the number of connections between the processing node 1 and the processing node 2 depends on the number of agents that each agent communicates in the processing node 1 from the number of agents that are arranged in the processing node 2 that communicates with each agent. It can be obtained by subtracting the number. Further, the number of connections between the processing nodes 2 and 3 increases by the number of agents arranged in the processing nodes 3 with which the respective processing nodes communicate. Furthermore, the change in the number of connections between the processing node 3 and the processing node 1 can be obtained by subtracting from 0 the number of agents arranged in the processing node 3 with which each agent communicates. In addition, the magnitude of the increase in the number of connections between the processing nodes 2 and 3 and the magnitude of the decrease in the number of connections between the processing nodes 3 and 1 coincide.
 ここで、処理ノード1と処理ノード2との間の結線本数の増減の列に着目すると、エージェントGのみがマイナスの値を有しており、また、エージェントGでは、処理ノード2と処理ノード3との間の結線本数、および処理ノード3と処理ノード1との間の結線本数の増減は0である。したがって、処理負荷配置手段108は、エージェントGを処理ノード2に移動させることで、処理ノード1と処理ノード2との間の結線本数を2本減らすことができ、かつそれ以外の処理ノード間の結線本数は増減しないという判定を得ることができる。同様に、図7(b)は処理ノード2を対象とした場合のリストの例であり、各エージェントが処理ノード1に移動した場合の処理ノード間の結線本数の増減が含まれている。図7(b)において、処理ノード1と処理ノード2との間の結線本数の増減の列に着目すると、マイナスの値を有するエージェントが無いことが分かる。以上のことから、処理負荷配置手段108は、ボトルネック緩和のためには、処理ノード1内にあるエージェントGを、処理ノード2に移動すれば良いと判定する。なお、ボトルネック解消のための処理対象の移動の候補が複数有る場合には、最も結線本数を減少させる候補、すなわち最も全体の通信負荷を減少させる候補を選択することで、より通信負荷を効率よく下げることが可能である。また、ボトルネック解消のための処理対象の移動の候補が複数有る場合は、処理負荷の移動が行われる処理ノードの対以外の処理ノード間に結線本数の増加を伴わない候補を選ぶことで、処理負荷の移動に伴う新たなボトルネックの発生を防ぐことができる。上述のエージェントGの移動は、最も通信負荷量を減少させる候補であり、かつ、新たなボトルネックの発生を防ぐ理想的な移動である。 Here, paying attention to the column of increase / decrease in the number of connections between the processing nodes 1 and 2, only the agent G has a negative value, and in the agent G, the processing nodes 2 and 3 And the increase / decrease in the number of connections between the processing node 3 and the processing node 1 are zero. Therefore, the processing load placement unit 108 can reduce the number of connections between the processing nodes 1 and 2 by moving the agent G to the processing node 2 and can also reduce the number of connections between the other processing nodes. It can be determined that the number of connections does not increase or decrease. Similarly, FIG. 7B is an example of a list when processing node 2 is targeted, and includes an increase / decrease in the number of connections between processing nodes when each agent moves to processing node 1. In FIG. 7B, when attention is paid to the column of increase / decrease in the number of connections between the processing node 1 and the processing node 2, it can be seen that there is no agent having a negative value. From the above, the processing load placement unit 108 determines that the agent G in the processing node 1 should be moved to the processing node 2 in order to alleviate the bottleneck. If there are multiple candidates for movement to be processed to eliminate bottlenecks, select the candidate that will most reduce the number of connections, that is, the candidate that most reduces the overall communication load. It can be lowered well. In addition, when there are a plurality of processing target movement candidates for bottleneck resolution, by selecting a candidate that does not involve an increase in the number of connections between processing nodes other than the processing node pair in which the processing load is moved, It is possible to prevent the occurrence of a new bottleneck accompanying the movement of the processing load. The above-described movement of the agent G is a candidate that most reduces the communication load, and is an ideal movement that prevents the occurrence of a new bottleneck.
 以上の解析結果に基づき、処理負荷配置手段108は、エージェントGが処理ノード2に配置されるように図4のリストを更新し、処理負荷配置情報106として再度格納する。 Based on the above analysis results, the processing load placement means 108 updates the list of FIG. 4 so that the agent G is placed in the processing node 2 and stores it again as processing load placement information 106.
 次に、設定変更305での並列計算機システム100の動作について説明する。設定変更305の内容はプログラム情報105に含まれており、管理ノード101は、処理実行の回数に応じて、通信制御手段108を用いて各処理ノードの処理情報111の内容を更新する。設定変更305は、本実施例においては、各エージェントの演算係数の変更とする。 Next, the operation of the parallel computer system 100 in the setting change 305 will be described. The content of the setting change 305 is included in the program information 105, and the management node 101 updates the content of the processing information 111 of each processing node using the communication control unit 108 according to the number of times of processing execution. In this embodiment, the setting change 305 is a change of the calculation coefficient of each agent.
 以上の動作により、第1回目の処理手順が完了し、第2回目の処理実行302に移行する。第2回目の処理実行302は、第1回目の処理手順により更新された処理負荷配置情報106および処理情報111に基づいて行われる。 With the above operation, the first processing procedure is completed, and the process proceeds to the second processing execution 302. The second processing execution 302 is performed based on the processing load arrangement information 106 and the processing information 111 updated by the first processing procedure.
 処理実行302および負荷解析303は、各エージェントの演算係数が変更されている点、およびエージェントGが処理ノード2に配置されている点の違いはあるものの、並列計算機システム100は、機能的には第1回目の処理実行302および負荷解析303と同様の動作を行う。具体的には、並列計算機システム100は、図8のイメージ図で示す処理負荷配置に基づき処理実行302を行い、負荷解析303により図9に示す監視結果情報107を得る。 Although the processing execution 302 and the load analysis 303 are different in that the calculation coefficient of each agent is changed and the agent G is arranged in the processing node 2, the parallel computer system 100 is functionally The same operations as those in the first process execution 302 and the load analysis 303 are performed. Specifically, the parallel computer system 100 performs processing execution 302 based on the processing load arrangement shown in the image diagram of FIG. 8 and obtains monitoring result information 107 shown in FIG.
 次に、負荷再配置304では、処理負荷配置手段108は、図9に示した監視結果情報107から、ボトルネック経路が最も通信時間が長い処理ノード2と処理ノード3との間であると判定する。 Next, in the load rearrangement 304, the processing load placement unit 108 determines from the monitoring result information 107 shown in FIG. 9 that the bottleneck path is between the processing node 2 and the processing node 3 with the longest communication time. To do.
 次に、処理ノード間の結線本数の増減の計算を行い、処理負荷配置手段108は、図10に示す計算結果に基づき、処理ノード2と処理ノード3との間の結線本数を削減するには、エージェントNを処理ノード3から処理ノード2へ移動すれば良いという判定を行う。さらに、処理負荷配置手段108は、エージェントNが処理ノード2に配置されるように図4のリストを更新し、更新したリストを処理負荷配置情報106として再度格納する。 Next, the increase / decrease in the number of connections between the processing nodes is calculated, and the processing load placement unit 108 reduces the number of connections between the processing nodes 2 and 3 based on the calculation result shown in FIG. The agent N is determined to be moved from the processing node 3 to the processing node 2. Furthermore, the processing load placement unit 108 updates the list of FIG. 4 so that the agent N is placed in the processing node 2 and stores the updated list as the processing load placement information 106 again.
 図11は更新された処理負荷配置情報106のイメージ図である。その後、並列計算機システム100の動作は設定変更305に移行し、第一回目の処理手順と同様に処理情報111の内容が更新される。 FIG. 11 is an image diagram of the updated processing load arrangement information 106. Thereafter, the operation of the parallel computer system 100 shifts to the setting change 305, and the contents of the processing information 111 are updated as in the first processing procedure.
 以上の動作により、第2回目の処理手順が完了し、第3回目の処理手順に移行する。第3回目以降の処理手順では、処理負荷再配置304は実行されず、処理実行306、負荷解析307、および設定変更305を実施し、第20回目の処理手順が完了するまで同様の動作を繰り返す。なお、処理実行306、負荷解析307、および設定変更305の動作は、機能的には第1回目や第2回目の処理手順と同様である。なお、第3回目以降の処理手順において、負荷解析307の結果は図12に示す監視結果情報107となり、図9に示した監視結果情報107と比べて、さらに通信時間が減少していることが分かる。このように、並列処理の初期の段階で処理負荷の再配置を行うことで、再配置後の通信時間を減少の恩恵を多く受けることが可能となる。 With the above operation, the second processing procedure is completed, and the procedure proceeds to the third processing procedure. In the third and subsequent processing procedures, the processing load rearrangement 304 is not executed, the processing execution 306, the load analysis 307, and the setting change 305 are executed, and the same operation is repeated until the twentieth processing procedure is completed. . The operations of the process execution 306, the load analysis 307, and the setting change 305 are functionally similar to the first and second processing procedures. In the third and subsequent processing procedures, the result of the load analysis 307 becomes the monitoring result information 107 shown in FIG. 12, and the communication time is further reduced compared to the monitoring result information 107 shown in FIG. I understand. Thus, by rearranging the processing loads at the initial stage of parallel processing, it is possible to receive many benefits of reducing the communication time after rearrangement.
 最後に、結果出力308での並列計算機システム100の動作について説明する。第1回目から第20回目までの演算結果は、演算集計手段110に格納されており、並列計算機システム100はこれらの結果をプログラム情報105に基づき出力する。なお、本実施例では演算結果の出力先については明示していないが、出力先として、例えば画像表示装置やプリンタ、外部ストレージなどを用いることができる。 Finally, the operation of the parallel computer system 100 with the result output 308 will be described. The calculation results from the first time to the 20th time are stored in the calculation totaling means 110, and the parallel computer system 100 outputs these results based on the program information 105. In the present embodiment, although the output destination of the calculation result is not specified, for example, an image display device, a printer, an external storage, or the like can be used as the output destination.
 以上説明した本発明第1の実施例によれば、第1回目よりも第2回目の処理手順の方が通信ボトルネックを緩和でき、さらに第2回目よりも第3回目の処理手順の方が通信ボトルネックを緩和できる。つまり、計算プロセッサ間の通信が多数発生するABSのような並列処理を実行する際において、通信分の処理負荷量を適正に分散可能な並列計算機システムを提供することが可能である。 According to the first embodiment of the present invention described above, the second processing procedure can alleviate the communication bottleneck than the first processing, and the third processing procedure can perform the third processing procedure more than the second processing procedure. Communication bottlenecks can be alleviated. That is, it is possible to provide a parallel computer system that can appropriately distribute the processing load amount for communication when executing parallel processing such as ABS in which a large number of communication between calculation processors occurs.
 なお、本発明の実施例においては処理負荷再配置を2回としているが、これに限られる訳ではなく、1回または3回以上に設定ことも可能である。ただし、処理負荷再配置の処理自体にも処理時間が発生するため、ある程度ボトルネックが解消した後に処理負荷再配置を繰り返すことは効率的ではない。したがって、一回の処理手順を実行するための時間とその繰り返し回数、および処理負荷再配置を実行するための時間とのバランスを考えて、処理負荷再配置の繰り返し回数を決定することが望ましい。ボトルネックの解消は早い段階で行われる方が望ましいので、上述の第1回目の処理手順で処理負荷再配置を行うのが望ましい。 In the embodiment of the present invention, the processing load rearrangement is set to two times, but is not limited to this, and can be set once or three times or more. However, since processing time also occurs in the processing load rearrangement process itself, it is not efficient to repeat the processing load rearrangement after the bottleneck has been resolved to some extent. Therefore, it is desirable to determine the number of repetitions of processing load relocation in consideration of the balance between the time for executing one processing procedure and the number of repetitions thereof and the time for executing processing load relocation. Since it is desirable that the bottleneck be eliminated at an early stage, it is desirable to perform the processing load rearrangement in the first processing procedure described above.
 本実施例では、処理負荷再配置の動作を通信時間の比較により判断したが、これに限られる訳ではなく、例えば通信時間のしきい値を設けて、しきい値を超えた場合のみ処理負荷再配置を実行することも可能である。 In this embodiment, the processing load rearrangement operation is determined by comparing the communication time. However, the present invention is not limited to this. For example, the processing load is set only when the communication time threshold is exceeded and the threshold is exceeded. It is also possible to perform relocation.
 また、処理手順を実行する前に処理負荷再配置を繰り返しておき、処理ノード間の結線本数を均一化しておくことも可能である。しかし、実際の処理ノード間の通信時間は、各種要因により、必ずしも結線本数に依存しないことが考えられる。したがって、上記の処理手順を行う場合にも、通信負荷の監視機能および再配置機能を設けておくことが望ましい。なお、各種要因により、処理負荷再配置後の通信時間が明らかに増大した場合は、次候補の処理負荷配置を実行することや、元の処理負荷配置処理に戻すことも可能である。 Also, it is possible to repeat the processing load rearrangement before executing the processing procedure to make the number of connections between the processing nodes uniform. However, the actual communication time between processing nodes may not necessarily depend on the number of connections due to various factors. Therefore, it is desirable to provide a communication load monitoring function and a rearrangement function even when the above processing procedure is performed. If the communication time after the processing load rearrangement obviously increases due to various factors, it is possible to execute the processing load allocation of the next candidate or return to the original processing load allocation processing.
 また、ネットワークのトポロジに依存して、処理ノード間の通信帯域が異なることが考えられ、この場合には同じ情報量の通信を実行しても、処理ノード間で通信時間が異なることが考えられる。この場合には、各処理ノード間における単位情報量あたりの通信時間を予め測定または装置の仕様から算出しておき、この情報を加味して処理負荷再配置を実行することで、通信時間の均一化が可能である。 Also, depending on the network topology, the communication bandwidth between the processing nodes may be different. In this case, even if communication with the same amount of information is executed, the communication time may be different between the processing nodes. . In this case, the communication time per unit information amount between the processing nodes is calculated in advance from the measurement or the specification of the apparatus, and the processing load is rearranged by taking this information into account, thereby making the communication time uniform. Is possible.
 さらに、本実施例においては、並列計算の例としてABSを取り上げたが、これに限られる訳ではなく、例えば最短経路探索などのグラフ処理と呼ばれる解析にも応用することが可能である。グラフ処理の場合には、グラフ頂点毎の処理が各処理負荷に対応する。さらに、本実施例においては、説明の便宜上、管理ノードを1台、処理ノードを3台としたが、これに限られる訳ではないことは言うまでもない。 Furthermore, in the present embodiment, ABS is taken as an example of parallel computation, but the present invention is not limited to this, and can be applied to analysis called graph processing such as shortest path search. In the case of graph processing, processing for each graph vertex corresponds to each processing load. Furthermore, in this embodiment, for convenience of explanation, one management node and three processing nodes are used, but it goes without saying that the present invention is not limited to this.
 本実施例では、通信負荷の監視だけでなく演算負荷の監視も行い、双方の負荷量を加味して処理負荷量を適正に分散可能な並列計算システムの例について説明する。 In the present embodiment, an example of a parallel computing system capable of not only monitoring communication load but also computing load and taking into account both load amounts and appropriately distributing the processing load amount will be described.
 本実施例において、基本的な構成や動作は、本発明第1の実施例で示した並列計算機システム100と同様である。本発明第1の実施例と異なる点は、負荷監視手段115の実行内容とその結果である監視結果情報107の内容、および処理負荷配置手段108の実行内容とその結果である処理負荷配置情報106の内容である。以下、これらの相違点について詳しく説明する。 In this embodiment, the basic configuration and operation are the same as those of the parallel computer system 100 shown in the first embodiment of the present invention. The difference from the first embodiment of the present invention is that the execution contents of the load monitoring means 115 and the contents of the monitoring result information 107 as a result thereof, and the execution contents of the processing load arrangement means 108 and the processing load arrangement information 106 as a result thereof. It is the contents of. Hereinafter, these differences will be described in detail.
 まず、負荷監視手段115は、本発明第1の実施例と同様に通信制御手段114の通信負荷を監視するが、新たに演算処理手段113の演算負荷についても監視する。ここで、監視する演算負荷とは、例えば処理実行302の動作時における、各処理ノードのCPU201の平均負荷率とメモリ202の平均使用率である。図13は、各処理ノードから送信された監視結果情報107の例であり、図5で示した処理負荷配置イメージの場合を想定している。なお、図13において、各処理ノードのCPU負荷率およびメモリ使用率は、図5のイメージ図におけるエージェントの数に比例しているが、これは、説明の便宜上、各エージェントで実行する演算量が全て等しいことを仮定したためである。 First, the load monitoring means 115 monitors the communication load of the communication control means 114 as in the first embodiment of the present invention, but also monitors the calculation load of the arithmetic processing means 113 anew. Here, the computational load to be monitored is, for example, the average load rate of the CPU 201 and the average usage rate of the memory 202 at the time of operation of the process execution 302. FIG. 13 is an example of the monitoring result information 107 transmitted from each processing node, and assumes the case of the processing load arrangement image shown in FIG. In FIG. 13, the CPU load factor and the memory usage rate of each processing node are proportional to the number of agents in the image diagram of FIG. This is because they are assumed to be equal.
 次に、処理負荷配置手段108は、本発明第1の実施例と同様の動作を行うが、本発明第1の実施例との相違点は、CPU負荷率またはメモリ使用率が、あるしきい値を超えた場合、エージェントの移動を実行しない点である。例えば、本実施における上記のしきい値を65%とすると、図13の監視結果情報107において、CPU負荷率とメモリ使用率は共に60%であることから、エージェントを移動しても良いと判断し、本発明第1の実施例と同様に、エージェントGの移動を行う。 Next, the processing load arrangement means 108 performs the same operation as that of the first embodiment of the present invention, but the difference from the first embodiment of the present invention is that the CPU load factor or the memory usage rate is different. If the value is exceeded, the agent will not be moved. For example, if the threshold value in the present embodiment is 65%, the CPU load rate and the memory usage rate are both 60% in the monitoring result information 107 in FIG. Then, the agent G moves as in the first embodiment of the present invention.
 次に第2回目の処理手順の実行に入り、同様に負荷の監視した結果が、図14の監視結果情報107である。図14において、CPU負荷率とメモリ使用率は共に70%としきい値の65%を超えているため、候補となるエージェントNの移動は行わず、図8に示す第2回目の配置イメージを変更せずに、第3回目以降の処理手順を実行する。 Next, the second processing procedure is executed, and the result of monitoring the load in the same manner is the monitoring result information 107 in FIG. In FIG. 14, both the CPU load rate and the memory usage rate are 70%, which exceeds 65% of the threshold value. Therefore, the candidate agent N is not moved, and the second arrangement image shown in FIG. 8 is changed. Without this, the third and subsequent processing steps are executed.
 以上説明した本発明第2の実施例によれば、第1回目よりも第2回目の処理手順の方が通信ボトルネックを緩和できる。しかし、さらなる通信ボトルネック緩和に向けたエージェントの移動を行おうとすると、演算処理に影響が生じるため、この時点での処理負荷配置を最適と判断する。つまり、計算プロセッサ間の通信が多数発生するABSのような処理を実行する際において、通信分を含めた処理負荷量を適正に分散可能な並列計算機システムを提供することが可能である。 According to the second embodiment of the present invention described above, the communication bottleneck can be mitigated in the second processing procedure than in the first. However, if the agent is moved to further reduce the communication bottleneck, the calculation processing is affected. Therefore, it is determined that the processing load arrangement at this point is optimal. That is, it is possible to provide a parallel computer system that can appropriately distribute the processing load including communication when executing processing such as ABS in which a large number of communications between computing processors occur.
 なお、本実施例においては、負荷の再配置を実行するか否かの判断基準として、CPU負荷率またはメモリ使用率が65%をしきい値としたが、これに限られる訳ではなく、通信時間と演算時間とのバランスを加味して、しきい値を決定することが望ましい。 In this embodiment, as a criterion for determining whether or not to perform load rearrangement, the CPU load rate or the memory usage rate is set to 65% as a threshold value. However, the present invention is not limited to this. It is desirable to determine the threshold in consideration of the balance between time and calculation time.
 100:並列計算システム100、101:管理ノード、102~104:処理ノード1、105:プログラム情報、106:処理負荷配置情報、107:監視結果情報、108:処理負荷配置手段、109:通信制御手段、110:演算結果集計手段、111:処理情報、112:演算必要情報、113:演算処理手段、114:通信制御手段、115:負荷監視手段、116:通信経路制御手段、117:ストレージ装置、201:CPU、202:メモリ、203:通信手段、301:処理負荷配置、302:処理実行、303:負荷解析、304:処理負荷再配置、305:設定変更、306:処理実行、307:負荷解析、308:結果出力。 100: parallel computing system 100, 101: management node, 102 to 104: processing node 1, 105: program information, 106: processing load arrangement information, 107: monitoring result information, 108: processing load arrangement means, 109: communication control means 110: calculation result totaling means, 111: processing information, 112: calculation necessary information, 113: calculation processing means, 114: communication control means, 115: load monitoring means, 116: communication path control means, 117: storage device, 201 : CPU, 202: Memory, 203: Communication means, 301: Processing load allocation, 302: Processing execution, 303: Load analysis, 304: Processing load relocation, 305: Setting change, 306: Processing execution, 307: Load analysis, 308: Result output.

Claims (15)

  1.  複数の処理ノードに対して複数の処理負荷を分散配置して実行する並列計算機システムへの処理負荷配置方法であって、
     各処理負荷間の通信量の情報を取得し、
     取得した情報に基づいて、前記複数の処理ノードの内で配置の変更の対象となる処理ノードの対を選択し、
     選択した対の処理ノード間で処理負荷の移動を行った場合の該対の処理ノード間の通信負荷量の変化の予測を行い、
     予測結果に基づいて、前記複数の処理負荷の再配置を行うことを特徴とする処理負荷配置方法。
    A processing load allocation method for a parallel computer system that executes a plurality of processing loads distributed to a plurality of processing nodes.
    Get information on traffic between each processing load,
    Based on the acquired information, select a pair of processing nodes whose arrangement is to be changed among the plurality of processing nodes,
    Predicting a change in the communication load amount between the pair of processing nodes when the processing load is moved between the selected pair of processing nodes,
    A processing load arrangement method comprising rearranging the plurality of processing loads based on a prediction result.
  2.  請求項1に記載の処理負荷配置方法において、
     前記予測を行う際に、
     さらに、該対の処理ノード間以外の各処理ノード間の通信負荷量の変化の予測も行うことを特徴とする処理負荷配置方法。
    The processing load arrangement method according to claim 1,
    When making the prediction,
    Further, a processing load arrangement method characterized by predicting a change in communication load amount between processing nodes other than between the pair of processing nodes.
  3.  請求項1に記載の処理負荷配置方法において、
     前記並列計算機システムは複数回の処理を繰り返し、
     前記複数回の処理の内の第1の処理の後に前記再配置を行うことを特徴とする処理負荷配置方法。
    The processing load arrangement method according to claim 1,
    The parallel computer system repeats a plurality of processes,
    The processing load arrangement method, wherein the rearrangement is performed after a first process of the plurality of processes.
  4.  請求項3に記載の処理負荷配置方法において、
     前記第1の処理は、前記複数回の処理の内の最初の処理であることを特徴とする処理負荷配置方法。
    In the processing load arrangement method according to claim 3,
    The processing load arrangement method, wherein the first process is a first process among the plurality of processes.
  5.  請求項1に記載の処理負荷配置方法において、
     各処理負荷は、エージェントベースシミュレーションの各エージェントに対応していることを特徴とする処理負荷配置方法。
    The processing load arrangement method according to claim 1,
    A processing load arrangement method, wherein each processing load corresponds to each agent in the agent-based simulation.
  6.  請求項1に記載の処理負荷配置方法において、
     各処理負荷は、グラフの各ノードに対応していることを特徴とする処理負荷配置方法。
    The processing load arrangement method according to claim 1,
    A processing load arrangement method, wherein each processing load corresponds to each node of the graph.
  7.  複数の処理ノードに対して複数の処理負荷を分散配置して実行する並列計算機システムであって、
     各処理負荷間の通信量の情報を取得する手段と、
     前記取得する手段が取得した情報に基づいて、前記複数の処理ノードの内で配置の変更の対象となる処理ノードの対を選択し、
     選択した対の処理ノード間で処理負荷の移動を行った場合の該対の処理ノード間の通信負荷量の変化の予測を行い、
     予測結果に基づいて、前記複数の処理負荷の再配置を行う手段とを有することを特徴とする並列計算機システム。
    A parallel computer system that executes a plurality of processing loads in a distributed manner with respect to a plurality of processing nodes,
    Means for obtaining information on the amount of communication between each processing load;
    Based on the information acquired by the means for acquiring, a pair of processing nodes to be subject to change of arrangement among the plurality of processing nodes is selected,
    Predicting a change in the communication load amount between the pair of processing nodes when the processing load is moved between the selected pair of processing nodes,
    A parallel computer system comprising: means for rearranging the plurality of processing loads based on a prediction result.
  8.  請求項7に記載の並列計算機システムにおいて、
     前記再配置を行う手段は、
     前記予測を行う際に、
     さらに、該対の処理ノード間以外の各処理ノード間の通信負荷量の変化の予測も行うことを特徴とする並列計算機システム。
    In the parallel computer system according to claim 7,
    The means for performing the rearrangement is:
    When making the prediction,
    Furthermore, a parallel computer system characterized by also predicting a change in communication load between processing nodes other than between the pair of processing nodes.
  9.  請求項7に記載の並列計算機システムにおいて、
     前記並列計算機システムは複数回の処理を繰り返し、
     前記再配置を行う手段は、
     前記複数回の処理の内の第1の処理の後に前記再配置を行うことを特徴とする並列計算機システム。
    In the parallel computer system according to claim 7,
    The parallel computer system repeats a plurality of processes,
    The means for performing the rearrangement is:
    A parallel computer system, wherein the rearrangement is performed after a first process of the plurality of processes.
  10.  請求項9に記載の並列計算機システムにおいて、
     前記第1の処理は、前記複数回の処理の内の最初の処理であることを特徴とする並列計算機システム。
    In the parallel computer system according to claim 9,
    The parallel computer system, wherein the first process is a first process among the plurality of processes.
  11.  請求項7に記載の並列計算機システムにおいて、
     各処理負荷は、エージェントベースシミュレーションの各エージェントに対応していることを特徴とする並列計算機システム。
    In the parallel computer system according to claim 7,
    A parallel computer system characterized in that each processing load corresponds to each agent in the agent-based simulation.
  12.  複数の処理ノードに対して複数の処理負荷を分散配置して実行する並列計算機システムへの処理負荷配置方法であって、
     各処理負荷間の通信量の情報を取得し、
     前記分散配置の変更に伴う、
     前記複数の処理ノードの内の第1処理ノードと前記複数の処理ノードの内の第2処理ノードの間の通信負荷量の変化、
     前記第2処理ノードと前記複数の処理ノードの内の第3処理ノードの間の通信負荷量の変化、および
     前記第3処理ノードと前記第1処理ノードの間の通信負荷量の変化の予測を行い、
     取得した情報と予測結果とに基づいて、前記複数の処理負荷の再配置を行うことを特徴とする処理負荷配置方法。
    A processing load allocation method for a parallel computer system that executes a plurality of processing loads distributed to a plurality of processing nodes.
    Get information on traffic between each processing load,
    With the change of the distributed arrangement,
    A change in communication load between a first processing node of the plurality of processing nodes and a second processing node of the plurality of processing nodes;
    Predicting a change in communication load between the second processing node and a third processing node of the plurality of processing nodes, and a change in communication load between the third processing node and the first processing node. Done
    A processing load arrangement method, wherein the plurality of processing loads are rearranged based on the acquired information and a prediction result.
  13.  請求項12に記載の処理負荷配置方法において、
     前記再配置を行う際に、
     前記取得した情報に基づいて、
     前記複数の処理ノードの内で配置の変更の対象となる処理ノードの対を選択し、
     選択した対の処理ノード間で処理負荷の移動を行うことで前記分散配置の変更を行うことを特徴とする処理負荷配置方法。
    The processing load arrangement method according to claim 12,
    When performing the relocation,
    Based on the acquired information,
    Select a pair of processing nodes whose arrangement is to be changed among the plurality of processing nodes,
    A processing load placement method characterized in that the distributed placement is changed by moving a processing load between selected pairs of processing nodes.
  14.  請求項12に記載の処理負荷配置方法において、
     前記並列計算機システムは複数回の処理を繰り返し、
     前記複数回の処理の内の第1の処理の後に前記再配置を行うことを特徴とする処理負荷配置方法。
    The processing load arrangement method according to claim 12,
    The parallel computer system repeats a plurality of processes,
    The processing load arrangement method, wherein the rearrangement is performed after a first process of the plurality of processes.
  15.  請求項14に記載の処理負荷配置方法において、
     前記第1の処理は、前記複数回の処理の内の最初の処理であることを特徴とする処理負荷配置方法。
    The processing load arrangement method according to claim 14,
    The processing load arrangement method, wherein the first process is a first process among the plurality of processes.
PCT/JP2012/069077 2012-07-27 2012-07-27 Parallel computer system, and method for arranging processing load in parallel computer system WO2014016950A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2014526681A JPWO2014016950A1 (en) 2012-07-27 2012-07-27 Parallel computer system and processing load allocation method to parallel computer system
PCT/JP2012/069077 WO2014016950A1 (en) 2012-07-27 2012-07-27 Parallel computer system, and method for arranging processing load in parallel computer system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2012/069077 WO2014016950A1 (en) 2012-07-27 2012-07-27 Parallel computer system, and method for arranging processing load in parallel computer system

Publications (1)

Publication Number Publication Date
WO2014016950A1 true WO2014016950A1 (en) 2014-01-30

Family

ID=49996783

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2012/069077 WO2014016950A1 (en) 2012-07-27 2012-07-27 Parallel computer system, and method for arranging processing load in parallel computer system

Country Status (2)

Country Link
JP (1) JPWO2014016950A1 (en)
WO (1) WO2014016950A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10095555B2 (en) 2015-07-31 2018-10-09 Honda Motor Co., Ltd. Task control system
WO2022084784A1 (en) * 2020-10-23 2022-04-28 International Business Machines Corporation Auto-scaling a query engine for enterprise-level big data workloads

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009187115A (en) * 2008-02-04 2009-08-20 Internatl Business Mach Corp <Ibm> Multi-node server system, load distribution method, resource management server, and program
JP2010079504A (en) * 2008-09-25 2010-04-08 Mitsubishi Electric Information Systems Corp Apparatus, system, method, and program for distributed processing
JP2010079622A (en) * 2008-09-26 2010-04-08 Hitachi Ltd Multi-core processor system and task control method thereof
JP2012089015A (en) * 2010-10-21 2012-05-10 Hitachi Ltd Distributed information processing system, distributed information processing method and data transfer unit

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0934847A (en) * 1995-07-14 1997-02-07 Hitachi Ltd Method for distributing load of parallel computer system
JP5471166B2 (en) * 2009-08-26 2014-04-16 日本電気株式会社 Management system, management device, network device, management method and program

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009187115A (en) * 2008-02-04 2009-08-20 Internatl Business Mach Corp <Ibm> Multi-node server system, load distribution method, resource management server, and program
JP2010079504A (en) * 2008-09-25 2010-04-08 Mitsubishi Electric Information Systems Corp Apparatus, system, method, and program for distributed processing
JP2010079622A (en) * 2008-09-26 2010-04-08 Hitachi Ltd Multi-core processor system and task control method thereof
JP2012089015A (en) * 2010-10-21 2012-05-10 Hitachi Ltd Distributed information processing system, distributed information processing method and data transfer unit

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HIROSHI ARIKAWA: "A Large-scale Multi Agent Simulation Considering Environmental Information and Its Implementation on Parallel Computer", JOURNAL OF JAPAN SOCIETY FOR FUZZY THEORY AND INTELLIGENT INFORMATICS, vol. 22, no. 2, 15 April 2010 (2010-04-15), pages 211 - 221 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10095555B2 (en) 2015-07-31 2018-10-09 Honda Motor Co., Ltd. Task control system
WO2022084784A1 (en) * 2020-10-23 2022-04-28 International Business Machines Corporation Auto-scaling a query engine for enterprise-level big data workloads
GB2615466A (en) * 2020-10-23 2023-08-09 Ibm Auto-scaling a query engine for enterprise-level big data workloads
US11809424B2 (en) 2020-10-23 2023-11-07 International Business Machines Corporation Auto-scaling a query engine for enterprise-level big data workloads

Also Published As

Publication number Publication date
JPWO2014016950A1 (en) 2016-07-07

Similar Documents

Publication Publication Date Title
US8862744B2 (en) Optimizing traffic load in a communications network
EP3126975B1 (en) Dynamically identifying target capacity when scaling cloud resources
US8713125B2 (en) Method and system for scaling usage of a social based application on an online social network
CN114745317B (en) Computing task scheduling method facing computing power network and related equipment
KR101471749B1 (en) Virtual machine allcoation of cloud service for fuzzy logic driven virtual machine resource evaluation apparatus and method
JP2010122758A (en) Job managing device, job managing method and job managing program
KR20170139872A (en) Multi-tenant based system and method for providing services
JP2012199644A (en) Virtual network management system, virtual network management method, and program for managing virtual network
US20160019084A1 (en) Method and system for inter-cloud virtual machines assignment
JP7006607B2 (en) Distributed processing system, distributed processing method, and recording medium
US20160337245A1 (en) Network element controller, and control apparatus and method for controlling network element controllers
Babu et al. Interference aware prediction mechanism for auto scaling in cloud
Zhao et al. Large-scale machine learning cluster scheduling via multi-agent graph reinforcement learning
WO2014016950A1 (en) Parallel computer system, and method for arranging processing load in parallel computer system
US9124508B2 (en) Communication control device communication control system, communication control method and program
JP2006092053A (en) System use ratio management device, and system use ratio management method to be used for the same device and its program
KR20200109917A (en) Method for estimating learning speed of gpu-based distributed deep learning model and recording medium thereof
JP6036848B2 (en) Information processing system
JP6732693B2 (en) Resource allocation control system, resource allocation control method, and program
JP2014206805A (en) Control device
JP2014110538A (en) Network switching device, task moving method, and task moving program
JP5141788B2 (en) System usage rate management apparatus, system usage rate management method used therefor, and program thereof
Swain et al. Efficient straggler task management in cloud environment using stochastic gradient descent with momentum learning-driven neural networks
US10296493B2 (en) Distributed data processing system and distributed data processing method
KR20060080666A (en) Apparatus and method for load sharing in multi processor systems

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12881655

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2014526681

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12881655

Country of ref document: EP

Kind code of ref document: A1