JP4571090B2 - Scheduler program, server system, scheduler device - Google Patents

Scheduler program, server system, scheduler device Download PDF

Info

Publication number
JP4571090B2
JP4571090B2 JP2006089542A JP2006089542A JP4571090B2 JP 4571090 B2 JP4571090 B2 JP 4571090B2 JP 2006089542 A JP2006089542 A JP 2006089542A JP 2006089542 A JP2006089542 A JP 2006089542A JP 4571090 B2 JP4571090 B2 JP 4571090B2
Authority
JP
Japan
Prior art keywords
task
node
data
transaction
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
JP2006089542A
Other languages
Japanese (ja)
Other versions
JP2007265043A (en
Inventor
弘美 宇和田
Original Assignee
株式会社野村総合研究所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社野村総合研究所 filed Critical 株式会社野村総合研究所
Priority to JP2006089542A priority Critical patent/JP4571090B2/en
Publication of JP2007265043A publication Critical patent/JP2007265043A/en
Application granted granted Critical
Publication of JP4571090B2 publication Critical patent/JP4571090B2/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Description

  The present invention relates to a task processing technique in a server system including a plurality of nodes.

  Service requests that users transmit to server systems via the Internet using a web browser or the like are increasing year by year. A service request made from such a web browser may be executed for a long time by a session with the server because the user interactively executes the service request. In order to respond immediately to intermittent requests from a session, session information and programs must be stored in the server's memory. As the service demand increases, the memory resources required by the server tend to increase, and there is a demand for securing memory resources at a low cost.

  Accordingly, attention has been paid to grid computing that realizes high-speed processing by connecting a plurality of relatively inexpensive servers or personal computers in a network and distributing tasks derived from service requests. Grid computing users can use the enormous processing power and storage capacity pooled in the grid.

In general, in a service request, data reading from a database and data writing to the database are frequently performed. Therefore, if the time required for the transaction with the database can be shortened, the response time for the service request can be shortened. For example, Patent Document 1 uses a volatile memory that can be written at a speed higher than that of a disk in order to reduce the time required for commit, which is processing for storing data to be persisted in a database in a transaction with the database. Are disclosed.
JP 2006-12142 A

  When a plurality of tasks having a predetermined execution order are executed, even if the tasks are distributed to a plurality of nodes, the processing of the subsequent task cannot be started unless the previous task is completed. A major factor causing this restriction is that when a transaction with a database is performed in the previous task, the subsequent task cannot be started until the data change in the transaction is confirmed by the commit. That is, if it takes time to commit the transaction, the response to the service request is delayed accordingly.

  The present invention has been made in view of such circumstances, and an object thereof is to provide a technique for improving the response speed of a service request.

  An aspect of the present invention is a server system including a plurality of nodes each including a processor and a database storing predetermined data, and a scheduler executed by a schedule node responsible for scheduling tasks in the nodes in the server system It is a program. This program controls the execution on each node of a plurality of tasks whose execution order is determined in response to a service request given from a client terminal. This program has an assignment function that assigns multiple tasks to different task execution nodes in the server system, and before one task execution node completes the processing of the current task, including the use or editing of data in the database. Based on the prediction of the processing result of the task, the task pre-execution instruction function that executes the processing of the succeeding task in advance in another task execution node and the data required for the connection with the client terminal are sent to the succeeding task. The schedule node is caused to exhibit a data replication function to be transmitted to the other task execution node for delivery to the other node.

  In this aspect, in distributed computing in which a plurality of tasks having a predetermined execution order are assigned to different nodes, execution of the subsequent task is started before the processing of the current task being executed is completed. As a result, subsequent tasks can be completed earlier, and the overall response time to a service request can be reduced. In addition, substantial parallel processing can be realized even for tasks whose execution order is determined.

  In the task advance execution instruction function, when the current task process is a data persistence process that confirms data write or data rewrite that occurred in the transaction between the task execution node to which the task process is assigned and the database In addition, the processing of the subsequent task may be executed in advance.

  The current task must wait until the data persistence process is finished, but according to this aspect, subsequent tasks are executed in advance during the waiting time, so the transaction load in the database is large or small. Therefore, the substantial waiting time of the current task can be reduced.

  Another aspect of the present invention is a server system including a plurality of nodes each including a processor and a database storing predetermined data. This system includes a child node that is a schedule node that executes the scheduler program, a grandchild node that is a task execution node that executes a task assigned by the assignment function, and a transaction node that handles processing for a database. Including.

  The server system may be connected to a plurality of nodes and configured to be accessible from any node, and may further include a storage device that stores the above-described scheduler program. Thus, even if any node in the server system is determined as a child node, the scheduler program can be loaded on that node.

  It should be noted that any combination of the above-described components and a representation of the present invention by a method, apparatus, system, recording medium, and computer program are also effective as an aspect of the present invention.

  According to the present invention, the response speed of service requests can be improved.

According to an embodiment of the present invention, in a server system including a plurality of nodes each including a processor, a plurality of tasks having a predetermined execution order derived from a service request are expanded to different nodes in the system. Before the process is completed, the subsequent task is executed in advance in another node based on the prediction of the processing result of the task. By starting the processing of the subsequent task before the previous task is completed, the response time to the service request can be improved.
Hereinafter, this embodiment will be described in detail with reference to the drawings.

  FIG. 1 is an overall configuration diagram of a lattice computer system 100 according to an embodiment of the present invention and a client terminal 12 connected thereto. Here, the “lattice computer system” targeted by the present embodiment refers to a system in which a plurality of nodes each including a processor, such as a server or a personal computer, are connected in a lattice form.

  As shown in FIG. 1, the lattice computer system 100 includes a node group 10 that provides a specific service in response to a request issued from a client terminal 12. In the node group 10, nodes are arranged so as to form a lattice having a plurality of columns and a plurality of rows (four rows and four columns in FIG. 1). In FIG. 1, each node is represented by a white square. A plurality of routers (not shown) are provided between the nodes so that the nodes arranged in a lattice form can communicate with all of the other nodes, and these routers are connected to a network 14 such as the Internet, LAN, WAN or the like. Is done. The lattice computer system 100 is arranged in a company data center or the like, and can respond to a large number of service requests simultaneously.

  The node group 10 in FIG. 1 is shown as a lattice model in which each node is represented as a logical node. A logical node represents a plurality of servers or personal computers connected to a single router, represented by a single node. However, each node in FIG. 1 may correspond to one server or personal computer. In FIG. 1, the node group 10 is arranged in four rows and four columns, but it goes without saying that it may be composed of a larger number or a smaller number of nodes.

Each node in the node group 10 is configured to be accessible to the storage device 24 via a router. The scheduler program and application program necessary for processing the service request are stored in the storage device 24, and the program and data can be transmitted from the storage device 24 to each node as necessary. A table necessary for executing the application is stored in the database 26. Each node in the node group 10 is configured to be able to access the database 26 via the transaction node 16. Details of the transaction node 16 will be described later with reference to FIG.
The storage device 24 and the database 26 are generally hard disk devices, and are configured by collecting a plurality of disks so as to exhibit performance commensurate with write requests from a large number of nodes. The storage device 24 and the database 26 may be a magneto-optical disk device or a nonvolatile memory.

  The client terminal 12 may be a personal computer including an input device such as a keyboard and a mouse and an output device such as a display, or a mobile phone including an input / output device equivalent thereto. However, in the case of a mobile phone, wireless communication is assumed. The user issues a service request to the lattice computer system 100 using a web browser or the like on the client terminal 12. The service request may be, for example, securities ordering processing or travel reservation processing. When a server processes a service request found in these examples, a user interactively materializes the service request using a web browser.

  FIG. 2 shows the configuration of each node constituting the node group 10. The node stores a processor 92 that executes various processes according to a program, a memory 94 that temporarily stores data and programs, and a hard disk drive, a DVD drive, or the like that does not lose recorded contents even when the node is restarted. It includes at least a device 96, a network interface 98 connected to the network and executing various input / output processes, and a bus 90 interconnecting them. Each node may have an input device such as a keyboard and a mouse and an output device such as a display as necessary. One node may have two or more network interfaces 98.

  Each node has a compact shape suitable for configuring a lattice type computer system, that is, a blade-type housing on which a processor, a memory, a hard disk, a bus, and the like are mounted. The node group 10 is preferably arranged with a large number of blade-type housings arranged in a rack, but may be in other forms.

  By the way, in grid computing using a plurality of nodes connected to each other as shown in FIG. 1, how to assign a task derived from a service request between nodes constituting the grid is a big problem.

For example, when a number of tasks are derived from one service request and information is exchanged between the tasks, the request can be processed more efficiently if the nodes to which the tasks are assigned are grouped in the vicinity. However, if it is possible to freely occupy a node in the grid for some service requests, the load on a specific node may increase due to the crossing of data flows, and as a result, the performance of the entire system may decrease. is there. Therefore, in grid computing, there is a high need for task assignment in consideration of the characteristics of service requests.
Therefore, in the present embodiment, a super scheduler and a scheduler that execute task assignment according to the characteristics of the service request are arranged in the nodes in the node group 10.

  Returning to FIG. 1, in this specification, among the node group 10, a node that has a super scheduler 22 and monitors task assignment of the entire lattice computer system 100 is a “parent node”, and a scheduler 30 has a service request. Accordingly, a node that monitors task assignment within a certain range is called a “child node”, and a node to which task assignment is performed by the child node scheduler 30 is called a “grandchild node”.

  Further, in this specification, “task” refers to a division of the program code of an application that achieves a certain purpose, and when “task” is derived from “service request”, these tasks are processed in parallel. It is assumed that the order relationship of execution such as serial is clear. A task may or may not include a transaction. As an example of the latter, there is a task that does not access the database although it involves execution of a script embedded in a web page or component call. This embodiment is effective when at least one task among transactions derived from a service request includes a transaction.

  Next, an outline of operations of the super scheduler 22 and the scheduler 30 will be described.

  A service request issued from the client terminal 12 to the lattice computer system 100 is first received by the parent node 18. When receiving the service request, the super scheduler 22 of the parent node 18 analyzes the service request. Specifically, the required resource amount is estimated according to an estimate such as the type of service request coming to the system 100 and the number of concurrent transactions. This estimate may be manually input by an operator of the system 100, or may be calculated by the super scheduler 22 based on past statistics.

  Subsequently, according to this estimate, the super scheduler 22 secures the number of nodes required for each service request in the node group 10, determines the number of child nodes 20 as a starting point of service request processing, and determines the number of child nodes 20. Are expanded and arranged in the node group 10.

  When the placement of the child node is determined, the scheduler program is transmitted from the storage device 24 to the node determined as the child node, and functions as the scheduler 30 in the child node 20. The scheduler 30 notifies the superscheduler 22 of resources necessary for processing its service request. The super scheduler 22 notifies the client terminal 12 outside the system of the network address of the child node 20. Subsequent service requests are sent directly to the child node 20.

  The scheduler 30 executes processing for assigning tasks to several nodes in the node group 10 based on the characteristics of the service request. First, the scheduler 30 calls an application previously associated with each service request, and assigns a task processing program to other nodes in the node group 10 in order to process tasks constituting the application. These become grandchild nodes 50 and 54. The task processing program code is transmitted from the storage device 24 to the grandchild nodes 50 and 54, read into the memory, and then waits on the memory in preparation for a service request. Then, in the grandchild nodes 50 and 54 to which the task is assigned, the task processing is executed by the task processing program. The result of the task processing is transmitted to the client terminal 12 via the network 14. The client terminal 12 displays the result on the display based on the transmitted data.

In general, deploying a task derived from a service request to a plurality of nodes has the following advantages.
First, it is possible to secure resources necessary for executing a task. Also, resource contention that occurs when a plurality of tasks are activated simultaneously on the same node is less likely to occur.
Furthermore, the task switching overhead in the operating system is less than when a plurality of tasks are executed on the same node. Therefore, the response and throughput can be improved by dividing the task into different nodes.

  By the way, in the case of an advanced service request, execution order control reflecting the dependency between tasks is required. For example, on a website that accepts reservations for overseas travel via the Internet, it is necessary to sequentially execute tasks such as “reservation of transportation facilities” and “reservation of accommodation facilities” following “input of desired itinerary”. In such a case, it is necessary to take over the information of the previous task in order to execute the subsequent task.

  Such a task that has a dependency relationship between tasks and needs to be processed in series is called a “task in a serial relationship” in this specification. A task in a serial relationship usually cannot start a subsequent task until the previous task is completed normally. For example, in a two-phase commit in a distributed transaction described later, the task needs to wait until the commit result is notified to the task, and cannot move to the next task process.

  Therefore, in this embodiment, before a transaction between a task being executed on a grandchild node (hereinafter referred to as “current task”) and the database is completed, the result of the task being executed, specifically, the transaction After predicting the result, the subsequent task is executed in advance in another grandchild node based on the prediction.

  FIG. 3 is a functional block diagram showing a configuration of the scheduler 30 of the child node 20 and the transaction node 16 that enable the prior execution of the task. These configurations can be realized in terms of hardware by a processor, a memory, a bus, and the like, and in terms of software, they are realized by a program loaded in the memory. Here, functional blocks realized by their cooperation are shown. I'm drawing. Accordingly, those skilled in the art will understand that these functional blocks can be realized in various forms by hardware only, software only, or a combination thereof.

  The scheduler 30 is implemented exclusively as software, and the program is loaded from the storage device 96 of the node or the storage device 24 shared by the nodes. According to FIG. 3, the scheduler 30 includes a task allocation unit 32, a task advance execution instruction unit 34, a data replication unit 42, a screen display control unit 44, and a failure processing unit 46.

  The task assigning unit 32 assigns each task derived from the service request to an appropriate grandchild node in the node group 10 according to the above-described procedure. In FIG. 3, the current task 52 and the next task 56 are assigned to the grandchild node A and the grandchild node B, respectively. The current task 52 and the next task 56 are in a serial relationship.

  The task pre-execution instruction unit 34 determines the next task assigned to another node based on the prediction of the processing result of the current task before the processing of the current task including use or editing of data in the database 26 is completed. This process is executed in advance. More specifically, the current task 52 in the grandchild node A performs a commit (data persistence), which is an instruction for confirming data writing or data rewriting generated in a transaction between the current task 52 and the database 26, as the transaction manager 36. The task predecessor execution instructing unit 34 instructs the grandchild node B to start processing the next task 56 on the assumption that the transaction has succeeded without waiting for the completion notification of the commit. The function and processing of the transaction manager 36 will be described later.

  The data duplication unit 42 is a grandchild to which the next task 56 is assigned to deliver the session data necessary for executing the next task 56 in advance without waiting for the completion of the commit of the current task 52 to the next task 56. Transmit to node B. The session data held in the memory image of the current task 52 of the grandchild node A may be transmitted to the grandchild node B after the data duplicating section 42 has replicated, or the session data is transmitted to the grandchild node A. The node B may be instructed to transmit.

  Session data includes data required for communication with the client terminal, such as the IP address of the client terminal, the user name, the ID of the node on which the task is executed, and the task status, and data that the current task should write in the database in a transaction Data necessary for executing the next task.

  The data duplication unit 42 duplicates the session data into the grandchild node A and the grandchild node B. Therefore, even when a failure occurs in any node, session data is maintained, and subsequent communication with the client terminal becomes possible, so that there is also an effect that the tolerance of the system at the time of failure can be increased.

  The data duplication unit 42 duplicates the session data to the next task 56, and the task preceding execution instruction unit 34 instructs the grandchild node B to execute the next task 56. In the meantime, the execution of the next task 56 can be started.

  When both the current task 52 and the next task 56 are accompanied by a screen display on the client terminal 12, the current task 52 and the next task 56 are executed in parallel, and the processing result of each task is sent to the client terminal 12. When sent and displayed on the screen, the user is confused. Therefore, the screen display control unit 44 adjusts the screen display when a plurality of tasks are simultaneously executed.

  FIG. 4 is a diagram for explaining control by the screen display control unit 44. In the figure, the horizontal axis represents the passage of time, and the solid line represents the state in which the screen is displayed on the display of the client terminal 12.

  While the screen for the current task is being displayed, a commit is requested to the transaction manager 36 by the current task, and when this is triggered, the task preceding execution instruction unit 34 instructs the start of the next task (t1). The screen display control unit 44 switches the reception destination of the screen display data as the task processing result from the current task to the next task. Then, the screen display data of the current task is accumulated in the buffer, and the screen display data of the next task based on the prediction of the processing result of the current task is transmitted to the client terminal 12 (t1 to t2). This makes it appear to the user that the current task is finished and control is transferred to the next task.

  When the transaction of the current task is completed normally (t2), the screen display data of the current task accumulated in the buffer is discarded, and transmission of the screen display data of the next task 56 is continued (after t2). When the current task is not completed normally, a failure processing process is executed by the failure processing unit 46 described later.

  The screen display control unit 44 may have a function of buffering input made by the user via the client terminal 12 while switching the receiving destination of the screen display data from the current task to the next task (t1 to t2). . The input made during this time is sent to the next task after the transaction of the current task is completed normally (after t2).

  In this way, the data duplicating unit 42 and the screen display control unit 44 make it impossible for the user operating the client terminal 12 to see that two tasks are being executed simultaneously, and the preceding task of the next task is unaware of the user. Execution can proceed.

  Returning to FIG. 3, when a failure occurs before the completion of the transaction of the current task 52, the failure processing unit 46 performs a failure processing process predetermined for each current task or service request as the current task 52 and the next task 56. Are assigned to grandchild node A and grandchild node B. If the transaction of the current task 52 ends normally, nothing is done. The fault handling process is, for example, 1. 1. Interrupt the next task. 2. If the database has been written or changed in the next task based on the predicted success of the commit of the current task, roll back the database. 3. If a message has already been sent to the client terminal based on the predicted success of committing the current task, send an error message to the client terminal. Such as leaving the failure status in a log.

  Next, the description of the transaction node 16 will be given. The transaction node 16 is a node in charge of data writing, rewriting, data reading, etc. to the database. In FIG. 1, the transaction node 16 is drawn outside the node group 10, but it may be a node included in the node group 10. The transaction node 16 includes a transaction manager 36, a database writer 38, and a log writer 40.

  The transaction manager 36 is located between the task and the database, and has a function of controlling and monitoring the transaction. When the current task 52 executed in the grandchild node A needs to access the database 26, a transaction ID is assigned to the transaction, and the transaction in the transaction node 16 is referred to by referring to a call library that is held in advance. Call manager 36. The transaction ID is passed to the scheduler 30 and the database writer 38. When the current task 52 requests the transaction manager 36 to commit, this is also transmitted to the scheduler 30 via the current task 52.

  When the task predecessor execution instruction unit 34 of the scheduler 30 receives the commit request information, it predicts the commit result of the current task. Usually, we expect the current task to commit successfully. On the assumption that the commit is successful, the task predecessor execution instructing unit 34 instructs the start of the next task 56 assigned to the grandchild node B.

  Once the task requests the transaction manager 36 to commit data, the transaction manager 36 executes the subsequent processing. Therefore, it is not necessary to issue a plurality of instructions to the database on the task side, and the transaction processing is hidden from the task.

  The transaction manager 36 also communicates the success or failure of the commit to the scheduler via the current task. The failure processing unit 46 of the scheduler determines the execution of the failure processing based on this.

  The database writer 38 writes and rewrites data in the database 26.

  The log writer 40 records changes in the database state by the database writer 38 and the order of commits in a log. By referring to this log, the database can be rolled back when a failure occurs. “Rollback” is a process of canceling database changes, and even when new data is written to the database, the writing can be discarded before committing. When a rollback is performed, the database is maintained in its original state with no data changes.

  FIG. 5 shows a general processing procedure of a distributed transaction by the transaction manager 36. In a distributed transaction that accesses a plurality of databases in one transaction, a method called two-phase commit is adopted to maintain data consistency of each database. In the two-phase commit, changes to all databases are reflected together, or the process is terminated without changing any of them.

  One transaction is composed of one or more SQL statements. First, the task requests the transaction manager to commit (S100). Subsequently, as a first phase, the transaction manager 36 transmits a preparation command (prepare) for checking whether or not the database A and the database B can be committed together with the SQL statement (S102, S106). ). At this point, the data to be written is written in the database log as an SQL statement, and actual writing is not performed until a commit command is issued from the transaction manager to the database. Upon receiving this, the database A and the database B inform the transaction manager 36 whether or not the log writing has been normally performed. When it is confirmed by the log writer 40 that log writing has been normally performed, preparation completion (ok) is transmitted to the transaction manager 36 (S104, S108). If log writing has terminated abnormally, the transaction manager is informed that preparation has not been completed.

  In the second phase, the transaction manager 36 transmits a commit command or a rollback command to all the databases based on the notification from the databases A and B that the preparation is completed or not. Only when the notification of preparation completion is received from both the database A and the database B, the transaction manager sends a commit command to both databases (S110, S114). Database A and database B interpret SQL statements written in the log and reflect them in the table.

  If some failure occurs before the transaction manager commands a commit and if any notification is received from the database, the transaction manager sends a rollback command to all databases.

  Examples of failures include time stamps that do not match the database update rules (out of hours, etc.), table locking by another process, no notification from the database within the specified time, hardware failures, communication failures, etc. Is also included.

  When the data update in the databases A and B is successful (S112, S116), the information is transmitted to the transaction manager (S118, S119). The transaction manager 36 notifies the current task of the transaction execution result. If the transaction manager issues a commit command and the update in the database is successful, the transaction success information is sent to the task (S120). When the transaction manager commands rollback, and when the commit is commanded but the data update success notification is not received from the database within the specified time, the transaction failure information is sent to the task.

  In this way, by executing all the write operations for a plurality of databases within one transaction, it is possible to maintain the data consistency of all databases even if some failure occurs during the transaction. .

  The transaction in the present embodiment preferably conforms to BTP (Business Transaction Protocol) established by OASIS (Organization for the Advancement of Structured Information Standards), but is not limited thereto, and is in accordance with other protocols. Also good.

  FIG. 6 is a flowchart for explaining the pre-execution of a task in the lattice type computer system 100 according to the present embodiment.

  When the current task 52 requests the transaction manager 36 to commit data (S10), the current task 52 informs the scheduler 30 that a commit request is being made (S12). In response to this, the task predecessor execution instruction unit 34 predicts that the commit is normally completed, and for the grandchild node B to which the next task 56 has been assigned, An instruction is given to start execution (S14). Note that the task processing program of the next task 56 may be loaded in advance to the grandchild node B before S14, or the grandchild node so that the corresponding task processing program is acquired from the storage device 24 in S14. B may be instructed.

  The data duplication unit 42 duplicates the session data of the current task 52 and transmits it to the next task 56, and the preceding execution of the next task 56 is started (S16). The transaction manager 36 controls the transaction with the database 26 and notifies the scheduler 30 of the transaction result via the current task 52 (S18). If the transaction is successful (Y in S20), the current task 52 is terminated, and the execution of the next task 56 is continued in the grandchild node B (S22). If the transaction has failed (N in S20), the next task is terminated, and a predetermined failure processing process is executed by the failure processing unit 46 (S24).

  Of course, the notification of the result of the transaction in S18 may be made before the start of the execution of the next task 56 but before that. In this case, it is preferable that the task predecessor execution instructing unit 34 instructs the grandchild node B to execute the next task 56 after confirming the result of the commit of the current task 52 without causing the predecessor execution of the next task 56.

  When the commit of the current task fails, processing is performed according to a failure processing process predetermined according to the type of task and each failure case, or a failure processing process predetermined by the scheduler. However, if a failure handling process is defined at the application level according to the service request, it is desirable to give priority to this. For example, if a fault handling process that terminates all tasks when a commit fails is defined in the application, it is meaningless to execute the fault handling process related to each task.

  Next, two cases of processing a service request including a distributed transaction in the lattice type computer system 100 will be described, and the task pre-execution process in this embodiment will be described more specifically.

Example 1
FIG. 7 is an overall configuration diagram of a system for making a reservation for purchasing a car via the Internet. 7, elements denoted by the same reference numerals as those in FIGS. 1 and 3 have the same functions as those described above with reference to FIGS. 1 and 3, and thus detailed description thereof is omitted. Further, in FIG. 7, since it is assumed that only one service request is supported, the parent node that assigns child nodes, the super scheduler, and other nodes that are not involved in the service are omitted. .

  When the user accesses a site for accepting a car purchase reservation via the client terminal 12, a desired vehicle type data input screen 70 and a customer attribute data input screen 72 are displayed on the display of the client terminal 12. When the user inputs predetermined items into these items and clicks the send button, a service request including desired vehicle type data and customer attribute data is transmitted to the child node 20. The desired vehicle type data includes information such as the name, model, and color of the vehicle to be reserved for purchase, and the customer attribute data includes information such as the user's name, telephone number, and mail address.

  When the scheduler 30 of the child node 20 receives the service request, it reads out an application associated in advance from the storage device. Here, it is assumed that this application includes a purchase reservation task 62 and a delivery date notification task 66. The task assignment unit 32 of the scheduler 30 assigns the purchase reservation task 62 to the grandchild node A and the delivery date notification task 66 to the grandchild node B.

  First, the purchase reservation task 62 is executed in the grandchild node A. The purchase reservation task 62 needs to write the desired vehicle type data to the order management table 82 at the same time as writing the customer attribute data to the customer management table 80 in the database. Therefore, the purchase reservation task 62 calls the transaction manager in the transaction node 16 and requests data commit.

  When the task advance execution instructing unit 34 of the scheduler 30 knows that the purchase reservation task 62 has requested the transaction manager to commit data, the grandchild node B executes the delivery date notification task 66 before the commit is completed. To start. In addition, the data duplication unit 42 of the scheduler 30 duplicates the desired vehicle type data 68 held in the purchase reservation task 62 and being written to the order management table 82, and delivers it to the delivery date notification task 66.

  The delivery date notification task 66 refers to the desired vehicle type data 68 on the premise that the commit request by the purchase reservation task 62 is successful, and indicates when the desired vehicle can be shipped when the inventory management table 84 of the database 26 or the production. The management table 86 is searched and examined. The inventory management table 84 records the number of stocks for each car name, model, and color, and the production management table 86 records the planned number of cars for each car name, model, and color. By checking these, the delivery date notification task 66 calculates the delivery date of the car and transmits a message 74 including the delivery date to the client terminal 12. The client terminal 12 displays the message 74 on the screen.

  If the commit by the purchase reservation task 62 is successful, the purchase reservation task 62 ends and the delivery date notification task 66 is executed as it is. If the commit fails, the failure processing unit 46 of the scheduler 30 executes a predetermined failure processing process and transmits an error message 76 to the client terminal 12.

  Thus, by executing the delivery date notification task before the commit request by the purchase reservation task is completed, the overall response time to the service request can be shortened. The reason why such a task can be started in advance is based on the fact that table update processing by the transaction manager succeeds with high probability.

  FIG. 8 is a flowchart for explaining the task pre-execution process in the system of FIG. First, a service request is made from the client terminal 12 (S30). The task assignment unit 32 of the scheduler 30 assigns the purchase reservation task 62 and the delivery date notification task 66 corresponding to the service request to appropriate nodes in the system (S32). The execution of the purchase reservation task 62 is started, and the transaction manager 36 of the transaction node 16 is requested to commit data (S34). Subsequently, the desired vehicle type data copied from the purchase reservation task 62 by the data copying unit 42 is transmitted to the delivery date notification task 66, and the task advance execution instructing unit 34 instructs the start of execution of the delivery date notification task 66 (S36). ). The delivery date notification task 66 refers to the database 26 and transmits a message 74 including a delivery date to the client terminal 12 (S38).

  The transaction manager 36 notifies the scheduler 30 of the success or failure of the commit requested from the purchase reservation task 62 via the purchase reservation task 62 (S40). If the commit is successful (Y in S42), the purchase reservation task 62 is terminated and the delivery date notification task 66 is continued (S44). If the commit has failed (N in S42), the failure processing unit 46 executes a predetermined failure processing process (S46).

The failure processing process in S46 differs depending on the cause of the failure.
FIG. 9 is a flowchart of an example of a failure handling process. Here, it is assumed that the commit fails due to insufficient database capacity or hardware failure. In this case, the transaction manager 36 notifies the scheduler 30 that the commit has failed via the purchase reservation task 62 (S50). The purchase reservation task 62 transmits to the client terminal 12 an error message 76 prepared in advance for output when a failure occurs (S52). The failure processing unit 46 of the scheduler 30 interrupts the delivery date notification task 66 and forcibly terminates it (S54). If a transaction with the database 26 has already been executed, the delivery date notification task 66 notifies the transaction manager 36 to roll back the transaction (S56).

  FIG. 10 is a flowchart of another example of the failure handling process. Here, it is assumed that the commit fails due to an application error such as an abnormal stop of the purchase reservation task 62. In this case, an error notification from the transaction manager 36 is not made. Therefore, the scheduler 30 periodically checks the operation status of the purchase reservation task 62 to detect a commit failure (S60). The failure processing unit 46 notifies the transaction manager 36 to roll back the transaction with the purchase reservation task 62 using the transaction ID received in advance from the purchase reservation task 62 (S62). Since no error message is output from the purchase reservation task 62 this time, the scheduler 30 transmits an error message 76 prepared in advance to the client terminal 12 (S64). The failure processing unit 46 interrupts the delivery date notification task 66 and forcibly terminates it (S66). If a transaction with the database 26 has been executed, the delivery date notification task 66 notifies the transaction manager 36 to roll back the transaction (S68).

(Example 2)
In FIG. 7, the prior implementation of the task when it is completed by accessing the database in the system has been described, but this embodiment can also be applied to accessing a database outside the system.

  FIG. 11 is an overall configuration diagram of a travel arrangement system in a travel agency. In this system, another system constructed outside is accessed to complete task processing. Also in FIG. 11, elements having the same reference numerals as those in FIGS. 1 and 3 have the same functions as those described above with reference to FIGS. In FIG. 11, since it is assumed that only one service request is supported, the parent node that assigns child nodes, the super scheduler, and other nodes that are not involved in the service are omitted. .

  In this example, the air ticket reservation system and the accommodation reservation system are provided as independent services on the network, and the travel arrangement system accesses these external systems to complete the travel arrangement processing.

  When the user accesses a site that accepts travel arrangements via the client terminal 12, an itinerary input screen 160 is displayed on the display of the client terminal 12. When the user inputs data such as his / her name, flight boarding date / time, departure place, destination, and the like and clicks the send button, a service request including these data is sent to the child node 20.

  When the scheduler 30 of the child node 20 receives the service request, it reads out an application associated in advance from the storage device. This application includes an itinerary management task 112, an airline ticket arranging task 114, an accommodation destination arranging task 116, and a charge billing task 118. The task assigning unit 32 of the scheduler 30 sets the itinerary management task 112 to the grandchild node A, the airline ticket arranging task 114 to the grandchild node B, the accommodation destination arranging task 116 to the grandchild node C, and the billing task 118 to the grandchild node D. assign.

  First, in the grandchild node A, the itinerary management task 112 is executed. The itinerary management task 112 needs to write the itinerary data in the itinerary management table 156 in the database. Therefore, the itinerary management task 112 calls the transaction manager 36 in the transaction node 16 and requests data commit.

  When the task predecessor execution instruction unit 34 of the scheduler 30 knows that the itinerary management task 112 has requested the transaction manager 36 to commit data, the grandchild node B and the grandchild node C are notified before the completion of the commit. Instructions are given to start execution of the airline ticket arrangement task 114 and the accommodation destination arrangement task 116, respectively. The data duplication unit 42 of the scheduler 30 duplicates the airline ticket data 120 necessary for airline ticket arrangement out of the itinerary data held in the itinerary management task 112 and being written to the itinerary management table 156. Deliver to arrangement task 114. Further, the data duplicating unit 42 duplicates the accommodation destination data 122 necessary for arranging the accommodation destination in the itinerary data, and delivers it to the accommodation destination arrangement task 116.

  The airline ticket arranging task 114 and the accommodation destination arranging task 116 operate on the premise that the commit request by the itinerary management task 112 is successful. The air ticket arrangement task 114 accesses the air ticket reservation system 140 outside the system via the network, refers to the seat reservation table 142, and executes the air ticket reservation procedure based on the air ticket data 120. . In the seat reservation table 142, the reservation status of a large number of flights is recorded according to the date and time, the departure place, and the destination, and the flight ticket arrangement task 114 reserves those flights that meet the conditions. .

  The accommodation arrangement task 116 accesses the accommodation reservation system 150 outside the system via the network, refers to the accommodation reservation table 152, and performs the accommodation reservation procedure based on the accommodation data 122. Execute. The accommodation reservation table 152 records the reservation status of a large number of accommodation facilities according to the date and destination, and the accommodation arrangement task 116 reserves those accommodation facilities that meet the conditions.

  Also in the airline ticket arranging task 114 and the accommodation destination arranging task 116, a transaction with the database occurs in the external system of the access destination, and the data needs to be updated when making a reservation. Therefore, when the airline ticket arranging task 114 and the accommodation destination arranging task 116 access the external system and request data commit, the task advance execution instruction unit 34 of the scheduler 30 You may instruct to start execution. At this time, the data duplicating unit 42 duplicates the air ticket data 124 from the air ticket arranging task 114 and the accommodation data 126 from the accommodation arranging task 116, and delivers them to the charge billing task 118. At this time, the air ticket data 124 includes information on the flight to be reserved by the air ticket arranging task 114, and the accommodation data 126 includes information on the accommodation facility to be reserved by the accommodation arranging task 116. included.

  The billing task 118 operates on the premise that the commit request for the reservation is successful in the airline ticket arranging task 114 and the accommodation destination arranging task 116. The billing task 118 accesses the air ticket reservation system 140 via the network, and refers to the air fare table 144 to check the air fare of the flight that the air ticket arranging task 114 will make a reservation. The air fare table 144 records the air fare for each air flight. The billing task 118 accesses the accommodation reservation system 150 via the network, and refers to the accommodation fee table 154 to check the fee of the accommodation facility that the accommodation arrangement task 116 will reserve. In the accommodation fee table 154, a fee for each accommodation facility is recorded. The billing task 118 transmits to the client terminal 12 a message 162 that includes the air fare and accommodation fee data thus examined. The client terminal 12 displays the message 162 on the screen.

  If the commitment by the itinerary management task 112, the airline ticket arrangement task 114, and the accommodation destination arrangement task 116 is successful, these tasks are terminated, and the charge request task 118 is executed as it is. If any of these commits fails, the failure processing unit 46 of the scheduler 30 executes a predetermined failure processing process, and all the transactions executed by each task are rolled back, and the client terminal 12 An error message is sent.

  FIG. 12 is a flowchart for explaining a task pre-execution process in the system of FIG. First, a service request is made from the client terminal 12 (S70). The task allocation unit 32 of the scheduler 30 allocates each task 112 to 118 corresponding to the service request to an appropriate node in the system (S72). Execution of the itinerary management task 112 is started, and the transaction manager 36 of the transaction node 16 is requested to commit data (S74). In response to this, the air ticket data 120 duplicated from the itinerary management task 112 by the data duplicating unit 42 is transmitted to the air ticket arranging task 114, and the execution of the air ticket arranging task 114 is started by the task advance execution instructing unit 34. Directed to B. The accommodation destination data 122 copied from the itinerary management task 112 is transmitted to the accommodation destination arrangement task 116, and the task advance execution instruction unit 34 instructs the grandchild node C to execute the accommodation destination arrangement task 116 (S76). Further, while the airline ticket arranging task 114 and the accommodation destination arranging task 116 are requesting the commit of data for the corresponding tables, the air ticket data 124 and the accommodation data 126 are duplicated by the data duplicating unit 42, and the fee is charged. The charge task 118 may be transmitted to the charge task 118, and the charge task 118 may calculate a charge charged to the user based on these data (S78). The billing task 118 refers to the airfare table 144 and the accommodation table 154, and transmits a message representing the processing result of the task including the billing fee to the client terminal 12 (S80).

  The transaction manager 36 notifies the scheduler 30 of the success or failure of the commit requested from the itinerary management task 112 via the itinerary management task 112 (S82). If the commit is successful (Y in S84), the itinerary management task 112 is terminated and other tasks are continued (S86). If the commit has failed (N in S84), the failure processing unit 46 performs a predetermined failure processing process (S88).

  As described above, according to the present embodiment, in distributed computing in which a plurality of serially related tasks derived from a service request are assigned to different nodes, the subsequent task is not committed before the task being committed is completed. The execution was started in advance. As a result, regardless of the transaction load on the database, the waiting time for committing the current task is substantially reduced, and the overall response time to the service request can be shortened. Thus, in this embodiment, it is possible to create a situation in which a plurality of tasks having a serial relationship are temporarily processed in parallel.

  The present invention has been described based on some embodiments. Those skilled in the art will understand that these embodiments are exemplifications, and that there may be various modifications to the combinations of the respective constituent elements and processing processes, and such modifications are also within the scope of the present invention. By the way.

  It should also be understood by those skilled in the art that the functions to be fulfilled by the constituent elements described in the claims are realized by a single function block shown in the present embodiment or a combination thereof.

  Data communication between nodes may be performed using a shared storage device, or may be performed directly between nodes using, for example, a DAFS (Direct Access File System) protocol.

  There is no transaction manager, and the task itself may directly execute a transaction with the database. In this case, the scheduler needs to periodically monitor the success or failure of the transaction.

  In the embodiment, it has been described that the transaction manager uses a two-phase commit technique to coordinate transactions with the database. However, the above-described embodiment can be applied to any method in which there is a waiting time until the commit for finalizing writing to the database is completed.

  In the embodiment, it has been described that the scheduler assigns the serial task to the grandchild node adjacent to the child node. This is because high-speed processing is expected by arranging serial tasks at nodes adjacent to each other, but the present invention can be applied regardless of where the node executing the task is located in the node group. .

  Although the embodiment has been described using a lattice computer system, the present invention can be applied to any processing system capable of distributing tasks.

1 is an overall configuration diagram of a lattice type computer system according to an embodiment of the present invention and a client terminal connected thereto. FIG. It is a figure which shows the structure of each node which comprises a node group. It is a functional block diagram showing a configuration of a scheduler of a child node that enables pre-execution of a task and a transaction node. It is a figure explaining control by a screen display control part. It is a figure which shows the general processing procedure of the distributed transaction by a transaction manager. It is a flowchart explaining the prior | preceding execution of the task in the lattice type computer system by this embodiment. 1 is an overall configuration diagram of a system for making a car purchase reservation via the Internet. FIG. It is a flowchart explaining the prior | preceding execution process of the task in the system of FIG. It is a flowchart of an example of a failure processing process. It is a flowchart of another example of a failure handling process. It is a whole block diagram of the travel arrangement system in a travel agency. FIG. 12 is a flowchart illustrating a task pre-execution process in the system of FIG. 11. FIG.

Explanation of symbols

10 node group, 12 client terminal, 14 network, 16 transaction node, 18 parent node, 20 child node, 22 super scheduler, 24 storage device, 30 scheduler, 32 task allocation unit, 34 task advance execution instruction unit, 36 transaction manager, 38 database writer, 40 log writer, 42 data replication unit, 44 screen display control unit, 46 fault processing unit, 50 grandchild node A, 52 current task, 54 grandchild node B, 56th task, 100 grid computer system.

Claims (8)

  1. A schedule node responsible for scheduling tasks in a plurality of nodes each having a processor;
    A task execution node for executing a task assigned by the schedule node;
    A transaction node that handles processing for a database that stores predetermined data; and
    A scheduler program executed by the schedule node, comprising:
    An allocation function for allocating a plurality of tasks with execution orders determined to different task execution nodes in the server system in response to a service request given from a client terminal connected to the server system;
    When the transaction node is requested to perform data perpetuation processing for confirming data writing or data rewriting that occurs in a transaction between one task execution node and the database , the notification that the perpetuation processing has been completed normally A task predecessor execution instruction function that starts processing of a subsequent task at another task execution node before the current task from the transaction node is being executed at the one task execution node ;
    The data required for executing the processing of the subsequent tasks, and data replication to be transmitted to another task execution node the subsequent task is executed,
    A scheduler program comprising:
  2. When the processing of the subsequent task is started, switching the delivery point of the screen display data as a processing result of the task from the current task to the subsequent task, storing the screen display data of the current task in the buffer The scheduler program according to claim 1, further comprising a screen display control function for transmitting screen display data of a subsequent task to the client terminal.
  3.   When a failure occurs before the transaction of the current task is completed, a failure processing process predetermined for each of the current task or the service request is designated as a task execution node to which the current task and the subsequent task are assigned. The scheduler program according to claim 1, further comprising a failure processing function to be executed.
  4. A server system including a plurality of nodes each including a processor and a database storing predetermined data,
    A child node that is a schedule node for executing the scheduler program according to claim 1;
    A grandchild node that is a task execution node for executing a task assigned by the assignment function;
    A transaction node that handles processing for the database;
    A server system comprising:
  5.   5. The storage device according to claim 4, further comprising a storage device connected to the plurality of nodes and configured to be accessible from any node and storing the scheduler program according to claim 1. Server system.
  6. A scheduler device arranged in a server system including a plurality of task execution nodes each having a processor and executing an assigned task, and a transaction node handling a process for a database storing predetermined data,
    A task assigning unit that assigns a plurality of tasks in which an execution order is determined to different task execution nodes in the server system in response to a service request given from a client terminal connected to the server system;
    When the transaction node is requested to perform data perpetuation processing for confirming data writing or data rewriting that occurs in a transaction between one task execution node and the database , the notification that the perpetuation processing has been completed normally A task pre-execution instruction unit that starts processing of a subsequent task at another task execution node before the current task from the transaction node is being executed at the one task execution node ;
    The data required for executing the processing of the subsequent tasks, and data replication unit to be transmitted to another task execution node the subsequent task is executed,
    A scheduler device comprising:
  7. When the processing of the subsequent task is started, switching the delivery point of the screen display data as a processing result of the task from the current task to the subsequent task, storing the screen display data of the current task in the buffer The scheduler apparatus according to claim 6, further comprising a screen display control unit that transmits screen display data of a subsequent task to the client terminal.
  8.   When a failure occurs before the transaction of the current task is completed, a failure handling process predetermined for each of the current task or the service request is transferred to the task execution node to which the current task and the subsequent task are assigned. The scheduler device according to claim 6, further comprising a failure processing unit to be executed.
JP2006089542A 2006-03-28 2006-03-28 Scheduler program, server system, scheduler device Expired - Fee Related JP4571090B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2006089542A JP4571090B2 (en) 2006-03-28 2006-03-28 Scheduler program, server system, scheduler device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2006089542A JP4571090B2 (en) 2006-03-28 2006-03-28 Scheduler program, server system, scheduler device

Publications (2)

Publication Number Publication Date
JP2007265043A JP2007265043A (en) 2007-10-11
JP4571090B2 true JP4571090B2 (en) 2010-10-27

Family

ID=38637962

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2006089542A Expired - Fee Related JP4571090B2 (en) 2006-03-28 2006-03-28 Scheduler program, server system, scheduler device

Country Status (1)

Country Link
JP (1) JP4571090B2 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5326308B2 (en) 2008-03-13 2013-10-30 日本電気株式会社 Computer link method and system
US8549536B2 (en) * 2009-11-30 2013-10-01 Autonomy, Inc. Performing a workflow having a set of dependancy-related predefined activities on a plurality of task servers

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000047887A (en) * 1998-07-30 2000-02-18 Toshiba Corp Speculative multi-thread processing method and its device
JP2001075802A (en) * 1999-09-08 2001-03-23 Fujitsu Ltd Speculative execution device and verification device
JP2001101045A (en) * 1999-09-29 2001-04-13 Toshiba Corp Method and system for processing transaction
JP2005063139A (en) * 2003-08-12 2005-03-10 Toshiba Corp Computer system and program

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06332773A (en) * 1993-05-21 1994-12-02 Nec Corp Data base updating system
JPH1031606A (en) * 1996-07-17 1998-02-03 Nec Corp Method and system for updating interactive file
JP3550289B2 (en) * 1997-11-28 2004-08-04 富士通株式会社 Operation information management method in multi-cluster system, multi-cluster system and program storage medium for online operation information management

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000047887A (en) * 1998-07-30 2000-02-18 Toshiba Corp Speculative multi-thread processing method and its device
JP2001075802A (en) * 1999-09-08 2001-03-23 Fujitsu Ltd Speculative execution device and verification device
JP2001101045A (en) * 1999-09-29 2001-04-13 Toshiba Corp Method and system for processing transaction
JP2005063139A (en) * 2003-08-12 2005-03-10 Toshiba Corp Computer system and program

Also Published As

Publication number Publication date
JP2007265043A (en) 2007-10-11

Similar Documents

Publication Publication Date Title
US10282231B1 (en) Monitoring and automatic scaling of data volumes
US20190278622A1 (en) Assignment of resources in virtual machine pools
US9460185B2 (en) Storage device selection for database partition replicas
US9465602B2 (en) Maintaining service performance during a cloud upgrade
Ardekani et al. A self-configurable geo-replicated cloud storage system
US9043370B2 (en) Online database availability during upgrade
US20170075606A1 (en) Managing access of multiple executing programs to non-local block data storage
US8639816B2 (en) Distributed computing based on multiple nodes with determined capacity selectively joining resource groups having resource requirements
US9971823B2 (en) Dynamic replica failure detection and healing
JP6165729B2 (en) Method and system for maintaining strong consistency of distributed replicated content in a client / server system
US9372735B2 (en) Auto-scaling of pool of virtual machines based on auto-scaling rules of user associated with the pool
JP6254948B2 (en) Method, program, storage medium storing program, and system for assigning job to pool of virtual machine in distributed computing environment and executing task on virtual machine
US9262273B2 (en) Providing executing programs with reliable access to non-local block data storage
US9355060B1 (en) Storage service lifecycle policy transition management
US10042628B2 (en) Automated upgrade system for a service-based distributed computer system
JP6353924B2 (en) Reduced data volume durability status for block-based storage
US20160100001A1 (en) Managing distributed execution of programs
CN104081353B (en) Balancing dynamic load in scalable environment
CN103038788B (en) Providing multiple network resources
US20150106325A1 (en) Distributed storage of aggregated data
US20180129570A1 (en) Saving program execution state
US9569123B2 (en) Providing executing programs with access to stored block data of others
CN104081354B (en) Subregion is managed in scalable environment
EP2600246B1 (en) Batch processing of business objects
US8473646B1 (en) Balancing latency and throughput for shared resources

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20080908

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20091203

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20091222

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20100222

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20100601

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20100716

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20100810

A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20100811

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20130820

Year of fee payment: 3

R150 Certificate of patent or registration of utility model

Free format text: JAPANESE INTERMEDIATE CODE: R150

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

LAPS Cancellation because of no payment of annual fees