JPH08235043A

JPH08235043A - Cooperative distributed system

Info

Publication number: JPH08235043A
Application number: JP7064830A
Authority: JP
Inventors: Kenichi Abe; 賢一阿部; Yukiharu Imafuku; 幸春今福; Hitoshi Kirita; 仁切田; Toshiyuki Inoue; 利行井上
Original assignee: N T T DATA TSUSHIN KK; NTT Data Communications Systems Corp
Current assignee: N T T DATA TSUSHIN KK; NTT Data Corp
Priority date: 1995-02-28
Filing date: 1995-02-28
Publication date: 1996-09-13

Abstract

PURPOSE: To provide a cooperative distributed system which can correctly and smoothly process the transactions and can always assure the consistency of the data bases. CONSTITUTION: Plural processors 1a, 1b, 1c... are connected together via a communication network 2 and receive the transactions through terminals 3 to process them in a distributive way. If the processor 1a receives a transaction, the tentative updating instructions are given to other processors 1b, 1c... having the data bases 16 that should be updated by the reception of transaction of the processor 1a. These updating instructions are received by the processors 1b, 1c... as the journals 18, and the committing instructions are sent to these follower processors 1b, 1c... respectively. Each of these follower processors updates its data base 16 based on the contents of the tentative updating instruction. If one of follower processors has a fault like a system down, etc., this faulty processor acquires the tentative updating instruction of transaction from the journal of the processor 1a and performs the correct updating of its data base 16 after the fault is eliminated.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、一般には、オンライン
で銀行口座に預金や振込を行う処理や航空券の予約や発
券を行う処理のようなオンライン・データベース処理に
関わり、特に、通信回線、バス、チャネル等で接続され
た複数の処理装置がデータベースを分割して受持ち、協
調してトランザクションを分散処理するような協調型分
散システムにおける、トランザクション処理を正確かつ
円滑に完遂させるための技術の改良に関するものであ
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates generally to online database processing such as processing for depositing or transferring money to a bank account online, and processing for booking or issuing ticket for an airline ticket. Improvement of technology for completing transaction processing accurately and smoothly in a collaborative distributed system in which a plurality of processing devices connected by buses, channels, etc. divide the database and jointly handle distributed transactions It is about.

【０００２】[0002]

【従来の技術】例えばオンラインで銀行口座に預金や振
込を行う処理は、実際には、端末からホストに要求メッ
セージを送り、ホスト上でデータベースにアクセスして
データベースを処理し、そしてホストから端末に応答メ
ッセージを返す、といった一連の処理から構成されてい
る。これらの処理は、全体として纏まってオンライン・
データベース処理の論理的単位を構成しており、この論
理的単位を「トランザクション」と呼ぶ。2. Description of the Related Art For example, the process of depositing or transferring money to a bank account online is actually sending a request message from the terminal to the host, accessing the database on the host to process the database, and then sending from the host to the terminal. It consists of a series of processes such as returning a response message. These processes are collectively online
It constitutes a logical unit of database processing, and this logical unit is called a "transaction".

【０００３】トランザクションは、データベースの内容
を変化させるが、その変化後のデータベースの内容は利
用者にとって正しい意味をもった整合性ある内容となっ
ていなくてならない。そのため、トランザクションは、
中途半端な状態のままであってはならず、完遂される
か、全く処理されないかのいずれかでなくてならない。A transaction changes the contents of the database, but the contents of the database after the change must be consistent with meaning having a correct meaning for the user. So the transaction is
It must not remain in a half-finished state, either completed or not processed at all.

【０００４】さて、オンライン・データベース処理シス
テムの一態様として、通信回線で結ばれた複数の処理装
置がデータベースを分割して受持ち、協調してトランザ
クション分散処理する協調型分散システムが提案されて
いる。この協調型分散システムは、１台の大型メインフ
レームがデータベースを集中処理する最も一般的なシス
テムと比較すると、コストの削減や、事故に対するリス
クの分散や、システムの拡大・縮小の柔軟性等の点で優
れている。As one mode of the online database processing system, there has been proposed a cooperative distributed system in which a plurality of processing devices connected by a communication line divides a database and is responsible for the distributed processing of transactions in a coordinated manner. This collaborative distributed system offers cost reduction, accident risk distribution, and system expansion / contraction flexibility compared to the most common system in which one large mainframe centrally processes databases. Excellent in terms.

【０００５】しかし、この種の分散システムでは、トラ
ンザクションが複数の処理装置でのデータベース処理を
含んだものとなるため、トランザクション処理における
データベース処理が複数の処理装置を通じて一貫性ある
こと、つまり、一つのトランザクションに関し或る装置
はデータベースを更新したが別の装置は更新しなかった
という状態が生じないようにすることが必要である。そ
のためのトランザクション処理方法として、Ｎ．Ｊ．Ｇ
ｒａｙの考案した２フェーズ・コミット方式が知られて
いる。However, in this type of distributed system, since a transaction includes database processing in a plurality of processing devices, the database processing in the transaction processing is consistent across a plurality of processing devices, that is, one It is necessary to avoid the situation where one device has updated its database but another device has not. As a transaction processing method therefor, N. J. G
The two-phase commit method devised by Ray is known.

【０００６】この２フェーズ・コミット方式では、トラ
ンザクションが仮更新と実更新の２つのフェーズに分け
られる。或る端末がトランザクションの要求を発する
と、これを１台の処理装置が受け付ける。まず、仮更新
のフェーズに入り、トランザクションを受け付けた処理
装置（以下、主装置という）から、更新すべきデータベ
ースを受け持つ処理装置（以下、従装置という）に対し
更新が可能か否かが問い合され、そして、従装置から主
装置に対し更新可能か否かの回答が返される。その結
果、全ての従装置が更新可能の回答を返した場合にの
み、実更新のフェーズに移行して、主装置から従装置に
対しコミット（つまり、トランザクションの完遂）の指
示が送られ、そして、コミット指示を受けた従装置上で
それぞれのデータベースが実際に更新され、その後、従
装置から主装置へ更新完了のメッセージが返される。In this two-phase commit method, a transaction is divided into two phases, a temporary update and an actual update. When a terminal issues a transaction request, one processing device accepts this. First, in the temporary update phase, the processing device that accepts the transaction (hereinafter referred to as the main device) inquires of the processing device (hereinafter, referred to as the slave device) that is in charge of the database to be updated whether or not the update is possible. Then, the slave device returns a reply to the master device as to whether the update is possible. As a result, only when all the slaves have returned an updatable response, the actual update phase is entered, and the master sends a commit (that is, transaction completion) instruction to the slaves, and The respective databases are actually updated on the slave device that received the commit instruction, and then the update completion message is returned from the slave device to the master device.

【０００７】以上の２つのフェーズからなる処理によ
り、トランザクションを分散処理する処理装置の全てが
データベースを更新可能な場合にのみ、コミット指示が
発されてデータベースの実更新が行われることになる。By the processing consisting of the above two phases, the commit instruction is issued and the actual updating of the database is performed only when all the processing devices for distributed processing of the transactions can update the database.

【０００８】[0008]

【発明が解決しようとする課題】しかしながら、従来の
２フェーズ・コミット方式には次のような幾つかの問題
がある。However, the conventional two-phase commit method has some problems as follows.

【０００９】まず、上記のように主装置と従装置との間
で最低４回の交信を行う必要があるため、無視できない
長さの通信オーバーヘッドが生じてしまう。そのため、
システム全体の処理速度が低くなる。First, since it is necessary to communicate at least four times between the main device and the slave device as described above, a communication overhead of a length that cannot be ignored occurs. for that reason,
The processing speed of the entire system becomes low.

【００１０】また、上記のように主装置から従装置にメ
ッセージを送り、従装置から主装置にそのメッセージに
対する応答が返されるという手順を踏みながらトランザ
クション処理が進められるため、トランザクション処理
の最中に、いずれかの処理装置でブロッキング等の障害
が発生すると、その時点で全処理装置の処理が停止して
しまうという問題もある。これを解決する一つの方法
は、処理が一定時間以上停止したらその処理をキャンセ
ルして次の処理に移行する方法であるが、いずれにせよ
システム全体の処理速度が低下することになる。Also, since the transaction processing proceeds while following the procedure of sending a message from the master device to the slave device and returning a response to the message from the slave device to the master device as described above, during the transaction process. There is also a problem that when a failure such as blocking occurs in any of the processing devices, the processing of all the processing devices is stopped at that point. One method of solving this is to cancel the processing and move to the next processing when the processing is stopped for a certain period of time or more, but in any case, the processing speed of the entire system decreases.

【００１１】更に、実更新フェーズに入ってコミット指
示が発せられた後には、いずれかの処理装置が実更新に
失敗しても、もはや他の処理装置に対して実更新の中止
や後退回復を指示することができないため、他の処理装
置では実更新が実行されてしまい、結果としてデータベ
ースの一貫性が保証できないという問題もある。Further, after entering the actual update phase and issuing a commit instruction, even if one of the processing devices fails in the actual update, it is no longer possible to stop the actual update or set back recovery for another processing device. Since it is not possible to give an instruction, there is also a problem that the actual update is executed in another processing device, and as a result, the consistency of the database cannot be guaranteed.

【００１２】このように２フェーズ・コミット方式は、
トランザクション処理を正確且つ円滑に進める上で、完
全に満足できるものではない。As described above, the two-phase commit method is
It is not completely satisfactory for accurate and smooth transaction processing.

【００１３】また、従来知られている協調型分散システ
ムには、２フェーズ・コミット方式の問題以外にも、更
に以下のような問題点がある。Further, the conventionally known cooperative distributed system has the following problems in addition to the problem of the two-phase commit method.

【００１４】まず、従来のシステムは、主装置と従装置
とが固定的に定められている。そのため、複数の処理装
置が事情に応じて自由に主装置や従装置となって動作す
ることはできない。First, in the conventional system, the main device and the slave device are fixedly determined. Therefore, the plurality of processing devices cannot freely operate as the main device or the slave device according to circumstances.

【００１５】従って、本発明の第１の目的は、協調型分
散システムにおいて、トランザクション処理が正しく且
つ円滑に行え、データベースの一貫性が常に保証される
ようにすることにある。Therefore, a first object of the present invention is to ensure that transaction processing can be performed correctly and smoothly in a cooperative distributed system, and the consistency of a database is always guaranteed.

【００１６】本発明の第２の目的は、複数の分散処理装
置が自由に主装置にも従装置にもなり得る柔軟な協調型
分散システムを提供することにある。A second object of the present invention is to provide a flexible collaborative distributed system in which a plurality of distributed processing devices can freely function as a master device and a slave device.

【００１７】[0017]

【課題を解決するための手段】本発明は、相互に通信可
能な複数の処理装置が資源を分割して受持ち、協調して
トランザクションを分散処理するような協調型分散シス
テムにおける新規な改良を提供する。本発明のシステム
では、複数の処理装置の少なくとも１台の処理装置が、
トランザクションを受け付けるための主装置となり得、
他の処理装置が、受け付けられたトランザクションを分
散処理するための従装置となることができる。The present invention provides a novel improvement in a collaborative distributed system in which a plurality of processing units capable of communicating with each other divide resources and receive them, and collaboratively process transactions in a distributed manner. To do. In the system of the present invention, at least one processing device of the plurality of processing devices is
Can be the main device for accepting transactions,
Other processing devices can be slave devices for distributed processing of accepted transactions.

【００１８】主装置は、トランザクションにおいて更新
されるべき資源の更新内容を示した更新情報を、更新さ
れるべき資源を受け持つ従装置へ発行する更新情報発行
手段を有する。また、主装置及び従装置のいずれかが、
主装置が発行した更新情報の更新内容の全てをトランザ
クション識別子と共に記録したジャーナルを有する。更
に、従装置の各々が、更新情報に基づいて、各々が受け
持つ資源を更新する実更新手段と、実更新手段による更
新が失敗した過去のトランザクションを特定し、特定し
たトランザクションに対応する更新情報をジャーナルか
ら取寄せ、取寄せた更新情報に基づき、資源を更新する
リカバリ手段とを有する。The main device has update information issuing means for issuing update information indicating update contents of the resource to be updated in the transaction to the slave device which is in charge of the resource to be updated. In addition, either the main device or the slave device
It has a journal in which all the update contents of the update information issued by the main device are recorded together with the transaction identifier. Further, each of the slave devices specifies, based on the update information, the actual update means for updating the resource respectively managed by the slave device and the past transaction for which the update by the actual update means has failed, and the update information corresponding to the specified transaction is specified. And a recovery means for updating the resource based on the updated information ordered from the journal.

【００１９】[0019]

【作用】本発明のシステムにおいては、トランザクショ
ンを受け付けた主装置が、そのトランザクションを分散
処理する従装置に対し、まず、資源の更新内容を示した
更新情報を発行する。ここで、主装置から発行された更
新情報の示す更新内容の全ては、トランザクション識別
子と共にジャーナルとして取得され、システム内のいず
れかの場所に保持される。In the system of the present invention, the main device that receives a transaction first issues update information indicating the update contents of the resource to the slave device that performs distributed processing of the transaction. Here, all the update contents indicated by the update information issued from the main device are acquired as a journal together with the transaction identifier, and held in any place in the system.

【００２０】各従装置は、主装置から発行された更新情
報に基づき、それぞれの資源を更新する。その場合、各
従装置は、いずれかのトランザクション処理で資源更新
に失敗すると、後にそのトランザクションを特定して、
そのトランザクションに係る更新内容を上記ジャーナル
から取寄せる。そして、その取寄せた更新情報に基づ
き、資源を正しく更新する。Each slave device updates its resource based on the update information issued by the master device. In that case, each slave device identifies the transaction later when resource update fails in any transaction process,
The updated contents related to the transaction are ordered from the above journal. Then, the resource is correctly updated based on the acquired update information.

【００２１】従って、主装置においては、従装置が正し
く更新を行ったか否かを関知しなくても、個々の従装置
において、資源を正しい更新後内容となるようにリカバ
ーするため、実質的に全ての装置を通じて資源の一貫性
が確保される。しかも、主装置は従装置の更新成功／失
敗を関知しなくてよいので、主装置と従装置との間の通
信回数が減り、通信オーバヘッドが短くなると共に、各
装置の依存性が薄れるため、１つの装置で生じたブロッ
キング等の障害が他の装置までも停止させる事態が減少
する。Therefore, in the master device, even if the slave device does not know whether or not the slave device has correctly updated, in each slave device, the resources are recovered so as to have the correct updated contents. Resource consistency is ensured across all devices. Moreover, since the master device does not need to be aware of the success / failure of updating the slave device, the number of communications between the master device and the slave device is reduced, the communication overhead is shortened, and the dependency of each device is reduced. A situation in which a failure such as blocking that occurs in one device causes the other device to stop is reduced.

【００２２】好適な実施例では、複数の分散装置の各々
が、主装置及び従装置のいずれにもなり得るように、更
新情報発行手段と、ジャーナルと、実更新手段と、リカ
バリ手段とを有している。In the preferred embodiment, each of the plurality of distribution devices has an update information issuing means, a journal, an actual updating means, and a recovery means so that each of the plurality of distribution devices can be either a master device or a slave device. are doing.

【００２３】また、好適な実施例では、各装置が、各々
が主装置となって発行した更新情報を記録したジャーナ
ルを有している。従って、ジャーナルの正確性が保証さ
れる。Further, in a preferred embodiment, each device has a journal recording update information issued by each device as a main device. Therefore, the accuracy of the journal is guaranteed.

【００２４】また、好適な実施例では、各装置が、各々
の更新すべき資源を更新前にロックし更新後にアンロッ
クする排他制御手段と、各々の資源のロック及びアンロ
ックの履歴を記録したロック／アンロックログとを備え
ている。そして、排他制御手段は、資源のアクセス可能
な最小情報単位（例えば、レコード）毎にロック及びア
ンロックが行えるようになっている。そのため、或るト
ランザクションで或る資源の部分をロックした場合、そ
の部分はアクセスの最小単位であるため、他のトランザ
クション処理に対するアクセス制限が最小限に抑えら
れ、結果としてトランザクションの並行処理性能が向上
する。Further, in a preferred embodiment, each device records exclusive control means for locking each resource to be updated before updating and unlocking after updating, and lock and unlock history of each resource. It has a lock / unlock log. Then, the exclusive control means can lock and unlock for each minimum accessible information unit (for example, record) of the resource. Therefore, when a part of a certain resource is locked by a certain transaction, that part is the minimum unit of access, so access restrictions for other transaction processes are minimized, and as a result, the parallel processing performance of transactions is improved. To do.

【００２５】更に、好適な実施例では、各装置が各々の
資源のロック／アンロックログを参照することによっ
て、更新失敗に係るトランザクションを特定するように
なっている。そのため、各装置が他の装置から独立して
各々個別にリカバリ処理を実行することができる。Further, in the preferred embodiment, each device identifies the transaction related to the update failure by referring to the lock / unlock log of each resource. Therefore, each device can independently execute the recovery process independently of the other devices.

【００２６】[0026]

【実施例】図１は、本発明の協調型分散システムの一実
施例の全体構成を示すブロック図である。図１に示すよ
うに、協調してトランザクションを分散処理する複数台
の分散処理装置（以下、サーバという）１ａ、１ｂ、１
ｃ…が通信網（例えば、ＬＡＮ）２を介して接続されて
いる。また、この通信網２には、多数台の端末装置３
ａ、３ｂ、３ｃ…が接続されている。端末装置３ａ、３
ｂ、３ｃ…は、例えばワークステーションやパーソナル
コンピュータを用いたもので、通信網２を通じてサーバ
１ａ、１ｂ、１ｃ…との間で、トランザクションに関す
るメッセージの送受信を行う。典型的には、端末装置３
ａ、３ｂ、３ｃ…は、利用者からのトランザクションの
要求をサーバ１ａ、１ｂ、１ｃ…のいずれかに送信す
る。1 is a block diagram showing the overall configuration of an embodiment of a cooperative distributed system according to the present invention. As shown in FIG. 1, a plurality of distributed processing devices (hereinafter, referred to as servers) 1a, 1b, 1 for cooperatively processing distributed transactions.
are connected via a communication network (for example, LAN) 2. Further, the communication network 2 includes a large number of terminal devices 3
a, 3b, 3c ... Are connected. Terminal devices 3a, 3
b, 3c ... Are those using, for example, a workstation or a personal computer, and exchange messages relating to transactions with the servers 1a, 1b, 1c ... Through the communication network 2. Typically, the terminal device 3
A, 3b, 3c ... Send a transaction request from the user to any of the servers 1a, 1b, 1c.

【００２７】サーバ１ａ、１ｂ、１ｃ…は、それぞれ同
一の構成要素を含み、トランザクションの処理を互いに
協調して実行する。いずれのサーバ１ａ、１ｂ、１ｃ…
も、端末から要求されたトランザクションを受け付ける
主装置になることができ、且つ、他のサーバで受け付け
たトランザクションを分散処理する従装置にもなること
ができる。The servers 1a, 1b, 1c ... Include the same components, and execute transaction processing in cooperation with each other. Which server 1a, 1b, 1c ...
Can also be a master device that accepts a transaction requested from a terminal, and can also be a slave device that performs distributed processing of a transaction accepted by another server.

【００２８】図１は、第１のサーバ１ａの構成を中心に
示している。図示のように、いずれのサーバ１ａ、１
ｂ、１ｃ…も、通信マネージャ１１、業務処理プログラ
ム１２、分散資源同期処理部１３、データベース管理部
１４、資源管理部１５、記憶部（データベース）１６、
メモリテーブル／ファイル部１７、ジャーナル部１８を
備える。FIG. 1 mainly shows the configuration of the first server 1a. As shown, which server 1a, 1
b, 1c ... Also, the communication manager 11, the business processing program 12, the distributed resource synchronization processing unit 13, the database management unit 14, the resource management unit 15, the storage unit (database) 16,
A memory table / file unit 17 and a journal unit 18 are provided.

【００２９】通信マネージャ１１は、通信網２を通じて
端末３ａ、３ｂ、３ｃ…や他のサーバと通信を行うため
のものである。The communication manager 11 is for communicating with the terminals 3a, 3b, 3c ... And other servers through the communication network 2.

【００３０】業務処理プログラム１２は、本システムの
業務を処理する、つまり具体的なトランザクション処理
を実行するものである。The business processing program 12 processes the business of this system, that is, executes a specific transaction processing.

【００３１】分散資源同期処理部１３は、業務処理プロ
グラム１２と協働して、トランザクションの当該サーバ
における局域的な処理や他のサーバによる大域的な分散
処理のための処理や、トランザクションの障害回復のた
めの処理等を行うものである。The distributed resource synchronization processing unit 13 cooperates with the business processing program 12 to perform local processing of transactions in the server concerned, processing for global distributed processing by other servers, and transaction failures. The processing for recovery is performed.

【００３２】データベース管理部１４は、当該サーバが
受け持つ局域的なデータベース１６を管理するものであ
る。データベース１６には、当該サーバが直接管理すべ
きレコードの集合が格納されている。The database management unit 14 manages the regional database 16 which the server handles. The database 16 stores a set of records that the server should directly manage.

【００３３】資源管理部１５は、当該サーバが受け持つ
局域的なメモリテーブル／ファイル１７を管理するもの
である。The resource management unit 15 manages a local memory table / file 17 that the server handles.

【００３４】尚、本明細書では、データベース１６及び
メモリテーブル／ファイル１７を総称するときは「資
源」と呼ぶ。In this specification, the database 16 and the memory table / file 17 are collectively referred to as "resource".

【００３５】ジャーナル１８は、当該サーバで受け付け
た全てのトランザクション処理において、当該サーバ及
び他のサーバの資源に対して加えられるべき（実際に加
えられたか否かに関係ない）全ての変更を時系列的に記
録したものである。換言すれば、ジャーナル１８には、
当該サーバで受け付けた全トランザクションの完遂によ
り生じるべき、システム内の全資源に対する全変更が時
系列的に記録されている。従って、或るトランザクショ
ンの分散処理において、或るサーバでデータベースに障
害が発生しその更新が失敗したとしても、後にそのトラ
ンザクションを受け付けたサーバのジャーナル１８を参
照することによって、データベースを正しく更新し直す
ことが可能となる。The journal 18 chronologically records all the changes to be made (regardless of whether they are actually added) to the resources of the server and other servers in all the transaction processes accepted by the server. It was recorded as a record. In other words, the journal 18
All changes to all resources in the system that should occur upon completion of all transactions accepted by the server are recorded in time series. Therefore, in distributed processing of a certain transaction, even if a database fails in a certain server and the update fails, the database is correctly updated again by referring to the journal 18 of the server that accepted the transaction later. It becomes possible.

【００３６】この構成における動作を、第１のサーバ１
ａが主装置となり、他のサーバ１ｂ、１ｃ…が従装置と
なった場合を例に説明する。The operation in this configuration is performed by the first server 1
An example will be described in which a is the master device and the other servers 1b, 1c ... Are slave devices.

【００３７】サーバ１ａにおいて、通信マネージャ１１
が通信網２から、例えば端末３ａの発行したトランザク
ションのメッセージを受信し、これを業務処理プログラ
ム１２に導く。業務処理プログラム１２は、その端末３
ａから発行されたメッセージに基づき、当該トランザク
ションの処理を開始する。In the server 1a, the communication manager 11
Receives a message of a transaction issued by the terminal 3a from the communication network 2 and guides it to the business processing program 12. The business processing program 12 is the terminal 3
The processing of the transaction is started based on the message issued by a.

【００３８】このトランザクション処理では、業務プロ
グラム１２は、概略次の３つの動作を行う。第１に、分
散資源同期処理部１３を通じデータベース管理部１４及
び資源管理部１５に命じて、当該サーバ１ａの受け持つ
局域的なデータベース１６及びメモリテーブル／ファイ
ル１７に対する更新処理を行う。第２に、分散資源同期
処理部１３を通じて、従装置である他のサーバ１ｂ、１
ｃ…の局域的な資源に対する更新指示を通信網１２に送
信する。第３に、当該トランザクション処理においてな
されるべき当該サーバ１ａ及び他のサーバ１ｂ、１ｃ…
の資源に対する全ての更新の内容を、分散資源同期処理
部１３を通じてジャーナル１８に書込む。In this transaction processing, the business program 12 roughly performs the following three operations. First, the distributed resource synchronization processing unit 13 instructs the database management unit 14 and the resource management unit 15 to update the local database 16 and the memory table / file 17 which the server 1a handles. Secondly, through the distributed resource synchronization processing unit 13, the other servers 1b, 1
The update instruction for the local resources of c ... Is transmitted to the communication network 12. Thirdly, the server 1a and the other servers 1b, 1c ... Which should be performed in the transaction processing.
The contents of all the updates for the resources are written in the journal 18 through the distributed resource synchronization processing unit 13.

【００３９】従装置である他のサーバ１ｂ、１ｃ…では
それぞれ、通信マネージャ１１が通信網２から、主装置
たるサーバ１ａからのコミット指示を受け取り、これを
分散資源同期処理部１３に渡す。各々の分散処理同期処
理部１３は、その更新指示に基づいて、各々のデータベ
ース１６及びメモリテーブル／ファイル１７に対する更
新処理を行う。In the other servers 1b, 1c, ... Which are slave devices, the communication manager 11 receives the commit instruction from the server 1a, which is the main device, from the communication network 2 and passes it to the distributed resource synchronization processing unit 13. Each distributed processing synchronization processing unit 13 performs an update process on each database 16 and memory table / file 17 based on the update instruction.

【００４０】以上の動作において、主装置たるサーバ１
ａは、従装置たる他のサーバ１ｂ、１ｃ…が指示通りに
正しく更新を実行したか否かには関知せず、実行したも
のとみなして、上記した３つの動作を終えると、端末３
ａに対してトランザクション完遂の旨のメッセージを送
る。In the above operation, the server 1 which is the main device
a does not care whether or not the other servers 1b, 1c ... As slaves have correctly executed the update as instructed, and considers that the update has been executed, and when the above three operations are completed, the terminal 3
Send a message to the effect that transaction is completed.

【００４１】ここで、もしいずれかの従装置において、
資源に障害が生じてその更新が正しく行えなかった場合
には、その従装置の資源のみ更新前の状態のままとな
り、他の装置の資源は更新されてしまうため、資源内容
の一貫性が崩れることになる。しかし、この場合には、
その障害が回復した後に、その従装置において、そのト
ランザクションに関する更新履歴を、主装置のジャーナ
ル１８から取寄せ、これを参照して強制的に正しい更新
をやり直す。これにより、資源内容の一貫性が確保され
ることになる。Here, if any of the slaves,
If a resource fails and cannot be updated correctly, only the resources of the slave device will remain in the state before the update, and the resources of other devices will be updated, resulting in inconsistent resource contents. It will be. But in this case,
After the failure is recovered, the slave device retrieves the update history relating to the transaction from the journal 18 of the master device, and refers to this to forcibly perform correct update again. This will ensure consistency of resource content.

【００４２】図２は、上記動作及び各部の機能をより詳
細に示したブロック図である。FIG. 2 is a block diagram showing the above-mentioned operation and the function of each section in more detail.

【００４３】主装置であるサーバ１ａのブロック内に図
示するように、業務処理プログラム１２は、主要部（以
下、ＡＰ主要部という)２０及びアクセス部(以下、ＡＰ
アクセス部という）２１を備える。ＡＰ主要部２０は、
フローチャートで図示するようなトランザクション処理
の中核部分を行うものであり、ＡＰアクセス部２１は、
このトランザクション処理においてデータベース１６や
メモリテーブル／ファイル部１７に対するアクセス処理
を扱うものである。As shown in the block of the server 1a, which is the main device, the business processing program 12 includes a main part (hereinafter, AP main part) 20 and an access part (hereinafter, AP).
21). AP main part 20,
The core part of the transaction processing shown in the flowchart is performed, and the AP access unit 21
In this transaction processing, access processing to the database 16 and the memory table / file unit 17 is handled.

【００４４】分散資源同期処理部１３は、グローバルト
ランザクション管理部（以下、ＧＴＭという）２２及び
ローカルトランザクション管理部（以下、ＬＴＭとい
う）２３を備える。ＧＴＭ２２は、トランザクション処
理における当該サーバ及び他のサーバを含めた大域的な
資源に対する処理を管理するものであり、ＬＴＭ２３
は、当該サーバの局域的な資源に対する処理を管理する
ものである。The distributed resource synchronization processing unit 13 includes a global transaction management unit (hereinafter referred to as GTM) 22 and a local transaction management unit (hereinafter referred to as LTM) 23. The GTM 22 manages processing for global resources including the server concerned and other servers in transaction processing, and the LTM 23
Manages the processing of local resources of the server.

【００４５】分散資源同期処理部１３はまた、当該サー
バの資源に関する排他制御を行うための２つのログ、つ
まりロックログ２４及びアンロックログ２５を備える。The distributed resource synchronization processing unit 13 also includes two logs for performing exclusive control regarding the resources of the server, that is, a lock log 24 and an unlock log 25.

【００４６】データベース１６は、当該データベース１
６に対して加えられた更新内容を時系列的に記録するた
めの更新イメージログ２６を備える。また、メモリテー
ブル／ファイル部１７は、当該メモリテーブル／ファイ
ル部１７に対して加えられた更新内容を時系列的に記録
するための更新イメージログ２７を備える。これらの更
新イメージログ２６、２７はいずれも局域的な更新履歴
を記録したものであり、これに対し、ジャーナル１８は
当該サーバだけでなく他のサーバの資源に対する更新も
全て含んだ大域的な更新履歴を記録したものである。The database 16 is the database 1
The update image log 26 for recording the update contents added to 6 in time series is provided. Further, the memory table / file unit 17 includes an update image log 27 for recording the update contents added to the memory table / file unit 17 in time series. Each of these update image logs 26 and 27 records a local update history, whereas the journal 18 includes a global update history including all updates to the resources of not only the server concerned but also other servers. This is a record of the update history.

【００４７】以上の構成において、サーバ１ａの業務処
理プログラム１２のＡＰ主要部２０は、或る端末から発
行されたメッセージを受信すると、そのメッセージによ
り要求されたトランザクション処理を開始し（ステップ
Ｓ２０１）、分散資源同期処理部１３のＧＴＭ２２にト
ランザクション開始指示を送る。ＧＴＭ２２は、そのト
ランザクションに対するトランザクション識別子を生成
する。In the above configuration, when the AP main part 20 of the business processing program 12 of the server 1a receives a message issued from a certain terminal, it starts the transaction processing requested by the message (step S201), A transaction start instruction is sent to the GTM 22 of the distributed resource synchronization processing unit 13. GTM 22 generates a transaction identifier for that transaction.

【００４８】次に、ＡＰ主要部２０は、システム内のア
クセスすべき全ての資源に対する、排他制御のためのロ
ック指示（他のトランザクション処理によるアクセスを
禁止する指示）を、ＧＴＭ２２に対して発行する（ステ
ップＳ２０２）。ＧＴＭ２２は、一方で当該サーバ１ａ
内の資源に対するロック命令を当該サーバ１ａのＬＴＭ
２３に送り、他方で従装置たる他のサーバ１ｂ、１ｃ…
内の資源に対するロック命令を、通信網２を通じてそれ
ら他のサーバ１ｂ、１ｃ…のＬＴＭ２３に送る。各サー
バのＬＴＭ２３は、ロック命令を受けると、ロックすべ
き資源に対するロック情報をトランザクションの識別子
と共にロックログ２４に取得する。これにより、各サー
バのロックログ２４には、各サーバの資源中のどのレコ
ード（又はメモリテーブル、ファイル）が、どのトラン
ザクション処理に関してロック（アクセス禁止）された
か、が記録されることになる。Next, the AP main section 20 issues to the GTM 22 a lock instruction for exclusive control (instruction of prohibiting access by other transaction processing) for all resources to be accessed in the system. (Step S202). On the other hand, the GTM 22 is the server 1a.
A lock command for a resource in the LTM of the server 1a
23, and the other servers 1b, 1c ...
The lock command for the resources in the server is sent to the LTMs 23 of the other servers 1b, 1c ... Through the communication network 2. Upon receiving the lock command, the LTM 23 of each server acquires lock information for the resource to be locked in the lock log 24 together with the transaction identifier. As a result, in the lock log 24 of each server, which record (or memory table or file) in the resource of each server is locked (access prohibited) for which transaction process is recorded.

【００４９】次に、ＡＰ主要部２０は、更新すべき全て
の資源（データベース１６及びメモリテーブル／ファイ
ル部１７）に対する仮更新の指示を、ＧＴＭ２２に対し
て発行する（ステップＳ２０３、Ｓ２０４）。この仮更
新指示には、更新すべき全資源に関する更新内容の情報
が含まれている。この更新内容の情報は、ＡＰアクセス
部２１からデータベースに対する複数のアクセスの集合
でもよく、その場合、ＡＰ主要部２０は、その集合を一
括して呼出し可能である。この場合、メモリ上に保持し
ジャーナル１８に取得する仮更新命令は、そのアクセス
の集合の一括呼出し命令となる。ＧＴＭ２２は、この仮
更新指示を一旦メモリ上に保持し、そして、当該サーバ
１ａの資源に対する仮更新命令を、ＡＰアクセス部２１
を通じてデータベース管理部１４及び資源管理部１５に
送り、また、他のサーバ１ｂ、１ｃ…の資源に対する仮
更新命令を、通信網２から各サーバ１ｂ、１ｃ…のＡＰ
アクセス部２１を通じて、各サーバ１ｂ、１ｃ…のデー
タベース及び資源管理部１４、１５に送る。各サーバの
データベース及び資源管理部１４、１５は、それぞれの
データベース１６及びメモリテーブル／ファイル部１７
に対して仮更新命令を発行する。Next, the AP main section 20 issues to the GTM 22 a temporary update instruction for all resources (database 16 and memory table / file section 17) to be updated (steps S203, S204). The provisional update instruction includes information on the update content regarding all resources to be updated. This update content information may be a set of a plurality of accesses from the AP access unit 21 to the database, in which case the AP main unit 20 can call the set collectively. In this case, the temporary update command held in the memory and acquired in the journal 18 is a batch call command for the access set. The GTM 22 temporarily holds this temporary update instruction in the memory, and then issues a temporary update command to the resource of the server 1a to the AP access unit 21.
Through the communication network 2 to the database management unit 14 and the resource management unit 15 and the temporary update command for the resources of the other servers 1b, 1c.
It is sent to the database and resource management units 14 and 15 of each server 1b, 1c ... Through the access unit 21. The database and resource management units 14 and 15 of each server have a database 16 and a memory table / file unit 17 respectively.
A temporary update command is issued to.

【００５０】以上の仮更新までの過程で、何等かの障害
が発生した場合は、当該トランザクションはロールバッ
ク（後退回復）され、各部の状態は当該トランザクショ
ンが発生する前の状態に戻される。If any failure occurs in the process up to the temporary update, the transaction is rolled back (backward recovery), and the state of each part is returned to the state before the transaction.

【００５１】一方、仮更新処理が成功すると、次に、サ
ーバ１ａのＡＰ主要部２０は、更新すべき全ての資源に
対する実更新（コミット）指示を、ＧＴＭ２２に対して
発行する（ステップＳ２０５）。すると、ＧＴＭ２２
は、まず最初に、先の仮更新処理でメモリ上に保持した
仮更新指示に基づき、更新すべき全資源に関する更新内
容をジャーナル１８に記録する。次に、ＧＴＭ２２は、
当該サーバ１ａのＬＴＭ２３及び関連する他のサーバ１
ｂ、１ｃ…のＬＴＭ２３にコミット命令を送信する。す
ると、各サーバのＬＴＭ２３は、そのコミット命令を、
それぞれのデータベース管理部１４及び資源管理部１５
を通じ、それぞれのデータベース１６及びメモリテーブ
ル／ファイル部１７に対して発行する。それにより、そ
れぞれのデータベース１６及びメモリテーブル／ファイ
ル部１７において、実更新が行われ、実更新が成功する
と、その更新内容がそれぞれの更新イメージログ２６、
２７に記録される。On the other hand, if the temporary update process is successful, then the AP main part 20 of the server 1a issues an actual update (commit) instruction to all resources to be updated to the GTM 22 (step S205). Then GTM22
First, based on the temporary update instruction stored in the memory in the previous temporary update process, the update content regarding all resources to be updated is recorded in the journal 18. Next, GTM22
The LTM 23 of the server 1a and other related server 1
The commit command is transmitted to the LTM 23 of b, 1c, .... Then, the LTM 23 of each server sends the commit command to
Each database management unit 14 and resource management unit 15
Through the database 16 and the memory table / file unit 17. As a result, the actual update is performed in each of the database 16 and the memory table / file unit 17, and when the actual update is successful, the update content is updated in each update image log 26,
Recorded at 27.

【００５２】また、それぞれの資源での実更新が成功す
ると、それぞれのＬＴＭ２３は、その更新した資源のロ
ックを解除した旨の情報（アンロック情報）をアンロッ
クログ２５に所得する。このアンロックログ２５は、ロ
ックログ２４とペアとなるよう対応づけられる。When the actual update of each resource is successful, each LTM 23 obtains information (unlock information) indicating that the updated resource is unlocked in the unlock log 25. The unlock log 25 is associated with the lock log 24 so as to form a pair.

【００５３】上記コミット処理が終了すると、サーバ１
ａのＡＰ主要部２０は、トランザクションが完遂した旨
のメッセージを端末に返し、これにより１つのトランザ
クション処理が終了する。When the commit process is completed, the server 1
The AP main unit 20 of a returns a message indicating that the transaction has been completed to the terminal, whereby one transaction process ends.

【００５４】ところで、上記コミット処理において、或
る資源に障害が発生して実更新が失敗した場合、その資
源については、アンロックログ２５が取得されないた
め、トランザクション処理の終了後も、ロックログ２４
が単独で存在している状態、つまりロック状態のままに
維持される。そのため、ロック状態の資源については、
後続のトランザクション処理でのアクセスが一切禁止さ
れ、障害発生前の状態のままに保持される。By the way, in the above commit process, when a failure occurs in a certain resource and the actual update fails, the unlock log 25 is not acquired for that resource, and therefore, the lock log 24 remains even after the transaction process ends.
Exists by itself, that is, remains locked. Therefore, for locked resources,
Access is prohibited in the subsequent transaction processing, and the state before the failure occurred is retained.

【００５５】しかしながら、障害の発生した資源につい
ては、その障害の除去処理がなされた後に、次のように
して、リカバリ処理を行って強制的に正しい状態に修正
することができる。即ち、まず、ロック／アンロックロ
グ２４、２５を検索して、障害発生時のトランザクショ
ンを判明させる。次に、その障害発生時のトランザクシ
ョン処理で主装置となったサーバに対し、そのジャーナ
ル１８内の当該トランザクションに関する情報の送信を
依頼する。既に説明したように、このジャーナル１８に
は、各トランザクション処理における正しい更新内容が
記録されているから、その正しい更新内容に基づいて、
その資源を正しく更新し直す。尚、障害発生後に、その
資源に対するアクセスを必要とする他のトランザクショ
ンが発生していた場合には、それら他のトランザクショ
ンはロックの段階でロールバックされるため、リカバリ
処理では、それら後続のトランザクションを考慮する必
要はなく、障害発生時のトランザクションに関してのみ
再更新を行えばよい。However, with respect to the resource in which the failure has occurred, after the failure is removed, the recovery processing can be performed to forcibly correct it as follows. That is, first, the lock / unlock logs 24 and 25 are searched to identify the transaction at the time of failure occurrence. Next, the server that has become the main device in the transaction processing when the failure occurs is requested to send the information regarding the transaction in the journal 18. As described above, since the correct update content in each transaction process is recorded in this journal 18, based on the correct update content,
Correctly update the resource again. If other transactions that require access to the resource have occurred after the failure occurred, those other transactions are rolled back at the lock stage, so the recovery process does not There is no need to consider, and it is sufficient to re-update only the transaction at the time of failure.

【００５６】以上のようにして、いずれかの資源で障害
が発生して一時的に更新不可能となっても、障害が除去
された後の強制的なリカバリ処理によって、全ての資源
を通じて実質的に常に一貫性のある状態が確保されるこ
とになる。As described above, even if a failure occurs in any of the resources and the update cannot be performed temporarily, the forced recovery processing after the failure is eliminated causes all the resources to be substantially updated. There will always be a consistent state.

【００５７】しかも、個々のトランザクション処理で
は、主装置は、従装置にコミット指示を送った後、従装
置での実際のコミットの成功・失敗にかかわらず、トラ
ンザクション完遂とみなすので、後続のトランザクショ
ン処理へと速やかに移行することができる。また、主装
置は、従装置でのコミット成功・失敗を関知しないの
で、主装置と受装置間の交信回数も従来の２フェーズ・
コミット方式より少なくなる。結果として、通信オーバ
ーヘッドが短縮され、且つ、或る従装置でブロッキング
が発生してもシステム全体の稼働が停止することがなく
なり、よって、システム全体の稼働が高速且つ円滑にな
る。Moreover, in each transaction processing, the master device, after sending the commit instruction to the slave device, regards it as a transaction completion regardless of the actual success or failure of the commit in the slave device. Can be quickly moved to. In addition, since the master device does not care about the success or failure of the commit in the slave device, the number of communications between the master device and the receiver device can be the same as in the conventional two-phase.
Less than the commit method. As a result, the communication overhead is reduced, and even if blocking occurs in a certain slave device, the operation of the entire system does not stop, so that the operation of the entire system becomes fast and smooth.

【００５８】尚、既に述べたように、資源に障害が発生
した場合、その障害発生部分はリカバリ処理が行われる
までロック状態に維持されるため、後続のトランザクシ
ョン処理でのアクセスが禁止される。そのため、排他制
御（ロック／アンロック）の粒度が大きいと、後続のト
ランザクション処理に対する影響が大きくなり好ましく
ない。そこで、排他制御の粒度は、アクセスの最小単位
であるレコード単位とすることが望ましい。As already described, when a resource fails, the failed part is kept in the locked state until the recovery process is performed, so that access in the subsequent transaction process is prohibited. Therefore, if the granularity of exclusive control (lock / unlock) is large, the subsequent transaction processing is greatly affected, which is not preferable. Therefore, it is desirable that the granularity of exclusive control is set to a record unit, which is the minimum unit of access.

【００５９】図３は、本発明のシステムで採用される一
つの典型的なシステム構成例を示す。図示のように、通
信網によって複数のサーバ３００ａ、３００ｂ…が並列
的に接続されておる。サーバ３００ａ、３００ｂ…はい
ずれも、主装置になることができる。例えば、銀行の取
引処理システムに適用すれば、各サーバ３００ａ、３０
０ｂ…は銀行の各店舗に配置され、それぞれの店舗の端
末から入力されたトランザクション（預金、払い戻し、
振込など）に対して主装置となることができる。また、
このシステム構成では、サーバ３００ａ、３００ｂ…間
に上下関係がないため、例えばトラヒックの増大に応じ
てサーバを増設していくというようなシステムの拡大・
縮小が容易に行える。FIG. 3 shows one typical system configuration example adopted in the system of the present invention. As illustrated, a plurality of servers 300a, 300b ... Are connected in parallel by a communication network. Each of the servers 300a, 300b ... Can be a main device. For example, if applied to a bank transaction processing system, the servers 300a, 30
0b ... is placed in each store of the bank, and the transactions (deposit, refund,
It can be the main device for payment (eg transfer). Also,
In this system configuration, there is no vertical relationship between the servers 300a, 300b ... Therefore, for example, expansion of the system in which servers are added according to the increase in traffic.
Can be easily reduced.

【００６０】図４は、本発明で採用できる別のシステム
構成例を示す。これは、階層構造をもって複数のサーバ
４００ａ、４００ｂ、４００ｃ…を接続したものであ
る。例えば、銀行システムに適用した場合、最上位層の
サーバ４００ａは本店の装置、他のサーバ４００ｂ、４
００ｃ…は支店の装置とし、個々の支店で閉じたトラン
ザクション処理は、支店のサーバが個々に行い、本店と
関連した処理は支店と本店のサーバと協調して分散処理
するといった使い方ができる。このようなシステム構成
は、大規模なシステムを構築するのに適している。この
場合、システムの拡大・縮小は、各層毎に他層から独立
して行えるというメリットがある。FIG. 4 shows another system configuration example that can be adopted in the present invention. This is a connection of a plurality of servers 400a, 400b, 400c ... With a hierarchical structure. For example, when applied to a bank system, the server 400a at the top layer is the device of the head office, other servers 400b, 4
00c ... is a branch device, and transaction processing closed at each branch is individually performed by the branch server, and processing related to the head office can be distributed processing in cooperation with the branch and head office servers. Such a system configuration is suitable for building a large-scale system. In this case, there is an advantage that the system can be scaled up / down independently of the other layers.

【００６１】ところで、上述した実施例では、各サーバ
が、各々が主装置となって処理したトランザクションに
関する一切のジャーナルを保持するようになっている
が、必ずしもそのようにする必要はない。例えば、シス
テム内の特定の１台又は２台以上のサーバが代表して一
切のジャーナルを保持したり、或は、特別にジャーナル
管理専用のサーバを設けてもよい。この場合、ジャーナ
ルを保持しないサーバから保持するサーバへ、仮更新指
示などのジャーナルの元となる情報を送信する必要があ
るため、通信障害などがあるとジャーナルの正確性が保
証されないという問題がある。しかし、反面、ジャーナ
ルが一括管理されているので、リカバリ処理時などでの
ジャーナル使用には便利である。こうしたメリット、デ
メリットを考慮して、どのようにジャーナルを管理する
かをシステム毎に選択するべきである。また、上述の一
括管理と実施例のような分散管理とを併用することによ
り、両者のメリットを共に活かすこともできる。By the way, in the above-mentioned embodiment, each server holds all journals related to transactions processed by each server as a main unit, but it is not always necessary to do so. For example, one or more specific servers in the system may hold all journals on behalf of them, or a dedicated server for journal management may be provided. In this case, since it is necessary to send the information that is the source of the journal, such as a temporary update instruction, from the server that does not hold the journal to the server that holds the journal, the accuracy of the journal cannot be guaranteed if there is a communication failure or the like. . However, on the other hand, since the journals are collectively managed, it is convenient to use the journals during recovery processing. In consideration of these merits and demerits, how to manage journals should be selected for each system. Further, by using the collective management described above and the distributed management as in the embodiment together, it is possible to utilize the advantages of both.

【００６２】図５は、上記した実施例の構成において、
サーバ１ａが主装置である場合を例にとり、特に、ジャ
ーナルの取得と障害発生後のリカバリ処理とに関連する
部分を詳細に示したものである。FIG. 5 shows the configuration of the above embodiment,
The case where the server 1a is the main device is taken as an example, and in particular, the parts relating to the acquisition of the journal and the recovery processing after the occurrence of a failure are shown in detail.

【００６３】図５において、主装置たるサーバ１ａのコ
ミット管理部５０１及びリカバリ管理部５０２は共に、
図２に示したサーバ１ａのＧＴＭ２２に含まれる処理部
である。また、従装置たるサーバ１ｂ、１ｃ…の各々の
ロック管理部５０３及びリカバリ管理部５０４は、図２
に示した各サーバ１ｂ、１ｃ…のＬＴＭ２３に含まれる
処理部である。尚、図５では、各サーバのデータベース
１６及びメモリテーブル／ファイル部１７は纏めて資源
として示してあり、それらの管理部１４、１５も更新イ
メージログ２６、２７もそれぞれ纏めて１ブロックとし
て示してある。In FIG. 5, the commit management unit 501 and the recovery management unit 502 of the server 1a as the main device are both
This is a processing unit included in the GTM 22 of the server 1a shown in FIG. Further, the lock management unit 503 and the recovery management unit 504 of each of the slave servers 1b, 1c, ...
Is a processing unit included in the LTM 23 of each of the servers 1b, 1c. In FIG. 5, the database 16 and the memory table / file unit 17 of each server are collectively shown as a resource, and the management units 14 and 15 and the update image logs 26 and 27 are also collectively shown as one block. is there.

【００６４】主装置１ａのコミット管理部５０１は、Ａ
Ｐ主要部２０からの指示により次のような処理を行う。The commit management unit 501 of the main device 1a
The following processing is performed according to an instruction from the P main part 20.

【００６５】トランザクション開始指示に従い、トラ
ンザクション識別子を生成する。ここで、トランザクシ
ョン識別子とは、個々のトランザクションに固有の識別
コードである。このトランザクション識別子には、その
トランザクション処理の主装置がどのサーバであるかを
示すサーバの識別コードも含まれている。A transaction identifier is generated according to the transaction start instruction. Here, the transaction identifier is an identification code unique to each transaction. This transaction identifier also includes a server identification code indicating which server is the main device of the transaction processing.

【００６６】ロック指示に従い、ロック命令を各サー
バのロック管理部５０３へ発行する。According to the lock instruction, a lock command is issued to the lock management unit 503 of each server.

【００６７】仮更新指示に従い、トランザクション識
別子、ロック識別子及び仮更新命令をメモリ上に保持
し、そして、仮更新命令を各サーバのアクセス管理部２
１へ発行する。According to the temporary update instruction, the transaction identifier, the lock identifier, and the temporary update command are held in the memory, and the temporary update command is stored in the access management unit 2 of each server.
Issue to 1.

【００６８】コミット指示に従い、メモリ上に保持さ
れたトランザクション識別子、ロック識別子及び仮更新
命令をジャーナル１８として記録し、そして、各サーバ
のロック管理部５０３にアンロック命令を発行する。According to the commit instruction, the transaction identifier, the lock identifier, and the temporary update command held in the memory are recorded as the journal 18, and the unlock command is issued to the lock management unit 503 of each server.

【００６９】ロールバック指示に従い、当該トランザ
クションに関する一切の情報を破棄し、そして、各サー
バのロック管理部５０３にロールバック命令を発行す
る。According to the rollback instruction, all information regarding the transaction is discarded, and a rollback command is issued to the lock management unit 503 of each server.

【００７０】主装置１ａのリカバリ管理部５０２は、従
装置１ｂ、１ｃ…におけるシステムダウン等の障害発生
後のリカバリ処理において、従装置１ｂ、１ｃ…のリカ
バリ管理部５０２からの依頼に従って、要求されたトラ
ンザクションの仮更新命令を主装置１ａのジャーナル１
８から検索し、これを依頼元たる従装置の１ｂ、１ｃ…
のリカバリ管理部５０２へ送信するものである。The recovery management unit 502 of the main device 1a is requested according to a request from the recovery management unit 502 of the slave devices 1b, 1c, ... In the recovery processing after a failure such as a system down in the slave devices 1b, 1c. The temporary update command of the transaction
The slave devices 1b, 1c ...
To the recovery management unit 502.

【００７１】主装置１ａのジャーナル１８には、既に述
べたように、この主装置１ａが受け付けてコミットした
全てのトランザクションに関するトランザクション識別
子、ロック識別子及び仮更新命令が時系列的に記録され
ている。In the journal 18 of the main device 1a, as already described, the transaction identifiers, lock identifiers, and temporary update commands relating to all transactions accepted and committed by the main device 1a are recorded in time series.

【００７２】各従装置１ｂ、１ｃ…のロック管理部５０
３は、以下のような処理を行う。The lock management unit 50 of each of the slaves 1b, 1c, ...
3 performs the following processing.

【００７３】主装置１ａのコミット管理部５０１から
のロック命令に応答して、ロック取得処理、データベー
ス／資源管理部１４、１５へのトランザクション開始命
令の発行、ロック識別子の生成、及びロック／アンロッ
クログ２４、２５へのロック情報の取得を行う。ここ
で、ロック識別子とは、個々のロック情報に固有の識別
コードである。In response to a lock command from the commit management unit 501 of the main unit 1a, lock acquisition processing, transaction start command issuance to the database / resource management units 14 and 15, generation of lock identifier, and lock / unlock The lock information for the logs 24 and 25 is acquired. Here, the lock identifier is an identification code unique to each lock information.

【００７４】コミット管理部５０１からのアンロック
命令に応答して、データベース／資源管理部１４、１５
へコミット命令を発行し、そしてロック／アンロックロ
グ２４、２５にアンロック情報を取得する。In response to the unlock command from the commit management unit 501, the database / resource management units 14 and 15
Issue a commit command to the lock / unlock log 24, 25 to obtain unlock information.

【００７５】コミット管理部５０１からのロールバッ
ク命令に応答して、データベース／資源管理部１４、１
５へロールバック命令を発行し、そしてロック／アンロ
ックログ２４、２５にアンロック情報を取得する。In response to the rollback command from the commit management unit 501, the database / resource management units 14, 1
Issue a rollback command to 5 and obtain unlock information in the lock / unlock log 24, 25.

【００７６】各従装置１ｂ、１ｃ…のＡＰアクセス部２
１は、主装置１ａのコミット管理部５０１から仮更新命
令を受けて、データベース／資源管理部１４、１５に仮
更新命令を発行するものである。AP access unit 2 of each slave device 1b, 1c ...
1 receives a temporary update command from the commit management unit 501 of the main device 1a and issues a temporary update command to the database / resource management units 14 and 15.

【００７７】各従装置１ｂ、１ｃ…のリカバリ管理部５
０４は、以下のような処理を行う。The recovery management section 5 of each slave device 1b, 1c ...
04 performs the following processes.

【００７８】リカバリ処理において、ロック／アンロ
ックログ２４、２５からアンロック情報のないロック情
報を検索し、この情報に基づいて主装置１ａのリカバリ
管理部５０２にリカバリ処理を依頼する。In the recovery processing, the lock / unlock log 24, 25 is searched for lock information without unlock information, and the recovery management unit 502 of the main unit 1a is requested to perform recovery processing based on this information.

【００７９】主装置１ａのリカバリ管理部５０２から
仮更新命令を受けて、リカバリを実行する。Upon receiving a temporary update command from the recovery management unit 502 of the main unit 1a, the recovery is executed.

【００８０】各従装置１ｂ、１ｃ…のデータベース／資
源管理部１４、１５は、各々がもつ局域的な更新イメー
ジログ２６、２７に基づくリカバリ機能を有している。The database / resource management units 14 and 15 of the slaves 1b, 1c, ... Have a recovery function based on the local update image logs 26 and 27, respectively.

【００８１】各従装置１ｂ、１ｃ…のロック／アンロッ
クログ２４、２５は、それぞれの資源１６、１７に関す
るロック情報及びアンロック情報の時系列的な記録であ
る。ロック情報には、トランザクション識別子、ロック
マーク、ロック識別子及びレコード名が含まれている。
アンロック情報には、トランザクション識別子及びアン
ロックマークが含まれている。ロック情報とアンロック
情報は、それに含まれるトランザクション識別子によっ
て、互いに対応付けられている。もし、アンロック情報
と対応付けられていないロック情報があったならば、そ
れは、そのトランザクションにおいて障害が発生したこ
とを意味し、そのロック情報内のトランザクション識別
子によって、障害発生時のトランザクションと主装置と
を識別することができる。リカバリ処理では、このよう
にして障害発生時のトランザクションと主装置とを識別
する。The lock / unlock logs 24 and 25 of the slaves 1b, 1c, ... Are time-series records of lock information and unlock information regarding the resources 16 and 17, respectively. The lock information includes a transaction identifier, a lock mark, a lock identifier and a record name.
The unlock information includes a transaction identifier and an unlock mark. The lock information and the unlock information are associated with each other by the transaction identifier included in the lock information and the unlock information. If there is lock information that is not associated with unlock information, it means that a failure has occurred in that transaction, and the transaction identifier in the lock information indicates the transaction and main device at the time of failure. And can be identified. In the recovery process, the transaction at the time of failure and the main device are thus identified.

【００８２】以下、トランザクション処理の手順及びリ
カバリ処理の手順を、詳細に説明する。The procedure of transaction processing and the procedure of recovery processing will be described in detail below.

【００８３】まず。トランザクション処理手順を図６〜
図１０を参照して説明する。First of all. The transaction processing procedure is shown in FIG.
This will be described with reference to FIG.

【００８４】図６は、トランザクション処理における主
装置１ａのＡＰ主要部２０及びコミット管理部５０１の
処理手順を示す。図７は、各従装置１ｂ、１ｃ…のロッ
ク管理部５０３のロック処理手順を示す。図８は、各従
装置１ｂ、１ｃ…のＡＰアクセス部２１の仮更新命令処
理手順を示す。図９は、各従装置１ｂ、１ｃ…のロック
管理部５０３のアンロック命令処理手順を示す。図１０
は、各従装置１ｂ、１ｃ…のロック管理部５０３のロー
ルバック命令処理手順を示す。FIG. 6 shows the processing procedure of the AP main part 20 and the commit management part 501 of the main device 1a in transaction processing. FIG. 7 shows a lock processing procedure of the lock management unit 503 of each of the slave devices 1b, 1c, .... FIG. 8 shows a temporary update command processing procedure of the AP access unit 21 of each slave device 1b, 1c .... FIG. 9 shows an unlock command processing procedure of the lock management unit 503 of each of the slave devices 1b, 1c, .... Figure 10
Shows a rollback command processing procedure of the lock management unit 503 of each slave device 1b, 1c ....

【００８５】図６に示すように、まず、主装置１ａにお
いて、ＡＰ主要部２０がトランザクション開始指示を発
行し（ステップＳ３０１）、これを受けてコミット管理
部５１がそのトランザクションに対するトランザクショ
ン識別子を生成する（ステップＳ３０２）。次に、ＡＰ
主要部２０がロック指示を発行し（ステップＳ３０３
０）、これを受けてコミット管理部５０１が各従装置の
１ｂ、１ｃ…のロック管理部５０３へロック命令を発行
する（ステップＳ３０４）。As shown in FIG. 6, first, in the main device 1a, the AP main part 20 issues a transaction start instruction (step S301), and in response to this, the commit management part 51 generates a transaction identifier for the transaction. (Step S302). Next, AP
The main part 20 issues a lock instruction (step S303).
0) and in response thereto, the commit management unit 501 issues a lock command to the lock management units 503 of the slave devices 1b, 1c ... (Step S304).

【００８６】各従装置の１ｂ、１ｃ…のロック管理部５
０３は、図７に示すように、主装置１ａからのロック命
令を受け付けると（ステップＳ３３０）、ロック取得処
理を行う（ステップＳ３３１）。ロック取得処理では、
受け付けたロック命令をメモリ上に取得し、そのロック
対象となっているレコード（又はテーブル、ファイル）
が、他のトランザクションによって既にロックされてい
ないか調べ、ロックされていなければ、ロック成功と判
断し、既にロックされている場合はロック失敗と判断す
る（ステップＳ３３２）。そして、ロック失敗の場合
は、メモリ上に取得したロック命令を破棄し、ロック失
敗の旨を主装置１ａに通知する（ステップＳ３３３）。The lock management unit 5 of each slave device 1b, 1c, ...
As shown in FIG. 7, when receiving the lock command from the main device 1a (step S330), the 03 performs lock acquisition processing (step S331). In the lock acquisition process,
The received lock command is acquired in memory, and the record (or table or file) that is the lock target
Checks whether it has already been locked by another transaction. If it is not locked, it is determined that the lock is successful, and if it is already locked, it is determined that the lock is unsuccessful (step S332). If the lock is unsuccessful, the lock command acquired in the memory is discarded and the lock failure is notified to the main device 1a (step S333).

【００８７】一方、ロック成功の場合は、ロック管理部
５０３は次に、資源１６、１７にトランザクション開始
命令を発行し（ステップＳ３３４）、当該ロックに対応
するロック識別子を生成し（ステップＳ３３５）、そし
て、トランザクション識別子（ロック命令に含まれてい
た）、ロックマーク、ロック識別子及びロック対象のレ
コード名をロックログ２４に取得し（ステップＳ３３
６）、その後、ロック識別子を主装置１ａに送る（ステ
ップＳ３３７）。On the other hand, when the lock is successful, the lock management unit 503 next issues a transaction start command to the resources 16 and 17 (step S334), and generates a lock identifier corresponding to the lock (step S335). Then, the transaction identifier (included in the lock command), the lock mark, the lock identifier, and the record name of the lock target are acquired in the lock log 24 (step S33).
6) After that, the lock identifier is sent to the main device 1a (step S337).

【００８８】再び図６を参照して、主装置１ａのコミッ
ト管理部５０１は、ステップＳ３０７でロック命令を発
行した後、各従装置１ｂ、１ｃ…からロック識別子（ロ
ック成功の通知）又はロック失敗の通知を受け取ると、
これをＡＰ主要部２０に報告する。ＡＰ主要部２０は、
全ての従装置１ｂ、１ｃ…がロックに成功したか否かを
チェックし（ステップＳ３０５）、全ての従装置がロッ
クに成功すれば、各従装置の資源に対する仮更新指示を
発行する（ステップＳ３０８）。この仮更新指示を受け
たコミット管理部５０１は、トランザクション識別子、
各資源のロック識別子及び仮更新命令を一旦メモリ上に
取得し（ステップＳ３０９）、そして、各従装置のＡＰ
アクセス部２１へ、それぞれの資源に対する仮更新命令
を発行する（ステップＳ３１０）。Referring again to FIG. 6, the commit management unit 501 of the main device 1a issues a lock command in step S307, and then issues a lock identifier (notification of lock success) or lock failure from each of the slave devices 1b, 1c. When you receive the notification of
This is reported to the AP main section 20. AP main part 20,
It is checked whether all the slaves 1b, 1c ... Have succeeded in the lock (step S305), and if all the slaves have succeeded in the lock, issue a temporary update instruction to the resources of each slave (step S308). ). The commit management unit 501, which has received this temporary update instruction,
The lock identifier of each resource and the temporary update command are temporarily acquired in the memory (step S309), and the AP of each slave device is acquired.
A temporary update command for each resource is issued to the access unit 21 (step S310).

【００８９】各従装置では、図８に示すように、ＡＰア
クセス部２１が、仮更新命令を受け付けると（ステップ
Ｓ３４０）、それぞれの資源に仮更新命令を発行し（ス
テップＳ３４１）、そして、各資源から仮更新成功か否
かの回答をもらい（ステップ３４１）、回答に応じて成
功又は失敗のメッセージを主装置へ送る（ステップＳ３
４２、Ｓ３４３）。In each slave, as shown in FIG. 8, when the AP access unit 21 receives a temporary update command (step S340), it issues a temporary update command to each resource (step S341), and each A response is received from the resource as to whether or not the provisional update is successful (step 341), and a success or failure message is sent to the main device according to the response (step S3).
42, S343).

【００９０】再び、図６を参照して、主装置のコミット
管理部５０１は、各従装置より仮更新成功又は失敗の回
答を受け取ると（ステップＳ３１０）、ＡＰ主要部２０
に報告する。ＡＰ主要部２０は、全ての従装置が仮更新
に成功したかチェックし（ステップＳ３１１）、全て成
功したなら、次にコミット指示を発行する（ステップＳ
３１４）。このコミット指示を受けたコミット管理部５
０１は、ステップＳ３０９でメモリ上に取得したトラン
ザクション識別子、各資源のロック識別子及び仮更新命
令をジャーナル１８に取得し（ステップＳ３１５）、そ
の後、各従装置のロック管理部５０３へ、アンロック命
令を発行する（ステップＳ３１６）。ここで、アンロッ
ク命令には、コミットするか否かを示すコミットフラグ
が含まれており、ステップＳ３１６で発行されるアンロ
ック命令では、コミットフラグ＝ＯＮとなっている。Referring again to FIG. 6, when the commit management unit 501 of the master device receives the reply of the success or failure of the temporary update from each slave device (step S310), the AP main unit 20.
Report to. The AP main unit 20 checks whether all the slave devices have succeeded in the temporary update (step S311), and if all succeed, issues a commit instruction next (step S3).
314). Commit management unit 5 that received this commit instruction
01 acquires the transaction identifier, the lock identifier of each resource, and the tentative update command acquired on the memory in step S309 into the journal 18 (step S315), and then issues an unlock command to the lock management unit 503 of each slave. It is issued (step S316). Here, the unlock instruction includes a commit flag indicating whether or not to commit, and the unlock instruction issued in step S316 has the commit flag = ON.

【００９１】各従装置では、図９に示すように、ロック
管理部５０３が、アンロック命令を受け取ると（ステッ
プＳ３５１）、コミットフラグをチェックし（ステップ
Ｓ３５２）、コミットフラグがＯＮであれば、それぞれ
の資源１６、１７に対しコミット命令を発行する（ステ
ップＳ３５３）。コミット命令を受けた各資源１６、１
７は、実更新を行うと共に、自身のリカバリーのための
更新イメージログ２６、２７を取得する（ステップＳ３
５４）。その後、ロック管理部５０３は、先のロック命
令処理で取得したロックを開放し（ステップＳ３５
５）、トランザクション識別子及びアンロックマークを
アンロックログ２５に取得する（ステップＳ３５６）。In each slave device, as shown in FIG. 9, when the lock management unit 503 receives the unlock command (step S351), it checks the commit flag (step S352), and if the commit flag is ON, A commit command is issued to each of the resources 16 and 17 (step S353). Each resource 16 and 1 that received a commit command
7 performs the actual update and acquires the update image logs 26 and 27 for its own recovery (step S3).
54). After that, the lock management unit 503 releases the lock acquired in the previous lock command process (step S35).
5) The transaction identifier and unlock mark are acquired in the unlock log 25 (step S356).

【００９２】再び、図６を参照して、主装置のコミット
管理部５０１は、ステップＳ３１６でコミット命令を発
行した後、各従装置でコミットが実際に成功したか否か
に関知せず、直ちにトランザクションの終了をＡＰ主要
部２０に通知し、ＡＰ主要部２０はトランザクション完
遂を端末に報告して、当該トランザクション処理を終了
する。Referring again to FIG. 6, after issuing the commit command in step S316, the commit management unit 501 of the master device immediately does not care whether or not the commit is actually successful in each slave device, and immediately The AP main unit 20 is notified of the end of the transaction, the AP main unit 20 reports the completion of the transaction to the terminal, and ends the transaction processing.

【００９３】ところで、主装置のＡＰ主要部２０は、ス
テップＳ３０５で従装置のいずれか１台でもロックに失
敗したことを認識すると、ロールバック指示を発行する
（ステップＳ３０６）。また、ステップＳ３１１で従装
置のいずれか１台でも仮更新に失敗したことを認識した
場合も、同様にロールバック指示を発行する（ステップ
Ｓ３０６）。コミット管理部５０１は、ロールバック指
示を受けると、トランザクション情報（トランザクショ
ン識別子、ロック識別子等）を破棄し、そして各従装置
のロック管理部５０３へロールバック命令を発行する
（ステップＳ３０７、３１３）。When the AP main unit 20 of the main device recognizes that even one of the slave devices failed to lock in step S305, it issues a rollback instruction (step S306). In addition, if it is determined in step S311 that any one of the slave devices has failed in the temporary update, the rollback instruction is similarly issued (step S306). Upon receiving the rollback instruction, the commit management unit 501 discards the transaction information (transaction identifier, lock identifier, etc.) and issues a rollback command to the lock management unit 503 of each slave (steps S307 and 313).

【００９４】ロールバック命令を受けた各従装置では、
図１０に示すように、ロック管理部５０３が、ロールバ
ック命令を受取り（ステップＳ３６１）、それぞれの資
源１６、１７にロールバック命令を発行する（ステップ
Ｓ３６２）。各資源１６、１７では、自身のロールバッ
ク処理を行う（ステップＳ３６３）。そして、ロック管
理部５０３は、取得したロックを開放し（ステップＳ３
６４）、トランザクション識別子及びアンロックマーク
をアンロックログ２５に取得する（ステップＳ３６
５）。In each slave device which has received the rollback command,
As shown in FIG. 10, the lock management unit 503 receives the rollback command (step S361) and issues the rollback command to the resources 16 and 17 (step S362). Each of the resources 16 and 17 performs its own rollback processing (step S363). Then, the lock management unit 503 releases the acquired lock (step S3
64), the transaction identifier and the unlock mark are acquired in the unlock log 25 (step S36).
5).

【００９５】以上のようにして、トランザクション処理
が行われる。その過程で、ロック命令処理や仮更新命令
処理において障害が発生した場合は、トランザクション
はロールバックされるが、コミット命令処理においてい
ずれかの従装置で障害が発生した場合は、トランザクシ
ョンはロールバックされずコミットされたものとして処
理される。そのため、障害が発生した従装置では、後に
リカバリ処理を行うことにより、正しくコミットされた
状態に資源を復帰する。Transaction processing is performed as described above. In the process, if a failure occurs in the lock instruction processing or temporary update instruction processing, the transaction is rolled back, but if a failure occurs in any of the slaves in the commit instruction processing, the transaction is rolled back. Processed as if it had been committed. Therefore, in the slave device in which the failure has occurred, the resource is restored to the correctly committed state by performing the recovery process later.

【００９６】以下に、このリカバリの処理を図１１から
図１４を参照して説明する。The recovery process will be described below with reference to FIGS. 11 to 14.

【００９７】図１１は、障害発生からリカバリ処理の全
体の処理手順を示す。図１２は、各従装置のリカバリ管
理部５０４のリカバリ依頼処理手順を示す。図１３は、
主装置のリカバリ管理部５０２のリカバリ処理手順を示
す。図１４は、各従装置のリカバリ実行処理手順を示
す。FIG. 11 shows the entire processing procedure from the failure occurrence to the recovery processing. FIG. 12 shows a recovery request processing procedure of the recovery management unit 504 of each slave device. FIG.
The recovery processing procedure of the recovery management unit 502 of the main device is shown. FIG. 14 shows a recovery execution processing procedure of each slave device.

【００９８】図１１に示すように、いずれかの従装置で
システムダウン等の障害が発生すると、まずその従装置
において、障害原因の除去作業が行われ、その後にシス
テムが再起動され（ステップＳ４０１）、この再起動の
直後にリカバリ処理が開始される（ステップＳ４０
２）。リカバリ処理では、まず、個々の資源１６、１７
が自身のリカバリ処理を、自身の更新イメージログ２
６、２７に基づいて実行する（ステップＳ４０３）。即
ち、更新イメージログ２６、２７に記録されているコミ
ット済のトランザクションに関する更新イメージに従っ
て、自身の内容を更新し、また、未コミットのトランザ
クションについてはロールバックする。このリカバリ処
理により、個々の資源１６、１７内での整合性が回復さ
れる。As shown in FIG. 11, when a failure such as a system down occurs in any of the slaves, the work of removing the cause of the failure is first performed in that slave, and then the system is restarted (step S401). ), The recovery process is started immediately after this restart (step S40).
2). In the recovery process, first, the individual resources 16 and 17 are
Does its own recovery process, its own update image log 2
6 and 27 (step S403). That is, according to the update image regarding the committed transaction recorded in the update image logs 26 and 27, the content of itself is updated, and the uncommitted transaction is rolled back. This recovery process restores the consistency within the individual resources 16 and 17.

【００９９】個々の資源でのリカバリ処理の次に、主装
置と従装置のリカバリ管理部５０２、５０４によるリカ
バリ処理が行われる（ステップＳ４０４）。このリカバ
リ処理は、システム全体を通じて資源の整合性を回復さ
せるものであり、次の３つの段階、つまり、従装置のリ
カバリ管理部５０４から主装置のリカバリ管理部５０２
へのリカバリ依頼（図１２）、主装置のリカバリ管理部
５０２でのリカバリ処理（図１３）、及び従装置のリカ
バリ管理部５０４でのリカバリ実行処理（図１４）から
構成される。After the recovery processing with the individual resources, the recovery processing by the recovery management units 502 and 504 of the main device and the slave device is performed (step S404). This recovery processing restores the consistency of resources throughout the entire system, and is performed in the following three stages: from the recovery management unit 504 of the slave device to the recovery management unit 502 of the main device.
Recovery request (FIG. 12), a recovery process in the recovery management unit 502 of the main device (FIG. 13), and a recovery execution process in the recovery management unit 504 of the slave device (FIG. 14).

【０１００】まず、従装置のリカバリ管理部５０４から
主装置のリカバリ管理部５０２への依頼処理では、図１
２に示すように、まず、ロック／アンロックログ２４、
２５からロックマークに対応するアンロックマークが取
得されていないトランザクション識別子を検索する（ス
テップＳ５１０）。例えば、図示の例では、トランザク
ション識別子“Ａ”は、ロックマークに対応するアンロ
ックマークがあるため、検索対象外である。何故なら、
障害発生時には既にトランザクションが完遂されていた
からである。一方、トランザクション識別子“Ｂ”は、
アンロックマークがないため検索対象となる。何故な
ら、障害発生によりトランザクションが完遂されなかっ
たからである。First, in the request processing from the recovery management unit 504 of the slave device to the recovery management unit 502 of the main device, FIG.
As shown in FIG. 2, first, the lock / unlock log 24,
The transaction identifier for which the unlock mark corresponding to the lock mark has not been acquired is retrieved from 25 (step S510). For example, in the illustrated example, the transaction identifier “A” is not a search target because it has an unlock mark corresponding to the lock mark. Because,
This is because the transaction had already been completed when the failure occurred. On the other hand, the transaction identifier “B” is
Since there is no unlock mark, it will be searched. This is because the transaction was not completed due to the failure.

【０１０１】このようにして、完遂されていないトラン
ザクション識別子を見つけ出すと、次に、主装置のリカ
バリ管理部５０２へ、当該トランザクション識別子、ロ
ック識別子を送ってリカバリを依頼する（ステップＳ５
１１、５１２）。In this way, when an uncompleted transaction identifier is found, then the transaction identifier and lock identifier are sent to the recovery management unit 502 of the main unit to request recovery (step S5).
11, 512).

【０１０２】リカバリ依頼を受けた主装置のリカバリ管
理部５０２は、図１３に示すように、まず、リカバリ依
頼を受け付けると（ステップＳ５２０）、ジャーナル１
８から当該トランザクション識別子及びロック識別子に
対応する仮更新命令を検索する（ステップＳ５２１）。
そして、検索対象の仮更新命令を見つけ出したならば
（ステップＳ５２２）、その仮更新命令を依頼元の従装
置のリカバリ管理部５０４に発行する（ステップＳ５２
３）。一方、検索対象の仮更新命令が見つからない場合
は、「仮更新命令なし」のメッセージを依頼元の従装置
のリカバリ管理部５０４に発行する（ステップＳ５２
４）。Upon receiving the recovery request, the recovery management unit 502 of the main device first accepts the recovery request (step S520) as shown in FIG.
The temporary update instruction corresponding to the transaction identifier and the lock identifier is retrieved from 8 (step S521).
When the temporary update command to be searched is found (step S522), the temporary update command is issued to the recovery management unit 504 of the slave device which is the request source (step S52).
3). On the other hand, if the provisional update instruction to be searched cannot be found, a message “no provisional update instruction” is issued to the recovery management unit 504 of the slave device which is the request source (step S52).
4).

【０１０３】リカバリ依頼元の従装置のリカバリ管理部
５０４は、図１４に示すように、主装置のリカバリ管理
部５０２から依頼した仮更新命令（又は「仮更新命令な
し」メッセージ）を受取り（ステップＳ５３０）、仮更
新命令を受けた場合には（ステップＳ５３１）、それぞ
れの資源１６、１７にトランザクション開始命令を発行
し（ステップＳ５３２）、そして、仮更新命令を発行す
る（ステップＳ５３３）。これにより、それぞれの資源
１６、１７において仮更新命令が再実行される（ステッ
プＳ５３４）。次に、リカバリ管理部５０４は、ロック
管理部５０３に対し、コミットフラグ＝ＯＮとなったア
ンロック命令を発行する（ステップＳ５３５）。これに
より、ロック管理部５０３は、図９に示したアンロック
命令処理を行い、その結果、資源１６、１７の実更新が
実行される。As shown in FIG. 14, the recovery management unit 504 of the slave device of the recovery request source receives the temporary update command (or “no temporary update command” message) requested from the recovery management unit 502 of the main device (step). S530), when the temporary update command is received (step S531), the transaction start command is issued to each of the resources 16 and 17 (step S532), and the temporary update command is issued (step S533). As a result, the temporary update instruction is re-executed in each of the resources 16 and 17 (step S534). Next, the recovery management unit 504 issues an unlock command with the commit flag = ON to the lock management unit 503 (step S535). As a result, the lock management unit 503 executes the unlock command processing shown in FIG. 9, and as a result, the actual update of the resources 16 and 17 is executed.

【０１０４】一方、ステップＳ５３１で「仮更新命令な
し」のメッセージと判断した場合は、リカバリ管理部５
０４は、ロック管理部５０３に対し、コミットフラグ＝
ＯＦＦとなったアンロック命令を発行する（ステップＳ
５３６）。これにより、ロック管理部５０３は、図９に
示したアンロック命令処理を行い、その結果、資源１
６、１７の更新を行うことなくアンロックログが取得さ
れる。On the other hand, if it is determined in step S531 that the message is "no temporary update command", the recovery management unit 5
04 indicates to the lock management unit 503 that the commit flag =
Issue the unlock command that was turned off (step S
536). As a result, the lock management unit 503 performs the unlock command processing shown in FIG.
The unlock log is acquired without updating 6 and 17.

【０１０５】以上のようなリカバリ処理によって、障害
が発生したサーバの資源は、全てのトランザクションが
正しくコミットされた状態に強制的に復帰されるので、
システム全体を通じて資源の一貫性が確保される。ま
た、リカバリを行うのために業務処理プログラムがトラ
ンザクション処理を再実行する必要性がないので、業務
処理プログラムの負担も軽減できる。By the recovery processing as described above, the resources of the failed server are forcibly returned to the state in which all the transactions are correctly committed.
Resource consistency is ensured throughout the system. Further, since the business processing program does not need to re-execute the transaction processing for recovery, the load on the business processing program can be reduced.

【０１０６】図１５及び図１６は、リカバリ処理の変形
例を示す。この変形例は、上述したリカバリ処理のよう
に仮更新命令を逐次主装置から従装置へ送るのではな
く、リカバリに必要な全ての仮更新命令を一括して主装
置から従装置へ送るようにしたものである。15 and 16 show a modification of the recovery process. In this modification, not all the temporary update commands are sequentially sent from the main device to the slave device as in the recovery process described above, but all the temporary update commands necessary for recovery are sent from the main device to the slave device at once. It was done.

【０１０７】図１５に示すように、リカバリ処理を開始
した従装置のリカバリ管理部５０４は、まず、ロック／
アンロックログ２４、２５から、ロックマークに対応す
るアンロックマークが取得されていないトランザクショ
ン識別子を全て検索し、それらのトランザクション識別
子及びロック識別子をメモリ上に保存する（ステップＳ
６００）。その結果、アンロックマークのないトランザ
クション識別子が１件以上あれば（ステップＳ６０
１）、次に、メモリ上に保存した全てのトランザクショ
ン識別子及びロック識別子を主装置に送って、対応する
仮更新命令の検索を依頼する（ステップＳ６０２）。As shown in FIG. 15, the recovery management unit 504 of the slave device which has started the recovery process first performs lock / lock.
From the unlock logs 24 and 25, all the transaction identifiers for which the unlock mark corresponding to the lock mark has not been acquired are searched, and those transaction identifier and lock identifier are stored in the memory (step S
600). As a result, if there is at least one transaction identifier without an unlock mark (step S60).
1) Next, all transaction identifiers and lock identifiers stored in the memory are sent to the main device, and a search for a corresponding temporary update command is requested (step S602).

【０１０８】この依頼を受けた主装置のリカバリ管理部
５０２は、図１６に示すように、その検索依頼を受け付
け（ステップＳ６１０）、依頼された全てのトランザク
ション識別子及びロック識別子に対応する仮更新命令を
ジャーナル１８から検索し（ステップＳ６１１）、そし
て、検索した全ての仮更新命令を依頼元である従装置の
リカバリ管理部５０４に返送する（ステップＳ６１
２）。Upon receipt of this request, the recovery management unit 502 of the main unit accepts the search request (step S610) as shown in FIG. 16, and issues a temporary update command corresponding to all the requested transaction identifiers and lock identifiers. From the journal 18 (step S611), and returns all the retrieved temporary update commands to the recovery management unit 504 of the slave device which is the request source (step S61).
2).

【０１０９】再び、図１５を参照して、従装置のリカバ
リ管理部５０４は、依頼した全ての仮更新命令を主装置
から受け取ると（ステップＳ６０２）、次に、受け取っ
た仮更新命令を資源１６、１７に順に発行する（ステッ
プＳ６０３）。続いて、リカバリ管理部５０４は、メモ
リ上に保存した全てのトランザクション識別子に対する
アンロックを行うために、ロック管理部５０３に対し、
コミットフラグ＝ＯＮとしたアンロック命令を発行する
（ステップＳ６０４）。尚、主装置より仮更新命令を受
け取らなかったトランザクションに関しては、コミット
フラグ＝ＯＦＦとしたアンロック命令を発行する。これ
により、未コミットのトランザクションの全てに関する
更新が実行され、資源１６、１７は正しい状態に復帰さ
れる。Referring again to FIG. 15, when the recovery management unit 504 of the slave device receives all the requested temporary update commands from the main device (step S602), it then receives the received temporary update commands as resources 16 , 17 in order (step S603). Subsequently, the recovery management unit 504 instructs the lock management unit 503 to unlock all transaction identifiers stored in the memory.
An unlock command with the commit flag = ON is issued (step S604). An unlock command with the commit flag = OFF is issued for a transaction that has not received a temporary update command from the main device. As a result, the update for all uncommitted transactions is executed, and the resources 16 and 17 are returned to the correct state.

【０１１０】[0110]

【発明の効果】本発明の協調型分散システムによれば、
トランザクション処理が正しく且つ円滑に行え、データ
ベースの一貫性が常に保証されるようになる。According to the cooperative distributed system of the present invention,
Transaction processing will be executed correctly and smoothly, and the consistency of the database will always be guaranteed.

[Brief description of drawings]

【図１】本発明の協調型分散システムの一実施例の全体
構成を示すブロック図。FIG. 1 is a block diagram showing the overall configuration of an embodiment of a cooperative distributed system of the present invention.

【図２】図１の実施例の動作及び各部の機能をより詳細
に示したブロック図。FIG. 2 is a block diagram showing the operation of the embodiment of FIG. 1 and the function of each unit in more detail.

【図３】本発明のシステムで採用される一つの典型的な
システム構成例を示すブロック図。FIG. 3 is a block diagram showing one typical system configuration example adopted in the system of the present invention.

【図４】本発明で採用できる別のシステム構成例を示す
ブロック図。FIG. 4 is a block diagram showing another system configuration example that can be adopted in the present invention.

【図５】図１の実施例において、サーバ１ａが主装置で
ある場合を例にとり、特に、ジャーナルの取得と障害発
生後のリカバリ処理とに関連する部分を詳細に示したブ
ロック図。FIG. 5 is a block diagram showing in detail, in the embodiment of FIG. 1, a case where a server 1a is a main device as an example, and in particular, a portion related to journal acquisition and recovery processing after a failure occurs.

【図６】トランザクション処理における主装置１ａのＡ
Ｐ主要部２０及びコミット管理部５０１の処理手順を示
すフローチャート。FIG. 6A of main device 1a in transaction processing
6 is a flowchart showing a processing procedure of a P main part 20 and a commit management part 501.

【図７】各従装置１ｂ、１ｃ…のロック管理部５０３の
ロック処理手順を示すフローチャート。FIG. 7 is a flowchart showing a lock processing procedure of a lock management unit 503 of each of the slave devices 1b, 1c.

【図８】各従装置１ｂ、１ｃ…のＡＰアクセス部２１の
仮更新命令処理手順を示すフローチャート。FIG. 8 is a flowchart showing a temporary update command processing procedure of the AP access unit 21 of each slave device 1b, 1c.

【図９】各従装置１ｂ、１ｃ…のロック管理部５０３の
アンロック命令処理手順を示すフローチャート。9 is a flowchart showing an unlock command processing procedure of the lock management unit 503 of each of the slave devices 1b, 1c, ....

【図１０】各従装置１ｂ、１ｃ…のロック管理部５０３
のロールバック命令処理手順を示すフローチャート。FIG. 10 is a lock management unit 503 of each slave device 1b, 1c.
6 is a flowchart showing a rollback instruction processing procedure.

【図１１】障害発生からリカバリ処理の全体の処理手順
を示すフローチャート。FIG. 11 is a flowchart showing the overall processing procedure of a recovery process from the occurrence of a failure.

【図１２】各従装置のリカバリ管理部５０４のリカバリ
依頼処理手順を示すフローチャート。FIG. 12 is a flowchart showing a recovery request processing procedure of the recovery management unit 504 of each slave device.

【図１３】主装置のリカバリ管理部５０２のリカバリ処
理手順を示すフローチャート。FIG. 13 is a flowchart showing a recovery processing procedure of the recovery management unit 502 of the main device.

【図１４】各従装置のリカバリ実行処理手順を示すフロ
ーチャート。FIG. 14 is a flowchart showing a recovery execution processing procedure of each slave device.

【図１５】別のリカバリ処理におけるリカバリ管理部５
０４の処理手順を示すフローチャート。FIG. 15 is a diagram showing a recovery management unit 5 in another recovery process.
The flowchart which shows the processing procedure of 04.

【図１６】別のリカバリ処理におけるリカバリ管理部５
０２の処理手順を示すフローチャート。FIG. 16 is a diagram showing a recovery management unit 5 in another recovery process.
The flowchart which shows the processing procedure of 02.

[Explanation of symbols]

１分散処理装置（サーバ）２通信網３端末１１通信マネージャ１２業務処理プログラム１３分散資源同期処理部１４データベース管理部１５資源管理部１６データベース１７メモリテーブル／ファイル部１８ジャーナル２０ＡＰ主要部２１ＡＰアクセス部２２グローバルトランザクション管理部（ＧＴＭ）２３ローカルトランザクション管理部（ＬＴＭ）２４ロックログ２５アンロックログ２６、２７更新イメージログ５０１コミット管理部５０２、５０４リカバリ管理部５０３ロック管理部 1 distributed processing device (server) 2 communication network 3 terminal 11 communication manager 12 business processing program 13 distributed resource synchronization processing unit 14 database management unit 15 resource management unit 16 database 17 memory table / file unit 18 journal 20 AP main unit 21 AP access Part 22 Global transaction management part (GTM) 23 Local transaction management part (LTM) 24 Lock log 25 Unlock log 26, 27 Update image log 501 Commit management part 502, 504 Recovery management part 503 Lock management part

───────────────────────────────────────────────────── フロントページの続き (72)発明者井上利行東京都江東区豊洲三丁目３番３号エヌ・ティ・ティ・データ通信株式会社内 ─────────────────────────────────────────────────── ─── Continued Front Page (72) Inventor Toshiyuki Inoue 3-3-3 Toyosu, Koto-ku, Tokyo NTT Data Communications Corporation

Claims

[Claims]

1. A collaborative distributed system in which a plurality of processing devices capable of communicating with each other divide resources to receive the resources and cooperatively perform distributed processing of transactions, wherein at least one processing device of the plurality of processing devices is provided. Is a main device for accepting a transaction, another processing device is a slave device for distributed processing of the accepted transaction, and the main device is an update content of the resource to be updated in the transaction. Update information issuing means for issuing to the slave device that is in charge of the resources to be updated, one of the master device and the slave device updates the update information issued by the master device. Each of the slaves has a journal in which all contents are recorded together with a transaction identifier. The actual update means for updating the resources each of which manages, and the past transaction for which the update by the actual update means has failed are specified, update information corresponding to the specified transaction is ordered from the journal, and based on the acquired update information, A collaborative distributed system, comprising: a recovery means for updating resources.

2. The system according to claim 1, wherein each of the plurality of distribution devices can be either the main device or the slave device, the update information issuing means, the journal, and A collaborative distributed system comprising an actual updating means and the recovery means.

3. The system according to claim 1, wherein the main unit has the journal.

4. The system according to claim 1, wherein each of the slave devices locks the resource to be updated before updating and unlocks after updating, and a history of locking and unlocking of the resource. And a lock / unlock log recording therein, wherein the exclusive control means can perform locking and unlocking for each accessible minimum information unit of the resource.

5. The system according to claim 1, wherein each of the slaves locks the resource to be updated before updating and unlocks after updating, and a lock and unlock history of the resource. A lock / unlock log that records the transaction identifier together with the transaction identifier,
The collaborative distributed system, further comprising: wherein the recovery unit identifies the transaction in which the update has failed by referring to the lock / unlock log.

6. A cooperative distributed processing method in which a plurality of processing devices communicable with each other divide resources and receive the resources and cooperatively perform distributed processing of transactions, wherein one processing device among the plurality of processing devices is A step of accepting a transaction as a main device, a step of issuing update information indicating update contents of the resource to be updated in the transaction, to the other device which is in charge of the resource to be updated, , A process in which one of the main device and the other device acquires all the update contents of the update information issued by the main device into a journal together with a transaction identifier, and each of the other devices stores the update information in the journal. Based on the process of updating each resource, the main device and each of the other devices fail to be updated by the actual updating means. Identified past transactions, Toyo update information corresponding to the identified transaction from the journal, on the basis of Toyo update information,
And a step of updating the resource, the cooperative distributed processing method.