WO1996029651A1

WO1996029651A1 - Distributed telegraphic message processing system

Info

Publication number: WO1996029651A1
Application number: PCT/JP1995/000454
Authority: WO
Inventors: Kenichi Abe; Yukiharu Imahuku; Hitoshi Kirita; Toshiyuki Inoue
Original assignee: Ntt Data Communications Systems Corporation
Priority date: 1995-03-17
Filing date: 1995-03-17
Publication date: 1996-09-26
Also published as: GB2302970B; FR2731858A1; DE19581619T1; FR2731858B1; GB9624185D0; GB2302970A

Abstract

Telegraphic messages inputted from external terminals through a communication line are processed in a distributed manner in a plurality of telegraphic message processing devices (10a) as shown in the figure. The common resources in a storage unit (data base) (14) are updated on the basis of the results of this processing operation. Each of the telegraphic message processing devices (10a) is provided with a lock sequence determination unit (15) and a lock processor (16), and lock requests for resources common to the subject devices or other devices, which are issued by the subject devices, are sorted, arranged in predetermined order common to all the devices and issued sequentially, the outputting of subjecting lock requests being held until the lock processing based on precedent lock requests has been finished. Each device is provided in its telegraphic message processing unit (13) with an autonomic telegraphic message assurance member (133) which removes the causes of abnormality, in the event of a failure in updating in the subject device, and execute the updating process autonomically. This system is formed so as to improve a general system processing efficiency while preventing the occurrence of contradiction of the common resources to one another.

Description

Description Distributed electronic message processing system Technical field

The present invention relates to a distributed message processing system including a plurality of message processing devices for performing distributed processing of input messages, and more particularly, to a method of reducing inconsistency of shared resources while maintaining parallel execution of processing by each message processing device. Regarding techniques to avoid. Background art

In a system such as an online transaction processing system in which a single transaction is distributed and processed by a plurality of message processors, it is common to use the information stored in the resource storage provided in each message processor as a shared resource. It is. In a system having such a configuration, simultaneous use of a shared resource by a plurality of processes, reduction of processing efficiency due to cooperative processing of each message processing device, and inconsistency of shared resources due to use of each message processing device are avoided. Is a major challenge.

Conventionally, as a method of avoiding the simultaneous use of multiple processes for shared resources, a means for issuing an exclusive setting (hereinafter referred to as a “hook”) request to each message processing device for the resource storage unit of its own device or another device A means for performing a lock process on the resource storage unit of the own device upon receiving a lock request from the own device or another device, and providing access to the shared resource by another process until the lock release request is issued. Generally, it is prohibited (hereinafter, Conventional Example 1). In this case, when the lock request is for each table, access to the entire table is prohibited, and when the lock request is for each data record in the table, access to the data record is prohibited, and the lock request is issued. The sent message processing device enters a waiting state.

On the other hand, in order to avoid the inconsistent state of the shared resources, conventionally, when a single update process updates the data records of multiple message processing devices, it is necessary to normalize the update of all data records. Alternatively, control for invalidating all updates, that is, update synchronization control is performed. In this regard, two phases (2 Hue 1 B) Commit method or its improved method can be used (hereinafter, Conventional Example 2). In these methods, the update phase is divided into "temporary update" and "actual update". When an input message is received, in the first stage, an updated frlf report based on the processing result of the message is created and the temporary update state is updated. The other message processing device (subsidiary S) determines whether or not one message processing device is the main device according to the type of message and is ready to actually update (commit) the data record. If there is a device that is not ready, request the update cancellation (rollback) to all devices including the own device. Insufficient preparations include failures of the message processing device. Then, when the preparation for the actual update is completed in all the devices, the actual update request by the input message is performed again for all the devices, and the second stage process, that is, the process of confirming the update information is performed.

However, each conventional example has the following problems to be solved. In other words, in Conventional Example 1, if there is a request for a lock on a single shared resource in data record units, a deadlock occurs in relation to other data records in the same table, and the lock cannot be released. There is. A lock request for each table is effective mainly to prevent this deadlock, but it degrades the concurrency of other processes. Furthermore, when lock requests are issued a plurality of times from the same lock request issuing means, deadlock cannot be prevented depending on the mode. For example, four message processing devices are interconnected by a communication line, and a table a in the second message processing device is locked by a lock request issued from the first message processing device to the second message processing device. In addition, if the fourth message processing device has been tucked on the table b in the fourth message processing device by a hook request issued from the third message processing device to the fourth message processing device, When a lock request for table b in the fourth message processing device is issued from the first message processing device and at the same time, a lock request for table a in the second message processing device is issued from the third message processing device. Since the tables a and b are already locked, the third and fourth message processing devices are both in a waiting state, and the locks on the tables a and b cannot be released.

In the conventional example 2, since the request and response are repeated twice or more between the master device and the slave device between the provisional update and the actual update, it is necessary to perform communication at least four times. Heads are likely to occur. In addition, failures such as blocking may occur with one specific device. When harm occurs, the processing of all devices is stopped, and the processing performance of the entire system is significantly degraded. Furthermore, if a heuristic error occurs in the resource storage unit of a specific message processing device after an actual update request, the conventional system allows the updating of information in that message processing device to be canceled at this stage as well. I will. In this case, the message processing device also cancels the provisional update state due to the input message, so that the update is out of synchronization with the resource storage unit of the other device. The inconsistency of shared resources could not be avoided because of the lack of the

An object of the present invention is to solve the problem and to provide a distributed message processing system configured to completely guarantee the update processing of a shared resource while maintaining parallel execution of processing by each message processing device. . DISCLOSURE OF THE INVENTION The present invention provides a distributed message processing system that performs distributed processing of an input message by a plurality of message processing devices and updates a shared resource shared by the respective message processing devices based on a processing result of the input message. A message processing device that performs distributed processing of the input message includes the following elements.

(1) an exclusive control request of another process for the shared resource to be updated, for example, a means for issuing a lock request,

(2) means for quantifying the route information from the own device to the shared resource and the acknowledgment request and generating sequence information,

(3) means for rearranging the sequence information in a predetermined order common to all the message processing devices and sequentially issuing the sequence information together with the mouth request,

(4) A means for suspending the issuance of the next sequence information until the exclusive control process of the shared resource based on the issuance of one sequence information, for example, the hack process is completed.

In the above configuration, it is preferable to set the shared resources in data record units from the viewpoint of ensuring parallel operability. In this case, the means for generating the sequence information is configured to set and quantify the path information and the acknowledgment request in data record units. In addition, the message processing device that updates the shared resource includes a unit that executes a required lock process based on the sequence information before updating the shared resource, and further updates the shared resource as necessary. There are provided means for detecting the cause of the abnormality at the time of abnormal termination, means for autonomously removing the cause of the abnormality and re-executing the update, and means for restricting the access of another process to the abnormal part.

In the distributed message processing system having the above-described configuration, the sequence information of the shared resource is set in advance and set in the sequence setting table. This sequence information includes, for example, a node name for uniquely identifying each message processing device, a node number indicating the order of lock requests for each node, a table name included in the node, and each table. It can be composed of a table number indicating the order of lock requests for, a record name included in each table, and a record number indicating the order of lock requests for each record.

When a lock request is issued from one message processing device to a shared resource, the sequence information relating to the relevant shared resource is retrieved from the sequence setting table. Then, the sequence information is arranged in a predetermined order, and is issued to the corresponding shared resource in the arrangement order. The arrangement order is defined in common for all message processing devices. For example, they are arranged in numerical order or alphabetical order based on at least one of the node name, node number, table name, and record name. As a result, a plurality of sequence information can be given a predetermined order, and unified management for each shared resource can be performed. After issuing one sequence information, the output of the next sequence information is suspended until the lock processing based on the sequence information is completed, and the occurrence of deadlock is completely suppressed. When the lock request and the corresponding lock processing are performed in data record units, parallel execution of other data records is ensured. On the other hand, when the shared resource is updated by one message processing device, the message processing device having the shared resource to be updated has a powerful update process (temporary update or actual update) of its own shared resource. If it is confirmed that a failure has occurred, the cause of the abnormality is detected, and this is removed autonomously and updated again. As a result, there is no room for communication overhead between the message processing devices, and furthermore, it is possible to avoid stopping the processing of other devices due to the occurrence of an abnormality in the message processing device. Thus, the conventional example Solve the problems at once. BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 is a schematic configuration diagram of a distributed message processing system of the present invention, FIG. 2 is a block diagram of an individual message processing device, FIG. 3 is a block diagram of a lock sequence determination unit in FIG. Fig. 3 is an explanatory diagram showing a setting example of the lock sequence setting table of Fig. 3, Fig. 5 is an explanatory diagram showing an example of exclusive control based on the setting example of Fig. 4, and Fig. 6 is an explanatory diagram of the exclusive control of Fig. 5. 7 is a block diagram of the autonomous message assurance unit shown in FIG. 2, and FIG. 8 is a procedure explanatory diagram for executing a shared resource update process based on a result of the distributed process.

BEST MODE FOR CARRYING OUT THE INVENTION Next, preferred embodiments of the present invention will be described in detail with reference to the drawings.

FIG. 1 is a schematic configuration diagram of a distributed electronic message processing system according to one embodiment of the present invention. In the distributed message processing system 1 of this embodiment, for example, four message processing devices 10a to 10d having the same elements are connected by a high-speed bus or a LAN (Local Area Network) or the like, and a communication line 2 is connected to the terminal devices 3a to 3n. One of the devices is processed in a distributed manner as one of the master device and the other as the slave device. General-purpose workstations or personal computers can be used for the terminal devices 3a to 3π.

As shown in FIG. 2, each of the message processing devices 10a to 10d includes an information input unit 11, a message distribution processing unit 12, a message processing unit 13, a resource storage unit 14, a lock sequence determination unit 15, and a lock processing unit. 16, a journal acquisition unit 17, and an information output unit 18. The resource storage unit 14 stores, for each table, a data record that is a shared resource of each of the message processing devices 10a to 10d; Hereinafter, the message processing device 10a will be described as a main device, and the other message processing devices 10 ″ will be described as slave devices.

In the message processing device 10a, the information input unit 11 inputs received information from the communication line 2. Then, a message having a predetermined data structure is extracted, and the extracted message is led to the message distribution processing unit 12.

The message distribution processor 12 distributes the message according to the destination of the input message.

The message processing unit 13 performs a required process on the message assigned to itself. Further, based on the processing result, the update processing unit 131 updates the shared resources stored in the resource storage unit 14 of the own device or the g source storage unit of another device, and detects the presence or absence of abnormal processing. Then, when the processing is abnormal, the fact is notified to the autonomous electronic message guarantee unit 131. The autonomous text assurance unit 131 detects the cause of the abnormality, removes it autonomously, and completes the update process. This processing will be described later in detail. When updating the shared resource, a lock request is issued from the lock request unit 131 prior to execution of the update.

The lock sequence determination unit 15 quantifies the lock request issued by its own device, that is, request information for urging the setting of access prohibition of another process to the shared resource, and determines the issuance order. The lock processing unit 16 determines the validity of the lock request received from the own device or another device, and locks the data record in the resource storage unit 14 when the received lock request is valid. Release of this lip is performed at any time. In this case, all shared resources may be unlocked at the same time, or unlocked individually.

The journal acquisition unit 17 acquires and records a failure recovery journal, that is, update information that has been normally processed. The information output unit 18 edits the processing result of the message processing unit 13 and outputs the result to the source of the received message or to a designated destination. Next, the issuance of the lock request and the hooking process, which are one feature of the present embodiment, will be described in detail with reference to FIGS.

FIG. 3 is a block diagram of the lock sequence determining unit 15. As shown in the figure, the lock sequence determination unit 15 includes a lock sequence setting table 151 for registering order information, which is a basis of a sequence number indicating a lock request order, in units of data records in advance, and a received password. And a lock request result output unit 153 that outputs lock requests in the sorted order and notifies the lock processing unit 16 of the result. Is done. These units are included in all the message processing devices in common, and can be locked by their own device or by other devices. The two-way wake-up process is possible.

FIG. 4 shows a setting example of the lock sequence setting table 151. The setting example in Fig. 4 is based on the configuration in Fig. 5 showing the concept of lock request between two message processing devices and the accompanying hacking process. The node name A3 uniquely assigned to identify the node, the node number indicating the order of lock requests for each node, the table names a to d included in that node, and the order of lock requests for each table Table numbers, record names and records included in each table, and record numbers indicating the order of lock requests for each record are set. These contents are merely examples, and the identification information such as the individual node names and the order information including the path information may of course be in other formats. The contents of the lock sequence setting table 151 are set in the memory of each of the message processing devices 10a to 10d when the distributed message processing system 1 is started.

Here, as shown in FIG. 5, a lock request is issued from the lock sequence determination units 15a and 15b of the two message processing devices 10a and 10b to the lock processing units 16a and 16b of the own device and the other device, respectively. The operation in this case will be described. First, an operation example of the lock sequence determination unit 15a will be described with reference to FIG. In FIG. 6, S indicates a processing step.

The lock sequence determination unit 15a receives a lock request issued from the lock request unit 131 of the own device (S100). The format of the lock request specifies the specific data record on each node, so the node name (identification information of the message processing device), the table name (identification information of the table in the resource storage unit 14), and the record name (Identification information of the data record in the table). Next, referring to the lock sequence setting table 151, the node number, table number, and record number corresponding to the node name, table name, and record name specified in the received lock request are acquired (S101). A sequence number representing the lock order is created (S102). In other words, it is quantified. For example, for a record represented by "Aa", the node number "1", the table number "1", the record number "1", and the record number "1" Get the record number "Γ" and create the sequence number "111". Similarly, for the record represented by "Be force", create the sequence number "21 Γ". The sort processing unit 152 sorts these sequence numbers (S103), and if there is an unprocessed lock request, repeats the same processing (S104). If there is no unprocessed lock request, the sorted first lock request ((1) shown) is issued to the lock processing unit 16a (S105). Then, the result of the processing based on the request is received from the lock processing unit 16a (S106), and it is determined whether or not the processing has been performed (S107). If β¾ϋ (OK), it is again determined whether or not an unprocessed lock request remains (S108). Issued to lock processing unit 16b. Thereafter, the same processing is repeated. When there is no more outstanding lock processing, a lock processing result report is output to the lock requesting unit 131 (S109). If the result of the lock process is abnormal (NG), information indicating "abnormal termination" is set in the above result report and output (S110) o

Next, the operation when the acknowledgment request is issued in the order of "Be power" and "Aa" from the hacking sequence 15b of the other message processing device 10b in parallel with the lock processing in the message processing device 10a. Will be described again with reference to FIG.

The lock sequence determining unit 15b determines a sequence number based on the lock request sequence setting table, and sorts the sequence number in the sort processing unit, as in the processing in the message processing device 10a. As a result, the lock requests are sorted as "Aa", "Be force", and are issued in this order ((3) and (4) in the figure). If the previous lock request (1) is already valid, the next lock request (3) will be sent to the lock processing unit 16a until the previous lock request (1) is released by the message processing device 10a. The lock sequence. The determination unit 15b also processes the lock request (3) in the lock processing unit 16a and waits until receiving the result to request the next lock request for "Be force". Do not issue (4).

As described above, even if there are a plurality of lock requests to the data record expressed by "Aa" and "Be force", the sequence numbers of these lock requests are created in each of the message processing devices 10a and 10b. Further, by sorting this to give a certain order, the occurrence of deadlock is reliably prevented. In addition, since each of the above processes is not a lock process for each table but a lock process for each data record, a process for accessing a data record other than the data records represented by “AA” and “Bc force” ( Application Parallel processing can be performed for the program. Note that the sequence number does not necessarily need to be numerical information, but may be character information.

Next, the autonomous message assurance unit 133 as another feature of the present embodiment will be described. As shown in FIG. 7, the autonomous message assurance unit 133 includes an abnormality location detection processing unit 233 that detects an abnormal location by detecting an abnormal device name / record number and the like, and performs an error self-recovery process. An error location recovery processing unit 333 that performs an update, an update processing unit 433 that performs the update process after the recovery of the abnormal location, and a reupdate determination processing unit 533 that determines whether the update has been successful.

The self-recovery of the abnormality in the abnormal point recovery processing unit 333 is performed, for example, in the event of a failure in the resource storage unit 4, by replacing the stored information with the journal of the journal acquisition unit 17 and restoring to the normal state, and in the event of a program error Is repaired by debugging or the like. This self-recovery and subsequent re-update are repeatedly executed until the update processing (actual update) in the message processing unit 13 ends normally.

Next, the operation when the shared resource is updated in the message processing device 10a will be described with reference to FIG.

The message processing unit 13 inputs the message to be updated from the message distribution processing unit 12 (S200), and checks the validity of the content. Specifically, this check is a check on whether the message is in an allowed format, or if the message includes money amount information, the money amount is a negative value, such as a negative value. If the result of the check is abnormal, the processing is terminated and the next message is input. If the check is normal, update information is created (S201), and a lock request is issued to the lock sequence determination unit 15 (S202). If the message needs to update the file in the resource storage unit of the own device or another device, a temporary update request is made to the hack processing unit that manages the resource storage unit at the same time (S203). The provisional update request to another device may be made directly to the lock processing unit, or may be made via the message processing unit of the device. It is determined whether or not the provisional update has been performed normally in all the related message processing devices (S204). If even one of the provisional updates has failed (NO), the resource storage unit is sent to the lock processing unit 16 of the own device. In step S207, a request for temporary update cancellation is made, and a request for temporary update cancellation is made to other related devices at the same time (S207). As a result, the update processing in all devices is stopped. Then, Seki A lock release request is issued to the relevant lock processing unit (S208).

On the other hand, if all of the provisional update processing has been normally performed in S204 (YES), a journal acquisition request is made to the journal acquisition unit 17 (S205), and the updated information is recorded in the journal file as failure recovery information. . At this time, it is determined whether or not the data has been normally recorded in the journal file (S206), and if it has failed, the processing of S207 to S208 is performed. When the information is normally recorded, an actual update request is made to the lock processing unit 16 of the own device or the lock processing unit of the related device (S209). The actual update request to the device may be directly sent to the lock processing unit that manages the resource storage unit, or may be sent via the message processing unit of the device.

It is determined whether or not the actual update in the own device has been completed normally (S210). If the actual update has not been completed normally for some reason, a notification is sent to the autonomous message assurance unit 133 (S211). The autonomous message processing unit 133 autonomously eliminates the cause of the abnormality and completes the actual update by the configuration shown in FIG. When the actual update is performed by another device, it is determined whether or not the actual update is completed normally in the device. If the update is not completed normally, the cause of the abnormality is autonomously determined by the autonomous message processing unit of the device. Complete and complete the actual update.

After the actual update is completed, the information output unit 18 is requested to output a message (S212), and the message processing result is sent to the source of the message.

If the autonomous message processing unit of the own device or another device cannot remove the abnormal part and cannot perform the actual update normally, the update cancellation request can be made in the same procedure as in the case of the temporary update cancellation. .

After that, the lock processing unit 16 is requested to release the lock (S213), and the update processing ends. In this way, even if the processing based on the first actual update request fails at the relevant message processing device, t, the autonomous message assurance unit of the device removes the cause of the error and saves it to the resource storage unit. Since the update is completed autonomously, each message processing device can continue the actual update process in its own device without waiting for a response from another device. As a result, it is possible to effectively prevent the occurrence of inconsistency in the shared resources while preventing a decrease in processing efficiency.

In this embodiment, distributed processing based on the two-phase commit method has been described in which the message processing device 10a is a main device and the other message processing devices 1013 are slave devices. The present invention can also be applied to a system in which each of the message processing devices processes a message addressed to the own device at an independent timing. INDUSTRIAL APPLICABILITY As is clear from the above description, in the distributed message processing system of the present invention, when a message processing device that performs distributed processing of an input message issues a lock request to a shared resource, The route information from the own device to the shared resource and the acknowledgment request are fixed, these are rearranged in a predetermined order common to all devices, and the acknowledgment process based on one lock request is completed. Since the issuance of the next lock request is suspended, deadlock can be reliably prevented without impairing the parallel execution of processing. In addition, when a plurality of lock requests for a shared resource are generated, the order can be easily and reliably ensured, and centralized management of the order is possible.

Furthermore, even if an error occurs in one of the plurality of message processing devices, the cause of the error is removed by the device and the update process is completed, so that the process based on the first update request is completed autonomously. Update synchronization deviation between shared resources is reliably prevented. Also, for example, in the case of distributed processing based on the two-phase commit method, one message processing device can complete processing and updating of a message addressed to its own device without waiting for a response from another message processing device. Therefore, the processing efficiency is significantly improved compared to the conventional system that employs the two-phase commit method. Furthermore, since communication between the message processing devices related to updating only needs to be performed once, communication overhead is suppressed. In the distributed message processing system of the present invention, access to the abnormal location from another device is restricted until the cause of the abnormality of the own device is eliminated, so that occurrence of abnormality information is effectively prevented. And the overall system reliability can be improved.

Claims

The scope of the claims

1. In a distributed message processing system that distributes input messages using multiple message processors and updates the shared resources shared by each message processor based on the processing results of the input messages!

Means for issuing a request for exclusive control of another process for the shared resource to be updated, route information from the own device to the shared resource, and the request for exclusive control. Means for generating the sequence information by converting the sequence information into a predetermined order common to all the message processing devices and sequentially issuing the sequence information together with the exclusive control request. Means for retaining the issuance of the next sequence information until the exclusive control processing of the shared resource based on the issuance is completed, wherein the message processing apparatus for updating the shared resource includes the sequence information prior to the update. And a means for executing a required exclusive control process based on the message.

2. The shared resource is set in data record units, and the means for generating the sequence information sets and quantifies the path information and the exclusive control request in data record units. Item 1. The distributed message processing system according to item 1.

3. The message processing device for updating the shared resource includes: means for detecting a cause of the abnormality when the update of the shared resource is abnormally terminated; and autonomously removing the cause of the abnormality and re-executing the update. 2. The distributed electronic message processing system according to claim 1, further comprising:

4. The distributed message processing system according to claim 3, wherein the message processing device for updating the shared resource includes means for restricting access of another process to an abnormal point.