CN100334554C

CN100334554C - A method of standby and controlling load in distributed data processing system

Info

Publication number: CN100334554C
Application number: CNB028299396A
Authority: CN
Inventors: 李海鹏; 戴存军
Original assignee: ZTE Corp
Current assignee: ZTE Corp
Priority date: 2002-12-31
Filing date: 2002-12-31
Publication date: 2007-08-29
Anticipated expiration: 2022-12-31
Also published as: CN1695120A; WO2004059484A1; AU2002357568A1

Abstract

The present invention relates to a distributed processing system. The distributed processing system comprises a plurality of load groups, an interface processing module and a plurality of processors, wherein the load groups are used for generating service requests; the interface processing module is used for receiving the service requests from the load groups and sending the service requests to corresponding processors; each processor processes the services of a plurality of load groups and each load group uses one processor of a plurality of processors as a master processor to bear the processing of the services and uses the other processor of a plurality of processors as a standby processor; when the master processor has failures, the standby processor processes the services of the load groups.

Description

Backup and duty control method in the full distributed processing system (DPS)

Technical field:

The present invention relates to a kind of full distributed processing system (DPS), particularly backup in the fully-distributed system and duty control method.

Background technology:

In distributed system, usually backup of adopting and control mode are active and standby with distributing, that is: the disposal system of a main usefulness is added a standby disposal system, under normal circumstances, back-up system is not handled any business, when having only the master to break down, just all loads are transferred in the back-up system with system.

The mode that also has a kind of normal employing is the load sharing mode, that is: two systems respectively share the load of half, do not have active and standby branch, if one of them system breaks down, then all load all by another system handles.In the patent No. is the distributed data disposal system that the file " Distributeddata processing system and method for processing data in distributed dataprocessing system (the distributed data disposal system and the method that are used for data processing in the distributed data disposal system) " of EP1 139 235 A2 both disclosed a kind of like this mode.

Yet above-mentioned these processing modes all need to reserve for the backup operation that will carry out the processing power of half, so usage factor of system resource can not surpass 50%.Really, such control mode can be suitable in small distributed processing system (DPS), but in the full disposal system that distributes of high capacity, the probability that has made processor break down in the development of prior art is reduced under the minimum situation, and such utilization factor is undoubtedly a kind of waste.

In order to improve the utilization factor of system, the backup mode of a kind of N+1 in being the patent document " Distributed data access system including a plurality of database accessprocessors with one-for-n redundancy (the distributed data access system with comprising of 1/N redundancy of a plurality of database access processors) " of US5408649, the patent No. has been proposed, that is: in a plurality of disposal systems, there is a disposal system to be in the backup status, when a disposal system in a plurality of disposal systems of normal operation breaks down and withdraws from when service, start this back-up processing system, carry out work with the system that replaces breaking down.In this system, because for N+1 disposal system, N the disposal system that is in the operate as normal can't prefabricatedly know which system wherein should back up in good time on back-up system, therefore, N the disposal system that is in the operate as normal do not back up on back-up system in good time; When the system of wherein normal operation breaks down, need backed up data to lose, standby system needs to restart new processing after starting.

This working method is for for the distributed processing system(DPS) of data processing, and especially for requiring data and process to handling all to back up in realtime, that is: real-time requires very high communication system, obviously is unacceptable.

Summary of the invention:

The objective of the invention is to construct a kind of backup of in the full distributed processing system (DPS) of high capacity, using and duty control method and system, this method and system not only can satisfy the requirement of backing up in realtime, that is: improve the reliability of system, and can improve the utilization factor of system effectively.

According to a kind of distributed processing system(DPS) provided by the present invention, the load group that comprises a plurality of generation service request, be used to receive from the service request of each load group and service request be distributed to an interface processing module of corresponding processor, and a plurality of processors, wherein each processor is responsible for handling the business of a plurality of load groups, and each load group is born its business processing by a processor in a plurality of processors as master processor, and by another processor in a plurality of processors as spare processor, when this master processor broke down, this spare processor was responsible for handling the business of this load group.Comprise a synchronous triggering module in the wherein said processor, be used for when described master processor breaks down, with the information synchronization in the master processor in spare processor.

According to the disposal route of carrying out in a kind of distributed processing system(DPS) provided by the present invention, comprise that step receives the service request from a plurality of load groups; Service request is distributed to processor corresponding in a plurality of processors handle, each processor wherein can be handled the business of a plurality of load groups; When breaking down as the master processor of handling this service request in a plurality of processors, this service request is handled by other a processor as the spare processor of this service request in a plurality of processors, it is characterized in that: when described master processor breaks down, with the information synchronization in the master processor in spare processor.

In a kind of distributed processing system(DPS) and disposal route thereof that the invention described above provided, a plurality of processors comprise 3 processors at least.

Describe in detail:

Be elaborated around synoptic diagram respectively below:

What Fig. 1 described is concrete dealing with relationship between processor and the load group;

When Fig. 2 is illustrated in one of them processor and breaks down, the disposition of load;

When Fig. 3 illustrates adjacent two processors and breaks down, the disposition of load;

Fig. 4 has described the FB(flow block) that realizes this load sharing mode

Fig. 5 has described the complicated type load sharing mode that this invention comprised

As shown in Figure 1, four processors are arranged in this distributed processing system(DPS), processor 1 is responsible for handling the business from

load group

1 and 2 under the condition of normal operation, processor 2 is responsible for handling the business from

load group

3 and 4, same processor 3 is handled

load group

5 and 6, and processor 4 is responsible for

load group

7 and 8.

Owing to handle the certain peripheral environment of service needed of a load group, for example: data qualification, internal memory condition etc., thereby be subjected to the restriction of physical condition, a processor does not have the business that condition is handled all load groups.So need the corresponding relation between definite processor and the load group, and when starting, be ready to environmental baseline.

In example shown in Figure 1, processor 1 also has the ability of the business that can handle load group 8 and load group 3, breaks down if that is: bear the master processor 4 of the business processing of load group 8, and then load group 8 can be transferred to and continue in the processor 1 to handle.Simultaneously, if processor 4 is when normal process load group 8 professional, can be in processor 1 with the intermediate information backed up in synchronization, just can accomplish to break down and when taking over load group 8 professional by processor 1 when processor 4, be implemented in the switching of carrying out processor under the situation of non-interrupting service.

In like manner, other all load groups all have the backup condition on its adjacent processor, and each processor too can two adjacent load groups of back-up processing, so just form a load backup chain, this chain link ring interlocks, and finishes business processing jointly.

If there is a processor to break down, situation as shown in Figure 2, suppose that processor 2 breaks down, then in the load group of the former processing of fault handling machine, the load of load group 3 changes have been handled by processor 1, the load of load group 4 is then handled by processor 3, and the load impact that has caused when dispersion treatment has avoided too big load to change by another processor processing has like this reduced the generation of a chain of fault.Under this situation, processor has only increased a load group, and it had two load groups originally, therefore as can be seen under normal circumstances the load factor of this processor can reach 66%.

Under the mutually redundant in twos system of tradition, if load group 1～4 bear respectively by

processor

1 and 2 and backup each other, if under the situation that extreme processor 1 and processor 2 all break down, must cause load group 1～4 whole service disconnection.

And Fig. 3 shows under the back-up job mode that the present invention proposes, burden apportionment situation when

processor

1 and 2 breaks down simultaneously; Load group 1 is because adjacent with processor 4 as can be seen, and contains the condition of handling this load group in the processor 4, so load group 1 forwards processor to and handle, and same load group 4 forwards processing on the processor 3 to.Two load group service disconnection are only arranged.

In general, polyprocessor distributed processing system (DPS) externally only has an interface processing module 10, is used to receive the service request from the outside, after service request enters this interface module, is distributed to each processor by certain mode and handles.Also adopt this implementation in the method described in the invention, its implementation and theory diagram are as shown in Figure 4.

In distributed processing system (DPS) of the present invention shown in Figure 4, data and load group parameters needed conditions all in the system all exist in the concentrated Large Volume Data storehouse, organize according to the load of logic in database and distinguish these data and unit, and----it is the twice of processor quantity that the dividing elements that professional ability is similar is gone into the load group number that same load grouping----divides altogether.

When processor starts, be loaded in this processor according to should the load group relevant data of load group that this processor distributed and this processor needs data and other conditions, to carry out the preparation that to manage business as the load group of back-up processing.

In the interface processing module, when receiving a service request, at first determine this service request from load cell, then as mentioned above, according to the grouping of load cell, determine the service groups that this service request is affiliated, thereby by the distribution of services table in service inquiry module of visit 30, according to the information in the distribution of services table of forming by load group, master processor, spare processor, determine to be responsible for handling the master processor and the spare processor of this service request.Wherein, this service inquiry module can place above-mentioned interface processing module, also can place each processor; Distribution of services table in the service inquiry module is that prior static configuration is good, and distribution of services table as shown in Figure 4 is the distribution of services table that is provided with according to the structure of Fig. 1 annular backup link in advance.

According to the information of distribution of services table, after the information that obtains about the master processor of handling this service request and spare processor, also need state according to processor, determine that present business specifically should still be responsible for processing by spare processor by master processor.

In the interface processing module, a system management module 20 is arranged, it is responsible for writing down the status information of each processor; By and nonidentical processor between get in touch, this system management module is being safeguarded the status information of each processor, that is: when a processor in the system broke down, system supervisor was made a response at once, and the state recording of this processor is made amendment.

If the information in the system management module shows that the state of master processor of this service request correspondence is normal, then this service request is distributed to its master processor; If break down but the information in the system management module shows this master processor, then this service request distributed to spare processor and handle.

In processor inside, the processing of each business all has a memory field corresponding with it, has write down each pilot process of this business processing in this memory field, has determined professional trend.A synchronous triggering module that needs external trigger is arranged in processor, the synchronization program in this synchronous triggering module, and have quick synchronizing channel between the adjacent processor.In case occur to need preserve and be synchronized to the situation of spare processor in the business processing process, will trigger this synchronization program, this synchronization program can be with the content synchronization of memory field in the master processor of this service request correspondence to spare processor.Therefore, in the business processing process, even master processor breaks down, this professional subsequent message is when being delivered to the spare processor processing, still can find the content that in the memory field of master processor, writes down of this business, and, proceed business processing according to the content that wherein writes down, thereby realized when breaking down switch handler professional continuous processing.

Core concept of the present invention is to go in ring to back up the formation of chain, realizes the backup and the load control of large-capacity distributing system by organizing suitable load group and the corresponding relation between the processor.

Fig. 1 is that content of the present invention should not only be confined to this according to a kind of canonical form in the constructed belt backup chain of inventive concept.According to thought of the present invention, can also concern by defining between more complicated load group and the processor, construct other forms of more complicated backup chain, better to back up in the realization large-capacity distributing system and to load and control.The backup link form of a kind of complexity as shown in Figure 5.Complicated backup link has the effect of more performance and load control, especially for the system of high capacity and vast capacity load and more multiprocessor carry out the system of distribution process, more can improve the utilization factor and the stability of total system.

Beneficial effect

According to the present invention above-mentioned distributed processing system(DPS) and method thereof, because in this system, Taked each processor to be responsible for processing the business of a plurality of load groups and each load group by many A processor in the individual processor is born its Business Processing as master processor and by a plurality of places Another processor in the reason machine as spare processor, when this master processor breaks down should Spare processor is responsible for processing the mode of the business of this load group, therefore, and when one of them processing When the machine node broke down, the load group of processing on it was come by its adjacent a plurality of processors respectively Share, that is: under the prerequisite that load is discontented with when guaranteeing that a processor breaks down, usual place The manageable load of reason machine is N/ (N+1) * 100%, and wherein N is adjacent back-up processing machine Quantity, just the load group is to the multiple of processor. Like this, when N=2, processing negative Lotus is 66.7%, and when N=3, accessible load is that 75%, N is more big, and is daily passable The load of processing is just more high, thereby has improved the utilization rate of system resource.

In addition, if two adjacent processors break down simultaneously, adopting branch of the present invention After cloth formula treatment system and the method, because load group wherein is respectively by their adjacent processor Process, thereby reduced the paralysis quantity of load group, improved the reliability of system.

Simultaneously, owing to have synchronous trigger module in the processor, therefore can be with master processor In the content synchronization of the corresponding memory field of service request in spare processor, thereby realized The continuity that system manages business, and improved the reliability of system.

Claims

1, a kind of distributed processing system(DPS) comprises: a plurality of load groups are used to produce service request; An interface processing module is used to receive the service request from each load group, and service request is distributed to corresponding processor; A plurality of described processors, wherein each processor is responsible for handling the business of a plurality of described load groups, and each load group is born its business processing by a processor in described a plurality of processors as master processor, and by another processor in described a plurality of processors as spare processor, when this master processor breaks down, this spare processor is responsible for handling the business of this load group, it is characterized in that: comprise a synchronous triggering module in the wherein said processor, be used for when described master processor breaks down, with the information synchronization in the master processor in spare processor.

2, distributed processing system(DPS) as claimed in claim 1, wherein said a plurality of processors comprise 3 processors at least.

3, distributed processing system(DPS) as claimed in claim 1 or 2 wherein also comprises: a service inquiry module is used to provide the map information between described service request and the described a plurality of processor.

4, distributed processing system(DPS) as claimed in claim 1 or 2, wherein said interface also comprises: a system management module is used to provide the status information of processor.

5, distributed processing system(DPS) as claimed in claim 3, wherein said interface also comprises: a system management module is used to provide the status information of processor.

6, the disposal route of carrying out in a kind of distributed processing system(DPS) comprises step: receive the service request from a plurality of load groups; Described service request is distributed to processor corresponding in a plurality of processors handle, each processor wherein can be handled the business of a plurality of described load groups; When breaking down as the master processor of handling this service request in described a plurality of processors, this service request is handled by other a processor as the spare processor of this service request in described a plurality of processors, it is characterized in that: when described master processor breaks down, with the information synchronization in the master processor in spare processor.

7, the disposal route of carrying out in a kind of distributed processing system(DPS) as claimed in claim 6, wherein said a plurality of processors comprise 3 processors at least.

8, the disposal route as carrying out in claim 6 or the 7 described a kind of distributed processing system(DPS)s wherein also comprises step: provide the map information between described service request and the described a plurality of processor.

9, the disposal route as carrying out in claim 6 or the 7 described a kind of distributed processing system(DPS)s wherein also comprises step: the status information that processor is provided.

10, the disposal route of carrying out in a kind of distributed processing system(DPS) as claimed in claim 8 wherein also comprises step: the status information that processor is provided.