CN1744554A - Expandable dynamic fault-tolerant method for cooperative system - Google Patents

Expandable dynamic fault-tolerant method for cooperative system Download PDF

Info

Publication number
CN1744554A
CN1744554A CN 200510019586 CN200510019586A CN1744554A CN 1744554 A CN1744554 A CN 1744554A CN 200510019586 CN200510019586 CN 200510019586 CN 200510019586 A CN200510019586 A CN 200510019586A CN 1744554 A CN1744554 A CN 1744554A
Authority
CN
China
Prior art keywords
node
service
ring
load
token
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 200510019586
Other languages
Chinese (zh)
Other versions
CN100341298C (en
Inventor
金海�
王玎
李胜利
袁平鹏
李昌清
孙盛
黎时才
邝坪
战治国
王辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CNB2005100195867A priority Critical patent/CN100341298C/en
Publication of CN1744554A publication Critical patent/CN1744554A/en
Application granted granted Critical
Publication of CN100341298C publication Critical patent/CN100341298C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The method is improved from mode of backup for main use. Receiving request from user, the main service node carries out process. Based on magnitude of redundancy of task, backup node is allocated dynamically. Backup process in backup service node communicates with basic service process in main service node periodically, and synchronization between them is kept. When basic service process is in failure, one process is selected from backup service processes as basic service process. Meanwhile, any service node can be as main service node and backup node in order to reach purpose of using system resources furthest. The invention reduces backup process is often in 'idle' state. Based on node capability and state of load, the invention changes redundancy of service so as to raise efficiency of service and balance of load.

Description

Extendible dynamic fault-tolerant method in the cooperative system
Technical field
(Computer Supported CooperativeWork, CSCW) field are a kind of novel extendible dynamic fault-tolerant method to the invention belongs to computer supported cooperative work.
Background technology
Computer supported cooperative work supports the multi-user by computer and network technology, accomplishes a task jointly with coordination and cooperation way.Along with rapid development of network technology, CSCW provides the cooperative working environment of a kind of " face-to-face " and " you see to be that I see " for the people that disperse on space-time, it can support a plurality of times to go up to separate, distribute and the complementary cooperation of work member's collaborative work on the space.In the past several years, international and domestic representational CSCW system comprises: multimedia collaborative work system (the The Collaborative Environment forConcurrent Engineering Design of Stanford Research Institute (SRI) development, CECED), the multimedia design system of the SHASTRA collaborative work of Purdue University's development, BERKOM multimedia collaborative server and other CSCW system of common exploitation such as IBM European network center, DEC, as capital intervisibility frequency conference system, the intelligent negotiating system of 3C (CAD/CAPP/CAM) etc.Cooperative system has been widely used in supporting colony's user collaborative work at present.Because the collaborative work back-up system relates to colony's user collaboration, any system failure causes group collaboration to carry out or the result loses, thereby, very big to efficient and result's influence of group collaboration.Therefore, in the cooperative system design, how to guarantee that the high reliability of cooperative system service node and performance are to be related to the key factor of user to system's confidence level.
The implementation of high reliability has the design of high reliability except requiring hardware device, and good parts fault tolerant mechanism also will be arranged.Facts have proved that fault-tolerant design is highly effective to the raising of computer application system reliability.The mode of service redundant is generally adopted in traditional fault-tolerant design: the service processes on the service node (being called the basic service process) duplicates many parts (being called the reserve service processes) and operates in respectively on the different nodes.According to the type of action of redundancy services, can be divided into: Active Replication and main with backup.In the Active Replication mode: basic service process and all reserve service processes receive client requests simultaneously and ask and handle, and then result are all returned to the client, by the customer selecting return results.Though it is transparent to the client that Active Replication mode service processes lost efficacy, communication overhead is bigger, and because system resource is limited, redundancy services is response request simultaneously, has reduced the performance of system on the whole.Main with backup mode in: basic service process reception client requests is handled, the reserve process periodically with the basic service process communication, synchronous with its maintenance.After the basic service process lost efficacy, from the reserve service processes, choose a process as the basic service process.This mode is compared with the Active Replication mode, has significantly reduced communication overhead, but backup process is being in " leaving unused " state under the basic normal situation of process, has wasted system resource and system load is unbalanced.When collaborative work task quantity increased, the cooperation with service node formed system bottleneck easily, and performance reduces.Therefore, these two kinds of fault-tolerant ways are applied to all exist in the cooperative surroundings certain problem.In addition, in the collaborative work support environment, collaborative work task quantity is dynamic change, is reflected as the dynamic change of service node performance, and it is dynamically extendible therefore requiring the fault-tolerant of service node.But Active Replication and the main characteristic that all can not adapt to the cooperative system dynamic change with two kinds of fault-tolerant ways of backup well as the above analysis.
Summary of the invention
Purpose of the present invention is exactly at the deficiencies in the prior art, and extendible dynamic fault-tolerant method in a kind of cooperative system is provided, and this method can make system have good fault-tolerant and load balance ability.
Extendible dynamic fault-tolerant method in a kind of cooperative system provided by the invention may further comprise the steps:
(1) when service request arrives, service managerZ-HU is its distribution services node as follows:
(1.1) judge whether ring is gone up maximum node RN is zero, if RN=0 will make up r unit service basic ring, and to set token number be 0, enters step (1.2); Otherwise directly enter step (1.2);
(1.2) cotasking t is distributed to the service node N (i) that holds token, its load increases 1, task t is added in the set of tasks of holding the token node;
(1.3) judge that token number is whether greater than maximum node number, if the node that will hold token adds the service ring, with seasonal RN=RN+1, enters step (1.4); Otherwise directly enter step (1.4);
(1.4) whether judge redundancy r greater than maximum node RN, if, in secondary node, new node is added the service ring, make its expansion constitute r unit service ring, enter step (1.5); Otherwise directly enter step (1.5);
(1.5) backup tasks t to service ring go up with r-1 nearest node of the service node of holding token on;
(1.6) judge on the service ring whether have the effective node of load less than threshold value, if there is no, then pass token to the next outer secondary node of ring that is about to add the service ring, and number add 1, enter step (2) then for its distribution node number equals maximum node; If exist, then transmit token and go up the effective node of next load less than threshold value to ring, enter step (2) then;
(2) service node N (i) carries out this cotasking, in the process of executing the task, checks in the service ring whether failure node N (i) is arranged, if there is not failure node, changes step (3) over to; Otherwise, service managerZ-HU reconstruct as follows service ring:
(2.1) whether the load of judging failure node N (i) equals zero, if then change step (2.4); Otherwise enter step (2.2);
(2.2) from the set of tasks of failure node N (i), take out task t, whether there be the effective node N (j) of load in the logic box of inspection task t less than load threshold, if there is no, then from secondary node, take out a node substitute node N (i), enter step (3); Otherwise enter step (2.3);
(2.3) with the substitute node of node N (j) as N (i), task t is added in the set of tasks of ingress N (j), the load of node N (i) simultaneously subtracts 1, changes step (2.1) then;
(2.4) deletion of node N (i), all node numbers greater than i subtract 1 simultaneously;
(2.5) judge whether the node load of holding token equals threshold value, if not, step (3) entered; Otherwise enter step (2.6);
(2.6) judge on the service ring whether have the effective node of load,, then transmit token, enter step (3) then to the node of next load less than threshold value if exist less than threshold value; Otherwise, token is passed to the outer secondary node of ring, and number add 1 for its distribution node number equals maximum node, enter step (3) then;
(3) judge whether task t finishes, if not, change step (2) over to, otherwise, carry out following steps:
(3.1) deletion task t from the set of tasks of node N (i), and the logic box T_C of the middle task t correspondence of deletion of node N (i) subtracts 1 with node N (i) load,
(3.2) judge token number whether greater than maximum node number, if, pass token to node N (i), finish, otherwise directly end.
R unit service basic ring construction method is in the step (1.1): according to the service redundant degree r of task t, take out r service node from secondary node, it is 0,1 that node number is set respectively ..., r-1 is designated as N (0), N (1) ..., N (r-1); Distribution node N (0) gives cotasking as main service node then, and other nodes are as the backup node of N (0); Last this r node N (0), N (1) ..., N (r-1) connects into ring, constitute by r node N (0), N (1) ..., N (r-1) } r unit service basic ring.
Service ring extended mode is in the step (1.4): at first increase r-RN new node, distribution node number is respectively N (r), N (r+1) ..., N (RN-1) is then according to service basic ring building mode reconstruct service ring.
The present invention is a kind of based on the main improved structure of backup mode of using, main service node receives client requests and handles, backup node is according to task redundancy size dynamic assignment, and all must be identical unlike the backup node of main each task of stipulating with the backup mode structure.Reserve process on the backup services node periodically with main service node on the basic service process communication, synchronous with its maintenance.After the basic service process lost efficacy, from the reserve service processes, choose one as the basic service process.Simultaneously, any one service node can reach and utilize system resource to greatest extent as the main service node and the backup node of system task.This mode is compared with backup mode with main, has significantly reduced backup process is in " leaving unused " under the basic normal situation of process state, has effectively utilized system resource.The present invention can be the load threshold that the task quantity of collaborative work is dynamically set service node according to joint behavior and cooperative system load state, change the service redundant degree, not only improved efficiency of service, and realized load balance with a kind of simple and effective way.Particularly, the present invention mainly contains following characteristics:
(1) dynamic: the present invention utilizes the load information of system dynamically to define the load threshold of service node, quantity and performance decision that the big I of load threshold can provide service node according to the size and the system of system load, can change the service redundant degree, both improve the efficient of system reliability and messenger service, exceeded the ground occupying system resources again.
(2) extensibility: the state replication strategy of service node is separated with communication protocol, does not relate to bottom communication mechanism, is with good expansibility.
(3) fine-grained load balancing: the scheduling mode that is based on load that the present invention adopts, can accurately locate the lightest node of load, reach splendid load balancing effect, and service node for the collaborative work task provides service, has avoided resource waste and single node that the system's " bottleneck " that serves and cause is provided in as backup node.
(4) to user transparent: fault tolerant mechanism is transparent fully to the user, and troubleshooting is timely, system restoration is fast, expense is little.
(5) good cost performance: compare with special-purpose high availability server, utilize service system of the present invention to have better fault-tolerant ability and stronger computing capability, and the realization of system is economic, easily payment.
Description of drawings
Fig. 1 is a schematic flow sheet of the present invention;
Fig. 2 is service node allocation flow figure;
Fig. 3 is the first service of a k of the present invention member ring systems structure chart;
Fig. 4 is a service ring reconstruct flow chart behind the node failure;
Fig. 5 is cotasking deletion flow chart.
Embodiment
The present invention is further detailed explanation below in conjunction with accompanying drawing and example.
As shown in Figure 1, the present invention includes following steps:
(1) when service request arrives, service managerZ-HU is its distribution services node, and service request becomes cotasking;
(2) carry out this cotasking, in the process of executing the task, in the regular check service ring whether failure node is arranged.If failure node is arranged, service managerZ-HU sends instruction reconstruct service ring.If there is not failure node, execution in step (3);
Whether (3) inspection task is finished.If task is finished, the service node deletion is finished the work, and finishes; Otherwise, get back to step (2).
System can for each node is selected suitable load threshold, and set the service redundant degree r of task according to the performance of node at different service requests.The load threshold of each service node, cotasking quantity with and redundancy r determined the interstitial content of service ring.Node and behavior cotasking provide service on the ring, have both improved the disposal ability of message, effectively utilize system resource again.Simultaneously, exist on the ring under the situation of some node failure, the service ring structure guarantees that effective service node can take over the work of inefficacy service node, automatic reconfiguration service ring at any time, for cotasking provides the continuous and reliable messenger service, guarantee " continuing to flow " of service.
Be better to set forth the operation principle of service ring structure, the node that we equal node number to token number is called holds the token node.Below describe the operation principle and the flow process of above-mentioned three steps in detail:
(1) service node distributes:
At the service redundant degree r of different service request setting tasks, be its distribution services node then.As shown in Figure 2, concrete steps are as follows:
(1.1) determine whether the service ring exists, judge promptly whether ring is gone up maximum node RN is zero.If maximum node number is zero on the ring, illustrate that the service ring does not exist, need to make up r unit service basic ring, and the setting token number is 0; If the service ring exists, then carried out for the 2nd step;
Wherein, r unit service ring building mode is as follows:
At first the service redundant degree r according to task t takes out r service node from secondary node, and it is 0,1 that node number is set respectively ..., r-1 is designated as N (0), N (1) ..., N (r-1); Distribution node N (0) gives cotasking as main service node then, and other nodes are as the backup node of N (0); Last this r node N (0), N (1) ..., N (r-1) connects into ring.By r node N (0), N (1) ..., N (r-1) } and the connected mode of the r unit service basic ring that constitutes has following feature:
(a) when the service node number be r (r is a positive integer, r 〉=2), during maximum service node number RN=r-1, N (i) is connected with N (1) with N (j), wherein: j=(i-1) modr; L=(i+1) modr; J, l 〉=0 and be integer.
(b) in the r unit service ring between any 2 N (i) and the N (j) apart from d (i, j)=(i-j) modr.R unit service loops composition as shown in Figure 3.
(1.2) cotasking t is distributed to the service node of holding token, its load increases 1, task t is added in the set of tasks of holding the token node;
(1.3) whether token number is greater than maximum node number.If, illustrate that the node that distributes is the outer node of service ring, then will hold the token node and add the service ring, maximum node number increases 1 simultaneously;
(1.4) whether redundancy r is greater than maximum node RN.If the service that illustrates encircles goes up the redundancy requirement that interstitial content does not satisfy task t, then needs in secondary node new node to be added service and encircles, make its expansion constitute the service of r unit and encircle; Wherein, service ring extended mode is as follows:
At first increase r-RN new node, distribution node number is respectively N (r), N (r+1) ..., N (RN-1) is then according to above-mentioned service ring building mode reconstruct service ring;
(1.5) backup tasks t to service ring go up with r-1 nearest node of the service node of holding token on:
If service ring is k unit service ring, when the task t of having redundancy and be a r is assigned to main service node N (i),, chooses the service ring and go up the backup node of r-1 nearest node of the main service node of distance as task t according to the principle of " from the close-by examples to those far off ".Backup node number is: (i-m) modk and (i+n) modk, wherein:
Figure A20051001958600102
After backup is finished, token is delivered to the node of next load less than threshold value;
(1.6) whether there be the effective node of load on the service ring less than threshold value.If there is no, then pass token to the next outer secondary node of ring that is about to add the service ring, and number add 1 for its distribution node number equals maximum node; If exist, then transmit token and go up the effective node of next load less than threshold value to ring.
After service node assigned, the logic box T_C of task t was exactly the set that its main service node and backup services node are formed.System utilizes token passing scheme for the service of collaborative work Task Distribution encircles the service node of going up underloading, and the service that guaranteed encircles the load balance of going up node.Simultaneously, the service ring can adapt to the dynamic change of cotasking quantity and redundancy, is with good expansibility.
(2) service ring reconstruct:
After service node N (i) lost efficacy, at first be that the task on the N (i) is sought the service node that substitutes, reconstruct service ring; Inquire about the node of holding token then and whether still satisfy the condition of load,, then need to transmit token to effective node of next load less than threshold value if do not satisfy less than threshold value.
Service ring reconstruct NodeFailure (N (i)) after node N (i) lost efficacy, as shown in Figure 4, concrete implementation step is as follows:
(2.1) whether the load of failure node N (i) equals zero.If then change step (2.4); Otherwise continue next step;
(2.2) from the set of tasks of node N (i), take out task t, whether have the effective node N (j) of load in the logic box of inspection task t, if there is no, then from secondary node, take out a node substitute node N (i), finish less than load threshold; Otherwise continue next step;
(2.3) node N (j) adds task t in the set of tasks of ingress N (j) as the substitute node of N (i), and the load of node N (i) simultaneously subtracts 1, changes step (2.1) then;
(2.4) deletion of node N (i), all node numbers greater than i subtract 1 simultaneously;
(2.5) whether the node load of holding token equals threshold value.If not, then finish; Otherwise continue next step;
(2.6) whether there be the effective node of load on the service ring less than threshold value.If exist, then transmit token to the node of next load less than threshold value; Otherwise, illustrate that ring goes up node load and all reach maximum load, token is passed to the outer secondary node of ring, and number add 1 for its distribution node number equals maximum node.
Suppose that serving ring goes up node N (i) inefficacy, then be respectively N (i) and go up cotasking searching substitute node, promptly seek the alternative service node that N (i) goes up each task in the cotasking set.Suppose the task t in the set of tasks, its logic box is T_C, after the main service node of cotasking t lost efficacy, in T_C, seek with failure node N (i) recently and load less than effective node of load threshold as its substitute node.If this node exists, then delete failure node and reconstruct service ring, otherwise from secondary node, take out the substitute node reconstruct service ring of a node as N (i) node.
(3) cotasking deletion:
The service node load, be presented as the quantity of cotasking on the node, along with the establishment and the deletion of cotasking is dynamic change, task has reflected the current state of cotasking on service node on the node, therefore be that the task t end of r is withdrawed from or when deleted, load will change thereupon on the node when node N (i) goes up redundancy.
It is the deletion TaskDeleting (t) of the cotasking t of r that node N (i) goes up redundancy, and flow chart is seen shown in Figure 5, and concrete implementation step is as follows:
(3.1) the task t of deletion of node N (i) whether, if, deletion task t from the set of tasks of node N (i); Otherwise finish;
(3.2) the logic box T_C of task t correspondence among the deletion of node N (i);
(3.3) node N (i) load subtracts 1;
(3.4) whether token number is greater than maximum node number.If illustrate that the token node for the outer node of service ring, then passes token to node N (i).
Example
Utilize the said fault-tolerance approach of the present invention, 10 physical servers are provided in the laboratory, and these nodes can both provide the service node distribution, cotasking deletion, services such as service ring reconstruct.The hardware configuration of 10 physical servers and the load threshold that is provided with according to machine performance are as follows:
Machine name CPU Internal memory Hard disk Load threshold
Server 1-2 PIII 550M 256M 10.2G 2
Server 3-5 PIIII 1.4G 256M 40G 5
Server 6-10 PIIII 1.8G 512M 60G 8
Create first redundancy and be 4 cotasking t 1The time, system constructing service basic ring: we take out 4 service nodes from 10 guest machines, be designated as N (0), N (1), N (2), N (3).According to connecting into the service ring in the following manner: node N (0) and node N (3), N (1) connects; Node N (1) and node N (0), N (2) connects; Node N (2) and node N (1), N (3) connects; Node N (3) and node N (2), N (0) connects.Wherein N (0) is task t 1Main service node, N (1), N (2) and N (3) they are t 1The backup services node, t 1Become node N (0) and go up task, have logic box T_C={N (0), N (1), N (2), N (3) }.The basic ring creation-time is 5 milliseconds.
Create second redundancy and be 7 cotasking t 2The time, be 4 because the service ring is gone up the node number, the discontented football association of service ring is with task t 2The redundancy requirement, from secondary node, take out 3 nodes again and add the service ring, be designated as N (4) respectively, N (5), N (6).The service ring expands to 7 yuan of service rings by 4 yuan of service rings.Distribution node N (1) is as task t then 2Main service node, backup node is N (0), N (2), N (3), N (4), N (5), N (6).t 2Become node N (1) and go up task, have logic box T_C={N (1), N (0), N (2), N (3), N (4), N (5), N (6) }.
Create the 3rd redundancy and be 3 cotasking t 3The time, distribution node N (2) is as task t 3Main service node, backup node is N (1), N (3).t 3Become node N (2) and go up task, have logic box T_C={N (1), N (2), N (3) }.
With above-mentioned mode, system creates cotasking t successively 4, t 5... the logic box that the node that distributes ring to go up underloading provides service and sets the tasks as its main service node.Node load on ring all reaches its load threshold, then new node is added the service ring, guarantees the performance of service.
Through repeatedly test, adopt the said cooperative system of the present invention can expand the service ring of fault-tolerance approach, behind node failure, service ring still can operate as normal, because the task on the failure node can be redistributed into effective node, and reconstruct service ring, guaranteed that the service request of carrying out can not be affected.

Claims (3)

1, extendible dynamic fault-tolerant method in a kind of cooperative system may further comprise the steps:
(1) when service request arrives, service managerZ-HU is its distribution services node as follows:
(1.1) judge whether ring is gone up maximum node RN is zero, if RN=0 will make up r unit service basic ring, and to set token number be 0, enters step (1.2); Otherwise directly enter step (1.2);
(1.2) cotasking t is distributed to the service node N (i) that holds token, its load increases 1, task t is added in the set of tasks of holding the token node;
(1.3) judge that token number is whether greater than maximum node number, if the node that will hold token adds the service ring, with seasonal RN=RN+1, enters step (1.4); Otherwise directly enter step (1.4);
(1.4) whether judge redundancy r greater than maximum node RN, if, in secondary node, new node is added the service ring, make its expansion constitute r unit service ring, enter step (1.5); Otherwise directly enter step (1.5);
(1.5) backup tasks t to service ring go up with r-1 nearest node of the service node of holding token on;
(1.6) judge on the service ring whether have the effective node of load less than threshold value, if there is no, then pass token to the next outer secondary node of ring that is about to add the service ring, and number add 1, enter step (2) then for its distribution node number equals maximum node; If exist, then transmit token and go up the effective node of next load less than threshold value to ring, enter step (2) then;
(2) service node N (i) carries out this cotasking, in the process of executing the task, checks in the service ring whether failure node N (i) is arranged, if there is not failure node, changes step (3) over to; Otherwise, service managerZ-HU reconstruct as follows service ring:
(2.1) whether the load of judging failure node N (i) equals zero, if then change step (2.4); Otherwise enter step (2.2);
(2.2) from the set of tasks of failure node N (i), take out task t, whether there be the effective node N (j) of load in the logic box of inspection task t less than load threshold, if there is no, then from secondary node, take out a node substitute node N (i), enter step (3); Otherwise enter step (2.3);
(2.3) with the substitute node of node N (j) as N (i), task t is added in the set of tasks of ingress N (j), the load of node N (i) simultaneously subtracts 1, changes step (2.1) then;
(2.4) deletion of node N (i), all node numbers greater than i subtract 1 simultaneously;
(2.5) judge whether the node load of holding token equals threshold value, if not, step (3) entered; Otherwise enter step (2.6);
(2.6) judge on the service ring whether have the effective node of load,, then transmit token, enter step (3) then to the node of next load less than threshold value if exist less than threshold value; Otherwise, token is passed to the outer secondary node of ring, and number add 1 for its distribution node number equals maximum node, enter step (3) then;
(3) judge whether task t finishes, if not, change step (2) over to, otherwise, carry out following steps:
(3.1) deletion task t from the set of tasks of node N (i), and the logic box T_C of the middle task t correspondence of deletion of node N (i) subtracts 1 with node N (i) load,
(3.2) judge token number whether greater than maximum node number, if, pass token to node N (i), finish, otherwise directly end.
2, method according to claim 1, it is characterized in that: r unit service basic ring construction method is in the step (1.1): according to the service redundant degree r of task t, take out r service node from secondary node, it is 0,1 that node number is set respectively, r-1 is designated as N (0), N (1),, N (r-1); Distribution node N (0) gives cotasking as main service node then, and other nodes are as the backup node of N (0); Last this r node N (0), N (1) ..., N (r-1) connects into ring, constitute by r node N (0), N (1) ..., N (r-1) } r unit service basic ring.
3, method according to claim 2 is characterized in that: service ring extended mode is in the step (1.4): at first increase r-RN new node, distribution node number is respectively N (r), N (r+1),, N (RN-1) is then according to service basic ring building mode reconstruct service ring.
CNB2005100195867A 2005-10-13 2005-10-13 Expandable dynamic fault-tolerant method for cooperative system Expired - Fee Related CN100341298C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2005100195867A CN100341298C (en) 2005-10-13 2005-10-13 Expandable dynamic fault-tolerant method for cooperative system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2005100195867A CN100341298C (en) 2005-10-13 2005-10-13 Expandable dynamic fault-tolerant method for cooperative system

Publications (2)

Publication Number Publication Date
CN1744554A true CN1744554A (en) 2006-03-08
CN100341298C CN100341298C (en) 2007-10-03

Family

ID=36139755

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2005100195867A Expired - Fee Related CN100341298C (en) 2005-10-13 2005-10-13 Expandable dynamic fault-tolerant method for cooperative system

Country Status (1)

Country Link
CN (1) CN100341298C (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101227530B (en) * 2008-02-03 2011-04-06 中兴通讯股份有限公司 Method and apparatus of resource backup
CN102833103A (en) * 2012-08-24 2012-12-19 上海创件信息科技有限公司 Error detecting and processing method relate to electronic map collaboration mark
CN105721545A (en) * 2016-01-20 2016-06-29 浪潮(北京)电子信息产业有限公司 Multi-level cluster management realization method
CN106663030A (en) * 2014-08-13 2017-05-10 微软技术许可有限责任公司 Scalable fault resilient communications within distributed clusters
CN102984184B (en) * 2011-09-05 2017-09-19 上海可鲁系统软件有限公司 The service load balancing method and device of a kind of distributed system
CN116991562A (en) * 2023-09-28 2023-11-03 宁波银行股份有限公司 Data processing method and device, electronic equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09269931A (en) * 1996-01-30 1997-10-14 Canon Inc Cooperative work environment constructing system, its method and medium
US6981019B1 (en) * 2000-05-02 2005-12-27 International Business Machines Corporation System and method for a computer based cooperative work system
CN1308278A (en) * 2001-02-15 2001-08-15 华中科技大学 IP fault-tolerant method for colony server

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101227530B (en) * 2008-02-03 2011-04-06 中兴通讯股份有限公司 Method and apparatus of resource backup
CN102984184B (en) * 2011-09-05 2017-09-19 上海可鲁系统软件有限公司 The service load balancing method and device of a kind of distributed system
CN102833103A (en) * 2012-08-24 2012-12-19 上海创件信息科技有限公司 Error detecting and processing method relate to electronic map collaboration mark
CN106663030A (en) * 2014-08-13 2017-05-10 微软技术许可有限责任公司 Scalable fault resilient communications within distributed clusters
US11290524B2 (en) 2014-08-13 2022-03-29 Microsoft Technology Licensing, Llc Scalable fault resilient communications within distributed clusters
CN105721545A (en) * 2016-01-20 2016-06-29 浪潮(北京)电子信息产业有限公司 Multi-level cluster management realization method
CN105721545B (en) * 2016-01-20 2019-01-22 浪潮(北京)电子信息产业有限公司 A kind of multi-level cluster management implementation method
CN116991562A (en) * 2023-09-28 2023-11-03 宁波银行股份有限公司 Data processing method and device, electronic equipment and storage medium
CN116991562B (en) * 2023-09-28 2023-12-26 宁波银行股份有限公司 Data processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN100341298C (en) 2007-10-03

Similar Documents

Publication Publication Date Title
CN100341298C (en) Expandable dynamic fault-tolerant method for cooperative system
WO2015096656A1 (en) Thread creation method, service request processing method and related device
CN104102548B (en) task resource scheduling processing method and system
CN102833289B (en) A kind of distributed cloud computing resources tissue and method for allocating tasks
CN103023805A (en) MapReduce system
CN1835453A (en) Method of realizing load sharing in distributing system
CN102622275A (en) Load balancing realization method in cloud computing environment
CN1710865A (en) Method for raising reliability of software system based on strucural member
CN1251103C (en) Method for improving serviceability of business machine group
CN103473848B (en) Network invoice checking framework and method based on high concurrency
CN106354574A (en) Acceleration system and method used for big data K-Mean clustering algorithm
CN108282526B (en) Dynamic allocation method and system for servers between double clusters
CN102682117A (en) Method for quickly copying cluster data in database
CN1455347A (en) Distributed parallel scheduling wide band network server system
CN111858033A (en) Load balancing method based on cluster and multiple processes
CN114389955B (en) Method for managing heterogeneous resource pool of embedded platform
CN100357930C (en) Large scale data parallel computing main system and method under network environment
CN110427270A (en) The dynamic load balancing method of distributed connection operator under a kind of network towards RDMA
CN1315045C (en) A method for implementing centralized concurrent management to cluster
CN103401951B (en) Based on the elastic cloud distribution method of peer-to-peer architecture
Bendjoudi et al. An adaptive hierarchical master–worker (AHMW) framework for grids—Application to B&B algorithms
US7774311B2 (en) Method and apparatus of distributing data in partioned databases operating on a shared-nothing architecture
CN108616398A (en) A kind of container dynamic capacity reduction method based on DNS load-balancing techniques
Reddy et al. A hierarchical load balancing algorithm for efficient job scheduling in a computational grid testbed
CN103647712B (en) The method and system of distributed route processing business

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20071003

Termination date: 20101013