GB2485866A - Replication of state data of processing tasks from an active entity to a backup entity - Google Patents

Replication of state data of processing tasks from an active entity to a backup entity Download PDF

Info

Publication number
GB2485866A
GB2485866A GB1114826.9A GB201114826A GB2485866A GB 2485866 A GB2485866 A GB 2485866A GB 201114826 A GB201114826 A GB 201114826A GB 2485866 A GB2485866 A GB 2485866A
Authority
GB
United Kingdom
Prior art keywords
entity
processing
state data
replication
backup
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB1114826.9A
Other versions
GB201114826D0 (en
GB2485866B (en
Inventor
Richard Sugarman
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Metaswitch Networks Ltd
Original Assignee
Metaswitch Networks Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Metaswitch Networks Ltd filed Critical Metaswitch Networks Ltd
Priority to GB1114826.9A priority Critical patent/GB2485866B/en
Publication of GB201114826D0 publication Critical patent/GB201114826D0/en
Publication of GB2485866A publication Critical patent/GB2485866A/en
Application granted granted Critical
Publication of GB2485866B publication Critical patent/GB2485866B/en
Application status is Active legal-status Critical
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2038Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant with a single idle spare processing component
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/1658Data re-synchronization of a redundant component, or initial sync of replacement, additional or spare unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2048Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant where the redundant components share neither address space nor persistent storage
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2097Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements maintaining the standby controller/processing unit updated
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/81Threshold

Abstract

An active entity executes a number of processing tasks. Each task persists for a finite time period. A backup entity is configured to take over processing from the active entity if the active entity fails. When a trigger indicates that the state of the tasks on the active entity should be replicated on the backup entity, any new processing tasks are replicated between the entities. Tasks that are already executing on the active entity are not replicated until after a delay period. Then any of those tasks that are still executing are replicated. The delay may be dependent on the average duration of the tasks or on the resource usage on the entities. The delay period may change and may be different for different tasks. The tasks may be handling telephone calls and may be SIP processes.

Description

DATA REPLICATION FOR A BACKUP ENTITY

Technical Field

The present invention relates to data replication in a system that comprises an active entfty and a backup entity, the backup entity being configured to take over processing activity from the active entity on failure of the active entity. It is suited to implementation in, but not necessarily limited to, telephony systems wherein state data associated with a processing task persists for a finite time period.

Background

A telephony device that holds state information for Voice over Internet Protocol (VoIP) phone calls, for example using the Session Initiation Protocol (SIP), will often be a highly available device. Highly available devices are arranged so that a single failure, whether in hardware and/or software, does not result in a device losing state information for any calls that it is handling. High availability is commonly achieved by using a pair of redundant processing entfties: a primary entity handles call processing, and replicates state information for each call to a backup entity that can take over if the primary entity fails.

In normal operation, the primary entity is designed to handle processing tasks as an active entity. During this time, the backup entity keeps a copy of the state information associated with the processing tasks on the active entity. On failure of the active entity, the backup entity takes over the processing tasks handled by the failed entfty and becomes an active backup entity. Typically this is performed in such a way that users of the system are unaware of the failure. Following failure (referred to as a "failover" in the art), the primary entfty is restarted, reset or replaced, e.g. is generally recovered. Following recovery, the primary entfty becomes a standby primary entfty while the active backup entity continues to handle the transferred processing tasks.

The standby primary entfty wafts for a swftch-over from the active backup entity so the system can retum to normal operation. During fts time as a standby entfty the primary entity may also back-up the active backup entity in case the latter fails prior to any switch-over. After switch-over, normal operation returns and the entfties return to their normal active and backup roles.

US2002/0073409 Al describes a telecommunications platform that comprises a cluster of processors which perform a central processing function. A background replication mode may be used that immediate stores state data in a memory accessible by a first processor of the cluster followed by delayed storing of the state data in a memory accessible by a second processor of the cluster.

A telephony device might support many tens of thousands, or even hundreds of thousands, of active VoIP calls. This means that the process of replicating state information between an active and backup entity can both take a long time and consume a significant proportion of the processing resources of the active entity. A telephony device therefore may not be able to meet designated call-handling volumes (its published call-rate). For example, call rates of around 500 calls per second are not uncommon. The process of data replication can take many minutes. It is not acceptable to have such a long period of time where the telephony device cannot perform at its rated capacity.

There is thus a problem of maintaining designated processing volumes while at the same time maintaining high availability.

Summary

In accordance with one aspect of the present invention, there is provided a method for replicating state data between an active entity and a backup entity, the backup entity configured to take over processing activity from the active entity on failure of the active entity, the processing activity comprising one or more processing tasks, wherein state data associated with a processing task persists for a finite time period, the method comprising: replicating state data for new processing tasks assigned to the active entity from a trigger point, the state data being replicated between the active entity and the backup entity; delaying, for a delay period, replication of state data for processing tasks assigned to the active entity before the trigger point; and after the delay period, replicating, between the active entity and the backup entity, state data for said processing tasks assigned to the active entity before the trigger point By waiting for a period to expire, in effect delaying a "catch-up" replication process, the processing demands on an active entity can be reduced, enabling it to maintain designated processing volumes while also maintaining high availability. In certain embodiments, state replication for new processes and existing processes is effectively separated, spreading the extra load on an active entity caused by replication over a longer period of time. The trigger point may be, amongst others, the activation of a new server, the insertion of a new processing card or the activation of a new backup instance.

In some embodiments, the step of replicating state data comprises replicating state data for a processing task following assignment of said processing task, i.e. dynamically. For example, state data may be replicated immediately or near-immediately following the assignment of a processing task. In this way, as new processing tasks are typically assigned over a time period, rather than all at once, the processing load on the active entity can be more efficiently managed. Spikes in replication activity that may detrimentally affect the processing of the processing tasks themselves may also be avoided.

In some embodiments, the delay period is configurable. It may be calculated based on one or more processing-activity metrics. This enables the delay period to be tailored for a particular system based on the processing activity, i.e. the period is dependent in some manner on a characteristic of the processing activity. In certain embodiments, the delay period may be calculated dynamically, e.g. at regular intervals based on changing system conditions. This, in effect, only delays the "catch-up" replication process by a significant time period when the conditions require it. In some embodiments, the delay period need not be a time period, but could a function of one or more processing-activity metrics, for example a certain number of processor cycles.

In some embodiments, one of the processing-activity metrics comprises an average duration of a processing task. For example, in a VoIP or Signalling System #7 (SS7) processing system this may be the average duration of a telephone call. This has the advantage that, by the time the "catch-up" replication process is initiated, many of the calls whose state data was to be replicated would have completed. As, in some embodiments, the state data only persists for the length of the call, the quantfty of state data that needs to be replicated after the delay period may be greatly reduced.

This not only reduces the amount of data that need be transferred between the active and backup entities but also reduces the amount of processing that needs to be performed by the active entity, for example in packaging and sending the state data, and the backup entity, for example in receiving and un-packaging the state data. A reduction in the amount of processing that needs to be performed by the active entity again helps the active entity maintain designated call-rates, or even manage higher call-volumes, enhancing scalability. For example, by freeing up processing resources the active entity is able to handle a larger number of calls, making it easy to scale a system comprising the active entity to handle more calls.

Tn some embodiments, one of the processing-activity metrics comprises a measurement of resource usage on at least one of the active entity and the backup entity. For example, resource usage may comprise the central processing unit (CPU) usage of one or more processors or processing cores or a percentage of system memory committed to current processing activity. This has the advantage of making the system adaptable; under heavy loads the delay period may be longer to enable an active entity to process backup state data at a later time when some of said data may not be required, under light loads the delay may be shorter, or even zero, to reduce the risk of data loss if failure of the active entity occurs within the delay period when existing calls have not been replicated.

Tn some embodiments, the delay period is initially zero and the method comprises setting the delay period to a non-zero value when at least one of the one or more processing-activity metrics meet one or more predetermined conditions. For example, there may be no delayed "catch-up" replication if loading is below a certain threshold, and there may be delayed "catch-up" replication once loading raises above a certain threshold. The delay period value may also be returned to a zero (or lower) value when one or more processing-activity metrics no longer meet one or more predetermined conditions. In other embodiments, the predetermined condition may reflect factors such as the time of day, peak call periods may require delayed "catch-up" replication whereas night periods may omit the delayed "catch-up" replication.

This again increases the flexibility of a high availability system.

Tn some embodiments, the method comprises determining whether a processing task belongs to a first group or a second group, and setting the delay period to a first value for the first group and a second value for the second group, wherein replicating state data comprises replicating state data for processing tasks in the first group and replicating state data for processing tasks in the second group. This enables delayed "catch-up" replication to be applied to a subset of calls. For example, it may not be suitable to perform delayed "catch-up" replication on emergency calls as these may require a high level of fault tolerance. The first group may thus comprise emergency calls with a low or zero configurable time (i.e. delay) period value and the second group may comprise non-emergency calls with a higher configurable time (i.e. delay) period value. There may be more than two groups, for example groups for different subscriber types.

In some embodiments, the processing task comprises a telephone call such as a VoIP/SIP or SS7 call, the state data persisting for at least the lifetime of the telephone call. Certain embodiments are particularly suited to telephony which has processing tasks with a limited time duration, e.g. most phone calls are of the order of minutes, high volumes, and wherein state data is only maintained for the duration of the call or for a limited time period after the call has finished.

In some embodiments, the method comprises replicating system data for the active entity from the trigger point, which may occur at a first time. This system data may comprise global configuration data for the processing activity. This data may need to be replicated to enable switch-over to the backup entity.

In accordance with one aspect of the present invention, there is provided a system for replicating state data comprising: an active entity for performing one or more processing tasks, a processing task having associated state data, the state data persisting for a finite time period; a backup entity arranged to take over processing activity from the active entity on failure of the active entity; and a replication manager for managing the replication of state data between the active entity and the backup entity, the replication manager comprising a timer component for measuring a delay period from a trigger point, the replication manager being arranged to instruct the replication of state data for new processing tasks assigned to the active entity from the trigger point and to delay, until the delay period has elapsed, instructing replication of state data for processing tasks assigned to the active entity before the trigger point.

This system provides similar advantages to those discussed above for the method aspect.

In some embodiments, the system comprises a monitoring component for determining one or more processing-activity metrics, the resource manager being arranged to set the delay period based on at least one of the one or more processing activity metrics.

In some embodiments, one or more of the processing-activity metrics comprise: an average duration of a processing task and a measurement of resource usage on at least one of the active entity and the backup entity.

In some embodiments, the delay period is initialised to a zero value and the replication manager is arranged to set the delay period to a non-zero value when the processing-activity metric meets a predetermined condition.

In some embodiments, a processing task is assigned to one of a first group and a second group, the replication manager being arranged to set the delay period to a first value for the first group and a second value for the second group and replicate state data for each group accordingly.

In some embodiments, the active entity and the backup entity comprise entities for processing SIP messages.

Hence, system features corresponding to method variations may be provided, the advantages of the method variations set out above applying in a corresponding manner to the system features.

In accordance with one aspect of the present invention, there is provided an apparatus for replicating state data comprising: a processor for performing one or more processing tasks; a memory for storing state data associated with a processing task, the state data persisting for a finite time period; a replication manager for managing the replication of state data between the memory and a backup entity, the backup entity arranged to take over processing activity from the apparatus on failure of the apparatus, the replication manager being arranged to instruct replication of state data from the memory to the backup entity for new processing tasks assigned to the apparatus from a first trigger point, the replication manager further being arranged to delay, until a second trigger point, instructing replication of state data from the memory to the backup entity for processing tasks assigned to the active entity before the first trigger point.

This apparatus provides one possible implementation of an active entity. A suitably arranged apparatus may also provide a possible implementation of a backup entity and/or any management entity.

In accordance with one aspect of the present invention, there is provided a computer program product comprising computer program code configured to perform the steps of the method set out above, and any of the method variations, when said computer program code is processed by one or more processors.

Further features and advantages of the invention will become apparent from the following description of preferred embodiments of the invention, given by way of example only, which is made with reference to the accompanying drawings.

Brief Description of the Drawings

Figure 1 is a schematic diagram showing an exemplary active entity and an exemplary backup entity; and Figure 2 is a sequence diagram showing an exemplary data replication method.

Detailed Description

Figure 1 shows an exemplary highly available system 100. The highly available system comprises a primary server 105-A and a secondary server 105-B. In this example, the server comprises processing cards configured to process VoIP calls however a server may comprise any computing hardware capable of acting as a server for processing services. Each processing card comprises one or more processors (i.e. CPUs) and associated memory. A processing card may also comprise disk and/or solid-state storage. The processing card also has access to a communications interface for receiving and sending messages including SIP messages.

Each server 105 in Figure 1 comprises a number of components: a hardware manager 110, a system manager 120 and a SIP entity 130. These components may be implemented by at least one of: hardware such as dedicated or programmable integrated circuits; and software such as modules implemented in a known programming language such as Java, C, or C++, computer program code for the modules being stored in memory and processed in operation by one or more processors. A combination of software and hardware can be used. In certain implementations, an operating support system may be provided. The operating support system may be configured to operate on many different server hardware configurations and/or operating systems. Components may then be configured to operate within the operating support system. The components described herein may also act at the level of an operating system, replacing one or more operating system control functions.

A hardware manager 110 is responsible for determining the state of the server hardware. For example, it may be responsible for detecting whether one or more processing cards are available, i.e. are not in a fault state. A SIP entity 130, in an active configuration, is responsible for performing SIP processing. A system manager is responsible for managing one or more SIP entities 130. For example, in the present example, it determines and records the locations of all SIP entities in the highly available system 100. A hardware manager 110 communicates with a system manager over a hardware manager interface 115. A system manager 120 communicates with a SIP entity 130 over a fault tolerance interface 125.

In use, one of the SIP entities 130 is designated an active SIP entity and one of the SIP entities is designated a backup SIP entity. In the present example, primary server 105-A hosts an active primary SIP entity 130-A and secondary server 105-B hosts a backup secondary SIP entity 130-B. At any point in time, the system manager on the primary server, i.e. the primary system manager 120-A, knows which SIP entities exist across the servers 105. This may be implemented using a message from one or more components as described in more detail below. The primaiy system manager 120-A is responsible for communicating with the existing SIP entities 130 to inform them when a new entity has been created or destroyed. This may define a trigger point. In the present example, in normal operation, the active primary SIP entity 130-A deals with SIP signalling messages, and replicates the creation, modification, or deletion of SIP dialogs to the backup SIP entity across a replication interface 135 between the primary and secondary servers 105. The replication interface 135 can be a transport level connection between the two servers 105, for example Transmission Control Protocol (TCP) or any other reliable transport protocol.

On initialisation of a new backup entity, whether that be a normal backup entity or a standby primary entity, the new backup entity must communicate with the active entity in order that state data for all calls that are active on the active entity is replicated to the backup entity, whether that be a primary entity or a backup entity in active mode. This is known as "catch-up" replication. Replication may occur, amongst other techniques, using a "push" from an active entity, i.e. initiated by the active entity, using a "pull" from a backup entity, i.e. initiated by the backup entity, or using a third entity such as a hardware and/or system manager. For an entity under normal loading, catch-up replication may take a few minutes. The new backup entity must also replicate new calls assigned to the corresponding active entity, in order that the backup entity has an up-to-date copy of the state of all active calls. In conventional systems, when a failover occurs, most of the resources devoted to catch-up replication are wasted. This is because an average VoIP phone call is estimated to last approximately three minutes. Hence, by the time catch-up replication has finished, most of the calls whose state data is replicated are statistically likely to have finished, e.g. would have been torn down by people hanging up.

Certain embodiments make use of the realisation described above. They introduce the concept of delayed catch-up replication. When a new backup entity is initialised, e.g. when a new server is added to a cluster or when failover occurs, a configurable time period (or general "delay period") is defined during which an active entity replicates any new calls to the new backup entity, but delays replication of existing calls, i.e. the calls that survive the failover. After the configurable time period has passed, catch-up replication of any active existing calls that survived the failover is performed; however, most of these calls are likely to have been tom down if the configurable time period is equal to, or longer than, an average call duration. Hence, a reduced amount of state data needs to be replicated, which requires fewer system resources and results in a much shorter time during which performance of an active entity is affected. As normal catch-up replication typically lasted for a few minutes, due to the need to transfer and process a large quantity of state data, the risk of calls being lost or dropped during the configurable time period is equivalent to prior methods. Also the chance of a device failing every few minutes is rare as such a device would not be a valid functioning piece of networking equipment. In certain variations a more general delay period may be used which may be based on time or other measured values, such as CPU cycles or SIP operations.

Figure 2 is an exemplary sequence diagram showing how the components of Figure 1 may interact to implement delayed catch-up replication. It is to be understood that the sequence of operations is provided as an example in order to explain an embodiment. The exact sequence of messages, their form, and the originating components may vary depending on the hardware, software, and/or system requirements of a particular implementation.

Figure 2 shows messages sent between the primary hardware manager 11 OA and the primary system manager 120-A over the hardware manager interface 115. It further shows messages sent between the primary system manager 120-A and the active 130-A and backup 130-B SIP entities over fault tolerant interface 125. The sequence of Figure 2 demonstrates the interactions of said components when a new processing card comprising the secondary server 105-B is installed; however, there may be other initiating actions such as the recovery of a failed primary server or entity. In this latter case, the backup SIP entity 130-B may comprise a standby primary SIP entity.

Returning to Figure 2, the method begins when a new backup card comprising secondary server 105-B is plugged in. In some embodiments, this may define a first trigger point. In previous methods catch-up replication would normally commence at this point. In the present example, however, at step 210 the primary hardware manager 110-A sends a LOCATION_STATUS IND message to the primary system manager 120-A indicating that a new backup location is available. This may implement a registration procedure that enables the system manager 120 to determine the location of all SIP entities, or alternatively other procedures may be used to determine the location of entities. At step 215, the primary system manager 120-A sends a START_LOCATION request to the primary hardware manager 1 10-A to start a new backup entity at the indicated new backup location, i.e. the processing card comprising secondary server 105-B. In reply to the START_LOCATION request the primary hardware manager 110-A sends a START_LOCATION response to the primary system manager 120-A. This START_LOCATION response comprises a field "delay repl" that may be set to indicate that delayed replication should be used, for example "delay repl = ATG_YES". At step 220, in response to the communications at step 215, the primary system manager 120-A sends a command to the secondary server 105-B to create a backup SIP entity 130-B on the new backup location. This command may be received and enacted by at least one of the secondary system manager 120-B and the secondary hardware manager 110-B. Step 220 is shown as a dashed line to indicate that it is received to create the backup SIP entity 130-B. At step 225, the primary system manager 110-A sends an INITIALIZE_INSTANCE command to the backup SIP entity 130-B to configure the new entity. The backup SIP entity 130-B responses with an acknowledgement indicating that the initialisation procedure is complete and that the backup SIP entity 130-B is ready to receive new backup commands.

At step 230, the primary system manager 120-A sends a NEW_BACKUP request to the primary SIP entity 130-A. This request indicates that a new backup procedure is to be performed and that delayed replication is be used. At step 235, a START_REPL request from the primary SIP entity 130-A instructs the backup SIP entity 130-B to expect one or more replication requests for essential system or state data. A response to the request is also sent from the backup SIP entity 130-B to the primary SIP entity 130-A to inform the primary SIP entity 130-A that the backup SIP entity 130-B is ready to receive the replication requests. At step 240, one or more CB_REPL messages are exchanged between the primary SIP entity 130-A and the backup SIP entity 130-B. These messages contain data comprising each piece or portion of essential system or state data. A response to each CB_REPL message is also sent from the backup SIP entity 130-B to the primary SIP entity 130-A to confirm that each piece or portion of data has been successfully stored. Based on the exchanges of steps 235 and 240, the primary SIP entity 130-A replicates essential system data, for example global configurations or parameter values, to the backup SIP entity 130-B omitting SIP dialogs that comprise state data for existing calls on the primary SIP entity 130-A. At step 245, the initial replication procedure completes which may be indicated by a REPL_COMPLETE message sent by the primary SIP entity 130-A and acknowledged the backup SIP entity 130-B. At step 250, the primary SIP entity 130-A informs the primary system manager 120-A that the backup SIP entfty 130-B is now a valid backup entity able to take over if the primary SIP entity 130-A fails. This may be performed using an acknowledgement to the NEW_BACKUP message at step 230. In some embodiments, this may form a first trigger point, the trigger point being any event following which replication of new calls is performed but replication of existing calls is delayed.

Following step 250, dynamic replication can now occur between the primary SIP entity 130-A and the backup SIP entity 130-B to replicate state data associated with new calls. For example, this may involve replicating state data for a call in real-time or near real-time, i.e. following assignment of a new call to the primary SIP entity 130-A for processing after step 250, state data is replicated to the backup SIP entfty 130-B as, or shortly after, it is created. This involves the passing of data over replication interface 135. It may involve replicating data in memory and/or storage of primary server 105-A in memory and/or storage of secondary server 105-B. Once replication for a new call has been set up or started, it may intermittently continue as state data relating to the call changes. In any case, state data for new calls is replicated between the two entities 130 in a configurable time period t before step 255 is initiated. This configurable time period starts from a first time, effectively when the backup SIP entity 130-B is ready for backup operation. In this example, the time period t is configurable, however in other embodiments the delay period need not be configurable nor need it be explicitly linked to a time period.

In this example, the primary hardware manager 110-A is responsible for monitoring the passing of configurable time period t. This may involve waiting while a timer component or object monitors for a time equal to the first time plus the configurable time period t. This timer component or object may generate an event that is sent to the primary hardware manager 110-A. This event may comprise a second trigger point. However the system is implemented, at a second time equal to the first time plus the configurable time period, step 255 commences and the hardware manager sends a LOCATION_STATUS IND indication for the backup location to the primary system manager 120-A. This indication has a delay replication field value, in this case "delay repl = ATG NO", to indicate that replication should not be delayed any longer. At step 260, the primary system manager 120-A sends a BACKUP_CONTINUE request to the primary SIP entity 130. This request indicates that catch-up replication should commence for the new backup location, i.e. for backup SIP entity 130-B. This catch-up replication comprises the replication of state data for processing tasks active at the second time that were also active at the first time, i.e. state data associated with calls that existed at step 250 when the primary system manager 120-A was informed that the backup SIP entity 130-B was ready for backup operation and that still exist following the configurable time period t.

Configurable time period t is typically set as the average call duration for the system.

At step 265, a START DELAY REPL request from the primary SIP entity 130-A instructs the backup SIP entity 130-B to expect one or more replication requests for non-essential state data, i.e. state data relating to existing calls. A response to the request is also sent from the backup SIP entity 130-B to the primary SIP entity 130-A to inform the primary SIP entity 130-A that the backup SIP entity 130-B is ready to receive the replication requests. At step 270, one or more CBREPL messages are exchanged between the primary SIP entity 130-A and the backup SIP entity 130-B. These messages contain data comprising each piece or portion of non-essential state data. A response to each CB_REPL message is also sent from the backup SIP entity 130-B to the primary SIP entity 130-A to confirm that each piece or portion of data has been successfully stored. Hence, at steps 265 and 270, the primary SIP entity 130-A replicates SIP dialogs and/or SIP subscriptions omitted during the initial replication procedure of steps 235 to 245. This may comprise sending data over replication interface 135 to the secondary server 105-B and in particular to the backup SIP entity 130-B. At step 275, the delayed catch-up replication completes, which may be indicated by a DELAY_REPL_COMPLETE message sent by the primary SIP entity 130-A and acknowledged by the backup SIP entity 130-B. At step 280, the primary SIP entity 130-A informs the system manager 120-A that the backup SIP entity 130-B has now fully replicated the SIP state of the primary SIP entity 130-A.

This may be performed using an acknowledgement to the BACKUP_CONTINUE message at step 260.

As can be seen from Figure 2, certain embodiments comprise a first replication phase wherein state data for new processes and/or essential system data is replicated between a primary or active SIP entity and a backup SIP entity, followed by a delay, followed by a second replication phase wherein state data for non-essential data and/or processes that existed at the time of the first replication phase, and that still exist, are replicated between a primary or active SIP entity and a backup SIP entity.

The above embodiments are to be understood as illustrative examples of the invention. Further embodiments of the invention are envisaged. For example, the system and/or methods described herein may be applied to any kind of "highly available" device that holds persistent state and where each piece of state has a finite lifetime. For example, it can be applied to any device that handles voice and/or video calls (or any time limited process), including those using protocols other than SIP such as SS7. Even though the example of Figure 2 has a primary hardware manager that monitors the passing of the configurable time period, any of the components discussed herein, or an additionally supplied component, could perform this function.

For example, in a peer-to-peer arrangement the primary and secondary servers may only comprise SIP entities, in which ease the functionality described herein may be incorporated into the SIP entity. In this manner, the primary or backup SIP entity could implement the delay between the replication phases.

In certain variations the configurable time period is determined based on one or more measured variable. For example, the average call duration may be calculated based on historic and/or concurrent processing, i.e. dynamically, such that if the average call duration was to change the configurable time period would be updated accordingly. In some variations delayed catch-up replication may be withheld for certain call types. For example, for emergency calls the configurable time period may be set to zero or a small value or steps 265 to 275 may be performed concurrently with steps 235 to 245; for non-emergency calls the method of Figure 2 may be used with a larger configurable time period value. Likewise, delayed catch-up replication may only be applied to particular state data, in effect having a first group of data that is replicated with the essential system data in steps 235 to 245 and a second group of data that is replication in steps in steps 265 to 275. More than two groups may be used, with each group having a defined configurable time period value.

In certain variations, system resources such as CPU load may be monitored, with the configurable time period t being based on one or more monitored system S resource values. If one or more monitored system resource values are below one or more respective thresholds, catch-up replication may proceed immediately, i.e. with the configurable time period t equal to zero and if not catch-up replication may be delayed as described herein. In some variations, a delay period acting as the configurable time period may be calculated as a function of one or more processing metrics. The value of the configurable time period t may also be chosen so as to pace the progress of replication so that replication of new calls has priority over catch-up replication. In some embodiments, the delay period need not be a time period, but could a function of one or more processing-activity metrics, for example a certain number of processor cycles.

It is to be understood that any feature described in relation to any one embodiment may be used alone, or in combination with other features described, and may also be used in combination with one or more features of any other of the embodiments, or any combination of any other of the embodiments. Furthermore, equivalents and modifications not described above may also be employed without departing from the scope of the invention, which is defined in the accompanying claims.

Claims (19)

  1. Claims 1. A method for replicating state data between an active entity and a backup entity, the backup entity configured to take over processing activity from the active entity on failure of the active entity, the processing activity comprising one or more processing tasks, wherein state data associated with a processing task persists for a finite time period, the method comprising: replicating state data for new processing tasks assigned to the active entity from a trigger point, the state data being replicated between the active entity and the backup entity; delaying, for a delay period, replication of state data for processing tasks assigned to the active entity before the trigger point; and after the delay period, replicating, between the active entity and the backup entity, state data for said processing tasks assigned to the active entity before the trigger point.
  2. 2. The method of claim 1, wherein the step of replicating state data for new processing tasks comprises replicating state data for a processing task following assignment of said processing task to said active entity.
  3. 3. The method of any one of the preceding claims, wherein the delay period is calculated based on one or more processing-activity metrics.
  4. 4. The method of claim 3, wherein the one or more processing-activity metrics comprise an average duration of a processing task.
  5. 5. The method of claim 3 or 4, wherein the one or more processing-activity metrics comprise a measurement of resource usage on at least one of the active entity and the backup entity.
  6. 6. The method of any one of claims 3, 4 or 5, wherein the delay period is initially zero, the method comprising: setting the delay period to a non-zero value when at least one of the one or more processing-activity metrics meet a predetermined condition.
  7. 7. The method of any one of the preceding claims comprising: determining whether a processing task belongs to a first group or a second group; and selling the delay period to a first value for the first group and a second value for the second group, wherein replicating state data comprises replicating state data for processing tasks in the first group and replicating state data for processing tasks in the second group.
  8. 8. The method of any one of the preceding claims, wherein a processing task comprises a telephone call, the state data persisting for at least the lifetime of the telephone call.
  9. 9. The method of any one of the preceding claims, wherein a processing task comprises a SIP process, the state data persisting for at least the lifetime of the SIP process.
  10. 10. The method of any one of the preceding claims, comprising: replicating system data for the active entity from the trigger point.
  11. 11. The method of any one of the preceding claims, wherein the delay period is configurable.
  12. 12. A system for replicating state data comprising: an active entity for performing one or more processing tasks, a processing task having associated state data, the state data persisting for a finite time period; a backup entity arranged to take over processing activity from the active entity on failure of the active entity; and a replication manager for managing the replication of state data between the active entity and the backup entity, the replication manager comprising a timer component for measuring a delay period from a trigger point, the replication manager being arranged to instruct the replication of state data for new processing tasks assigned to the active entity from the trigger point and to delay, until the delay period has elapsed, instructing replication of state data for processing tasks assigned to the active entity before the trigger point.
  13. 13. The system of claim 12, comprising: a monitoring component for determining one or more processing-activity metrics, the resource manager being arranged to set the delay period based on the processing activity metric.
  14. 14. The system of claim 13, wherein the one or more processing-activity metrics comprise at least one of: an average duration of a processing task; and a measurement of resource usage on at least one of the active entity and the backup entity.
  15. 15. The system of claim 13 or 14, wherein the delay period is initialised to a zero value and the replication manager is arranged to set the delay period to a non-zero value when at least one of the one or more processing-activity metrics meet a predetermined condition.
  16. 16. The system of any one of claims 12 to 15, wherein a processing task is assigned to one of a first group and a second group, the replication manager being arranged to set the delay period to a first value for the first group and a second value for the second group and replicate state data for each group accordingly.
  17. 17. The system of any one of claims 12 to 16, wherein the active entity and the backup entity comprise entities for processing SIP messages.
  18. 18. Apparatus for replicating state data comprising: a processor for performing one or more processing tasks; a memory for storing state data associated with a processing task, the state data persisting for a finite time period; a replication manager for managing the replication of state data between the memory and a backup entity, the backup entity arranged to take over processing activity from the apparatus on failure of the apparatus, the replication manager being arranged to instruct replication of state data from the memory to the backup entity for new processing tasks assigned to the apparatus from a first trigger point, the replication manager further being arranged to delay, until a second trigger point, instructing replication of state data from the memory to the backup entity for processing tasks assigned to the active entity before the first trigger point.
  19. 19. A computer program product comprising computer program code configured to perform the steps of any one of method claims 1 to 11 when said computer program code is processed by one or more processors.
GB1114826.9A 2011-08-26 2011-08-26 Data replication for a backup entity Active GB2485866B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB1114826.9A GB2485866B (en) 2011-08-26 2011-08-26 Data replication for a backup entity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1114826.9A GB2485866B (en) 2011-08-26 2011-08-26 Data replication for a backup entity

Publications (3)

Publication Number Publication Date
GB201114826D0 GB201114826D0 (en) 2011-10-12
GB2485866A true GB2485866A (en) 2012-05-30
GB2485866B GB2485866B (en) 2012-10-10

Family

ID=44838812

Family Applications (1)

Application Number Title Priority Date Filing Date
GB1114826.9A Active GB2485866B (en) 2011-08-26 2011-08-26 Data replication for a backup entity

Country Status (1)

Country Link
GB (1) GB2485866B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015006143A1 (en) * 2013-07-08 2015-01-15 Nicira, Inc. Unified replication mechanism for fault-tolerance of state
US9667447B2 (en) 2013-07-08 2017-05-30 Nicira, Inc. Managing context identifier assignment across multiple physical domains
US9843476B2 (en) 2012-04-18 2017-12-12 Nicira, Inc. Using transactions to minimize churn in a distributed network control system
US9923760B2 (en) 2015-04-06 2018-03-20 Nicira, Inc. Reduction of churn in a network control system
US9973382B2 (en) 2013-08-15 2018-05-15 Nicira, Inc. Hitless upgrade for network control applications
US10091120B2 (en) 2014-05-05 2018-10-02 Nicira, Inc. Secondary input queues for maintaining a consistent network state
US10204122B2 (en) 2015-09-30 2019-02-12 Nicira, Inc. Implementing an interface between tuple and message-driven control entities

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9178833B2 (en) 2011-10-25 2015-11-03 Nicira, Inc. Chassis controller

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020107966A1 (en) * 2001-02-06 2002-08-08 Jacques Baudot Method and system for maintaining connections in a network
US7321992B1 (en) * 2002-05-28 2008-01-22 Unisys Corporation Reducing application downtime in a cluster using user-defined rules for proactive failover
US20100325485A1 (en) * 2009-06-22 2010-12-23 Sandeep Kamath Systems and methods for stateful session failover between multi-core appliances

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020107966A1 (en) * 2001-02-06 2002-08-08 Jacques Baudot Method and system for maintaining connections in a network
US7321992B1 (en) * 2002-05-28 2008-01-22 Unisys Corporation Reducing application downtime in a cluster using user-defined rules for proactive failover
US20100325485A1 (en) * 2009-06-22 2010-12-23 Sandeep Kamath Systems and methods for stateful session failover between multi-core appliances

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10033579B2 (en) 2012-04-18 2018-07-24 Nicira, Inc. Using transactions to compute and propagate network forwarding state
US9843476B2 (en) 2012-04-18 2017-12-12 Nicira, Inc. Using transactions to minimize churn in a distributed network control system
US10135676B2 (en) 2012-04-18 2018-11-20 Nicira, Inc. Using transactions to minimize churn in a distributed network control system
US9667447B2 (en) 2013-07-08 2017-05-30 Nicira, Inc. Managing context identifier assignment across multiple physical domains
US10069676B2 (en) 2013-07-08 2018-09-04 Nicira, Inc. Storing network state at a network controller
WO2015006143A1 (en) * 2013-07-08 2015-01-15 Nicira, Inc. Unified replication mechanism for fault-tolerance of state
US10218564B2 (en) 2013-07-08 2019-02-26 Nicira, Inc. Unified replication mechanism for fault-tolerance of state
US9973382B2 (en) 2013-08-15 2018-05-15 Nicira, Inc. Hitless upgrade for network control applications
US10091120B2 (en) 2014-05-05 2018-10-02 Nicira, Inc. Secondary input queues for maintaining a consistent network state
US10164894B2 (en) 2014-05-05 2018-12-25 Nicira, Inc. Buffered subscriber tables for maintaining a consistent network state
US9967134B2 (en) 2015-04-06 2018-05-08 Nicira, Inc. Reduction of network churn based on differences in input state
US9923760B2 (en) 2015-04-06 2018-03-20 Nicira, Inc. Reduction of churn in a network control system
US10204122B2 (en) 2015-09-30 2019-02-12 Nicira, Inc. Implementing an interface between tuple and message-driven control entities

Also Published As

Publication number Publication date
GB201114826D0 (en) 2011-10-12
GB2485866B (en) 2012-10-10

Similar Documents

Publication Publication Date Title
Garg et al. Analysis of preventive maintenance in transactions based software systems
JP4721195B2 (en) How to manage the remote resources accessible in a multi-node distributed data processing system
JP4620455B2 (en) Business continuity policy for the server consolidated environment
CA2339783C (en) Fault tolerant computer system
CN1213376C (en) Protocol for replicated servers
CN1191528C (en) Method and system of transparent selective software regeneration based on time
US7657580B2 (en) System and method providing virtual applications architecture
JP3930743B2 (en) Method of providing network connectivity in fault tolerant platform
US7676616B2 (en) Method, apparatus and program storage device for providing asynchronous status messaging in a data storage system
US8560889B2 (en) Adding scalability and fault tolerance to generic finite state machine frameworks for use in automated incident management of cloud computing infrastructures
US7543174B1 (en) Providing high availability for an application by rapidly provisioning a node and failing over to the node
US7747717B2 (en) Fast application notification in a clustered computing system
US5396613A (en) Method and system for error recovery for cascaded servers
US7194652B2 (en) High availability synchronization architecture
US20030005350A1 (en) Failover management system
US20020087612A1 (en) System and method for reliability-based load balancing and dispatching using software rejuvenation
US8095935B2 (en) Adapting message delivery assignments with hashing and mapping techniques
US20080215915A1 (en) Mechanism to Change Firmware in a High Availability Single Processor System
US8726078B1 (en) Method and system for providing high availability to computer applications
US7076691B1 (en) Robust indication processing failure mode handling
US8195777B2 (en) System and method for adding a standby computer into clustered computer system
US7188237B2 (en) Reboot manager usable to change firmware in a high availability single processor system
Bailis et al. The network is reliable
WO2004015513A2 (en) Migration method for software application in a multi-computing architecture, multi-computing method and system for carrying out functional continuity implementing said migration method
US20030196148A1 (en) System and method for peer-to-peer monitoring within a network