US20170357552A1 - Technologies for data center environment checkpointing - Google Patents
Technologies for data center environment checkpointing Download PDFInfo
- Publication number
- US20170357552A1 US20170357552A1 US15/670,707 US201715670707A US2017357552A1 US 20170357552 A1 US20170357552 A1 US 20170357552A1 US 201715670707 A US201715670707 A US 201715670707A US 2017357552 A1 US2017357552 A1 US 2017357552A1
- Authority
- US
- United States
- Prior art keywords
- node
- computing node
- checkpoint
- computing
- applications
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1448—Management of the data involved in backup or backup restore
- G06F11/1451—Management of the data involved in backup or backup restore by selection of backup contents
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1415—Saving, restoring, recovering or retrying at system level
- G06F11/1438—Restarting or rejuvenating
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1458—Management of the backup or restore process
- G06F11/1469—Backup restoration techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/84—Using snapshots, i.e. a logical point-in-time copy of the data
Definitions
- HPC high-performance computing
- cloud computing environments may incorporate distributed or multi-tier applications and workloads.
- more than one instance of a workload may be executing at the same time across multiple applications and/or computing devices (e.g., servers). Crashes or other errors occurring in the course of processing such distributed workloads may cause the loss of application state and thus may require large amounts of computational work to be repeated. Accordingly, crashes in large-scale computing environments may be quite costly and time-consuming.
- Some HPC and cloud computing environments support software-based application checkpointing.
- Typical application checkpointing solutions are purely software-based and allow the computing environment to store periodic snapshots (i.e., checkpoints) of the state of a running application, a virtual machine, or a workload in a non-distributed or single-tier computing environment. Based on the saved checkpoints, a suspended or interrupted application may be resumed or replayed starting from the state of a saved checkpoint, which may allow for quicker or less-expensive crash recovery.
- software checkpointing support may require the checkpointing software to be re-engineered for each supported application and/or operating system.
- FIG. 1 is a simplified block diagram of at least one embodiment of a system for supporting data center environment checkpointing that includes an orchestration node and working computing nodes;
- FIG. 2 is a simplified block diagram of at least one embodiment of a computing node of the system of FIG. 1 ;
- FIG. 3 is a simplified block diagram of at least one embodiment of an environment that may he established by the orchestration node of FIG. 1 ;
- FIG. 4 is a simplified block diagram of at least one embodiment of an environment that may be established by at least one of the additional computing nodes of FIG. 2 ;
- FIG. 5 is a simplified flow diagram of at least one embodiment of a method for initializing a distributed application that may be executed by one or more of the working computing nodes of FIG. 4 ;
- FIG. 6 is a simplified flow diagram of at least one embodiment of a method for administering an environment checkpointing event that may be executed by the orchestration node of FIG. 3 ;
- FIG. 7 is a simplified flow diagram of at least one embodiment of a method for performing a checkpointing event that may be executed by one or more of the working computing nodes of FIG. 4 ;
- FIG. 8 is a simplified flow diagram of at least one embodiment for a method for performing an environment restore event that may be executed by one or more of the working computing nodes of FIG. 4 .
- references in the specification to “one embodiment,” “an embodiment,” “an illustrative embodiment,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may or may not necessarily include that particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
- items included in a list in the form of “at least one of A, B, and C” can mean (A); (B); (C): (A and B); (A and C); (B and C); or (A, B, and C).
- items listed in the form of “at least one of A, B, or C” can mean (A); (B); (C): (A and B); (A and C); (B and C); or (A, B, and C).
- the disclosed embodiments may be implemented, in some cases, in hardware, firmware, software, or any combination thereof.
- the disclosed embodiments may also be implemented as instructions carried by or stored on one or more transitory or non-transitory machine-readable (e.g., computer-readable) storage media, which may be read and executed by one or more processors.
- a machine-readable storage medium may be embodied as any storage device, mechanism, or other physical structure for storing or transmitting information in a form readable by a machine (e.g., a volatile or non-volatile memory, a media disc, or other media device).
- a system 100 for data center environment checkpointing includes a plurality of computing nodes 102 communicatively coupled via a backplane management controller 112 in a compute environment 114 .
- Each of the plurality of computing nodes 102 is capable of executing one or more applications, or services, and responding to checkpointing events.
- the illustrative computing nodes 102 include an orchestration node 104 for managing resources (e.g., central processing unit (CPU) resources, storage resources, network resources) and/or distributing workloads across working computing nodes 110 (e.g., illustratively, the computing nodes 106 , 108 ), which are registered with the orchestration node 104 .
- resources e.g., central processing unit (CPU) resources, storage resources, network resources
- working computing nodes 110 e.g., illustratively, the computing nodes 106 , 108
- the illustrative working computing nodes 110 include a first computing node, which is designated as computing node ( 1 ) 106 , and a second computing node, which is designated as computing node (N) 108 (i.e., the “Nth” computing node of the working computing nodes 110 , wherein “N” is a positive integer and designates one or more additional computing nodes 110 that are registered with the orchestration node 104 ).
- Each of the plurality of computing nodes 102 is capable of executing one or more applications and includes hardware capable of supporting checkpointing (i.e., hardware-assisted checkpointing support).
- Hardware checkpointing support may allow for improved checkpointing performance, reliability, and scalability compared to software-only implementations. Additionally, because hardware checkpointing may be transparent to executing applications, checkpointing support may be provided for existing applications without requiring re-engineering (e.g., modifying code, recompiling code, etc.) of the underlying software.
- the orchestration node 104 is additionally configured to administer an environment checkpointing event. To do so, in use, the orchestration node 104 provides a checkpoint initialization signal, distributed via the backplane management controller 112 , to the working computing nodes 110 . Each of the working computing nodes 110 that receive the checkpoint initialization signal pauses the execution of local applications (i.e. workload processing processes, threads, virtual machines, etc.) presently running on the corresponding working computing node 110 , atomically saves the states of the paused applications (i.e., application checkpointing data) using the hardware checkpoint support, and transmits the application checkpointing data back to the orchestration node 104 .
- local applications i.e. workload processing processes, threads, virtual machines, etc.
- application checkpointing data i.e., application checkpointing data
- the orchestration node 104 then aggregates the application checkpointing data received from each of the working computing nodes 110 and, upon having received the application checkpointing data from all of the working computing nodes 110 , provides a checkpoint complete signal to the working computing nodes 110 to indicate to the working computing nodes 110 that they may resume execution of the previously paused applications.
- any of the working computing nodes 110 i.e., illustratively, the computing node 106 or the computing node 108
- any computing node 102 of the compute environment 114 may be designated as the “orchestration” node and is referred to as such in the following description.
- the plurality of computing nodes 102 and the backplane management controller 112 may be configured in a physical housing that facilitates the communication enabling connections between the computing nodes 102 and the backplane management controller 112 .
- the physical housing may be a rack in a rack-mounted configuration (i.e., the computing nodes 102 are rack-mounted servers), a blade server chassis in a blade server configuration (i.e., the computing nodes 102 are blade servers), or any other type of physical housing capable of facilitating the communication enabling connections between the computing nodes 102 and the backplane management controller 112 .
- the compute environment 114 may additionally include various other components, such as power supplies, fans, etc., which are not illustrated herein for clarity of the description. It should be appreciated, however, that in some embodiments, the process and/or workload distribution may not be self-contained to just the computing nodes 102 on the rack or in the chassis, such as in a cross-rack orchestration or a cross cloud orchestration, for example. In such embodiments, the compute environment 114 may encompass the various network devices and computing nodes 102 associated with the cross-rack orchestration or the cross cloud orchestration.
- the backplane management controller 112 may be embodied as any type of circuitry and/or components capable of performing the functions described herein, such as an enclosure management controller (EMC), a baseboard management controller (BMC), a chassis management controller (CMC), or any type of backplane management controller capable of facilitating the backend connectivity and transmission of communications across the computing nodes 102 , such as between the orchestration node 104 and the working computing nodes 110 .
- EMC enclosure management controller
- BMC baseboard management controller
- CMC chassis management controller
- the computing nodes 102 may be embodied as any type of computation or computer device capable of performing the functions described herein, including, without limitation, a computer, a multiprocessor system, a server, a rack-mounted server, a blade server, a smartphone, a tablet computer, a laptop computer, a notebook computer, a mobile computing device, a wearable computing device, a network appliance, a web appliance, a distributed computing system, a processor-based system, and/or a consumer electronic device. As shown in FIG.
- one of the computing nodes 102 illustratively includes a processor 202 , an input/output (I/O) subsystem 208 , a memory 212 , a data storage device 214 , and communication circuitry 216 .
- the computing node 102 may include other or additional components, such as those commonly found in a computer (e.g., various input/output devices), in other embodiments.
- one or more of the illustrative components may be incorporated in, or otherwise form a portion of, another component.
- the memory 212 or portions thereof, may be incorporated in the processor 202 in some embodiments.
- the processor 202 may be embodied as any type of processor capable of performing the functions described herein.
- the processor 202 may be embodied as a single or multi-core processor(s), digital signal processor, microcontroller, or other processor or processing/controlling circuit.
- the processor 202 illustratively includes hardware checkpoint support 204 and a hardware event monitor 206 .
- the hardware checkpoint support 204 may be embodied as any hardware component, microcode, firmware, or other component of the processor 202 capable of saving the execution state (e.g., a virtual memory state) of a currently executing application.
- the hardware checkpoint support 204 may be embodied as one or more dedicated processor instructions and associated memory management functions of the processor 202 that causes all or part of the virtual memory space of the current application to be saved to nonvolatile storage.
- the hardware checkpoint support 204 or a portion thereof, may be embodied as firmware or software executable by the processor 202 or other component of the computing node 102 .
- the hardware checkpoint support 204 (or instructions thereof) may be stored in memory (e.g., the memory 212 ).
- the hardware event monitor 206 may be embodied as any hardware component, microcode, firmware, or other component of the processor 202 capable of notifying software executed by the processor 202 of system events occurring within the processor 202 , such as memory access events, cache access events, and/or checkpointing events.
- the hardware event monitor 206 may be embodied as one or more performance counters, performance monitoring units, cache monitoring units, or other hardware counters of the processor 202 .
- the computing node 102 may facilitate the orchestration of the checkpointing event through a main platform firmware, or pre-boot firmware, such as an extension of the Intel platform chipset or the platform Basic Input/Output System (BIOS) based on the Unified Extensible Firmware Interface (“UEFI”) specification, which has several versions published by the Unified EFI Forum.
- BIOS may reside in the memory 212 and include instructions to initialize the computing node 102 during the boot process.
- the hardware event monitor 206 or a portion thereof, may be embodied as firmware or software executable by the processor 202 or other component of the computing node 102 .
- the hardware event monitor 206 (or instructions thereof) may be stored in memory (e.g., the memory 212 ).
- the memory 212 may be embodied as any type of volatile or non-volatile memory or data storage capable of performing the functions described herein.
- the memory 212 may store various data and software used during operation of the computing node 102 such as operating systems, applications, programs, libraries, and drivers.
- the memory 212 is communicatively coupled to the processor 202 via the I/O subsystem 208 , which may be embodied as circuitry and/or components to facilitate input/output operations with the processor 202 , the memory 212 , and other components of the computing node 102 .
- the I/O subsystem 208 may be embodied as, or otherwise include, memory controller hubs, input/output control hubs, firmware devices, communication links (i.e., point-to-point links, bus links, wires, cables, light guides, printed circuit board traces, etc.) and/or other components and subsystems to facilitate the input/output operations.
- the I/O subsystem 208 further includes an I/O buffering device 210 .
- the I/O buffering device 210 may be embodied as any hardware component, microcode, firmware, or other component of the I/O subsystem 208 that is capable of buffering I/O signals during a checkpointing event and notifying software executed by the processor 202 of system I/O events occurring within the computing node 102 , such as disk access events, memory access events, network access events, checkpointing events, or other system events.
- the I/O buffering device 210 may be embodied as one or more bit identifiers, performance counters, performance monitoring units, or other hardware counters of the I/O subsystem 208 .
- the I/O subsystem 208 may form a portion of a system-on-a-chip (SoC) and be incorporated, along with the processor 202 , the memory 212 , and other components of the computing node 102 , on a single integrated circuit chip.
- SoC system-on-a-chip
- the data storage device 214 may be embodied as any type of device or devices configured for short-term or long-term storage of data such as, for example, memory devices and circuits, memory cards, hard disk drives, solid-state drives, or other data storage devices. In use, as described below, the data storage device 214 may store application checkpointing data such as saved execution states or other, similar data.
- the communication circuitry 216 of the computing node 102 may be embodied as any communication circuit, device, or collection thereof, capable of enabling communications between the computing node 102 and the orchestration node 104 , and between the computing node 102 and remote devices over a network (not shown).
- the communication circuitry 216 may be configured to use any one or more communication technology (e.g., wired or wireless communications) and associated protocols (e.g., Ethernet, Bluetooth®, Wi-Fi®, WiMAX, etc.) to effect such communication.
- the computing node 102 may also include a checkpoint cache 218 .
- the checkpoint cache 218 may be embodied as any type of device or devices configured for short-term or long-term storage of data such as, for example, memory devices and circuits, memory cards, hard disk drives, solid-state drives, or other data storage devices.
- the checkpoint cache 218 may be embodied as a flash-based memory storage device for storing persistent information.
- the checkpoint cache 218 may store application checkpointing data such as saved execution states or other, similar data.
- the computing node 102 may also include one or more peripheral devices 220 .
- the peripheral devices 220 may include any number of additional input/output devices, interface devices, and/or other peripheral devices.
- the peripheral devices 220 may include a display, touch screen, graphics circuitry, keyboard, mouse, speaker system, and/or other input/output devices, interface devices, and/or peripheral devices.
- each of the computing nodes 102 of FIG. 1 are capable of being configured as the orchestration node 104 or one of the working computing nodes 110 (i.e., illustratively, the computing nodes 106 , 108 ). Accordingly, environments that may be established during operation of the computing nodes 102 , as shown in FIGS. 3 and 4 , are dependent on whether a particular computing node 102 is functioning as the orchestration node 104 or one of the working computing nodes 110 .
- the orchestration node 104 establishes an environment 300 during operation.
- the illustrative environment 300 includes a registration interface module 310 , a distributed application coordination module 320 , and an environment checkpoint administration module 330 .
- Each of the modules, logic, and other components of the environment 300 may be embodied as hardware, software, firmware, or a combination thereof.
- each of the modules, logic, and other components of the environment 300 may form a portion of, or otherwise be established by, the processor 202 or other hardware components of the orchestration node 104 .
- one or more of the modules of the environment 300 may be embodied as a circuit or collection of electrical devices (e.g., a registration interface circuit, a distributed application coordination circuit, an environment checkpoint administration circuit, etc.).
- the registration interface module 310 , the distributed application coordination module 320 , and/or the environment checkpoint administration module 330 may be embodied as one or more components of a virtualization framework of the orchestration node 104 such as a hypervisor or virtual machine monitor (VMM).
- a virtualization framework of the orchestration node 104 such as a hypervisor or virtual machine monitor (VMM).
- VMM virtual machine monitor
- the orchestration node 104 may include other components, sub-components, modules, sub-modules, and devices commonly found in a computing device, which are not illustrated in FIG. 3 for clarity of the description.
- the orchestration node 104 additionally includes environment checkpointing data 302 and registration data 304 , each of which may be accessed by one or more of the various modules and/or sub-modules of the orchestration node 104 .
- the environment checkpointing data 302 may include a hash table of the state of various connections and processes associated with each distributed application (i.e., master persistency).
- the environment checkpointing data 302 may be transmitted by the orchestration node 104 to the working computing nodes 110 (i.e., environment persistency).
- the registration interface module 310 is configured to receive registration information from the working computing nodes 110 that includes identification information related to the working computing nodes 110 , such as a computing node identifier, a process identifier, a virtual machine identifier, etc.
- the registration interface module 310 is further configured to register the working computing nodes 110 with the orchestration node 104 based on the received registration information.
- the registration information may include computing node identification data, application (see application 410 of FIG. 4 ) related data, such as thread or process identifiers (e.g., names, IDs, etc.).
- the registration interface module 310 may store the received registration information in the registration data 304 .
- the distributed application coordination module 320 is configured to coordinate (i.e., spawn, distribute, monitor, adjust, allocate resources, terminate, etc.) each application, hypervisor, or master thread of a distributed application to be performed at the working computing nodes 110 . To do so, the distributed application coordination module 320 is configured to initiate applications, record dependencies, and generate child thread identifiers and/or connection identifiers based on objects registered by the working computing nodes 110 for the initiated applications, such as the registration information received and registered at the registration interface module 310 . Additionally, the distributed application coordination module 320 may be further configured to coordinate (i.e., track) the various signals and events triggered by and/or received at the orchestration node 104 .
- the environment checkpoint administration module 330 is configured to administer atomic checkpointing operations and track the checkpointing operations across the working computing nodes 110 .
- the computing nodes 102 may span a rack, a blade, a data center, or any number of cloud-provider environments.
- the environment checkpoint administration module 330 is configured to transmit a checkpoint initialization signal that includes time sync information to the working computing nodes 110 of the compute environment 114 that are presently registered with the orchestration node 104 .
- the environment checkpoint administration module 330 is further configured to receive checkpointing information (e.g., application state information) from each other working computing node 110 registered with the orchestration node 104 upon completion of the checkpointing event at each other working computing node 110 .
- checkpointing information e.g., application state information
- the environment checkpoint administration module 330 Upon receipt, the environment checkpoint administration module 330 stores the checkpointing information, which may be aggregated with previously received checkpointing data from other computing nodes 102 of the compute environment 114 .
- the received checkpointing data may be stored in the environment checkpointing data 302 .
- the environment checkpointing data 302 may include checkpointing information related to each other working computing node 110 of the compute environment 114 registered with the orchestration node 104 , such as child process (thread) information, connection information, virtual memory contents, processor register state, processor flags, process tables, file descriptors, file handles, or other data structures relating to the current state of the application running on the working computing nodes 110 at the time that the checkpoint initialization signal was received at the working computing nodes 110 .
- those functions may be performed by one or more sub-modules, such as an environment checkpointing data management module 332 .
- the environment checkpoint administration module 330 may be further configured to administer a restore event based on the environment checkpointing data 302 .
- the environment checkpoint administration module 330 may be further configured to restore the execution state (e.g., a virtual memory state) of a distributed application by using the hardware checkpoint support 204 of the processor 202 to restore the execution state based on at least some of the environment checkpointing data 302 .
- the environment checkpoint administration module 330 may transmit a checkpoint restore signal to the working computing nodes 110 to indicate to each of the working computing nodes 110 to pause executing any presently executing applications and start execution of one or more applications based on the checkpoint restore signal and/or any additional environment checkpointing data. It should be appreciated that the presently executing applications paused as a result of the restore operation may or may not be the same applications.
- each working computing nodes 110 establishes an environment 400 during operation.
- the illustrative environment 400 includes an application 410 and a checkpoint management module 420 .
- Each of the modules, logic, and other components of the environment 400 may be embodied as hardware, software, firmware, or a combination thereof.
- each of the modules, logic, and other components of the environment 400 may form a portion of, or otherwise be established by, the processor 202 or other hardware components of the computing node 102 .
- one or more of the modules of the environment 400 may be embodied as a circuit or collection of electrical devices (e.g., a checkpoint management circuit, etc.).
- the checkpoint management module 420 may be embodied as one or more components of a virtualization framework of the working computing node 110 such as a hypervisor or virtual machine monitor (VMM).
- VMM virtual machine monitor
- the computing node 102 may include other components, sub-components, modules, sub-modules, and devices commonly found in a computing device, which are not illustrated in FIG. 4 for clarity of the description.
- the application 410 may be embodied as any program, process, thread, task, or other executable component of the working computing node 110 .
- the application 410 may be embodied as a process, a thread, a native code application, a managed code application, a virtualized application, a virtual machine, or any other similar application.
- the application 410 may be compiled to target the processor 202 specifically; that is, the application 410 may include code to access the hardware checkpoint support 204 , such as via specialized processor instructions.
- the application 410 is initialized by a main thread, which maintains and modifies an execution state that may include, for example, a virtual memory state (i.e., virtual memory contents), processor register state, processor flags, process tables, file descriptors, file handles, or other data structures relating to the current state of the application 410 .
- a virtual memory state i.e., virtual memory contents
- processor register state i.e., processor flags, process tables, file descriptors, file handles, or other data structures relating to the current state of the application 410 .
- the application 410 may be a distributed application, such that at least a portion of the application processing is performed on a first working computing node 110 (e.g., the computing node 106 ) and at least another portion of the application processing is performed on another working computing node 110 (e.g., the computing node 108 ). It should be further appreciated that, in some embodiments, the application 410 may be a multi-tiered application, commonly referred to as an n-tier application, such that the multi-tiered application consists of more than one application developed and distributed among more than one layer (e.g., operational layer).
- n-tier application commonly referred to as an n-tier application
- the checkpoint management module 420 is configured to detect and handle occurrences of checkpointing events (e.g., any hardware or software event that triggers a checkpointing operation) received from an orchestration node 104 communicatively coupled to the working computing node 110 . In response to detecting checkpointing events, the checkpoint management module 420 may call one or more hardware hooks (e.g., system calls, processor instructions, etc.) to cause the working computing node 110 to save a checkpoint.
- checkpointing events e.g., any hardware or software event that triggers a checkpointing operation
- the checkpoint management module 420 is configured to lock the context of the working computing node 110 (i.e., pause further execution of presently executing applications 410 and block newly received data) and buffer the presently executing tasks using an I/O buffering mechanism to ensure that all I/O signals (e.g., memory, disk, network, etc.) are buffered until the working computing node 110 receives an indication that the computing node 102 may resume the paused applications 410 .
- I/O buffering mechanism to ensure that all I/O signals (e.g., memory, disk, network, etc.) are buffered until the working computing node 110 receives an indication that the computing node 102 may resume the paused applications 410 .
- the I/O buffering device 210 may buffer the I/O signals until the working computing node 110 provides an indication to the orchestration node 104 that the requested checkpointing event has completed (i.e., states saved and transmitted to the orchestration node 104 ) and receives an indication from the orchestration node 104 that the computing node 102 may resume operation, such as via the checkpoint complete signal.
- the checkpoint management module 420 is configured to buffer and save memory pages and states behind the main thread, and flush memory pages and states ahead of the main thread.
- the checkpoint management module 420 is further configured to atomically save the execution state of each application 410 (i.e., the checkpointing information) that was being executed by the computing node 102 at the time the checkpointing event signal was received.
- the computing node 102 additionally includes checkpointing data 402 , which may be accessed by one or more of the various modules and/or sub-modules of the computing node 102 .
- the checkpointing information may be saved in the checkpointing data 402 .
- the checkpointing data 402 may include checkpointing related data, such as virtual memory contents, processor register state, processor flags, process tables, file descriptors, file handles, or other data structures relating to the current state of the application 410 at the time of the checkpointing request.
- checkpointing related data such as virtual memory contents, processor register state, processor flags, process tables, file descriptors, file handles, or other data structures relating to the current state of the application 410 at the time of the checkpointing request.
- the functions of the checkpoint management module 420 may be performed by one or more sub-modules, such as an application management module 422 to manage the pausing and restarting of each application 410 , an atomicity management module 424 to atomically save the checkpointing information, and a checkpointing data management module 426 to interface with the checkpointing data 402 to store the retrieved checkpointing information and retrieve the checkpointing information for transmission to the orchestration node 104 . Additionally, in some embodiments, the checkpoint management module 420 may be further configured to receive and store environment checkpointing data 302 that may be received from the orchestration node 104 .
- the computing node 102 can be configured to assume the role of orchestration node 104 in the event the computing node 102 presently configured to function as the orchestration node crashes or otherwise fails. Accordingly, the checkpointing data 402 and/or the environment checkpointing data 302 may be stored in the checkpoint cache 218 .
- the checkpoint management module 420 may be further configured to manage a restore event based on the environment checkpointing data 302 .
- the checkpoint management module 420 may be configured to receive a checkpoint restore signal from the orchestration node 104 (e.g., the environment checkpoint administration module 330 of the environment 300 (see FIG. 3 )) that indicates to the checkpoint management module 420 to restore the execution state of a distributed application.
- the checkpoint management module 420 may pause executing any presently executing applications and start execution of one or more applications based on the checkpoint restore signal and/or any additional environment checkpointing data.
- the checkpoint management module 420 may use the hardware checkpoint support 204 of the processor 202 to restore the execution state based on the checkpointing data 402 and/or the environment checkpointing data 302 . It should be appreciated that the presently executing applications paused as a result of the restore operation may or may not be the same applications.
- a working computing node 110 may execute a method 500 for initializing a distributed application.
- the method 500 begins at block 502 , in which the working computing node 110 receives a request to perform a task, such as processing a workload, via a distributed application.
- the working computing node 110 initializes the distributed application via an application management module (e.g., the application management module 422 of FIG. 4 ) to control the distributed application.
- an application management module e.g., the application management module 422 of FIG. 4
- the application management module 422 may be embodied as a hypervisor, main process, or master thread to execute and manage one or more of the applications 410 of FIG. 4 .
- the applications 410 may be embodied as any process, thread, managed code, or other task executed by the working computing node 110 .
- the applications 410 may be embodied as virtualized applications, for example as applications or operating systems executed by a hypervisor and performed in a virtual machine created by the distributed application coordinator. During execution, the applications 410 may perform calculations, update regions of the memory 212 , or perform any other operations typical of a computer application.
- the application management module 422 of the working node 110 spawns objects based on the task to be performed via the distributed application. For example, in some embodiments, one or more child processes may be spawned based on requirements of the distributed application at block 508 . In such embodiments, each spawned child process may be run via a virtual machine instance or directly by the working computing node 110 . Accordingly, in such embodiments, one or more virtual machines may need to be spawned based on the child processes spawned. Additionally or alternatively, in some embodiments, one or more connections may be spawned based on the requirements of the distributed application at block 510 .
- the working computing node 110 registers the one or more spawned objects with the orchestration node 104 .
- the working computing node 110 may register each of the spawned child processes with the orchestration node 104 .
- the spawned child processes may be running on either a virtual machine or directly by the working computing node 110 . Accordingly, the virtual machine instances may also be registered with the orchestration node 104 .
- the working computing node 110 may register each of the spawned connections between applications with the orchestration node 104 , such as for multi-tiered application embodiments.
- the working computing node 110 receives identifiers from the orchestration node 104 for each of the objects registered with the orchestration node 104 .
- the orchestration node 104 receives a child process identifier for each child process registered with the orchestration node 104 .
- the orchestration node 104 receives the identifiers includes receiving a connection identifier for each of the connections registered with the orchestration node 104 .
- the orchestration node 104 may execute a method 600 for performing an environment checkpointing event.
- the method 600 begins at block 602 , in which the orchestration node 104 determines whether an environment checkpoint initialization signal was received.
- An environment checkpoint initialization signal may include time sync information and may be embodied as any hardware or software event that triggers a checkpointing operation.
- the orchestration node 104 may use any technique to monitor for environment checkpoint initialization signals, including polling for events, handling interrupts, registering callback functions or event listeners, or other techniques.
- the environment checkpoint initialization signal may be embodied as a hardware event such as an interrupt, a memory access, or an I/O operation; as a software event such as a modification of a data structure in memory; as a user-generated event such as an application programming interface (API) call, or as any other event.
- a hardware event such as an interrupt, a memory access, or an I/O operation
- a software event such as a modification of a data structure in memory
- API application programming interface
- the method 600 advances to block 604 , wherein the orchestration node 104 transmits (e.g., via the backplane management controller 112 of FIG. 1 ) a checkpoint initialization signal to one or more of the working computing nodes 110 (e.g., illustratively, the computing node 106 or the computing node 108 ) registered with the orchestration node 104 .
- the orchestration node 104 may only transmit the checkpoint initialization signal to a subset of the registered working computing nodes 110 at a time.
- the checkpointing event may be directed to a particular application, which may be distributed across multiple virtual machines on a single working computing node 110 or across a number of working computing nodes 110 . Accordingly, a checkpoint initialization signal may only be sent to those computing node(s) that are processing the distributed application.
- the orchestration node 104 determines whether checkpointing data has been received from one of the working computing nodes 110 (e.g., illustratively, the computing node 106 or the computing node 108 ). As described previously, the checkpointing data may include saved execution states or other, similar data. If checkpointing data has not been received, the method 600 loops back to block 606 to continue to monitor for checkpointing data. If checkpointing data has been received, the method 600 advances to block 608 , wherein the orchestration node 104 stores the received checkpointing data. In some embodiments, the checkpointing data may be stored in the data storage device 214 or the checkpoint cache 218 of FIG. 2 .
- the orchestration node 104 determines whether all of the working computing nodes 110 to which the checkpoint initialization signals were transmitted are required to be completed (i.e., checkpointing data received by the orchestration node 104 ) before proceeding. If so, the method 600 branches to block 616 , which is described in further detail below; otherwise, the method 600 advances to block 612 , wherein the orchestration node 104 transmits a checkpoint complete signal to the working computing node 110 from which the checkpointing data was received.
- the method 600 may advance to block 614 , wherein the orchestration node 104 transmits the environment checkpointing data (i.e., the checkpointing data for all of the working computing nodes 110 ) to registered working computing nodes 110 such that in the event the orchestration node 104 crashes, another working computing node 110 may assume the role of the orchestration node 104 , thus avoiding a single point of failure.
- the environment checkpointing data may include a hash table of the state of various connections and processes associated with each distributed application for each working computing node 110 of the compute environment 114 .
- the orchestration node 104 determines whether the checkpointing data has been received from all of the working computing nodes 110 to which the checkpoint initialization signals were transmitted. If not, the method 600 loops back to block 602 to continue monitoring for checkpointing data received at the orchestration node 104 ; otherwise, the method 600 advances to block 618 .
- the orchestration node 104 transmits a checkpoint complete signal to all of the working computing nodes 110 to which the checkpoint initialization signals were transmitted. In some embodiments, the method advances from block 618 to block 614 to transmit the environment checkpointing data to all the registered working computing nodes 110 .
- each working computing node 110 may execute a method 700 for performing a checkpointing event.
- the method 700 begins at block 702 , in which the working computing node 110 determines whether a checkpoint initialization signal was received.
- the checkpoint initialization signal may include time sync information and may be embodied as any hardware or software event that triggers a checkpointing operation.
- the working computing node 110 may use any technique to monitor for checkpoint initialization signals, including polling for events, handling interrupts, registering callback functions or event listeners, or other techniques. For example, during initialization of the working computing node 110 , the working computing node 110 may perform any initialization routines or other processes required to activate the hardware checkpoint support 204 , as well as any required software initialization routines. In such embodiments, the working computing node 110 may initialize interrupt vectors, timers, or other system hooks used to invoke the hardware checkpoint support 204 .
- the method 700 advances to block 704 , wherein the working computing node 110 pauses executing applications that are being executed at the time the checkpoint initialization signal was received.
- the application e.g., the application 410
- the main thread which is responsible for performing the task associated with the application
- the working computing node 110 locks the context of the computing node. In other words, any new data received at the working computing node 110 is blocked.
- the working computing node 110 buffers the paused applications. To do so, the computing node 102 buffers and saves any memory pages, or states, that are lagging behind the main thread and flushes any memory pages, or states, that are ahead of the main thread.
- the working computing node 110 saves the checkpointing data, which may include saved execution states of the applications paused at block 704 and buffered at block 708 , as well as any other data related to the state of the paused applications (e.g., the stack, the heap, the allocated pages, the process table, other parts of the memory 212 , the processor 202 flags, states, or other processor 202 information, etc.).
- software e.g., the operating system
- the working computing node 110 may execute a system hook (i.e., a hardware hook) to save the execution state of the applications that were executing at the time in which the checkpoint initialization signal was received.
- the working computing node 110 may save the execution states of the applications using the hardware checkpoint support 204 of FIG. 2 .
- the hardware hook may be embodied as any technique usable to invoke the hardware checkpoint support 204 of the processor 202 . It should be appreciated that different software executing on the same working computing node 110 may execute different system hooks.
- the checkpointing data may be stored in a persistent storage device, such as the checkpoint cache 218 of FIG. 2 .
- the working computing node 110 transmits the saved checkpointing data to the orchestration node 104 .
- the working computing node 110 may transmit the saved checkpointing data via the backplane management controller 112 of FIG. 1 .
- the working computing node 110 determines whether a checkpoint complete signal was received from the orchestration node 104 in response to having transmitted the saved checkpointing data.
- the orchestration node 104 may be configured to wait until all of the working computing nodes 110 that received the checkpoint initialization signal to have responded with their respective checkpointing data before the orchestration node 104 may transmit the checkpoint complete signal to the applicable working computing nodes 110 .
- the working computing node 110 loops back to block 716 to continue to monitor whether the checkpoint complete signal has been received. If the checkpoint complete signal has been received at block 716 , the method 700 advances to block 718 , wherein the context locked at block 706 is unlocked (i.e., new data is again accepted by the working computing node 110 ).
- each working computing node 110 in the compute environment 114 may store a copy of the environment checkpointing data, which may include application state information for all the working computing node 110 of the compute environment 114 , such that any working computing node 110 may be capable of being configured as the orchestration node 104 .
- any other working computing node 110 of the environment can assume the role of the orchestration node 104 .
- the working computing node 110 determines whether environment checkpointing data was received. If not, the working computing node 110 loops back to block 722 to continue to monitor for the environment checkpointing data; otherwise, the computing node advances to block 724 .
- the working computing node 110 stores the environment checkpointing data.
- the working computing node 110 stores the environment checkpointing data to a persistent storage device (e.g., the checkpoint cache 218 of FIG. 2 ).
- the environment checkpointing data may include a hash table of the state of various connections and processes associated with each distributed application for each working computing node 110 of the compute environment 114 in which the working computing node 110 is connected to.
- each working computing node 110 may execute a method 800 for performing an environment restore event.
- the method 800 begins at block 802 , in which the computing node 102 determines whether a checkpoint restore signal was received.
- the checkpoint restore signal may include data indicative of the environment checkpointing data to be referenced (i.e., used) for the restore. If the checkpoint restore signal was not received, the method loops back to block 802 to continue to monitor for the checkpoint initialization signal. If the checkpoint initialization signal was received, the method 800 advances to block 804 , wherein the computing node 102 pauses executing applications that are being executed at the time the checkpoint initialization signal was received.
- the working computing node 110 executes a system hook to load the saved execution state of one or more requested applications (e.g., the application 410 of FIG. 4 ) into memory (e.g., the memory 212 of FIG. 2 ). Similar to saving the execution state, the system hook for loading the execution state may be embodied as any technique usable to invoke the hardware checkpoint support 204 of the processor 202 .
- the working computing node 110 loads the execution state of the requested applications to be restored into the memory 212 using the hardware checkpoint support 204 . After loading the execution state of the requested applications to be restored into the memory 212 , at block 810 the working computing node 110 resumes execution of the restored applications based on the saved execution state. After resuming the restored applications, the method 800 loops back to block 802 to continue monitoring for the presence of a checkpoint restore signal.
- An embodiment of the technologies disclosed herein may include any one or more, and any combination of, the examples described below.
- Example 1 includes a computing node for performing a checkpointing event, the computing node comprising a hardware event monitor to receive a checkpoint initialization signal from an orchestration node communicatively coupled to the computing node; a checkpoint management module to (i) pause one or more applications being presently executed on the computing node in response to having received the checkpoint initialization signal and (ii) buffer, by an input/output (I/O) buffering device, input/output (I/O) signals of the one or more paused applications; and a hardware checkpoint support to save checkpointing data to a memory storage device of the computing node, wherein the checkpointing data includes an execution state of each of the one or more applications, wherein the checkpoint management module is further to transmit the checkpointing data to the orchestration node.
- a hardware event monitor to receive a checkpoint initialization signal from an orchestration node communicatively coupled to the computing node
- a checkpoint management module to (i) pause one or more applications being presently executed on the computing no
- Example 2 includes the subject matter of Example 1, and wherein the checkpoint management module is further to lock context of the computing node to block any new data received by the computing node from being processed by the computing node in response to having received the checkpoint initialization signal.
- Example 3 includes the subject matter of any of Examples 1 and 2, and wherein the hardware event monitor is further to receive, by the hardware event monitor, a checkpoint complete signal from the orchestration node, and wherein the checkpoint management module is further to resume the one or more paused applications in response to having received the checkpoint complete signal.
- Example 4 includes the subject matter of any of Examples 1-3, and wherein to resume the one or more paused applications comprises to (i) unlock context of the computing node to allow any new data to be received by the computing node and (ii) release the input/output (I/O) signals of the one or more applications from the input/output (I/O) buffering device.
- Example 5 includes the subject matter of any of Examples 1-4, and wherein the checkpoint management module is further to register with the orchestration node, wherein to register includes to provide an indication that the checkpointing event is to be initiated by the orchestration node.
- Example 6 includes the subject matter of any of Examples 1-5, and wherein the checkpoint management module is further to (i) receive environment checkpointing data from the orchestration node, wherein the environment checkpointing data includes execution state data of other computing nodes communicatively coupled to the orchestration node, and (ii) store the environment checkpointing data in a local storage.
- Example 7 includes the subject matter of any of Examples 1-6, and wherein the checkpoint management module is further to receive a checkpoint restore signal from the orchestration node, wherein the hardware checkpoint support is further to load a saved execution state of at least one of the one or more applications into a memory of the computing node, and wherein the checkpoint management module is further to resume execution of the at least one of the one or more applications from the saved execution stated loaded into the memory.
- Example 8 includes the subject matter of any of Examples 1-7, and wherein to load the saved execution state comprises to load a saved execution state based at least in part on the environment checkpointing data.
- Example 9 includes the subject matter of any of Examples 1-8, and wherein the checkpoint management module is further to execute a distributed application using a main thread initiated by the computing node, wherein to save the checkpointing data comprises to save an execution state of the distributed application, and wherein the execution state is indicative of a virtual memory state of the distributed application.
- Example 10 includes the subject matter of any of Examples 1-9, and wherein the checkpoint management module is further to (i) save memory pages that correspond to a first application of the one or more applications in a memory of the computing node in response to a determination that the first application is lagging behind the main thread and (ii) flush memory pages that correspond to a second application of the one or more applications in the memory in response to a determination that the second application is working ahead of the main thread.
- Example 11 includes the subject matter of any of Examples 1-10, and wherein to buffer the input/output (I/O) signals of the one or more paused applications comprises to buffer memory access events.
- I/O input/output
- Example 12 includes the subject matter of any of Examples 1-11, and wherein to buffer the input/output (I/O) signals of the one or more paused applications comprises to buffer disk access events.
- I/O input/output
- Example 13 includes the subject matter of any of Examples 1-12, and wherein to buffer the input/output (I/O) signals of the one or more paused applications comprises to buffer network access events.
- I/O input/output
- Example 14 includes an orchestration node for administering an environment checkpointing event, the orchestration node comprising an environment checkpoint administration module to (i) transmit a checkpoint initialization signal to each of a plurality of working computing nodes communicatively coupled to the orchestration node in response to an environment checkpoint initialization signal indicative of a checkpoint event, (ii) receive checkpointing data from each working computing node in response to the checkpoint initialization signal, wherein the checkpoint data includes an execution state of at least one application of a corresponding working computing node, (iii) store the received checkpointing data, and (iv) transmit a checkpoint complete signal to each of the plurality of working computing nodes.
- an environment checkpoint administration module to (i) transmit a checkpoint initialization signal to each of a plurality of working computing nodes communicatively coupled to the orchestration node in response to an environment checkpoint initialization signal indicative of a checkpoint event, (ii) receive checkpointing data from each working computing node in response to the checkpoint initialization signal, where
- Example 15 includes the subject matter of Example 14, and wherein to transmit the checkpoint complete signal to the plurality of working computing nodes comprises to transmit the checkpoint complete signal to the plurality of working computing nodes in response to a determination that the checkpointing data has been received from each of the plurality of working computing nodes.
- Example 16 includes the subject matter of any of Examples 14 and 15, and wherein the environment checkpoint administration module is further to transmit the received checkpointing data from each of the plurality of working computing nodes to each of the plurality of working computing nodes communicatively coupled to and registered with the orchestration node.
- Example 17 includes a method for performing a checkpointing event, the method comprising receiving, by a hardware event monitor of a computing node, a checkpoint initialization signal from an orchestration node communicatively coupled to the computing node; pausing, by a processor of the computing node, one or more applications presently executing on the computing node in response to receiving the checkpoint initialization signal; buffering, by an input/output (I/O) buffering device of the computing node, input/output (I/O) signals of the one or more paused applications; saving, by a hardware checkpoint support of the computing node, checkpointing data to a memory storage device of the computing node, wherein the checkpointing data includes an execution state of each of the one or more applications; and transmitting, by the computing node, the checkpointing data to the orchestration node.
- I/O input/output
- I/O input/output
- Example 18 includes the subject matter of Example 17, and further including locking, by the computing node, context of the computing node to block any new data received by the computing node from being processed by the computing node in response to receiving the checkpoint initialization signal.
- Example 19 includes the subject matter of any of Examples 17 and 18, and further including receiving, by the hardware event monitor of a computing node, a checkpoint complete signal from the orchestration node; and resuming the one or more paused applications in response to receiving the checkpoint complete signal.
- Example 20 includes the subject matter of any of Examples 17-19, and wherein resuming the one or more paused applications comprises (i) unlocking context of the computing node to allow any new data to be received by the computing node and (ii) releasing the input/output (I/O) signals of the one or more applications from the input/output (I/O) buffering device of the computing node.
- Example 21 includes the subject matter of any of Examples 17-20, and further including registering, by the computing node, with the orchestration node to provide an indication that the checkpointing event is to be initiated by the orchestration node.
- Example 22 includes the subject matter of any of Examples 17-21, and further including receiving, by the computing node, environment checkpointing data from the orchestration node, wherein the environment checkpointing data includes execution state data of other computing nodes communicatively coupled to the orchestration node; and storing, by the computing node, the environment checkpointing data in a local storage.
- Example 23 includes the subject matter of any of Examples 17-22, and further including receiving, by the computing node, a checkpoint restore signal from the orchestration node; loading, by the hardware checkpoint support, a saved execution state of at least one of the one or more applications into a memory of the computing node; and resuming, by the computing node, execution of the at least one of the one or more applications from the saved execution stated loaded into the memory.
- Example 24 includes the subject matter of any of Examples 17-23, and wherein loading the saved execution state comprises loading a saved execution state based at least in part on the environment checkpointing data.
- Example 25 includes the subject matter of any of Examples 17-24, and further including executing, by the computing node, a distributed application using a main thread initiated by the computing node; wherein saving the checkpointing data comprises saving an execution state of the distributed application, and wherein the execution state is indicative of a virtual memory state of the distributed application.
- Example 26 includes the subject matter of any of Examples 17-25, and further including saving memory pages, stored in a memory of the computing node, corresponding to a first application of the one or more applications in response to a determination that the first application is lagging behind the main thread; and flushing memory pages, stored in the memory, corresponding to a second application of the one or more applications in response to a determination that the second application is working ahead of the main thread.
- Example 27 includes the subject matter of any of Examples 17-26, and wherein buffering the input/output (I/O) signals of the one or more paused applications comprises buffering memory access events.
- Example 28 includes the subject matter of any of Examples 17-27, and wherein buffering the input/output (I/O) signals of the one or more paused applications comprises buffering disk access events.
- Example 29 includes the subject matter of any of Examples 17-28, and wherein buffering the input/output (I/O) signals of the one or more paused applications comprises buffering network access events.
- I/O input/output
- Example 30 includes a computing node comprising a processor; and a memory having stored therein a plurality of instructions that when executed by the processor cause the computing node to perform the method of any of Examples 17-29.
- Example 31 includes one or more machine readable storage media comprising a plurality of instructions stored thereon that in response to being executed result in a computing node performing the method of any of Examples 17-29.
- Example 32 includes a method for administering an environment checkpointing event, the method comprising transmitting, by an orchestration node, a checkpoint initialization signal to each of a plurality of working computing nodes communicatively coupled to the orchestration node in response to an environment checkpoint initialization signal indicative of a checkpoint event; receiving, by the orchestration node, checkpointing data from each working computing node in response to the checkpoint initialization signal, wherein the checkpoint data includes an execution state of at least one application of a corresponding working computing node; storing, by a memory storage device of the orchestration node, the received checkpointing data; transmitting, by the orchestration node, a checkpoint complete signal to each of the plurality of working computing nodes.
- Example 33 includes the subject matter of Example 32, and wherein transmitting the checkpoint complete signal to the plurality of working computing nodes comprises transmitting the checkpoint complete signal to the plurality of working computing nodes in response to a determination that the checkpointing data has been received from each of the plurality of working computing nodes.
- Example 34 includes the subject matter of any of Examples 32 and 33, and further including transmitting, by the orchestration node, the received checkpointing data from each of the plurality of working computing nodes to each of the plurality of working computing nodes communicatively coupled to and registered with the orchestration node.
- Example 35 includes a computing node comprising a processor; and a memory having stored therein a plurality of instructions that when executed by the processor cause the computing node to perform the method of any of Examples 32-34.
- Example 36 includes one or more machine readable storage media comprising a plurality of instructions stored thereon that in response to being executed result in a computing node performing the method of any of Examples 32-34.
- Example 37 includes a computing node for performing a checkpointing event, the computing node comprising means for receiving a checkpoint initialization signal from an orchestration node communicatively coupled to the computing node; means for pausing one or more applications presently executing on the computing node in response to receiving the checkpoint initialization signal; means for buffering input/output (I/O) signals of the one or more paused applications; means for saving checkpointing data to a memory storage device of the computing node, wherein the checkpointing data includes an execution state of each of the one or more applications; and means for transmitting the checkpointing data to the orchestration node.
- I/O input/output
- Example 38 includes the subject matter of Example 37, and further including means for locking context of the computing node to block any new data received by the computing node from being processed by the computing node in response to receiving the checkpoint initialization signal.
- Example 39 includes the subject matter of any of Examples 37 and 38, and further including means for receiving a checkpoint complete signal from the orchestration node; and means for resuming the one or more paused applications in response to receiving the checkpoint complete signal.
- Example 40 includes the subject matter of any of Examples 37-39, and wherein the means for resuming the one or more paused applications comprises means for (i) unlocking context of the computing node to allow any new data to be received by the computing node and (ii) releasing the input/output (I/O) signals of the one or more applications.
- Example 41 includes the subject matter of any of Examples 37-40, and further including means for registering with the orchestration node to provide an indication that the checkpointing event is to be initiated by the orchestration node.
- Example 42 includes the subject matter of any of Examples 37-41, and further including means for receiving environment checkpointing data from the orchestration node, wherein the environment checkpointing data includes execution state data of other computing nodes communicatively coupled to the orchestration node; and means for storing the environment checkpointing data in a local storage.
- Example 43 includes the subject matter of any of Examples 37-42, and further including means for receiving a checkpoint restore signal from the orchestration node; means for loading a saved execution state of at least one of the one or more applications into a memory of the computing node; and means for resuming execution of the at least one of the one or more applications from the saved execution stated loaded into the memory.
- Example 44 includes the subject matter of any of Examples 37-43, and wherein the means for loading the saved execution state comprises means for loading a saved execution state based at least in part on the environment checkpointing data.
- Example 45 includes the subject matter of any of Examples 37-44, and further including means for executing a distributed application using a main thread initiated by the computing node; wherein the means for saving the checkpointing data comprises means for saving an execution state of the distributed application, and wherein the execution state is indicative of a virtual memory state of the distributed application.
- Example 46 includes the subject matter of any of Examples 37-45, and further including means for saving memory pages, stored in a memory of the computing node, corresponding to a first application of the one or more applications in response to a determination that the first application is lagging behind the main thread; and means for flushing memory pages, stored in the memory, corresponding to a second application of the one or more applications in response to a determination that the second application is working ahead of the main thread.
- Example 47 includes the subject matter of any of Examples 37-46, and wherein the means for buffering the input/output (I/O) signals of the one or more paused applications comprises means for buffering memory access events.
- the means for buffering the input/output (I/O) signals of the one or more paused applications comprises means for buffering memory access events.
- Example 48 includes the subject matter of any of Examples 37-47, and wherein the means for buffering the input/output (I/O) signals of the one or more paused applications comprises means for buffering disk access events.
- the means for buffering the input/output (I/O) signals of the one or more paused applications comprises means for buffering disk access events.
- Example 49 includes the subject matter of any of Examples 37-48, and wherein the means for buffering the input/output (I/O) signals of the one or more paused applications comprises means for buffering network access events.
- the means for buffering the input/output (I/O) signals of the one or more paused applications comprises means for buffering network access events.
- Example 50 includes an orchestration node for administering an environment checkpointing event, the orchestration node comprising means for transmitting a checkpoint initialization signal to each of a plurality of working computing nodes communicatively coupled to the orchestration node in response to an environment checkpoint initialization signal indicative of a checkpoint event; means for receiving checkpointing data from each working computing node in response to the checkpoint initialization signal, wherein the checkpoint data includes an execution state of at least one application of a corresponding working computing node; means for storing the received checkpointing data; means for transmitting a checkpoint complete signal to each of the plurality of working computing nodes.
- Example 51 includes the subject matter of Example 50, and wherein the means for transmitting the checkpoint complete signal to the plurality of working computing nodes comprises means for transmitting the checkpoint complete signal to the plurality of working computing nodes in response to a determination that the checkpointing data has been received from each of the plurality of working computing nodes.
- Example 52 includes the subject matter of any of Examples 50 and 51, and further including means for transmitting the received checkpointing data from each of the plurality of working computing nodes to each of the plurality of working computing nodes communicatively coupled to and registered with the orchestration node.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Retry When Errors Occur (AREA)
- Debugging And Monitoring (AREA)
Abstract
Technologies for environment checkpointing include an orchestration node communicatively coupled to one or more working computing nodes. The orchestration node is configured to administer an environment checkpointing event by transmitting a checkpoint initialization signal to each of the one or more working computing nodes that have been registered with the orchestration node. Each working computing node is configured to pause and buffer any presently executing applications, save checkpointing data (an execution state of each of the one or more applications) and transmit the checkpointing data to the orchestration node. Other embodiments are described and claimed.
Description
- The present application is a continuation application of U.S. application Ser. No. 14/748,650, entitled “TECHNOLOGIES FOR DATA CENTER ENVIRONMENT CHECKPOINTING,” which was filed on Jun. 24, 2015.
- Many large-scale computing environments such as high-performance computing (HPC) and cloud computing environments may incorporate distributed or multi-tier applications and workloads. In other words, more than one instance of a workload may be executing at the same time across multiple applications and/or computing devices (e.g., servers). Crashes or other errors occurring in the course of processing such distributed workloads may cause the loss of application state and thus may require large amounts of computational work to be repeated. Accordingly, crashes in large-scale computing environments may be quite costly and time-consuming.
- Some HPC and cloud computing environments support software-based application checkpointing. Typical application checkpointing solutions are purely software-based and allow the computing environment to store periodic snapshots (i.e., checkpoints) of the state of a running application, a virtual machine, or a workload in a non-distributed or single-tier computing environment. Based on the saved checkpoints, a suspended or interrupted application may be resumed or replayed starting from the state of a saved checkpoint, which may allow for quicker or less-expensive crash recovery. However, software checkpointing support may require the checkpointing software to be re-engineered for each supported application and/or operating system. Further, such software-based checkpointing solutions (e.g., hypervisors, virtual machine monitors, etc.) are typically dependent on various factors of the single-tier or non-distributed environment, such as the vendor, the operating system, the type of virtual machine, the application, etc.
- The concepts described herein are illustrated by way of example and not by way of limitation in the accompanying figures. For simplicity and clarity of illustration, elements illustrated in the figures are not necessarily drawn to scale. Where considered appropriate, reference labels have been repeated among the figures to indicate corresponding or analogous elements.
-
FIG. 1 is a simplified block diagram of at least one embodiment of a system for supporting data center environment checkpointing that includes an orchestration node and working computing nodes; -
FIG. 2 is a simplified block diagram of at least one embodiment of a computing node of the system ofFIG. 1 ; -
FIG. 3 is a simplified block diagram of at least one embodiment of an environment that may he established by the orchestration node ofFIG. 1 ; -
FIG. 4 is a simplified block diagram of at least one embodiment of an environment that may be established by at least one of the additional computing nodes ofFIG. 2 ; -
FIG. 5 is a simplified flow diagram of at least one embodiment of a method for initializing a distributed application that may be executed by one or more of the working computing nodes ofFIG. 4 ; -
FIG. 6 is a simplified flow diagram of at least one embodiment of a method for administering an environment checkpointing event that may be executed by the orchestration node ofFIG. 3 ; -
FIG. 7 is a simplified flow diagram of at least one embodiment of a method for performing a checkpointing event that may be executed by one or more of the working computing nodes ofFIG. 4 ; and -
FIG. 8 is a simplified flow diagram of at least one embodiment for a method for performing an environment restore event that may be executed by one or more of the working computing nodes ofFIG. 4 . - While the concepts of the present disclosure are susceptible to various modifications and alternative forms, specific embodiments thereof have been shown by way of example in the drawings and will be described herein in detail. It should be understood, however, that there is no intent to limit the concepts of the present disclosure to the particular forms disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives consistent with the present disclosure and the appended claims.
- References in the specification to “one embodiment,” “an embodiment,” “an illustrative embodiment,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may or may not necessarily include that particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described. Additionally, it should be appreciated that items included in a list in the form of “at least one of A, B, and C” can mean (A); (B); (C): (A and B); (A and C); (B and C); or (A, B, and C). Similarly, items listed in the form of “at least one of A, B, or C” can mean (A); (B); (C): (A and B); (A and C); (B and C); or (A, B, and C).
- The disclosed embodiments may be implemented, in some cases, in hardware, firmware, software, or any combination thereof. The disclosed embodiments may also be implemented as instructions carried by or stored on one or more transitory or non-transitory machine-readable (e.g., computer-readable) storage media, which may be read and executed by one or more processors. A machine-readable storage medium may be embodied as any storage device, mechanism, or other physical structure for storing or transmitting information in a form readable by a machine (e.g., a volatile or non-volatile memory, a media disc, or other media device).
- In the drawings, some structural or method features may be shown in specific arrangements and/or orderings. However, it should be appreciated that such specific arrangements and/or orderings may not be required. Rather, in some embodiments, such features may be arranged in a different manner and/or order than shown in the illustrative figures. Additionally, the inclusion of a structural or method feature in a particular figure is not meant to imply that such feature is required in all embodiments and, in some embodiments, may not be included or may be combined with other features.
- Referring now to
FIG. 1 , in an illustrative embodiment, asystem 100 for data center environment checkpointing includes a plurality ofcomputing nodes 102 communicatively coupled via abackplane management controller 112 in acompute environment 114. Each of the plurality ofcomputing nodes 102 is capable of executing one or more applications, or services, and responding to checkpointing events. Theillustrative computing nodes 102 include anorchestration node 104 for managing resources (e.g., central processing unit (CPU) resources, storage resources, network resources) and/or distributing workloads across working computing nodes 110 (e.g., illustratively, thecomputing nodes 106, 108), which are registered with theorchestration node 104. The illustrativeworking computing nodes 110 include a first computing node, which is designated as computing node (1) 106, and a second computing node, which is designated as computing node (N) 108 (i.e., the “Nth” computing node of theworking computing nodes 110, wherein “N” is a positive integer and designates one or moreadditional computing nodes 110 that are registered with the orchestration node 104). - Each of the plurality of
computing nodes 102 is capable of executing one or more applications and includes hardware capable of supporting checkpointing (i.e., hardware-assisted checkpointing support). Hardware checkpointing support may allow for improved checkpointing performance, reliability, and scalability compared to software-only implementations. Additionally, because hardware checkpointing may be transparent to executing applications, checkpointing support may be provided for existing applications without requiring re-engineering (e.g., modifying code, recompiling code, etc.) of the underlying software. - The
orchestration node 104 is additionally configured to administer an environment checkpointing event. To do so, in use, theorchestration node 104 provides a checkpoint initialization signal, distributed via thebackplane management controller 112, to theworking computing nodes 110. Each of theworking computing nodes 110 that receive the checkpoint initialization signal pauses the execution of local applications (i.e. workload processing processes, threads, virtual machines, etc.) presently running on the correspondingworking computing node 110, atomically saves the states of the paused applications (i.e., application checkpointing data) using the hardware checkpoint support, and transmits the application checkpointing data back to theorchestration node 104. Theorchestration node 104 then aggregates the application checkpointing data received from each of theworking computing nodes 110 and, upon having received the application checkpointing data from all of theworking computing nodes 110, provides a checkpoint complete signal to theworking computing nodes 110 to indicate to theworking computing nodes 110 that they may resume execution of the previously paused applications. - While one of the
computing nodes 102 is designated as theorchestration node 104, it should be appreciated that any of the working computing nodes 110 (i.e., illustratively, thecomputing node 106 or the computing node 108) of thecompute environment 114 may be capable of performing as anorchestration node 104, such as in the event of a failure of the designatedorchestration node 104. As such, anycomputing node 102 of thecompute environment 114 may be designated as the “orchestration” node and is referred to as such in the following description. - In some embodiments, the plurality of
computing nodes 102 and the backplane management controller 112 (i.e., the compute environment 114) may be configured in a physical housing that facilitates the communication enabling connections between thecomputing nodes 102 and thebackplane management controller 112. For example, the physical housing may be a rack in a rack-mounted configuration (i.e., thecomputing nodes 102 are rack-mounted servers), a blade server chassis in a blade server configuration (i.e., thecomputing nodes 102 are blade servers), or any other type of physical housing capable of facilitating the communication enabling connections between thecomputing nodes 102 and thebackplane management controller 112. Accordingly, thecompute environment 114 may additionally include various other components, such as power supplies, fans, etc., which are not illustrated herein for clarity of the description. It should be appreciated, however, that in some embodiments, the process and/or workload distribution may not be self-contained to just thecomputing nodes 102 on the rack or in the chassis, such as in a cross-rack orchestration or a cross cloud orchestration, for example. In such embodiments, thecompute environment 114 may encompass the various network devices andcomputing nodes 102 associated with the cross-rack orchestration or the cross cloud orchestration. - The
backplane management controller 112 may be embodied as any type of circuitry and/or components capable of performing the functions described herein, such as an enclosure management controller (EMC), a baseboard management controller (BMC), a chassis management controller (CMC), or any type of backplane management controller capable of facilitating the backend connectivity and transmission of communications across thecomputing nodes 102, such as between theorchestration node 104 and theworking computing nodes 110. - The
computing nodes 102 may be embodied as any type of computation or computer device capable of performing the functions described herein, including, without limitation, a computer, a multiprocessor system, a server, a rack-mounted server, a blade server, a smartphone, a tablet computer, a laptop computer, a notebook computer, a mobile computing device, a wearable computing device, a network appliance, a web appliance, a distributed computing system, a processor-based system, and/or a consumer electronic device. As shown inFIG. 2 , one of thecomputing nodes 102 illustratively includes aprocessor 202, an input/output (I/O)subsystem 208, amemory 212, adata storage device 214, andcommunication circuitry 216. Of course, thecomputing node 102 may include other or additional components, such as those commonly found in a computer (e.g., various input/output devices), in other embodiments. Additionally, in some embodiments, one or more of the illustrative components may be incorporated in, or otherwise form a portion of, another component. For example, thememory 212, or portions thereof, may be incorporated in theprocessor 202 in some embodiments. - The
processor 202 may be embodied as any type of processor capable of performing the functions described herein. For example, theprocessor 202 may be embodied as a single or multi-core processor(s), digital signal processor, microcontroller, or other processor or processing/controlling circuit. Theprocessor 202 illustratively includeshardware checkpoint support 204 and ahardware event monitor 206. Thehardware checkpoint support 204 may be embodied as any hardware component, microcode, firmware, or other component of theprocessor 202 capable of saving the execution state (e.g., a virtual memory state) of a currently executing application. For example, thehardware checkpoint support 204 may be embodied as one or more dedicated processor instructions and associated memory management functions of theprocessor 202 that causes all or part of the virtual memory space of the current application to be saved to nonvolatile storage. In some embodiments, thehardware checkpoint support 204, or a portion thereof, may be embodied as firmware or software executable by theprocessor 202 or other component of thecomputing node 102. In such embodiments, the hardware checkpoint support 204 (or instructions thereof) may be stored in memory (e.g., the memory 212). - The hardware event monitor 206 may be embodied as any hardware component, microcode, firmware, or other component of the
processor 202 capable of notifying software executed by theprocessor 202 of system events occurring within theprocessor 202, such as memory access events, cache access events, and/or checkpointing events. For example, the hardware event monitor 206 may be embodied as one or more performance counters, performance monitoring units, cache monitoring units, or other hardware counters of theprocessor 202. In some embodiments, thecomputing node 102 may facilitate the orchestration of the checkpointing event through a main platform firmware, or pre-boot firmware, such as an extension of the Intel platform chipset or the platform Basic Input/Output System (BIOS) based on the Unified Extensible Firmware Interface (“UEFI”) specification, which has several versions published by the Unified EFI Forum. In such embodiments, the BIOS may reside in thememory 212 and include instructions to initialize thecomputing node 102 during the boot process. In some embodiments, thehardware event monitor 206, or a portion thereof, may be embodied as firmware or software executable by theprocessor 202 or other component of thecomputing node 102. In such embodiments, the hardware event monitor 206 (or instructions thereof) may be stored in memory (e.g., the memory 212). - The
memory 212 may be embodied as any type of volatile or non-volatile memory or data storage capable of performing the functions described herein. In operation, thememory 212 may store various data and software used during operation of thecomputing node 102 such as operating systems, applications, programs, libraries, and drivers. Thememory 212 is communicatively coupled to theprocessor 202 via the I/O subsystem 208, which may be embodied as circuitry and/or components to facilitate input/output operations with theprocessor 202, thememory 212, and other components of thecomputing node 102. For example, the I/O subsystem 208 may be embodied as, or otherwise include, memory controller hubs, input/output control hubs, firmware devices, communication links (i.e., point-to-point links, bus links, wires, cables, light guides, printed circuit board traces, etc.) and/or other components and subsystems to facilitate the input/output operations. The I/O subsystem 208 further includes an I/O buffering device 210. The I/O buffering device 210 may be embodied as any hardware component, microcode, firmware, or other component of the I/O subsystem 208 that is capable of buffering I/O signals during a checkpointing event and notifying software executed by theprocessor 202 of system I/O events occurring within thecomputing node 102, such as disk access events, memory access events, network access events, checkpointing events, or other system events. For example, the I/O buffering device 210 may be embodied as one or more bit identifiers, performance counters, performance monitoring units, or other hardware counters of the I/O subsystem 208. In some embodiments, the I/O subsystem 208 may form a portion of a system-on-a-chip (SoC) and be incorporated, along with theprocessor 202, thememory 212, and other components of thecomputing node 102, on a single integrated circuit chip. - The
data storage device 214 may be embodied as any type of device or devices configured for short-term or long-term storage of data such as, for example, memory devices and circuits, memory cards, hard disk drives, solid-state drives, or other data storage devices. In use, as described below, thedata storage device 214 may store application checkpointing data such as saved execution states or other, similar data. Thecommunication circuitry 216 of thecomputing node 102 may be embodied as any communication circuit, device, or collection thereof, capable of enabling communications between thecomputing node 102 and theorchestration node 104, and between thecomputing node 102 and remote devices over a network (not shown). Thecommunication circuitry 216 may be configured to use any one or more communication technology (e.g., wired or wireless communications) and associated protocols (e.g., Ethernet, Bluetooth®, Wi-Fi®, WiMAX, etc.) to effect such communication. - In some embodiments, the
computing node 102 may also include acheckpoint cache 218. Similar to thedata storage device 214, thecheckpoint cache 218 may be embodied as any type of device or devices configured for short-term or long-term storage of data such as, for example, memory devices and circuits, memory cards, hard disk drives, solid-state drives, or other data storage devices. For example, in some embodiments, thecheckpoint cache 218 may be embodied as a flash-based memory storage device for storing persistent information. Thecheckpoint cache 218 may store application checkpointing data such as saved execution states or other, similar data. - In some embodiments, the
computing node 102 may also include one or moreperipheral devices 220. Theperipheral devices 220 may include any number of additional input/output devices, interface devices, and/or other peripheral devices. For example, in some embodiments, theperipheral devices 220 may include a display, touch screen, graphics circuitry, keyboard, mouse, speaker system, and/or other input/output devices, interface devices, and/or peripheral devices. - As described previously, each of the
computing nodes 102 ofFIG. 1 are capable of being configured as theorchestration node 104 or one of the working computing nodes 110 (i.e., illustratively, thecomputing nodes 106, 108). Accordingly, environments that may be established during operation of thecomputing nodes 102, as shown inFIGS. 3 and 4 , are dependent on whether aparticular computing node 102 is functioning as theorchestration node 104 or one of the workingcomputing nodes 110. Referring now toFIG. 3 , in an illustrative embodiment, theorchestration node 104 establishes anenvironment 300 during operation. Theillustrative environment 300 includes aregistration interface module 310, a distributedapplication coordination module 320, and an environmentcheckpoint administration module 330. - Each of the modules, logic, and other components of the
environment 300 may be embodied as hardware, software, firmware, or a combination thereof. For example, each of the modules, logic, and other components of theenvironment 300 may form a portion of, or otherwise be established by, theprocessor 202 or other hardware components of theorchestration node 104. As such, in some embodiments, one or more of the modules of theenvironment 300 may be embodied as a circuit or collection of electrical devices (e.g., a registration interface circuit, a distributed application coordination circuit, an environment checkpoint administration circuit, etc.). In some embodiments, theregistration interface module 310, the distributedapplication coordination module 320, and/or the environmentcheckpoint administration module 330 may be embodied as one or more components of a virtualization framework of the orchestration node 104such as a hypervisor or virtual machine monitor (VMM). It should be appreciated that theorchestration node 104 may include other components, sub-components, modules, sub-modules, and devices commonly found in a computing device, which are not illustrated inFIG. 3 for clarity of the description. - In the
illustrative environment 300, theorchestration node 104 additionally includesenvironment checkpointing data 302 andregistration data 304, each of which may be accessed by one or more of the various modules and/or sub-modules of theorchestration node 104. In some embodiments, theenvironment checkpointing data 302 may include a hash table of the state of various connections and processes associated with each distributed application (i.e., master persistency). In some embodiments, theenvironment checkpointing data 302 may be transmitted by theorchestration node 104 to the working computing nodes 110 (i.e., environment persistency). - The
registration interface module 310 is configured to receive registration information from the workingcomputing nodes 110 that includes identification information related to the workingcomputing nodes 110, such as a computing node identifier, a process identifier, a virtual machine identifier, etc. Theregistration interface module 310 is further configured to register the workingcomputing nodes 110 with theorchestration node 104 based on the received registration information. The registration information may include computing node identification data, application (seeapplication 410 ofFIG. 4 ) related data, such as thread or process identifiers (e.g., names, IDs, etc.). Theregistration interface module 310 may store the received registration information in theregistration data 304. - The distributed
application coordination module 320 is configured to coordinate (i.e., spawn, distribute, monitor, adjust, allocate resources, terminate, etc.) each application, hypervisor, or master thread of a distributed application to be performed at the workingcomputing nodes 110. To do so, the distributedapplication coordination module 320 is configured to initiate applications, record dependencies, and generate child thread identifiers and/or connection identifiers based on objects registered by the workingcomputing nodes 110 for the initiated applications, such as the registration information received and registered at theregistration interface module 310. Additionally, the distributedapplication coordination module 320 may be further configured to coordinate (i.e., track) the various signals and events triggered by and/or received at theorchestration node 104. - The environment
checkpoint administration module 330 is configured to administer atomic checkpointing operations and track the checkpointing operations across the workingcomputing nodes 110. As described previously, thecomputing nodes 102 may span a rack, a blade, a data center, or any number of cloud-provider environments. Accordingly, the environmentcheckpoint administration module 330 is configured to transmit a checkpoint initialization signal that includes time sync information to the workingcomputing nodes 110 of thecompute environment 114 that are presently registered with theorchestration node 104. The environmentcheckpoint administration module 330 is further configured to receive checkpointing information (e.g., application state information) from each other workingcomputing node 110 registered with theorchestration node 104 upon completion of the checkpointing event at each other workingcomputing node 110. Upon receipt, the environmentcheckpoint administration module 330 stores the checkpointing information, which may be aggregated with previously received checkpointing data fromother computing nodes 102 of thecompute environment 114. In some embodiments, the received checkpointing data may be stored in theenvironment checkpointing data 302. Accordingly, theenvironment checkpointing data 302 may include checkpointing information related to each other workingcomputing node 110 of thecompute environment 114 registered with theorchestration node 104, such as child process (thread) information, connection information, virtual memory contents, processor register state, processor flags, process tables, file descriptors, file handles, or other data structures relating to the current state of the application running on the workingcomputing nodes 110 at the time that the checkpoint initialization signal was received at the workingcomputing nodes 110. In some embodiments, those functions may be performed by one or more sub-modules, such as an environment checkpointingdata management module 332. - Additionally or alternatively, in some embodiments, the environment
checkpoint administration module 330 may be further configured to administer a restore event based on theenvironment checkpointing data 302. In such embodiments, the environmentcheckpoint administration module 330 may be further configured to restore the execution state (e.g., a virtual memory state) of a distributed application by using thehardware checkpoint support 204 of theprocessor 202 to restore the execution state based on at least some of theenvironment checkpointing data 302. For example, the environmentcheckpoint administration module 330 may transmit a checkpoint restore signal to the workingcomputing nodes 110 to indicate to each of the workingcomputing nodes 110 to pause executing any presently executing applications and start execution of one or more applications based on the checkpoint restore signal and/or any additional environment checkpointing data. It should be appreciated that the presently executing applications paused as a result of the restore operation may or may not be the same applications. - Referring now to HG. 4, in an illustrative embodiment, each working computing nodes 110 (i.e., illustratively, the
computing node 106 or the computing node 108) establishes anenvironment 400 during operation. Theillustrative environment 400 includes anapplication 410 and acheckpoint management module 420. Each of the modules, logic, and other components of theenvironment 400 may be embodied as hardware, software, firmware, or a combination thereof. For example, each of the modules, logic, and other components of theenvironment 400 may form a portion of, or otherwise be established by, theprocessor 202 or other hardware components of thecomputing node 102. As such, in some embodiments, one or more of the modules of theenvironment 400 may be embodied as a circuit or collection of electrical devices (e.g., a checkpoint management circuit, etc.). In some embodiments, thecheckpoint management module 420 may be embodied as one or more components of a virtualization framework of the workingcomputing node 110 such as a hypervisor or virtual machine monitor (VMM). It should be appreciated that thecomputing node 102 may include other components, sub-components, modules, sub-modules, and devices commonly found in a computing device, which are not illustrated inFIG. 4 for clarity of the description. - The
application 410 may be embodied as any program, process, thread, task, or other executable component of the workingcomputing node 110. For example, theapplication 410 may be embodied as a process, a thread, a native code application, a managed code application, a virtualized application, a virtual machine, or any other similar application. In some embodiments, theapplication 410 may be compiled to target theprocessor 202 specifically; that is, theapplication 410 may include code to access thehardware checkpoint support 204, such as via specialized processor instructions. During execution, theapplication 410 is initialized by a main thread, which maintains and modifies an execution state that may include, for example, a virtual memory state (i.e., virtual memory contents), processor register state, processor flags, process tables, file descriptors, file handles, or other data structures relating to the current state of theapplication 410. Although illustrated as asingle application 410, it should be understood that theenvironment 400 may include one ormore applications 410 executing contemporaneously. It should be appreciated that, in some embodiments, theapplication 410 may be a distributed application, such that at least a portion of the application processing is performed on a first working computing node 110 (e.g., the computing node 106) and at least another portion of the application processing is performed on another working computing node 110 (e.g., the computing node 108). It should be further appreciated that, in some embodiments, theapplication 410 may be a multi-tiered application, commonly referred to as an n-tier application, such that the multi-tiered application consists of more than one application developed and distributed among more than one layer (e.g., operational layer). - The
checkpoint management module 420 is configured to detect and handle occurrences of checkpointing events (e.g., any hardware or software event that triggers a checkpointing operation) received from anorchestration node 104 communicatively coupled to the workingcomputing node 110. In response to detecting checkpointing events, thecheckpoint management module 420 may call one or more hardware hooks (e.g., system calls, processor instructions, etc.) to cause the workingcomputing node 110 to save a checkpoint. To do so, thecheckpoint management module 420 is configured to lock the context of the working computing node 110 (i.e., pause further execution of presently executingapplications 410 and block newly received data) and buffer the presently executing tasks using an I/O buffering mechanism to ensure that all I/O signals (e.g., memory, disk, network, etc.) are buffered until the workingcomputing node 110 receives an indication that thecomputing node 102 may resume the pausedapplications 410. For example, the I/O buffering device 210 may buffer the I/O signals until the workingcomputing node 110 provides an indication to theorchestration node 104 that the requested checkpointing event has completed (i.e., states saved and transmitted to the orchestration node 104) and receives an indication from theorchestration node 104 that thecomputing node 102 may resume operation, such as via the checkpoint complete signal. Additionally, thecheckpoint management module 420 is configured to buffer and save memory pages and states behind the main thread, and flush memory pages and states ahead of the main thread. - The
checkpoint management module 420 is further configured to atomically save the execution state of each application 410 (i.e., the checkpointing information) that was being executed by thecomputing node 102 at the time the checkpointing event signal was received. In theillustrative environment 400, thecomputing node 102 additionally includescheckpointing data 402, which may be accessed by one or more of the various modules and/or sub-modules of thecomputing node 102. In some embodiments, the checkpointing information may be saved in thecheckpointing data 402. Accordingly, thecheckpointing data 402 may include checkpointing related data, such as virtual memory contents, processor register state, processor flags, process tables, file descriptors, file handles, or other data structures relating to the current state of theapplication 410 at the time of the checkpointing request. - In some embodiments, the functions of the
checkpoint management module 420 may be performed by one or more sub-modules, such as anapplication management module 422 to manage the pausing and restarting of eachapplication 410, anatomicity management module 424 to atomically save the checkpointing information, and a checkpointingdata management module 426 to interface with thecheckpointing data 402 to store the retrieved checkpointing information and retrieve the checkpointing information for transmission to theorchestration node 104. Additionally, in some embodiments, thecheckpoint management module 420 may be further configured to receive and storeenvironment checkpointing data 302 that may be received from theorchestration node 104. In such embodiments, thecomputing node 102 can be configured to assume the role oforchestration node 104 in the event thecomputing node 102 presently configured to function as the orchestration node crashes or otherwise fails. Accordingly, thecheckpointing data 402 and/or theenvironment checkpointing data 302 may be stored in thecheckpoint cache 218. - Additionally or alternatively, in some embodiments, the
checkpoint management module 420 may be further configured to manage a restore event based on theenvironment checkpointing data 302. In such embodiments, thecheckpoint management module 420 may be configured to receive a checkpoint restore signal from the orchestration node 104 (e.g., the environmentcheckpoint administration module 330 of the environment 300 (seeFIG. 3 )) that indicates to thecheckpoint management module 420 to restore the execution state of a distributed application. In response, thecheckpoint management module 420 may pause executing any presently executing applications and start execution of one or more applications based on the checkpoint restore signal and/or any additional environment checkpointing data. Accordingly, thecheckpoint management module 420 may use thehardware checkpoint support 204 of theprocessor 202 to restore the execution state based on thecheckpointing data 402 and/or theenvironment checkpointing data 302. It should be appreciated that the presently executing applications paused as a result of the restore operation may or may not be the same applications. - Referring now to
FIG. 5 , in use, a working computing node 110 (i.e., illustratively, thecomputing node 106 or the computing node 108) may execute amethod 500 for initializing a distributed application. Themethod 500 begins atblock 502, in which the workingcomputing node 110 receives a request to perform a task, such as processing a workload, via a distributed application. Atblock 504, the workingcomputing node 110 initializes the distributed application via an application management module (e.g., theapplication management module 422 ofFIG. 4 ) to control the distributed application. Accordingly, in some embodiments, theapplication management module 422 may be embodied as a hypervisor, main process, or master thread to execute and manage one or more of theapplications 410 ofFIG. 4 . As described above, theapplications 410 may be embodied as any process, thread, managed code, or other task executed by the workingcomputing node 110. In some embodiments, theapplications 410 may be embodied as virtualized applications, for example as applications or operating systems executed by a hypervisor and performed in a virtual machine created by the distributed application coordinator. During execution, theapplications 410 may perform calculations, update regions of thememory 212, or perform any other operations typical of a computer application. - At
block 506, theapplication management module 422 of the workingnode 110 spawns objects based on the task to be performed via the distributed application. For example, in some embodiments, one or more child processes may be spawned based on requirements of the distributed application atblock 508. In such embodiments, each spawned child process may be run via a virtual machine instance or directly by the workingcomputing node 110. Accordingly, in such embodiments, one or more virtual machines may need to be spawned based on the child processes spawned. Additionally or alternatively, in some embodiments, one or more connections may be spawned based on the requirements of the distributed application atblock 510. - At
block 512, the workingcomputing node 110 registers the one or more spawned objects with theorchestration node 104. To do so, atblock 514 in some embodiments, the workingcomputing node 110 may register each of the spawned child processes with theorchestration node 104. As described previously, in some embodiments, the spawned child processes may be running on either a virtual machine or directly by the workingcomputing node 110. Accordingly, the virtual machine instances may also be registered with theorchestration node 104. Additionally or alternatively, atblock 516 in some embodiments, such as those embodiments wherein the application is a distributed application, the workingcomputing node 110 may register each of the spawned connections between applications with theorchestration node 104, such as for multi-tiered application embodiments. Atblock 518, the workingcomputing node 110 receives identifiers from theorchestration node 104 for each of the objects registered with theorchestration node 104. For example, atblock 520, theorchestration node 104 receives a child process identifier for each child process registered with theorchestration node 104. Additionally or alternatively, in some embodiments, atblock 522, theorchestration node 104 receives the identifiers includes receiving a connection identifier for each of the connections registered with theorchestration node 104. - Referring now to
FIG. 6 , in use, theorchestration node 104 may execute amethod 600 for performing an environment checkpointing event. Themethod 600 begins atblock 602, in which theorchestration node 104 determines whether an environment checkpoint initialization signal was received. An environment checkpoint initialization signal may include time sync information and may be embodied as any hardware or software event that triggers a checkpointing operation. Theorchestration node 104 may use any technique to monitor for environment checkpoint initialization signals, including polling for events, handling interrupts, registering callback functions or event listeners, or other techniques. The environment checkpoint initialization signal may be embodied as a hardware event such as an interrupt, a memory access, or an I/O operation; as a software event such as a modification of a data structure in memory; as a user-generated event such as an application programming interface (API) call, or as any other event. - If the environment checkpoint initialization signal was not received, the method loops back to block 602 to continue to monitor for the environment checkpoint initialization signal. If the environment checkpoint initialization signal was received, the
method 600 advances to block 604, wherein theorchestration node 104 transmits (e.g., via thebackplane management controller 112 ofFIG. 1 ) a checkpoint initialization signal to one or more of the working computing nodes 110 (e.g., illustratively, thecomputing node 106 or the computing node 108) registered with theorchestration node 104. In other words, theorchestration node 104 may only transmit the checkpoint initialization signal to a subset of the registeredworking computing nodes 110 at a time. For example, the checkpointing event may be directed to a particular application, which may be distributed across multiple virtual machines on a singleworking computing node 110 or across a number of workingcomputing nodes 110. Accordingly, a checkpoint initialization signal may only be sent to those computing node(s) that are processing the distributed application. - At
block 606, theorchestration node 104 determines whether checkpointing data has been received from one of the working computing nodes 110 (e.g., illustratively, thecomputing node 106 or the computing node 108). As described previously, the checkpointing data may include saved execution states or other, similar data. If checkpointing data has not been received, themethod 600 loops back to block 606 to continue to monitor for checkpointing data. If checkpointing data has been received, themethod 600 advances to block 608, wherein theorchestration node 104 stores the received checkpointing data. In some embodiments, the checkpointing data may be stored in thedata storage device 214 or thecheckpoint cache 218 ofFIG. 2 . - At
block 610, theorchestration node 104 determines whether all of the workingcomputing nodes 110 to which the checkpoint initialization signals were transmitted are required to be completed (i.e., checkpointing data received by the orchestration node 104) before proceeding. If so, themethod 600 branches to block 616, which is described in further detail below; otherwise, themethod 600 advances to block 612, wherein theorchestration node 104 transmits a checkpoint complete signal to the workingcomputing node 110 from which the checkpointing data was received. In some embodiments, themethod 600 may advance to block 614, wherein theorchestration node 104 transmits the environment checkpointing data (i.e., the checkpointing data for all of the working computing nodes 110) to registered workingcomputing nodes 110 such that in the event theorchestration node 104 crashes, another workingcomputing node 110 may assume the role of theorchestration node 104, thus avoiding a single point of failure. In some embodiments, the environment checkpointing data may include a hash table of the state of various connections and processes associated with each distributed application for each workingcomputing node 110 of thecompute environment 114. - At
block 616, theorchestration node 104 determines whether the checkpointing data has been received from all of the workingcomputing nodes 110 to which the checkpoint initialization signals were transmitted. If not, themethod 600 loops back to block 602 to continue monitoring for checkpointing data received at theorchestration node 104; otherwise, themethod 600 advances to block 618. Atblock 618, theorchestration node 104 transmits a checkpoint complete signal to all of the workingcomputing nodes 110 to which the checkpoint initialization signals were transmitted. In some embodiments, the method advances fromblock 618 to block 614 to transmit the environment checkpointing data to all the registeredworking computing nodes 110. - Referring now to
FIG. 7 , in use, each working computing node 110 (i.e., illustratively, thecomputing node 106 or the computing node 108) may execute amethod 700 for performing a checkpointing event. Themethod 700 begins atblock 702, in which the workingcomputing node 110 determines whether a checkpoint initialization signal was received. As described previously, the checkpoint initialization signal may include time sync information and may be embodied as any hardware or software event that triggers a checkpointing operation. - The working
computing node 110 may use any technique to monitor for checkpoint initialization signals, including polling for events, handling interrupts, registering callback functions or event listeners, or other techniques. For example, during initialization of the workingcomputing node 110, the workingcomputing node 110 may perform any initialization routines or other processes required to activate thehardware checkpoint support 204, as well as any required software initialization routines. In such embodiments, the workingcomputing node 110 may initialize interrupt vectors, timers, or other system hooks used to invoke thehardware checkpoint support 204. - If the checkpoint initialization signal was not received, the method loops back to block 702 to continue to monitor for the checkpoint initialization signal. If the checkpoint initialization signal was received, the
method 700 advances to block 704, wherein the workingcomputing node 110 pauses executing applications that are being executed at the time the checkpoint initialization signal was received. As described previously, during execution, the application (e.g., the application 410) is initialized by a main thread, which is responsible for performing the task associated with the application - At
block 706, the workingcomputing node 110 locks the context of the computing node. In other words, any new data received at the workingcomputing node 110 is blocked. Atblock 708, the workingcomputing node 110 buffers the paused applications. To do so, thecomputing node 102 buffers and saves any memory pages, or states, that are lagging behind the main thread and flushes any memory pages, or states, that are ahead of the main thread. - At
block 710, the workingcomputing node 110 saves the checkpointing data, which may include saved execution states of the applications paused atblock 704 and buffered atblock 708, as well as any other data related to the state of the paused applications (e.g., the stack, the heap, the allocated pages, the process table, other parts of thememory 212, theprocessor 202 flags, states, orother processor 202 information, etc.). In some embodiments, software (e.g., the operating system) running on the workingcomputing node 110 may execute a system hook (i.e., a hardware hook) to save the execution state of the applications that were executing at the time in which the checkpoint initialization signal was received. In such embodiments, the workingcomputing node 110 may save the execution states of the applications using thehardware checkpoint support 204 ofFIG. 2 . The hardware hook may be embodied as any technique usable to invoke thehardware checkpoint support 204 of theprocessor 202. It should be appreciated that different software executing on the sameworking computing node 110 may execute different system hooks. In some embodiments, inblock 712, the checkpointing data may be stored in a persistent storage device, such as thecheckpoint cache 218 ofFIG. 2 . - At
block 714, the workingcomputing node 110 transmits the saved checkpointing data to theorchestration node 104. To do so, the workingcomputing node 110 may transmit the saved checkpointing data via thebackplane management controller 112 ofFIG. 1 . Atblock 716, the workingcomputing node 110 determines whether a checkpoint complete signal was received from theorchestration node 104 in response to having transmitted the saved checkpointing data. As described previously, in some embodiments, theorchestration node 104 may be configured to wait until all of the workingcomputing nodes 110 that received the checkpoint initialization signal to have responded with their respective checkpointing data before theorchestration node 104 may transmit the checkpoint complete signal to the applicableworking computing nodes 110. Accordingly, if the checkpoint complete signal has not been received atblock 716, the workingcomputing node 110 loops back to block 716 to continue to monitor whether the checkpoint complete signal has been received. If the checkpoint complete signal has been received atblock 716, themethod 700 advances to block 718, wherein the context locked atblock 706 is unlocked (i.e., new data is again accepted by the working computing node 110). - At
block 720, the applications that were executing at the time the checkpoint initialization signal was received, and were subsequently paused and buffered, are released from the buffer. Upon release, the main thread continues processing the application. As described previously, in some embodiments, each workingcomputing node 110 in thecompute environment 114 may store a copy of the environment checkpointing data, which may include application state information for all the workingcomputing node 110 of thecompute environment 114, such that any workingcomputing node 110 may be capable of being configured as theorchestration node 104. For example, in the event that a workingcomputing node 110 that functioned as theorchestration node 104 crashed, or is otherwise unavailable, any other workingcomputing node 110 of the environment can assume the role of theorchestration node 104. In such embodiments, atblock 722, the workingcomputing node 110 determines whether environment checkpointing data was received. If not, the workingcomputing node 110 loops back to block 722 to continue to monitor for the environment checkpointing data; otherwise, the computing node advances to block 724. Atblock 724, the workingcomputing node 110 stores the environment checkpointing data. In some embodiments, atblock 726, the workingcomputing node 110 stores the environment checkpointing data to a persistent storage device (e.g., thecheckpoint cache 218 ofFIG. 2 ). As described previously, in some embodiments, the environment checkpointing data may include a hash table of the state of various connections and processes associated with each distributed application for each workingcomputing node 110 of thecompute environment 114 in which the workingcomputing node 110 is connected to. - Referring now to
FIG. 8 , in use, each working computing node 110 (i.e., thecomputing node 106 or the computing node 108) may execute amethod 800 for performing an environment restore event. Themethod 800 begins atblock 802, in which thecomputing node 102 determines whether a checkpoint restore signal was received. As described previously, the checkpoint restore signal may include data indicative of the environment checkpointing data to be referenced (i.e., used) for the restore. If the checkpoint restore signal was not received, the method loops back to block 802 to continue to monitor for the checkpoint initialization signal. If the checkpoint initialization signal was received, themethod 800 advances to block 804, wherein thecomputing node 102 pauses executing applications that are being executed at the time the checkpoint initialization signal was received. - At
block 806, the workingcomputing node 110 executes a system hook to load the saved execution state of one or more requested applications (e.g., theapplication 410 ofFIG. 4 ) into memory (e.g., thememory 212 ofFIG. 2 ). Similar to saving the execution state, the system hook for loading the execution state may be embodied as any technique usable to invoke thehardware checkpoint support 204 of theprocessor 202. Atblock 808, the workingcomputing node 110 loads the execution state of the requested applications to be restored into thememory 212 using thehardware checkpoint support 204. After loading the execution state of the requested applications to be restored into thememory 212, atblock 810 the workingcomputing node 110 resumes execution of the restored applications based on the saved execution state. After resuming the restored applications, themethod 800 loops back to block 802 to continue monitoring for the presence of a checkpoint restore signal. - Illustrative examples of the technologies disclosed herein are provided below. An embodiment of the technologies may include any one or more, and any combination of, the examples described below.
- Example 1 includes a computing node for performing a checkpointing event, the computing node comprising a hardware event monitor to receive a checkpoint initialization signal from an orchestration node communicatively coupled to the computing node; a checkpoint management module to (i) pause one or more applications being presently executed on the computing node in response to having received the checkpoint initialization signal and (ii) buffer, by an input/output (I/O) buffering device, input/output (I/O) signals of the one or more paused applications; and a hardware checkpoint support to save checkpointing data to a memory storage device of the computing node, wherein the checkpointing data includes an execution state of each of the one or more applications, wherein the checkpoint management module is further to transmit the checkpointing data to the orchestration node.
- Example 2 includes the subject matter of Example 1, and wherein the checkpoint management module is further to lock context of the computing node to block any new data received by the computing node from being processed by the computing node in response to having received the checkpoint initialization signal.
- Example 3 includes the subject matter of any of Examples 1 and 2, and wherein the hardware event monitor is further to receive, by the hardware event monitor, a checkpoint complete signal from the orchestration node, and wherein the checkpoint management module is further to resume the one or more paused applications in response to having received the checkpoint complete signal.
- Example 4 includes the subject matter of any of Examples 1-3, and wherein to resume the one or more paused applications comprises to (i) unlock context of the computing node to allow any new data to be received by the computing node and (ii) release the input/output (I/O) signals of the one or more applications from the input/output (I/O) buffering device.
- Example 5 includes the subject matter of any of Examples 1-4, and wherein the checkpoint management module is further to register with the orchestration node, wherein to register includes to provide an indication that the checkpointing event is to be initiated by the orchestration node.
- Example 6 includes the subject matter of any of Examples 1-5, and wherein the checkpoint management module is further to (i) receive environment checkpointing data from the orchestration node, wherein the environment checkpointing data includes execution state data of other computing nodes communicatively coupled to the orchestration node, and (ii) store the environment checkpointing data in a local storage.
- Example 7 includes the subject matter of any of Examples 1-6, and wherein the checkpoint management module is further to receive a checkpoint restore signal from the orchestration node, wherein the hardware checkpoint support is further to load a saved execution state of at least one of the one or more applications into a memory of the computing node, and wherein the checkpoint management module is further to resume execution of the at least one of the one or more applications from the saved execution stated loaded into the memory.
- Example 8 includes the subject matter of any of Examples 1-7, and wherein to load the saved execution state comprises to load a saved execution state based at least in part on the environment checkpointing data.
- Example 9 includes the subject matter of any of Examples 1-8, and wherein the checkpoint management module is further to execute a distributed application using a main thread initiated by the computing node, wherein to save the checkpointing data comprises to save an execution state of the distributed application, and wherein the execution state is indicative of a virtual memory state of the distributed application.
- Example 10 includes the subject matter of any of Examples 1-9, and wherein the checkpoint management module is further to (i) save memory pages that correspond to a first application of the one or more applications in a memory of the computing node in response to a determination that the first application is lagging behind the main thread and (ii) flush memory pages that correspond to a second application of the one or more applications in the memory in response to a determination that the second application is working ahead of the main thread.
- Example 11 includes the subject matter of any of Examples 1-10, and wherein to buffer the input/output (I/O) signals of the one or more paused applications comprises to buffer memory access events.
- Example 12 includes the subject matter of any of Examples 1-11, and wherein to buffer the input/output (I/O) signals of the one or more paused applications comprises to buffer disk access events.
- Example 13 includes the subject matter of any of Examples 1-12, and wherein to buffer the input/output (I/O) signals of the one or more paused applications comprises to buffer network access events.
- Example 14 includes an orchestration node for administering an environment checkpointing event, the orchestration node comprising an environment checkpoint administration module to (i) transmit a checkpoint initialization signal to each of a plurality of working computing nodes communicatively coupled to the orchestration node in response to an environment checkpoint initialization signal indicative of a checkpoint event, (ii) receive checkpointing data from each working computing node in response to the checkpoint initialization signal, wherein the checkpoint data includes an execution state of at least one application of a corresponding working computing node, (iii) store the received checkpointing data, and (iv) transmit a checkpoint complete signal to each of the plurality of working computing nodes.
- Example 15 includes the subject matter of Example 14, and wherein to transmit the checkpoint complete signal to the plurality of working computing nodes comprises to transmit the checkpoint complete signal to the plurality of working computing nodes in response to a determination that the checkpointing data has been received from each of the plurality of working computing nodes.
- Example 16 includes the subject matter of any of Examples 14 and 15, and wherein the environment checkpoint administration module is further to transmit the received checkpointing data from each of the plurality of working computing nodes to each of the plurality of working computing nodes communicatively coupled to and registered with the orchestration node.
- Example 17 includes a method for performing a checkpointing event, the method comprising receiving, by a hardware event monitor of a computing node, a checkpoint initialization signal from an orchestration node communicatively coupled to the computing node; pausing, by a processor of the computing node, one or more applications presently executing on the computing node in response to receiving the checkpoint initialization signal; buffering, by an input/output (I/O) buffering device of the computing node, input/output (I/O) signals of the one or more paused applications; saving, by a hardware checkpoint support of the computing node, checkpointing data to a memory storage device of the computing node, wherein the checkpointing data includes an execution state of each of the one or more applications; and transmitting, by the computing node, the checkpointing data to the orchestration node.
- Example 18 includes the subject matter of Example 17, and further including locking, by the computing node, context of the computing node to block any new data received by the computing node from being processed by the computing node in response to receiving the checkpoint initialization signal.
- Example 19 includes the subject matter of any of Examples 17 and 18, and further including receiving, by the hardware event monitor of a computing node, a checkpoint complete signal from the orchestration node; and resuming the one or more paused applications in response to receiving the checkpoint complete signal.
- Example 20 includes the subject matter of any of Examples 17-19, and wherein resuming the one or more paused applications comprises (i) unlocking context of the computing node to allow any new data to be received by the computing node and (ii) releasing the input/output (I/O) signals of the one or more applications from the input/output (I/O) buffering device of the computing node.
- Example 21 includes the subject matter of any of Examples 17-20, and further including registering, by the computing node, with the orchestration node to provide an indication that the checkpointing event is to be initiated by the orchestration node.
- Example 22 includes the subject matter of any of Examples 17-21, and further including receiving, by the computing node, environment checkpointing data from the orchestration node, wherein the environment checkpointing data includes execution state data of other computing nodes communicatively coupled to the orchestration node; and storing, by the computing node, the environment checkpointing data in a local storage.
- Example 23 includes the subject matter of any of Examples 17-22, and further including receiving, by the computing node, a checkpoint restore signal from the orchestration node; loading, by the hardware checkpoint support, a saved execution state of at least one of the one or more applications into a memory of the computing node; and resuming, by the computing node, execution of the at least one of the one or more applications from the saved execution stated loaded into the memory.
- Example 24 includes the subject matter of any of Examples 17-23, and wherein loading the saved execution state comprises loading a saved execution state based at least in part on the environment checkpointing data.
- Example 25 includes the subject matter of any of Examples 17-24, and further including executing, by the computing node, a distributed application using a main thread initiated by the computing node; wherein saving the checkpointing data comprises saving an execution state of the distributed application, and wherein the execution state is indicative of a virtual memory state of the distributed application.
- Example 26 includes the subject matter of any of Examples 17-25, and further including saving memory pages, stored in a memory of the computing node, corresponding to a first application of the one or more applications in response to a determination that the first application is lagging behind the main thread; and flushing memory pages, stored in the memory, corresponding to a second application of the one or more applications in response to a determination that the second application is working ahead of the main thread.
- Example 27 includes the subject matter of any of Examples 17-26, and wherein buffering the input/output (I/O) signals of the one or more paused applications comprises buffering memory access events.
- Example 28 includes the subject matter of any of Examples 17-27, and wherein buffering the input/output (I/O) signals of the one or more paused applications comprises buffering disk access events.
- Example 29 includes the subject matter of any of Examples 17-28, and wherein buffering the input/output (I/O) signals of the one or more paused applications comprises buffering network access events.
- Example 30 includes a computing node comprising a processor; and a memory having stored therein a plurality of instructions that when executed by the processor cause the computing node to perform the method of any of Examples 17-29.
- Example 31 includes one or more machine readable storage media comprising a plurality of instructions stored thereon that in response to being executed result in a computing node performing the method of any of Examples 17-29.
- Example 32 includes a method for administering an environment checkpointing event, the method comprising transmitting, by an orchestration node, a checkpoint initialization signal to each of a plurality of working computing nodes communicatively coupled to the orchestration node in response to an environment checkpoint initialization signal indicative of a checkpoint event; receiving, by the orchestration node, checkpointing data from each working computing node in response to the checkpoint initialization signal, wherein the checkpoint data includes an execution state of at least one application of a corresponding working computing node; storing, by a memory storage device of the orchestration node, the received checkpointing data; transmitting, by the orchestration node, a checkpoint complete signal to each of the plurality of working computing nodes.
- Example 33 includes the subject matter of Example 32, and wherein transmitting the checkpoint complete signal to the plurality of working computing nodes comprises transmitting the checkpoint complete signal to the plurality of working computing nodes in response to a determination that the checkpointing data has been received from each of the plurality of working computing nodes.
- Example 34 includes the subject matter of any of Examples 32 and 33, and further including transmitting, by the orchestration node, the received checkpointing data from each of the plurality of working computing nodes to each of the plurality of working computing nodes communicatively coupled to and registered with the orchestration node.
- Example 35 includes a computing node comprising a processor; and a memory having stored therein a plurality of instructions that when executed by the processor cause the computing node to perform the method of any of Examples 32-34.
- Example 36 includes one or more machine readable storage media comprising a plurality of instructions stored thereon that in response to being executed result in a computing node performing the method of any of Examples 32-34.
- Example 37 includes a computing node for performing a checkpointing event, the computing node comprising means for receiving a checkpoint initialization signal from an orchestration node communicatively coupled to the computing node; means for pausing one or more applications presently executing on the computing node in response to receiving the checkpoint initialization signal; means for buffering input/output (I/O) signals of the one or more paused applications; means for saving checkpointing data to a memory storage device of the computing node, wherein the checkpointing data includes an execution state of each of the one or more applications; and means for transmitting the checkpointing data to the orchestration node.
- Example 38 includes the subject matter of Example 37, and further including means for locking context of the computing node to block any new data received by the computing node from being processed by the computing node in response to receiving the checkpoint initialization signal.
- Example 39 includes the subject matter of any of Examples 37 and 38, and further including means for receiving a checkpoint complete signal from the orchestration node; and means for resuming the one or more paused applications in response to receiving the checkpoint complete signal.
- Example 40 includes the subject matter of any of Examples 37-39, and wherein the means for resuming the one or more paused applications comprises means for (i) unlocking context of the computing node to allow any new data to be received by the computing node and (ii) releasing the input/output (I/O) signals of the one or more applications.
- Example 41 includes the subject matter of any of Examples 37-40, and further including means for registering with the orchestration node to provide an indication that the checkpointing event is to be initiated by the orchestration node.
- Example 42 includes the subject matter of any of Examples 37-41, and further including means for receiving environment checkpointing data from the orchestration node, wherein the environment checkpointing data includes execution state data of other computing nodes communicatively coupled to the orchestration node; and means for storing the environment checkpointing data in a local storage.
- Example 43 includes the subject matter of any of Examples 37-42, and further including means for receiving a checkpoint restore signal from the orchestration node; means for loading a saved execution state of at least one of the one or more applications into a memory of the computing node; and means for resuming execution of the at least one of the one or more applications from the saved execution stated loaded into the memory.
- Example 44 includes the subject matter of any of Examples 37-43, and wherein the means for loading the saved execution state comprises means for loading a saved execution state based at least in part on the environment checkpointing data.
- Example 45 includes the subject matter of any of Examples 37-44, and further including means for executing a distributed application using a main thread initiated by the computing node; wherein the means for saving the checkpointing data comprises means for saving an execution state of the distributed application, and wherein the execution state is indicative of a virtual memory state of the distributed application.
- Example 46 includes the subject matter of any of Examples 37-45, and further including means for saving memory pages, stored in a memory of the computing node, corresponding to a first application of the one or more applications in response to a determination that the first application is lagging behind the main thread; and means for flushing memory pages, stored in the memory, corresponding to a second application of the one or more applications in response to a determination that the second application is working ahead of the main thread.
- Example 47 includes the subject matter of any of Examples 37-46, and wherein the means for buffering the input/output (I/O) signals of the one or more paused applications comprises means for buffering memory access events.
- Example 48 includes the subject matter of any of Examples 37-47, and wherein the means for buffering the input/output (I/O) signals of the one or more paused applications comprises means for buffering disk access events.
- Example 49 includes the subject matter of any of Examples 37-48, and wherein the means for buffering the input/output (I/O) signals of the one or more paused applications comprises means for buffering network access events.
- Example 50 includes an orchestration node for administering an environment checkpointing event, the orchestration node comprising means for transmitting a checkpoint initialization signal to each of a plurality of working computing nodes communicatively coupled to the orchestration node in response to an environment checkpoint initialization signal indicative of a checkpoint event; means for receiving checkpointing data from each working computing node in response to the checkpoint initialization signal, wherein the checkpoint data includes an execution state of at least one application of a corresponding working computing node; means for storing the received checkpointing data; means for transmitting a checkpoint complete signal to each of the plurality of working computing nodes.
- Example 51 includes the subject matter of Example 50, and wherein the means for transmitting the checkpoint complete signal to the plurality of working computing nodes comprises means for transmitting the checkpoint complete signal to the plurality of working computing nodes in response to a determination that the checkpointing data has been received from each of the plurality of working computing nodes.
- Example 52 includes the subject matter of any of Examples 50 and 51, and further including means for transmitting the received checkpointing data from each of the plurality of working computing nodes to each of the plurality of working computing nodes communicatively coupled to and registered with the orchestration node.
Claims (31)
1-25. (canceled)
26. A method for performing a checkpointing event, the method comprising:
receiving, by a hardware event monitor of a computing node, a checkpoint initialization signal from an orchestration node communicatively coupled to the computing node;
pausing, by a processor of the computing node, one or more applications presently executing on the computing node in response to receiving the checkpoint initialization signal;
buffering, by an input/output (I/O) buffering device of the computing node, input/output (I/O) signals of the one or more paused applications;
saving, by a hardware checkpoint support of the computing node, checkpointing data to a memory storage device of the computing node, wherein the checkpointing data includes an execution state of each of the one or more applications; and
transmitting, by the computing node, the checkpointing data to the orchestration node.
27. The method of claim 26 , further comprising locking, by the computing node, context of the computing node to block any new data received by the computing node from being processed by the computing node in response to receiving the checkpoint initialization signal.
28. The method of claim 27 , further comprising:
receiving, by the hardware event monitor of a computing node, a checkpoint complete signal from the orchestration node; and
resuming the one or more paused applications in response to receiving the checkpoint complete signal.
29. The method of claim 28 , wherein resuming the one or more paused applications comprises (i) unlocking context of the computing node to allow any new data to be received by the computing node and (ii) releasing the input/output (I/O) signals of the one or more applications from the input/output (I/O) buffering device of the computing node.
30. The method of claim 26 , further comprising registering, by the computing node, with the orchestration node to provide an indication that the checkpointing event is to be initiated by the orchestration node.
31. The method of claim 30 , further comprising:
receiving, by the computing node, environment checkpointing data from the orchestration node, wherein the environment checkpointing data includes execution state data of other computing nodes communicatively coupled to the orchestration node; and
storing, by the computing node, the environment checkpointing data in a local storage.
32. The method of claim 31 , further comprising:
receiving, by the computing node, a checkpoint restore signal from the orchestration node;
loading, by the hardware checkpoint support, a saved execution state of at least one of the one or more applications into a memory of the computing node; and
resuming, by the computing node, execution of the at least one of the one or more applications from the saved execution stated loaded into the memory.
33. The method of claim 32 , wherein loading the saved execution state comprises loading a saved execution state based at least in part on the environment checkpointing data.
34. The method of claim 26 , further comprising:
executing, by the computing node, a distributed application using a main thread initiated by the computing node;
wherein saving the checkpointing data comprises saving an execution state of the distributed application, and wherein the execution state is indicative of a virtual memory state of the distributed application.
35. The method of claim 34 , further comprising:
saving memory pages, stored in a memory of the computing node, corresponding to a first application of the one or more applications in response to a determination that the first application is lagging behind the main thread; and
flushing memory pages, stored in the memory, corresponding to a second application of the one or more applications in response to a determination that the second application is working ahead of the main thread.
36. The method of claim 26 , wherein buffering the input/output (I/O) signals of the one or more paused applications comprises buffering memory access events.
37. The method of claim 26 , wherein buffering the input/output (I/O) signals of the one or more paused applications comprises buffering disk access events.
38. The method of claim 26 , wherein buffering the input/output (I/O) signals of the one or more paused applications comprises buffering network access events.
39. A method for administering an environment checkpointing event, the method comprising:
transmitting, by an orchestration node, a checkpoint initialization signal to each of a plurality of working computing nodes communicatively coupled to the orchestration node in response to an environment checkpoint initialization signal indicative of a checkpoint event;
receiving, by the orchestration node, checkpointing data from each working computing node in response to the checkpoint initialization signal, wherein the checkpoint data includes an execution state of at least one application of corresponding working computing node;
storing, by a memory storage device of the orchestration node, the received checkpointing data;
transmitting, by the orchestration node, a checkpoint complete signal to each of the plurality of working computing nodes.
40. The method of claim 39 , wherein transmitting the checkpoint complete signal to the plurality of working computing nodes comprises transmitting the checkpoint complete signal to the plurality of working computing nodes in response to a determination that the checkpointing data has been received from each of the plurality of working computing nodes.
41. The method of claim 39 , further comprising transmitting, by the orchestration node, the received checkpointing data from each of the plurality of working computing nodes to each of the plurality of working computing nodes communicatively coupled to and registered with the orchestration node.
42. A computing node comprising:
one or more processors; and
one or more memory devices having stored therein a plurality of instructions that, when executed by the one or more processors, cause the computing node to:
pause one or more applications presently executed on the computing node;
buffer, by an input/output (I/O) buffering device, input/output (I/O) signals of the one or more paused applications; and
transmit checkpointing data indicative of an execution state of each of the one or more applications to a remote node.
43. The computing node of claim 42 , wherein to pause the one or more applications comprises to pause the one or more applications in response to receipt of a checkpoint initialization signal received from the remote node, and wherein the plurality of instructions, when executed, further cause the computing node to:
lock context of the computing node to block any new data received by the computing node from being processed by the computing node in response to receipt of the checkpoint initialization signal.
44. The computing node of claim 43 , wherein the plurality of instructions, when executed by the one or more processors, further cause the computing node to:
receive a checkpoint complete signal from the remote node, and
resume the one or more paused applications in response to having received the checkpoint complete signal, wherein to resume the one or more paused applications comprises to (i) unlock context of the computing node to allow any new data to be received by the computing node and (ii) release the input/output (I/O) signals of the one or more applications from the input/output (I/O) buffering device.
45. The computing node of claim 42 , wherein the plurality of instructions, when executed by the one or more processors, further cause the computing node to:
save the checkpointing data to a memory storage device of the computing node, wherein the checkpointing data includes an execution state of each of the one or more applications; and
execute a distributed application using a main thread initiated by the computing node,
wherein to save the checkpointing data comprises to save an execution state of the distributed application, and wherein the execution state is indicative of a virtual memory state of the distributed application.
46. An orchestration node comprising:
one or more processors; and
one or more memory devices having stored therein a plurality of instructions that, when executed by the one or more processors, cause the orchestration node to:
transmit a checkpoint initialization signal to each of a plurality of working computing nodes communicatively coupled to the orchestration node in response to an environment checkpoint initialization signal indicative of a checkpoint event, and
transmit a checkpoint complete signal to each of the plurality of working computing nodes in response to a determination that checkpointing data has been received from each corresponding working computing node, wherein the checkpoint data includes an execution state of at least one application of a corresponding working computing node.
47. The orchestration node of claim 46 , wherein the plurality of instructions, when executed, further cause the orchestration node to:
receive checkpointing data from each working computing node in response to the checkpoint initialization signal, and
store the received checkpointing data, and
48. The orchestration node of claim 46 , wherein the plurality of instructions, when executed, further cause the orchestration node to transmit the received checkpointing data from each of the plurality of working computing nodes to each of the plurality of working computing nodes communicatively coupled to and registered with the orchestration node.
49. One or more computer-readable storage media comprising a plurality of instructions stored thereon that, when executed by a computing node, cause the computing node to:
pause one or more applications presently executed on the computing node;
buffer, by an input/output (I/O) buffering device, input/output (I/O) signals of the one or more paused applications; and
transmit checkpointing data indicative of an execution state of each of the one or more applications to a remote node.
50. The one or more computer-readable storage media of claim 49 , wherein to pause the one or more applications comprises to pause the one or more applications in response to receipt of a checkpoint initialization signal received from the remote node, and wherein the plurality of instructions, when executed, further cause the computing node to:
lock context of the computing node to block any new data received by the computing node from being processed by the computing node in response to receipt of the checkpoint initialization signal.
51. The one or more computer-readable storage media of claim 50 , wherein the plurality of instructions, when executed by the computing node, further cause the computing node to:
receive a checkpoint complete signal from the remote node, and
resume the one or more paused applications in response to having received the checkpoint complete signal, wherein to resume the one or more paused applications comprises to (i) unlock context of the computing node to allow any new data to be received by the computing node and (ii) release the input/output (I/O) signals of the one or more applications from the input/output (I/O) buffering device.
52. The one or more computer-readable storage media of claim 49 , wherein the plurality of instructions, when executed by the computing node, further cause the computing node to:
save the checkpointing data to a memory storage device of the computing node, wherein the checkpointing data includes an execution state of each of the one or more applications; and
execute a distributed application using a main thread initiated by the computing node,
wherein to save the checkpointing data comprises to save an execution state of the distributed application, and wherein the execution state is indicative of a virtual memory state of the distributed application.
53. One or more computer-readable storage media comprising a plurality of instructions stored thereon that, when executed by an orchestration node, cause the orchestration node to:
transmit a checkpoint initialization signal to each of a plurality of working computing nodes communicatively coupled to the orchestration node in response to an environment checkpoint initialization signal indicative of a checkpoint event, and
transmit a checkpoint complete signal to each of the plurality of working computing nodes in response to a determination that checkpointing data has been received from each corresponding working computing node, wherein the checkpoint data includes an execution state of at least one application of a corresponding working computing node.
54. The one or more computer-readable storage media of claim 53 , wherein the plurality of instructions, when executed, further cause the orchestration node to:
receive checkpointing data from each working computing node in response to the checkpoint initialization signal, and
store the received checkpointing data, and
55. The one or more computer-readable storage media of claim 53 , wherein the plurality of instructions, when executed, further cause the orchestration node to transmit the received checkpointing data from each of the plurality of working computing nodes to each of the plurality of working computing nodes communicatively coupled to and registered with the orchestration node.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/670,707 US20170357552A1 (en) | 2015-06-24 | 2017-08-07 | Technologies for data center environment checkpointing |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/748,650 US9727421B2 (en) | 2015-06-24 | 2015-06-24 | Technologies for data center environment checkpointing |
US15/670,707 US20170357552A1 (en) | 2015-06-24 | 2017-08-07 | Technologies for data center environment checkpointing |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/748,650 Continuation US9727421B2 (en) | 2015-06-24 | 2015-06-24 | Technologies for data center environment checkpointing |
Publications (1)
Publication Number | Publication Date |
---|---|
US20170357552A1 true US20170357552A1 (en) | 2017-12-14 |
Family
ID=57586179
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/748,650 Active 2035-10-22 US9727421B2 (en) | 2015-06-24 | 2015-06-24 | Technologies for data center environment checkpointing |
US15/670,707 Abandoned US20170357552A1 (en) | 2015-06-24 | 2017-08-07 | Technologies for data center environment checkpointing |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/748,650 Active 2035-10-22 US9727421B2 (en) | 2015-06-24 | 2015-06-24 | Technologies for data center environment checkpointing |
Country Status (4)
Country | Link |
---|---|
US (2) | US9727421B2 (en) |
EP (1) | EP3314436A1 (en) |
CN (1) | CN107743618A (en) |
WO (1) | WO2016209471A1 (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200233677A2 (en) * | 2018-04-11 | 2020-07-23 | Smart Enterprises, Inc. | Dynamically-Updatable Deep Transactional Monitoring Systems and Methods |
US10776208B2 (en) * | 2018-07-18 | 2020-09-15 | EMC IP Holding Company LLC | Distributed memory checkpointing using storage class memory systems |
US10684666B2 (en) * | 2018-09-11 | 2020-06-16 | Dell Products L.P. | Startup orchestration of a chassis system |
US11641395B2 (en) * | 2019-07-31 | 2023-05-02 | Stratus Technologies Ireland Ltd. | Fault tolerant systems and methods incorporating a minimum checkpoint interval |
CN113076228B (en) * | 2020-01-03 | 2024-06-04 | 阿里巴巴集团控股有限公司 | Distributed system and management method and device thereof |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7581220B1 (en) * | 2005-11-22 | 2009-08-25 | Symantec Operating Corporation | System and method for modifying user memory from an arbitrary kernel state |
US20090222632A1 (en) * | 2008-02-29 | 2009-09-03 | Fujitsu Limited | Storage system controlling method, switch device and storage system |
US20120036106A1 (en) * | 2010-08-09 | 2012-02-09 | Symantec Corporation | Data Replication Techniques Using Incremental Checkpoints |
US8250033B1 (en) * | 2008-09-29 | 2012-08-21 | Emc Corporation | Replication of a data set using differential snapshots |
US20130054529A1 (en) * | 2011-08-24 | 2013-02-28 | Computer Associates Think, Inc. | Shadow copy bookmark generation |
US20140095821A1 (en) * | 2012-10-01 | 2014-04-03 | The Research Foundation For The State University Of New York | System and method for security and privacy aware virtual machine checkpointing |
Family Cites Families (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6401216B1 (en) * | 1998-10-29 | 2002-06-04 | International Business Machines Corporation | System of performing checkpoint/restart of a parallel program |
US6594779B1 (en) * | 1999-03-30 | 2003-07-15 | International Business Machines Corporation | Method, system and program products for managing the checkpointing/restarting of resources of a computing environment |
CN1251110C (en) * | 2002-12-31 | 2006-04-12 | 联想(北京)有限公司 | Method for node load information transfer and node survival detection in machine group |
US7191292B2 (en) * | 2004-06-04 | 2007-03-13 | Sun Microsystems, Inc. | Logging of level-two cache transactions into banks of the level-two cache for system rollback |
US7478278B2 (en) * | 2005-04-14 | 2009-01-13 | International Business Machines Corporation | Template based parallel checkpointing in a massively parallel computer system |
US7627728B1 (en) * | 2005-12-29 | 2009-12-01 | Symantec Operating Corporation | System and method for efficient generation of application snapshots |
DE602007000348D1 (en) * | 2006-09-19 | 2009-01-22 | Shelbourne Data Man Ltd | Data management system and method |
JP2008242744A (en) * | 2007-03-27 | 2008-10-09 | Hitachi Ltd | Management device and method for storage device performing recovery according to cdp |
US20080244544A1 (en) * | 2007-03-29 | 2008-10-02 | Naveen Neelakantam | Using hardware checkpoints to support software based speculation |
US7856421B2 (en) * | 2007-05-18 | 2010-12-21 | Oracle America, Inc. | Maintaining memory checkpoints across a cluster of computing nodes |
US8381032B2 (en) * | 2008-08-06 | 2013-02-19 | O'shantel Software L.L.C. | System-directed checkpointing implementation using a hypervisor layer |
CN201497981U (en) * | 2009-04-30 | 2010-06-02 | 升东网络科技发展(上海)有限公司 | Database failure automatic detecting and shifting system |
JP5313099B2 (en) * | 2009-09-25 | 2013-10-09 | 日立建機株式会社 | Machine abnormality monitoring device |
US10061464B2 (en) * | 2010-03-05 | 2018-08-28 | Oracle International Corporation | Distributed order orchestration system with rollback checkpoints for adjusting long running order management fulfillment processes |
US9507841B2 (en) * | 2011-06-16 | 2016-11-29 | Sap Se | Consistent backup of a distributed database system |
US9098439B2 (en) * | 2012-01-05 | 2015-08-04 | International Business Machines Corporation | Providing a fault tolerant system in a loosely-coupled cluster environment using application checkpoints and logs |
US8832037B2 (en) * | 2012-02-07 | 2014-09-09 | Zerto Ltd. | Adaptive quiesce for efficient cross-host consistent CDP checkpoints |
EP2859437A4 (en) * | 2012-06-08 | 2016-06-08 | Hewlett Packard Development Co | Checkpointing using fpga |
CN102915257B (en) * | 2012-09-28 | 2017-02-08 | 曙光信息产业(北京)有限公司 | TORQUE(tera-scale open-source resource and queue manager)-based parallel checkpoint execution method |
WO2015014394A1 (en) * | 2013-07-30 | 2015-02-05 | Nec Europe Ltd. | Method and system for checkpointing a global state of a distributed system |
US9652338B2 (en) * | 2013-12-30 | 2017-05-16 | Stratus Technologies Bermuda Ltd. | Dynamic checkpointing systems and methods |
ES2652262T3 (en) * | 2013-12-30 | 2018-02-01 | Stratus Technologies Bermuda Ltd. | Method of delaying checkpoints by inspecting network packets |
US9507668B2 (en) * | 2014-10-30 | 2016-11-29 | Netapp, Inc. | System and method for implementing a block-based backup restart |
US9411628B2 (en) * | 2014-11-13 | 2016-08-09 | Microsoft Technology Licensing, Llc | Virtual machine cluster backup in a multi-node environment |
-
2015
- 2015-06-24 US US14/748,650 patent/US9727421B2/en active Active
-
2016
- 2016-05-24 WO PCT/US2016/033934 patent/WO2016209471A1/en unknown
- 2016-05-24 CN CN201680036762.4A patent/CN107743618A/en active Pending
- 2016-05-24 EP EP16814933.4A patent/EP3314436A1/en not_active Withdrawn
-
2017
- 2017-08-07 US US15/670,707 patent/US20170357552A1/en not_active Abandoned
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7581220B1 (en) * | 2005-11-22 | 2009-08-25 | Symantec Operating Corporation | System and method for modifying user memory from an arbitrary kernel state |
US20090222632A1 (en) * | 2008-02-29 | 2009-09-03 | Fujitsu Limited | Storage system controlling method, switch device and storage system |
US8250033B1 (en) * | 2008-09-29 | 2012-08-21 | Emc Corporation | Replication of a data set using differential snapshots |
US20120036106A1 (en) * | 2010-08-09 | 2012-02-09 | Symantec Corporation | Data Replication Techniques Using Incremental Checkpoints |
US20130054529A1 (en) * | 2011-08-24 | 2013-02-28 | Computer Associates Think, Inc. | Shadow copy bookmark generation |
US20140095821A1 (en) * | 2012-10-01 | 2014-04-03 | The Research Foundation For The State University Of New York | System and method for security and privacy aware virtual machine checkpointing |
Also Published As
Publication number | Publication date |
---|---|
US20160378611A1 (en) | 2016-12-29 |
WO2016209471A1 (en) | 2016-12-29 |
US9727421B2 (en) | 2017-08-08 |
EP3314436A1 (en) | 2018-05-02 |
CN107743618A (en) | 2018-02-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20170357552A1 (en) | Technologies for data center environment checkpointing | |
US11126420B2 (en) | Component firmware update from baseboard management controller | |
US9760408B2 (en) | Distributed I/O operations performed in a continuous computing fabric environment | |
US10146641B2 (en) | Hardware-assisted application checkpointing and restoring | |
US20190266003A1 (en) | Virtual machine recovery in shared memory architecture | |
US8347290B2 (en) | Monitoring spin locks in virtual machines in a computing system environment | |
US8904159B2 (en) | Methods and systems for enabling control to a hypervisor in a cloud computing environment | |
WO2021018267A1 (en) | Live migration method for virtual machine and communication device | |
US20150082309A1 (en) | System and method for providing redundancy for management controller | |
US9639486B2 (en) | Method of controlling virtualization software on a multicore processor | |
US9571584B2 (en) | Method for resuming process and information processing system | |
US11573815B2 (en) | Dynamic power management states for virtual machine migration | |
US9529656B2 (en) | Computer recovery method, computer system, and storage medium | |
US20190121656A1 (en) | Virtualization operations for directly assigned devices | |
US8499112B2 (en) | Storage control apparatus | |
US20200133701A1 (en) | Software service intervention in a computing system | |
US11726852B2 (en) | Hardware-assisted paravirtualized hardware watchdog | |
CN102609324A (en) | Method, device and system for restoring deadlock of virtual machine | |
US11243800B2 (en) | Efficient virtual machine memory monitoring with hyper-threading | |
EP4443291A1 (en) | Cluster management method and device, and computing system | |
US10152341B2 (en) | Hyper-threading based host-guest communication | |
US10394295B2 (en) | Streamlined physical restart of servers method and apparatus | |
Sartakov et al. | Temporality a NVRAM-based virtualization platform | |
Cheng et al. | Supporting software-defined HA clusters on OpenStack platform | |
US20220357976A1 (en) | Information processing apparatus, information processing method, and computer-readable recording medium storing information processing program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: ADVISORY ACTION MAILED |
|
AS | Assignment |
Owner name: INTEL CORPORATION, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:LJUBUNCIC, IGOR;GIRI, RAVI A.;REEL/FRAME:050411/0913 Effective date: 20150604 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |