US20210377117A1 - Cluster deployment and management system - Google Patents
Cluster deployment and management system Download PDFInfo
- Publication number
- US20210377117A1 US20210377117A1 US15/929,859 US202015929859A US2021377117A1 US 20210377117 A1 US20210377117 A1 US 20210377117A1 US 202015929859 A US202015929859 A US 202015929859A US 2021377117 A1 US2021377117 A1 US 2021377117A1
- Authority
- US
- United States
- Prior art keywords
- cluster
- switch device
- node devices
- management
- deployment
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0803—Configuration setting
- H04L41/084—Configuration by using pre-existing information, e.g. using templates or copying from other elements
- H04L41/0843—Configuration by using pre-existing information, e.g. using templates or copying from other elements based on generic templates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0803—Configuration setting
- H04L41/0806—Configuration setting for initial configuration or provisioning, e.g. plug-and-play
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0893—Assignment of logical groups to network elements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0894—Policy-based network configuration management
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/12—Discovery or management of network topologies
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/40—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using virtualisation of network functions or resources, e.g. SDN or NFV entities
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/2866—Architectures; Arrangements
- H04L67/30—Profiles
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/34—Network arrangements or protocols for supporting network services or applications involving the movement of software or configuration parameters
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0895—Configuration of virtualised networks or elements, e.g. virtualised network function or OpenFlow elements
Definitions
- the present disclosure relates generally to information handling systems, and more particularly to deployment and lifecycle management of a cluster of information handling systems.
- An information handling system generally processes, compiles, stores, and/or communicates information or data for business, personal, or other purposes thereby allowing users to take advantage of the value of the information.
- information handling systems may also vary regarding what information is handled, how the information is handled, how much information is processed, stored, or communicated, and how quickly and efficiently the information may be processed, stored, or communicated.
- the variations in information handling systems allow for information handling systems to be general or configured for a specific user or specific use such as financial transaction processing, airline reservations, enterprise data storage, or global communications.
- information handling systems may include a variety of hardware and software components that may be configured to process, store, and communicate information and may include one or more computer systems, data storage systems, and networking systems.
- cluster nodes in a cluster system may include a separate physical server device, a storage device, a networking device, an accelerator device, a Graphical Processing Unit (GPU), and/or the combination of those devices in a Hyper-Converged Infrastructure (HCI) system.
- a separate physical server device e.g., a server, a graphics processing unit (GPU), and/or the combination of those devices in a Hyper-Converged Infrastructure (HCI) system.
- HCI Hyper-Converged Infrastructure
- HCI systems provide a software-defined Information Technology (IT) infrastructure that virtualizes elements of conventional “hardware-defined” systems in order to provide virtualized computing (e.g., via a hypervisor), a virtualized Storage Area Network (SAN) (e.g., software-defined storage) and, in some situations, virtualized networking (e.g., storage-defined networking), any or all of which may be provided using commercial “off-the-shelf” server devices.
- IT Information Technology
- Some cluster systems utilize a complex set of cluster nodes in order to run modern, cloud-native, micro-service-based applications (e.g., a container cluster system). These cluster systems may include cluster nodes that provide computational and storage environments for supporting cloud native applications, and each cluster node in the cluster system may require its own set of configuration parameters for performing corresponding processing functions.
- each cluster node requires a manual configuration in order to provision roles, route access, storage connections, application allocations, and/or other configuration parameters that would be apparent to one of skill in the art in possession of the present disclosure.
- provisioning and management of the configuration parameters for all the cluster nodes is complex, time consuming, and potentially prone to errors, and as the cluster system increases in size, the difficulty in configuring, managing, and maintaining the cluster system increases exponentially.
- cluster systems may include a deployment server that is allocated to function as the deployment control point for each cluster node within the cluster system, with the deployment server deploying applications, services, and data to the other cluster nodes and providing lifecycle management to the cluster system during its operation.
- lifecycle management may include operations such as updating firmware and embedded software in the cluster nodes, changing application and Basic Input/Output System (BIOS) settings, installation of operating system patches, updates, and upgrades, maintenance of run-time environment applications/software, installation, loading of a container management system and/or a virtual machine management system, and/or other lifecycle management operations known in the art.
- BIOS Basic Input/Output System
- the deployment server can provide for the deployment of applications and services to the cluster system, when the cluster node(s) that provide networking resources and connectivity are unavailable to the other cluster nodes, the deployment server is unable to complete the deployment operations, and lacks connectivity to a management console. Furthermore, inclusion of the additional deployment server to perform deployment operations utilizes additional rack-space and adds additional cost to the cluster system. Further still, the deployment server itself requires lifecycle management, resulting in a “chicken or egg” paradox as the deployment server cannot manage its own lifecycle without disrupting the operation of the entire cluster system.
- an Information Handling System includes a processing system; and a memory system that is coupled to the processing system and that includes instructions that, when executed by the processing system, cause the processing system to provide a cluster deployment and management engine that is configured to: discover each of a plurality of node devices in a cluster system; validate each of the plurality of node devices in the cluster system using a cluster profile; configure each of the plurality of node devices according to the cluster profile; and deploy one or more applications and data to at least one of the node devices included in the plurality of node devices.
- a cluster deployment and management engine that is configured to: discover each of a plurality of node devices in a cluster system; validate each of the plurality of node devices in the cluster system using a cluster profile; configure each of the plurality of node devices according to the cluster profile; and deploy one or more applications and data to at least one of the node devices included in the plurality of node devices.
- FIG. 1 is a schematic view illustrating an embodiment of an Information Handling System (IHS).
- IHS Information Handling System
- FIG. 2 is a schematic view illustrating an embodiment of a cluster deployment and management system.
- FIG. 3 is a schematic view illustrating a networking device that may be provided in the cluster deployment and management system of FIG. 2 .
- FIG. 4 is a flow chart illustrating an embodiment of a method for deploying and managing a cluster system.
- FIG. 5 is a flow chart illustrating an embodiment of a method for performing lifecycle management on a networking device that deploys and manages a cluster system according to the method of FIG. 4 .
- an information handling system may include any instrumentality or aggregate of instrumentalities operable to compute, calculate, determine, classify, process, transmit, receive, retrieve, originate, switch, store, display, communicate, manifest, detect, record, reproduce, handle, or utilize any form of information, intelligence, or data for business, scientific, control, or other purposes.
- an information handling system may be a personal computer (e.g., desktop or laptop), tablet computer, mobile device (e.g., personal digital assistant (PDA) or smart phone), server (e.g., blade server or rack server), a network storage device, or any other suitable device and may vary in size, shape, performance, functionality, and price.
- the information handling system may include random access memory (RAM), one or more processing resources such as a central processing unit (CPU) or hardware or software control logic, ROM, and/or other types of nonvolatile memory. Additional components of the information handling system may include one or more disk drives, one or more network ports for communicating with external devices as well as various input and output (I/O) devices, such as a keyboard, a mouse, touchscreen and/or a video display. The information handling system may also include one or more buses operable to transmit communications between the various hardware components.
- RAM random access memory
- processing resources such as a central processing unit (CPU) or hardware or software control logic
- ROM read-only memory
- Additional components of the information handling system may include one or more disk drives, one or more network ports for communicating with external devices as well as various input and output (I/O) devices, such as a keyboard, a mouse, touchscreen and/or a video display.
- I/O input and output
- the information handling system may also include one or more buses operable to transmit communications between the various
- IHS 100 includes a processor 102 , which is connected to a bus 104 .
- Bus 104 serves as a connection between processor 102 and other components of IHS 100 .
- An input device 106 is coupled to processor 102 to provide input to processor 102 .
- Examples of input devices may include keyboards, touchscreens, pointing devices such as mouses, trackballs, and trackpads, and/or a variety of other input devices known in the art.
- Programs and data are stored on a mass storage device 108 , which is coupled to processor 102 . Examples of mass storage devices may include hard discs, optical disks, magneto-optical discs, solid-state storage devices, and/or a variety other mass storage devices known in the art.
- IHS 100 further includes a display 110 , which is coupled to processor 102 by a video controller 112 .
- a system memory 114 is coupled to processor 102 to provide the processor with fast storage to facilitate execution of computer programs by processor 102 .
- Examples of system memory may include random access memory (RAM) devices such as dynamic RAM (DRAM), synchronous DRAM (SDRAM), solid state memory devices, and/or a variety of other memory devices known in the art.
- RAM random access memory
- DRAM dynamic RAM
- SDRAM synchronous DRAM
- solid state memory devices solid state memory devices
- a chassis 116 houses some or all of the components of IHS 100 . It should be understood that other buses and intermediate circuits can be deployed between the components described above and processor 102 to facilitate interconnection between the components and the processor 102 .
- the cluster deployment and management system 200 of the present disclosure may include one or more cluster systems such as the cluster system 202 illustrated in FIG. 2 .
- the cluster system 202 includes a plurality of node devices 202 a , 202 b , 202 c , and up to 202 d .
- any or all of the node devices 202 a - 202 d may be provided by the IHS 100 discussed above with reference to FIG. 1 , and/or may include some or all of the components of the IHS 100 .
- the cluster system 202 may be provided by Hyper-Converged Infrastructure (HCI) systems, with each of the node devices 202 a - 202 d provided by storage-dense server devices.
- HCI Hyper-Converged Infrastructure
- the node devices 202 a - 202 d may be provided by a server device (e.g., a server computing device), a networking device (e.g., a switch, a router, a gateway, etc.), an accelerator device, a Graphical Processing Unit (GPU) device, a storage device (e.g., an array of Solid-State Drives (SSDs), an array of Hard Disk Drives (HDDs), etc.) and/or any other computing device that one of skill in the art in possession of the present disclosure would recognize may provide a cluster node device that is distinct from other cluster node devices in a cluster system.
- cluster systems and node devices provided in the cluster deployment and management system 200 may include any types of cluster
- a pair of networking devices 206 and 208 are coupled to each of the node devices 202 a - 202 d included in the cluster system 202 .
- the networking device 206 and/or the networking device 208 may be cluster node devices included in the cluster system 202 .
- either or both of the networking devices 206 and 208 may be provided by the IHS 100 discussed above with reference to FIG. 1 , and/or may include some or all of the components of the IHS 100 .
- the networking devices 206 and 208 may be provided by Top Of Rack (TOR) switch devices, although other switch devices and/or networking devices may fall within the scope of the present disclosure as well. While a pair of networking device 206 and 208 are illustrated, one of skill in the art in possession of the present disclosure will recognize that a single networking device may be provided in the cluster deployment and management system 200 or more than two networking devices may be provided in the cluster deployment and management system.
- TOR Top Of Rack
- the networking device 206 and the networking device 208 may be coupled to a network 210 (e.g., a Local Area Network (LAN), the Internet, combinations thereof, etc.).
- a network 210 e.g., a Local Area Network (LAN), the Internet, combinations thereof, etc.
- the illustrated embodiment of the cluster deployment and management system 200 provides an example of “highly available” edge-based cluster system that utilizes a pair of redundant networking devices 206 and 208 that may each operate to ensure network connectivity for the cluster system 202 in the event of the failure or unavailability of the other networking device.
- the networking devices 206 and 208 may be associated with a data plane in which the networking devices 206 and 208 essentially operate as a single switch device. Further still, the networking processing systems (discussed below) in the networking devices 206 and 208 may perform a variety of switch fabric management functionality, as well as any other functionality that would be apparent to one of skill in the art in possession of the present disclosure. While a specific cluster deployment and management system 200 has been illustrated and described, one of skill in the art in possession of the present disclosure will recognize that the cluster deployment and management system of the present disclosure may include a variety of components and component configurations while remaining within the scope of the present disclosure as well.
- a networking device 300 may be the networking device 206 and/or the networking device 208 discussed above with reference to FIG. 2 .
- the networking device 300 may be the IHS 100 discussed above with reference to FIG. 1 , and/or may include some or all of the components of the IHS 100 .
- the networking device 300 is described as being provided by a networking switch, the networking device 300 may be provided by a router, a gateway, and/or a variety of networking devices that would be apparent to one of skill in the art in possession of the present disclosure.
- the networking device 300 includes a chassis 302 that houses the components of the networking device 300 , only some of which are illustrated in FIG. 3 .
- the chassis 302 may house a processing system (not illustrated, but which may be provided by the processor 102 discussed above with reference to FIG. 1 ) and a memory system (not illustrated, but which may be provided by the memory 114 discussed above with reference to FIG. 1 ) that includes instructions that, when executed by the processing system, cause the processing system to provide a networking engine 304 that is configured to perform the functionality of the networking engines and/or networking devices discussed below.
- a processing system not illustrated, but which may be provided by the processor 102 discussed above with reference to FIG. 1
- a memory system not illustrated, but which may be provided by the memory 114 discussed above with reference to FIG. 1
- the networking engine 304 includes an operating system 306 and a container runtime engine 308 that are configured to perform the functions of the networking engines, operating systems, container engines, and/or networking devices discussed below.
- the container runtime engine 308 e.g., a container engine available from Docker®/Docker Swarm® (currently available at http://www.docker.com), Collinser®, Windows Server 2016 Containers, and/or other container APIs known in the art
- the container 310 generated by the container runtime engine 308 may be provided by isolated user-space virtualization instances that run on top of the operating system 306 , and may be provisioned from a container image which specifies one or more prerequisites that a container requires to process a job for which the container is being provisioned.
- the container 310 may be configured with an agent such as a cluster deployment and management engine 310 a that is configured to perform the functions of the cluster deployment and management engines and/or the networking devices discussed below.
- the cluster deployment and management engine 310 a may be provided by a third-party or may include third-party code.
- the use of a container 310 to deploy the cluster deployment and management engine 310 a keeps the operations of the cluster deployment and management engine 310 a separate from the instructions used by the networking engine 304 for networking operations and makes it possible to update or replace the cluster deployment and management engine 310 a without impacting the networking operations of the networking device.
- the cluster deployment and management engine 310 a is illustrated as being provided in a container environment (e.g., the container 310 ), one of skill in the art in possession of the present disclosure may recognize that the cluster deployment and management engine 310 a may be a module that is provided by the networking engine 304 , or via its own distinct operations that are separate from the networking engine 304 .
- the networking engine 304 may be provided by networking processing system (e.g., Networking Processing Unit (NPU)) in the networking device 300 that is configured to transmit data traffic between the network 210 and the node devices 202 a - 202 d in the cluster system 202 , discussed above with reference to FIG. 2 , using a variety of data traffic network transmission techniques that would be apparent to one of skill in the art in possession of the present disclosure.
- the operating system 306 , the container runtime engine 308 , and/or the cluster deployment and management engine 310 a may be provided by a central processing system (e.g., a Central Processing Unit (CPU)) in the networking device 300 that is configured to run applications for the networking device 300 .
- CPU Central Processing Unit
- the chassis 302 may also house a storage device (not illustrated, but which may be the storage device 108 discussed above with reference to FIG. 1 ) that is coupled to the networking engine 304 (e.g., via a coupling between the storage device and the processing system) and that includes a networking database 312 that is configured to store the rules and/or any other data utilized by the networking engine 304 and/or the cluster deployment and management engine 310 a in order to provide the functionality discussed below.
- a storage device not illustrated, but which may be the storage device 108 discussed above with reference to FIG. 1
- networking engine 304 e.g., via a coupling between the storage device and the processing system
- networking database 312 that is configured to store the rules and/or any other data utilized by the networking engine 304 and/or the cluster deployment and management engine 310 a in order to provide the functionality discussed below.
- the networking database 312 includes a cluster profile repository 312 a that stores one or more cluster profiles, and the networking database 312 includes a cluster service and data repository 312 b that stores cluster data, cluster micro-services, cluster applications, and/or any other information that may be used to perform a variety of deployment functionality that one of skill in the art in possession of the present disclosure would recognize enables the cluster system to service a workload.
- the chassis 302 also houses the communication system 314 that is coupled to the networking engine 304 and/or the cluster deployment and management engine 310 a (e.g., via a coupling between the communication system 314 and the processing system), and that may include a network interface controller (NIC), programmable Smart NIC, a wireless communication subsystem, and/or other communication subsystems known in the art.
- NIC network interface controller
- programmable Smart NIC a wireless communication subsystem
- cluster profile repository 312 a and the cluster service and data repository 312 b are illustrated as stored in the networking database 312 that is housed in the chassis 302 , one of skill in the art in possession of the present disclosure will recognize that the cluster profile repository 312 a and/or the cluster service and data repository 312 b may be stored in a storage device that is located outside the chassis 302 and that is accessible to the networking engine 304 and/or the cluster deployment and management engine 310 a through a network (e.g., the network 210 of FIG. 2 ) via the communication system 314 .
- a network e.g., the network 210 of FIG. 2
- the storage device and communication system 314 may enable the networking engine 304 and/or the cluster deployment and management engine 310 a included in the networking device 300 to access the cluster profile repository 312 a and/or the cluster service and data repository 312 b without having to store that cluster profile repository 312 a and/or the cluster service and data repository 312 b directly on the networking device 300 .
- the networking device 300 may include other components and utilized to perform the functionality described below, as well as conventional networking device (e.g., a conventional network switch functionality), while remaining within the scope of the present disclosure.
- the systems and method of the present disclosure may provide a cluster deployment and management application on a cluster node in a cluster system that is provided by a networking device such as a switch device, with the cluster deployment and management application operating to deploy applications, services, and data on other cluster nodes in the cluster system.
- the networking device may be preconfigured and may include a validated operating system, as well as networking connectivity resources for interconnecting the cluster nodes (e.g., servers, accelerators, storage, networking devices, and/or other device included in the cluster system), and thus may be used and managed subsequent to being powered on.
- the cluster deployment and management application may be executed when the networking device is active and, in some embodiments, the cluster deployment and management application may be provided by a container that is activated or “spun up” on the preconfigured operating system running on the networking device. Upon activation, the cluster deployment and management application may begin cluster node discovery operations that gather inventory information associated with the cluster nodes included in the cluster system in order to determine a cluster configuration and to validate the cluster configuration against a cluster configuration profile. The cluster deployment and management application may then provision the cluster nodes with roles, states, and storage allocations that are specified in the cluster configuration profile, followed by its automatic deployment of applications, services, and data that are required for the cluster system and cluster nodes to operate.
- a networking device may be provided in a cluster system as a control point for cluster deployment and management to eliminate the requirement of a separate server in the cluster system for the control operations. Furthermore, a boot sequence of a networking device may operate to update a container image prior to the initiation of the cluster deployment and management application in the networking switch, which allows the use of the container to perform lifecycle management on the networking device prior to the cluster deployment and management application gathering inventory information for the cluster nodes included in the cluster system.
- a second networking device may be included in the cluster system for redundancy purposes, and that second networking device may mirror a primary networking device, which allows the secondary networking device to provide the cluster deployment and management application to the cluster system if the primary networking device requires any lifecycle management during the operation of the cluster system, and eliminates the “chicken or egg” paradox discussed above that is present in conventional cluster deployment and management servers.
- the method 400 begins at block 402 where a first networking device that is coupled to one or more cluster nodes in a cluster system is initialized.
- the networking device 206 / 300 may initialize when power is provided to the networking device 206 / 300 .
- the networking device 206 / 300 may be preconfigured and may include a validated operating system and, during the initialization of the networking device 206 / 300 , a Basic Input/Output System (BIOS) (not illustrated) in the networking device 206 / 300 may perform a boot sequence.
- the boot sequence may update any container images, such as the container image for the container 310 that runs the cluster deployment and management engine 310 a , which as discussed above may be stored in the networking database 312 or accessible via the network 210 .
- the method 400 then proceeds to block 404 where the first networking device initializes a cluster deployment and management engine.
- the networking device 206 / 300 may initialize the cluster deployment and management engine 310 a .
- the container runtime engine 308 running on the operating system 306 may generate a container 310 that includes the cluster deployment and management engine 310 a from the container image stored in the networking database 312 .
- the cluster deployment and management engine 310 a may perform lifecycle management operations on the networking device 206 / 300 that may include any firmware updates, BIOS updates, operating system updates, and/or any other lifecycle management operations that would be apparent to one of skill in the art in possession of the present disclosure.
- the method 400 then proceeds to block 406 where the first networking device discovers each of the plurality of node devices in the cluster system in order to obtain cluster inventory information.
- the cluster deployment and management engine 310 a may perform cluster node discovery operations.
- the cluster deployment and management engine 310 a may utilize Address Resolution Protocol (ARP), Dynamic Host Configuration Protocol (DHCP), Simple Network Management Protocol (SNMP), User Datagram Protocol-based Data Transfer Protocol (UDT), and/or other discovery/communication protocols that would be apparent to one of skill in the art to discover the node devices 202 a - 202 d and/or the networking device 208 included in the cluster system 202 .
- ARP Address Resolution Protocol
- DHCP Dynamic Host Configuration Protocol
- SNMP Simple Network Management Protocol
- UDT User Datagram Protocol-based Data Transfer Protocol
- the cluster deployment and management engine 310 a may simply query the networking engine 304 to enumerate devices that are attached to each of its ports.
- the cluster node discovery may be accomplished via an Intelligent Platform Management Interface (IPMI), a Remote Access Controller (RAC) (e.g., an Integrated Dell Remote Access Controller (iDRAC) or a Baseboard Management Controller (BMC) and/or by introspection tools.
- IPMI Intelligent Platform Management Interface
- RAC Remote Access Controller
- iDRAC Integrated Dell Remote Access Controller
- BMC Baseboard Management Controller
- the cluster deployment and management engine 310 a on the networking device 206 / 300 may discover the node devices 202 a - 202 d and/or the networking device 208 included in the cluster system 202 .
- the performance of the cluster node discovery operations may include the cluster deployment and management engine 310 a generating inventory information about the cluster system 202 that may include a node device type of each of the node device 202 a - 202 d and/or the networking device 208 (e.g., a server device, a networking device, a storage device, a GPU, an accelerator device, and/or other devices known in the art), the capabilities of each of the node device 202 a - 202 d and/or the networking device 208 , a topology of the node devices 202 a - 202 d and/or the networking device 208 , configuring the order of Network Interface Controllers (NIC) for remote booting of each server device and/or any other node device information and cluster system information that would be apparent to one of skill in the art in possession of the present disclosure.
- NIC Network Interface Controller
- the performance of node discovery operations may be introspected using a discovery protocol to enumerate the configuration of firmware and components in the node devices 202 a - 202 d in the cluster system 202 .
- the information that is gleaned from node device introspections may be used to determine any change of state that must be established to declare that the node device is ready for the next state transition operation to proceed.
- the method 400 then proceeds to block 408 where the first networking device determines whether the inventory information for the cluster system and a cluster profile indicate that the cluster system is valid.
- the cluster deployment and management engine 310 a may validate the inventory information for the cluster system 202 with a cluster profile that is stored in the cluster profile repository 312 a .
- the cluster deployment and management engine 310 a may compare the inventory information obtained in block 406 to inventory information stored in each cluster profile in the cluster profile repository 312 a .
- the inventory information for the cluster system 202 in order for the inventory information for the cluster system 202 to be validated, the inventory information for the cluster system 202 must match (or substantially match by, for example, satisfying a predetermined condition of similarity with) the inventory information included in a cluster profile. If the inventory information for the cluster system 202 does not match the inventory information in any of the cluster profiles in the cluster profile repository 312 a , the cluster deployment and management engine 310 a may invalidate the cluster system 202 , and a notification may be sent by the cluster deployment and management engine 310 a to an administrator via the network 210 .
- the cluster deployment and management engine 310 a may select a cluster profile from the cluster profile repository 312 a that is the most similar to the inventory information of the cluster system 202 , or may build a cluster profile based on a master cluster profile stored in the cluster profile repository 312 a and convergence rules provided in that master cluster profile.
- the method 400 then proceeds to block 410 where the first networking device configures each of the plurality of node devices included in the cluster system according to the cluster profile.
- the cluster deployment and management engine 310 a may retrieve a cluster configuration from the cluster profile that was used to validate the cluster system 202 in block 408 .
- the cluster configuration may include configurations for the node devices 202 a - 202 d and/or the networking device 208 , and the cluster deployment and management engine 310 a may configure the node device 202 a - 202 d and/or the networking device 208 using the cluster configuration.
- the cluster deployment and management engine 310 a may assign roles and services to the node devices 202 a - 202 d defined in the cluster configuration.
- the cluster deployment and management engine 310 a may assign a switch device as a TOR switch, a leaf-spine switch, or as a core switch.
- the cluster deployment and management engine 310 a may assign a server device as a control plane device, as a compute node, as a storage node, or as a Host Controller Interface (HCI) node.
- a NIC may be assigned to a function as a leaf switch or as a network connection for storage or a GPU.
- the node devices may further be assigned sub-functional roles as required during initial deployment of the cluster system 202 , during initialization of the cluster system 202 , and/or as part of a persistent or temporal role necessary for part or all of the service life of the cluster system 202 .
- the cluster deployment and management engine 310 a may allocate storage resources included in the node devices 202 a - 202 d in at least one of the node devices 202 a - 202 d , with the storage resources assigned based on applications and workloads that are to be run on the cluster system 202 .
- the cluster deployment and management engine 310 a may deploy a container infrastructure on at least one of the node devices 202 a - 202 d .
- container engines and/or virtual machine engines that are configured to provide containers and/or virtual machines, respectively, for the various applications that are to operate on the cluster system 202 may be deployed on the various node device 202 a - 202 d (e.g., the servers, GPUs, accelerators, and/or other device).
- the various node device 202 a - 202 d e.g., the servers, GPUs, accelerators, and/or other device.
- cluster configurations are described, one of skill in the art in possession of the present disclosure will recognize that other cluster configurations may be applied to the cluster system 202 without remaining within the scope of the present disclosure as well.
- the method 400 then proceeds to block 412 where the first networking device deploys one or more applications and data to at least one of the node devices included in the plurality of node devices.
- the cluster deployment and management engine 310 a may deploy one or more applications and data to the node device 202 a - 202 d and/or the networking device 208 , and those applications and/or data may be obtained from the cluster service and data repository 312 b .
- the cluster deployment and management engine 310 a may access the cluster service and data repository 312 b to obtain micro-service functions, application functions, data for those micro-service functions and application functions, and/or any other data and applications that would be apparent to one of skill in the art in possession of the present disclosure.
- the networking device 206 / 300 may provide a control point for the node devices 202 a - 202 d when deploying applications, services, and/or data.
- the cluster service and data repository 312 b may be provided on the networking database 312 housed in the networking device 300 and/or connected to the networking engine 304 via a local connection and/or the network 210 .
- any virtual machine and/or container that hosts the applications and/or services may be deployed on the container infrastructure as well, and upon completion of block 412 , the cluster system 202 may be operational such that it is running the services and applications on the cluster system 202 .
- the method 400 then proceeds to block 414 where the first networking device performs lifecycle management operations on at least one of the node devices.
- the cluster deployment and management engine 310 a may perform any of a variety of lifecycle management operations on the node devices 202 a - 202 d and/or the networking device 208 .
- the cluster deployment and management engine 310 a on the networking device 206 / 300 may also perform lifecycle management operations upon itself.
- the cluster deployment and management engine 310 a on the networking device 206 may perform lifecycle management operations including, for example, the updating of firmware and embedded software on the nodes device 202 a - 202 d and/or the networking device 208 , the changing of application and Basic Input/Output System (BIOS) settings on the nodes device 202 a - 202 d and/or the networking device 208 , the installation of operating system patches, updates, and/or upgrades on the nodes device 202 a - 202 d and/or the networking device 208 , the maintenance of run-time environment applications/software on the nodes device 202 a - 202 d and/or the networking device 208 , the installation and loading of a container management system and/or a virtual machine management system on the cluster system 202 , configuring switch device for overlay as required for clustering platform to be deployed on other node devices (e.g., setting up VLANS that the cluster management and deployment engine 310
- BIOS Basic Input/Output System
- the method 500 begins at block 502 where the first networking device identifies a lifecycle management operation that is required for the first networking device.
- the cluster deployment and management engine 310 a on the networking device 206 / 300 may receive a lifecycle management operation for the networking device 206 / 300 .
- the networking device 206 / 300 may receive the lifecycle management operation via the network 210 .
- the server device in conventional cluster systems where a server device included the cluster system 202 provides lifecycle management functions and cluster deployment, such lifecycle management operations would require that the server device restart or shut down, which in turn requires that the entire cluster system restart and be reconfigured.
- the server device may require a firmware update, reconfiguration of firmware of BIOS settings, redeployment of hosted operating system components, rebuilding of hosted application containers or components, and/or simple redeployment of services that the server device provides within the cluster framework.
- the method 500 proceeds to block 504 where the first networking device passes control of the cluster deployment and management to a second networking device.
- the cluster deployment and management engine 310 a on the networking device 206 / 300 may pass control to the cluster deployment and management engine 310 a on the networking device 208 / 300 .
- cluster systems such as the cluster system 202 often require redundant networking devices to maintain connectivity to a network such as the network 210 in the event that a primary networking device fails.
- the networking device 206 and the networking device 208 may perform elections operations to elect the networking device 206 as a primary networking device such that the networking device 208 is designated as a secondary networking device.
- the election of the primary networking device may include the selection of the networking device 206 / 300 as the networking device to handle the cluster deployment and management engine 310 a .
- the selection of the networking device 206 / 300 may have been auto-negotiated between the networking device 206 and 208 using an intelligent algorithm that assures that only one of them will own this role for the duration of a deployment stream.
- the networking devices 206 and 208 may be aggregated to form Link Aggregation Groups (LAG), as well as virtualized as a virtual networking device that the other nodes devices 202 a - 202 d included in the cluster system 202 recognize as a single networking device provided by the aggregated networking devices 206 and 208 .
- LAG Link Aggregation Groups
- the aggregation of networking devices or the provisioning of redundant networking devices also requires synchronization of the networking devices such that, if the primary networking device fails or otherwise becomes unavailable, the secondary networking device can resume operations for the primary networking device without disruption to network connectivity and services.
- the networking device 206 and the networking device 208 may perform synchronization operations via their respective networking engines 304 , and those synchronization operations may cause the networking device 208 / 300 to deploy the cluster deployment and management engine 310 a as it is deployed on the networking device 206 / 300 such that the cluster deployment and management engine 310 a remains available should the networking device 206 / 300 become unavailable or require a lifecycle management operation.
- the cluster deployment and management engine 310 a on the networking device 206 / 300 may signal to the cluster deployment and management engine 310 a on the networking device 208 / 300 to take control of cluster deployment and management.
- the signal provided by the networking device 206 / 300 may include a notification sent to the networking device 208 / 300 , or a lack of signal (e.g., a lack of a heartbeat message) when the networking device 206 shuts down or otherwise becomes unavailable.
- the networking engine 304 on the networking device 208 / 300 may then operate to take over the primary networking device role for the networking engine 304 on the networking device 206 / 300 , and subsequently provide the network functionality for the cluster system 202 .
- the cluster deployment and management engine 310 a on the networking device 208 / 300 may take control of the lifecycle management and the cluster deployment for the cluster system 202 .
- the method 500 then proceeds to block 506 where the lifecycle management operations are performed on the first networking device.
- the lifecycle management operations may be performed on the networking device 206 / 300 .
- the cluster deployment and management engine 310 a on the networking device 208 / 300 may take control of the lifecycle management for the cluster system 202 that was previously managed by the cluster deployment and management engine 310 a on the networking device 206 / 300 , and the cluster deployment and management engine 310 a on the networking device 208 / 300 may then assist in the performance of the lifecycle management operations on the networking device 206 / 300 while the networking device 206 / 300 is being updated with any lifecycle management operations.
- the method 500 then proceeds to block 508 where the first networking device synchronizes cluster deployment and management engine data with the second networking device.
- the networking engine 304 on the networking device 206 / 300 may synchronize with the networking engine 304 on the networking device 206 / 300 after the lifecycle management operations are performed on the networking device 206 / 300 .
- the synchronization may include synchronizing cluster deployment and management engine data between the networking device 206 / 300 and the networking device 208 / 300 .
- the networking engine 304 on the networking device 208 / 300 may provide any cluster deployment and management engine data for the cluster deployment and management engine 310 a to the networking engine 304 on the networking device 206 / 300 so that the cluster deployment and management engine 310 a on the networking device 208 / 300 mirrors the cluster deployment and management engine 310 a on the networking device 206 / 300 .
- control of the cluster system 202 may revert to the cluster deployment and management engine 310 a on the networking device 206 / 300 once the networking device 206 / 300 has completed the lifecycle management operations.
- the networking device 206 / 300 may be designated as the secondary networking device while the networking device 208 / 300 remains the primary networking device and in control of the deployment and management of the cluster system 202 .
- the cluster deployment and management engine may be initialized on a container on the switch device, and may provide for the discovery of node devices in the cluster system, the validation of the discovered node devices, the configuration of the node devices including the assignment and deployment of roles, services, and allocation of storage to the roles and services, the deployment of applications on a container and/or virtual machine infrastructure, and/or a variety of lifecycle management operations known in the art.
- the cluster deployment and management engine may also configure and perform lifecycle management operations for the switch device prior to the cluster deployment and management application configuring the cluster system.
- the cluster system may also include a redundant switch device that synchronizes with the “primary” switch device that provides the cluster deployment and management application such that the redundant switch device may control networking functionality and cluster deployment and management functionality in the event lifecycle management operations are performed on the primary switch device.
- the systems and methods of the present disclosure eliminate a need for a separate server device that performs cluster deployment and management, as is required in conventional cluster systems.
- the systems and methods of the present disclosure allow for the performance of lifecycle management operations on a primary switch device on which the cluster deployment and management application is provided, resulting in the cluster system experiencing little to no downtime during primary switch device lifecycle management operations.
- network connectivity for the cluster system is provided when the switch device is initialized, which allows for remote management of the cluster deployment and management application when the cluster deployment and management application becomes available.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
- Hardware Redundancy (AREA)
Abstract
Description
- The present disclosure relates generally to information handling systems, and more particularly to deployment and lifecycle management of a cluster of information handling systems.
- As the value and use of information continues to increase, individuals and businesses seek additional ways to process and store information. One option available to users is information handling systems. An information handling system generally processes, compiles, stores, and/or communicates information or data for business, personal, or other purposes thereby allowing users to take advantage of the value of the information. Because technology and information handling needs and requirements vary between different users or applications, information handling systems may also vary regarding what information is handled, how the information is handled, how much information is processed, stored, or communicated, and how quickly and efficiently the information may be processed, stored, or communicated. The variations in information handling systems allow for information handling systems to be general or configured for a specific user or specific use such as financial transaction processing, airline reservations, enterprise data storage, or global communications. In addition, information handling systems may include a variety of hardware and software components that may be configured to process, store, and communicate information and may include one or more computer systems, data storage systems, and networking systems.
- Information handling systems are sometimes provided via cluster systems that include a plurality of cluster nodes. For example, cluster nodes in a cluster system may include a separate physical server device, a storage device, a networking device, an accelerator device, a Graphical Processing Unit (GPU), and/or the combination of those devices in a Hyper-Converged Infrastructure (HCI) system. As will be appreciated by one of skill in the art, HCI systems provide a software-defined Information Technology (IT) infrastructure that virtualizes elements of conventional “hardware-defined” systems in order to provide virtualized computing (e.g., via a hypervisor), a virtualized Storage Area Network (SAN) (e.g., software-defined storage) and, in some situations, virtualized networking (e.g., storage-defined networking), any or all of which may be provided using commercial “off-the-shelf” server devices.
- Some cluster systems utilize a complex set of cluster nodes in order to run modern, cloud-native, micro-service-based applications (e.g., a container cluster system). These cluster systems may include cluster nodes that provide computational and storage environments for supporting cloud native applications, and each cluster node in the cluster system may require its own set of configuration parameters for performing corresponding processing functions. Currently, each cluster node requires a manual configuration in order to provision roles, route access, storage connections, application allocations, and/or other configuration parameters that would be apparent to one of skill in the art in possession of the present disclosure. As such, provisioning and management of the configuration parameters for all the cluster nodes is complex, time consuming, and potentially prone to errors, and as the cluster system increases in size, the difficulty in configuring, managing, and maintaining the cluster system increases exponentially.
- Furthermore, after the cluster system and its cluster nodes are configured and operational, the deployment of applications and services such as, for example, containerized applications, introduces additional challenges in cluster systems where the alignment of compute resources, storage, and network connectivity is required to ensure the reliability and the performance of the applications and services. Conventional cluster systems may include a deployment server that is allocated to function as the deployment control point for each cluster node within the cluster system, with the deployment server deploying applications, services, and data to the other cluster nodes and providing lifecycle management to the cluster system during its operation. As would be appreciated by one of skill in the art, lifecycle management may include operations such as updating firmware and embedded software in the cluster nodes, changing application and Basic Input/Output System (BIOS) settings, installation of operating system patches, updates, and upgrades, maintenance of run-time environment applications/software, installation, loading of a container management system and/or a virtual machine management system, and/or other lifecycle management operations known in the art.
- However, while the deployment server can provide for the deployment of applications and services to the cluster system, when the cluster node(s) that provide networking resources and connectivity are unavailable to the other cluster nodes, the deployment server is unable to complete the deployment operations, and lacks connectivity to a management console. Furthermore, inclusion of the additional deployment server to perform deployment operations utilizes additional rack-space and adds additional cost to the cluster system. Further still, the deployment server itself requires lifecycle management, resulting in a “chicken or egg” paradox as the deployment server cannot manage its own lifecycle without disrupting the operation of the entire cluster system.
- Accordingly, it would be desirable to provide a cluster deployment and management system that addresses the issues discussed above.
- According to one embodiment, an Information Handling System (IHS) includes a processing system; and a memory system that is coupled to the processing system and that includes instructions that, when executed by the processing system, cause the processing system to provide a cluster deployment and management engine that is configured to: discover each of a plurality of node devices in a cluster system; validate each of the plurality of node devices in the cluster system using a cluster profile; configure each of the plurality of node devices according to the cluster profile; and deploy one or more applications and data to at least one of the node devices included in the plurality of node devices.
-
FIG. 1 is a schematic view illustrating an embodiment of an Information Handling System (IHS). -
FIG. 2 is a schematic view illustrating an embodiment of a cluster deployment and management system. -
FIG. 3 is a schematic view illustrating a networking device that may be provided in the cluster deployment and management system ofFIG. 2 . -
FIG. 4 is a flow chart illustrating an embodiment of a method for deploying and managing a cluster system. -
FIG. 5 is a flow chart illustrating an embodiment of a method for performing lifecycle management on a networking device that deploys and manages a cluster system according to the method ofFIG. 4 . - For purposes of this disclosure, an information handling system may include any instrumentality or aggregate of instrumentalities operable to compute, calculate, determine, classify, process, transmit, receive, retrieve, originate, switch, store, display, communicate, manifest, detect, record, reproduce, handle, or utilize any form of information, intelligence, or data for business, scientific, control, or other purposes. For example, an information handling system may be a personal computer (e.g., desktop or laptop), tablet computer, mobile device (e.g., personal digital assistant (PDA) or smart phone), server (e.g., blade server or rack server), a network storage device, or any other suitable device and may vary in size, shape, performance, functionality, and price. The information handling system may include random access memory (RAM), one or more processing resources such as a central processing unit (CPU) or hardware or software control logic, ROM, and/or other types of nonvolatile memory. Additional components of the information handling system may include one or more disk drives, one or more network ports for communicating with external devices as well as various input and output (I/O) devices, such as a keyboard, a mouse, touchscreen and/or a video display. The information handling system may also include one or more buses operable to transmit communications between the various hardware components.
- In one embodiment, IHS 100,
FIG. 1 , includes aprocessor 102, which is connected to abus 104.Bus 104 serves as a connection betweenprocessor 102 and other components of IHS 100. Aninput device 106 is coupled toprocessor 102 to provide input toprocessor 102. Examples of input devices may include keyboards, touchscreens, pointing devices such as mouses, trackballs, and trackpads, and/or a variety of other input devices known in the art. Programs and data are stored on amass storage device 108, which is coupled toprocessor 102. Examples of mass storage devices may include hard discs, optical disks, magneto-optical discs, solid-state storage devices, and/or a variety other mass storage devices known in the art. IHS 100 further includes adisplay 110, which is coupled toprocessor 102 by avideo controller 112. Asystem memory 114 is coupled toprocessor 102 to provide the processor with fast storage to facilitate execution of computer programs byprocessor 102. Examples of system memory may include random access memory (RAM) devices such as dynamic RAM (DRAM), synchronous DRAM (SDRAM), solid state memory devices, and/or a variety of other memory devices known in the art. In an embodiment, achassis 116 houses some or all of the components of IHS 100. It should be understood that other buses and intermediate circuits can be deployed between the components described above andprocessor 102 to facilitate interconnection between the components and theprocessor 102. - Referring now to
FIG. 2 , an embodiment of a cluster deployment andmanagement system 200 is illustrated. As will be appreciated by one of skill in the art in possession of the present disclosure, the cluster deployment andmanagement system 200 of the present disclosure may include one or more cluster systems such as the cluster system 202 illustrated inFIG. 2 . In the illustrated embodiment, the cluster system 202 includes a plurality ofnode devices FIG. 1 , and/or may include some or all of the components of the IHS 100. In some examples, the cluster system 202 may be provided by Hyper-Converged Infrastructure (HCI) systems, with each of the node devices 202 a-202 d provided by storage-dense server devices. However, in other examples, the node devices 202 a-202 d may be provided by a server device (e.g., a server computing device), a networking device (e.g., a switch, a router, a gateway, etc.), an accelerator device, a Graphical Processing Unit (GPU) device, a storage device (e.g., an array of Solid-State Drives (SSDs), an array of Hard Disk Drives (HDDs), etc.) and/or any other computing device that one of skill in the art in possession of the present disclosure would recognize may provide a cluster node device that is distinct from other cluster node devices in a cluster system. However, one of skill in the art in possession of the present disclosure will recognize that cluster systems and node devices provided in the cluster deployment andmanagement system 200 may include any types of cluster systems, devices, and/or applications that may be configured to operate similarly as discussed below. - In the illustrated embodiment, a pair of
networking devices networking device 206 and/or thenetworking device 208 may be cluster node devices included in the cluster system 202. In an embodiment, either or both of thenetworking devices FIG. 1 , and/or may include some or all of the components of the IHS 100. For example, thenetworking devices networking device management system 200 or more than two networking devices may be provided in the cluster deployment and management system. - As illustrated in
FIG. 2 , thenetworking device 206 and thenetworking device 208 may be coupled to a network 210 (e.g., a Local Area Network (LAN), the Internet, combinations thereof, etc.). As will be appreciated by one of skill in the art in possession of the present disclosure, the illustrated embodiment of the cluster deployment andmanagement system 200 provides an example of “highly available” edge-based cluster system that utilizes a pair ofredundant networking devices networking devices networking devices networking devices management system 200 has been illustrated and described, one of skill in the art in possession of the present disclosure will recognize that the cluster deployment and management system of the present disclosure may include a variety of components and component configurations while remaining within the scope of the present disclosure as well. - Referring now to
FIG. 3 , an embodiment of anetworking device 300 is illustrated that may be thenetworking device 206 and/or thenetworking device 208 discussed above with reference toFIG. 2 . As such, thenetworking device 300 may be theIHS 100 discussed above with reference toFIG. 1 , and/or may include some or all of the components of theIHS 100. As discussed above, while thenetworking device 300 is described as being provided by a networking switch, thenetworking device 300 may be provided by a router, a gateway, and/or a variety of networking devices that would be apparent to one of skill in the art in possession of the present disclosure. In the illustrated embodiment, thenetworking device 300 includes achassis 302 that houses the components of thenetworking device 300, only some of which are illustrated inFIG. 3 . For example, thechassis 302 may house a processing system (not illustrated, but which may be provided by theprocessor 102 discussed above with reference toFIG. 1 ) and a memory system (not illustrated, but which may be provided by thememory 114 discussed above with reference toFIG. 1 ) that includes instructions that, when executed by the processing system, cause the processing system to provide anetworking engine 304 that is configured to perform the functionality of the networking engines and/or networking devices discussed below. - In the illustrated embodiment, the
networking engine 304 includes anoperating system 306 and acontainer runtime engine 308 that are configured to perform the functions of the networking engines, operating systems, container engines, and/or networking devices discussed below. In the illustrated example, the container runtime engine 308 (e.g., a container engine available from Docker®/Docker Swarm® (currently available at http://www.docker.com), Rancher®, Windows Server 2016 Containers, and/or other container APIs known in the art) may have generated one or more containers (e.g., thecontainer 310 illustrated inFIG. 3 ) for theoperating system 306. For example, thecontainer 310 generated by thecontainer runtime engine 308 may be provided by isolated user-space virtualization instances that run on top of theoperating system 306, and may be provisioned from a container image which specifies one or more prerequisites that a container requires to process a job for which the container is being provisioned. In an embodiment, thecontainer 310 may be configured with an agent such as a cluster deployment andmanagement engine 310 a that is configured to perform the functions of the cluster deployment and management engines and/or the networking devices discussed below. As would be appreciated by one of skill in the art, the cluster deployment andmanagement engine 310 a may be provided by a third-party or may include third-party code. The use of acontainer 310 to deploy the cluster deployment andmanagement engine 310 a keeps the operations of the cluster deployment andmanagement engine 310 a separate from the instructions used by thenetworking engine 304 for networking operations and makes it possible to update or replace the cluster deployment andmanagement engine 310 a without impacting the networking operations of the networking device. However, while the cluster deployment andmanagement engine 310 a is illustrated as being provided in a container environment (e.g., the container 310), one of skill in the art in possession of the present disclosure may recognize that the cluster deployment andmanagement engine 310 a may be a module that is provided by thenetworking engine 304, or via its own distinct operations that are separate from thenetworking engine 304. - In a specific example, the
networking engine 304 may be provided by networking processing system (e.g., Networking Processing Unit (NPU)) in thenetworking device 300 that is configured to transmit data traffic between thenetwork 210 and the node devices 202 a-202 d in the cluster system 202, discussed above with reference toFIG. 2 , using a variety of data traffic network transmission techniques that would be apparent to one of skill in the art in possession of the present disclosure. In a specific example, theoperating system 306, thecontainer runtime engine 308, and/or the cluster deployment andmanagement engine 310 a may be provided by a central processing system (e.g., a Central Processing Unit (CPU)) in thenetworking device 300 that is configured to run applications for thenetworking device 300. - The
chassis 302 may also house a storage device (not illustrated, but which may be thestorage device 108 discussed above with reference toFIG. 1 ) that is coupled to the networking engine 304 (e.g., via a coupling between the storage device and the processing system) and that includes anetworking database 312 that is configured to store the rules and/or any other data utilized by thenetworking engine 304 and/or the cluster deployment andmanagement engine 310 a in order to provide the functionality discussed below. In an embodiment, thenetworking database 312 includes acluster profile repository 312 a that stores one or more cluster profiles, and thenetworking database 312 includes a cluster service anddata repository 312 b that stores cluster data, cluster micro-services, cluster applications, and/or any other information that may be used to perform a variety of deployment functionality that one of skill in the art in possession of the present disclosure would recognize enables the cluster system to service a workload. - The
chassis 302 also houses thecommunication system 314 that is coupled to thenetworking engine 304 and/or the cluster deployment andmanagement engine 310 a (e.g., via a coupling between thecommunication system 314 and the processing system), and that may include a network interface controller (NIC), programmable Smart NIC, a wireless communication subsystem, and/or other communication subsystems known in the art. While thecluster profile repository 312 a and the cluster service anddata repository 312 b are illustrated as stored in thenetworking database 312 that is housed in thechassis 302, one of skill in the art in possession of the present disclosure will recognize that thecluster profile repository 312 a and/or the cluster service anddata repository 312 b may be stored in a storage device that is located outside thechassis 302 and that is accessible to thenetworking engine 304 and/or the cluster deployment andmanagement engine 310 a through a network (e.g., thenetwork 210 ofFIG. 2 ) via thecommunication system 314. As will be appreciated by one of skill in the art in possession of the present disclosure, the storage device andcommunication system 314 may enable thenetworking engine 304 and/or the cluster deployment andmanagement engine 310 a included in thenetworking device 300 to access thecluster profile repository 312 a and/or the cluster service anddata repository 312 b without having to store thatcluster profile repository 312 a and/or the cluster service anddata repository 312 b directly on thenetworking device 300. However, while specific components of thenetworking device 300 have been illustrated and described, one of skill in the art in possession of the present disclosure will recognize that other components may be included in thechassis 302 and utilized to perform the functionality described below, as well as conventional networking device (e.g., a conventional network switch functionality), while remaining within the scope of the present disclosure. - Referring now to
FIG. 4 , an embodiment of amethod 400 for deploying and managing a cluster system is illustrated. As discussed below, the systems and method of the present disclosure may provide a cluster deployment and management application on a cluster node in a cluster system that is provided by a networking device such as a switch device, with the cluster deployment and management application operating to deploy applications, services, and data on other cluster nodes in the cluster system. The networking device may be preconfigured and may include a validated operating system, as well as networking connectivity resources for interconnecting the cluster nodes (e.g., servers, accelerators, storage, networking devices, and/or other device included in the cluster system), and thus may be used and managed subsequent to being powered on. The cluster deployment and management application may be executed when the networking device is active and, in some embodiments, the cluster deployment and management application may be provided by a container that is activated or “spun up” on the preconfigured operating system running on the networking device. Upon activation, the cluster deployment and management application may begin cluster node discovery operations that gather inventory information associated with the cluster nodes included in the cluster system in order to determine a cluster configuration and to validate the cluster configuration against a cluster configuration profile. The cluster deployment and management application may then provision the cluster nodes with roles, states, and storage allocations that are specified in the cluster configuration profile, followed by its automatic deployment of applications, services, and data that are required for the cluster system and cluster nodes to operate. - As such, a networking device may be provided in a cluster system as a control point for cluster deployment and management to eliminate the requirement of a separate server in the cluster system for the control operations. Furthermore, a boot sequence of a networking device may operate to update a container image prior to the initiation of the cluster deployment and management application in the networking switch, which allows the use of the container to perform lifecycle management on the networking device prior to the cluster deployment and management application gathering inventory information for the cluster nodes included in the cluster system. Further still, a second networking device may be included in the cluster system for redundancy purposes, and that second networking device may mirror a primary networking device, which allows the secondary networking device to provide the cluster deployment and management application to the cluster system if the primary networking device requires any lifecycle management during the operation of the cluster system, and eliminates the “chicken or egg” paradox discussed above that is present in conventional cluster deployment and management servers.
- The
method 400 begins at block 402 where a first networking device that is coupled to one or more cluster nodes in a cluster system is initialized. In an embodiment, at block 402, thenetworking device 206/300 may initialize when power is provided to thenetworking device 206/300. In an embodiment, thenetworking device 206/300 may be preconfigured and may include a validated operating system and, during the initialization of thenetworking device 206/300, a Basic Input/Output System (BIOS) (not illustrated) in thenetworking device 206/300 may perform a boot sequence. In an embodiment, the boot sequence may update any container images, such as the container image for thecontainer 310 that runs the cluster deployment andmanagement engine 310 a, which as discussed above may be stored in thenetworking database 312 or accessible via thenetwork 210. - The
method 400 then proceeds to block 404 where the first networking device initializes a cluster deployment and management engine. In an embodiment, atblock 404, thenetworking device 206/300 may initialize the cluster deployment andmanagement engine 310 a. For example, during runtime, thecontainer runtime engine 308 running on theoperating system 306 may generate acontainer 310 that includes the cluster deployment andmanagement engine 310 a from the container image stored in thenetworking database 312. Following the initialization of the cluster deployment andmanagement engine 310 a, the cluster deployment andmanagement engine 310 a may perform lifecycle management operations on thenetworking device 206/300 that may include any firmware updates, BIOS updates, operating system updates, and/or any other lifecycle management operations that would be apparent to one of skill in the art in possession of the present disclosure. - The
method 400 then proceeds to block 406 where the first networking device discovers each of the plurality of node devices in the cluster system in order to obtain cluster inventory information. In an embodiment, atblock 406, the cluster deployment andmanagement engine 310 a may perform cluster node discovery operations. For example, the cluster deployment andmanagement engine 310 a may utilize Address Resolution Protocol (ARP), Dynamic Host Configuration Protocol (DHCP), Simple Network Management Protocol (SNMP), User Datagram Protocol-based Data Transfer Protocol (UDT), and/or other discovery/communication protocols that would be apparent to one of skill in the art to discover the node devices 202 a-202 d and/or thenetworking device 208 included in the cluster system 202. In other examples, the cluster deployment andmanagement engine 310 a may simply query thenetworking engine 304 to enumerate devices that are attached to each of its ports. In yet other examples, the cluster node discovery may be accomplished via an Intelligent Platform Management Interface (IPMI), a Remote Access Controller (RAC) (e.g., an Integrated Dell Remote Access Controller (iDRAC) or a Baseboard Management Controller (BMC) and/or by introspection tools. As a result of the cluster node discovery operations, the cluster deployment andmanagement engine 310 a on thenetworking device 206/300 may discover the node devices 202 a-202 d and/or thenetworking device 208 included in the cluster system 202. - In an embodiment, the performance of the cluster node discovery operations may include the cluster deployment and
management engine 310 a generating inventory information about the cluster system 202 that may include a node device type of each of the node device 202 a-202 d and/or the networking device 208 (e.g., a server device, a networking device, a storage device, a GPU, an accelerator device, and/or other devices known in the art), the capabilities of each of the node device 202 a-202 d and/or thenetworking device 208, a topology of the node devices 202 a-202 d and/or thenetworking device 208, configuring the order of Network Interface Controllers (NIC) for remote booting of each server device and/or any other node device information and cluster system information that would be apparent to one of skill in the art in possession of the present disclosure. In various embodiments, the performance of node discovery operations may be introspected using a discovery protocol to enumerate the configuration of firmware and components in the node devices 202 a-202 d in the cluster system 202. The information that is gleaned from node device introspections may be used to determine any change of state that must be established to declare that the node device is ready for the next state transition operation to proceed. - The
method 400 then proceeds to block 408 where the first networking device determines whether the inventory information for the cluster system and a cluster profile indicate that the cluster system is valid. In an embodiment, atblock 408, the cluster deployment andmanagement engine 310 a may validate the inventory information for the cluster system 202 with a cluster profile that is stored in thecluster profile repository 312 a. For example, atblock 408 the cluster deployment andmanagement engine 310 a may compare the inventory information obtained inblock 406 to inventory information stored in each cluster profile in thecluster profile repository 312 a. in an embodiment, in order for the inventory information for the cluster system 202 to be validated, the inventory information for the cluster system 202 must match (or substantially match by, for example, satisfying a predetermined condition of similarity with) the inventory information included in a cluster profile. If the inventory information for the cluster system 202 does not match the inventory information in any of the cluster profiles in thecluster profile repository 312 a, the cluster deployment andmanagement engine 310 a may invalidate the cluster system 202, and a notification may be sent by the cluster deployment andmanagement engine 310 a to an administrator via thenetwork 210. However, in some examples in which a match does not occur, the cluster deployment andmanagement engine 310 a may select a cluster profile from thecluster profile repository 312 a that is the most similar to the inventory information of the cluster system 202, or may build a cluster profile based on a master cluster profile stored in thecluster profile repository 312 a and convergence rules provided in that master cluster profile. - The
method 400 then proceeds to block 410 where the first networking device configures each of the plurality of node devices included in the cluster system according to the cluster profile. In an embodiment, atblock 410, the cluster deployment andmanagement engine 310 a may retrieve a cluster configuration from the cluster profile that was used to validate the cluster system 202 inblock 408. For example, the cluster configuration may include configurations for the node devices 202 a-202 d and/or thenetworking device 208, and the cluster deployment andmanagement engine 310 a may configure the node device 202 a-202 d and/or thenetworking device 208 using the cluster configuration. In a specific example, the cluster deployment andmanagement engine 310 a may assign roles and services to the node devices 202 a-202 d defined in the cluster configuration. For example, the cluster deployment andmanagement engine 310 a may assign a switch device as a TOR switch, a leaf-spine switch, or as a core switch. In other examples, the cluster deployment andmanagement engine 310 a may assign a server device as a control plane device, as a compute node, as a storage node, or as a Host Controller Interface (HCI) node. In yet other examples, a NIC may be assigned to a function as a leaf switch or as a network connection for storage or a GPU. In any of these roles, the node devices may further be assigned sub-functional roles as required during initial deployment of the cluster system 202, during initialization of the cluster system 202, and/or as part of a persistent or temporal role necessary for part or all of the service life of the cluster system 202. - In other specific examples of the cluster configuration, the cluster deployment and
management engine 310 a may allocate storage resources included in the node devices 202 a-202 d in at least one of the node devices 202 a-202 d, with the storage resources assigned based on applications and workloads that are to be run on the cluster system 202. In various embodiments, subsequent to or during the configuration of the each of the node devices 202 a-202 d and/or thenetworking device 208, the cluster deployment andmanagement engine 310 a may deploy a container infrastructure on at least one of the node devices 202 a-202 d. For example, container engines and/or virtual machine engines that are configured to provide containers and/or virtual machines, respectively, for the various applications that are to operate on the cluster system 202 may be deployed on the various node device 202 a-202 d (e.g., the servers, GPUs, accelerators, and/or other device). However, while specific cluster configurations are described, one of skill in the art in possession of the present disclosure will recognize that other cluster configurations may be applied to the cluster system 202 without remaining within the scope of the present disclosure as well. - The
method 400 then proceeds to block 412 where the first networking device deploys one or more applications and data to at least one of the node devices included in the plurality of node devices. In an embodiment, atblock 412, the cluster deployment andmanagement engine 310 a may deploy one or more applications and data to the node device 202 a-202 d and/or thenetworking device 208, and those applications and/or data may be obtained from the cluster service anddata repository 312 b. In a specific example, the cluster deployment andmanagement engine 310 a may access the cluster service anddata repository 312 b to obtain micro-service functions, application functions, data for those micro-service functions and application functions, and/or any other data and applications that would be apparent to one of skill in the art in possession of the present disclosure. As such, thenetworking device 206/300 may provide a control point for the node devices 202 a-202 d when deploying applications, services, and/or data. As discussed above, the cluster service anddata repository 312 b may be provided on thenetworking database 312 housed in thenetworking device 300 and/or connected to thenetworking engine 304 via a local connection and/or thenetwork 210. Furthermore, duringblock 412, any virtual machine and/or container that hosts the applications and/or services may be deployed on the container infrastructure as well, and upon completion ofblock 412, the cluster system 202 may be operational such that it is running the services and applications on the cluster system 202. - The
method 400 then proceeds to block 414 where the first networking device performs lifecycle management operations on at least one of the node devices. In an embodiment, atblock 414, the cluster deployment andmanagement engine 310 a may perform any of a variety of lifecycle management operations on the node devices 202 a-202 d and/or thenetworking device 208. Furthermore, as discussed below with reference to themethod 500 ofFIG. 5 , the cluster deployment andmanagement engine 310 a on thenetworking device 206/300 may also perform lifecycle management operations upon itself. As discussed above, during operation of the cluster system 202, the cluster deployment andmanagement engine 310 a on thenetworking device 206 may perform lifecycle management operations including, for example, the updating of firmware and embedded software on the nodes device 202 a-202 d and/or thenetworking device 208, the changing of application and Basic Input/Output System (BIOS) settings on the nodes device 202 a-202 d and/or thenetworking device 208, the installation of operating system patches, updates, and/or upgrades on the nodes device 202 a-202 d and/or thenetworking device 208, the maintenance of run-time environment applications/software on the nodes device 202 a-202 d and/or thenetworking device 208, the installation and loading of a container management system and/or a virtual machine management system on the cluster system 202, configuring switch device for overlay as required for clustering platform to be deployed on other node devices (e.g., setting up VLANS that the cluster management anddeployment engine 310 a will use and one or more VLANS that cluster management anddeployment engine 310 a will assign to users of the cluster system 202, and/or other lifecycle management operations that would be apparent to one of skill in the art in possession of the present disclosure. In some embodiments, atblock 414, thenetworking device 208 may receive the lifecycle management operations via thenetwork 210 from a management terminal and/or from various third-party providers. - Referring now to
FIG. 5 , an embodiment of amethod 500 for performing lifecycle management on a first networking device is illustrated. Themethod 500 begins atblock 502 where the first networking device identifies a lifecycle management operation that is required for the first networking device. In an embodiment, atblock 502, the cluster deployment andmanagement engine 310 a on thenetworking device 206/300 may receive a lifecycle management operation for thenetworking device 206/300. For example, thenetworking device 206/300 may receive the lifecycle management operation via thenetwork 210. As would be recognized by one of skill in the art in possession of the present disclosure, in conventional cluster systems where a server device included the cluster system 202 provides lifecycle management functions and cluster deployment, such lifecycle management operations would require that the server device restart or shut down, which in turn requires that the entire cluster system restart and be reconfigured. For example, the server device may require a firmware update, reconfiguration of firmware of BIOS settings, redeployment of hosted operating system components, rebuilding of hosted application containers or components, and/or simple redeployment of services that the server device provides within the cluster framework. - However, in cluster system of the present disclosure, the
method 500 proceeds to block 504 where the first networking device passes control of the cluster deployment and management to a second networking device. In an embodiment, atblock 504, the cluster deployment andmanagement engine 310 a on thenetworking device 206/300 may pass control to the cluster deployment andmanagement engine 310 a on thenetworking device 208/300. As discussed above, cluster systems such as the cluster system 202 often require redundant networking devices to maintain connectivity to a network such as thenetwork 210 in the event that a primary networking device fails. As such, during operation, thenetworking device 206 and thenetworking device 208 may perform elections operations to elect thenetworking device 206 as a primary networking device such that thenetworking device 208 is designated as a secondary networking device. The election of the primary networking device may include the selection of thenetworking device 206/300 as the networking device to handle the cluster deployment andmanagement engine 310 a. However, the selection of thenetworking device 206/300 may have been auto-negotiated between thenetworking device networking devices networking devices - The aggregation of networking devices or the provisioning of redundant networking devices also requires synchronization of the networking devices such that, if the primary networking device fails or otherwise becomes unavailable, the secondary networking device can resume operations for the primary networking device without disruption to network connectivity and services. As such, the
networking device 206 and thenetworking device 208 may perform synchronization operations via theirrespective networking engines 304, and those synchronization operations may cause thenetworking device 208/300 to deploy the cluster deployment andmanagement engine 310 a as it is deployed on thenetworking device 206/300 such that the cluster deployment andmanagement engine 310 a remains available should thenetworking device 206/300 become unavailable or require a lifecycle management operation. - In an embodiment, at
block 504, when the cluster deployment andmanagement engine 310 a on thenetworking device 206/300 detects that a lifecycle management operation is required for thenetworking device 206/300, the cluster deployment andmanagement engine 310 a on thenetworking device 206/300 may signal to the cluster deployment andmanagement engine 310 a on thenetworking device 208/300 to take control of cluster deployment and management. For example, the signal provided by thenetworking device 206/300 may include a notification sent to thenetworking device 208/300, or a lack of signal (e.g., a lack of a heartbeat message) when thenetworking device 206 shuts down or otherwise becomes unavailable. Thenetworking engine 304 on thenetworking device 208/300 may then operate to take over the primary networking device role for thenetworking engine 304 on thenetworking device 206/300, and subsequently provide the network functionality for the cluster system 202. As such, the cluster deployment andmanagement engine 310 a on thenetworking device 208/300 may take control of the lifecycle management and the cluster deployment for the cluster system 202. - The
method 500 then proceeds to block 506 where the lifecycle management operations are performed on the first networking device. In an embodiment, atblock 506, the lifecycle management operations may be performed on thenetworking device 206/300. For example, the cluster deployment andmanagement engine 310 a on thenetworking device 208/300 may take control of the lifecycle management for the cluster system 202 that was previously managed by the cluster deployment andmanagement engine 310 a on thenetworking device 206/300, and the cluster deployment andmanagement engine 310 a on thenetworking device 208/300 may then assist in the performance of the lifecycle management operations on thenetworking device 206/300 while thenetworking device 206/300 is being updated with any lifecycle management operations. - The
method 500 then proceeds to block 508 where the first networking device synchronizes cluster deployment and management engine data with the second networking device. In an embodiment, atblock 508, thenetworking engine 304 on thenetworking device 206/300 may synchronize with thenetworking engine 304 on thenetworking device 206/300 after the lifecycle management operations are performed on thenetworking device 206/300. For example, the synchronization may include synchronizing cluster deployment and management engine data between thenetworking device 206/300 and thenetworking device 208/300. As such, thenetworking engine 304 on thenetworking device 208/300 may provide any cluster deployment and management engine data for the cluster deployment andmanagement engine 310 a to thenetworking engine 304 on thenetworking device 206/300 so that the cluster deployment andmanagement engine 310 a on thenetworking device 208/300 mirrors the cluster deployment andmanagement engine 310 a on thenetworking device 206/300. In various embodiments, control of the cluster system 202 may revert to the cluster deployment andmanagement engine 310 a on thenetworking device 206/300 once thenetworking device 206/300 has completed the lifecycle management operations. However, in other embodiments, thenetworking device 206/300 may be designated as the secondary networking device while thenetworking device 208/300 remains the primary networking device and in control of the deployment and management of the cluster system 202. - Thus, systems and methods have been described that provide a cluster deployment and management application on a switch device in a cluster system. The cluster deployment and management engine may be initialized on a container on the switch device, and may provide for the discovery of node devices in the cluster system, the validation of the discovered node devices, the configuration of the node devices including the assignment and deployment of roles, services, and allocation of storage to the roles and services, the deployment of applications on a container and/or virtual machine infrastructure, and/or a variety of lifecycle management operations known in the art. The cluster deployment and management engine may also configure and perform lifecycle management operations for the switch device prior to the cluster deployment and management application configuring the cluster system. In many embodiments, the cluster system may also include a redundant switch device that synchronizes with the “primary” switch device that provides the cluster deployment and management application such that the redundant switch device may control networking functionality and cluster deployment and management functionality in the event lifecycle management operations are performed on the primary switch device. As such, the systems and methods of the present disclosure eliminate a need for a separate server device that performs cluster deployment and management, as is required in conventional cluster systems. Furthermore, by passing control of the lifecycle management operations to a redundant switch device, the systems and methods of the present disclosure allow for the performance of lifecycle management operations on a primary switch device on which the cluster deployment and management application is provided, resulting in the cluster system experiencing little to no downtime during primary switch device lifecycle management operations. Further still, by providing the cluster deployment and management on the switch device, network connectivity for the cluster system is provided when the switch device is initialized, which allows for remote management of the cluster deployment and management application when the cluster deployment and management application becomes available.
- Although illustrative embodiments have been shown and described, a wide range of modification, change and substitution is contemplated in the foregoing disclosure and in some instances, some features of the embodiments may be employed without a corresponding use of other features. Accordingly, it is appropriate that the appended claims be construed broadly and in a manner consistent with the scope of the embodiments disclosed herein.
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/929,859 US11201785B1 (en) | 2020-05-26 | 2020-05-26 | Cluster deployment and management system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/929,859 US11201785B1 (en) | 2020-05-26 | 2020-05-26 | Cluster deployment and management system |
Publications (2)
Publication Number | Publication Date |
---|---|
US20210377117A1 true US20210377117A1 (en) | 2021-12-02 |
US11201785B1 US11201785B1 (en) | 2021-12-14 |
Family
ID=78707008
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/929,859 Active US11201785B1 (en) | 2020-05-26 | 2020-05-26 | Cluster deployment and management system |
Country Status (1)
Country | Link |
---|---|
US (1) | US11201785B1 (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220052904A1 (en) * | 2020-08-11 | 2022-02-17 | F5 Networks, Inc. | Managing network ports in a virtualization environment |
US20220236902A1 (en) * | 2021-01-27 | 2022-07-28 | Samsung Electronics Co., Ltd. | Systems and methods for data transfer for computational storage devices |
US20230008011A1 (en) * | 2021-07-06 | 2023-01-12 | Vmware, Inc. | Cluster capacity management for hyper converged infrastructure updates |
CN115766717A (en) * | 2022-11-02 | 2023-03-07 | 北京志凌海纳科技有限公司 | Automatic deployment method and device for super-fusion distributed system |
CN115941486A (en) * | 2022-11-04 | 2023-04-07 | 苏州浪潮智能科技有限公司 | Cluster management method, system, equipment and storage medium |
WO2023125482A1 (en) * | 2021-12-27 | 2023-07-06 | 华为技术有限公司 | Cluster management method and device, and computing system |
CN116760637A (en) * | 2023-08-16 | 2023-09-15 | 中国人民解放军军事科学院系统工程研究院 | High-safety command control system and method based on double-chain architecture |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2391095A1 (en) * | 2010-05-31 | 2011-11-30 | Fluke Corporation | Automatic addressing scheme for 2 wire serial bus interface |
US9137111B2 (en) * | 2012-01-30 | 2015-09-15 | Microsoft Technology Licensing, Llc | Discovering, validating, and configuring hardware-inventory components |
US20140086100A1 (en) * | 2012-09-26 | 2014-03-27 | Avaya, Inc. | Multi-Chassis Cluster Synchronization Using Shortest Path Bridging (SPB) Service Instance Identifier (I-SID) Trees |
US10437510B2 (en) * | 2015-02-03 | 2019-10-08 | Netapp Inc. | Monitoring storage cluster elements |
US10379966B2 (en) * | 2017-11-15 | 2019-08-13 | Zscaler, Inc. | Systems and methods for service replication, validation, and recovery in cloud-based systems |
US11330087B2 (en) * | 2017-11-16 | 2022-05-10 | Intel Corporation | Distributed software-defined industrial systems |
CN109302483B (en) * | 2018-10-17 | 2021-02-02 | 网宿科技股份有限公司 | Application program management method and system |
-
2020
- 2020-05-26 US US15/929,859 patent/US11201785B1/en active Active
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220052904A1 (en) * | 2020-08-11 | 2022-02-17 | F5 Networks, Inc. | Managing network ports in a virtualization environment |
US20220236902A1 (en) * | 2021-01-27 | 2022-07-28 | Samsung Electronics Co., Ltd. | Systems and methods for data transfer for computational storage devices |
US20230008011A1 (en) * | 2021-07-06 | 2023-01-12 | Vmware, Inc. | Cluster capacity management for hyper converged infrastructure updates |
US11595321B2 (en) * | 2021-07-06 | 2023-02-28 | Vmware, Inc. | Cluster capacity management for hyper converged infrastructure updates |
WO2023125482A1 (en) * | 2021-12-27 | 2023-07-06 | 华为技术有限公司 | Cluster management method and device, and computing system |
CN115766717A (en) * | 2022-11-02 | 2023-03-07 | 北京志凌海纳科技有限公司 | Automatic deployment method and device for super-fusion distributed system |
CN115941486A (en) * | 2022-11-04 | 2023-04-07 | 苏州浪潮智能科技有限公司 | Cluster management method, system, equipment and storage medium |
CN116760637A (en) * | 2023-08-16 | 2023-09-15 | 中国人民解放军军事科学院系统工程研究院 | High-safety command control system and method based on double-chain architecture |
Also Published As
Publication number | Publication date |
---|---|
US11201785B1 (en) | 2021-12-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11201785B1 (en) | Cluster deployment and management system | |
US11444765B2 (en) | Methods and apparatus to manage credentials in hyper-converged infrastructures | |
JP7391862B2 (en) | AUTOMATICALLY DEPLOYED INFORMATION TECHNOLOGY (IT) SYSTEMS AND METHODS | |
US10986174B1 (en) | Automatic discovery and configuration of server nodes | |
US10348574B2 (en) | Hardware management systems for disaggregated rack architectures in virtual server rack deployments | |
US10097620B2 (en) | Methods and apparatus to provision a workload in a virtual server rack deployment | |
WO2017162173A1 (en) | Method and device for establishing connection of cloud server cluster | |
US8892863B2 (en) | System and method for automated network configuration | |
US11392417B2 (en) | Ultraconverged systems having multiple availability zones | |
US20200106669A1 (en) | Computing node clusters supporting network segmentation | |
CN111198696B (en) | OpenStack large-scale deployment method and system based on bare computer server | |
US8995424B2 (en) | Network infrastructure provisioning with automated channel assignment | |
US11349721B2 (en) | Discovering switch port locations and internet protocol addresses of compute nodes | |
US11595837B2 (en) | Endpoint computing device multi-network slice remediation/productivity system | |
US20230229481A1 (en) | Provisioning dpu management operating systems | |
CN113918174A (en) | Bare metal server deployment method, deployment controller and server cluster | |
US11652786B2 (en) | Network fabric deployment system | |
US20230325203A1 (en) | Provisioning dpu management operating systems using host and dpu boot coordination | |
US11615006B2 (en) | Virtual network life cycle management | |
US20220215001A1 (en) | Replacing dedicated witness node in a stretched cluster with distributed management controllers | |
US11757711B1 (en) | Event notification mechanism for migrating management network of a cluster node of a hyper converged infrastructure (HCI) appliance | |
US11108641B2 (en) | Automatic switching fabric role determination system | |
US20220405171A1 (en) | Automated rollback in virtualized computing environments | |
WO2023141069A1 (en) | Provisioning dpu management operating systems | |
Partitions | ESXi Install |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
AS | Assignment |
Owner name: DELL PRODUCTS L.P., TEXAS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KANEVSKY, ARKADY;TERPSTRA, JOHN H.;SANDERS, MARK S.;AND OTHERS;REEL/FRAME:052850/0160 Effective date: 20200424 |
|
AS | Assignment |
Owner name: CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, NORTH CAROLINA Free format text: SECURITY AGREEMENT;ASSIGNORS:DELL PRODUCTS L.P.;EMC IP HOLDING COMPANY LLC;REEL/FRAME:053531/0108 Effective date: 20200818 |
|
AS | Assignment |
Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT, TEXAS Free format text: SECURITY INTEREST;ASSIGNORS:DELL PRODUCTS L.P.;EMC IP HOLDING COMPANY LLC;REEL/FRAME:053578/0183 Effective date: 20200817 Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT, TEXAS Free format text: SECURITY INTEREST;ASSIGNORS:DELL PRODUCTS L.P.;EMC IP HOLDING COMPANY LLC;REEL/FRAME:053574/0221 Effective date: 20200817 Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT, TEXAS Free format text: SECURITY INTEREST;ASSIGNORS:DELL PRODUCTS L.P.;EMC IP HOLDING COMPANY LLC;REEL/FRAME:053573/0535 Effective date: 20200817 |
|
AS | Assignment |
Owner name: EMC IP HOLDING COMPANY LLC, TEXAS Free format text: RELEASE OF SECURITY INTEREST AT REEL 053531 FRAME 0108;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058001/0371 Effective date: 20211101 Owner name: DELL PRODUCTS L.P., TEXAS Free format text: RELEASE OF SECURITY INTEREST AT REEL 053531 FRAME 0108;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058001/0371 Effective date: 20211101 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: EMC IP HOLDING COMPANY LLC, TEXAS Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (053574/0221);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060333/0001 Effective date: 20220329 Owner name: DELL PRODUCTS L.P., TEXAS Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (053574/0221);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060333/0001 Effective date: 20220329 Owner name: EMC IP HOLDING COMPANY LLC, TEXAS Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (053578/0183);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060332/0864 Effective date: 20220329 Owner name: DELL PRODUCTS L.P., TEXAS Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (053578/0183);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060332/0864 Effective date: 20220329 Owner name: EMC IP HOLDING COMPANY LLC, TEXAS Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (053573/0535);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060333/0106 Effective date: 20220329 Owner name: DELL PRODUCTS L.P., TEXAS Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (053573/0535);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:060333/0106 Effective date: 20220329 |