CN116974696A - Online migration method and device for stateful Pod in Kubernetes cluster - Google Patents

Online migration method and device for stateful Pod in Kubernetes cluster Download PDF

Info

Publication number
CN116974696A
CN116974696A CN202310954130.8A CN202310954130A CN116974696A CN 116974696 A CN116974696 A CN 116974696A CN 202310954130 A CN202310954130 A CN 202310954130A CN 116974696 A CN116974696 A CN 116974696A
Authority
CN
China
Prior art keywords
pod
migrated
target node
source node
container
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310954130.8A
Other languages
Chinese (zh)
Inventor
滕颖蕾
满毅
陈佳璇
马仕君
滕俊杰
王思康
钟腾
刘婧媛
张勇
金磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN202310954130.8A priority Critical patent/CN116974696A/en
Publication of CN116974696A publication Critical patent/CN116974696A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45595Network integration; Enabling network access in virtual machine instances

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a method and a device for online migration of stateful Pod in a Kubernetes cluster. The method comprises the following steps: after obtaining the to-be-migrated Pod of the target node and the source node, modifying the configuration information in the to-be-migrated Pod of the source node to obtain the configuration information in the modified to-be-migrated Pod; creating a Target Pod in a Target node, initiating a migration request to a preset Kubelet of a source node, and obtaining a memory mirror image of the Pod to be migrated, which is transmitted by the source node, after the preset Kubelet of the source node performs premigration operation on the Pod to be migrated; after receiving the memory mirror images of all Pods to be migrated, the target node carries out concurrent shutdown of all containers of the Pods to be migrated in the source node to obtain a container state information mirror image and a container read-write layer file; and (3) carrying out concurrent recovery and starting on the containers migrated to the Pod of the target node, and obtaining the migrated Pod by the target node. The invention aims to solve the problem that the existing Kubernetes platform cannot realize online migration operation on stateful Pod.

Description

Online migration method and device for stateful Pod in Kubernetes cluster
Technical Field
The invention relates to the technical field of software, in particular to a method and a device for online migration of stateful Pod in a Kubernetes cluster.
Background
Kubernetes is one of the open source container orchestration platforms for automated deployment, expansion, and management of containerized applications. Pod is the smallest unit of creation or deployment in Kubernetes, with one Pause container in each Pod, and all containers in a Pod can share the network and data volumes of the Pause container.
The Kubernetes platform has very perfect management capabilities for stateless applications, but the maintenance for stateful applications is still a weakness. When a physical node in a cluster is down and Pod is required to be maintained or is evicted due to lack of resources, pod is often restarted on other nodes, and the original state is lost, which is quite unfavorable for long-term running and stateful workload, such as HPC (High performance computin) application, and the worst result is that the original state is completely lost for several hours or days of calculation data. In this regard, it is preferable to sense and migrate these stateful applications before the physical node is unexpected, but Kubernetes have not supported this function until 2023, month 1, the Kubernetes community accepted a proposal for container checkpoints, and until now, the latest version of Kubernetes has supported a test version of the relevant function, but Kubernetes has not yet given a proposal for container Restore. Meanwhile, many developers have secondarily developed Kubernetes to support the migration requirement of their personalized Pod. Furthermore, the release of docershim was declared in Kubernetes release 1.24 from 2022, and the present invention is preferably concerned with the integration of Kubernetes and Containerd when running with Containerd as its default container.
The biggest disadvantage of the current Kubernetes official container Checkpoint migration scheme is that the integration level is quite low, the operation process is quite tedious, and the scheme is not truly online migration and has non-negligible downtime. In addition, the scheme only supports recovery of Pod by CRI-O, and temporarily does not support the container runtime of the mainstream such as Docker and Containerd. For the existing technical scheme which does not consider that the multi-container Pod stops the Checkpoint and the recovery time is inconsistent, the problem that the multi-container collaboration is failed and the Pod migration is failed is easily caused.
Disclosure of Invention
The invention provides an online migration method of stateful Pod in a Kubernetes cluster, which is used for solving the problem that the existing Kubernetes platform cannot realize online migration operation of the stateful Pod.
The invention provides an online migration method of stateful Pod in a Kubernetes cluster, which comprises the following steps:
after obtaining a to-be-migrated Pod of a source node and a target node of the to-be-migrated Pod set by a user, modifying configuration information in the to-be-migrated Pod of the source node by using a preset migration controller to obtain configuration information in the modified to-be-migrated Pod;
creating a Target Pod in the Target node based on the configuration information in the modified Pod to be migrated;
When a Target Pod is scheduled to the Target node, the Target node initiates a migration request to a preset Kubelet of a source node, so that after the preset Kubelet of the source node performs premigration operation on the Pod to be migrated, the Target node obtains a memory mirror image of the Pod to be migrated, which is transmitted by the source node;
after the target node receives the memory mirror images of all the Pods to be migrated, performing concurrent shutdown on all containers of the Pods to be migrated in the source node to enable the target node to obtain the container state information mirror images and the container read-write layer files of the Pods to be migrated, which are transmitted by the source node;
and based on the memory mirror image, the container state information mirror image, the container read-write layer file and the Target Pod obtained by the Target node, the container migrated to the Pod of the Target node is concurrently restored and started, so that the Target node obtains the migrated Pod.
According to the online migration method of the stateful Pod in the Kubernetes cluster provided by the invention, after obtaining the to-be-migrated Pod of the source node and the target node of the to-be-migrated Pod set by the user, modifying the configuration information in the to-be-migrated Pod of the source node by using the preset migration controller to obtain the configuration information in the modified to-be-migrated Pod, wherein the method comprises the following steps:
After obtaining a to-be-migrated Pod of a source node and a target node of the to-be-migrated Pod set by a user, carrying out Pod name modification, tag addition and deletion on configuration information in the to-be-migrated Pod of the source node by utilizing a preset migration controller, and obtaining configuration information in the modified to-be-migrated Pod.
According to the online migration method of the stateful Pod in the Kubernetes cluster provided by the invention, when the Target Pod is scheduled to the Target node, the Target node initiates a migration request to a preset Kubelet of a source node, so that after the preset Kubelet of the source node performs a premigration operation on the Pod to be migrated, the Target node obtains a memory mirror image of the Pod to be migrated, which is transmitted by the source node, and the method comprises the following steps:
when a Target Pod is scheduled to the Target node, a preset Kubelet in the Target node stops a Start action after a Create action is finished;
after the target node stops the Start action, a migration request is initiated to a migration HTTP API endpoint of a preset Kubelet in the source node by using the target node, so that after the preset Kubelet in the source node receives the request, a HandleTargetKubeletrequest method is called to send a SyncPodMigrate event to a work queue of Pod to be migrated;
After the Worker of the Pod to be migrated detects the SyncPodMigate event, the target node obtains the memory mirror image of the Pod to be migrated, which is transmitted by the source node, by utilizing a CheckpointPod method called in the syncPod of the preset Kubelet in the source node so that the preset Kubelet of the source node performs premigration operation on the Pod to be migrated.
According to the online migration method of stateful Pod in Kubernetes cluster provided by the present invention, when a Worker of a Pod to be migrated detects the SyncPodMigrate event, a CheckpointPod method is called in a syncPod of a preset Kubelet in the source node, so that the source node transmits a memory mirror image of the Pod to be migrated to the target node, and the method includes:
and after the Worker of the Pod to be migrated detects the SyncPodMigate event, calling a CheckpointPod method in a syncPod of a preset Kubelet in the source node, initializing an sftp client in the source node, and calling Runc for pre-migration of a memory mirror image of all containers in the Pod to be migrated, so that the source node transmits the memory mirror image of the Pod to be migrated to the target node.
According to the online migration method of the stateful Pod in the Kubernetes cluster provided by the invention, after the target node receives the memory images of all the to-be-migrated pods, all containers of the to-be-migrated pods in the source node are subjected to concurrent shutdown checkpoints, so that the target node obtains the container state information images and the container read-write layer files of the to-be-migrated pods transmitted by the source node, and the method comprises the following steps:
After the target node receives all the migration memory images, performing concurrent shutdown on all containers to be migrated in the source node by using a Checkpoint Pod method in the syncPod;
after all containers to be migrated of the Pod in the source node are subjected to concurrent shutdown of the checkpoints, calling a container to be migrated of each container of the Pod by using a CRI Checkpoint container method, and storing the catalogue of each container to be migrated to obtain a storage catalogue of the target node;
and transmitting the container state information mirror image and the container read-write layer file transmitted by the source node to the storage directory of the target node for storage based on the storage directory of the target node, so that the target node obtains the container state information mirror image and the container read-write layer file of the Pod to be migrated, which are transmitted by the source node.
According to the online migration method of the stateful Pod in the Kubernetes cluster provided by the invention, based on the memory mirror image, the container state information mirror image, the container read-write layer file and the Target Pod obtained by the Target node, the container migrated to the Pod of the Target node is concurrently restored and started, so that the Target node obtains the migrated Pod, and the online migration method comprises the following steps:
And calling a RestoreContainer method of Containerd by using CRI based on the premigration memory mirror image, the container state information mirror image and the container read-write layer file, and recovering and starting the container migrated to the Pod of the target node so that the target node obtains the migrated Pod.
According to the online migration method of the stateful Pod in the Kubernetes cluster provided by the invention, based on the premigrated memory mirror image, the container state information mirror image and the container read-write layer file, the CRI is utilized to call the restoecontainer method of the containment, and the container migrated to the Pod of the target node is restored and started, so that the target node obtains the migrated Pod, and the online migration method further comprises the following steps:
and restoring and cleaning the environment of the migrated Pod by using a preset migration controller to obtain a target node after environment restoration and cleaning.
According to the on-line migration method of the stateful Pod in the Kubernetes cluster provided by the invention, the method for pre-migrating the Pod to be migrated by the preset Kubelet comprises the following steps:
setting the upper limit of the pre-copy number of the memory mirror image of the Pod to be migrated, starting from the second iteration, stopping the next iteration if the change rate of the memory dirty pages is less than 5% when the memory dirty pages are less than 2MB or 4 continuous pre-copy memory dirty pages, and ending the pre-migration;
When all containers are premigrated, premigrated memory image files are transmitted to the migration work catalogue of the target node.
The invention also provides an online migration device of the stateful Pod in the Kubernetes cluster, which comprises:
the modification module is used for modifying the configuration information in the to-be-migrated Pod of the source node by utilizing the preset migration controller after obtaining the to-be-migrated Pod of the source node and the target node of the to-be-migrated Pod set by the user, so as to obtain the configuration information in the modified to-be-migrated Pod;
the creating module is used for creating a Target Pod in the Target node based on the configuration information in the modified Pod to be migrated;
the memory mirror image migration module is used for initiating a migration request to a preset Kubelet of a source node by the Target node when the Target Pod is scheduled to the Target node, so that the Target node obtains a memory mirror image of the Pod to be migrated, which is transmitted by the source node, after the preset Kubelet of the source node performs a premigration operation on the Pod to be migrated;
the transmission module is used for carrying out concurrent shutdown of Checkpoint for all containers of the Pod to be migrated in the source node after the target node receives the memory images of all the Pod to be migrated, so that the target node obtains the container state information images of the Pod to be migrated and the container read-write layer file transmitted by the source node;
And the recovery module is used for carrying out concurrent recovery and starting on the container migrated to the Pod of the Target node based on the memory mirror image, the container state information mirror image, the container read-write layer file and the Target Pod obtained by the Target node, so that the Target node obtains the migrated Pod.
The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the online migration method of the stateful Pod in the Kubernetes cluster when executing the program.
According to the method and the device for online migration of the state Pod in the Kubernetes cluster, the configuration information in the Pod to be migrated of the source node is modified by utilizing the preset migration controller, the modified configuration information of the Pod to be migrated is obtained, the migration request is initiated to the preset Kubelet of the source node by utilizing the target node, so that the target node obtains the memory mirror image of the Pod to be migrated transmitted by the source node, after the target node receives the memory mirror images of all the Pod to be migrated, all containers in the source node are subjected to concurrent shutdown Checkpoint, so that the target node obtains the container state information mirror image of the Pod to be migrated and the container read-write layer file of the Pod to be migrated, the containers migrated to the target node are subjected to concurrent recovery and starting, so that the target node obtains the migrated Pod, the concurrent shutdown point and the recovery container are performed in the concurrent process, the synchronization of the container information is ensured to the maximum extent, and the situation that the containers are still in the memory mirror images of the Pod to be migrated are not synchronized in the process, and the state of the Kuberes is not synchronized in the cluster is realized, and the state of the memory is shortened is achieved at the same time when the cluster is used for the state of the migration of the Kuberes to be migrated.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of an online migration method of stateful Pod in Kubernetes cluster provided by the invention.
Fig. 2 is a schematic structural diagram of an online migration device with a state Pod in a Kubernetes cluster provided by the present invention.
Fig. 3 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The following describes an online migration method of stateful Pod in Kubernetes cluster according to the present invention with reference to fig. 1, which includes:
s1, after obtaining a to-be-migrated Pod of a source node and a target node of the to-be-migrated Pod set by a user, modifying configuration information in the to-be-migrated Pod of the source node by using a preset migration controller to obtain the configuration information in the modified to-be-migrated Pod.
S2, creating a Target Pod in the Target node based on the configuration information in the modified Pod to be migrated.
Specifically, the preset migration controller is developed based on Kubebuilder, and the preset migration controller is added with specific functions and attributes related to migration. The spec in the API configuration of the preset migration controller is provided with a TargetNode and a migationtrigger field, and when the former is not null and the latter is modified to true, the preset migration controller triggers migration. In addition, a Template field is also arranged under the spec and is used for configuring the basic information of the Pod. The main fields in status are MigrationState, sourcePod and TargetPod, the former for dynamically monitoring migration status, and the rest for displaying the name of Pod to be migrated and the name of Pod to be restored. The preset migration controller enters the CreatingSourcePod state immediately when being initially created, and then creates a Pod according to the configuration of the Template field, wherein the name of the Pod is set to "< the name of the migration controller instance > -Pod-0", so as to represent the migration controller instance to which the Pod belongs and the number of times that the Pod has been migrated. When Pod enters a Running state, the preset migration controller sets the state as Running, and waits for migration triggering. If the TargetNode of the preset migration control is configured as the name of the target node and the MigrationTrigger is configured as true, the migration controller immediately enters a Migrating state, takes the TargetNode as a signal, and represents the formal beginning of migration when entering the state. First, the migration controller will apply to clone a Pod with nearly the same configuration as the Pod to be migrated, hereinafter referred to as Target Pod, at the Target node, and the Target Pod will not be started immediately. The name of the Target Pod will be set by the migration controller to "< migration controller instance name > -Pod-1", after which the number at the end of the name is incremented by 1 after each time the Pod is migrated. Since the initialization container in Pod does not need to be restored, the configuration of the initialization container is deleted before the Target Pod is created, and furthermore, the migration controller adds [ CloneSourcePod: and the sourcepod name label is used for informing a preset Kubelet of the target node that the Pod is created in the migration process, and the migration branch needs to be entered to intercept the normal starting flow. When the Target Pod enters the Running state to indicate that the migration is finished, then the migration controller enters the Migrated state and returns MigrationTrigger to false, targetNode to be set to null, [ CloneSourcePod ] of the Target Pod: the sourcepod. Name tag and the source node to be migrated Pod are deleted, and if the source node to be migrated has resource objects such as Service, the migration controller rebinds the resource objects to the Target Pod. And after all the Running environments are restored, the migration controller enters a Running state and waits for the next migration to start. The purpose of the preset migration controller is to trigger migration and control the whole migration life cycle.
In this embodiment, when the TargetNode of the preset migration controller is configured as a target node name and the MigrationTrigger is configured as true, migration is triggered, the preset migration controller enters a Migrating state, and a processing step of configuration information in a Pod to be migrated is performed: changing the name of Pod to be migrated as "< migration controller instance name > -Pod- [ number of migrated times +1]", adding [ CloneSourcePod: and (3) the sourcepod name label and deleting all initialized container configuration information to obtain modified Pod configuration information to be migrated.
And S3, when the Target Pod is scheduled to the Target node, the Target node initiates a migration request to a preset Kubelet of the source node, so that after the preset Kubelet of the source node performs premigration operation on the Pod to be migrated, the Target node obtains a memory mirror image of the Pod to be migrated, which is transmitted by the source node. Meanwhile, the target node obtains other state files transmitted by the source node, such as: cpu status, cpu information, and container read-write layer.
Specifically, the invention is based on the Kubelet of version v1.26.0, and a mixing module is newly added in the Kubelet to obtain the preset Kubelet. When the preset Kubelet starts, a migationmanager in the Migration module is instantiated, and the Manager provides a lot of information and methods used in the Migration process.
The preset Kubelet provides a work queue and a Worker for each Pod, and the Worker loops to acquire tasks from the work queue and calls a syncPod method of Kubelet, wherein different processes, such as creating and updating pods, can be respectively carried out according to task types in the syncPod method. In order to process the migration class work, the invention adds the judgment of the SyncPodMigate event in the syncPod method, when the task in the work queue is of this type, the preset checkpointPod method is called in the source node, the method firstly starts and premigrates all containers in the Pod, directly calls the checkpoint command of Runc in the preset Kubelet by using the CLI method and matches the parameter-Pre-dump, after each Pre-copy, the dumped memory image is saved to the "/var/lib/restriction/< Pod UID >/< Container name >/< Pre-dump [ Pre-copy number ] >" directory, and a soft link is generated in the method and linked to the last Pre-copy folder.
In addition, in order to reduce network resource waste and migration time cost caused by redundant data transmission in the premigration process, the invention improves the traditional premigration policy, senses the dirty rate of the container memory in real time, and dynamically decides the increment copy number of premigration, which is specifically as follows: setting the upper limit of the pre-copy number as 15, starting from the second iteration, stopping the next iteration to finish the pre-migration if the change rate of the memory dirty pages is smaller than 5% when the memory dirty pages are smaller than 2MB or 4 continuous pre-copy memory dirty pages, and transmitting the mirror image files generated at the stage to the migration work directory of the target node after all the containers are pre-migrated. When the steps are finished, all containers enter a shutdown Checkpoint stage concurrently, and the state that all containers enter the shutdown Checkpoint concurrently can ensure that information of all containers in the Pod is synchronous to the maximum extent, so that service abnormality of the restored Pod inner container on the target node caused by information out of synchronization due to partial shutdown while other containers still work is avoided. This phase calls the checkpoint controllers method of kubeGenericRuntimeManager for each container, where the checkpoint controllers method of CRI interface is called to communicate with the underlying controllers, and when controllers dump the container state as a mirror image, the state files are synchronized to the target node. In addition, due to the layered architecture of the container file system, only the read-write layer of the container changes when the container runs, and in order to reduce the transmission of an unnecessary container mirror layer (also called a read-only layer) during migration, the process also locks the read-write layer file positions of all containers in the Pod to be migrated by acquiring the Mount information of the overlay fs of the source node, and packages and transmits the read-write layer file positions to the target node. When the steps are finished, the Pod to be migrated may already enter an Error state due to the fact that the internal container is stopped, and the Pod is put into a Terminated state by using a status manager of the preset Kubelet to wait to be deleted.
The invention also adds a migration HTTP API endpoint in the preset Kubelet, which is used for the communication between the source node and the target node in the migration process. When the source node's Migration HTTP API endpoint receives the request, there is a specific method HandleTargetKubeletRequest in the Migration module to process the Migration request.
In order to achieve that the normal Pod creation flow can be cut off after the Target Pod is dispatched to the Target node, the invention independently provides a createContainer method from the startContainer method of kubeGenericRuntimeManager, which only includes logic to create the Pod container, without starting the container. The startContainer method is normally called in step 7 of the SyncPod method of kubegenericruntimimemanager, and the invention additionally adds a pair [ clonesercpod: judging a sourcePod.Name label, if the label exists in the Pod, the Pod is represented as the Pod created in the migration process, then a createContainer method is transferred to, after all containers in the Pod are created, a preset Kubelet of a target node can initiate a migration request to a preset Kubelet migration HTTP API endpoint of a source node, and after the request is normally replied, a resetoreContainer method is concurrently called. The restoecontainer is obtained by deleting the logic of the Create container in the startContainer on the basis of the startContainer and replacing the code logic of the original Start container with the restoecontainer of CRI.
In this embodiment, according to the modified configuration information of the Pod to be migrated, after the preset migration controller clones a Target Pod identical to the Pod to be migrated in the Target node, when the preset Kubelet of the Target node detects [ clone source Pod: and (3) a sourcepod name label, cutting off a normal Pod creation flow after the creation action is finished, stopping the Start action, and then initiating a migration request to a migration HTTP API endpoint of a preset Kubelet of the source node. After the preset Kubelet of the source node receives the request, the HandleTargetKubeletRequest method is called to send a "SyncPodMigrate" event to the work queue of the Pod to be migrated, and the HandleTargetKubeletRequest method is blocked. When the event is detected by the workbench, the CheckpointPod method is called in the syncPod of the Kubelet: firstly initializing an sftp client for transmitting files, then concurrently starting to call Runc for all containers in a Pod, storing an image file of each round of Pre-copy dump into a directory of "/var/lib/migration/< Pod UID >/< Container name >/< Pre-dump [ Pre-copy times ] >, automatically generating a soft link to the last Pre-copy directory, and synchronizing the files to the same directory of a target node after all the Container Pre-migration is finished, so that the target node obtains a memory image of the Pod to be migrated transmitted by the source node.
And S4, after the target node receives the memory mirror images of all the Pods to be migrated, performing concurrent shutdown of Checkpoint on all the containers of the Pods to be migrated in the source node, so that the target node obtains the container state information mirror images of the Pods to be migrated and the container read-write layer files transmitted by the source node.
In this embodiment, after premigration is finished, all containers in the to-be-migrated Pod are concurrently subjected to shutdown Checkpoint, and a CRI Checkpoint Container method is called to further call a Container to operate, so that state information is stored under a "/var/lib/mapping/< Pod UID >/< Container name >/fullCheck" directory and is synchronized to a target node, and meanwhile, the directory information of the last pre-copy of the Container, namely, the partentPath, is transmitted to the Container through CRI, so that soft link to the last pre-copy directory can be automatically generated under the fullCheck directory, and the integrity of the memory state of the Container is ensured. Then, this stage will also locate the read-write layer position of each container by retrieving the Mount information of the system Overlayfs, and transmit it to the target node. And the target node obtains the memory mirror image of the Pod to be migrated, which is transmitted by the source node.
And S5, based on the memory mirror image, the container state information mirror image, the container read-write layer file and the Target Pod obtained by the Target node, the container migrated to the Pod of the Target node is concurrently restored and started, so that the Target node obtains the migrated Pod.
In this embodiment, the CheckpointPod unlocks the blocking of the HandleTargetKubeletRequest, the migration HTTP request of the target node preset Kubelet is normally replied, the Pod to be migrated is set to the Terminated state by the status manager, the preset Kubelet of the target node calls the resetainer to wait to be deleted, and then calls the resetainer method of the contenrd through CRI, and the container migrated to the Pod of the target node is restored and started, so that the target node obtains the migrated Pod.
According to the method, the configuration information in the to-be-migrated Pod of the source node is modified by utilizing the preset migration controller to obtain the configuration information of the modified to-be-migrated Pod, the target node is utilized to initiate a migration request to the preset Kubelet of the source node, so that the target node obtains the memory mirror image of the to-be-migrated Pod transmitted by the source node, after the target node receives the memory mirror images of all to-be-migrated Pod, all containers in the source node are subjected to shutdown check point to enable the target node to obtain the container state information mirror image and the container read-write layer file of the to-be-migrated Pod transmitted by the source node, further, the containers in the Pod to be migrated to the target node are concurrently restored and started to enable the target node to obtain the migrated Pod, and the shutdown check point and the restored containers are concurrently carried out in the migration process, so that the synchronization of the container information is avoided, and the condition that the service is abnormal due to the fact that the container information in the restored Pod is not synchronized is still working in part of the containers is ensured to the maximum extent, and normal migration is ensured. Meanwhile, online migration of the stateful Pod in the Kubernetes cluster is realized, and the downtime of migration of the stateful and memory-intensive application in the Kubernetes cluster is shortened.
On the basis of the above embodiment, after obtaining the to-be-migrated Pod of the source node and the target node of the Pod to be migrated set by the user, modifying the configuration information in the to-be-migrated Pod of the source node by using a preset migration controller to obtain the configuration information in the modified to-be-migrated Pod, including:
after obtaining a to-be-migrated Pod of a source node and a target node of the to-be-migrated Pod set by a user, carrying out Pod name modification, tag addition and deletion on configuration information in the to-be-migrated Pod of the source node by utilizing a preset migration controller, and obtaining configuration information in the modified to-be-migrated Pod.
On the basis of the above embodiment, when a Target Pod is scheduled to the Target node, the Target node initiates a migration request to a preset Kubelet of a source node, so that after the preset Kubelet of the source node performs a premigration operation on a Pod to be migrated, the Target node obtains a memory mirror image of the Pod to be migrated, which is transmitted by the source node, including:
when the Target Pod is dispatched to the Target node, the preset Kubelet in the Target node stops the Start action after the Create action is finished.
After the target node suspends the Start action, a migration request is initiated to a migration HTTP API endpoint of a preset Kubelet in the source node by using the target node, so that after the preset Kubelet in the source node receives the request, a HandleTargetKubeletrequest method is called to send a SyncPodMigrate event to a work queue of a Pod to be migrated.
After the Worker of the Pod to be migrated detects the SyncPodMigate event, the target node obtains the memory mirror image of the Pod to be migrated, which is transmitted by the source node, by utilizing a CheckpointPod method called in the syncPod of the preset Kubelet in the source node so that the preset Kubelet of the source node performs premigration operation on the Pod to be migrated.
On the basis of the above embodiment, when the Worker of the Pod to be migrated detects the SyncPodMigrate event, invoking a CheckpointPod method in the syncPod of the preset Kubelet in the source node, so that the source node transmits the memory image of the Pod to be migrated to the target node, including:
and after the Worker of the Pod to be migrated detects the SyncPodMigate event, calling a CheckpointPod method in a syncPod of a preset Kubelet in the source node, initializing an sftp client in the source node, and calling Runc for pre-migration of a memory mirror image of all containers in the Pod to be migrated, so that the source node transmits the memory mirror image of the Pod to be migrated to the target node.
On the basis of the above embodiment, after the target node receives the memory images of all the to-be-migrated Pod, performing concurrent shutdown checkpoints on all the containers of the to-be-migrated Pod in the source node, so that the target node obtains the container state information image and the container read-write layer file of the to-be-migrated Pod transmitted by the source node, including:
After the target node receives all the migration memory images, performing concurrent shutdown on all containers to be migrated in the source node by using a Checkpoint Pod method in the syncPod.
After all containers to be migrated of the Pod in the source node are subjected to concurrent shutdown of the checkpoints, calling a container to store the catalogue of each container to be migrated of the Pod by using a Checkpoint container method of CRI, and obtaining a storage catalogue of the target node.
And transmitting the container state information mirror image and the container read-write layer file transmitted by the source node to the storage directory of the target node for storage based on the storage directory of the target node, so that the target node obtains the container state information mirror image and the container read-write layer file of the Pod to be migrated, which are transmitted by the source node.
In this embodiment, whenever the preset Kubelet calls the checkpointContainer and the resetorecontainer, the CRI calls the gRPC server in the running process of the underlying container, so as to complete the operation on the container. The invention reconstructs the original CheckpointContainer method under the criService based on the version v1.7.0 Containerd, and additionally adds a RestoreContainer method. In the Checkpoint container, a Checkpoint method of the task is called to implement a Checkpoint of the container. Besides the most basic parameters required by the Checkpoint, the parameter ParentPath is additionally added in the method, the parameter stores the path information of the last pre-copy of the container, the parameter is transmitted into the Runc when the Containerd calls the Runc, and then the parameter is transmitted into the CRU, and when the Checkpoint is finished, a soft link is automatically generated on the working path to the path of the last pre-copy of the container, so that the memory information dumped in the premigration stage can be recovered together when the container is recovered. The RestoreContainer method realizes the recovery of the container state and the starting of the container, multiplexes the StartContainer method of criService, and additionally transmits path parameters of the container state information on the basis of the method, thereby ensuring that the original state of the container can be recovered according to the dumped image file when the container is started.
Based on the above embodiment, based on the memory image, the container state information image, the container read-write layer file and the Target Pod obtained by the Target node, the container migrated to the Pod of the Target node is recovered and started, so that the Target node obtains the migrated Pod, including:
and calling a RestoreContainer method of Containerd by using CRI based on the premigration memory mirror image, the container state information mirror image and the container read-write layer file, and recovering and starting the container migrated to the Pod of the target node so that the target node obtains the migrated Pod.
Based on the above embodiment, based on the premigration memory mirror image, the container state information mirror image and the container read-write layer file, the CRI is used to call the restoecontainer method of Containerd, and restore and start the container migrated to the Pod of the target node, so that after the step of obtaining the migrated Pod by the target node, the method further includes:
and restoring and cleaning the environment of the migrated Pod by using a preset migration controller to obtain a target node after environment restoration and cleaning.
Specifically, the Target Pod enters a Running state, and after the preset migration controller detects the Running state, the migration controller is set to Migrated, and the preset migration controller starts to restore and clean the environment; the migationtrigger is restored to false, targetNode set to null, [ CloneSourcePod ] of Target Pod: the sourcepod. Name tag and the source node to be migrated Pod are deleted, and if the source node to be migrated has resource objects such as Service, the migration controller rebinds the resource objects to the Target Pod. And after all the Running environments are restored, the migration controller enters a Running state, the TargetPod under status is set to be empty, the sourcePod is set to be the TargetPod name, and the migration controller becomes the to-be-migrated Pod for the next migration and waits for the next migration.
The online migration device of the stateful Pod in the Kubernetes cluster provided by the invention is described below, and the online migration device of the stateful Pod in the Kubernetes cluster described below and the online migration method of the stateful Pod in the Kubernetes cluster described above can be correspondingly referred to each other.
Referring to fig. 2, an online migration apparatus for stateful Pod in Kubernetes cluster includes a modification module 210, a creation module 220, a memory mirror migration module 230, a transmission module 240, and a recovery module 250.
The modification module 210 is configured to modify configuration information in the to-be-migrated Pod of the source node by using a preset migration controller after obtaining the to-be-migrated Pod of the source node and a target node to which the user sets the to-be-migrated Pod, so as to obtain the configuration information in the modified to-be-migrated Pod;
the creating module 220 is configured to create a Target Pod in the Target node based on the configuration information in the modified Pod to be migrated.
The memory mirror image migration module 230 is configured to, when a Target Pod is scheduled to the Target node, initiate a migration request to a preset Kubelet of a source node, so that after the preset Kubelet of the source node performs a premigration operation on a Pod to be migrated, the Target node obtains a memory mirror image of the Pod to be migrated, which is transmitted by the source node.
The transmission module 230 is configured to, after the target node receives the memory images of all the to-be-migrated Pod, concurrently stop checkpoints for all the containers of the to-be-migrated Pod in the source node, so that the target node obtains the container state information image and the container read-write layer file of the to-be-migrated Pod transmitted by the source node.
The restoration module 240 is configured to concurrently restore and start a container migrated to a Pod of the Target node based on the memory image, the container state information image, the container read-write layer file, and the Target Pod obtained by the Target node, so that the Target node obtains the migrated Pod.
According to the invention, the modified Pod to be migrated is obtained through the modification module 210, the creation module 220 creates the Target Pod in the Target node, the memory image migration module 230 obtains the memory image of the Pod to be migrated transmitted by the source node, the transmission module 240 obtains the container state information image and the container read-write layer file of the Pod to be migrated transmitted by the source node, and the recovery module 250 obtains the Pod after migration, so that the synchronization of container information is guaranteed to the greatest extent, and the problem that the service abnormality is caused by the asynchronous container information in the recovered Pod due to the fact that part of containers are stopped and other containers still work is avoided, and normal migration is guaranteed. Meanwhile, the downtime for the state Pod migration in the Kubernetes cluster is shortened, and the on-line migration for the state Pod in the Kubernetes cluster is realized.
The modification module 210 is specifically configured to modify the Pod name, add a tag, and delete all initialized container configuration information of configuration information in the Pod to be migrated of the source node by using a preset migration controller after obtaining the Pod to be migrated of the source node and a target node to which the user sets the Pod to be migrated.
The memory image migration module 230 is specifically configured to: when the Target Pod is dispatched to the Target node, the preset Kubelet in the Target node stops the Start action after the Create action is finished.
After the target node suspends the Start action, a migration request is initiated to a migration HTTP API endpoint of a preset Kubelet in the source node by using the target node, so that after the preset Kubelet in the source node receives the request, a HandleTargetKubeletrequest method is called to send a SyncPodMigrate event to a work queue of a Pod to be migrated.
After the Worker of the Pod to be migrated detects the SyncPodMigate event, the target node obtains the memory mirror image of the Pod to be migrated, which is transmitted by the source node, by utilizing a CheckpointPod method called in the syncPod of the preset Kubelet in the source node so that the preset Kubelet of the source node performs premigration operation on the Pod to be migrated.
Specifically, after the Worker of the Pod to be migrated detects the SyncPodMigrate event, a CheckpointPod method is called in a syncPod of a preset Kubelet in the source node, an sftp client in the source node is initialized, and Runc is called for pre-migration of memory images for all containers in the Pod to be migrated, so that the source node transmits the memory images of the Pod to be migrated to the target node.
The transmission module 240 is specifically configured to, after the target node receives all the migrated memory images, concurrently shutdown checkpoints for all containers to be migrated in the source node by using the Checkpoint Pod method in the syncPod.
After all containers to be migrated of the Pod in the source node are subjected to concurrent shutdown of the checkpoints, calling a container to be migrated of each container of the Pod by using a CRI Checkpoint container method, and storing the catalogue of each container to be migrated to obtain a storage catalogue of the target node;
and transmitting the container state information mirror image and the container read-write layer file transmitted by the source node to the storage directory of the target node for storage based on the storage directory of the target node, so that the target node obtains the container state information mirror image and the container read-write layer file of the Pod to be migrated, which are transmitted by the source node.
The restoration module 250 is specifically configured to restore and start a container migrated to a Pod of a target node by using CRI to call a restoecontainer method of Containerd based on the premigrated memory image, the container state information image, and the container read-write layer file, so that the target node obtains the migrated Pod.
And based on the premigrated memory mirror image, the container state information mirror image and the container read-write layer file, calling a RestoreContainer method of Containerd by using CRI, and recovering and starting the container migrated to the Pod of the target node, so that the target node further comprises a recovery cleaning module after the step of obtaining the migrated Pod.
And the recovery cleaning module is used for recovering and cleaning the environment of the migrated Pod by utilizing a preset migration controller to obtain the target node after the environment recovery and cleaning.
Fig. 3 illustrates a physical schematic diagram of an electronic device, as shown in fig. 3, where the electronic device may include: processor 310, communication interface (Communications Interface) 320, memory 330 and communication bus 340, wherein processor 310, communication interface 320, memory 330 accomplish communication with each other through communication bus 340. Processor 310 may invoke logic instructions in memory 330 to perform an online migration method of stateful Pod in Kubernetes cluster, the method comprising:
S1, after obtaining a to-be-migrated Pod of a source node and a target node of the to-be-migrated Pod set by a user, modifying configuration information in the to-be-migrated Pod of the source node by using a preset migration controller to obtain the configuration information in the modified to-be-migrated Pod.
S2, creating a Target Pod in the Target node based on the configuration information in the modified Pod to be migrated.
And S3, when the Target Pod is scheduled to the Target node, the Target node initiates a migration request to a preset Kubelet of the source node, so that after the preset Kubelet of the source node performs premigration operation on the Pod to be migrated, the Target node obtains a memory mirror image of the Pod to be migrated, which is transmitted by the source node.
And S4, after the target node receives the memory mirror images of all the Pods to be migrated, performing concurrent shutdown of Checkpoint on all the containers of the Pods to be migrated in the source node, so that the target node obtains the container state information mirror images of the Pods to be migrated and the container read-write layer files transmitted by the source node.
And S5, based on the memory mirror image, the container state information mirror image, the container read-write layer file and the Target Pod obtained by the Target node, the container migrated to the Pod of the Target node is concurrently restored and started, so that the Target node obtains the migrated Pod.
Further, the logic instructions in the memory 330 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, where the computer program product includes a computer program, where the computer program can be stored on a non-transitory computer readable storage medium, where the computer program when executed by a processor can perform an online migration method of a stateful Pod in a Kubernetes cluster provided by the methods above, where the method includes:
S1, after obtaining a to-be-migrated Pod of a source node and a target node of the to-be-migrated Pod set by a user, modifying configuration information in the to-be-migrated Pod of the source node by using a preset migration controller to obtain the configuration information in the modified to-be-migrated Pod.
S2, creating a Target Pod in the Target node based on the configuration information in the modified Pod to be migrated.
And S3, when the Target Pod is scheduled to the Target node, the Target node initiates a migration request to a preset Kubelet of the source node, so that after the preset Kubelet of the source node performs premigration operation on the Pod to be migrated, the Target node obtains a memory mirror image of the Pod to be migrated, which is transmitted by the source node.
And S4, after the target node receives the memory mirror images of all the Pods to be migrated, performing concurrent shutdown of Checkpoint on all the containers of the Pods to be migrated in the source node, so that the target node obtains the container state information mirror images of the Pods to be migrated and the container read-write layer files transmitted by the source node.
And S5, based on the memory mirror image, the container state information mirror image, the container read-write layer file and the Target Pod obtained by the Target node, the container migrated to the Pod of the Target node is concurrently restored and started, so that the Target node obtains the migrated Pod.
In yet another aspect, the present invention further provides a non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, is implemented to perform the method for online migration of stateful Pod in Kubernetes cluster provided by the methods above, the method comprising:
s1, after obtaining a to-be-migrated Pod of a source node and a target node of the to-be-migrated Pod set by a user, modifying configuration information in the to-be-migrated Pod of the source node by using a preset migration controller to obtain the configuration information in the modified to-be-migrated Pod.
S2, creating a Target Pod in the Target node based on the configuration information in the modified Pod to be migrated.
And S3, when the Target Pod is scheduled to the Target node, the Target node initiates a migration request to a preset Kubelet of the source node, so that after the preset Kubelet of the source node performs premigration operation on the Pod to be migrated, the Target node obtains a memory mirror image of the Pod to be migrated, which is transmitted by the source node.
And S4, after the target node receives the memory mirror images of all the Pods to be migrated, performing concurrent shutdown of Checkpoint on all the containers of the Pods to be migrated in the source node, so that the target node obtains the container state information mirror images of the Pods to be migrated and the container read-write layer files transmitted by the source node.
And S5, based on the memory mirror image, the container state information mirror image, the container read-write layer file and the Target Pod obtained by the Target node, the container migrated to the Pod of the Target node is concurrently restored and started, so that the Target node obtains the migrated Pod.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. An online migration method for a stateful Pod in a Kubernetes cluster, comprising the steps of:
after obtaining a to-be-migrated Pod of a source node and a target node of the to-be-migrated Pod set by a user, modifying configuration information in the to-be-migrated Pod of the source node by using a preset migration controller to obtain configuration information in the modified to-be-migrated Pod;
creating a Target Pod in the Target node based on the configuration information in the modified Pod to be migrated;
when a Target Pod is scheduled to the Target node, the Target node initiates a migration request to a preset Kubelet of a source node, so that after the preset Kubelet of the source node performs premigration operation on the Pod to be migrated, the Target node obtains a memory mirror image of the Pod to be migrated, which is transmitted by the source node;
After the target node receives the memory mirror images of all the Pods to be migrated, performing concurrent shutdown on all containers of the Pods to be migrated in the source node to enable the target node to obtain the container state information mirror images and the container read-write layer files of the Pods to be migrated, which are transmitted by the source node;
and based on the memory mirror image, the container state information mirror image, the container read-write layer file and the Target Pod obtained by the Target node, the container migrated to the Pod of the Target node is concurrently restored and started, so that the Target node obtains the migrated Pod.
2. The online migration method of a stateful Pod in a Kubernetes cluster according to claim 1, wherein after obtaining a to-be-migrated Pod of a source node and a target node of a to-be-migrated Pod set by a user, modifying configuration information in the to-be-migrated Pod of the source node by using a preset migration controller to obtain configuration information in the modified to-be-migrated Pod, the method comprises:
after obtaining a to-be-migrated Pod of a source node and a target node of the to-be-migrated Pod set by a user, carrying out Pod name modification, tag addition and deletion on configuration information in the to-be-migrated Pod of the source node by utilizing a preset migration controller, and obtaining configuration information in the modified to-be-migrated Pod.
3. The online migration method of a stateful Pod in a Kubernetes cluster according to claim 1, wherein when a Target Pod is scheduled to the Target node, the Target node initiates a migration request to a preset Kubelet of a source node, so that after the preset Kubelet of the source node performs a premigration operation on a Pod to be migrated, the Target node obtains a memory mirror image of the Pod to be migrated transmitted by the source node, and the method includes:
when a Target Pod is scheduled to the Target node, a preset Kubelet in the Target node stops a Start action after a Create action is finished;
after the target node stops the Start action, a migration request is initiated to a migration HTTP API endpoint of a preset Kubelet in the source node by using the target node, so that after the preset Kubelet in the source node receives the request, a HandleTargetKubeletrequest method is called to send a SyncPodMigrate event to a work queue of Pod to be migrated;
after the Worker of the Pod to be migrated detects the SyncPodMigate event, the target node obtains the memory mirror image of the Pod to be migrated, which is transmitted by the source node, by utilizing a CheckpointPod method called in the syncPod of the preset Kubelet in the source node so that the preset Kubelet of the source node performs premigration operation on the Pod to be migrated.
4. The online migration method of a stateful Pod in a Kubernetes cluster according to claim 3, wherein, after the Worker of the Pod to be migrated detects the SyncPodMigrate event, invoking a CheckpointPod method in a syncPod of a preset Kubelet in the source node, so that the source node transmits a memory image of the Pod to be migrated to the target node, including:
and after the Worker of the Pod to be migrated detects the SyncPodMigate event, calling a CheckpointPod method in a syncPod of a preset Kubelet in the source node, initializing an sftp client in the source node, and calling Runc for pre-migration of a memory mirror image of all containers in the Pod to be migrated, so that the source node transmits the memory mirror image of the Pod to be migrated to the target node.
5. The online migration method of a stateful Pod in a Kubernetes cluster according to claim 1, wherein after the target node receives memory images of all the to-be-migrated pods, performing concurrent shutdown checkpoints on all containers of the to-be-migrated pods in a source node, so that the target node obtains container state information images and container read-write layer files of the to-be-migrated pods transmitted by the source node, including:
After the target node receives all the migration memory images, performing concurrent shutdown on all containers to be migrated in the source node by using a Checkpoint Pod method in the syncPod;
after all containers to be migrated of the Pod in the source node are subjected to concurrent shutdown of the checkpoints, calling a container to be migrated of each container of the Pod by using a CRI Checkpoint container method, and storing the catalogue of each container to be migrated to obtain a storage catalogue of the target node;
and transmitting the container state information mirror image and the container read-write layer file transmitted by the source node to the storage directory of the target node for storage based on the storage directory of the target node, so that the target node obtains the container state information mirror image and the container read-write layer file of the Pod to be migrated, which are transmitted by the source node.
6. The online migration method of a stateful Pod in a Kubernetes cluster according to claim 1, wherein the concurrently recovering and starting a container migrated to a Pod of a Target node based on the memory image, the container state information image, the container read-write layer file and the Target Pod obtained by the Target node, so that the Target node obtains the migrated Pod comprises:
And calling a RestoreContainer method of Containerd by using CRI based on the premigration memory mirror image, the container state information mirror image and the container read-write layer file, and recovering and starting the container migrated to the Pod of the target node so that the target node obtains the migrated Pod.
7. The online migration method of a stateful Pod in a Kubernetes cluster according to claim 1, wherein the step of using CRI to invoke a resutorecontainer method of context to restore and start a container migrated to a Pod of a target node based on the premigrated memory image, the container state information image, and the container read-write layer file, so that the target node obtains the migrated Pod further comprises:
and restoring and cleaning the environment of the migrated Pod by using a preset migration controller to obtain a target node after environment restoration and cleaning.
8. The online migration method of a stateful Pod in a Kubernetes cluster according to claim 1, wherein the method for performing the premigrating operation on the Pod to be migrated by the preset Kubelet includes:
setting the upper limit of the pre-copy number of the memory mirror image of the Pod to be migrated, starting from the second iteration, stopping the next iteration if the change rate of the memory dirty pages is less than 5% when the memory dirty pages are less than 2MB or 4 continuous pre-copy memory dirty pages, and ending the pre-migration;
When all containers are premigrated, premigrated memory image files are transmitted to the migration work catalogue of the target node.
9. An online migration apparatus for stateful Pod in a Kubernetes cluster, comprising:
the modification module is used for modifying the configuration information in the to-be-migrated Pod of the source node by utilizing the preset migration controller after obtaining the to-be-migrated Pod of the source node and the target node of the to-be-migrated Pod set by the user, so as to obtain the configuration information in the modified to-be-migrated Pod;
the creating module is used for creating a Target Pod in the Target node based on the configuration information in the modified Pod to be migrated;
the memory mirror image migration module is used for initiating a migration request to a preset Kubelet of a source node by the Target node when the Target Pod is scheduled to the Target node, so that the Target node obtains a memory mirror image of the Pod to be migrated, which is transmitted by the source node, after the preset Kubelet of the source node performs a premigration operation on the Pod to be migrated;
the transmission module is used for carrying out concurrent shutdown of Checkpoint for all containers of the Pod to be migrated in the source node after the target node receives the memory images of all the Pod to be migrated, so that the target node obtains the container state information images of the Pod to be migrated and the container read-write layer file transmitted by the source node;
And the recovery module is used for carrying out concurrent recovery and starting on the container migrated to the Pod of the Target node based on the memory mirror image, the container state information mirror image, the container read-write layer file and the Target Pod obtained by the Target node, so that the Target node obtains the migrated Pod.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements an on-line migration method of stateful Pod in Kubernetes cluster according to any of claims 1 to 8 when the program is executed by the processor.
CN202310954130.8A 2023-07-31 2023-07-31 Online migration method and device for stateful Pod in Kubernetes cluster Pending CN116974696A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310954130.8A CN116974696A (en) 2023-07-31 2023-07-31 Online migration method and device for stateful Pod in Kubernetes cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310954130.8A CN116974696A (en) 2023-07-31 2023-07-31 Online migration method and device for stateful Pod in Kubernetes cluster

Publications (1)

Publication Number Publication Date
CN116974696A true CN116974696A (en) 2023-10-31

Family

ID=88470992

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310954130.8A Pending CN116974696A (en) 2023-07-31 2023-07-31 Online migration method and device for stateful Pod in Kubernetes cluster

Country Status (1)

Country Link
CN (1) CN116974696A (en)

Similar Documents

Publication Publication Date Title
CN110768833B (en) Application arrangement and deployment method and device based on kubernets
AU2018261579B2 (en) Backup and restore framework for distributed computing systems
US9442813B2 (en) Replaying jobs at a secondary location of a service
US9876878B2 (en) Seamless cluster servicing
EP3218810B1 (en) Virtual machine cluster backup
US7760743B2 (en) Effective high availability cluster management and effective state propagation for failure recovery in high availability clusters
US7698391B2 (en) Performing a provisioning operation associated with a software application on a subset of the nodes on which the software application is to operate
CN100465899C (en) Method for implementing checkpoint of Linux program at user level based on virtual kernel object
US20070288532A1 (en) Method of updating an executable file for a redundant system with old and new files assured
CN111897558A (en) Kubernets upgrading method and device for container cluster management system
JP2001134454A (en) Method and system for updating component in computer environment and manufactured product
WO2009089746A1 (en) Method, device and system for realizing task in cluster environment
CN110134489A (en) Using moving method and device, system, storage medium, using upper cloud tool
US20220092083A1 (en) Asynchronous storage management in a distributed system
CN109508223A (en) A kind of virtual machine batch creation method, system and equipment
CN116974696A (en) Online migration method and device for stateful Pod in Kubernetes cluster
EP3340048A1 (en) System and method for content - application split
CN110837394A (en) High-availability configuration version warehouse configuration method, terminal and readable medium
CN112214323B (en) Resource recovery method and device and computer readable storage medium
CN117493271B (en) Container node management device, method, equipment and medium of cloud operating system
CN117389713B (en) Storage system application service data migration method, device, equipment and medium
US20240069779A1 (en) Object storage based asynchronous mailbox, eventual consistency for cross-region communication
CN116719604A (en) Container migration method and device, storage medium and electronic equipment
CN115250271A (en) Master-slave distributed service cooperation method and system based on micro-service architecture
CN114547191A (en) Method and system for asynchronously executing intelligent contracts by block chain virtual machine and P2P network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination