CN114281479A - Container management method and device - Google Patents

Container management method and device Download PDF

Info

Publication number
CN114281479A
CN114281479A CN202111618279.6A CN202111618279A CN114281479A CN 114281479 A CN114281479 A CN 114281479A CN 202111618279 A CN202111618279 A CN 202111618279A CN 114281479 A CN114281479 A CN 114281479A
Authority
CN
China
Prior art keywords
container
node
load
containers
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111618279.6A
Other languages
Chinese (zh)
Inventor
林思君
孔冰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN202111618279.6A priority Critical patent/CN114281479A/en
Publication of CN114281479A publication Critical patent/CN114281479A/en
Pending legal-status Critical Current

Links

Images

Abstract

The embodiment of the invention provides a container management method and a device, which are suitable for a container cluster management system with a plurality of container nodes; the method comprises the following steps: the container scheduling node monitors the load of the incompressible resource of each container node; each container node is provided with at least one container through the container cluster management system; when the container scheduling node determines that the container nodes with loads reaching the load threshold exist, determining containers to be migrated according to the use states of all containers in the container nodes reaching the load threshold; and the container scheduling node sends a migration instruction to the container cluster management system, wherein the migration instruction is used for indicating the container cluster management system to migrate the container to be migrated. Compared with the prior art that only rescheduling is carried out, the method and the device can further balance resources among the nodes, consider the use condition of the container and improve the quality of service provided by the nodes.

Description

Container management method and device
Technical Field
The present application relates to the field of network technologies, and in particular, to a method and an apparatus for managing a container.
Background
In recent years, with the development of computer technology, more and more technologies are applied in the financial field, and the traditional financial industry is gradually changing to financial technology (Fintech), but higher requirements are also put on the technologies due to the requirements of the financial industry on safety and real-time performance. For example, due to some performance and resource utilization limitations of virtualization technologies, container technologies are developed to improve the efficiency of business processing. The container technique can effectively divide the resources of a single operating system into isolated groups so as to better balance conflicting resource usage requirements among the isolated groups.
In the prior art, a plurality of containers are arranged in a node for service processing, so that mutual noninterference of parallel service processing can be realized, the service processing speed is increased, and in order to prevent resource waste caused by unbalanced resource division of a sending node, a rescheduling mechanism is started after a node load reaches a limit threshold, and the containers in the node with the load reaching the limit threshold are rescheduled, that is, the containers in the node with the node load reaching the limit threshold are migrated to other nodes with low loads, so as to ensure the balance of the loads in the nodes. Although the mode ensures load balance among the nodes to a certain extent, under the production environment, a small load difference still exists among the nodes. For example, the load of some nodes is only 20%, and the load of some nodes reaches 85%. That is to say, both the nodes do not reach 90% of the limit threshold of the rescheduling mechanism, rescheduling does not occur, but the load is seriously inclined, the nodes with high load more or less influence the provided service, and even if the load of the nodes with high load does not reach the limit threshold, the nodes may run slowly, even the nodes are abnormal. In addition, although the rescheduling mechanism can reschedule the container to balance the load of each node, the rescheduling mechanism schedules the container which is providing the service, which may cause the container to provide the service abnormally. In summary, the current rescheduling mechanism still cannot solve the problems of node abnormality and service abnormality in the node cluster well, and verifies the quality of the service provided by the node.
Therefore, a container management method and apparatus for improving the quality of service provided by a node are needed.
Disclosure of Invention
The embodiment of the invention provides a container management method and device, which are used for improving the service quality provided by a node.
In a first aspect, an embodiment of the present invention provides a container management method, which is applicable to a container cluster management system having multiple container nodes; the method comprises the following steps:
the container scheduling node monitors the load of the incompressible resource of each container node; each container node is provided with at least one container through the container cluster management system;
when the container scheduling node determines that the container nodes with loads reaching the load threshold exist, determining containers to be migrated according to the use states of all containers in the container nodes reaching the load threshold;
and the container scheduling node sends a migration instruction to the container cluster management system, wherein the migration instruction is used for indicating the container cluster management system to migrate the container to be migrated.
In the above method, the container scheduling method is applied to a container cluster management system having a plurality of container nodes, where the container nodes include at least one container. In this way, the load and container related information in each container node is monitored by the container scheduling node. The load is an incompressible resource (e.g., memory, CPU, etc.), which is a resource that may cause the node to be overloaded and abnormal, and the compressible resource generally does not cause the node to be abnormal. The container scheduling node determines that the load of any one container node in the container nodes reaches a load threshold, acquires the use state of each container in the container nodes, determines a container to be migrated from each container according to the use state of each container, and migrates the container to be migrated to another container node in the container nodes, wherein the another container node is a container node except the container node with the load reaching the load threshold in each container node. Therefore, after the incompressible resource in the node reaches the load threshold value, the container which is not in use is migrated, so that container scheduling can be realized, the resources among the nodes are balanced, the normal operation of the service of the container in the node can be ensured, and a waiter cannot perceive the normal operation. Compared with the prior art that only rescheduling is carried out, the method and the device can further balance resources among the nodes, consider the use condition of the container and improve the quality of service provided by the nodes. In addition, by the method, after the nodes reach the load threshold, the containers in the high-load nodes are migrated according to the use conditions of the containers, so that the load of the high-load nodes is reduced, the abnormal probability of the high-load nodes is reduced, the node abnormal rate of the load cluster is further reduced, the reduction of the manual intervention times is realized, and the stability of the operation of the container cluster is ensured. Optionally, when determining that there is a container node whose load reaches the load threshold, the container scheduling node sets the state of the container node that reaches the load threshold to be non-configurable.
In the above method, the container scheduling system in the container scheduling node (one container scheduling tool installed in the container scheduling node, and only needs to be installed on the container scheduling node, and the container scheduling node schedules each container node in the container cluster management system based on the container scheduling tool) configures the container node whose load reaches the load threshold value as a non-configurable-non-distributed service state through the container cluster management system (K8S) in the container scheduling node. Therefore, in the process of container rescheduling, the container cluster management system is prevented from distributing the containers for the node, and rescheduling and normal operation of each container in the node are ensured.
Optionally, the monitoring, by the container scheduling node, the load of the incompressible resource of each container node includes:
the container scheduling node monitors the load of the incompressible resource of each container node based on an open source monitoring system;
the method for acquiring the use state of each container in the container node comprises the following steps:
and the container scheduling node monitors the APM based on the application performance and determines the use state of each container in the container node.
In the method, an open source monitoring system is arranged in the container node, and the open source monitoring system can be used for monitoring the load condition of the container node, and when the load of the incompressible resource in the container node exceeds a load threshold value, a notification message can be generated to the container scheduling node according to the identifier of the container node and the load data. The container node is also provided with an application performance monitoring APM, after receiving the notification message, the container scheduling node determines that the load of the container node exceeds a load threshold, and then generates a monitoring agent to be injected into each process of the container node based on the application performance monitoring APM, so as to obtain the service condition of each container in the container node, wherein the service condition of the container can be obtained through three types of data: whether the container has data transmission with other servers can be judged according to the online request of the container so as to judge whether the container provides services for the outside, whether the container has batch running or timing tasks in a database according to the database task of the container, and whether the container has tasks which are processed inside can be judged according to the cache task of the container. Therefore, the using condition of the container can be obtained timely and accurately.
Optionally, determining a container to be migrated according to the use state of each container in the container node that reaches the load threshold includes:
determining containers meeting a heavy load configuration condition from the container nodes reaching the load threshold; the large load configuration condition is that the average utilization rate of the container nodes is greater than or equal to the product of the difference value of the load threshold value and the average utilization rate of each container node and the total memory amount of the container nodes;
and determining containers to be migrated from the containers except the containers meeting the heavy load configuration condition.
In the above method, if the container configured with a heavy load is migrated, the node load may immediately exceed the load threshold once the node receives the container configured with the heavy load. Therefore, a container in a heavy load configuration cannot be used as a migration container before determining the migration container, regardless of whether the container is used.
Optionally, determining a container to be migrated from containers other than the container satisfying the heavy load configuration condition includes:
and determining containers in an idle state and/or containers with high resource utilization rate from the containers except the containers meeting the heavy load configuration condition as the containers to be migrated.
In the above method, the migration container is determined from the idle containers, ensuring that the migration container is not the container in use. Thus, the container migration is ensured not to influence the service provided by the container. The containers to be migrated are made to be containers with high resource utilization as much as possible. Therefore, the number of containers to be migrated by the high-load nodes is reduced, resources consumed by restarting the migrated containers are further reduced, and the resource cost of container scheduling is saved.
Optionally, the total load of the containers to be migrated is not less than the difference between the load of the container node reaching the load threshold and the average load of each container node;
the migration priority of the idle state container is higher than that of the container with high resource utilization rate.
In the above method, the migration priority of the idle state container is higher than the migration priority of the container with high resource utilization rate. In one example, the first containers with the highest utilization rate are selected from the idle containers, so that the sum of the loads of the first containers is greater than or equal to the difference between the real-time load of the container node reaching the load threshold and the average load of the cluster. Therefore, the nodes after the container is migrated can be in the load level of the average load of the cluster, the number of the migrated containers can be reduced as much as possible, and resources consumed by restarting the migrated containers in the container nodes receiving the migrated containers are saved.
Optionally, the method further includes: the container scheduling node determines that each container in the container nodes reaching the load threshold meets a large-load configuration condition, and applies for capacity expansion to the container cluster management system;
and when determining that the container node in the abnormal state exists, the container scheduling node restarts the container node in the abnormal state.
In the above method, the container node in the abnormal state is restarted. Therefore, the container node in the abnormal state can be awakened, and if the container node is abnormal due to overlarge load, the container node can be rescheduled to be recovered to be normal, so that the operation and maintenance cost is saved, the manual intervention frequency is reduced, and the stability of the container cluster operation is ensured. If the container scheduling node determines that each container in the container nodes reaching the load threshold meets the heavy load configuration condition, the container scheduling node represents that each container node in the cluster is under the heavy load configuration condition, and capacity expansion is needed to ensure that the containers provide services normally.
In a second aspect, an embodiment of the present invention provides a container management apparatus, which is applied to a container cluster management system having a plurality of container nodes; the device includes:
the monitoring module is used for monitoring the load of the incompressible resource of each container node; each container node is provided with at least one container through the container cluster management system;
the processing module is used for determining a container to be migrated according to the use state of each container in the container nodes reaching the load threshold when the container nodes with the loads reaching the load threshold are determined to exist;
the processing module is further configured to send a migration instruction to the container cluster management system, where the migration instruction is used to instruct the container cluster management system to migrate the container to be migrated.
In a third aspect, an embodiment of the present application further provides a computing device, including: a memory for storing a program; a processor for calling the program stored in said memory and executing the method as described in the various possible designs of the first aspect according to the obtained program.
In a fourth aspect, embodiments of the present application further provide a computer-readable non-transitory storage medium including a computer-readable program which, when read and executed by a computer, causes the computer to perform the method as described in the various possible designs of the first aspect.
These and other implementations of the present application will be more readily understood from the following description of the embodiments.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a schematic diagram of a container management architecture according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a container management method according to an embodiment of the present invention;
fig. 3 is a schematic flow chart of a container management method according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a container management apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a system architecture for managing containers according to an embodiment of the present invention, wherein a container scheduling tool for container rescheduling is disposed in a container scheduling node 101. A container cluster management system 102 having multiple container nodes, where each container node in the container cluster management system 102 may be provided with K8s (K8s is an orchestration management tool for a portable container generated for container services), an open source monitoring system (which may be promethas, an open source monitoring system suitable for a kubernets environment), and an all-link application performance monitoring system (apm (application performance monitor) which is an all-link application performance monitoring system and currently has multiple open source items, such as skywalk, pinpoint, and the like, where the all-link application performance monitoring system is mainly used for requesting monitoring data of an application container, and each container node includes at least one container.
The container scheduling node 101 monitors load information of each container node based on an open source monitoring system in each container node in the container cluster management system 102, and if a load of an incompressible resource of a container node in each container node of the container cluster management system 102 reaches a load threshold, sets the container node that reaches the load threshold as a container node in an unconfigurable-unallocated service state, determines a container satisfying a heavy load configuration condition from the container nodes that reach the load threshold, and obtains a use state and a utilization rate of each container from containers other than the container satisfying the heavy load configuration condition, specifically: in this embodiment, the process of the monitoring agent is injected into the container of each container node in advance through the full-link application performance monitoring system, and when the use state and the utilization rate of each container need to be obtained, the container of each container node can be monitored and obtained through the process of the monitoring agent, and the container to be migrated is determined according to the use state and the utilization rate of each container. The container scheduling node 101 sends a migration instruction to K8S to instruct the container cluster management system 102 to migrate the container to be migrated. Therefore, compared with the prior art that the containers in the container nodes are rescheduled only on the basis of the rescheduling mechanism of the container cluster management system, the system architecture can improve the rescheduling sensitivity, ensure that the containers normally provide services on the basis, and improve the quality of the services provided by the nodes.
Based on this, the embodiment of the present application provides a flow of a container management method, which is suitable for a container cluster management system having a plurality of container nodes; as shown in fig. 2, includes:
step 201, a container scheduling node monitors the load of the incompressible resource of each container node; each container node is provided with at least one container through the container cluster management system;
step 202, when determining that there is a container node with a load reaching a load threshold, the container scheduling node determines a container to be migrated according to the use state of each container in the container node with the load reaching the load threshold;
step 203, the container scheduling node sends a migration instruction to the container cluster management system, where the migration instruction is used to instruct the container cluster management system to migrate the container to be migrated.
In the above method, the container scheduling method is applied to a container cluster management system having a plurality of container nodes, where the container nodes include at least one container. In this way, the load and container related information in each container node is monitored by the container scheduling node. In the embodiment of the present invention, if a container is not killed, the incompressible resource cannot be recovered, so that the container node is hung, and the service of the whole container is affected. Therefore, the container scheduling node determines that the load of any one of the container nodes reaches the load threshold, acquires the use state of each container in the container nodes, determines a container to be migrated from each container according to the use state of each container, and migrates the container to be migrated to another container node in each container node, where the another container node is a container node except the container node whose load reaches the load threshold in each container node. Therefore, after the incompressible resource in the node reaches the load threshold value, the container which is not in use is migrated, so that container scheduling can be realized, the resources among the nodes are balanced, the normal operation of the service of the container in the node can be ensured, and a waiter cannot perceive the normal operation. Compared with the prior art that only rescheduling is carried out, the method and the device can further balance resources among the nodes, consider the use condition of the container and improve the quality of service provided by the nodes. In addition, by the method, after the nodes reach the load threshold, the containers in the high-load nodes are migrated according to the use conditions of the containers, so that the load of the high-load nodes is reduced, the abnormal probability of the high-load nodes is reduced, the node abnormal rate of the load cluster is further reduced, the reduction of the manual intervention times is realized, and the stability of the operation of the container cluster is ensured.
The embodiment of the application provides a container management method, wherein when a container node with a load reaching a load threshold is determined to exist in a container scheduling node, the state of the container node reaching the load threshold is set to be non-configurable. That is to say, after the incompressible resource in the container node reaches the load threshold, the container node that reaches the load threshold needs to be rescheduled, and in this process, it is prevented that the container cluster management system continues to allocate services to the container node, and the container node may allocate a task to a container to be migrated or reconstruct a container allocation task, which causes confusion in container scheduling, and the container to be migrated cannot be migrated normally. Configuring the container node reaching the load threshold value as a non-configurable-non-allocable service so as to reach a state of only going out and not going in and ensure the normal progress of container scheduling.
The embodiment of the application provides a container management method, wherein a container scheduling node monitors the load of incompressible resources of each container node, and the method comprises the following steps:
the container scheduling node monitors the load of the incompressible resource of each container node based on an open source monitoring system;
the method for acquiring the use state of each container in the container node comprises the following steps:
and the container scheduling node monitors the APM based on the application performance and determines the use state of each container in the container node. That is, in the embodiment of the present invention, before optionally performing step 201 of the present invention, the container management method includes the steps of:
based on the application performance monitoring APM, injecting the process of the monitoring agent into the container of each container node to monitor the use state of the container of each container node;
and when the container node with the load reaching the load threshold value is determined to exist, the container scheduling node triggers a use state acquisition instruction so as to monitor the APM based on the application performance and determine the use state of each container in the container node.
That is, the container node is provided with an open source monitoring system and an application performance monitoring APM. Thus, the open source monitoring system monitors various load indexes (such as CPU and memory) of the container node and configures a load threshold. And if the load in the container node exceeds the load threshold, notifying the container node identification to a container scheduling node, so that the container scheduling node determines the containers to be migrated in the containers according to the information of the containers in the container node. The application performance monitoring APM can acquire data of a process in a container by injecting a monitoring Agent-Agent into the container in an application-unaware mode, and can acquire request data of online requests, non-Web background transactions, databases, various middleware and the like. In this manner, the usage of the container is determined from the request data, and it is determined whether the container is a container to be migrated based on the usage of the container. Illustratively, the online request is a request for providing a service for an external server or a client by a container in the container node, and if such data exists in the container, it is verified that the container is providing the service for the outside, and the use condition is in use, which is generally not a container to be migrated, otherwise, the service provided for the outside by the container is abnormal. Background transactions that are not Web are internal services of the container, such as transactions that exist in the container cache. The database is database task data of batch running or timing tasks which may exist before the container and the database, and if the container is taken as a container to be migrated, the batch running or timing tasks between the container and the database are abnormal. Middleware may also consider services inside a container. Therefore, the use data of the container is embodied in various aspects, the use condition of the container can be accurately obtained, whether the container is in use or not is judged, and the stability of container service is ensured. In addition, when the load of the container node is too large, more containers need to be migrated, and fewer idle containers are needed, the containers to be migrated can be determined according to the importance degree of the types of the internal service and the external service in the use information. In one example, the load value to be reallocated in the container node reaching the load threshold is a, the load data sum of the containers in the container node which are idle is b, b is less than a, and based on that the external service is important for the internal service, the containers which do not provide the external service can be determined from the containers in use, and the containers which do not provide the external service are taken as the containers to be migrated. The method for determining the container to be migrated can be flexibly set according to needs, and is not particularly limited.
The embodiment of the application provides a container management method, which determines a container to be migrated according to the use state of each container in the container node reaching the load threshold, and comprises the following steps: determining containers meeting a heavy load configuration condition from the container nodes reaching the load threshold; the large load configuration condition is that the average utilization rate of the container nodes is greater than or equal to the product of the difference value of the load threshold value and the average utilization rate of each container node and the total memory amount of the container nodes; and determining containers to be migrated from the containers except the containers meeting the heavy load configuration condition. If the container meeting the heavy load configuration condition is migrated, once other nodes receive the heavy load configuration container, the node load immediately exceeds a load threshold, so that the scheduling of the container has no effect, and resources are wasted. Therefore, a container in a heavy load configuration cannot be used as a migration container before determining a container to be migrated. The containers to be migrated are determined from the containers other than the containers satisfying the heavy load configuration condition. In this way, the cluster load balancing effect of container scheduling can be realized. In addition, the heavy load configuration condition may be: the average utilization rate of the container memory is larger than or equal to the product of the difference value of the load threshold value and the average utilization rate of each container node and the total amount of the container node memory. In other words, the definition of the heavy load configuration is: the average container memory usage > (load threshold-average cluster utilization) × the total amount of container node memories (c _ memory _ avg >) (threshold _ v-g _ memory _ avg) × the node _ memories). In one example, the memory usage of a container is 8G, the load threshold is 80%, the average cluster utilization rate is 70%, and the total amount of memory of container nodes is 60G, in this case, 8> (80% -70%) 60) then the container is a heavy-load configuration container and cannot be used as a container to be migrated.
The embodiment of the application provides a method for determining containers to be migrated, which determines the containers to be migrated from the containers except the containers meeting the heavy load configuration condition, and comprises the following steps: and determining containers in an idle state and/or containers with high resource utilization rate from the containers except the containers meeting the heavy load configuration condition as containers to be migrated. That is, containers in a non-heavy load configuration are acquired, and an idle container is determined from the containers as a container to be migrated. Or, determining the first few containers with high resource utilization rate from the containers as the containers to be migrated, or determining the first few containers with high resource utilization rate from the containers in idle state as the containers to be migrated.
The premise here is that the total load value of the container cluster to be migrated is equal to or greater than the load value that the container node needs to reduce. In this way, if the load value of the idle container determined from the containers is smaller than the load value that needs to be reduced by the container node, the container to be migrated is determined according to the use condition of each container that is not configured with a large load and is in use (the use type or the importance of the service item may be set to be determined as the container to be migrated, that is, the container that causes a small influence on the use type or the service item is set as the container to be migrated). Or, if the load value of the idle container determined from the containers is smaller than the load value that the container node needs to reduce, determining a container to be migrated according to the use condition and the resource utilization rate of each container that is not configured with a large load and is in use (the use type of the container, the transfer priority of the service item and the resource utilization rate may be set according to the use type of the container, the importance of the service item and the resource utilization rate, and determining the container to be migrated according to the use type of the container, the service item and the resource utilization rate and the transfer priority thereof, that is, determining the use type and the service item that cause little influence, and using the container with a high resource utilization rate as the container to be migrated). In this way, it is ensured that the container migration does not affect the service provided by the container, or the impact caused by the container migration is minimized (in the container rescheduling mechanism, if the container scheduling is performed according to the above method, it may be necessary to schedule the container in use, a scheme for flexibly setting the container scheduling considering the impact of the container migration on the container providing service and the impact of the load reaching the load threshold on the container providing service may be considered, and no limitation is specifically made here).
The embodiment of the application provides a method for determining a container to be migrated, wherein the total load of the container to be migrated is not less than the difference between the load of the container node reaching the load threshold and the average load of each container node; the migration priority of the idle state container is higher than that of the container with high resource utilization rate. And the sum of the loads of the containers to be migrated is not less than the difference between the load of the container node reaching the load threshold and the average load of each container node. That is, for a container node that reaches a load threshold, the containers to be migrated in that container node are scheduled such that the load of that container node is at the average load of the cluster.
In one example, the sum of the loads of the set of containers to be migrated is container node real-time load-cluster average load. If the current cluster average load is 70% and the real-time load of the container node is 85%, this represents an expectation that the load of the container node will be reduced by 15% to the cluster average load. In addition, in order to minimize the scope of impact of the traffic service of the scheduling container. The preference of idle containers with higher resource usage reduces the impact on the service provided and the number of containers. The priority rule for selecting the container to be migrated is: idle containers > containers in use (reduce service usage impact), containers with high real-time resource usage > containers with low resource usage (reduce the number of containers. Finally, the container scheduling of the container node is determined to be completed, and the load of the container node is reduced to be below the load threshold, then the configuration of the container node is changed to be the state-configurable of the allocable service. The container node can be continuously observed in a period of time scheduled by the container node, and rescheduling can be carried out if the container node reaches the load threshold value again in the period of time.
The embodiment of the application provides a container management method, which further comprises the following steps:
the container scheduling node determines that each container in the container nodes reaching the load threshold meets a large-load configuration condition, and applies for capacity expansion to the container cluster management system;
and when determining that the container node in the abnormal state exists, the container scheduling node restarts the container node in the abnormal state. That is, if the containers in all the container nodes in the cluster are large-load configuration containers, the container node cluster meets the condition of insufficient resources, and node resources are automatically added to reduce the average utilization rate of the cluster. In addition, the load of the container node sometimes suddenly increases, and the container cluster management system is slow to handle the high load problem, so that part of the basic services on the container node in a continuous high load state are unavailable (such as container-dependent Docker services). When the container scheduling node monitors the situation, the state of the container node is determined to be an abnormal state, and the abnormal service phenomenon occurs in part of the service containers in the container node at the moment. The container scheduling node automatically restarts the abnormal container node, so that the container node and the service container thereon are recovered to normal in a short time. Therefore, the container management node in the abnormal state can be awakened, and if the container management node is abnormal due to overlarge load, the container management node can be rescheduled to be recovered to be normal, so that the operation and maintenance cost is saved.
Based on the above method flow, an embodiment of the present application provides a flow of a container management method, as shown in fig. 3, including:
and 301, monitoring each container node in the container node cluster.
Step 302, whether an abnormal container node exists or not is performed, if yes, step 303 is performed, and if not, step 304 is performed.
Step 303, restart the container node with the exception.
And step 304, acquiring the container nodes with the loads of the incompressible resources reaching the load threshold from the container node cluster.
Step 305, configure the container node that reached the load threshold as a non-configurable-non-allocatable service.
And step 306, aiming at each container node reaching the load threshold, determining a container of the heavy load configuration in the container nodes reaching the load threshold.
And 307, acquiring the use state and the resource utilization rate of each container except the container with the large load configuration in the container node reaching the load threshold.
And 308, determining the containers to be migrated according to the use states and the resource utilization rate of the containers.
Here, the containers in the idle state are determined according to the use states of the containers, and if the total load of the containers in the idle state is greater than the load value to be migrated of the container node reaching the load threshold, the containers in the idle state with high resource utilization rate (the containers in the idle state are sorted from large to small according to the resource utilization rate, the containers in the idle state with high resource utilization rate are selected, and the total load of the containers in the idle state with high resource utilization rate is greater than or equal to the load value to be migrated of the container node reaching the load threshold) are selected from the containers in the idle state as the containers to be migrated. That is, the migration priority of the idle state container is higher than the migration priority of the container with high resource utilization.
Step 309, migrating each container to be migrated according to the received migration instruction including each container to be migrated, and completing container scheduling.
Step 310, checking whether the load of the container node reaching the load threshold value is reduced below the load threshold value, if so, executing step 311, otherwise, executing step 307.
It should be noted that, the above flow steps are not exclusive, for example, step 302 and step 303 are restart flows, and the restart flows may be executed in parallel or in series with the rescheduling flows of step 301 and step 304 to step 311, that is, the restart flows may also be executed after the rescheduling flows of step 301 and step 304 to step 311, and this is not limited specifically here.
Step 311, configure the container node reaching the load threshold as an allocable service state.
Based on the same concept, an embodiment of the present invention provides a container management apparatus, which is suitable for a container cluster management system having a plurality of container nodes, and fig. 4 is a schematic diagram of the container management apparatus provided in the embodiment of the present application, as shown in fig. 4, including:
a monitoring module 401, configured to monitor a load of an incompressible resource of each container node; each container node is provided with at least one container through the container cluster management system;
a processing module 402, configured to determine, when it is determined that there is a container node whose load reaches a load threshold, a container to be migrated according to a usage state of each container in the container node that reaches the load threshold;
the processing module 402 is further configured to send a migration instruction to the container cluster management system, where the migration instruction is used to instruct the container cluster management system to migrate the container to be migrated.
Optionally, when determining that there is a container node whose load reaches the load threshold, the container scheduling node sets the state of the container node that reaches the load threshold to be non-configurable.
Optionally, the monitoring module 401 is specifically configured to monitor a load of an incompressible resource of each container node based on an open source monitoring system; the monitoring module 401 is specifically configured to monitor the APM based on the application performance, and determine the use state of each container in the container node.
Optionally, the processing module 402 is specifically configured to determine, from the container nodes reaching the load threshold, a container meeting a heavy-load configuration condition; the large load configuration condition is that the average utilization rate of the container nodes is greater than or equal to the product of the difference value of the load threshold value and the average utilization rate of each container node and the total memory amount of the container nodes; and determining containers to be migrated from the containers except the containers meeting the heavy load configuration condition.
Optionally, the processing module 402 is specifically configured to determine, from the containers except the container satisfying the heavy load configuration condition, a container in an idle state and/or a container with a high resource usage rate as the container to be migrated.
Optionally, the total load of the containers to be migrated is not less than the difference between the load of the container node reaching the load threshold and the average load of each container node; the migration priority of the idle state container is higher than that of the container with high resource utilization rate.
Optionally, the processing module 402 is further configured to apply for capacity expansion to the container cluster management system if it is determined that each container in the container node that reaches the load threshold meets a heavy load configuration condition; and restarting the container node in the abnormal state when the container node in the abnormal state is determined to exist.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (10)

1. A container management method is characterized in that the method is suitable for a container cluster management system with a plurality of container nodes; the method comprises the following steps:
the container scheduling node monitors the load of the incompressible resource of each container node; each container node is provided with at least one container through the container cluster management system;
when the container scheduling node determines that the container nodes with loads reaching the load threshold exist, determining containers to be migrated according to the use states of all containers in the container nodes reaching the load threshold;
and the container scheduling node sends a migration instruction to the container cluster management system, wherein the migration instruction is used for indicating the container cluster management system to migrate the container to be migrated.
2. The method of claim 1, wherein the container scheduling node sets the status of the container node reaching a load threshold to non-configurable upon determining that there is a container node whose load reaches the load threshold.
3. The method of claim 1, wherein the container scheduling node monitoring the load of the incompressible resources of each container node comprises:
the container scheduling node monitors the load of the incompressible resource of each container node based on an open source monitoring system;
the method for acquiring the use state of each container in the container node comprises the following steps:
and the container scheduling node monitors the APM based on the application performance and determines the use state of each container in the container node.
4. The method of claim 1, wherein determining the containers to be migrated based on the usage status of each container in the container nodes that reached the load threshold comprises:
determining containers meeting a heavy load configuration condition from the container nodes reaching the load threshold; the large load configuration condition is that the average utilization rate of the container nodes is greater than or equal to the product of the difference value of the load threshold value and the average utilization rate of each container node and the total memory amount of the container nodes;
and determining containers to be migrated from the containers except the containers meeting the heavy load configuration condition.
5. The method of claim 4, wherein determining the containers to be migrated from among the containers other than the containers satisfying the heavy load configuration condition comprises:
and determining containers in an idle state and/or containers with high resource utilization rate from the containers except the containers meeting the heavy load configuration condition as the containers to be migrated.
6. The method of claim 5, wherein the sum of the loads of the containers to be migrated is not less than the difference between the load of the container node that reached the load threshold and the average load of each container node;
the migration priority of the idle state container is higher than that of the container with high resource utilization rate.
7. The method as recited in claim 4, further comprising:
the container scheduling node determines that each container in the container nodes reaching the load threshold meets a large-load configuration condition, and applies for capacity expansion to the container cluster management system;
and when determining that the container node in the abnormal state exists, the container scheduling node restarts the container node in the abnormal state.
8. A container management apparatus is adapted to a container cluster management system having a plurality of container nodes; the device comprises:
the monitoring module is used for monitoring the load of the incompressible resource of each container node; each container node is provided with at least one container through the container cluster management system;
the processing module is used for determining a container to be migrated according to the use state of each container in the container nodes reaching the load threshold when the container nodes with the loads reaching the load threshold are determined to exist;
the processing module is further configured to send a migration instruction to the container cluster management system, where the migration instruction is used to instruct the container cluster management system to migrate the container to be migrated.
9. A computer-readable storage medium, characterized in that it stores a program which, when run on a computer, causes the computer to carry out the method of any one of claims 1 to 7.
10. A computer device, comprising:
a memory for storing a computer program;
a processor for calling a computer program stored in said memory to execute the method of any of claims 1 to 7 in accordance with the obtained program.
CN202111618279.6A 2021-12-27 2021-12-27 Container management method and device Pending CN114281479A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111618279.6A CN114281479A (en) 2021-12-27 2021-12-27 Container management method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111618279.6A CN114281479A (en) 2021-12-27 2021-12-27 Container management method and device

Publications (1)

Publication Number Publication Date
CN114281479A true CN114281479A (en) 2022-04-05

Family

ID=80876508

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111618279.6A Pending CN114281479A (en) 2021-12-27 2021-12-27 Container management method and device

Country Status (1)

Country Link
CN (1) CN114281479A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115118729A (en) * 2022-06-28 2022-09-27 中国电信股份有限公司 Container migration method, system and storage medium
CN115379019A (en) * 2022-08-19 2022-11-22 济南浪潮数据技术有限公司 Service scheduling method, device, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115118729A (en) * 2022-06-28 2022-09-27 中国电信股份有限公司 Container migration method, system and storage medium
CN115379019A (en) * 2022-08-19 2022-11-22 济南浪潮数据技术有限公司 Service scheduling method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
US8839243B2 (en) Remediating resource overload
CN112162865B (en) Scheduling method and device of server and server
EP3180695B1 (en) Systems and methods for auto-scaling a big data system
US11748154B2 (en) Computing node job assignment using multiple schedulers
EP2656215B1 (en) Scheduling and management in a personal datacenter
US7945913B2 (en) Method, system and computer program product for optimizing allocation of resources on partitions of a data processing system
US20150312167A1 (en) Maximizing server utilization within a datacenter
CN108132837B (en) Distributed cluster scheduling system and method
CN114281479A (en) Container management method and device
WO2016176060A1 (en) Balancing resources in distributed computing environments
CN111694633A (en) Cluster node load balancing method and device and computer storage medium
US20070038744A1 (en) Method, apparatus, and computer program product for enabling monitoring of a resource
KR20140122240A (en) Managing partitions in a scalable environment
KR20130136449A (en) Controlled automatic healing of data-center services
EP2726958B1 (en) Stochastic management of power consumption by computer systems
Talwar et al. An energy efficient agent aware proactive fault tolerance for preventing deterioration of virtual machines within cloud environment
CN112162839A (en) Task scheduling method and device, computer equipment and storage medium
CN110569124A (en) Task allocation method and device
US20210173699A1 (en) Decentralized resource scheduling
US8909666B2 (en) Data query system and constructing method thereof and corresponding data query method
CN112631756A (en) Distributed regulation and control method and device applied to space flight measurement and control software
CN111857990A (en) Method and system for enhancing YARN long type service scheduling
Wang et al. Remediating overload in over-subscribed computing environments
CN111158896A (en) Distributed process scheduling method and system
CN115480924A (en) Method and device for processing job data, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication