CN108965485B - Container resource management method and device and cloud platform - Google Patents

Container resource management method and device and cloud platform Download PDF

Info

Publication number
CN108965485B
CN108965485B CN201811165919.0A CN201811165919A CN108965485B CN 108965485 B CN108965485 B CN 108965485B CN 201811165919 A CN201811165919 A CN 201811165919A CN 108965485 B CN108965485 B CN 108965485B
Authority
CN
China
Prior art keywords
container
node
expansion
group
elastic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811165919.0A
Other languages
Chinese (zh)
Other versions
CN108965485A (en
Inventor
蔡志强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Cloud Network Technology Co Ltd
Beijing Kingsoft Cloud Technology Co Ltd
Original Assignee
Beijing Kingsoft Cloud Network Technology Co Ltd
Beijing Kingsoft Cloud Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Cloud Network Technology Co Ltd, Beijing Kingsoft Cloud Technology Co Ltd filed Critical Beijing Kingsoft Cloud Network Technology Co Ltd
Priority to CN201811165919.0A priority Critical patent/CN108965485B/en
Publication of CN108965485A publication Critical patent/CN108965485A/en
Application granted granted Critical
Publication of CN108965485B publication Critical patent/CN108965485B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a management method and a management device of container resources and a cloud platform, wherein the method comprises the following steps: when the time of the container resource adjustment corresponding to the elastic expansion group is reached, acquiring the corresponding relation between the expansion number of the elastic expansion group and the time; determining the expansion number of the elastic expansion group in the container resource adjustment time according to the corresponding relation between the expansion number and the time; determining container operation nodes to be adjusted according to the determined expansion number and the container number of each container operation node; and adjusting the determined container operation node to enable the container number of the elastic expansion group to be matched with the determined expansion number. The method and the device automatically stretch the container operation nodes in the elastic stretching group based on time, so that the number of the containers in the elastic stretching group meets the service requirement of the current time, and meanwhile, the container operation node resources are released in idle time to save cost, improve the matching degree of the container resources and the service requirement, and further improve the resource utilization rate of the cloud platform.

Description

Container resource management method and device and cloud platform
Technical Field
The invention relates to the technical field of cloud computing, in particular to a management method and device of container resources and a cloud platform.
Background
A container is a lightweight, portable, self-contained software packaging technique that allows applications to function in the same manner almost anywhere. Unlike conventional virtualization techniques, a container runs in a certain user space of an operating system, isolated from other processes of the operating system, and much smaller in size than a virtual machine. Starting the container does not require starting the entire operating system, so the container is faster to deploy and start, less expensive, and easier to migrate.
The demands of the business of the user running on the cloud platform on the number of the containers are different in different time periods, in order to save the container resources of the cloud platform and the renting cost of the user, the containers are added in the business demand peak, and the server downtime caused by sudden increase of the business amount can be prevented. When the service requirement is reduced, the number of containers is reduced, so that resources can be saved, and the stable and healthy operation of the service is ensured. In the existing cloud platform, when container resources need to be adjusted for services, manual intervention and manual deployment are needed, and the operation convenience is poor.
Disclosure of Invention
In view of this, the present invention provides a method and an apparatus for managing container resources, and a cloud platform, so as to automatically scale Node nodes or Pod nodes in an elastic scaling group, thereby improving the matching degree between the container resources and business requirements, and improving the resource utilization rate of the cloud platform.
In a first aspect, an embodiment of the present invention provides a method for managing container resources, where the method is applied to a Master node of a container management cluster, and the method includes: determining whether the container resource adjustment time corresponding to the elastic expansion group is reached; when the adjustment time of the container resource is determined, acquiring the corresponding relation between the stretching number and the time of the elastic stretching group; determining the expansion number of the elastic expansion group in the container resource adjustment time according to the corresponding relation between the expansion number and the time; the number of the elastic expansion groups indicates the number of containers; determining container operation nodes to be adjusted according to the determined expansion number and the container number of each container operation node; and adjusting the determined container operation node to enable the container number of the elastic expansion group to be matched with the determined expansion number.
In a preferred embodiment of the present invention, the elastic expansion group is preset with an expansion threshold; the expansion number threshold comprises a maximum expansion number and a minimum expansion number; the correspondence between the expansion number and the time is specifically set in the following manner: acquiring historical operating data of user services within a preset time period; determining the corresponding relation between the service volume of the user service and the time according to the historical operating data; and determining the corresponding relation between the expansion number and the time according to the corresponding relation between the traffic and the time and the expansion number threshold of the elastic expansion group.
In a preferred embodiment of the present invention, the elastic expansion group is further preset with an expected expansion number; the method further comprises the following steps: monitoring the number of containers of the elastic expansion group at a time other than the container resource adjustment time, and if the number of the containers exceeds the expansion number threshold of the elastic expansion group, determining the container operation nodes to be adjusted according to the expected expansion number and the number of the containers of each container operation node; and adjusting the determined container operation nodes to enable the number of the containers of the elastic expansion group to be matched with the expected expansion number.
In a preferred embodiment of the present invention, the step of determining the container operation node to be adjusted according to the determined expansion number and the number of containers in each container operation node includes: if the current container number of the elastic expansion group is smaller than the determined expansion number, calculating the difference value between the expansion number and the current container number of the elastic expansion group; searching container operation nodes with the number of operable containers matched with the difference value from the container operation nodes to be started; determining the searched container operation node as a container operation node to be adjusted; the step of adjusting the determined container operation node comprises: and adding the determined container operation node into the elastic expansion group.
In a preferred embodiment of the present invention, the step of determining the container operation node to be adjusted according to the determined expansion number and the number of containers in each container operation node includes: if the current container number of the elastic telescopic group is larger than the determined telescopic number, calculating the difference value between the current container number of the elastic telescopic group and the telescopic number; searching container operation nodes with the number of the operated containers matched with the difference value from the container operation nodes corresponding to the elastic expansion group; determining the searched container operation node as a container operation node to be adjusted; the step of adjusting the determined container operation node comprises: and removing the searched determined container operation node from the elastic expansion group.
In a preferred embodiment of the present invention, the method further includes: monitoring the operation state of each container operation node in the elastic telescopic group; if the running state of the container running node is abnormal running, the container running node which runs abnormally is stopped to run the user service; and starting the operation node of the newly added container to operate the user service.
In a preferred embodiment of the present invention, the step of running the user service by the container running node which terminates the abnormal running includes: recovering service operation data of the container operation nodes which are abnormally operated; and setting the running state of the container running node which runs abnormally as a running stop.
In a preferred embodiment of the present invention, the container operation Node includes a Node; the step of starting the operation node of the newly added container to operate the user service comprises the following steps: determining newly added Node nodes to be started through a cloud resource providing platform; setting configuration information of the newly added Node nodes; the configuration information at least comprises Node type, type and capacity of a data storage disk, configuration name, password and bindable load balance of the elastic expansion group; acquiring a mirror image file of a user service; and deploying the mirror image file into the newly added Node, and transferring the recovered service operation data to the newly added Node so that the newly added Node continues to operate the user service.
In a preferred embodiment of the present invention, the container operation node includes a Pod node; the step of starting the operation node of the newly added container to operate the user service comprises the following steps: copying a mirror image file of a user service through an RC (remote control) controller in a Master node; creating a newly added Pod Node on the appointed Node; and operating the copied mirror image file on the newly added Pod node, and transferring the recovered service operation data to the newly added Pod node so as to enable the newly added Pod node to continue operating the user service.
In a second aspect, an embodiment of the present invention provides a management apparatus for container resources, where the apparatus is disposed at a Master node of a container management cluster, and the apparatus includes: the time determining module is used for determining whether the container resource adjusting time corresponding to the elastic expansion group is reached; the corresponding relation acquisition module is used for acquiring the corresponding relation between the stretching number of the elastic stretching group and the time when the adjustment time of the container resource is determined to be reached; the expansion number determining module is used for determining the expansion number of the elastic expansion group in the container resource adjusting time according to the corresponding relation between the expansion number and the time; the number of the elastic expansion groups indicates the number of containers; the container operation node determining module is used for determining the container operation nodes to be adjusted according to the determined expansion number and the container number of each container operation node; and the first adjusting module is used for adjusting the determined container operation nodes so as to enable the number of the containers of the elastic expansion group to be matched with the determined expansion number.
In a preferred embodiment of the present invention, the elastic expansion group is preset with an expansion threshold; the expansion number threshold comprises a maximum expansion number and a minimum expansion number; the corresponding relation between the expansion number and the time is specifically set in the following way: acquiring historical operating data of user services within a preset time period; determining the corresponding relation between the service volume of the user service and the time according to the historical operating data; and determining the corresponding relation between the expansion number and the time according to the corresponding relation between the traffic and the time and the expansion number threshold of the elastic expansion group.
In a preferred embodiment of the present invention, the elastic expansion group is further preset with an expected expansion number; the device still includes: the first monitoring module is used for monitoring the number of containers of the elastic telescopic group at the time except the container resource adjusting time, and determining the container operation nodes to be adjusted according to the expected telescopic number and the number of the containers of each container operation node if the number of the containers exceeds the telescopic number threshold of the elastic telescopic group; and the second adjusting module is used for adjusting the determined container operation nodes so as to enable the number of the containers of the elastic expansion group to be matched with the expected expansion number.
In a preferred embodiment of the present invention, the container operation node determining module is further configured to: if the current container number of the elastic expansion group is smaller than the determined expansion number, calculating the difference value between the expansion number and the current container number of the elastic expansion group; searching container operation nodes with the number of operable containers matched with the difference value from the container operation nodes to be started; determining the searched container operation node as a container operation node to be adjusted; a first adjustment module further configured to: and adding the determined container operation node into the elastic expansion group.
In a preferred embodiment of the present invention, the container operation node determining module is further configured to: if the current container number of the elastic telescopic group is larger than the determined telescopic number, calculating the difference value between the current container number of the elastic telescopic group and the telescopic number; searching container operation nodes with the number of the operated containers matched with the difference value from the container operation nodes corresponding to the elastic expansion group; determining the searched container operation node as a container operation node to be adjusted; a first adjustment module further configured to: and removing the searched determined container operation node from the elastic expansion group.
In a preferred embodiment of the present invention, the apparatus further comprises: the second monitoring module is used for monitoring the operation state of each container operation node in the elastic telescopic group; the termination module is used for terminating the running user service of the container running node which runs abnormally if the running state of the container running node is abnormal running; and the starting module is used for starting the operation node of the newly added container to operate the user service.
In a preferred embodiment of the present invention, the termination module is further configured to: recovering service operation data of the container operation nodes which are abnormally operated; and setting the running state of the container running node which runs abnormally as a running stop.
In a preferred embodiment of the present invention, the container operation Node includes a Node, and the starting module is further configured to: determining newly added Node nodes to be started through a cloud resource providing platform; setting configuration information of the newly added Node nodes; the configuration information at least comprises Node type, type and capacity of a data storage disk, configuration name, password and bindable load balance of the elastic expansion group; acquiring a mirror image file of a user service; and deploying the mirror image file into the newly added Node, and transferring the recovered service operation data to the newly added Node so that the newly added Node continues to operate the user service.
In a preferred embodiment of the present invention, the container operation node includes a Pod node, and the start module is further configured to: copying a mirror image file of a user service through an RC (remote control) controller in a Master node; creating a newly added Pod Node on the appointed Node; and operating the copied mirror image file on the newly added Pod node, and transferring the recovered service operation data to the newly added Pod node so as to enable the newly added Pod node to continue operating the user service.
In a third aspect, an embodiment of the present invention provides a cloud platform, where a container management cluster runs on the cloud platform; the container management cluster comprises a Master node and an elastic expansion group; the Master node is connected with a plurality of elastic stretching groups, and user services of users run in the elastic stretching groups; the device of any one of claims 10-18, disposed at the Master node.
The embodiment of the invention has the following beneficial effects:
the embodiment of the invention provides a management method and a management device for container resources and a cloud platform, wherein when the container resource adjustment time corresponding to an elastic telescopic group is reached, the telescopic number of the elastic telescopic group at the container resource adjustment time is determined according to the corresponding relation between the telescopic number and the time; and determining the container operation nodes to be adjusted according to the expansion number and the container number of each container operation node, and further adjusting the container operation nodes. The method automatically stretches the container operation nodes in the elastic stretching group based on time, so that the number of containers in the elastic stretching group meets the service requirement of the current time, and meanwhile, the container operation node resources are released in idle time to save cost, the matching degree of the container resources and the service requirement is improved, and the resource utilization rate of the cloud platform is improved.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the invention as set forth above.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic structural diagram of a container management cluster according to an embodiment of the present invention;
fig. 2 is a flowchart of a method for managing container resources according to an embodiment of the present invention;
fig. 3 is a flowchart of a method for setting a correspondence between a scaling factor and time in a method for managing container resources according to an embodiment of the present invention;
FIG. 4 is a flowchart of another method for managing container resources according to an embodiment of the present invention;
FIG. 5 is a flowchart of another method for managing container resources according to an embodiment of the present invention;
FIG. 6 is a flowchart of another method for managing container resources according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a management apparatus for container resources according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a cloud server according to an embodiment of the present invention.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
For convenience of understanding, a container management cluster is first described below as an application scenario of the management method for container resources in this embodiment; referring to fig. 1, Master nodes are arranged in a container management cluster, and each Master node monitors elastic expansion groups in a set range or number; in fig. 1, a Master node is taken as an example, and the Master node is connected with an elastic expansion group, and the elastic expansion group comprises a plurality of container operation nodes; the container operation Node comprises a Node or a Pod Node. The Node nodes are all connected with the main Node, and the nodes can be physical servers or virtual machines.
The Master node includes an API Server and an RC Controller (Replication Controller); the API Server is an API interface of cluster management and a communication junction among all modules in the cluster, and meanwhile, the API Server can also verify and authorize user login to realize security control of the cluster. In order to improve performance or realize high availability, an application needs to be copied for several copies through an RC controller, a Pod node is created for each copy, and meanwhile, the number of Pod nodes actually operated at any time is always equal to the number of copies, namely, the RC controller is a copy abstraction of Pod, and the capacity expansion and reduction problems of Pod can be solved.
The Node nodes are used for operating Pod nodes, and two Node nodes are taken as an example in fig. 1 for explanation, namely a Node a and a Node B; each Node may run multiple Pod nodes; pod is a basic deployment schedule unit that contains a set of containers and data volumes (also referred to as volumes); as can be seen from the above, containers are used to run applications; and a data volume holds specific files or folders associated with one or more containers; initializing the data volume at the container creation time; the container may use the files in the data volume during runtime. A Pod node may also be understood as an instance of an application, e.g. a Web application is usually composed of three components, a front-end, a back-end and a database, which may run in separate containers and are contained in the same Pod node.
Generally, an application contains many specific functions and services, and a Pod node is a place where the application is actually run and provides the specific services, and the services provided by all Pod nodes are combined together to form a complete application. The architecture of different applications is different, e.g., a web application may contain only one Pod node, and all users may access the web page running in the Pod node. For more complex applications, a code running environment needs to be provided for each user of the application, so that a Pod node needs to be allocated to each logged-in user, and a user-isolated code running environment is provided through the Pod node, wherein the Pod allocated to the user is only one component of the application.
The Service in the container management cluster is a routing agent abstraction of the Pod nodes, and is used for solving the problem of Service discovery between the Pod nodes. Since the operating state of the Pod node is usually dynamically changed, the Pod node may be terminated or the operating position may be changed during the processes of switching machines, expanding capacity, and reducing capacity, an access end (the access end may be another Pod node in the cluster or a terminal outside the cluster) cannot access a certain Pod node in a fixed IP manner and obtain a related service. The Service can ensure that the dynamic change of the Pod node is transparent to the access terminal, the access terminal only needs to know the address of the Service, the Service provides the proxy, and the Service can find the Pod node meeting the requirement through the tags of each Pod.
In the existing method, when resources need to be adjusted for a service, manual intervention and manual deployment are needed, which can reasonably allocate resources for each user service to a certain extent, but for time-sensitive user services, the change of traffic within one day will be obvious and the fluctuation frequency is high, at this time, if manual adjustment and deployment are also used, the workload is obviously high, and the timeliness of resource adjustment cannot be guaranteed, so that the traffic of the user service and the container resources cannot be well matched, and the problem that the normal operation of the service is affected due to low resource utilization rate or insufficient resources still exists.
In order to avoid the above problem, embodiments of the present invention provide a method and an apparatus for managing container resources, and a cloud platform, and the technology may be applied to a cloud platform providing various cloud services. The following is a detailed description by way of example.
Firstly, referring to a flow chart of a management method for container resources shown in fig. 2, the method is applied to a Master node of a container management cluster, the Master node is generally connected with at least one elastic expansion group, and the elastic expansion group includes a plurality of container operation nodes; the container operation Node may be a Node or a Pod Node.
The method comprises the following steps:
step S202, determining whether the container resource adjusting time corresponding to the elastic expansion group is reached;
in practical implementation, a time list may be set, where the time list includes container resource adjustment time corresponding to the elastic expansion group; the container resource adjustment time corresponding to the elastic expansion group can be specifically extracted from the corresponding relation between the expansion number of the elastic expansion group and the time, and can be set by a user. Whether the current time is the container resource adjustment time corresponding to the elastic expansion group or not can be monitored in real time through a preset timer.
Step S204, when the adjustment time of the container resource is determined to be reached, acquiring the corresponding relation between the stretching number and the time of the elastic stretching group; determining the expansion number of the elastic expansion group in the container resource adjustment time according to the corresponding relation between the expansion number and the time; the number of the elastic expansion and contraction groups indicates the number of containers;
the corresponding relation can be set in advance according to the user service operated by the elastic expansion group; at the initial operation stage of the elastic telescopic group, the cloud platform generally allocates a fixed number of containers to the elastic telescopic group according to the requirements of users, and the containers are operated by corresponding container operation nodes, so that the container operation nodes corresponding to the elastic telescopic group are relatively fixed. In order to adjust the number of containers in time, the number of containers required by the service in different time periods can be estimated according to historical operating data of the user service. For example, for some large enterprise websites, it can be known that the traffic volume in the daytime is significantly higher than that in the evening according to the historical operation data of the large enterprise website; for some large shopping platforms, according to the historical operation data of the shopping platform, the fact that the traffic volume at night is obviously higher than that at day can be known. Therefore, in order to make full use of the resources of the cloud platform, a scaling number matched with the service needs to be adjusted according to different services at different time periods; preferably, the obtained traffic of different time periods can be arranged into a form of a table or a formula, so that the cloud platform sets different expansion numbers according to different time periods, and resources are fully utilized.
The container resource adjusting time is usually a time period, and the container resource adjusting time can be determined according to the actual conditions of service operation, such as typical traffic peak time and low peak time, activity period and login access centralized time, etc., and the time periods are estimated and the expansion number of the time periods is set; since the number of containers required for these time periods is significantly different from the normal case, adjusting the scaling number of the container resource adjustment time can improve the matching degree of the number of containers and the service.
In order to adjust the number of containers in time during the adjustment time of the container resources, a monitoring timer can be set; and when the time reaches the container resource adjusting time, automatically acquiring the corresponding expansion number from the corresponding relation between the expansion number and the time.
Step S206, determining container operation nodes to be adjusted according to the determined expansion number and the container number of each container operation node;
as can be seen from the above, the container operation Node may be a Node or a Pod Node; the number of containers which can be accommodated by different container operation nodes is different due to the limitation of various hardware resource conditions in the cluster, and the adjustment of the Master node on the number of containers in the elastic telescopic group needs to be realized by adjusting the container operation nodes; therefore, after the expansion number corresponding to the current container resource adjustment time is determined, a proper container operation node needs to be selected for adjustment, and the number of containers in the elastic expansion group is matched with the determined expansion number by adjusting the selected container operation node.
For example, if the determined expansion number is greater than the current number of containers in the elastic expansion group, a newly added container operation node is generally required to be called from the cloud resource providing platform, and the capacity of the newly added container operation node is required to be matched with the increased number of containers (i.e., the difference between the determined expansion number and the current number of containers in the elastic expansion group); specifically, the number of containers that can be operated by the newly added container operation node needs to be equal to or slightly larger than the increased number of containers. If the determined expansion number is smaller than the current number of containers in the elastic expansion group, the container operation node is usually deleted from the elastic expansion group, and the number of the containers which normally operate in the deleted container operation node needs to be matched with the reduced number of the containers (namely the difference between the current number of the containers in the elastic expansion group and the determined expansion number); specifically, the number of containers currently operated by the deleted container operation node needs to be equal to or slightly less than the increased number of containers.
And S208, adjusting the determined container operation node to enable the container number of the elastic expansion group to be matched with the determined expansion number.
Specifically, when the container resource adjustment time is reached, according to the corresponding relationship between the expansion number of the elastic expansion group and the time, if the number of the containers of the elastic expansion group is higher than the expansion number corresponding to the container resource adjustment time in the corresponding relationship, the Master node automatically recovers excessive container operation nodes, and transfers the operation data in the excessive container operation nodes to container operation nodes other than the container operation nodes in the elastic expansion group; if the container resource adjusting time is up, the container operation node of the elastic expansion group is lower than the expansion number corresponding to the container resource adjusting time in the corresponding relation, the Master node creates a new container operation node, and after the file is installed and the configuration information is set, the user service is operated to the container operation node.
The embodiment of the invention provides a management method of container resources, which is characterized in that when the adjustment time of the container resources corresponding to an elastic expansion group is reached, the expansion number of the elastic expansion group in the adjustment time of the container resources is determined according to the corresponding relation between the expansion number and the time; and determining the container operation nodes to be adjusted according to the expansion number and the container number of each container operation node, and further adjusting the container operation nodes. The method automatically stretches the container operation nodes in the elastic stretching group based on time, so that the number of containers in the elastic stretching group meets the service requirement of the current time, and meanwhile, the container operation node resources are released in idle time to save cost, the matching degree of the container resources and the service requirement is improved, and the resource utilization rate of the cloud platform is improved.
In the above embodiment, the number of containers in the elastic expansion group is adjusted according to the preset corresponding relationship between the expansion number and the time, so that the corresponding relationship between the expansion number and the time largely determines the matching degree between the adjusted number of containers in the elastic expansion group and the user service; therefore, in this embodiment, a setting manner of the correspondence relationship between the expansion number and the time is described first. In order not to make each elastic telescopic group infinitely expand or reduce the number of containers; the elastic expansion group is usually preset with an expansion number threshold value; the warp number threshold includes a maximum warp number and a minimum warp number.
For example, the maximum expansion number in one elastic expansion group is 10 containers, and the minimum expansion number is 2 containers; of course, the determination may be specifically determined according to the actual situation of the service or the cloud platform. The Master node can monitor the number of containers in each elastic expansion group at regular time; if the number of containers created in one elastic expansion group is larger than the maximum expansion number, the Master node automatically reduces the number of container operation nodes in the expansion group; and if the number of the containers created in one elastic expansion group is less than the minimum expansion number, the Master node automatically increases the number of the container operation nodes in the expansion group.
Referring to fig. 3, a specific flow of the setting manner of the correspondence between the expansion number and the time is described as follows:
step S302, obtaining historical operation data of user service in a preset time period;
the historical operation data can be CPU utilization rate, memory utilization rate, network card flow, monitor flow and the like; the CPU utilization rate is the CPU percentage occupied by the container in the container operation node when the container operates the program of the user service; if the CPU utilization rate is low, the utilization rate of the container operation node is low; if the CPU utilization rate is higher, the utilization rate of the operation node of the container is higher, but if the CPU utilization rate is too high, the operation speed of a program of the user service is influenced, so that the performance and the response speed of the user service are reduced; in addition, the overhigh CPU utilization rate can cause overhigh temperature of the CPU in the container operation node, and the service life of the CPU can be greatly shortened. The memory utilization rate, that is, the percentage of memory occupied by the container operation node when running the program of the user service, is similar to the CPU utilization rate, and if the memory utilization rate is low, it indicates that the container operation node utilization rate is low; if the memory utilization rate is high, it indicates that the utilization rate of the container operation node is high, but if the memory utilization rate is too high, the operation speed of the program of the user service is also influenced, and further the performance of the user service is influenced.
The network card access flow is the amount of uplink data and downlink data of the container operation node, and if the network card access flow is high, it usually indicates that the user service access amount is large. A monitor can be arranged on the Master node or the container operation node and can be realized through a software program; the monitoring room can be used for monitoring the flow of a specified container operation node or a specified data type; the listener ingress and egress traffic may indicate the amount of access to certain module-specific functions in the user traffic. The above-mentioned packet forwarding rate is generally used to measure the data throughput and data processing capability of the container operation node, and when the packet forwarding rate of the container operation node is higher, it can also indicate that the traffic volume of the user service is larger.
Step S304, determining the corresponding relation between the service volume of the user service and the time according to the historical operation data;
according to the historical operating data of the user service, it can be seen that the service volume in the time period is determined to be high in the time period with high CPU utilization rate, high memory utilization rate, high network card flow or high monitor flow; correspondingly, if the CPU utilization rate is low, the memory utilization rate is low, the network card traffic is low, or the listener traffic is low, it can be determined that the traffic is low in the time period.
Step S306, determining the corresponding relation between the expansion number and the time according to the corresponding relation between the traffic and the time and the expansion number threshold of the elastic expansion group.
For example, according to the obtained correspondence between the traffic volume and the time, the scaling number of the time with low traffic volume may be set as the minimum scaling number; setting the expansion number of the time with high traffic as the maximum expansion number, such as some special festivals; and at other times when the traffic is relatively average, the expansion number between the maximum expansion number and the minimum expansion number can be set in proportion.
In the above manner, the corresponding relationship between the service volume of the user service and the time can be obtained through the historical operation data of the user service, and further the corresponding relationship between the expansion number and the time can be obtained; when the elastic expansion group adjusts the number of the container operation nodes according to the corresponding relation between the expansion number and the time, the matching degree between the actual requirement of the user service and the number of the containers can be higher.
The embodiment of the invention also provides another container resource management method, which is realized on the basis of the embodiment; the embodiment mainly describes the monitoring process of the number of containers in the elastic telescopic group after the telescopic number is adjusted; the correspondence between the number of expansion and contraction and the time may only control the number of expansion and contraction in the container resource adjustment time, and in other time periods, the number of expansion and contraction in the elastic expansion group may be controlled by other factors, such as actual traffic or resource balance between the elastic expansion groups, and therefore, the number of containers in each elastic expansion group needs to be monitored in real time or at regular time.
The elastic expansion group in the embodiment is also preset with expected expansion number; the expected expansion number is generally an expansion number estimated by a worker for the user service, and the expansion number can be suitable for the service volume of most time periods of the user service; as shown in fig. 4, the method for managing container resources further includes:
step S402, monitoring the number of containers of the elastic expansion group at the time except the container resource adjusting time;
in actual implementation, the Master node may be used to monitor the number of container operation nodes of the elastic expansion and contraction group.
Step S404, judging whether the number of the containers exceeds the expansion number threshold value of the elastic expansion group; if yes, go to step S406; if not, step S402 is performed.
The above-mentioned expansion number threshold range may be composed of a maximum expansion number and a minimum expansion number, and if the number of containers in the elastic expansion group is smaller than the minimum expansion number or larger than the maximum expansion number, it may be determined that the expansion number threshold of the elastic expansion group is exceeded.
Step S406, determining container operation nodes to be adjusted according to the expected expansion number and the container number of each container operation node;
similar to the above embodiment, if the current number of containers in the elastic expansion group is smaller than the minimum expansion number, it is usually necessary to invoke a new container operation node from the cloud resource providing platform, and the capacity of the new container operation node needs to match the increased number of containers (i.e. the difference between the expected expansion number and the current number of containers in the elastic expansion group); specifically, the number of containers that can be operated by the newly added container operation node needs to be equal to or slightly larger than the increased number of containers.
If the current number of containers in the elastic expansion group is greater than the maximum expansion number, the container operation nodes are generally required to be deleted from the elastic expansion group, and the number of containers which normally operate in the deleted container operation nodes is required to be matched with the reduced number of containers (namely the difference between the current number of containers in the elastic expansion group and the expected expansion number); specifically, the number of containers currently operated by the deleted container operation node needs to be equal to or slightly less than the increased number of containers.
And step S408, adjusting the determined container operation nodes to enable the number of the containers of the elastic expansion group to be matched with the expected expansion number.
Specifically, under the condition that the number of the current containers of the elastic expansion group is smaller than the minimum expansion number, the Master node may notify an elastic expansion group management node of the cloud platform, and allocate more container operation nodes to the user service, so that the total container amount of the elastic expansion group operating the user service is matched with the total container amount of the expected elastic expansion group; after the flexible expansion group management node allocates a new container operation node to the user service, allocation completion information can be fed back to the Master node, and the information usually includes information such as an access path, a node identifier and the like of the new container operation node, so that the Master node brings the new container operation node into a monitoring range.
By the method, the number of the container operation nodes in the elastic telescopic group can be always kept in a reasonable range, and the stability of the elastic telescopic group is improved while the number of the container operation nodes is ensured to meet user services.
The embodiment of the invention also provides another container resource management method, which is realized on the basis of the embodiment; the present embodiment further describes the adjustment manner of the container operation node when the current number of containers in the elastic expansion group is smaller than and smaller than the determined expansion number. The process is shown in fig. 5, and includes the following steps:
step S502, determining whether the container resource adjusting time corresponding to the elastic expansion group is reached;
step S504, when the adjustment time of the container resource is determined to be reached, acquiring the corresponding relation between the stretching number and the time of the elastic stretching group; determining the expansion number of the elastic expansion group in the container resource adjustment time according to the corresponding relation between the expansion number and the time; the number of the elastic expansion and contraction groups indicates the number of containers;
step S506, judging the size relationship between the current container number of the elastic expansion group and the determined expansion number; if the current number of the containers of the elastic expansion group is less than the determined expansion number, executing step S508; if the current container number of the elastic expansion group is larger than the determined expansion number, executing step S514; if the current container number of the elastic expansion group is equal to the determined expansion number, ending;
it can be understood that if the current number of containers of the elastic scaling group is equal to the determined scaling number, which indicates that the current number of containers of the elastic scaling group matches the traffic corresponding to the container resource adjustment time, then there is no need to adjust the container operation node of the elastic scaling group.
Step S508, calculating the difference between the expansion number and the current container number of the elastic expansion group;
step S510, searching a container operation node with the number of operable containers matched with the difference value from the container operation nodes to be started; determining the searched container operation node as a container operation node to be adjusted;
the cloud resource providing platform is used for managing container operation nodes to be started in the cluster, and the container operation nodes usually have different configuration parameters such as CPU, memory and network flow; therefore, the number of containers that can be accommodated by different container operation nodes is also different; configuration parameters for each container operation node may be queried from the cloud resources provisioning platform. After the difference value between the expansion number and the current container number of the elastic expansion group is obtained, the configuration parameters of the matched container operation nodes can be obtained through conversion according to the container number corresponding to the difference value, the available container operation nodes are inquired from the cloud resource providing platform based on the configuration parameters, and then the container operation nodes to be adjusted are determined.
And S512, adding the determined container operation node into the elastic expansion group, and ending.
In actual implementation, the container operation node generally needs to be started first, and detailed configuration information, such as a container type, a type and a capacity of a data volume, a name, a password, and the like, is set for the container operation node; and further storing the configuration information of the container operation node into a Master node of the elastic telescopic group, and adding the configuration information of the container operation node into a management list of the container operation node of the elastic telescopic group. After the determined container operation node is added to the elastic expansion group, it is usually necessary to package a file on the container operation node according to the container of the user service corresponding to the elastic expansion group, so that the container operation node can normally operate the container.
Step S514, calculating the difference value between the current container quantity and the expansion number of the elastic expansion group;
step S516, searching container operation nodes with the number of operated containers matched with the difference value from the container operation nodes corresponding to the elastic expansion group; determining the searched container operation node as a container operation node to be adjusted;
after the difference value between the current container number of the elastic telescopic group and the telescopic number is obtained, the Master node can search and obtain the container operation nodes matched with the container number corresponding to the difference value according to the container number of each container operation node in the elastic telescopic group in operation; the number of the containers currently operated by the matched container operation node needs to be equal to or slightly smaller than the number of the containers corresponding to the difference.
And step S518, removing the searched determined container operation node from the elastic expansion group.
In order to ensure that user services are not affected, before removing a container operation node, it is usually necessary to recycle a container operated in the container operation node, including an operation program of the container, stored data, and the like; and then notifying the cloud resource providing platform, and managing the container operation node by the cloud resource providing platform so as to be used for subsequent node scheduling.
In the above manner, when the container resource adjustment time corresponding to the elastic expansion group is reached, if the current container number of the elastic expansion group is not matched with the expansion number, the container operation node to be adjusted is determined according to the expansion number and the container number of each container operation node, and then the container operation node is adjusted. The method automatically stretches the container operation nodes in the elastic stretching group based on time, so that the number of containers in the elastic stretching group meets the service requirement of the current time, and meanwhile, the container operation node resources are released in idle time to save cost, the matching degree of the container resources and the service requirement is improved, and the resource utilization rate of the cloud platform is improved.
The embodiment of the invention also provides another container resource management method, which is realized on the basis of the embodiment; the embodiment mainly describes the monitoring process of the container operation node state in the elastic telescopic group after the telescopic number is adjusted, and the process of terminating and newly adding container operation nodes according to different types of container operation nodes; the process is shown in fig. 6, and includes the following steps:
step S602, monitoring the operation state of each container operation node in the elastic expansion group.
Step S604, judging whether the operation state of the container operation node is abnormal operation; if yes, go to step S606; if not, step S602 is performed.
In fact, the Master node may receive a fault signal sent by the container operation node or the entity server operating the container operation node, and when the container operation node fails, the Master node may receive a fault signal sent by the container operation node or the entity server operating the container operation node; or, when the container operation node fails, if the CPU utilization rate, the memory utilization rate, the network card traffic, the listener traffic, or other operation data exceeds a preset threshold, the container operation node may also determine that the container operation node is abnormally operated.
Step S606, the service operation data of the container operation node which operates abnormally is recovered.
When the container operation node is abnormally operated, the abnormal container operation node is likely to cause user services to be affected, so the container operation node should be recovered and a new container operation node should be started to replace the abnormally operated container operation node. The operation data may include the operation program of the container operation node, data generated in the operation process, and the like.
In step S608, the operation state of the container operation node that operates abnormally is set to stop operating.
And setting the running state of the detected container running node which runs abnormally as a running stop state for manual repair.
Certainly, for some container operation nodes with special functions, including Node nodes or Pod nodes, a protection function can be preset, that is, even if a Master Node detects that the state of the container operation Node is abnormal, the container operation Node is not recycled; the traffic of the container operation node can be gradually reduced in other ways, such as a shunting traffic way, so that the container operation node is restored to a normal state.
Step S610, judging the type of the container operation node; if the container operation Node is a Node, executing step S612; if the container operation node is a Pod node, step S620 is performed.
Step S612, determining the newly added Node to be started through the cloud resource providing platform.
Since a Node is usually a physical server or a virtual machine, if a Node is to be newly added for running a user service, it is usually necessary to set configuration information for the Node, install an image file of an operating system, and the like, as follows.
Step S614, setting configuration information of the newly added Node; the configuration information at least comprises Node type, type and capacity of data storage disk, configuration name, password and bindable load balance of elastic expansion group.
The Node type can be a physical server or a virtual machine; after the Node type is determined, the name, the password and the bindable load balance of the elastic flexible group of the Node are set, and the Node is a functional module for executing user service, so that the load balance of the elastic flexible group is required to be specifically bound to expand the bandwidth of network equipment and a server, increase the throughput, strengthen the network data processing capacity and improve the flexibility and the usability of the network.
Step S616, acquiring a mirror image file of the user service.
The image file is a single file which is made from a specific series of files according to a certain format, and the image file stores an operation program for operating the user service.
Step S618, deploy the mirror image file to the newly added Node, and transfer the recovered service operation data to the newly added Node, so that the newly added Node continues to operate the user service.
After the operation parameter setting is completed, the mirror image file can be installed in the container operation Node, after the installation is completed, the newly added Node can normally operate, the recovered service operation data is transferred to a container in the newly added Node, and the container can continue to operate the user service.
Because the Pod node is a basic deployment and scheduling unit, if a new Pod node is required to be used for operating the user service, only the Master node needs to copy the image file of the user service, and then the image file is operated on the new Pod node, and operations such as setting configuration information, installing an operating system and the like are not required.
And step S620, copying the image file of the user service through an RC controller in the Master node.
The RC controller can designate that an application needs to be copied for several copies, create a Pod node for each copy, and simultaneously ensure that the number of Pod nodes actually operated at any time is always equal to the copied number, namely the RC controller is the copied abstraction of the Pod, so that the capacity expansion and reduction problems of the Pod can be solved.
In step S622, a new Pod Node is created on the designated Node.
This step may be accomplished by an RC controller in the Master node and a Service in the container management cluster, where the Service is a routing agent abstraction of the Pod node. The process of creating the newly added Pod nodes does not need to manually process the Pod nodes, and the RC controller and the Service can automatically manage the newly added Pod nodes only by presetting the expected quantity adjusting strategy.
Step S624, the copied image file is run on the newly added Pod node, and the recovered service running data is transferred to the newly added Pod node, so that the newly added Pod node continues to run the user service.
After a newly added Pod Node is created on a designated Node, the image file can be operated in the newly added Pod Node, after the operation, the newly added Pod Node can operate normally, and then the recovered service operation data is transferred to a container in the newly added Pod Node, and the container can continue to operate the user service.
In the above manner, the Master node can monitor the running state of each container running node in the elastic telescopic group, and if the container running node is abnormal, the container running node is timely recovered and a newly added container running node is used, so that the stable running of the user service can be ensured, and the experience of the user is improved.
Corresponding to the foregoing method embodiment, an embodiment of the present invention further provides a management apparatus for container resources, as shown in fig. 7, where the management apparatus for container resources is disposed at a Master node of a container management cluster, and the management apparatus includes:
a time determining module 70, configured to determine whether a container resource adjustment time corresponding to the elastic expansion group is reached;
a corresponding relation obtaining module 71, configured to obtain a corresponding relation between the number of elastic stretches of the elastic stretch group and time when it is determined that the container resource adjustment time is reached;
a stretch number determining module 72, configured to determine, according to a correspondence between a stretch number and time, a stretch number of the elastic stretch group in the container resource adjustment time; the number of the elastic expansion groups indicates the number of containers;
a container operation node determining module 73, configured to determine a container operation node to be adjusted according to the determined expansion number and the number of containers in each container operation node;
a first adjusting module 74, configured to adjust the determined container operation node so that the number of containers of the elastic expansion group matches the determined expansion number.
Furthermore, the elastic expansion group is preset with an expansion number threshold; the expansion number threshold comprises a maximum expansion number and a minimum expansion number; the correspondence between the expansion number and the time is specifically set in the following manner: acquiring historical operating data of user services within a preset time period; determining the corresponding relation between the service volume of the user service and the time according to the historical operating data; and determining the corresponding relation between the expansion number and the time according to the corresponding relation between the traffic and the time and the expansion number threshold of the elastic expansion group.
Furthermore, the elastic expansion group is also preset with expected expansion number; the device still includes: the first monitoring module is used for monitoring the number of containers of the elastic telescopic group at the time except the container resource adjusting time, and determining the container operation nodes to be adjusted according to the expected telescopic number and the number of the containers of each container operation node if the number of the containers exceeds the telescopic number threshold of the elastic telescopic group; and the second adjusting module is used for adjusting the determined container operation nodes so as to enable the number of the containers of the elastic expansion group to be matched with the expected expansion number.
Further, the container operation node determining module is further configured to: if the current container number of the elastic expansion group is smaller than the determined expansion number, calculating the difference value between the expansion number and the current container number of the elastic expansion group; searching container operation nodes with the number of operable containers matched with the difference value from the container operation nodes to be started; determining the searched container operation node as a container operation node to be adjusted; a first adjustment module further configured to: and adding the determined container operation node into the elastic expansion group.
Further, the container operation node determining module is further configured to: if the current container number of the elastic telescopic group is larger than the determined telescopic number, calculating the difference value between the current container number of the elastic telescopic group and the telescopic number; searching container operation nodes with the number of the operated containers matched with the difference value from the container operation nodes corresponding to the elastic expansion group; determining the searched container operation node as a container operation node to be adjusted; a first adjustment module further configured to: and removing the searched determined container operation node from the elastic expansion group.
Further, the above apparatus further comprises: the second monitoring module is used for monitoring the operation state of each container operation node in the elastic telescopic group; the termination module is used for terminating the running user service of the container running node which runs abnormally if the running state of the container running node is abnormal running; and the starting module is used for starting the operation node of the newly added container to operate the user service.
Further, the termination module is further configured to: recovering service operation data of the container operation nodes which are abnormally operated; and setting the running state of the container running node which runs abnormally as a running stop.
Further, the container operation Node includes a Node, and the starting module is further configured to: determining newly added Node nodes to be started through a cloud resource providing platform; setting configuration information of the newly added Node nodes; the configuration information at least comprises Node type, type and capacity of a data storage disk, configuration name, password and bindable load balance of the elastic expansion group; acquiring a mirror image file of a user service; and deploying the mirror image file into the newly added Node, and transferring the recovered service operation data to the newly added Node so that the newly added Node continues to operate the user service.
Further, the container operation node includes a Pod node, and the start module is further configured to: copying a mirror image file of a user service through an RC (remote control) controller in a Master node; creating a newly added Pod Node on the appointed Node; and operating the copied mirror image file on the newly added Pod node, and transferring the recovered service operation data to the newly added Pod node so as to enable the newly added Pod node to continue operating the user service.
The embodiment of the invention provides a management device of container resources, which is characterized in that when the container resource adjustment time corresponding to an elastic telescopic group is reached, the telescopic number of the elastic telescopic group in the container resource adjustment time is determined according to the corresponding relation between the telescopic number and the time; and determining the container operation nodes to be adjusted according to the expansion number and the container number of each container operation node, and further adjusting the container operation nodes. The method automatically stretches the container operation nodes in the elastic stretching group based on time, so that the number of containers in the elastic stretching group meets the service requirement of the current time, and meanwhile, the container operation node resources are released in idle time to save cost, the matching degree of the container resources and the service requirement is improved, and the resource utilization rate of the cloud platform is improved.
For the above embodiment, the embodiment of the present invention further provides a cloud platform, where a container management cluster is operated on the cloud platform; the container management cluster comprises a Master node and an elastic expansion group; the Master node is connected with a plurality of elastic stretching groups, and user services of users run in the elastic stretching groups; the device of any one of claims 10-18, disposed at the Master node.
The cloud platform provided by the embodiment of the invention has the same technical characteristics as the management method of the container resource provided by the embodiment, so that the same technical problems can be solved, and the same technical effects can be achieved.
An embodiment of the present invention further provides a cloud server, configured to run the management method for the container resource, and as shown in fig. 8, the cloud server includes a memory and a processor, where the memory is configured to store one or more computer instructions, and the one or more computer instructions are executed by the processor, so as to implement the management method for the container resource.
Further, the cloud server shown in fig. 8 further includes a bus 102 and a communication interface 103, and the processor 101, the communication interface 103, and the memory 100 are connected through the bus 102.
The Memory 100 may include a high-speed Random Access Memory (RAM) and may further include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. The communication connection between the network element of the system and at least one other network element is realized through at least one communication interface 103 (which may be wired or wireless), and the internet, a wide area network, a local network, a metropolitan area network, and the like can be used. The bus 102 may be an ISA bus, PCI bus, EISA bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 8, but that does not indicate only one bus or one type of bus.
The processor 101 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 101. The Processor 101 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA), or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 100, and the processor 101 reads the information in the memory 100, and completes the steps of the method of the foregoing embodiment in combination with the hardware thereof.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (17)

1. A management method of container resources is applied to a Master node of a container management cluster, and the method comprises the following steps:
determining whether the container resource adjustment time corresponding to the elastic expansion group is reached;
when the adjustment time of the container resource is determined, acquiring the corresponding relation between the stretching number and the time of the elastic stretching group;
determining the stretching number of the elastic stretching group in the container resource adjusting time according to the corresponding relation between the stretching number and the time; the number of telescopes indicates the number of containers of the set of elastic telescopes;
determining container operation nodes to be adjusted according to the determined expansion number and the container number of each container operation node;
adjusting the determined container operation nodes to enable the number of the containers of the elastic expansion group to be matched with the determined expansion number;
the elastic expansion group is preset with an expansion number threshold; the expansion number threshold comprises a maximum expansion number and a minimum expansion number; the elastic expansion group is also preset with expected expansion number; the expected expansion number is an expansion number estimated by a worker for a user service; the method further comprises the following steps:
monitoring the number of containers of the elastic expansion group at a time other than the container resource adjustment time, and if the number of containers exceeds the expansion number threshold of the elastic expansion group, determining the container operation nodes to be adjusted according to the expected expansion number and the number of containers of each container operation node;
adjusting the determined container operation node to match the number of containers of the elastic expansion group with the expected expansion number.
2. The method according to claim 1, wherein the correspondence between the number of stretches and the time is set by:
acquiring historical operating data of user services within a preset time period;
determining the corresponding relation between the service volume of the user service and the time according to the historical operating data;
and determining the corresponding relation between the expansion number and the time according to the corresponding relation between the traffic and the time and the expansion number threshold of the elastic expansion group.
3. The method of claim 1, wherein the step of determining the container operation nodes to be adjusted according to the determined number of scales and the number of containers of each container operation node comprises:
if the current container number of the elastic telescopic group is smaller than the determined telescopic number, calculating the difference value between the telescopic number and the current container number of the elastic telescopic group; searching container operation nodes with the number of operable containers matched with the difference value from the container operation nodes to be started; determining the searched container operation node as a container operation node to be adjusted;
the step of adjusting the determined container operation node comprises:
adding the determined container operation node to the elastic expansion group.
4. The method of claim 1, wherein the step of determining the container operation nodes to be adjusted according to the determined number of scales and the number of containers of each container operation node comprises:
if the current container number of the elastic telescopic group is larger than the determined telescopic number, calculating the difference value between the current container number of the elastic telescopic group and the telescopic number; searching container operation nodes with the number of operated containers matched with the difference value from the container operation nodes corresponding to the elastic expansion group; determining the searched container operation node as a container operation node to be adjusted;
the step of adjusting the determined container operation node comprises: and removing the searched determined container operation node from the elastic expansion group.
5. The method of claim 1, further comprising:
monitoring the operation state of each container operation node in the elastic telescopic group;
if the running state of the container running node is abnormal running, the container running node which runs abnormally is stopped running the user service;
and starting a new container operation node to operate the user service.
6. The method of claim 5, wherein the step of terminating the abnormally operated container operation node to operate user traffic comprises:
recovering the service operation data of the container operation node which operates abnormally;
and setting the running state of the container running node which runs abnormally as a running stop.
7. The method of claim 6, wherein the container operation Node comprises a Node;
the step of starting the operation node of the newly added container to operate the user service comprises the following steps:
determining newly added Node nodes to be started through a cloud resource providing platform;
setting configuration information of the newly added Node; the configuration information at least comprises Node type, type and capacity of a data storage disk, configuration name, password and bindable load balance of the elastic expansion group;
acquiring a mirror image file of the user service;
and deploying the mirror image file into the newly added Node, and transferring the recovered service operation data to the newly added Node so as to enable the newly added Node to continue to operate the user service.
8. The method of claim 6, wherein the container operation node comprises a Pod node;
the step of starting the operation node of the newly added container to operate the user service comprises the following steps:
copying a mirror image file of the user service through an RC (remote control) controller in a Master node;
creating a newly added Pod Node on the appointed Node;
and running the copied mirror image file on a newly added Pod node, and transferring the recovered service running data to the newly added Pod node so as to enable the newly added Pod node to continue running the user service.
9. The utility model provides a management device of container resource which characterized in that, the device sets up in the Master node of container management cluster, the device includes:
the time determining module is used for determining whether the container resource adjusting time corresponding to the elastic expansion group is reached;
the corresponding relation obtaining module is used for obtaining the corresponding relation between the stretching number of the elastic stretching group and the time when the container resource adjusting time is determined to be reached;
the expansion number determining module is used for determining the expansion number of the elastic expansion group in the container resource adjusting time according to the corresponding relation between the expansion number and the time; the number of telescopes indicates the number of containers of the set of elastic telescopes;
the container operation node determining module is used for determining the container operation nodes to be adjusted according to the determined expansion number and the container number of each container operation node;
the first adjusting module is used for adjusting the determined container operation nodes so as to enable the number of the containers of the elastic expansion group to be matched with the determined expansion number;
the elastic expansion group is preset with an expansion number threshold; the expansion number threshold comprises a maximum expansion number and a minimum expansion number; the elastic expansion group is also preset with expected expansion number; the expected expansion number is an expansion number estimated by a worker for a user service; the device further comprises:
the first monitoring module is used for monitoring the number of containers of the elastic expansion group at a time except the container resource adjusting time, and determining the container operation nodes to be adjusted according to the expected expansion number and the number of the containers of each container operation node if the number of the containers exceeds the expansion number threshold of the elastic expansion group;
and the second adjusting module is used for adjusting the determined container operation nodes so as to enable the number of the containers of the elastic expansion group to be matched with the expected expansion number.
10. The apparatus according to claim 9, wherein the correspondence between the number of expansions and contractions and the time is set by:
acquiring historical operating data of user services within a preset time period;
determining the corresponding relation between the service volume of the user service and the time according to the historical operating data;
and determining the corresponding relation between the expansion number and the time according to the corresponding relation between the traffic and the time and the expansion number threshold of the elastic expansion group.
11. The apparatus of claim 9, wherein the container operation node determining module is further configured to:
if the current container number of the elastic telescopic group is smaller than the determined telescopic number, calculating the difference value between the telescopic number and the current container number of the elastic telescopic group; searching container operation nodes with the number of operable containers matched with the difference value from the container operation nodes to be started; determining the searched container operation node as a container operation node to be adjusted;
the first adjusting module is further configured to: adding the determined container operation node to the elastic expansion group.
12. The apparatus of claim 9, wherein the container operation node determining module is further configured to:
if the current container number of the elastic telescopic group is larger than the determined telescopic number, calculating the difference value between the current container number of the elastic telescopic group and the telescopic number; searching container operation nodes with the number of operated containers matched with the difference value from the container operation nodes corresponding to the elastic expansion group; determining the searched container operation node as a container operation node to be adjusted;
the first adjusting module is further configured to: and removing the searched determined container operation node from the elastic expansion group.
13. The apparatus of claim 9, further comprising:
the second monitoring module is used for monitoring the operation state of each container operation node in the elastic telescopic group;
a termination module, configured to terminate the abnormally operated container operation node from operating the user service if the operation state of the container operation node is abnormal operation;
and the starting module is used for starting the operation node of the newly added container to operate the user service.
14. The apparatus of claim 13, wherein the termination module is further configured to:
recovering the service operation data of the container operation node which operates abnormally;
and setting the running state of the container running node which runs abnormally as a running stop.
15. The apparatus of claim 14, wherein the container running Node comprises a Node, and wherein the initiation module is further configured to:
determining newly added Node nodes to be started through a cloud resource providing platform;
setting configuration information of the newly added Node; the configuration information at least comprises Node type, type and capacity of a data storage disk, configuration name, password and bindable load balance of the elastic expansion group;
acquiring a mirror image file of the user service;
and deploying the mirror image file into the newly added Node, and transferring the recovered service operation data to the newly added Node so as to enable the newly added Node to continue to operate the user service.
16. The apparatus of claim 14, wherein the container operation node comprises a Pod node, and wherein the initiation module is further configured to:
copying a mirror image file of the user service through an RC (remote control) controller in a Master node;
creating a newly added Pod Node on the appointed Node;
and running the copied mirror image file on a newly added Pod node, and transferring the recovered service running data to the newly added Pod node so as to enable the newly added Pod node to continue running the user service.
17. A cloud platform is characterized in that a container management cluster runs on the cloud platform; the container management cluster comprises a Master node and an elastic expansion group; the Master node is connected with a plurality of elastic stretching groups, and user services of users run in the elastic stretching groups; the device of any one of claims 9-16 disposed on the Master node.
CN201811165919.0A 2018-09-30 2018-09-30 Container resource management method and device and cloud platform Active CN108965485B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811165919.0A CN108965485B (en) 2018-09-30 2018-09-30 Container resource management method and device and cloud platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811165919.0A CN108965485B (en) 2018-09-30 2018-09-30 Container resource management method and device and cloud platform

Publications (2)

Publication Number Publication Date
CN108965485A CN108965485A (en) 2018-12-07
CN108965485B true CN108965485B (en) 2021-10-15

Family

ID=64471415

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811165919.0A Active CN108965485B (en) 2018-09-30 2018-09-30 Container resource management method and device and cloud platform

Country Status (1)

Country Link
CN (1) CN108965485B (en)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109766174B (en) * 2018-12-24 2021-04-16 杭州数梦工场科技有限公司 Resource scheduling method, resource scheduling apparatus, and computer-readable storage medium
CN111666034A (en) * 2019-03-05 2020-09-15 北京京东尚科信息技术有限公司 Container cluster disk management method and device
CN110417614B (en) * 2019-06-18 2022-04-26 平安科技(深圳)有限公司 Cloud server self-checking method, device, equipment and computer readable storage medium
CN110502336A (en) * 2019-07-05 2019-11-26 深圳壹账通智能科技有限公司 A kind of starting method, apparatus, equipment and the readable storage medium storing program for executing of server
CN112291288B (en) * 2019-07-24 2022-10-04 北京金山云网络技术有限公司 Container cluster expansion method and device, electronic equipment and readable storage medium
CN110633325B (en) * 2019-09-20 2022-04-12 四川长虹电器股份有限公司 Docker-based database cluster capacity expansion method and device
CN112860427A (en) * 2019-11-27 2021-05-28 北京金山云网络技术有限公司 Container cluster and load balancing method and device thereof
CN113010363A (en) * 2019-12-19 2021-06-22 中科星图股份有限公司 Container monitoring method under swarm cluster and shared service cloud platform
CN113127135B (en) * 2019-12-30 2023-12-12 百度在线网络技术(北京)有限公司 Container starting method, container starting device and electronic equipment
CN111338752B (en) * 2020-02-14 2022-04-08 聚好看科技股份有限公司 Container adjusting method and device
CN111694666B (en) * 2020-06-10 2023-11-07 中国建设银行股份有限公司 Task distribution management method, device, equipment and medium
CN113918315A (en) * 2020-07-07 2022-01-11 华为技术有限公司 Method, device and system for capacity adjustment and computing equipment
CN113190329A (en) * 2021-05-24 2021-07-30 青岛聚看云科技有限公司 Server and method for automatically stretching container cloud cluster resources
CN115509676A (en) * 2021-06-22 2022-12-23 华为云计算技术有限公司 Container set deployment method and device
CN113553180B (en) * 2021-07-20 2023-10-13 唯品会(广州)软件有限公司 Container scheduling method and device and electronic equipment
CN113626206A (en) * 2021-10-11 2021-11-09 苏州浪潮智能科技有限公司 Method, device and equipment for adjusting number of application examples of container
CN115086335A (en) * 2022-07-27 2022-09-20 北京思和科创软件有限公司 Container cloud node dynamic adding method and device, electronic equipment and storage medium
CN117806815B (en) * 2023-11-27 2024-07-09 本原数据(北京)信息技术有限公司 Data processing method, system, electronic device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2355406A1 (en) * 2010-01-27 2011-08-10 Ricoh Company, Ltd. A method for flexibly storing/retrieving data stored in a tree-based data storing device in/from a database and corresponding system
CN106603618A (en) * 2016-09-14 2017-04-26 浪潮电子信息产业股份有限公司 Cloud platform-based application auto scaling method
CN106961351A (en) * 2017-03-03 2017-07-18 南京邮电大学 Intelligent elastic telescopic method based on Docker container clusters
CN106992887A (en) * 2017-04-05 2017-07-28 国家电网公司 The implementation method of application example elastic telescopic based on container, apparatus and system
CN107404409A (en) * 2017-09-01 2017-11-28 广西大学 Towards the container cloud elastic supply number of containers Forecasting Methodology and system of mutation load

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2355406A1 (en) * 2010-01-27 2011-08-10 Ricoh Company, Ltd. A method for flexibly storing/retrieving data stored in a tree-based data storing device in/from a database and corresponding system
CN106603618A (en) * 2016-09-14 2017-04-26 浪潮电子信息产业股份有限公司 Cloud platform-based application auto scaling method
CN106961351A (en) * 2017-03-03 2017-07-18 南京邮电大学 Intelligent elastic telescopic method based on Docker container clusters
CN106992887A (en) * 2017-04-05 2017-07-28 国家电网公司 The implementation method of application example elastic telescopic based on container, apparatus and system
CN107404409A (en) * 2017-09-01 2017-11-28 广西大学 Towards the container cloud elastic supply number of containers Forecasting Methodology and system of mutation load

Also Published As

Publication number Publication date
CN108965485A (en) 2018-12-07

Similar Documents

Publication Publication Date Title
CN108965485B (en) Container resource management method and device and cloud platform
US11704144B2 (en) Creating virtual machine groups based on request
CN110858161B (en) Resource allocation method, device, system, equipment and medium
CN108337109B (en) Resource allocation method and device and resource allocation system
CN109684074B (en) Physical machine resource allocation method and terminal equipment
CN105159775A (en) Load balancer based management system and management method for cloud computing data center
CN109936473B (en) Deep learning prediction-based distributed computing system and operation method thereof
CN112583861A (en) Service deployment method, resource configuration method, system, device and server
US10993127B2 (en) Network slice instance management method, apparatus, and system
CN109412841A (en) Method of adjustment, device and the cloud platform of resources of virtual machine
KR20170056350A (en) NFV(Network Function Virtualization) resource requirement verifier
CN110933178B (en) Method for adjusting node configuration in cluster system and server
CN103414657A (en) Cross-data-center resource scheduling method, super scheduling center and system
CN108874502B (en) Resource management method, device and equipment of cloud computing cluster
CN104601680A (en) Resource management method and device
CN112965817B (en) Resource management method and device and electronic equipment
CN108900435B (en) Service deployment method, device and computer storage medium
WO2020045189A1 (en) Network service management device, network service management method, and network service management program
CN110858986A (en) Bandwidth adjusting method, device, communication equipment and computer readable storage medium
WO2013082742A1 (en) Resource scheduling method, device and system
CN114356557A (en) Cluster capacity expansion method and device
US11979335B2 (en) Network controller
CN114338670A (en) Edge cloud platform and three-level cloud control platform for internet traffic with same
CN112261125B (en) Centralized unit cloud deployment method, device and system
CN115473780B (en) Network target range distributed flow generation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant