CN115357336A - Online capacity expansion method and device of container group, terminal equipment and medium - Google Patents

Online capacity expansion method and device of container group, terminal equipment and medium Download PDF

Info

Publication number
CN115357336A
CN115357336A CN202210941324.XA CN202210941324A CN115357336A CN 115357336 A CN115357336 A CN 115357336A CN 202210941324 A CN202210941324 A CN 202210941324A CN 115357336 A CN115357336 A CN 115357336A
Authority
CN
China
Prior art keywords
container group
capacity expansion
job
subtask
service layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210941324.XA
Other languages
Chinese (zh)
Inventor
方超
田永江
朱鹏
杨维强
黄睿
张洪魁
张多子
莫淡先
孙腾腾
吴斯亮
刘海波
胡晓容
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Merchants Bank Co Ltd
Original Assignee
China Merchants Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Merchants Bank Co Ltd filed Critical China Merchants Bank Co Ltd
Priority to CN202210941324.XA priority Critical patent/CN115357336A/en
Publication of CN115357336A publication Critical patent/CN115357336A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an online capacity expansion method, an online capacity expansion device, terminal equipment and a medium of a container group, wherein the method comprises the following steps: when a capacity expansion instruction is received, validity check is carried out on the capacity expansion instruction through the service layer, if the check result is that the first object Cgroup generated based on the capacity expansion instruction passes, a first object Cgroup generated based on the capacity expansion instruction is obtained, a subtask jobcorresponding to the first object Cgroup is created through the JOB service layer JOB, and a first resource limit value in the first object is modified through the subtask jobto expand the capacity of the container group, so that the capacity expansion efficiency of the container group and the operation efficiency of application in the container group are improved.

Description

Online capacity expansion method and device of container group, terminal equipment and medium
Technical Field
The present invention relates to the field of data processing, and in particular, to an online capacity expansion method and apparatus for a container group, a terminal device, and a medium.
Background
Kubernets, a portable and extensible open source platform, in which containers, i.e. small and light running environments, can be packaged into a plurality of container images, and the containers are contained in a container group Pod, so that the kubernets, compared with a traditional virtual machine, has the advantage of light weight and is widely applied to virtual applications or services. The Kubernetes comprises an ETCD (distributed storage system) cluster module, is a data storage unit of the Kubernetes cluster, and is used for storing all data needing to be persisted in the cluster; the Kube-apiserver module is the only entrance for accessing the ETCD by the cluster, provides the HTTP Rest interfaces such as the adding, deleting, checking, and watch of various resource objects in the Kubernetes cluster, and is a data bus of the whole system; the Kube-controller module is a controller of resource objects in the cluster and is responsible for managing and controlling the whole cluster. The method mainly manages resource objects, and when a container group running in a Node or the Node breaks down, a Kuber-controller timely discovers and processes the container group or the Node so as to ensure that the whole cluster is in an ideal working state; the container group Pod, which is the most basic operating unit of kubernets, represents a program running in a cluster, and one or more closely related containers are packaged in the container group Pod, and multiple containers can run in a single container group at the same time. As the demand of users increases, the resource demand of the container group increases, such as more running memory is required to be allocated.
Generally, the capacity expansion of a container group is performed after a capacity expansion request is received, kubernets modify a resource limit value in the container group through a kubecect command or a client tool, then record the modified container group configuration into a distributed storage system, and Kube-controller in the kubernets senses the change of the configuration of the container group through a Watch mechanism in the Kube-api server, and then enable the modified resource capacity expansion to take effect through a way of deleting an old container group and then newly creating the container group.
In the capacity expansion mode of deleting the old container group and rebuilding a new container group, the whole container group needs to be restarted to enable the modified resource to be subjected to capacity expansion and take effect. If a stateful application runs in the used container group, the application may be disconnected, and service interruption is caused; if online applications run in the used container group, the application may fail to access again, and the flow rate is abnormally increased; if there is an offline task running in the group of containers, it may result in the total loss of the calculated data for the first few hours.
The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.
Disclosure of Invention
The invention mainly aims to provide a method, a device, a terminal device and a medium for online capacity expansion of a container group, and aims to solve the technical problem that an application running in the container group is abnormally interrupted due to a traditional capacity expansion mode of the container group, and improve the capacity expansion efficiency of the container group and the running efficiency of the application running in the container group.
In order to achieve the above object, the present invention provides an online capacity expansion method for a container group, where the online capacity expansion method for a container group includes the following steps:
when a capacity expansion instruction is received, carrying out validity check on the capacity expansion instruction through the business service layer;
if the verification result is that the object passes, acquiring a first object Cgroups generated based on the capacity expansion instruction;
creating a subtask JOB corresponding to the first object Cgroups through the JOB service layer JOB;
and modifying the first resource limit value in the first object through the subtask job to expand the container group.
Optionally, if the result of the check is that the first object Cgroups generated based on the capacity expansion instruction passes, the step of obtaining the first object Cgroups includes:
if the verification result is passed, acquiring a second object Stateful corresponding to the container group through the operation service layer;
and acquiring the object information of the corresponding container group based on the second object, and creating a first object corresponding to the container group based on the object information of the container group.
Optionally, after the step of modifying the first resource restriction value in the first object by the subtask job to expand the container group, the method further includes:
and modifying a second resource limit value corresponding to the resource template of the container group corresponding to the second object through the business service layer, so that the container group is generated by the container group resource template corresponding to the second resource limit value after the next abnormal restart.
Optionally, after the step of modifying the first resource restriction value in the first object by the subtask job to expand the container group, the method further includes:
acquiring a current version number field of the second object;
modifying the current version number field through the business service layer so as to update the current version number field into a latest version number field;
and deleting the version number field except the latest version number field so as to generate the container group based on the second object corresponding to the latest version number field after the container group is abnormally restarted next time.
Optionally, the modifying, by the subtask job, the first resource restriction value in the first object to expand the container group includes:
generating the subtask jobcorresponding to the first object based on the first object through the job service layer;
and finding the first object corresponding to the container group through the job of the subtask based on the container mounting function in the container group, and modifying the first resource limit value in the first object through the subtask to expand the container group.
Optionally, the step of generating, by the job service layer, a corresponding subtask based on the first object further includes:
acquiring the creation condition of a first object group through the operation service layer;
when it is monitored that the first object is created in the first object group, node information of the container group corresponding to the first object is obtained, and the subtask jobcorresponding to the first object is generated based on the node information.
Optionally, after the step of modifying, by the subtask, the first resource limit value of the first object corresponding to the container group to expand the container group, the method further includes:
and after the capacity expansion of the container group is completed, destroying the subtasks corresponding to the container group.
In addition, to achieve the above object, the present invention further provides an online capacity expansion device for a container bank, where the online capacity expansion device for a container bank includes:
the verification module is used for carrying out validity verification on the capacity expansion instruction through the business service layer when the capacity expansion instruction is received;
the first creating module is used for acquiring a first object Cgroups generated based on the capacity expansion instruction if the checking result is that the first object Cgroups passes;
a second creating module, configured to create the subtask JOB corresponding to the first object Cgroups through the JOB service layer JOB;
and the modification module is used for modifying the resource limit value in the first object through the operation service layer so as to expand the container group.
In addition, in order to achieve the above object, the present invention further provides a terminal device, where the terminal device includes a memory, a processor, and an online capacity expansion program of a container group stored in the memory and operable on the processor, and the online capacity expansion program of the container group, when executed by the processor, implements the steps of the online capacity expansion method of the container group as described above.
In addition, to achieve the above object, the present invention further provides a computer-readable storage medium, where an online capacity expansion program of a container group is stored, and when the online capacity expansion of the container group is executed by a processor, the steps of the online capacity expansion method of the container group are implemented.
The invention provides an online capacity expansion method and device of a container group, terminal equipment and a medium. When a capacity expansion instruction is received, the validity of the capacity expansion instruction is checked through the service layer, if the result of the check is that the first object CgRoups generated based on the capacity expansion instruction passes, a subtask jobcorresponding to the first object CgRoups is created through the JOB service layer JOB, the first resource limit value in the first object is modified through the subtask jobto expand the container group, the CgRoup file in the CgRoups object is modified through the Job subtask on the container group level, and finally the purpose of container group capacity expansion of CgRoups object management is achieved.
Drawings
Fig. 1 is a schematic diagram of functional modules of a terminal device to which an online capacity expansion device of a container group belongs;
fig. 2 is a schematic flow chart of capacity expansion of Pod resources of a conventional container group based on kubernets;
fig. 3 is a flowchart illustrating an exemplary embodiment of an online capacity expansion method for a container group according to the present application;
fig. 4 is a flowchart illustrating another exemplary embodiment of an online capacity expansion method for a container group according to the present application;
fig. 5 is a flowchart illustrating another exemplary embodiment of an online capacity expansion method for a container group according to the present application;
fig. 6 is a flowchart illustrating another exemplary embodiment of an online capacity expansion method for a container group according to the present application;
FIG. 7 is a flowchart illustrating an online capacity expansion method of a container group according to another exemplary embodiment of the present application;
fig. 8 is a schematic flow diagram illustrating an application of the online capacity expansion method of the application container group to the MySQL database.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The application relates to a background knowledge of the online capacity expansion method of the container group:
referring to fig. 2, fig. 2 is a schematic flow chart illustrating expansion of a traditional container group resource based on kubernets. Step one, sending a capacity expansion request command to a port through a management tool kubecect in kubernets, and step two, wherein the port is a component in the kubernets and directly interacts with the distributed storage system, the port authenticates the capacity expansion request through kubeconfig configuration, and after the authentication is passed, container group information recorded in the YAML file in the capacity expansion request is stored in the distributed storage system. And step three, a control program in the kubernets discovers the update of the Pod information through a watch interface in a port, executes the integration of a topological structure on which the resource depends, sends the corresponding Node binding information such as the position information of the Node bound by the container group to the port after the integration, and writes the Node binding information into the distributed storage system through the port. And step four, the kube-scheduler is responsible for scheduling resources, the container group is scheduled to a corresponding Node according to a preset scheduling strategy, information that the container group can be scheduled is obtained through a watch interface in the port, the Node is distributed to the container group through an algorithm, information bound by the container group and the corresponding Node is given to the port, and the port is written into the distributed storage system. Step five, the kubel is responsible for monitoring the Container group assigned to the Node where the kubel is located, and comprises the steps of creating, modifying, monitoring, deleting and the like on the Container group, then obtaining Container group information needing to be created from a port, calling related interfaces such as CNI (Container Network Interface), CSI (Container Storage Interface), CRI (Container Runtime Interface) and the like to complete the creation of related resources, and finally completing the creation of the Container group, wherein after a service process is started, the application in the Container group is successfully operated.
The main solution of the embodiment of the application is as follows: when a capacity expansion instruction is received, when the capacity expansion instruction is received, validity verification is carried out on the capacity expansion instruction through the service layer, if the verification result is that the capacity expansion instruction passes, a first object CgRoups generated based on the capacity expansion instruction is obtained, a subtask jobcorresponding to the first object CgRoups is created through the JOB service layer Job, a first resource limit value in a file corresponding to the first object is modified through the subtask jobb, so that the capacity of the container group is expanded, the purpose of expanding the capacity of the container group Pod managed by the CgRoups object is finally achieved from the viewpoint that abnormal interruption of application in the container group can be caused by restarting in a traditional container group Pod capacity expansion mode, and the first resource limit value is modified through Job subtasks on the Pod level layer, so that the capacity expansion efficiency of the container group Pod managed by the CgRoups object is improved, and the operation efficiency of the container group is effective.
Specifically, referring to fig. 1, fig. 1 is a schematic diagram of functional modules of a terminal device to which an online capacity expansion device of a container group belongs. The online capacity expansion device of the container group can be a device which is independent of the terminal equipment, can carry out validity check, can acquire an object, can create a subtask, and can modify a resource limit value, and the device can be borne on the terminal equipment in a hardware or software mode. The terminal device can be an intelligent mobile terminal with a data processing function, such as a mobile phone and a tablet personal computer, and can also be a fixed terminal device or a server with a data processing function.
In this embodiment, the terminal device to which the online capacity expansion apparatus of the container group belongs at least includes an output module 110, a processor 120, a memory 130, and a communication module 140.
The memory 130 stores an operating system and an online capacity expansion program of the container group, and the online capacity expansion device of the container group can store information such as a capacity expansion instruction, a Cgroups object, a job of a subtask, a resource limit value and the like in the memory 130; the output module 110 may be a display screen or the like. The communication module 140 may include a WIFI module, a mobile communication module, a bluetooth module, and the like, and communicates with an external device or a server through the communication module 140.
When executed by the processor, the online capacity expansion program for the container group in the memory 130 implements the following steps:
when a capacity expansion instruction is received, validity check is carried out on the capacity expansion instruction through the business service layer;
if the verification result is that the object passes, acquiring a first object Cgroups generated based on the capacity expansion instruction;
creating a subtask JOB corresponding to the first object Cgroups through the JOB service layer JOB;
and modifying the first resource limit value in the first object through the subtask jobto expand the container group.
Further, before the on-line capacity expansion program of the container group in the memory 130 is executed by the processor, the following steps are also implemented:
if the checking result is that the checking result is passed, acquiring a second object Stateful corresponding to the container group through the operation service layer;
and acquiring the object information of the corresponding container group based on the second object, and creating the first object corresponding to the container group based on the object information of the container group.
Further, the following steps are also implemented before the online capacity expansion program of the container group in the memory 130 is executed by the processor:
and modifying a second resource limit value corresponding to the resource template of the container group corresponding to the second object through the business service layer, so that the container group is generated by the container group resource template corresponding to the second resource limit value after the next abnormal restart.
Further, before the on-line capacity expansion program of the container group in the memory 130 is executed by the processor, the following steps are also implemented:
acquiring a current version number field of the second object;
modifying the current version number field through the business service layer so as to update the current version number field into a latest version number field;
and deleting the version number field except the latest version number field so as to generate the container group based on the second object corresponding to the latest version number field after the container group is abnormally restarted next time.
Further, the following steps are also implemented before the online capacity expansion program of the container group in the memory 130 is executed by the processor:
generating the subtask jobcorresponding to the first object based on the first object through the job service layer;
and finding the first object corresponding to the container group through the sub task job and based on a container mounting function in the container group, and modifying the first resource limit value in the first object through the sub task to expand the container group.
Further, before the on-line capacity expansion program of the container group in the memory 130 is executed by the processor, the following steps are also implemented:
acquiring the creation condition of a first object group through the operation service layer;
when it is monitored that the first object is created in the first object group, node information of the container group corresponding to the first object is obtained, and the subtask jobcorresponding to the first object is generated based on the node information.
Further, the following steps are also implemented before the online capacity expansion program of the container group in the memory 130 is executed by the processor:
and after the capacity expansion of the container group is completed, destroying the subtasks corresponding to the container group.
The invention provides an online capacity expansion method and device of a container group, terminal equipment and a medium. When a capacity expansion instruction is received, validity check is carried out on the capacity expansion instruction through the service layer, if the check result is that the first object Cgroups generated based on the capacity expansion instruction passes, a subtask jobcorresponding to the first object Cgroups is created through the JOB service layer JOB, a first resource limit value in a file corresponding to the first object is modified through the subtask jobJOB, so that capacity expansion of the container group is carried out through simultaneous modification of the first resource limit value and a second resource limit value, restart of the container group Pod caused by inconsistency of the resource limit values is avoided, and the modified resource limit value takes effect in real time, and therefore operation efficiency and reliability of applications in the container group are greatly improved.
Based on the above terminal device architecture but not limited to the above architecture, embodiments of the method of the present application are provided.
Referring to fig. 3, fig. 3 is a flowchart illustrating an exemplary embodiment of an online capacity expansion method for a container group according to the present application. The online capacity expansion method of the container group comprises the following steps:
step S1001, when receiving a capacity expansion instruction, carrying out validity check on the capacity expansion instruction through the service layer;
the embodiment of the application realizes response and feedback of an Operation and maintenance request by setting up an Operation-service-system of the Operation and maintenance server and exposing the service to the outside in an HTTP (Hyper Text Transfer Protocol) form, and finally completes capacity expansion of a container group corresponding to the Operation and maintenance request, wherein the Operation-service-system comprises the following layers: the service layer Operation-service-server, the Operation service layer Cgram-operator and the JOB service layer JOB.
Specifically, kube-apiserver is the only entry for resource operation, and provides mechanisms such as API registration and discovery, authentication, authorization, access control and the like; when the Kube-api finds and receives a capacity expansion instruction sent by a user, legality verification is carried out on a capacity expansion request corresponding to the capacity expansion instruction through an Operation-service-server service layer, specifically, information such as legality of a field of the capacity expansion request, whether residual resources are enough for the capacity expansion request, and whether the request is overtime is verified.
Step S1002, if the check result is passed, acquiring a first object Cgroup generated based on the capacity expansion instruction;
specifically, after the verification of the capacity expansion request is passed, a Cgroups object is created through an http service, and is used for corresponding to the capacity expansion request event, where the object has the same name as the Pod that needs to be expanded and contains capacity expansion related data. The Cgroups object may be obtained by first obtaining a stateful object corresponding to the request information according to the request information, and then obtaining the Cgroups object by the stateful object: because each Pod in the StatefUlset is allocated with an integer ordinal, the corresponding StatefUlset object can be obtained according to the integer ordinal. And then obtaining the current object information of all the Pods controlled by the Stateful object according to the Stateful object, wherein the Pod with the same name as the current object information is the Cgroups object, namely the first object.
Step S1003, creating a subtask JOB corresponding to the first object Cgroups through the JOB service layer JOB;
specifically, the Cgroup-operator is used for monitoring the creation condition of the Cgroup object, that is, the Cgroup object is monitored in real time through the watch mechanism of the Kube-apiserver. When the Cgroups object is monitored to be created, through acquiring the Pod object which has the same name as the created Cgroups object, the Node/Wker Node information where the Pod is located is obtained. On the Node, according to related content of capacity expansion contained in the Cgroup object, a corresponding jobsub-task is created for the Cgroup object, and the Cgroup object and the jobsub-task are in one-to-one correspondence.
Step S1004, modify the first resource limit value in the first object through the subtask jobto expand the container group.
In particular, the job subtask is responsible for performing modifications to the Cgrop file to which the Cgrops object corresponds. Because the Cgroups object is a custom resource object created by http based on an expansion request, current resource information of a Pod managed by the object, such as a resource request value or a resource limit value, and expansion information, such as a resource request value or a resource limit value after expansion, are recorded, and the current resource limit value corresponding to the Pod is modified by a jobsubtask to realize real-time expansion of the corresponding Pod.
Exemplarily, referring to fig. 8, fig. 8 is a schematic flowchart of an application of the online capacity expansion method for a container group to a MySQL database; when the MySQL database runs in the container group, the crops object is a parameter object related to the resource in the MySQL database, such as (innodb _ buffer _ pool _ size).
Firstly, after receiving a capacity expansion request, the operation and maintenance server performs check detection in S1001 on the capacity expansion request, and after the check detection is passed, creates a corresponding Cgroups1 object for a container group 1 corresponding to the capacity expansion request.
After the Cgroups1 object is created, an operation service layer monitors a user-defined resource object through a watch mechanism of a Kube-api over, namely, the Cgroups1 object is newly created, then feasibility verification and evaluation are performed on the Cgroups1 object, after the verification and evaluation pass, a job1 subtask is created for each Cgroups1 object, the job1 subtask and the Pod1 are on the same Node/Worker Node, then the job1 modifies parameters of the MySQL database (Innodb _ buffer _ pool _ size) and the like, if the current parameters of the current MySQL database are 0.25cpu and 64MiB (megabyte, byte unit) memories, and the SQL database is modified into 0.5cpu and 128MiB memories, then expansion of the current MySQL database 0.25cpu resource and 64MiB memories is completed, wherein cpu bytes are a unit resource, and MiB is a unit. The capacity expansion operation is carried out on a Pod level layer, the modified and expanded MySQL database resource configuration takes effect in real time, and the effect of improving the performance of the MySQL database in real time is achieved.
If capacity expansion is performed on a Pod in the stateful service in a traditional Pod capacity expansion mode, firstly, due to the characteristic that only one node can write and needs real-Time persistent storage at a specific moment, data interruption caused by restart of the Pod triggered by traditional capacity expansion can affect services with the stateful service, and the Time from interruption to detection Recovery operation is relatively long. And this application realizes the online dilatation of container group, need not to restart the Pod promptly and can accomplish Pod resource dilatation, brings huge value in the actual database use: in the embodiment, because Pod capacity expansion is realized by opening the bottom resource management of a container and the application of an upper database, online capacity expansion of resources is realized, namely, the capacity expansion is performed on a Pod object level in real time, so that normal operation of the application is not influenced, and the RTO time is 0, so that the operation efficiency and reliability of actual services are greatly improved, and a huge value is brought to a user.
In this embodiment, through the above scheme, when a capacity expansion instruction is received, the validity of the capacity expansion instruction is checked through the service layer, if the result of the check is that the capacity expansion instruction passes, a first object Cgroups generated based on the capacity expansion instruction is obtained, a subtask jobcorresponding to the first object Cgroups is created through the JOB service layer JOB, and the subtask jobis used to modify a first resource limit value in the first object so as to expand the container group.
Referring to fig. 4, fig. 4 is a flowchart illustrating another exemplary embodiment of an online capacity expansion method for a container group according to the present application.
In step S1001, if the result of the check is that the first object Cgroups generated based on the capacity expansion instruction is passed, the step of obtaining the first object Cgroups includes:
step A100, if the checking result is that the checking result is passed, acquiring a second object Stateresult corresponding to the container group through the operation service layer;
specifically, after the result of validity check of the capacity expansion request passes, a second object stateful set is obtained from kubernets by operating a service layer Cgroup-operator according to request information, wherein the stateful set object is an object component for managing Pod in kubernets, one stateful set can correspondingly manage multiple pods, and the managed content includes an update condition, a life cycle, a restart policy and the like of the corresponding Pod. And correspondingly managed Pod object information can be found through Stateful set.
Step a200, based on the second object, obtaining object information of the corresponding container group, and based on the object information of the container group, creating a first object corresponding to the container group.
Specifically, according to the stateful set object obtained in step a100, corresponding Pod object information may be obtained, and then based on the Pod object information, a corresponding Cgroups object is created for each Pod through an http service, where a name of the Cgroups object is the same as that of the Pod, and the Cgroups object includes data related to capacity expansion, for example, 2 cpus are added, and each cpu is a resource unit.
In this embodiment, by the above scheme, specifically, if the check result is that the object passes through, the second object stateful set corresponding to the container group is obtained by the operation service layer, the corresponding object information of the container group is obtained based on the second object, and the first object corresponding to the container group is created based on the object information of the container group, that is, the object information of the managed Pod is obtained by the contact between stateful set and the managed Pod, so that the corresponding Cgroups object is created based on the Pod object information.
In step S1003, after the step of modifying, by the job service layer, the resource request value and the limit value in the object Cgroups to expand the Pod, the method further includes:
step B100, modifying, by the service layer, a second resource limit value corresponding to the resource template of the container group corresponding to the second object, so that the container group is generated by the container group resource template corresponding to the second resource limit value after the next abnormal restart or manual restart.
Specifically, the second object is a stateful object, the stateful object creates a Pod according to the PVC (Persistent Volume rule) corresponding to the Pod, and for a stateful Set with N copies, each Pod in the stateful Set is assigned an integer ordinal number, from 0 to N-1, which is unique in the whole Set, the ordinal number is index information of the stateful pair to the Pod, and there is also a resource request value of the Pod template in the stateful Set, i.e. the second resource limit value. And modifying the second resource limit value in the Statefmelset, and completing the creation and pull-up of the Pod according to the modified second resource limit value when the Statefmelset creates the Pod according to the PVC corresponding to the Pod when the Statefmelset is abnormally restarted next time.
In this embodiment, specifically, the second resource limit value corresponding to the resource template of the container group corresponding to the second object is modified through the service layer, so that the container group is generated by using the container group resource template corresponding to the second resource limit value after the next abnormal restart, that is, the resource size of the stateful object Pod template is modified through Operation-service-server, so that the Pod controlled by the Pod can use the modified Pod template to generate the Pod object when the next abnormal restart occurs, thereby improving the reliability of Pod Operation.
Referring to fig. 5, fig. 5 is a schematic flowchart illustrating an online capacity expansion method of a container group according to another exemplary embodiment of the present application.
In step S1004, after the step of modifying, by the subtask, the first resource limit value in the first object to expand the Pod, the method further includes:
step C100, acquiring a current version number field of the second object;
specifically, the second object is a stateful object, the stateful object includes a version number field corresponding to Pod, the version number field corresponds to the first resource limit value one by one, and a new version number field is generated each time the resource limit value is changed, so as to obtain the current version number field.
Step C200, modifying the field of the current version number through the service layer so as to update the field of the current version number into a field of the latest version number;
specifically, through the Operation-service-server, after a first resource limit value is modified, a current version number field is modified into a latest version number field, the modified version number field corresponds to the expanded first resource limit value, when the container group Pod is abnormal and the container group Pod needs to be rebuilt, the container group Pod is rebuilt according to the updated latest version number field, and the rebuilt Pod has the expanded resource limit, that is, the container group Pod is guaranteed to be pulled up again according to the latest resource configuration, and the Operation reliability of the container group Pod is improved.
And step C300, deleting the version field numbers except the latest version field number, so that the container group is generated based on the second object corresponding to the latest version field number after the container group is abnormally restarted next time.
Specifically, the version number before the current latest version number is the history version number, the history version number corresponds to the resource configuration of the history Pod, and in order to avoid pulling up the Pod according to the history version number, it is necessary to delete the history version number and only keep the latest version number field. Therefore, when abnormal reconstruction occurs in the subsequent Pod, the Pod can be pulled up again according to the Pod configuration corresponding to the latest version number in the second object stateful, and the reliability of Pod reconstruction is improved.
In this embodiment, by using the above scheme, specifically, by obtaining a current version number field of the second object, and modifying the current version number field through the service layer, the current version number field is updated to a latest version number field, and version field numbers other than the latest version field number are deleted, so that after a next container group is abnormally restarted, the container group is generated based on the second object corresponding to the latest version field number, and the Pod is generated based on the second object corresponding to the current version number field, thereby improving the reliability of pulling up the Pod again when abnormal reconstruction occurs during Pod operation.
Referring to fig. 6, fig. 6 is a flowchart illustrating another exemplary embodiment of an online capacity expansion method for a container group according to the present application.
Step S1003, modifying the resource request value and the limit value in the object Cgroups through the job service layer, so as to expand the capacity of the Pod, including:
step D100, generating the subtask job corresponding to the first object based on the first object through the operation service layer;
specifically, the Cgroup-operator is configured to monitor a creation condition of a Cgroups object created based on a capacity expansion request, and after the Cgroups object is monitored to be created, create a jobsub-task corresponding to the Cgroups on a Node corresponding to the Pod, where Pod information specifically requiring capacity expansion is defined in the jobsub-task.
And step D200, finding the first object corresponding to the container group through the job of the subtask based on the container mounting function in the container group, and modifying the first resource limit value in the first object through the subtask to expand the container group.
Specifically, the mounting means that the Linux system records in a file form, and besides a conventional file, a process, a disk and the like are abstracted into a file form, so that a developer can call most resources in the Linux system only by one set of API and a development tool, and the provenance is improved. However, when any device is used, the device must perform a mount operation with a directory below the root directory, that is, the file directory structure of Linux with the root directory as the root of the tree is combined with the directory structure of the hardware device.
The job of the subtask finds the Cgroup file corresponding to the Cgroup object of the corresponding Pod in the Linux system through the mounting function of the container, so that the Cgroup file is modified, and the capacity expansion of the Pod corresponding to the file is realized. The modification is specifically that the subtask jobmodifies a first resource limit value of Pod in the Cgroup file corresponding to the Cgroups object, for example, the original allocation of 2cpu is modified to allocation of 3cpu, and cpu is used as a resource unit, so that capacity expansion of Pod is realized.
In this embodiment, through the above scheme, specifically, the job service layer generates the subtask job corresponding to the first object based on the first object, finds the first object corresponding to the container group through the subtask job and based on a container mount function in the container group, and modifies the first resource limit value in the first object through the subtask to expand the volume of the container group, that is, through the mount function, the subtask job accurately completes modification of the Cgroup file to execute Pod expansion, thereby ensuring accuracy of Pod expansion.
Referring to fig. 7, fig. 7 is a flowchart illustrating another exemplary embodiment of an online capacity expansion method for a container group according to the present application.
Step D200, the step of generating corresponding subtasks based on the first object through the job service layer comprises:
step E100, acquiring the creation condition of the first object group through the operation service layer;
specifically, the operation service layer Cgroup-operator is a service in the operation and maintenance server customized and secondarily developed based on kubernets by the capacity expansion method of Pod in the application. The Cgroup-operator may be used to monitor the creation of the first object, the Cgroup object.
Step E200, when it is monitored that the first object is created in the first object group, acquiring node information of the container group corresponding to the first object, and generating the subtask jobcorresponding to the first object based on the node information.
Specifically, the object of Pod management is named as Cgroups in a self-defined manner, the Cgroups are not Cgroup in Linux, each Cgroup object information corresponds to one Cgroup file, each Cgroup file corresponds to a Pod resource, the Cgroup file information includes the resource allocation condition of the corresponding Pod, CPU resource constraints and requests are based on the unit of CPU, for example, a Pod request allocates 8 CPUs, the request number is recorded in the Cgroup object information, and the isolation condition of the Pod, multiple pods can exist in a Node, each Pod runs in the Node in an isolated manner, and isolated position information and the like are recorded in the Cgroup file.
The creating condition of the Cgroups object is monitored through the operation service layer, when the Cgroups object is monitored to be newly increased, a subtask job is correspondingly generated for the Cgroups object, and the subtask job can achieve the purpose of modifying the resource allocation condition of the Pod corresponding to the Cgroups object by modifying the resource configuration information in the Cgroup file corresponding to the Cgroups object, so that the effect of Pod capacity expansion is achieved.
In this embodiment, by the above scheme, specifically, the creating condition of the first object group is obtained through the operation service layer, when it is monitored that the first object is created in the first object group, node information of the container group corresponding to the first object is obtained, and the subtask jobthat corresponds to the first object is generated based on the node information, that is, the Cgroups object is an object for managing Pod, and the Cgroups object generates a jobb subtask, and the jobb subtask is correspondingly connected to the Pod through a management relationship, so that a capacity expansion task can be better executed.
Step D200, after the step of modifying the first resource restriction value of the first object corresponding to the Pod through the subtask to expand the Pod, the method further includes:
and F100, destroying the subtasks corresponding to the container group after the container group is expanded.
Specifically, the job subtask is an executor for modifying a first resource limit value in the Cgroup file corresponding to the Cgroups object to achieve capacity expansion of the Pod, the job and the Pod are on the same Node, and after the task is executed, the job task is destroyed to release an operating space and improve the operating efficiency of the Pod.
In this embodiment, by the above scheme, specifically, after the capacity expansion of the container group is completed, the subtask corresponding to the container group is destroyed, that is, the jobtask is used as an executor of the Pod capacity expansion, and the subtask is destroyed after the capacity expansion is completed, so that the operation space can be saved for the Pod, and the operation efficiency of the Pod can be improved.
In addition, an embodiment of the present application further provides an online capacity expansion device for a container bank, where the online capacity expansion device for the container bank includes:
the verification module is used for carrying out validity verification on the capacity expansion instruction through the business service layer when the capacity expansion instruction is received;
the first creating module is used for acquiring a first object Cgroups generated based on the capacity expansion instruction if the checking result is that the first object Cgroups passes;
a second creating module, configured to create the subtask JOB corresponding to the first object Cgroups through the JOB service layer JOB;
and the modification module is used for modifying the resource limit value in the first object through the operation service layer so as to expand the container group.
For the principle and implementation process of implementing data stream detection in this embodiment, please refer to the above embodiments, which are not described herein again.
In addition, an embodiment of the present application further provides a terminal device, where the terminal device includes a memory, a processor, and an online capacity expansion program of a container group that is stored in the memory and is executable on the processor, and the online capacity expansion of the container group is performed by the processor, so as to implement the above-mentioned step of online capacity expansion of the container group.
Since the online capacity expansion program of the present container group is executed by the processor, all technical solutions of all the foregoing embodiments are adopted, so that at least all the beneficial effects brought by all the technical solutions of all the foregoing embodiments are achieved, and details are not repeated herein.
In addition, an embodiment of the present application further provides a computer-readable storage medium, where an online capacity expansion program of a container group is stored on the computer-readable storage medium, and when executed by a processor, the online capacity expansion program of the container group implements the above-mentioned step of online capacity expansion of the container group.
Since the online capacity expansion program of the present container group is executed by the processor, all technical solutions of all the foregoing embodiments are adopted, so that at least all beneficial effects brought by all the technical solutions of all the foregoing embodiments are achieved, and details are not repeated herein.
Compared with the prior art, namely a mode of deleting an old Pod and rebuilding a new Pod, the mode needs to be restarted to update the Pod resource configuration to the resource configuration after capacity expansion, when the Pod is in a stateful application, the problem of abnormal interruption of the application caused by restarting and the problem of long time from interruption to recovery of RTO (real time operation) are caused, and poor use experience is caused to a user with the stateful application.
By the online capacity expansion method of the application container group, the purpose of modifying the resource configuration of the Pod in real time and applying the Pod is achieved by modifying the first resource limit value in the Cgroup file corresponding to the resource management object of the target Pod, namely the Cgroups object, the purpose of generating the subsequent Pod with the expanded resource configuration is achieved by modifying the resource parameter of the template Pod, namely the second resource limit value, and the purpose of restarting and pulling up the Pod with the latest resource configuration is achieved by modifying the current version number and deleting the version number before the current version number when the Pod is abnormally restarted, so that the Pod capacity expansion method without restarting the Pod is provided, and the huge contribution value of the application to application operation can be understood by referring to the example in the step S1004.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of other like elements in a process, method, article, or system comprising the element.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, a controlled terminal, or a network device) to execute the method of each embodiment of the present application.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. An online capacity expansion method for a container group is applied to an Operation and maintenance server, the Operation and maintenance server comprises a service layer Operation-service-server, an Operation service layer Cgram-operator and an Operation service layer JOB, and the online capacity expansion method for the container group comprises the following steps:
when a capacity expansion instruction is received, carrying out validity check on the capacity expansion instruction through the business service layer;
if the checking result is that the first object CgRoups is passed, acquiring a first object CgRoups generated based on the capacity expansion instruction;
creating a subtask JOB corresponding to the first object Cgroups through the JOB service layer JOB;
and modifying the first resource limit value in the first object through the subtask jobto expand the container group.
2. An online capacity expansion method for a container group according to claim 1, wherein the step of obtaining the first object Cgroups generated based on the capacity expansion instruction if the result of the check is passed comprises:
if the checking result is that the checking result is passed, acquiring a second object Stateful corresponding to the container group through the operation service layer;
and acquiring the object information of the corresponding container group based on the second object, and creating a first object corresponding to the container group based on the object information of the container group.
3. A method for online capacity expansion of a container group according to claim 2, wherein the step of modifying the first resource limit value in the first object by the subtask job to expand the container group further comprises:
and modifying a second resource limit value corresponding to the resource template of the container group corresponding to the second object through the business service layer, so that the container group is generated by the container group resource template corresponding to the second resource limit value after the next abnormal restart.
4. A method for online capacity expansion of a container group according to claim 3, wherein the step of modifying the first resource limit value in the first object by the subtask job to expand the container group further comprises:
acquiring a current version number field of the second object;
modifying the current version number field through the business service layer so as to update the current version number field into a latest version number field;
and deleting the version number field except the latest version number field so as to generate the container group based on the second object corresponding to the latest version number field after the container group is abnormally restarted next time.
5. An online capacity expansion method for a container group according to claim 1, wherein the step of modifying the first resource limit value in the first object by the subtask job to expand the container group comprises:
generating the subtask jobcorresponding to the first object based on the first object through the job service layer;
and finding the first object corresponding to the container group through the sub task job and based on a container mounting function in the container group, and modifying the first resource limit value in the first object through the sub task to expand the container group.
6. An online capacity expansion method for a container bank according to claim 5, wherein the step of generating, by the job service layer, a corresponding subtask based on the first object further comprises:
acquiring the creation condition of a first object group through the operation service layer;
when it is monitored that the first object is created in the first object group, node information of the container group corresponding to the first object is obtained, and the subtask jobcorresponding to the first object is generated based on the node information.
7. An online capacity expansion method for a container group according to claim 5, wherein after the step of modifying the first resource limit value of the first object corresponding to the container group by the subtask to expand the container group, the method further comprises:
and after the capacity expansion of the container group is completed, destroying the subtasks corresponding to the container group.
8. An online capacity expansion device of a container group, comprising:
the verification module is used for carrying out validity verification on the capacity expansion instruction through the business service layer when the capacity expansion instruction is received;
the first creating module is used for acquiring a first object Cgroup generated based on the capacity expansion instruction if the checking result is that the first object Cgroup passes;
a second creating module, configured to create, through the JOB service layer JOB, the subtask JOB corresponding to the first object Cgroups;
and the modification module is used for modifying the first resource limit value in the first object through the subtask job so as to expand the container group.
9. A terminal device, characterized in that the terminal device comprises a memory, a processor and an online capacity expansion program of a container group stored on the memory and operable on the processor, and when executed by the processor, the online capacity expansion program of the container group implements the steps of the online capacity expansion method of the container group according to any one of claims 1 to 7.
10. A computer-readable storage medium, on which an online capacity-expansion program of a container bank is stored, which when executed by a processor, implements the steps of the online capacity-expansion method of a container bank according to any one of claims 1 to 7.
CN202210941324.XA 2022-08-04 2022-08-04 Online capacity expansion method and device of container group, terminal equipment and medium Pending CN115357336A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210941324.XA CN115357336A (en) 2022-08-04 2022-08-04 Online capacity expansion method and device of container group, terminal equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210941324.XA CN115357336A (en) 2022-08-04 2022-08-04 Online capacity expansion method and device of container group, terminal equipment and medium

Publications (1)

Publication Number Publication Date
CN115357336A true CN115357336A (en) 2022-11-18

Family

ID=84001442

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210941324.XA Pending CN115357336A (en) 2022-08-04 2022-08-04 Online capacity expansion method and device of container group, terminal equipment and medium

Country Status (1)

Country Link
CN (1) CN115357336A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116560804A (en) * 2023-07-10 2023-08-08 中国人民解放军国防科技大学 Method and apparatus for interoperating containers using multiple container images

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116560804A (en) * 2023-07-10 2023-08-08 中国人民解放军国防科技大学 Method and apparatus for interoperating containers using multiple container images
CN116560804B (en) * 2023-07-10 2023-09-05 中国人民解放军国防科技大学 Method and apparatus for interoperating containers using multiple container images

Similar Documents

Publication Publication Date Title
CN111966305B (en) Persistent volume allocation method and device, computer equipment and storage medium
US11226847B2 (en) Implementing an application manifest in a node-specific manner using an intent-based orchestrator
US10896102B2 (en) Implementing secure communication in a distributed computing system
US11113158B2 (en) Rolling back kubernetes applications
US10642694B2 (en) Monitoring containers in a distributed computing system
KR101574366B1 (en) Synchronizing virtual machine and application life cycles
JP4426736B2 (en) Program correction method and program
US11347684B2 (en) Rolling back KUBERNETES applications including custom resources
CN111212116A (en) High-performance computing cluster creating method and system based on container cloud
US10620871B1 (en) Storage scheme for a distributed storage system
CN112328170B (en) Cloud hard disk capacity expansion method and device, computer equipment and storage medium
US20050188068A1 (en) System and method for monitoring and controlling server nodes contained within a clustered environment
US11645098B2 (en) Systems and methods to pre-provision sockets for serverless functions
CN113590169B (en) Application deployment method, application deployment system, and computer-readable storage medium
US10845997B2 (en) Job manager for deploying a bundled application
CN115357336A (en) Online capacity expansion method and device of container group, terminal equipment and medium
CN116560801B (en) Cross-container counter system credit migration method and equipment
CN107783826B (en) Virtual machine migration method, device and system
CN112564979B (en) Execution method and device of construction task, computer equipment and storage medium
CN115373886A (en) Service group container shutdown method, device, computer equipment and storage medium
CN111274211B (en) Application file storage method, device and system
US10805182B2 (en) Provisioner disaster-recovery framework for platform-as-a-service offering
CN114356549A (en) Method, device and system for scheduling container resources in multi-container cluster
CN109101253B (en) Management method and device for host in cloud computing system
CN115454450B (en) Method and device for resource management of data job, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination