CN114185679A

CN114185679A - Container resource scheduling method and device, computer equipment and storage medium

Info

Publication number: CN114185679A
Application number: CN202111534118.9A
Authority: CN
Inventors: 李华; 温丽明; 帅翡芍; 郑洁锋
Original assignee: Industrial and Commercial Bank of China Ltd ICBC
Current assignee: Industrial and Commercial Bank of China Ltd ICBC
Priority date: 2021-12-15
Filing date: 2021-12-15
Publication date: 2022-03-15

Abstract

The application relates to a container resource scheduling method, a container resource scheduling device, a computer device, a storage medium and a computer program product, which are used in the fields of big data or other fields. The method comprises the steps of forming a plurality of state nodes by a plurality of running resources in a container to construct a Markov chain directed graph, obtaining a transition probability matrix corresponding to the Markov chain directed graph according to historical state transition times among the plurality of state nodes, determining a second leading running resource corresponding to the container according to the transition probability matrix, the quantity of the running resources in the container and the leading running resource which is used by the service and has the most use of the running resources in the current container, and scheduling the running resources in the container according to the second leading running resource. Compared with the traditional mode of monitoring the change of the dominant running resources of the container in real time and then performing a series of scheduling, the scheme predicts the dominant running resources in the container by constructing the Markov chain directed graph and utilizing the transition probability matrix, thereby improving the efficiency of scheduling the resources in the container.

Description

Container resource scheduling method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of big data technologies, and in particular, to a method and an apparatus for scheduling container resources, a computer device, a storage medium, and a computer program product.

Background

With the rapid development of computer technologies, cloud computing is a main mode for computers to process data, cloud computing application requirements are rapidly developed, and the traditional virtualization technology has exposed shortcomings in resource utilization efficiency, performance and the like. Current cloud computing application delivery platforms are therefore typically based on container technology. In order to ensure that resources in a container meet the requirements of an ongoing service scene, the resources in the container need to be scheduled, and currently, scheduling of the container resources is usually performed by monitoring currently dominant operating resources of the container in real time and then performing a series of scheduling modes. However, the real-time monitoring of the change of the resource demand and the scheduling can result in that the currently leading service cannot request the required resource in time.

Therefore, the current container resource scheduling method has the defect of low scheduling efficiency.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a container resource scheduling method, apparatus, computer device, computer readable storage medium and computer program product capable of improving scheduling efficiency.

In a first aspect, the present application provides a method for scheduling container resources, where the method includes:

acquiring various running resources of a container, and constructing a Markov chain directed graph according to the various running resources; the Markov chain directed graph comprises a plurality of state nodes;

acquiring corresponding historical state transition times among the plurality of state nodes, and acquiring a transition probability matrix corresponding to the Markov chain directed graph according to the historical state transition times;

determining a second leading operation resource corresponding to the container according to the transition probability matrix, the number of the operation resources in the container and the first leading operation resource, and scheduling the operation resources in the container according to the second leading operation resource; the dominant run resource characterizes a run resource used by a service with the most use of the run resource in the container; the calling time of the first dominant running resource is earlier than that of the second dominant running resource.

In one embodiment, the acquiring multiple operating resources of the container includes:

and acquiring processor resources, memory resources, network input and output resources and block input and output resources corresponding to the container as the multiple running resources.

In one embodiment, the method further comprises:

receiving a starting instruction of a container, and acquiring an average value of the running resource consumption of the started service in the container in unit time to obtain a plurality of average values of the running resource consumption;

and determining an initial leading service according to the maximum value in the average consumption values of the running resources, and taking the running resources corresponding to the initial leading service as the initial leading running resources.

In one embodiment, the obtaining a transition probability matrix corresponding to the markov chain directed graph according to the historical state transition times includes:

acquiring a plurality of historical state transition times corresponding to transition from a first state node to a second state node in a Markov chain directed graph, and acquiring the sum of the plurality of historical state transition times;

for each historical state transition frequency, acquiring the ratio of the historical state transition frequency to the sum of the multiple historical state transition frequencies, and taking the ratio as the transition probability corresponding to the state transition;

and obtaining the transition probability matrix according to the plurality of transition probabilities.

In one embodiment, the determining, according to the transition probability matrix, the number of the operating resources in the container, and the first dominant operating resource, a second dominant operating resource corresponding to the container includes:

for each service operated in the container, acquiring the ratio of each operating resource corresponding to the service to the resource upper limit of various operating resources in the container, and acquiring the operating resource consumption value corresponding to the service according to the ratios;

obtaining the maximum value of the running resource consumption values of the services, and using the maximum value as a first leading running resource of the container;

forming a Bayesian conditional probability formula according to a first leading operation resource corresponding to the first time step and the transition probability matrix, and determining a plurality of leading operation resources corresponding to the second time step; the first time step is earlier than the second time step;

and acquiring candidate leading operation resources which are larger than or equal to the first leading operation resources in the leading operation resources corresponding to the second time step, and taking the maximum value in the candidate leading operation resources as the second leading operation resources.

In one embodiment, after determining the second dominant run resource corresponding to the container, the method further includes:

detecting whether the residual operating resources in the container are larger than a preset operating resource value of the second leading operating resource;

if not, re-determining a second dominant operating resource of the container so as to enable the second dominant operating resource to be smaller than the preset operating resource value of the residual operating resources.

In a second aspect, the present application provides an apparatus for scheduling container resources, the apparatus comprising:

the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring various operation resources of a container and constructing a Markov chain directed graph according to the various operation resources; the Markov chain directed graph comprises a plurality of state nodes;

a second obtaining module, configured to obtain historical state transition times corresponding to the plurality of state nodes, and obtain a transition probability matrix corresponding to the markov chain directed graph according to the historical state transition times;

the scheduling module is used for determining a second leading operation resource corresponding to the container according to the transition probability matrix, the number of the operation resources in the container and the first leading operation resource, and scheduling the operation resources in the container according to the second leading operation resource; the dominant run resource characterizes a run resource used by a service with the most use of the run resource in the container; the calling time of the first dominant running resource is earlier than that of the second dominant running resource.

In a third aspect, the present application provides a computer device comprising a memory storing a computer program and a processor implementing the steps of the method described above when the processor executes the computer program.

In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method described above.

In a fifth aspect, the present application provides a computer program product comprising a computer program which, when executed by a processor, performs the steps of the method described above.

According to the container resource scheduling method, the container resource scheduling device, the computer equipment, the storage medium and the computer program product, the Markov chain digraph is constructed by forming a plurality of state nodes by using a plurality of operating resources in the container, the transition probability matrix corresponding to the Markov chain digraph is obtained according to the historical state transition times among the state nodes, the second leading operating resource corresponding to the container is determined according to the transition probability matrix, the number of the operating resources in the container and the leading operating resource which is used by the service and has the largest service usage amount of the operating resources in the current container, and the operating resources in the container are scheduled according to the second leading operating resource. Compared with the traditional mode of monitoring the change of the dominant running resources of the container in real time and then performing a series of scheduling, the scheme predicts the dominant running resources in the container by constructing the Markov chain directed graph and utilizing the transition probability matrix, thereby improving the efficiency of scheduling the resources in the container.

Drawings

FIG. 1 is a diagram of an exemplary implementation of a method for scheduling container resources;

FIG. 2 is a flow diagram illustrating a method for scheduling container resources according to one embodiment;

FIG. 3 is a diagram illustrating an exemplary system for scheduling container resources;

figure 4 is a schematic diagram of the structure of a markov chain directed graph in one embodiment;

FIG. 5 is a flowchart illustrating a method for scheduling container resources according to another embodiment;

FIG. 6 is a block diagram of an apparatus for scheduling container resources according to an embodiment;

FIG. 7 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The container resource scheduling method provided by the embodiment of the application can be applied to the application environment shown in fig. 1. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104, or may be located on the cloud or other network server. The server 104 may construct a markov chain directed graph based on a plurality of operating resources in the container, and obtain a corresponding transition probability matrix based on the historical state transition times of a plurality of state nodes in the directed graph, so that the server 104 predicts and schedules the dominant operating resources in the container based on the transition probability matrix. In some embodiments, a terminal 102 is also included. Wherein the terminal 102 communicates with the server 104 via a network. The server 104 may send the scheduling result to the terminal 102, so that the terminal 102 may display the scheduling result and the operating status of the server 104. The terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, and tablet computers. The server 104 may be implemented as a stand-alone server or as a server cluster comprised of multiple servers. It should be noted that the container resource scheduling method disclosed in the present disclosure may be used in the field of big data, and may also be used in any field other than the field of big data.

In one embodiment, as shown in fig. 2, a method for scheduling container resources is provided, which is described by taking the method as an example applied to the server in fig. 1, and includes the following steps:

step S202, obtaining various running resources of the container, and constructing a Markov chain directed graph according to the various running resources; the Markov chain directed graph comprises a plurality of state nodes.

The container is a product of cloud computing, and can be understood as a next-generation product of a typical virtual machine, because the container has excellent characteristics of light weight, rapid deployment, source opening, elastic expansion and the like, the container is widely used in the field of domestic internet at present, and the remote office mode further promotes the development of the technology, for example, the container is used for some cloud desktops. Docker is a kind of container, Docker is an open source application container engine, so that developers can pack their applications and dependence packages into a portable image, and then distribute the image to any popular Linux or Windows operating system machine, and can also realize virtualization. The containers are fully sandboxed without any interface between each other. The container may include a plurality of operation resources, which may be used to maintain the operation of the container, and the container needs to support the execution of different services by using each resource, and since the operation resources required by different services are different and the services operated in the container may change over time, the container needs to schedule its own resources to meet the resource requirement of the currently busiest service, that is, the resources in the container need to be scheduled.

Server 104 may first initialize a container and set the initialized master run resources for the initialized container. The structure of the container may be as shown in fig. 3, and fig. 3 is a schematic structural diagram of a container resource scheduling system in an embodiment. The server 104 may place the container service in accordance with a given initialization-dominant-run resource, with the container service in a temporary steady state. Server 104 may employ a cAdvisor + infixdb-based container cluster performance monitoring scheme. cAdvisor is a tool used by google to analyze the resource usage and performance characteristics of a running Docker container. cAdvisor is a daemon process in operation, which is used for collecting, aggregating, processing and exporting information related to operation containers, and each container maintains independent parameters, historical resource use conditions and complete resource use data; the inflixdb is a database for processing high-write inflixdb and storing a large amount of time series data and performing real-time analysis. The Google open source tool cAdviror is used for collecting three-dimensional detailed monitoring index data of nodes, containers and services in a cluster and expanding the functions of a database at the layer of a collector. And adopting distributed time sequence-based InfluxDB as a monitoring data storage module. And collecting, dividing, filing, storing and linking the cluster data according to time slices. The dominant run resource may be a resource used by a service running most frequently in the container, and affects the main resource demand of the container service. In cloud computing, the attributes of a particular container service, its dominant operating resources, are also subject to certain rules, such as: a container service that provides face recognition requires a collection of images of computing functions to be pulled from an image center for a certain period of time. After the mirror image is installed, the mirror image is frequently used after another period of time, obviously, the attributes of the two dominant running resources are changed when the mirror image is downloaded and the face recognition application is used for calculation, and then the scheduling is needed especially in a heterogeneous cluster environment. And scheduling refers to replacing one container for another host node to run, wherein one node can run a plurality of containers, and in k8s, one container is usually contained in one pod, and the pod is the minimum unit of scheduling. In general, in a node, each container shares the resource of the host node, because it has no kernel and no resource limitation, the resource can be used as much as possible according to the requirement of a program.

When the frequently used services in the container change, the server 104 needs to perform resource scheduling on the container. The server 104 may determine the resource to be scheduled in advance in a prediction manner, and the server 104 may obtain a plurality of operation resources of the container and construct a markov chain directed graph according to the plurality of operation resources. The concept of describing a discrete random process by probability theory and mathematical statistics summarization is useful in aspects of power systems, queuing theory, financial field and signal processing. In mathematics, the definition of a markov chain is a notational expression, which can be described in language that, in a random variable set X and a state space a, the probability from a state t to a state t +1(t is understood as a time step) is independent of the previous state, and this property is called no after-effect or no memory, and the no-after-effect is the most important property of the markov process, which means that the state is known at a certain stage, and the development and change of the state after the process are only related to the process. For example, I eat no meal in noon today, and only relate to the fact that I eat no meal in the morning, and do not relate to the situations of yesterday and the day before; the markov chain directed graph may include a plurality of state nodes, and the respective run resources may be respective state nodes in the markov chain directed graph. For example, in one embodiment, a plurality of operating resources of a container are acquired, including: and acquiring processor resources, memory resources, network input and output resources and block input and output resources corresponding to the container as multiple running resources. In this embodiment, the server 104 may obtain processor resources, memory resources, network input/output resources, and block input/output resources in the container as the multiple operation resources. Each resource corresponds to different processing devices in the container, the processor resource may be a CPU of the container, the memory resource may be a memory in the container, the network input/output resource may be Net I/O, and the Block input/output resource may be Block I/O.

The server 104 may construct a markov chain directed graph using the various run resources described above. For example, as shown in fig. 4, fig. 4 is a schematic structural diagram of a markov chain directed graph in one embodiment. The server 104 may prove that the initial placement of the container, i.e., the initial setting of the dominant run resource, does not affect the state of the markov chain at other time steps, according to the ineffectiveness of the markov chain. The server 104 first takes container memory usage, CPU usage, Net I/O, Block I/O as four discrete states of a markov chain. Where two state points are bidirectional, they can be maintained as state S1 by state S1. In addition to initializing the master run resource, the server 104 uses four discrete states of the container service master run resource as four link nodes, so as to convert the cluster node corresponding to the container service master run resource into a point-to-point problem in the directed graph. The evolution of the markov chain can be represented as a transition graph according to the structure of the graph, each edge in the graph can be assigned with a transition probability, and the transition probability represents the probability of changing from a certain state to a certain state. In probability theory and mathematical statistics, the probability of occurrence of the next state of the markov chain depends only on the previous state. An example of an application of a markov chain is the problem of selecting the direction of movement of a target object in road traffic in some machine learning algorithms. For example, a person at intersection a may be traveling to intersection B, or may be traveling to intersection C, or D. This selection can be said to be independent of intersections previously walked through, because the selection is a transition state as long as at intersection a, one of the fixed intersections is surely selected to reach the destination.

And step S204, acquiring the corresponding historical state transition times among the plurality of state nodes, and acquiring a transition probability matrix corresponding to the Markov chain directed graph according to the historical state transition times.

The server 104 may determine a transition probability matrix corresponding to the markov chain directed graph based on the historical state transition times between the nodes in the markov chain directed graph. The transition probability may be a probability that a state in the markov chain directed graph is changed from one node to another node. The values of the elements of the transition probability matrix are non-negative, the sum of the elements of each row is 1, the elements are represented by probabilities, and the elements are mutually transitioned under a certain condition.

Step S206, determining a second leading operation resource corresponding to the container according to the transition probability matrix, the number of the operation resources in the container and the first leading operation resource, and scheduling the operation resources in the container according to the second leading operation resource; the dominant running resource represents the running resource used by the service with the most used running resource in the container; the calling time of the first dominant run resource is earlier than that of the second dominant run resource.

After the server 104 obtains the transition probability matrix, a second dominant operating resource corresponding to the container may be determined based on the transition probability matrix, the number of operating resources in the container, and the first dominant operating resource in the container, so that the server 104 predicts changes of the dominant operating resource in the container and schedules the second dominant operating resource based on the predicted second dominant operating resource. Wherein, the dominant running resource can be the running resource used by the service with the most use of running resource in the container; the first dominant run resource is invoked at an earlier time than the second dominant run resource, e.g., the first dominant run resource may be the run resource currently being invoked in the current container and the second dominant run resource may be the next dominant run resource that the container is ready to invoke.

In the container resource scheduling method, a Markov chain digraph is constructed by forming a plurality of state nodes by a plurality of operating resources in a container, a transition probability matrix corresponding to the Markov chain digraph is obtained according to the historical state transition times among the plurality of state nodes, a second leading operating resource corresponding to the container is determined according to the transition probability matrix, the number of the operating resources in the container and the leading operating resource used by the service with the most operating resources in the current container, and the operating resources in the container are scheduled according to the second leading operating resource. Compared with the traditional mode of monitoring the change of the dominant running resources of the container in real time and then performing a series of scheduling, the scheme predicts the dominant running resources in the container by constructing the Markov chain directed graph and utilizing the transition probability matrix, thereby improving the efficiency of scheduling the resources in the container.

In one embodiment, further comprising: receiving a starting instruction of a container, and acquiring an average value of the running resource consumption of the started service in the container in unit time to obtain a plurality of average values of the running resource consumption; and determining the initial leading service according to the maximum value in the average consumption values of the running resources, and taking the running resources corresponding to the initial leading service as the initial leading running resources.

In this embodiment, the server 104 may perform initial resource configuration when the container starts to run. When receiving a starting instruction of the container, the server 104 may obtain an average value of the running resource consumption of the started service in the container in a unit time, so as to obtain a plurality of running resource consumption average values. The server 104 may determine an initial dominant service according to a maximum value of the average value of the consumption of the running resources, and use a running resource corresponding to the initial dominant service as an initial dominant running resource, that is, the server 104 may use a resource used by a service with the largest resource consumption as a dominant running resource when the container starts to be started. Specifically, the server 104 first sets a leading running resource of the container service, and records the leading running resource into the infiluxdb, and the cAdvisor periodically counts average resource consumption of each node, container, and service of the cluster in unit time into a data table, including resource consumption conditions of a memory, a CPU, Net I/O, and Block I/O. The server 104 may, according to the set condition that the container service dominates the running resources, dominate the nodes with the most running resources for the container or service load, and call netstat service to listen to the collision port, filtering the cluster node set.

With the embodiment, the server 104 may determine the initial dominant run resource by detecting the resource consumption value of the service when the container is started, so as to improve the efficiency of resource scheduling.

In one embodiment, obtaining a transition probability matrix corresponding to a markov chain directed graph according to the historical state transition times includes: acquiring a plurality of historical state transition times corresponding to transition from a first state node to a second state node in a Markov chain directed graph, and acquiring the sum of the plurality of historical state transition times; acquiring the ratio of the historical state transition times to the sum of the historical state transition times as the transition probability corresponding to the state transition for each historical state transition time; and obtaining a transition probability matrix according to the plurality of transition probabilities.

In this embodiment, the server 104 may obtain a corresponding transition probability matrix based on the number of times of historical state transition in the markov chain directed graph. For example, the server 104 may obtain a plurality of historical state transition times corresponding to transitions from a first state node to a second state node in the markov chain directed graph. The first state node and the second state node may be two different or the same nodes in the directed graph, and the server 104 may obtain the number of transitions of state transition between any two nodes in the directed graph. The server 104 may also obtain the sum of the above multiple historical state transition times; for each historical state transition number, the server 104 may obtain a ratio of the historical state transition number to a sum of the historical state transition numbers as a transition probability corresponding to the state transition, so that the server 104 may obtain the state transition probability matrix according to the transition probabilities.

The server 104 may set the structure of the state transition probability matrix in advance. For example, the server 104 may set a random sequence of container service-dominated running resources { x (t) }, where t is 0, 1, 2. } in a discrete space E, for any natural number k and any m positive integers t₁，t₂，...，t_m(0≤t₁＜t₂＜...＜t_m) And optionally i₁，i₂，...，i_mJ ∈ E satisfying P { X (t)_m+k)＝j|X(t₁)＝i₁，X(t₂)＝i₂，...，X(t_m)＝i_m}＝P{X(t_m+k)＝j|X(t_m)＝i_m}; in the above formula, t_mRepresenting the current time, t₁，t₂，...，t_m-1Representing the elapsed time, t_m+kIndicating a future k time step. From Markov, the server 104 notes the k-step transition probability of the Markov chain at time t as P_ij(t, t + k): then P is_ij(t，t+k)＝P{X(t_m+k)＝j|X(t_m)＝i_mK is more than 0; the container service dominated run resource attribute is a homogeneous markov chain, which is only related to the transition departure state i, the transition step number k, and the transition arrival state j, and is not related to the time t. At this time, k transition probabilitiesCan be recorded as P_ij(k) I.e. P_ij(k)＝P_ij(t, t + k) ═ P { X (t + k) ═ j | X (t) ═ i }, k > 0; wherein

When k is 1, P_ij(1) Represents the probability of one-step transition, denoted as P_ijThe one-step transition matrix P (1) is formed by all one-step transition probabilities P_ijComposition, all t-step transition probabilities are P_ij(t), the constructed matrix P (t) is a t-step transition probability matrix of the Markov chain.

The server 104 may also calculate a functional relationship of the transition probability matrix based on a Chapman-Kolmogorov equation. The Chapman-Kolmogorov equation, also called C-K equation, is used in the calculation of the transition probability in n steps, and for a homogeneous Markov chain, the transition probability matrix in n steps is the n-th power of the one-time transition probability matrix. For example, the server 104, based on Chapman-Kolmogorov equation, obtains:

the server 104 can thus get the recurrence relation: p (t) ═ P (1) P (t-1) ═ P (t-1) P (1), whereby P (t) ═ P (1) is present^t(ii) a Further, the number of historical state transitions from state i to state j of the container service dominant run resource attribute is counted as C_ijThen, the obtaining formula of the transition probability matrix can be as follows:

and server 104 may be based on the above-described recurrence relation,

the transition probability matrix obtained in the k steps is:

through the embodiment, the server 104 can obtain the transition probability matrix based on a plurality of historical state transition times, so that the server 104 can predict the dominant operation resource transition in the container based on the transition probability matrix, and the efficiency of resource scheduling in the container is improved.

In one embodiment, determining a second dominant run resource corresponding to the container according to the transition probability matrix, the number of run resources in the container, and the first dominant run resource includes: aiming at each service operated in the container, acquiring the ratio of each operating resource corresponding to the service to the resource upper limit of various operating resources in the container, and acquiring the operating resource consumption value corresponding to the service according to the ratios; obtaining the maximum value of the running resource consumption values of a plurality of services as a first leading running resource of the container; forming a Bayesian conditional probability formula according to a first leading operation resource corresponding to the first time step and the transition probability matrix, and determining a plurality of leading operation resources corresponding to the second time step; the first time step is earlier than the second time step; and acquiring candidate leading operation resources which are larger than or equal to the first leading operation resources in the plurality of leading operation resources corresponding to the second time step, and taking the maximum value in the candidate leading operation resources as the second leading operation resources.

In this embodiment, the server 104 may determine the second dominant run resource in the container based on the transition probability matrix, the number of run resources in the container, and the first dominant run resource in the container. For example, for each service running in the container, the server 104 may obtain a ratio of each running resource used by the service to the resource upper limit of the multiple running resources in the container, so that the server 104 may obtain multiple ratios and obtain a running resource consumption value corresponding to the service based on the multiple ratios. The server 104 may use a maximum value of the operation resource consumption values of the plurality of services as a first leading operation resource of the container, and the server 104 may further form a bayesian conditional probability formula according to the first leading operation resource corresponding to the first time step and the transition probability matrix, so as to determine a plurality of leading operation resources corresponding to the second time step. The first time step is earlier than the second time step, the first time step may be a current time step, and the second time step may be a next time step corresponding to the current time step; the time step is the time interval of the load sub-step in the load step. In the rate-independent analyses such as static analysis and (static) nonlinear analysis, the time step does not reflect the real time in a load step, and the time step is accumulated to reflect the sequence of the load sub-steps. The size of the time step can be determined freely, and only represents how many parts of the whole calculation are occupied by one step in the calculation. For example: for a load, you can assume that the value of the end time after the load is calculated is 1, or 2 or any number. And then divided into smaller time numbers according to the time numbers, and the smaller time numbers are called time steps. Bayesian conditional probability represents the relationship of two conditional probabilities (P (a | B) and P (B | a)). For example, if you see that a person is always good, that person is probably a good person, which is two conditional probabilities, and the attribute probability is determined by the number of occurrences of the essential event. The ratio of the two conditional probabilities is actually P (A) to P (B), that is, the relation between good people and good things, and the relation between good people and good things is good people.

Specifically, when the server 104 predicts the second dominant run resource of the container, it first traverses the resources of each node of the cluster. Suppose a container service s_l(s_lEs) includes n running containers, which are stored in a collection

In (1). There are y kinds of resources in the cluster node, composed of

Consumed collections of resources

Represent, then each attribute therein

Representing a class of resources on a single container, i.e.

Available container service S in each node of cluster_lThe resource consumption of (a) is:

in the above cluster, the resource of each working node is limited, and the upper limit of the resource of a single node may be

I.e., an upper bound per service, with a total cluster resource upper bound of L ═ L¹，L²，...，L^yI.e. the resource upper limit of the various operating resources in the container. Since L can be composed of_iCalculated so that R in the formula_slMay be represented by a percentage of the total resources of the node:

thus, server 104 can compute node container service dominant run resource function DOM(s) based on the above formula_l) It returns the time-step cluster phase container service s_lThe calculation formula of the dominant operating resource demand type is as follows:

the above container service dominates the k-step transition state of the operating resource attribute, and the server 104 may consider that the following function may be obtained through k-1 state transitions first according to the invalidity of the markov process and the bayesian conditional probability formula:

wherein server 104 predicts the transition step in a Markov chain of container service dominated run resource attributesThe number is 1, and the server 104 may convert the selection of the container service dominant run resource into a prediction that the resource attribute corresponding to the maximum element x (t) is used as the second dominant run resource.

In addition, the initial state of the attribute of the container service dominant run resource is a single attribute, and the server 104 needs to meet not only the requirement of the dominant run resource in the previous state but also the requirement of the dominant run resource in the next state in the subsequent scheduling. After the second dominant running resource of the container service is obtained, the container is scheduled, that is, the node resource scheduled by the server 104 not only needs to satisfy the dominant running resource of the t-1 time

Also satisfy the dominant run resource attribute at time t

Through the embodiment, the server 104 can predict the second dominant operation resource in the container based on the transition probability matrix and the bayesian conditional probability, so that the efficiency of resource scheduling in the container is improved.

In one embodiment, after determining the second dominant run resource corresponding to the container, the method further includes: detecting whether the residual operating resources in the container are larger than a preset operating resource value of the second leading operating resource; if not, re-determining the second dominant operating resource of the container so as to enable the second dominant operating resource to be smaller than the preset operating resource value of the residual operating resources.

In this embodiment, the server 104 may also perform dynamic monitoring feedback on the resource demand in the container. After determining the second dominant operating resource of the container, the server 104 may detect whether the remaining operating resources in the container are greater than a preset operating resource value of the second dominant operating resource, that is, whether the remaining operating resources in the container meet the requirement of the second dominant operating resource, before performing scheduling. If so, the server 104 may perform a scheduling process; if not, the server 104 may re-determine the second dominant operating resource in the container, so that the second dominant operating resource is smaller than the preset operating resource value of the remaining operating resources in the container, that is, the remaining operating resources in the container need to meet the requirement of the second dominant operating resource.

Specifically, under the condition of high load of the cluster, the demand of the container service dominant resource changes dynamically, the cluster scheduling framework configures the running resource and predicts the next dominant resource based on the initialization, and the server 104 may monitor the performance of the application program before and after the scheduling of the container service, and ensure high availability of the container service. Where high availability refers to high availability of a service, reducing downtime, in a container cluster, it is understood that one or more containers can continue to provide a particular service. For example, the server 104 sets a cluster node resource exhaustion warning threshold, that is, the preset running resource value, combines the container service dominant resource attribute and the container node load conflict judgment, and reschedules the container if the second time step is not reached, and finally terminates the release of the cluster resource by the original container instance.

Through the embodiment, the server 104 can monitor and feed back the resource scheduling process of the server 104 based on the preset resource early warning threshold, so that the safety and efficiency of resource scheduling in the container are improved.

In one embodiment, as shown in fig. 5, fig. 5 is a flowchart illustrating a method for scheduling container resources in another embodiment. In this embodiment, the server may first collect data, including placing initialization resources in the container and collecting index data. The server 104 first sets the dominant resources of the container service and logs into the infiluxdb. And the cAdvisor regularly counts the average resource consumption of each node, container and service of the cluster in unit time into a data table, including the resource consumption conditions of a memory, a CPU, Net I/O and Block I/O. According to the set condition that the container service leads resources, the container or the node with the most service load leads the resources, a netstat service is called to intercept a conflict port, and a cluster node set is filtered. The specific implementation algorithm can be as follows:

algorithm 1: initial placement

Inputting: candidate node working set W_optService set maintained by master { MS }

And (3) outputting: filtered W_id

Algorithm 1 describes the process of tasking and assigning containers to specific nodes, first, each management node maintains a known service set MS, records the dominant resources of the container and stores them in the infiluxdb of each node, e.g., memory, CPU, network I/O and block I/O usage, and selects the working nodes in operation as the candidate set of load nodes. When a new task arrives, the algorithm narrows down the candidate working set W according to the user-specified filters_optThen, the dominant resource T of each container service is obtained_domAnd is used to screen cluster nodes for initial placement.

The server 104 may also obtain the transition probability matrix based on the historical state transition times, where an obtaining algorithm is as follows:

and 2, algorithm: k-step transition probability matrix acquisition

Inputting: history transition times matrix C ═ C_ij}

And (3) outputting: 1 to k step transition probability matrix

The server 104 may also predict and schedule the next dominant resource attribute for the container service. For example, the predicted result of server 104 for a container-dominated resource is mapped to a queue NQueueSet, and the dominant resource at the last time step is OQueueSet. And the Resource Filter generates a container and closes the container of the specific service at the specific node through the driving assembly according to the acquired leading Resource attributes of the second time step and the first time step, and performs synchronous processing Resource exhaustion early warning of the message queue in the queue set MQueueSet. Specifically, the dominant resource attribute prediction set container scheduling implementation algorithm is as follows,

algorithm 3: dominant resource attribute prediction and container scheduling

Inputting: 1 to k step transition probability matrix

Total resource type number of categories n

Current dominant resource type r_k-1

And (3) outputting: wid with most dominant resources

The algorithm 3 processes and predicts the leading resources of the container service and schedules the container, and in the algorithm, the server 104 firstly performs accumulated calculation according to the calculated 1-k step matrix, and filters some preset non-schedulable nodes. Secondly, the service S is judged_idAnd if the service is the global service, searching for a scheduling node according to the first time step and the dominant resource attribute of the last time step corresponding to the first time step. The server 104 firstly adds all the selectable nodes meeting the previous dominant resource attribute into the candidate set, and then screens the node with the largest dominant resource attribute from the candidate set as a scheduling node and performs scheduling. When S is_idWhen not a global service, server 104 raises S by placing its containers to different work nodes_idSo that the screening can be performed on the node candidate set, and other main steps are related to S belonging to the global service_idThe same is true.

Through the embodiment, the server 104 predicts the dominant operating resource in the container by constructing the markov chain directed graph and utilizing the transition probability matrix, so that the efficiency of resource scheduling in the container can be improved. And the server 104 implements monitoring of various indexes of the container cluster based on cAdvisor + infixdb, and by constructing a directed graph of the container leading resource attributes, evaluating the transition probability of the next leading resource attribute of the container based on container state history and depending on a transition probability matrix, and scheduling the container by cooperatively considering the node condition and the condition of the current service leading resource, the high availability of the container service can be ensured, the performance of an application program is improved, and the resource utilization rate is improved.

It should be understood that, although the steps in the flowcharts related to the embodiments as described above are sequentially displayed as indicated by arrows, the steps are not necessarily performed sequentially as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the embodiments described above may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the execution order of the steps or stages is not necessarily sequential, but may be rotated or alternated with other steps or at least a part of the steps or stages in other steps.

Based on the same inventive concept, the embodiment of the present application further provides a container resource scheduling apparatus for implementing the above-mentioned container resource scheduling method. The implementation scheme for solving the problem provided by the device is similar to the implementation scheme described in the above method, so specific limitations in one or more embodiments of the container resource scheduling device provided below may refer to the limitations in the above container resource scheduling method, and details are not described here.

In one embodiment, as shown in fig. 6, there is provided a container resource scheduling apparatus, including: a first acquisition module 500, a second acquisition module 502, and a scheduling module 504, wherein:

a first obtaining module 500, configured to obtain multiple operation resources of a container, and construct a markov chain directed graph according to the multiple operation resources; the Markov chain directed graph comprises a plurality of state nodes.

A second obtaining module 502, configured to obtain historical state transition times corresponding to the multiple state nodes, and obtain a transition probability matrix corresponding to the markov chain directed graph according to the historical state transition times.

A scheduling module 504, configured to determine a second dominant operating resource corresponding to the container according to the transition probability matrix, the number of operating resources in the container, and the first dominant operating resource, and schedule the operating resources in the container according to the second dominant operating resource; the dominant running resource represents the running resource used by the service with the most used running resource in the container; the calling time of the first dominant run resource is earlier than that of the second dominant run resource.

In an embodiment, the first obtaining module 500 is specifically configured to obtain processor resources, memory resources, network input and output resources, and block input and output resources corresponding to the container as multiple operating resources.

In one embodiment, the above apparatus further comprises: the initial configuration module is used for receiving a starting instruction of the container, acquiring an average value of the running resource consumption of the started service in the container in unit time, and acquiring a plurality of average values of the running resource consumption; and determining the initial leading service according to the maximum value in the average consumption values of the running resources, and taking the running resources corresponding to the initial leading service as the initial leading running resources.

In an embodiment, the second obtaining module 502 is specifically configured to obtain a plurality of historical state transition times corresponding to transition from a first state node to a second state node in a markov chain directed graph, and obtain a sum of the plurality of historical state transition times; acquiring the ratio of the historical state transition times to the sum of the historical state transition times as the transition probability corresponding to the state transition for each historical state transition time; and obtaining a transition probability matrix according to the plurality of transition probabilities.

In an embodiment, the scheduling module 504 is specifically configured to, for each service running in the container, obtain a ratio of each running resource corresponding to the service to a resource upper limit of multiple running resources in the container, and obtain a running resource consumption value corresponding to the service according to the multiple ratios; obtaining the maximum value of the running resource consumption values of a plurality of services as a first leading running resource of the container; forming a Bayesian conditional probability formula according to a first leading operation resource corresponding to the first time step and the transition probability matrix, and determining a plurality of leading operation resources corresponding to the second time step; the first time step is earlier than the second time step; and acquiring candidate leading operation resources which are larger than or equal to the first leading operation resources in the plurality of leading operation resources corresponding to the second time step, and taking the maximum value in the candidate leading operation resources as the second leading operation resources.

In one embodiment, the above apparatus further comprises: the monitoring module is used for detecting whether the residual operating resources in the container are larger than a preset operating resource value of the second leading operating resource; if not, re-determining the second dominant operating resource of the container so as to enable the second dominant operating resource to be smaller than the preset operating resource value of the residual operating resources.

The modules in the container resource scheduling device may be implemented in whole or in part by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 7. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing data such as container resources and transition probability matrixes. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of container resource scheduling.

Those skilled in the art will appreciate that the architecture shown in fig. 7 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the container resource scheduling method described above when executing the computer program.

In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which, when being executed by a processor, implements the above-mentioned container resource scheduling method.

In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the container resource scheduling method described above.

It should be noted that, the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high-density embedded nonvolatile Memory, resistive Random Access Memory (ReRAM), Magnetic Random Access Memory (MRAM), Ferroelectric Random Access Memory (FRAM), Phase Change Memory (PCM), graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others. The databases referred to in various embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing based data processing logic devices, etc., without limitation.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims

1. A method for scheduling container resources, the method comprising:

2. The method of claim 1, wherein obtaining a plurality of operating resources for a container comprises:

3. The method of claim 1, further comprising:

4. The method according to claim 1, wherein the obtaining of the transition probability matrix corresponding to the markov chain directed graph according to the historical state transition times comprises:

5. The method of claim 1, wherein determining a second dominant run resource corresponding to the container according to the transition probability matrix, the number of run resources in the container, and the first dominant run resource comprises:

6. The method of claim 1, wherein after determining the second dominant run resource corresponding to the container, further comprising:

7. An apparatus for scheduling container resources, the apparatus comprising:

8. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 6.

9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.

10. A computer program product comprising a computer program, characterized in that the computer program realizes the steps of the method of any one of claims 1 to 6 when executed by a processor.