WO2022089281A1 - 基于容器的应用管理方法和装置 - Google Patents

基于容器的应用管理方法和装置 Download PDF

Info

Publication number
WO2022089281A1
WO2022089281A1 PCT/CN2021/125158 CN2021125158W WO2022089281A1 WO 2022089281 A1 WO2022089281 A1 WO 2022089281A1 CN 2021125158 W CN2021125158 W CN 2021125158W WO 2022089281 A1 WO2022089281 A1 WO 2022089281A1
Authority
WO
WIPO (PCT)
Prior art keywords
application
memory
low
application instance
instance
Prior art date
Application number
PCT/CN2021/125158
Other languages
English (en)
French (fr)
Inventor
史明伟
周新宇
许晓斌
聂诗超
詹洲翔
王川
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Priority to EP21885008.9A priority Critical patent/EP4239473A1/en
Publication of WO2022089281A1 publication Critical patent/WO2022089281A1/zh
Priority to US18/141,230 priority patent/US20230266814A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/3293Power saving characterised by the action undertaken by switching to a less power-consuming processor, e.g. sub-CPU
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45583Memory management, e.g. access or allocation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present disclosure relates to the field of serverless computing, in particular to a container-based application management solution.
  • serverless computing is a model that allows application developers to focus on their core products without caring about the running status of the application's servers on-premises or in the cloud.
  • the elastic time of the application that is, the time required for the application to achieve elastic scaling, is a very important indicator.
  • application instances need to be elastically scaled according to the real-time traffic of the application. If the elasticity time of an application is too long, the scheduling platform will not be able to expand and shrink the instance flexibly and quickly. Or, it takes a long time to monitor the flow of capacity expansion and capacity reduction to ensure that the capacity reduction has no impact on the service, resulting in wasted computing resources.
  • One solution is to use the application architecture to create a new process using Fork or Clone technology to speed up the cold start of the application.
  • the applications in the system start the commonly used Zygote framework, and other running instances are created by forking (Fork) from the main process of the Zygote (Zygote).
  • Fork forking
  • Zygote main process of the Zygote
  • the other is the one-to-one snapshot method.
  • One application instance corresponds to one snapshot, and one snapshot is used to restore one application instance.
  • This snapshot method needs to start the application instance in advance, and then make a snapshot of the started application instance and store it. When you need to restore the application instance, restore the corresponding snapshot. Therefore, the snapshot management and storage cost of this scheme is high.
  • snapshot creation process may be bound to specific machine features, so that subsequent snapshot restoration still needs to be performed on the machine with the bound machine feature, and cannot be performed in other environments. Therefore, snapshots are not universal.
  • the one-to-one snapshot method also has a time state problem. For example, for some applications that depend on the actual time, the snapshot will be made based on the current time; when the snapshot is restored, the time when the snapshot is made will also be restored. In this way, application logic may be affected by time and cause execution logic errors to occur.
  • a technical problem to be solved by the present disclosure is to provide a container-based application management method and apparatus, which can rapidly and elastically scale the application instance while reducing the cost of the application instance.
  • a container-based application management method comprising: configuring a serverless computing system implemented based on a container to allow an application instance to be in one of an online state and a low power consumption state during runtime , wherein the power consumption and/or resource consumption of the application instance in the low power consumption state is lower than the power consumption and/or resource consumption in the online state;
  • the first application instance enters a low power consumption state; and in response to performing a capacity expansion process on the application, a second application instance of at least one low power consumption state of the application is brought into an online state.
  • the step of making the first application instance in the at least one online state of the application enter the low power consumption state includes: limiting the resource usage quota of the at least one first application instance; and reducing the resources of the respective container of the at least one first application instance.
  • configuring and/or bringing the second application instance of the at least one low power consumption state of the application into an online state includes: improving the resource configuration of the container where the at least one second application instance is located; and deactivating the at least one second application instance Resource usage limit.
  • the step of bringing the second application instance in the at least one low power consumption state of the application to the online state includes: preferentially selecting to bring the second application instance in the container on the machine with relatively idle resources in the node where the container is located to enter the online state.
  • the method may further include: when the application instance is about to enter an online state and the resources on the node where it is located are insufficient, based on the real-time migration function of the user-mode checkpoint recovery (CRIU) technology, the application instance is migrated to a relative location.
  • CRIU user-mode checkpoint recovery
  • the application instance On an idle node; and/or in the case of multiple application instances in a low-power state on the same node, checkpoint one or more application instances based on the user-mode checkpoint recovery (CRIU) technology Snapshots and then restore one or more application instances on relatively idle nodes based on the checkpoint snapshots.
  • CRIU user-mode checkpoint recovery
  • the resource usage quota of the application instance is limited or the restriction is lifted based on the control group mechanism; and/or when the resource configuration of the container is reduced, based on the in-situ upgrade mechanism of the container pod resources, the resources released by the container where the application instance is located are returned.
  • the scheduling system and/or when improving the resource configuration of the container, apply for resources from the scheduling system for the container based on the in-situ upgrade mechanism of container pod resources.
  • the step of making at least one first application instance enter a low power consumption state includes: based on the CPU sharing function, adding multiple first application instances to a CPU group with a low power consumption specification to run, and the CPU group with a low power consumption specification and/or the step of bringing the at least one second application instance into an online state includes: causing the second application instance to exit a CPU group with a low power consumption specification.
  • the step of making the at least one first application instance enter a low power consumption state includes: making the memory space occupied by the at least one first application instance in the memory range of the low power consumption instance in the memory; and/or making the at least one
  • the step of entering the online state of the second application instance includes: keeping the memory space occupied by at least one second application instance outside the memory range of the low-power consumption instance in the memory.
  • the low-power instance memory range is a range outside the execution range of the in-memory periodic memory management operation in the memory.
  • the periodic memory management operation includes a memory garbage collection operation and/or a memory release operation that releases memory that has not been used within a predetermined period.
  • the method may further include: adjusting the execution range and/or the size of the execution range in response to the range adjustment instruction; and/or setting the execution range when the first application instance occupying the memory space enters a low power consumption state
  • the scope does not include the memory space occupied by the first application instance; and/or when the first application instance occupying the memory space enters the online state, the setting execution scope includes the memory space occupied by the first application instance.
  • the step of making the at least one first application instance enter a low power consumption state may include: closing part of the resources used by the first application instance or reducing the usage of part of the resources, and only retaining part of the system used by the first application instance. resource.
  • the method may further include: transferring memory data of the one or more application instances in the low power consumption state between the memory and the storage device.
  • the step of transferring the memory data of one or more low-power application instances between the memory and the storage device includes: using a memory swap function in the system kernel state to transfer the memory of one or more low-power application instances. Transfer data between memory and storage devices; and/or use user-mode memory swapping to transfer memory data for one or more low-power application instances between memory and storage devices, and transfer memory from different containers Data is moved to different storage devices or different pages of storage devices.
  • use the user-mode memory swap function to transfer the memory data of multiple low-power application instances from the memory to different storage devices or different pages of the storage device at the same time; and/or, simultaneously transfer multiple low-power application instances.
  • Memory-consuming application instance data is transferred into memory from different storage devices or different pages of storage devices.
  • the method may further include: receiving a memory swap setting instruction for an application or an application instance, where the memory swap setting instruction is used to instruct the setting of the application instance for the application or the application instance, whether to use the memory swap function of the system kernel state or the memory swap function of the system kernel state.
  • the user-mode memory swap function is used to perform memory data transfer; in response to the memory swap setting instruction, a memory swap function for performing memory data transfer is set for the application instance of the application or the application instance.
  • a plurality of super-performance cloud disk devices are used to construct memory swap storage devices with the same priority.
  • the step of transferring the memory data of the one or more application instances in the low power state between the memory and the storage device includes: selecting the memory of the one or more application instances in the low power state based on a least recently used algorithm. data, and transfer it from memory and persist it to storage; and/or in response to a swap back instruction or traffic request or instance deployment policy change, place one or more low-power state application instances on the storage device The memory data is transferred back into memory.
  • the step of bringing the second application instance in at least one low power consumption state of the application into an online state further includes: transferring the second application instance to the online state.
  • the in-memory data is transferred from the storage device into the execution scope of periodic memory management operations in memory.
  • a capacity reduction process is performed on the application; and/or in response to a capacity expansion instruction for the application , or in response to the application's traffic increasing above the second predetermined threshold, performing a capacity expansion process on the application.
  • a container-based application management apparatus deployed in a container-based serverless computing system, and configured to allow application instances to be in an online state and One of the low power consumption states, wherein the power consumption and/or resource consumption of the application instance in the low power consumption state is lower than the power consumption and/or resource consumption in the online state
  • the apparatus includes: a scaling device, in response to the The application performs a capacity reduction process, so that the first application instance in the at least one online state of the application enters a low power consumption state; and the capacity expansion device, in response to performing the capacity expansion process on the application, causes the second application instance of the at least one low power consumption state of the application Go online.
  • the device for reducing capacity may include: a quota limiting device for limiting the resource usage quota of at least one first application instance of a low-traffic application; and a configuration reducing device for reducing the container in which the at least one first application instance is located. resource configuration.
  • the capacity expansion device may include: a configuration elevating device for elevating the resource configuration of the respective container in which the at least one second application instance in the at least one first application instance is located; and a quota restoration device for releasing the at least one second application instance resource usage limit.
  • the apparatus may further include: a memory exchange storage device, which enables the memory data of one or more first application instances in the at least one first application instance to be transferred between the memory and the storage device.
  • a memory exchange storage device which enables the memory data of one or more first application instances in the at least one first application instance to be transferred between the memory and the storage device.
  • a computing device comprising: a processor; and a memory on which executable code is stored, and when the executable code is executed by the processor, causes the processor to execute the above-mentioned first the method described in the aspect.
  • a non-transitory machine-readable storage medium on which executable codes are stored, and when the executable codes are executed by a processor of an electronic device, the processor is caused to execute the above-mentioned first step. The method described in one aspect.
  • FIG. 1 is a schematic diagram of application elastic scaling according to an embodiment of the present disclosure
  • FIG. 2 is a schematic diagram of elastic expansion of an application according to an embodiment of the present disclosure
  • FIG. 3 is a schematic block diagram of an application management apparatus according to an embodiment of the present disclosure.
  • FIG. 4 is a schematic flowchart of an application management method according to an embodiment of the present disclosure.
  • FIG. 5 is a schematic flowchart of a scale-down stage of an application management method according to an embodiment of the present disclosure
  • FIG. 6 is a schematic flowchart of a capacity expansion stage of an application management method according to an embodiment of the present disclosure
  • FIG. 7 is a schematic diagram of low-power memory range management according to an embodiment of the present disclosure.
  • FIG. 8 is a schematic diagram of an example state transition according to an embodiment of the present disclosure.
  • FIG. 9 is a schematic diagram of the memory swap function of the system kernel state
  • FIG. 10 is a schematic diagram of a user-mode memory swap function according to the present disclosure.
  • FIG. 11 shows a schematic structural diagram of a computing device that can be used to implement the above application management method according to an embodiment of the present disclosure.
  • the "first application instance” is used to represent an application instance in the process of scaling down.
  • the “second application instance” is used to represent an application instance in the process of capacity expansion. With the switching of the working state of the application instance, the appellations of the first application instance and the second application instance can also be converted to each other. That is, after the first application instance undergoes the capacity reduction process and enters the low power consumption state, when the capacity expansion process is to be performed again, it can also be used as the second application instance of the present disclosure, and vice versa.
  • An application instance in a low-power state may be referred to as a "low-power application instance”.
  • An application instance in an online state may be referred to as an "online application instance” or an “online application instance”.
  • an "application instance” may also be simply referred to as an "instance”.
  • an application management solution based on container implementation in a serverless computing (Serverless) scenario is provided.
  • Serverless computing elasticity means that in the serverless computing mode, in order to cope with changes in service request traffic, the application running instance needs to be increased from one (or a few) instances to multiple (or more) instances. Or reducing from multiple copies to fewer copies, we call the process of increase and decrease as expansion and shrinkage, respectively, and the ability to expand and shrink is called elasticity.
  • elasticity can be divided into vertical elasticity and horizontal elasticity.
  • Vertical elasticity refers to the expansion and contraction behavior that occurs on the same host
  • horizontal elasticity refers to the expansion and contraction behavior of cross-hosts.
  • the elasticity mentioned in this disclosure refers to the expansion and contraction of the application.
  • An instance of an application changes from a low-power (low-traffic) state (also referred to as a "low-power operating state,” “low-power mode,” or “low-power operating mode”) to a state that can accept full traffic (
  • the full-traffic/full-power-consumption state also referred to as the “online state”
  • the state from accepting traffic to the standby (Standby) state is capacity reduction.
  • the capacity expansion can also be changing the instance from a low power consumption state to a partial traffic/partial power consumption state, that is, a state where the traffic/power consumption is between the low power consumption/low traffic state and the full traffic/full power consumption state.
  • scaling down may also include transitioning an instance of a partial traffic/partial power state to a low power/low traffic state.
  • the power consumption and/or resource consumption of the application instance in the low power consumption state is lower than the power consumption and/or resource consumption in the online state.
  • a proportional value may be set such that the power consumption and/or resource consumption in the low power consumption state is equal to or lower than the set value or the corresponding proportional value.
  • the proportional value can be, for example, 10%, 5%, 2%, 1%, and so on.
  • different numerical values or different corresponding ratio values may be respectively set for power consumption or different types of resources.
  • part of the resources used by the application instance may be turned off or the usage of the part of the resources may be reduced, and only part of the system resources used by the application instance may be reserved. For example, you can close the thread pool or reduce the number of threads in the thread pool, while keeping system resources such as file descriptors open.
  • the application instance can still provide certain services, such as configuration push and so on.
  • FIG. 1 and FIG. 2 schematically show the state changes of several instances of an application in the process of scaling down and scaling up. As an example, five instances are shown for one application. It should be understood that the number of instances corresponding to one application is not limited to this.
  • the figure shows that the resources occupied by online instances (full traffic state) are "4c, 4g", that is, 4 cores (CPU) and 4g memory, and the resources occupied by low-power instances (low-power state) It is "0.05c, 500M", that is, 0.05 core (CPU) and 500M memory.
  • the number of resources is not limited to this, and the number of resources occupied by each instance does not need to be the same.
  • the resources occupied by each instance can also be adjusted according to real-time traffic.
  • the types of resources are not limited to CPU and memory, and may also include other types of resources.
  • the solid line box represents the current state of the instance
  • the dotted line represents the previous state
  • the arrow represents the state change direction.
  • FIG. 1 is a schematic diagram of application elastic scaling according to an embodiment of the present disclosure.
  • a downscaling process may be performed to make one or more of the previously online applications.
  • the application instance enters a low-power state and transitions to a low-power instance.
  • FIG. 2 is a schematic diagram of elastic expansion of an application according to an embodiment of the present disclosure.
  • a capacity expansion process may be performed, and one or more application instances that were originally in a low-power consumption state go online. state, which is converted to an online instance.
  • the second predetermined threshold may be higher than or equal to the first threshold.
  • all low-power instances or some of the low-power instances corresponding to the application can also be converted to online instances according to commands or real-time traffic.
  • FIG. 3 is a schematic block diagram of an application management apparatus according to an embodiment of the present disclosure.
  • the application management apparatus 10 may include a capacity reduction apparatus 100 , a capacity expansion apparatus 200 , and a memory swap storage (swap) apparatus 300 .
  • the capacity reducing device 100 may include a quota limiting device 110 and a configuration reducing device 120
  • the capacity expansion device 200 may include a quota restoration device 220 and a configuration lifting device 210 .
  • FIG. 4 is a schematic flowchart of an application management method according to an embodiment of the present disclosure.
  • step S10 the serverless computing system implemented based on the container is configured so that the application instance is allowed to be in one of an online state and a low power consumption state when running.
  • the upper-layer runtime of the system can know whether the application instance is working in a low-power state, which can avoid a large number of page input and output operations caused by the behavior of the application program ( Page In/Page Out) to avoid making the system unusable.
  • step S100 for example, by the scaling device 100, in response to performing scaling processing on the application, the first application instance in at least one online state of the application enters a low power consumption state
  • step S200 for example, by the expansion device 200, in response to performing expansion processing on the application, the second application instance of the at least one low power consumption state of the application is brought into an online state.
  • the scaling process herein may be performed, for example, in response to a scaling down instruction for a low-traffic application, or, for example, in response to the application's traffic dropping below a first predetermined threshold to become a low-traffic application.
  • FIG. 5 is a schematic flow chart of step S100 of a scaling-down phase of an application management method according to an embodiment of the present disclosure.
  • the quota limiting device 110 may limit the resource usage quota of the first application instance in at least one online state of the application to be reduced in size.
  • the resource usage quota of the application instance can be restricted or released based on the control group (Cgroups, Control Groups) mechanism.
  • the control group mechanism is a mechanism provided by the Linux kernel that can limit the resources used by a single process or a group of processes.
  • Cgroups defines a subsystem for each controllable resource, which enables fine-grained control of resources such as CPU and memory.
  • CPU and memory In the cloud computing environment, most of the current popular container technologies deeply rely on the resource limitation capabilities provided by Cgroups to complete the resource control of CPU, memory and other parts.
  • the at least one first application instance enters a low power consumption (running) state.
  • the present disclosure proposes the concept of implementing low-power operation for traditional applications.
  • the application is extremely inactive when there is no traffic (or low traffic). In this state, theoretically, the CPU consumption required by the application is very low, and the memory footprint (Memory Footprint) is very small.
  • This state in which the instance is running may be referred to as a low power state (low power operation).
  • Low-power operation is a state in which an application instance is running.
  • the application instance is in a live running state, but only provides limited services.
  • the full-power (full-traffic) operating state occupies less system resources and can quickly respond to system external events.
  • a running application instance under certain events (eg, in response to a capacity expansion instruction, or in response to an increase in the traffic of a low-traffic application above a second predetermined threshold), a running application instance can quickly recover from a low-power operating state to full Power consumption running state, so as to have full service capability and undertake a large number of external traffic requests.
  • step S110 the CPU and memory specifications of the low-power application instance can be reduced to a very low level through the Cgroups mechanism.
  • a plurality of first application instances may be added to a CPU group with a low power consumption specification based on a CPU share function to run.
  • the application instances in the CPU group share the CPU.
  • the shared CPU can be easily allocated to the low-power instance, further reducing the CPU resources used in the low-power running mode, and can also combine the CPU occupied by the low-power instance with the full-power instance.
  • the CPU used is isolated.
  • the second application instance can also be made to exit the CPU group of the low power consumption specification.
  • the application management solution of the present disclosure requires that the application runtime (Runtime) supports a low power consumption operation mode.
  • JVM Java Virtual Machine
  • GC Garbage Collection
  • the Java application program In order to enable the Java runtime (Runtime) to support low-power operation, the Java application program is firstly brought into a low-power operation state. In the low-power state, the garbage collection (GC) operation of the Java process only acts on a small part of the heap memory range. In other words, only local GC is performed to ensure that the JVM's memory footprint (Memory Footprint) is limited to a range in low power mode.
  • GC garbage collection
  • the Elastic Heap capability provided by AJDK can be used to provide the ability to run processes with low power consumption. In the low-power mode, GC can only act on the heap area of a limited size. Elastic Heap provides a low-power operation elastic heap size setting, which can usually be 1/10 of the normal heap range.
  • Periodic GC scans in Java language are described above.
  • Other languages also have their own low-power operations, such as a memory release operation to release memory that has not been used within a predetermined period, which also affects the running state of an application instance in a low-power state. Therefore, these operations affecting the low-power running state of the low-power application instance can also be limited to a local memory range, and the memory space occupied by the low-power application instance can be set outside the execution range of the operation.
  • the set size of the execution range of the periodic memory management operation can be adjusted, or the start and end range of the execution range can be adjusted.
  • FIG. 7 is a schematic diagram of low-power memory range management according to an embodiment of the present disclosure.
  • the Java application memory 400 may include heap memory (heap area) 420 and non-heap memory (non-heap area) 410 .
  • the execution range 421 of the periodic memory management operation is limited, and the range 422 outside the execution range 421 is the low power consumption instance memory range. In the low power consumption instance memory range 422, the above-mentioned periodic memory management operations are not performed.
  • the memory space occupied by these first application instances may be in the memory range of the low power consumption instance in the memory.
  • the memory space occupied by these first application instances can be transferred from the limited execution range of periodic memory management operations to the low-power instance memory range outside the execution range.
  • a range adjustment instruction can also be issued to adjust the execution range: when the first application instance occupying memory space enters a low power consumption state, the set execution range does not include the memory space occupied; When the instance enters the online state, the set execution scope includes the occupied memory space.
  • the memory space occupied by the application instance can be switched between the above-mentioned execution range and the above-mentioned low-power consumption instance memory range corresponding to the change of the running state of the application instance.
  • the memory data of the instance can also be used, for example, by means of the operating system's memory least recently used (LRU) algorithm, which is not in the low-power mode/state. Active memory is further swapped out to the Swap area (memory data swap device, or storage device).
  • LRU memory least recently used
  • FIG. 8 is a schematic diagram of an example state transition according to an embodiment of the present disclosure.
  • the storage of memory data can be logically divided into three layers, namely L0, L1, and L2.
  • the L0 layer is the memory data of the online instance, which is within the execution range of the above-mentioned periodic memory management operation, and is outside the memory range of the above-mentioned low-power instance.
  • the L1 layer is the memory data of the low-power consumption (cache) instance, which is located within the memory range of the above-mentioned low-power consumption instance and outside the execution range of the above-mentioned periodic memory management operation.
  • the L2 layer is memory data that is swapped (persisted) to a memory swap device or storage device, such as on an external hard disk or cloud disk.
  • a memory swap device or storage device such as on an external hard disk or cloud disk.
  • images can be formed for each container, that is, a container image set, to achieve data isolation between containers.
  • the memory data of the application instance can be switched between the L0 layer and the L1 layer as the running state changes.
  • memory data can be exchanged between the L1 layer and the L2 layer, eg, based on the LRU algorithm described above.
  • Swap is an important part of the memory subsystem in the current Linux operating system.
  • the Linux operating system uses the Swap capability to increase the virtual memory available to a node.
  • the operating system can swap out and persist the data in the main memory to the memory swap device (Swap device).
  • the system can allocate the acquired (released) main memory to other processes (instances) for use.
  • Swap can effectively avoid the occurrence of out-of-memory (OOM, Out of memory) errors in the system.
  • OOM out-of-memory
  • the memory data of at least one low-power consumption application instance can be transferred between the memory and the storage device through the memory exchange storage device 300 .
  • the memory data of one or more low-power application instances located in the low-power instance memory range of memory may be selected based on a least recently used algorithm, and transferred from memory and persisted to the storage device.
  • the memory data of one or more low-power application instances located on the storage device is transferred back to the low-power instance memory range of the memory.
  • a high-speed network disk can be used as a swap device (memory data exchange device/storage device) for storing the swapped out memory.
  • the default swap capability/policy of the operating system that is, the memory swap function in the system kernel state (Kernel) may be used to perform the memory data swap.
  • FIG. 9 is a schematic diagram of the memory swap function in the system kernel state.
  • Low power consumption instances A1 and A2 of application A; low power consumption instances B1 and B2 of application B have corresponding memory data respectively.
  • the figures represent the data of different instances with different inner shading filling methods.
  • the memory data of all low-power instances are exchanged and stored on the same memory exchange storage device, that is, share the same memory data exchange device/storage device.
  • the sequence in the memory swap device will be disrupted, and the randomization is too serious, which seriously affects the read and write performance.
  • the present disclosure also allows the use of a user-space Swap capability through configuration, that is, a user-space memory swap function.
  • a user-space memory swap function Through the user-mode memory swap function, Swap isolation at the container, process, and instance levels can be achieved. Exceptionally, a formatted memory structure can be used to achieve strong pre-reading and fast up and down.
  • the user-mode memory swap function can use file mapping and sequential read and write methods, and the read and write performance can adapt to the sequential read and write performance of different storage media. In this way, the dependence on the capability of high-speed random read swap storage device can also be reduced to a certain extent.
  • FIG. 10 is a schematic diagram of a user-mode memory swap function according to the present disclosure.
  • memory data corresponding to different instances or from different containers can be exchanged and stored on different storage devices, or on different pages of storage devices, to achieve isolation of instance or container data.
  • the memory swap function of the system kernel state can be used to transfer the memory data of one or more low-power application instances between the memory and the storage device.
  • relevant staff can configure whether to use the kernel-mode memory interaction function or the user-mode memory interaction function.
  • a memory swap setting instruction for an application or an application instance issued by a related worker can be received.
  • the memory swap setting instruction may be used to instruct an application instance of the application or the application instance to perform memory data transfer using the system kernel-mode memory swap function or the user-mode memory swap function.
  • a memory swap function for performing memory data transfer may be set for the application instance of the application or the application instance.
  • multiple super-performance cloud disk devices can be used to construct memory swap storage devices with the same priority.
  • the target application instance After the low-power conversion processing for the CPU and the memory as described above, the target application instance has entered a low-power running state. Inactive memory data has further been swapped out to a low cost external IO device (memory swap device/storage device).
  • a low cost external IO device memory swap device/storage device
  • the resources of the application instance may be further de-allocated, and the resources saved by the low-power running of the instance may be returned to the scheduling system for scheduling other high-priority tasks to realize the reuse of resources.
  • the configuration reducing device 120 may be used to reduce the resource configuration of the container where each of the at least one first application instance is located.
  • the resources released by the container will be returned to the cluster by upgrading the resource specifications of the container pod (Pod) level in place, and then the cluster can schedule the container pods (Pods) of other workloads to the Node.
  • the resource configuration of the container can be reduced or improved based on the in-place upgrade mechanism of container resources such as Kubernetes (K8s).
  • container resources such as Kubernetes (K8s).
  • K8s Kubernetes
  • the resources released by the container where the application instance is located are returned to the scheduling system.
  • the resource configuration of the container is improved in the expansion stage, based on the in-situ upgrade mechanism of container pod resources, resources are applied for the container from the scheduling system.
  • resource management consists of two layers: the runtime layer inside the container and the K8s layer.
  • the original system allocated 4C8G (4-core 8G memory) to the application instance.
  • the runtime uses so many resources, but after entering the low-power state, the application instance itself only occupies 1/10 of the resources.
  • the scheduling layer will still think that the application instance occupies 4C8G resources. In this way, the freed resources cannot be allocated to other application instances or containers.
  • the K8s level is mainly metadata maintained at the scheduling level.
  • the in-situ upgrade of container resources can realize in-situ update of Pod resources without affecting the Kubernetes container pod (Pod), and reduce the resource configuration of the container where the current application instance is located. In this way, the low-power operation of the application instance and the quota adjustment of the container platform resources can be comprehensively realized.
  • a container pod is the smallest unit that can be created and deployed in Kubernetes. It is an application instance in a Kubernetes cluster and is always deployed on the same node (Node).
  • a container pod (Pod) contains one or more containers, and also includes resources shared by containers such as storage and network.
  • a container pod (Pod) can support a variety of container environments, such as the currently popular container environment Docker.
  • the scaling process according to the present disclosure has been described in detail, that is, limiting the resource usage quota of at least one first application instance of a low-traffic application to enter a low power consumption (running) state; and further reducing these first application instances
  • the resource configuration of the container where each container is located allocates the resources released from the above container to other containers in the container pod (Pod) where the container is located.
  • the capacity expansion process here may be performed, for example, in response to a capacity expansion instruction for a low-traffic application, or, for example, in response to an increase in the traffic of the low-traffic application above a second predetermined threshold.
  • Low-power operating modes can support dynamic resource configuration updates.
  • the second application instance that is currently running with low power consumption can be quickly pulled up to have the ability to accept all traffic.
  • FIG. 6 is a schematic flowchart of step S200 in the expansion stage of the application management method according to an embodiment of the present disclosure.
  • the configuration enhancing device 210 may be used to enhance the resource configuration of the respective containers of at least one second application instance in the low power consumption state of the application to be expanded.
  • the resource configuration of a container within a container pod can be upgraded based on an in-place upgrade mechanism for container resources.
  • the second application instance in the container on the machine where the resources in the node where the container is located may be preferentially selected to enter the online state.
  • the capacity expansion will preferentially select the second application instance on the machine with relatively idle resources in the node where the container is located, and apply to the scheduling system for resource upgrade to the original specification or the set specification or the specification determined based on the current traffic.
  • the application to be expanded has multiple application instances in a low-power state, and different application instances are on different machines, the application instance on the machine with relatively idle resources is given priority to be restored to the online state, and the corresponding application instances are restored.
  • Resource usage quota which increases the resource configuration of the container where it is located.
  • the local Cgroup restriction can be lifted, so that the application instance exits the low-power running mode (low-power state) when running, that is, enters the online mode (online state).
  • the quota restoration device 220 may be used to release the resource usage quota restriction of at least one second application instance.
  • the low-power application instance can be migrated to a relatively idle machine, so that the problem of resource run can be further solved.
  • CRIU user-mode checkpoint recovery
  • the application instance when the application instance is about to enter the online state and the resources on the node where it is located are insufficient, the application instance can be migrated to a relatively idle node based on the real-time migration function of the user-mode checkpoint recovery (CRIU) technology. .
  • CRIU user-mode checkpoint recovery
  • the memory space occupied by the at least one second application instance may be outside the memory range of the low power consumption instance in the memory, that is, the execution range of the periodic memory management operation in the memory.
  • the memory data stored on the external IO device can be quickly swapped in through the memory swap storage device 300, that is, the first 2.
  • the memory data of the application instance is transferred from the storage device to the execution scope in the memory.
  • the default swap function of the system kernel-mode memory swap function (swap in from the storage device to the memory) belongs to the lazy (Lazy) mode, which is not suitable for fast pop-up scenarios.
  • a layer of concurrent swap-in logic can be implemented in the user-mode memory swap function.
  • the user-mode memory swap function can be used to simultaneously transfer the memory data of multiple low-power application instances from the memory to different storage devices or different pages of the storage device.
  • the user-mode memory swap function can also be used to simultaneously transfer the memory data of multiple low-power application instances from different storage devices or different pages of the storage device to the memory.
  • the one-time swap-in speed of memory exchange can reach the upper limit of IO, which greatly improves the expansion performance.
  • the priority feature of the Swap subsystem can be used, and multiple ESSDs can be used.
  • ESSDs can be used.
  • Enhanced SSD, ultra-performance cloud disk block device to construct the same priority memory swap device/storage device to realize the RAID (Redundant Array of Independent Disks) capability of memory swap.
  • the IO throughput of memory data swapping can be greatly improved, and the extremely fast second-level elasticity can be achieved.
  • the IO throughput can be increased from 350M/s to 960M/s.
  • the scheme of the present disclosure can utilize the coroutine feature of the Go language, combined with the Linux pagemap mapping structure, to simultaneously access multiple segments of memory data.
  • a method for realizing fast memory swap in a serverless scenario is provided. Compared with the loading method based on lazy access of traditional operating systems, it can make full use of the maximum bandwidth of IO to load the memory data swapped out to the external storage medium into the memory and provide services quickly.
  • the present disclosure proposes an application management solution, which can be used to achieve rapid elasticity of existing applications without any changes to the application architecture.
  • the present disclosure proposes a low-power consumption operation mode of an application instance in a cloud computing serverless scenario.
  • the cached live objects in the low-power state can take up very few system resources.
  • the online service application instance can be horizontally expanded within seconds, and resources can be applied for at startup.
  • a set of high-performance memory swap-out/swap-in solutions is provided by performing concurrency control in the user mode and using the Swap priority feature, which can realize second-level elastic expansion in serverless scenarios.
  • the relevant information may include the respective numbers of low-power application instances and online application instances, the number ratio between application instances in the two states, the duration of the corresponding states, and the time it takes for the application instances to enter the online state from the low-power state. , the time it takes to expand the application, the time it takes to shrink the application, and so on.
  • the application capacity reduction and expansion scheme according to the application management method of the present disclosure can be applied to various application scenarios, especially the application scenarios where the traffic changes greatly over time.
  • the traffic that needs to be dealt with is relatively small, and it is necessary to flexibly shrink or expand.
  • large-scale promotions at specific times, or scenarios such as train ticket sales In such an application scenario, the traffic will suddenly explode several times or even dozens or hundreds of times than usual in a very short period of time.
  • the application management solution of the present disclosure can well cope with the needs of application capacity reduction and expansion in these scenarios.
  • FIG. 11 shows a schematic structural diagram of a computing device that can be used to implement the above application management method according to an embodiment of the present disclosure.
  • computing device 1000 includes memory 1010 and processor 1020 .
  • the processor 1020 may be a multi-core processor, or may include multiple processors.
  • processor 1020 may include a general-purpose main processor and one or more special-purpose co-processors, such as a graphics processor (GPU), a digital signal processor (DSP), and the like.
  • the processor 1020 may be implemented using custom circuits, such as Application Specific Integrated Circuits (ASICs) or Field Programmable Gate Arrays (FPGAs).
  • ASICs Application Specific Integrated Circuits
  • FPGAs Field Programmable Gate Arrays
  • Memory 1010 may include various types of storage units, such as system memory, read only memory (ROM), and persistent storage.
  • the ROM may store static data or instructions required by the processor 1020 or other modules of the computer.
  • Persistent storage devices may be readable and writable storage devices.
  • Permanent storage may be a non-volatile storage device that does not lose stored instructions and data even if the computer is powered off.
  • persistent storage devices employ mass storage devices (eg, magnetic or optical disks, flash memory) as persistent storage devices.
  • persistent storage may be a removable storage device (eg, a floppy disk, an optical drive).
  • System memory can be a readable and writable storage device or a volatile readable and writable storage device, such as dynamic random access memory.
  • System memory can store some or all of the instructions and data that the processor needs at runtime.
  • memory 1010 may include any combination of computer-readable storage media, including various types of semiconductor memory chips (DRAM, SRAM, SDRAM, flash memory, programmable read-only memory), and magnetic and/or optical disks may also be employed.
  • memory 1010 may include a removable storage device that is readable and/or writable, such as a compact disc (CD), a read-only digital versatile disc (eg, DVD-ROM, dual-layer DVD-ROM), Read-only Blu-ray Discs, Ultra-Density Discs, Flash Cards (eg SD Cards, Min SD Cards, Micro-SD Cards, etc.), Magnetic Floppy Disks, etc.
  • a removable storage device that is readable and/or writable, such as a compact disc (CD), a read-only digital versatile disc (eg, DVD-ROM, dual-layer DVD-ROM), Read-only Blu-ray Discs, Ultra-Density Discs, Flash Cards (eg SD Cards, Min SD Cards, Micro-SD Cards, etc.), Magnetic Floppy Disks, etc.
  • Computer readable storage media do not contain carrier waves and transient electronic signals transmitted over wireless or wire.
  • Executable codes are stored in the memory 1010, and when the executable codes are processed by the processor 1020, the processor 1020 can be caused to execute the application management method mentioned above.
  • the method according to the present invention can also be implemented as a computer program or computer program product comprising computer program code instructions for carrying out the above-mentioned steps defined in the above-mentioned method of the present invention.
  • the present invention can also be implemented as a non-transitory machine-readable storage medium (or computer-readable storage medium, or machine-readable storage medium) having executable codes (or computer programs, or computer instruction codes stored thereon) ), when the executable code (or computer program, or computer instruction code) is executed by the processor of the electronic device (or computing device, server, etc.), the processor is caused to perform the various steps of the above-mentioned method according to the present invention .
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more functions for implementing the specified logical function(s) executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Power Sources (AREA)

Abstract

公开了一种基于容器的应用管理方法和装置。配置基于容器实现的无服务器计算系统,以使得允许应用实例在运行时处于在线状态和低功耗状态之一。响应于对应用执行缩容处理,使应用的至少一个在线状态的第一应用实例进入低功耗状态;以及响应于对应用执行扩容处理,使应用的至少一个低功耗状态的第二应用实例进入在线状态。由此,能够在降低应用实例成本的同时,对应用实例进行快速弹性伸缩。

Description

基于容器的应用管理方法和装置
本申请要求2020年10月30日递交的申请号为202011194163.X、发明名称为“基于容器的应用管理方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本公开涉及一种无服务器计算领域,特别涉及基于容器的应用管理方案。
背景技术
在云计算场景中,无服务器计算(Serverless)是一种模式,使应用开发人员能专注于他们的核心产品,而无需关心应用程序在本地或云端的服务器运行状态。
在无服务器(Serverless)计算领域的自动扩缩容场景中,应用程序的弹性时间,即应用程序实现弹性伸缩所需要的时间,是一个非常重要的指标。
特别是在云计算场景中,需要根据应用程序的实时流量对应用实例进行弹性伸缩。如果一个应用的弹性时间过长,将导致调度平台无法灵活快速地对实例进行扩容和缩容。或者,扩容和缩容的流量监测需要经过很长的时间才能保证缩容对服务没有影响,从而导致计算资源被浪费。
在扩容时,需要实现实例的快速启动。目前部分系统或者环境下为了实现实例的快速启动,主要有下面两种方案。
一种方案是借助应用架构,使用分叉(Fork)或者克隆(Clone)技术来创建新的进程,从而加速应用冷启动。以安卓(Android)移动操作系统为例,其系统中应用启动普遍使用的受精卵(Zygote)框架,其它运行实例的创建均是通过从受精卵(Zygote)主进程分叉(Fork)而来。尽可能在分叉(Fork)之前加载应用所依赖的相同服务,从而达到应用快速启动。
基于Zygote模型的应用架构的扩容方案非常依赖这种应用架构。对于存量应用,需要对应用架构进行大幅调整。因此,这种方案不适用于存量应用。
而且,通常而言,应用冷启动的耗时较长,也无法满足快速扩容的要求。
另一种方案是利用检查点/恢复(Checkpoint/Restore)快照技术。应用启动之后制作成快照。应用要扩容的时候,利用已经制作好的快照,来加速应用的启动过程。
快照方案通常又有两种。
一种是快照复制方案,针对一个应用只制作一个快照。利用该快照来创建所有的应用实例。
使用这种快照复制方式,如果制作快照的时候,应用中存在一些持久化状态信息,例如UUID信息等,那么所生成的多个实例就会在正确性和安全性存在问题。
另一种是一对一快照方式,一个应用实例对应于一个快照,一个快照用于恢复一个应用实例。
这种快照方式需要提前启动应用实例,然后针对已启动的应用实例制作快照予以存储。需要恢复该应用实例的时候再恢复对应的快照。因此,这种方案的快照管理和存储的成本较高。
而且,快照制作过程中可能会和具体的机器特征绑定,导致后续的快照恢复仍然需要在具有所绑定的机器特征的机器上进行,不能在其它环境进行。因此,快照不通用。
另外,一对一快照方式还存在时间状态问题。例如,对于一些依赖于实际时间的应用,制作快照时会依赖于当时的时间;恢复快照时也会同时恢复制作快照时的时间。这样,应用逻辑可能会受时间影响而导致发生执行逻辑错误。
换言之,在现有技术中,在无服务器计算领域,应用的弹性扩容和缩容能力尚不能够满足日益增长的应用弹性需求。
因此,仍然需要一种能够例如随着应用程序的流量变化而对应用实例进行快速弹性伸缩的方案。
发明内容
本公开要解决的一个技术问题是提供一种基于容器的应用管理方法和装置,其能够在降低应用实例成本的同时,对应用实例进行快速弹性伸缩。
根据本公开的第一个方面,提供了一种基于容器的应用管理方法,包括:配置基于容器实现的无服务器计算系统,以使得允许应用实例在运行时处于在线状态和低功耗状态之一,其中,应用实例在低功耗状态下的功耗和/或资源消耗低于在线状态下的功耗和/或资源消耗;响应于对应用执行缩容处理,使应用的至少一个在线状态的第一应用实例进入低功耗状态;以及响应于对应用执行扩容处理,使应用的至少一个低功耗状态的第二应用实例进入在线状态。
可选地,使应用的至少一个在线状态的第一应用实例进入低功耗状态的步骤包括:限制至少一个第一应用实例的资源使用额度;以及降低至少一个第一应用实例各自所在容器的资源配置,并且/或者,使应用的至少一个低功耗状态的第二应用实例进入在线状态的步骤包括:提升至少一个第二应用实例各自所在容器的资源配置;以及解除至少一个第二应用实例的资源使用额度限制。
可选地,使应用的至少一个低功耗状态的第二应用实例进入在线状态的步骤包括:优先选择使容器所在节点中资源相对空闲的机器上的容器中的第二应用实例进入在线状态。
可选地,该方法还可以包括:在应用实例将要进入在线状态而其所在节点上资源不足的情况下,基于用户态的检查点恢复(CRIU)技术的实时迁移功能,将应用实例迁移到相对空闲的节点上;以及/或者在同一个节点上具有多个低功耗状态的应用实例的情况下,基于用户态的检查点恢复(CRIU)技术,将一个或多个应用实例做成检查点快照,然后基于检查点快照在相对空闲的节点上恢复一个或多个应用实例。
可选地,基于控制组机制来限制应用实例的资源使用额度或解除限制;并且/或者在降低容器的资源配置时,基于容器荚资源原地升级机制,将应用实例所在容器所释放的资源归还给调度系统;以及/或者在提升容器的资源配置时,基于容器荚资源原地升级机制,从调度系统为容器申请资源。
可选地,使至少一个第一应用实例进入低功耗状态的步骤包括:基于CPU共享功能,将多个第一应用实例加入低功耗规格的CPU组中运行,低功耗规格的CPU组中的应用实例共享CPU;并且/或者使至少一个第二应用实例进入在线状态的步骤包括:使第二应用实例退出低功耗规格的CPU组。
可选地,使至少一个第一应用实例进入低功耗状态的步骤包括:使至少一个第一应用实例所占用的内存空间处于内存中的低功耗实例内存范围中;并且/或者使至少一个第二应用实例进入在线状态的步骤包括:使至少一个第二应用实例所占用的内存空间处于内存中的低功耗实例内存范围之外。
可选地,低功耗实例内存范围是内存中周期性内存管理操作在内存中的执行范围之外的范围。
可选地,周期性内存管理操作包括内存垃圾回收操作和/或释放预定期限内未曾使用过的内存的内存释放操作。
可选地,该方法还可以包括:响应于范围调整指令,调整执行范围和/或执行范围的大小;以及/或者在占用内存空间的第一应用实例进入低功耗状态的情况下,设置执行范围不包括第一应用实例所占用的内存空间;以及/或者在占用内存空间的第一应用实例进入在线状态的情况下,设置执行范围包括第一应用实例所占用的内存空间。
可选地,使至少一个第一应用实例进入低功耗状态的步骤可以包括:关闭第一应用实例所使用的部分资源或降低部分资源的使用量,仅保留第一应用实例所使用的部分系统资源。
可选地,该方法还可以包括:使一个或多个低功耗状态的应用实例的内存数据在内存和存储设备之间转移。
可选地,使一个或多个低功耗应用实例的内存数据在内存和存储设备之间转移的步骤包括:使用系统内核态的内存交换功能,使一个或多个低功耗应用实例的内存数据在内存和存储设备之间转移;以及/或者使用用户态的内存交换功能,使一个或多个低功耗应用实例的内存数据在内存和存储设备之间转移,并且将来自不同容器的内存数据转移到不同的存储设备或存储设备的不同页面。
可选地,使用用户态的内存交换功能,同时将多个低功耗应用实例的内存数据从内存分别转移到不同的存储设备或存储设备的不同页面;并且/或者,同时将多个低功耗应用实例的内存数据从不同的存储设备或存储设备的不同页面转移到内存中。
可选地,该方法还可以包括:接收针对应用或应用实例的内存交换设置指令,内存 交换设置指令用于指示设置针对该应用的应用实例或该应用实例,使用系统内核态的内存交换功能还是用户态的内存交换功能来执行内存数据转移;响应于内存交换设置指令,针对该应用的应用实例或该应用实例,设置用于执行内存数据转移的内存交换功能。
可选地,在使用用户态的内存交换功能的情况下,使用多个超性能云盘设备构造具有相同优先级的内存交换用存储设备。
可选地,一个或多个低功耗状态的应用实例的内存数据在内存和存储设备之间转移的步骤包括:基于最近最少使用算法,选择一个或多个低功耗状态的应用实例的内存数据,并将其从内存转移并持久化到存储设备;以及/或者响应于换回指令或流量请求或实例部署策略变化,将位于存储设备上的一个或多个低功耗状态的应用实例的内存数据转移回到内存中。
可选地,在第二应用实例的内存数据已被转移到存储设备上的情况下,使应用的至少一个低功耗状态的第二应用实例进入在线状态的步骤还包括:将第二应用实例的内存数据从存储设备转移到内存中周期性内存管理操作的执行范围之中。
可选地,响应于针对应用的缩容指令,或响应于应用的流量降低到低于第一预定阈值而成为低流量应用,对应用执行缩容处理;并且/或者响应于针对应用的扩容指令,或响应于应用的流量增加到高于第二预定阈值,对应用执行扩容处理。
根据本公开的第二个方面,提供了一种基于容器的应用管理装置,部署于基于容器实现的无服务器计算系统中,无服务器计算系统和被配置为允许应用实例在运行时处于在线状态和低功耗状态之一,其中,应用实例在低功耗状态下的功耗和/或资源消耗低于在线状态下的功耗和/或资源消耗,该装置包括:缩容装置,响应于对应用执行缩容处理,使应用的至少一个在线状态的第一应用实例进入低功耗状态;以及扩容装置,响应于对应用执行扩容处理,使应用的至少一个低功耗状态的第二应用实例进入在线状态。
可选地,缩容装置可以是包括:额度限制装置,用于限制低流量应用的至少一个第一应用实例的资源使用额度;以及配置降低装置,用于降低至少一个第一应用实例各自所在容器的资源配置。
可选地,扩容装置可以包括:配置提升装置,用于提升至少一个第一应用实例中至少一个第二应用实例各自所在容器的资源配置;以及额度恢复装置,用于解除至少一个第二应用实例的资源使用额度限制。
可选地,该装置还可以包括:内存交换存储装置,使至少一个第一应用实例中的一个或多个第一应用实例的内存数据在内存和存储设备之间转移。
根据本公开的第三个方面,提供了一种计算设备,包括:处理器;以及存储器,其上存储有可执行代码,当可执行代码被处理器执行时,使处理器执行如上述第一方面所述的方法。
根据本公开的第四个方面,提供了一种非暂时性机器可读存储介质,其上存储有可 执行代码,当可执行代码被电子设备的处理器执行时,使处理器执行如上述第一方面所述的方法。
由此,实现了一种能够对应用实例进行快速弹性伸缩的基于容器的应用管理方法和装置。
附图说明
通过结合附图对本公开示例性实施方式进行更详细的描述,本公开的上述以及其它目的、特征和优势将变得更加明显,其中,在本公开示例性实施方式中,相同的参考标号通常代表相同部件。
图1是根据本公开实施例的应用弹性缩容的示意图;
图2是根据本公开实施例的应用弹性扩容的示意图;
图3是根据本公开实施例的应用管理装置的示意性框图;
图4是根据本公开实施例的应用管理方法的示意性流程图;
图5是根据本公开实施例的应用管理方法缩容阶段的示意性流程图;
图6是根据本公开实施例的应用管理方法扩容阶段的示意性流程图;
图7是根据本公开实施例的低功耗内存范围管理的示意图;
图8是根据本公开实施例的实例状态转换示意图;
图9是系统内核态的内存交换功能的示意图;
图10是根据本公开的用户态内存交换功能的示意图;
图11示出了根据本公开实施例可用于实现上述应用管理方法的计算设备的结构示意图。
具体实施方式
下面将参照附图更详细地描述本公开的优选实施方式。虽然附图中显示了本公开的优选实施方式,然而应该理解,可以以各种形式实现本公开而不应被这里阐述的实施方式所限制。相反,提供这些实施方式是为了使本公开更加透彻和完整,并且能够将本公开的范围完整地传达给本领域的技术人员。
在本公开的上下文中,术语“第一”、“第二”仅用于区分限定的作用,而没有先后、主次或重要性区分的含义。
为便于描述,在本公开的上下文中,“第一应用实例”用于表示缩容过程中的应用实例。“第二应用实例”用于表示扩容过程中的应用实例。随着应用实例工作状态的切换,第一应用实例和第二应用实例的称谓也可以相互转化。即,第一应用实例经历缩容处理进入低功耗状态后,当要再次经扩容处理时,也可以作为本公开的第二应用实例,反之亦然。
处于低功耗状态的应用实例可以称为“低功耗应用实例”。处于在线状态的应用实例可以称为“线上应用实例”或“在线应用实例”。
在一些场合下,“应用实例”也可以简称为“实例”。
根据本公开,提供了一种无服务器计算(Serverless)场景下基于容器实现的应用管理方案。
无服务器计算弹性是指在无服务器计算模式下,为了应对服务请求流量的变化,应用运行实例需要从一个(或较少的若干个)实例增加到多个(或较多的若干个)实例,或者从多个副本减少为较少的副本,我们把增加和减少的过程分别称为扩容和缩容,扩容和缩容的能力称为弹性。
从是否涉及跨机扩缩容的角度,弹性可以分为垂直弹性和水平弹性,垂直弹性是指发生在同一个主机上的扩缩容行为,水平弹性则是指跨主机的扩缩容行为。
本公开所提到的弹性是指应用程序的扩缩容。应用程序的实例从低功耗(低流量)状态(也可以称为“低功耗运行状态”、“低功耗模式”或“低功耗运行模式”)变为可以接受全量流量的状态(全流量/全功耗状态,也可以称为“线上状态”)为扩容,从接受流量的状态变为待用(Standby)状态为缩容。应当明白,扩容也可以是将实例从低功耗状态转变为部分流量/部分功耗状态,即流量/功耗处于低功耗/低流量状态和全流量/全功耗状态之间的状态。相应地,缩容也可以包括将部分流量/部分功耗状态的实例转变为低功耗/低流量状态。
低功耗模式下,可以限制应用实例运行时的活跃度,减少应用实例资源消耗。
应用实例在低功耗状态下的功耗和/或资源消耗低于在线状态下的功耗和/或资源消耗。
例如,可以设定一个比例值,使得低功耗状态下的功耗和/或资源消耗等于或低于所设定的数值或相应比例值。该比例值例如可以为10%、5%、2%、1%等等。
另外,也可以针对功耗或不同类型的资源分别设置不同的数值或不同的相应比例值。
在在线状态的应用实例进入低功耗状态时,可以关闭应用实例所使用的部分资源或降低所述部分资源的使用量,而仅保留应用实例所使用的部分系统资源。例如,可以关闭线程池或者降低线程池的线程数量,而保持文件描述符之类的系统资源处于打开状态。
相应地,在低功耗状态的应用实例进入在线状态时,可以恢复所关闭的部分资源或其使用量。
另外,在处于低功耗模式时,应用实例的内存占用空间大幅降低,原来占用的大量内存空间被释放出来,不再使用。这些不再使用的内存空间,包括堆区和非堆区内存空间,都可以清理掉,已分配给其它应用实例。
与直接冻结应用实例不同,在低功耗状态(模式)下,应用实例仍然能够提供一定的服务,例如配置推送等。
图1和图2示意性地示出了一个应用的缩容和扩容过程中其若干个实例的状态变化。其中作为示例,为一个应用示出了五个实例。应当明白,对应于一个应用的实例个数不 限于此。
另外,图中示出了线上实例(全流量状态)所占用的资源为“4c、4g”,即4核(CPU)及4g内存,低功耗实例(低功耗状态)所占用的资源为“0.05c、500M”,即0.05核(CPU)及500M内存。应当明白,资源数量不限于此,而且也不需要每个实例所占用的资源数量相同。而且,各实例所占用的资源还可以根据实时流量来调整。另外,资源的类型也不限于CPU和内存,还可以包括其它类型的资源。
图中以实线框表示实例当前状态,以虚线表示先前状态,箭头表示状态变化方向。
图1是根据本公开实施例的应用弹性缩容的示意图。
例如响应于针对低流量应用的缩容指令,或例如响应于应用的流量降低到低于第一预定阈值而成为低流量应用,可以执行缩容处理,使一个或多个原来处于线上状态的应用实例进入低功耗状态,转换为低功耗实例。
缩容时,可以根据指令或实时流量,将该应用对应的所有线上实例或其中部分线上实例转换为低功耗实例。
图2是根据本公开实施例的应用弹性扩容的示意图。
例如响应于针对低流量应用的扩容指令,或例如响应于低流量应用的流量增加到高于第二预定阈值,可以执行扩容处理,是一个或多个原来处于低功耗状态的应用实例进入在线状态,转换为线上实例。这里,第二预定阈值可以高于或等于第一阈值。
扩容时,也可以根据指令或实时流量,将该应用对应的所有低功耗实例或其中部分低功耗实例转换为线上实例。
下面参考图3至图10详细描述根据本公开的缩容和扩容处理。
图3是根据本公开实施例的应用管理装置的示意性框图。
如图3所示,根据本公开实施例的应用管理装置10可以包括缩容装置100、扩容装置200和内存交换存储(swap)装置300。
缩容装置100可以包括额度限制装置110和配置降低装置120
扩容装置200可以包括额度恢复装置220和配置提升装置210。
图4是根据本公开实施例的应用管理方法的示意性流程图。
如图4所示,在步骤S10,配置基于容器实现的无服务器计算系统,以使得允许应用实例在运行时处于在线状态和低功耗状态之一。
通过使得系统支持应用实例运行时(runtime)以低功耗状态/模式工作,系统上层运行时能够了解应用实例是否以低功耗状态工作,可以避免应用程序的行为造成大量的页面输入输出操作(Page In/Page Out),从而避免导致系统不可用。
在缩容阶段,在步骤S100,例如通过所述缩容装置100,响应于对应用执行缩容处理,使所述应用的至少一个在线状态的第一应用实例进入低功耗状态
在扩容阶段,在步骤S200,例如通过所述扩容装置200,响应于对应用执行扩容处 理,使所述应用的至少一个低功耗状态的第二应用实例进入在线状态。
下面,首先描述根据本公开的缩容处理。如上文所述,这里的缩容处理可以例如响应于针对低流量应用的缩容指令,或例如响应于应用的流量降低到低于第一预定阈值而成为低流量应用,而执行的。
【缩容】
图5是根据本公开实施例的应用管理方法缩容阶段步骤S100的示意性流程图。
如图5所示,在步骤S110,例如可以通过额度限制装置110,限制要进行缩容处理的应用的至少一个在线状态的第一应用实例的资源使用额度。
这里,可以基于控制组(Cgroups,Control Groups)机制来限制应用实例的资源使用额度或解除限制。
控制组机制是Linux内核提供的一种可以限制单个进程或者一组进程所使用资源的机制。Cgroups为每种可以控制的资源定义了一个子系统,从而可以对CPU、内存等资源实现精细化的控制。在云计算环境下,目前流行的容器技术大多深度依赖于Cgroups提供的资源限制能力来完成CPU、内存等部分的资源控制。
相应地,至少一个第一应用实例进入低功耗(运行)状态。
本公开了提出了对传统应用实施低功耗运行的概念。应用在无流量(或低流量)时处于极度不活跃的状态。这个状态下理论上应用需要的CPU消耗非常低,内存占用(Memory Footprint)非常小。可以将实例运行的这种状态称为低功耗状态(低功耗运行)。
低功耗运行是应用实例运行的一种状态,应用实例处于存活运行状态,但是仅提供有限的服务。与全功耗(全流量)运行状态相比,低功耗运行状态下占用很少的系统资源,能够快速响应系统外部事件。根据本公开,在特定事件下(例如响应于扩容指令,或响应于低流量应用的流量增加到高于第二预定阈值),运行中的应用实例可以迅速从低功耗运行状态迅速恢复到全功耗运行状态,从而具备全量服务能力,对外承接大量流量请求。
应用实例在运行时进入低功耗运行,上述步骤S110中通过Cgroups机制,可以将低功耗应用实例的CPU和内存规格降低到很低。
一方面,关于CPU,可以是基于CPU共享(CPU share)功能,将多个第一应用实例加入低功耗规格的CPU组中运行。该CPU组中的应用实例共享CPU。
由此,可以方便地将分配给低功耗实例的集中到该共享CPU,进一步降低低功耗运行模式下使用的CPU资源,并且还可以将低功耗实例所占用的CPU与全功耗实例所使用的CPU隔离开来。
另外,在后续执行扩容处理,使第二应用实例进入在线状态时,还可以使第二应用实例退出低功耗规格的CPU组。
另一方面,关于内存,下面详细描述根据本公开在低功耗实例管理时的内存管理方 案。
本公开的应用管理方案要求应用程序运行时(Runtime)支持低功耗运行模式。
一些传统Java应用,运行过程中之所以不支持Runtime低功耗运行,是因为JVM(Java虚拟机)内存管理的特征:即使没有外部流量请求,JVM内存也会在整个堆区内存范围内进行对象分配,且会进行周期性的垃圾回收(GC,Garbage Collection)操作,这些行为都会导致Java堆(Heap)中的大量内存区域被触碰(Touch)。GC操作对象的存活信息更新也会导致大量内存成为脏数据(Dirty)状态。
为了使Java运行时(Runtime)支持低功耗运行,首先使Java应用程序进入低功耗运行状态。在低功耗状态下,使Java进程的垃圾回收(GC)操作只作用于很小的一部分堆区内存范围内。换言之,只进行局部GC,保证JVM的内存占用(Memory Footprint)在低功耗模式下局限在一个范围。
可以利用AJDK提供的弹性堆(Elastic Heap)能力,来提供进程低功耗运行能力。在低功耗模式下,GC可以只作用于限定大小的堆区范围内,Elastic Heap提供了低功耗运行弹性堆大小设置,通常可以为正常堆范围的1/10。
另外,在Java语言的系统中,不仅可以释放堆区内存,本地(Native)内存部分也可以释放。
上面描述了Java语言中的周期性GC扫描。其它语言也有各自的低功耗操作,例如释放预定期限内未曾使用过的内存的内存释放操作等,同样也会对处于低功耗状态的应用实例的运行状态造成影响。因此,也可以将这些影响低功耗应用实例的低功耗运行状态的操作限制在一个局部内存范围之内,而将低功耗应用实例占用的内存空间设置在该操作的执行范围之外。
除了这里描述的周期性垃圾回收操作、内存释放操作,其它可能唤起低功耗实例,或影响低功耗实例的低功耗运行状态的周期性内存管理操作的执行范围也都可以加以限制。
可以响应于范围调整指令,调整周期性内存管理操作的执行范围的设定大小,或者调整该执行范围的起止范围。
图7是根据本公开实施例的低功耗内存范围管理的示意图。
Java应用内存400中可以包括堆内存(堆区)420和非堆内存(非堆区)410。
堆内存420中,限定周期性内存管理操作的执行范围421,执行范围421之外的范围422则为低功耗实例内存范围。在低功耗实例内存范围422中,不执行上述周期性内存管理操作。
使应用的至少一个第一应用实例进入低功耗状态时,可以使这些第一应用实例所占用的内存空间处于内存中的低功耗实例内存范围中。
例如,可以将这些第一应用实例占用的内存空间从限定的周期性内存管理操作的执 行范围转移到该执行范围之外的低功耗实例内存范围中。
或者,也可以发出范围调整指令,调整执行范围:在占用内存空间的第一应用实例进入低功耗状态的情况下,设置执行范围不包括所占用内存空间;或者在占用内存空间的第一应用实例进入在线状态的情况下,设置执行范围包括所占用内存空间。
这样,无需实际移动内存数据,即可使应用实例所占用的内存空间,对应于该应用实例的运行状态变化,而在上述执行范围和上述低功耗实例内存范围之间切换。
实例的内存数据除了在线上实例的上述执行范围和低功耗实例内存范围之间切换之外,还可以例如借助操作系统的内存最近最少使用(LRU)算法,将低功耗模式/状态下不活跃的内存进一步换出到交换(Swap)区(内存数据交换设备,或存储设备)。
图8是根据本公开实施例的实例状态转换示意图。
如图8所示,内存数据的存储逻辑上可以分为三层,即L0、L1、L2。
L0层为线上实例的内存数据,位于上述周期性内存管理操作的执行范围之内,上述低功耗实例内存范围之外。
L1层为低功耗(缓存)实例的内存数据,位于上述低功耗实例内存范围之内,上述周期性内存管理操作的执行范围之外。
L2层为交换(持久化)到内存交换设备或存储设备的内存数据,例如位于外部硬盘或云盘上。例如可以针对各个容器分别形成镜像,即容器镜像集,实现容器间数据的隔离。
应用实例的内存数据可以随着运行状态的变化而在L0层和L1层之间切换。同样地,内存数据可以例如基于上述LRU算法,在L1层和L2层之间交换。或者,如下文所述,在扩容时,也可以直接从L2层转移到L1层。
内存交换(Swap)是当前Linux操作系统中内存子系统重要的组成部分。Linux操作系统使用Swap能力来增加节点可用的虚拟内存。借助Swap能力,操纵系统可以将主内存中的数据换出持久化到内存交换设备(Swap设备)上。这样系统可以将所获得(释放出来)的主内存分配给其它的进程(实例)使用。同时,在系统内存不足的情况下,通过Swap可以有效地避免系统发生内存溢出(OOM,Out of memory)错误。当这些内存页再次被系统访问的时候,操作系统再从Swap设备上读取这些内存页数据,并将其加载到主内存中,从而保证进程的正常运行。
这样,例如可以通过内存交换存储装置300,使至少一个低功耗应用实例的内存数据在内存和存储设备之间转移。
这里,可以基于最近最少使用算法,选择位于内存的低功耗实例内存范围中的一个或多个低功耗应用实例的内存数据,并将其从内存转移并持久化到存储设备。
响应于换回指令或流量请求或实例部署策略变化,将位于存储设备上的一个或多个低功耗应用实例的内存数据转移回到内存的低功耗实例内存范围中。可以使用高速网盘 作为用于存储所换出内存的Swap设备(内存数据交换设备/存储设备)。
另外,可以使用操作系统默认的Swap能力/策略,即,系统内核态(Kernel)的内存交换功能,来执行内存数据交换。
但Linux系统目前的Swap能力,物理主机共享相同的Swap存储空间,多个应用实例换出(从内存换出到存储设备)的内存数据会存储到多个主机(Host)共享的存储设备上。因此,Swap并不支持Cgroup隔离。
图9是系统内核态的内存交换功能的示意图。
应用A的低功耗实例A1、A2;应用B的低功耗实例B1、B2分别具有相应的内存数据。为便于理解,图中以不同的内部阴影填充方式表示不同实例的数据。
如图9所示,所有低功耗实例的内存数据都交换存储到同一个内存交换存储设备上,即共享相同的内存数据交换设备/存储设备。例如在频繁换入换出时,内存交换设备中的序列会被打乱,随机化太严重,严重影响读写性能。
因此本公开也允许通过配置使用一种用户态(Userspace)的Swap能力,即,用户态的内存交换功能。通过用户态的内存交换功能,能够实现容器、进程、实例层面的Swap隔离。例外,可以采用格式化的内存结构,实现大力度地预读,实现快上快下。
用户态的内存交换功能,可以采用文件映射、顺序读写方式,读写性能能够适应不同存储介质的顺序读写性能。这样,一定程度上也可以降低对高速随机读Swap存储设备能力的依赖。
图10是根据本公开的用户态内存交换功能的示意图。
通过用户态的设置,可以将对应于不同实例或来自不同容器的内存数据交换存储到不同的存储设备,或存储设备的不同页面上,实现实例或容器数据的隔离。
这样,可以使用系统内核态的内存交换功能,使一个或多个低功耗应用实例的内存数据在内存和存储设备之间转移。也可以使用用户态的内存交换功能,使一个或多个低功耗应用实例的内存数据在内存和存储设备之间转移,并且将来自不同容器的内存数据转移到不同的存储设备,或存储设备的不同页面。
针对应用或应用实例,可以由相关工作人员来配置使用内核态内存交互功能还是用户态内存交互功能。
可以接收例如相关工作人员发出的针对应用或应用实例的内存交换设置指令。
内存交换设置指令可以用于指示设置针对该应用的应用实例或该应用实例,使用系统内核态的内存交换功能还是用户态的内存交换功能来执行内存数据转移。
这样,响应于所接收到的内存交换设置指令,可以针对该应用的应用实例或该应用实例,设置用于执行内存数据转移的内存交换功能。
在使用用户态的内存交换功能的情况下,可以使用多个超性能云盘设备构造具有相同优先级的内存交换用存储设备。
通过如上所述针对CPU和内存进行低功耗转换处理之后,目标应用实例已进入低功耗运行状态。不活跃的内存数据被进一步已经换出到低成本的外部IO设备(内存数据交换设备/存储设备)上。
此时,可以是进一步对应用实例的资源进行降配,把实例低功耗运行节省的资源归还给调度系统,以用于调度其它高优先级的任务,实现资源的复用。
返回图5,在步骤S120,例如可以通过配置降低装置120,降低至少一个第一应用实例各自所在容器的资源配置。
容器释放的资源,会通过原地升级容器荚(Pod)层面资源规格的方式还给集群,然后集群可以把其它的工作负载(Workload)的容器荚(Pod)调度到该Node上。
这里,例如可以基于如Kubernetes(K8s)的容器资源原地升级机制,来降低或提升容器的资源配置。在降低容器的资源配置时,基于容器荚资源原地升级机制,将应用实例所在容器所释放的资源归还给调度系统。相应地,在扩容阶段提升容器的资源配置时,基于容器荚资源原地升级机制,从调度系统为容器申请资源。
这里,资源管理包含两个层面:容器内部运行时层面和K8s层面。例如,原来系统给应用实例分配了4C8G(4核8G内存)。运行时用了这么多资源,而进入低功耗状态后,应用实例自己却只占用了1/10的资源。但是如果不在K8s层面也对这个资源进行更新的话,调度层会仍然认为应用实例占用了4C8G的资源。这样便不能将空余出来的资源分配给其它应用实例或容器。这里,K8s层面主要是调度层面维护的元数据。
容器资源原地升级能够在不影响Kubernetes容器荚(Pod)的情况下,实现Pod资源的原地更新,降低当前应用实例所在容器的资源配置。从而可以综合实现应用实例低功耗运行和容器平台资源的配额调整。
容器荚(Pod)是Kubernetes中能够创建和部署的最小单元,是Kubernetes集群中的一个应用实例,总是部署在同一个节点(Node)上。容器荚(Pod)中包含一个或多个容器,还包括了存储、网络等各容器共享的资源。容器荚(Pod)可以支持多种容器环境,例如当前流行的容器环境Docker等。
至此,已详细描述了根据本公开的缩容处理,即限制低流量应用的至少一个第一应用实例的资源使用额度,使其进入低功耗(运行)状态;并进一步降低这些第一应用实例各自所在容器的资源配置,将从上述容器释放的资源分配给该容器所在容器荚(Pod)中的其它容器。
【扩容】
下面,描述根据本公开的扩容处理。如上文所述,这里的扩容处理可以是例如响应于针对低流量应用的扩容指令,或例如响应于低流量应用的流量增加到高于第二预定阈值,而执行的。
在应用缩容之后,其至少一个应用实例处于低功耗运行模式。低功耗运行模式能够 支持动态资源配置更新。
例如,当有流量请求,或流量增大,需要扩容的时候,可以将当前处于低功耗运行第二应用实例快速拉起,以具备承接全部流量的能力。
图6是根据本公开实施例的应用管理方法扩容阶段步骤S200的示意性流程图。
如图6所示,在步骤S210,例如可以通过配置提升装置210,提升要扩容的应用的低功耗状态的至少一个第二应用实例各自所在容器的资源配置。
如上文所述相同,例如,可以基于容器资源原地升级机制,来提升容器荚内容器的资源配置。
这里,可以优先选择使容器所在节点中资源相对空闲的机器上的容器中的第二应用实例进入在线状态。
即,扩容会优先选择容器所在节点中资源比较空闲的机器上的第二应用实例,向调度系统申请资源升配到原来的规格或设定的规格或基于当前流量确定的规格。
换言之,在要扩容的应用具有多个处于低功耗状态的应用实例,不同应用实例处于不同机器上的情况下,优先将资源相对空闲的机器上的应用实例恢复到在线状态,并相应恢复其资源使用额度,提升其所在容器的资源配置。
容器资源配置升配完成之后,可以解除本地Cgroup限制,使应用实例运行时退出低功耗运行模式(低功耗状态),即进入在线模式(在线状态)。
即在步骤S220,例如可以通过额度恢复装置220,解除至少一个第二应用实例的资源使用额度限制。
与上文所述相同,这里也可以基于控制组机制来解除对应用实例的资源使用额度的限制。
另外,还可以基于用户态的检查点恢复(CRIU)技术的实时迁移功能,将低功耗应用实例迁移到相对空闲的机器上,从而可以进一步解决资源挤兑的问题。
这样,低功耗运行模式需要解决扩容时,针对扩容实例所在机器资源不足的问题,可以引入了基于用户态的检查点恢复(Checkpoint/Restore)(CRIU,Checkpoint/Restore in Userspace)技术的实时迁移(Live Migration,也可以称为“热迁移”或“在线迁移”)功能,将低功耗运行实例迁移到空闲的目标机器上进行调度。即,可以对低功耗应用实例进行水平迁移。
一方面,可以在应用实例将要进入在线状态而其所在节点上资源不足的情况下,基于用户态的检查点恢复(CRIU)技术的实时迁移功能,将所述应用实例迁移到相对空闲的节点上。
另一方面,在同一个节点上具有多个低功耗状态的应用实例的情况下,可以例如预判将来这些低功耗应用实例要恢复为在线应用实例时,该节点上资源将不足,即可能出现资源挤兑问题。这样,可以基于用户态的检查点恢复(CRIU)技术,将一个或多个所 述应用实例做成检查点快照,然后基于所述检查点快照在相对空闲的节点上恢复所述一个或多个应用实例这里,例如可以使至少一个第二应用实例所占用的内存空间处于内存中的上述低功耗实例内存范围之外,也即上述周期性内存管理操作在内存中的执行范围之内。
当第二应用实例的内存数据已经被转移到内存交换设备/存储设备/外部IO设备时,例如可以通过内存交换存储装置300,快速换入存储在外部IO设备上的内存数据,即,将第二应用实例的内存数据从存储设备转移到内存中的执行范围之中。
另外,默认的系统内核态内存交换功能的Swap换入(从存储设备换入到内存)属于懒人(Lazy)模式,不适合快速弹起场景。
为了适应无服务器计算(Serverless)极速换入的要求,可以在用户态内存交换功能中实现一层并发换入逻辑。
可以使用用户态的内存交换功能,同时将多个低功耗应用实例的内存数据从内存分别转移到不同的存储设备或存储设备的不同页面。另一方面,也可以使用用户态的内存交换功能,同时将多个低功耗应用实例的内存数据从不同的存储设备或存储设备的不同页面转移到内存中。
这样,能够使内存交换一次性换入速度达到IO上限,大幅提升扩容性能。
并且,为了加速换入时的性能,提升换入时输入输出吞吐量(IO Throughput)上限,在底层内存交换设备/存储设备构造上,可以是利用Swap子系统的优先级特性,用多个ESSD(Enhanced SSD,超性能云盘)块设备来构造同一个优先级内存交换设备/存储设备,实现内存交换的RAID(独立冗余磁盘阵列)能力。在实际使用过程中,能够使内存数据换入的IO吞吐量大幅提升,实现极速的秒级弹性。
例如,在一个应用示例中,将单个1.2T的ESSD设备替换为3个400G的ESSD设备,IO吞吐量可以从350M/s提升到960M/s。
Serverless强调应用实例的快速创建。本公开的方案可以利用Go语言的协程特性,结合Linux pagemap映射结构,能够对多段内存数据同时访问。由此,提供了一种在Serverless场景下,实现内存快速换入的方法。与传统操作系统基于Lazy访问加载的方式相比,能够充分利用IO最大带宽以将换出到外部存储介质的内存数据加载到内存中,快速提供服务。
这里,详细描述了根据本公开的扩容处理,即提升第二应用实例各自所在容器的资源配置;解除第二应用实例的资源使用额度限制,使第二应用实例退出低功耗运行模式/低功耗状态。
至此,已详细描述了根据本公开的基于容器的应用管理方案中的缩容和扩容处理。
由此,本公开提出了一种应用管理方案,其可以用于实现存量应用的快速弹性,而无需对应用架构做任何改变。
本公开在云计算Serverless场景下,提出应用实例的低功耗运行模式。
结合应用实例的低功耗运行模式,结合操作系统内核态的Swap能力和用户态Swap能力,以及例如K8s层面的容器原地升级能力(容器归还资源过程),提供了一套基于用户态来控制应用实例低功耗运行的方法。
可以使低功耗状态下缓存的存活对象占用极少的系统资源。而当流量请求到达需要扩容的阈值时,可以在秒级时间内水平扩容线上服务应用实例,并启动时申请资源。
另外,在实施例中,还通过在用户态进行并发控制,并利用Swap优先级特性提供了一套高性能的内存换出/换入方案,可以实现Serverless场景下实现秒级弹性扩容。
另外,还可以对应用的弹性缩容、扩容的一些相关信息进行记录、展示或统计,以便进行评估、调整,必要时可以进行干预。例如,相关信息可以包括低功耗应用实例和在线应用实例各自的数量、两种状态的应用实例之间的数量比例、相应状态持续时间、应用实例从低功耗状态进入在线状态所耗用时间、应用扩容所耗用时间、应用缩容所耗用时间等等。
根据本公开的应用管理方法的应用缩容、扩容方案可以适用于多种应用场景,特别是流量随时间变化较大的应用场景,换言之,一些在特定时间段需要同时应对巨大的流量,而另外一些时间所需要应对的流量则相对小很多,需要弹性缩容或扩容的场景。例如,特定时段进行的大型促销活动,又例如火车售票等场景。在这样的应用场景中,会在极短的时间段内突然爆发出数倍甚至数十倍、数百倍于平时的流量。而本公开的应用管理方案能够很好地应对这些场景下应用缩容、扩容的需要。
图11示出了根据本公开实施例可用于实现上述应用管理方法的计算设备的结构示意图。
参见图11,计算设备1000包括存储器1010和处理器1020。
处理器1020可以是一个多核的处理器,也可以包含多个处理器。在一些实施例中,处理器1020可以包含一个通用的主处理器以及一个或多个特殊的协处理器,例如图形处理器(GPU)、数字信号处理器(DSP)等等。在一些实施例中,处理器1020可以使用定制的电路实现,例如特定用途集成电路(ASIC,Application Specific Integrated Circuit)或者现场可编程逻辑门阵列(FPGA,Field Programmable Gate Arrays)。
存储器1010可以包括各种类型的存储单元,例如系统内存、只读存储器(ROM),和永久存储装置。其中,ROM可以存储处理器1020或者计算机的其他模块需要的静态数据或者指令。永久存储装置可以是可读写的存储装置。永久存储装置可以是即使计算机断电后也不会失去存储的指令和数据的非易失性存储设备。在一些实施方式中,永久性存储装置采用大容量存储装置(例如磁或光盘、闪存)作为永久存储装置。另外一些实施方式中,永久性存储装置可以是可移除的存储设备(例如软盘、光驱)。系统内存可以是可读写存储设备或者易失性可读写存储设备,例如动态随机访问内存。系统内存 可以存储一些或者所有处理器在运行时需要的指令和数据。此外,存储器1010可以包括任意计算机可读存储媒介的组合,包括各种类型的半导体存储芯片(DRAM,SRAM,SDRAM,闪存,可编程只读存储器),磁盘和/或光盘也可以采用。在一些实施方式中,存储器1010可以包括可读和/或写的可移除的存储设备,例如激光唱片(CD)、只读数字多功能光盘(例如DVD-ROM,双层DVD-ROM)、只读蓝光光盘、超密度光盘、闪存卡(例如SD卡、min SD卡、Micro-SD卡等等)、磁性软盘等等。计算机可读存储媒介不包含载波和通过无线或有线传输的瞬间电子信号。
存储器1010上存储有可执行代码,当可执行代码被处理器1020处理时,可以使处理器1020执行上文述及的应用管理方法。
上文中已经参考附图详细描述了根据本发明的应用管理方案。
此外,根据本发明的方法还可以实现为一种计算机程序或计算机程序产品,该计算机程序或计算机程序产品包括用于执行本发明的上述方法中限定的上述各步骤的计算机程序代码指令。
或者,本发明还可以实施为一种非暂时性机器可读存储介质(或计算机可读存储介质、或机器可读存储介质),其上存储有可执行代码(或计算机程序、或计算机指令代码),当所述可执行代码(或计算机程序、或计算机指令代码)被电子设备(或计算设备、服务器等)的处理器执行时,使所述处理器执行根据本发明的上述方法的各个步骤。
本领域技术人员还将明白的是,结合这里的公开所描述的各种示例性逻辑块、模块、电路和算法步骤可以被实现为电子硬件、计算机软件或两者的组合。
附图中的流程图和框图显示了根据本发明的多个实施例的系统和方法的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或代码的一部分,所述模块、程序段或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标记的功能也可以以不同于附图中所标记的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
以上已经描述了本发明的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术的改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。

Claims (25)

  1. 一种基于容器的应用管理方法,包括:
    配置基于容器实现的无服务器计算系统,以使得允许应用实例在运行时处于在线状态和低功耗状态之一,其中,应用实例在低功耗状态下的功耗和/或资源消耗低于在线状态下的功耗和/或资源消耗;
    响应于对应用执行缩容处理,使所述应用的至少一个在线状态的第一应用实例进入低功耗状态;以及
    响应于对应用执行扩容处理,使所述应用的至少一个低功耗状态的第二应用实例进入在线状态。
  2. 根据权利要求1所述的方法,其中,
    使所述应用的至少一个在线状态的第一应用实例进入低功耗状态的步骤包括:
    限制所述至少一个第一应用实例的资源使用额度;以及
    降低所述至少一个第一应用实例各自所在容器的资源配置,
    并且/或者,使所述应用的至少一个低功耗状态的第二应用实例进入在线状态的步骤包括:
    提升所述至少一个第二应用实例各自所在容器的资源配置;以及
    解除所述至少一个第二应用实例的资源使用额度限制。
  3. 根据权利要求2所述的方法,其中,使所述应用的至少一个低功耗状态的第二应用实例进入在线状态的步骤包括:
    优先选择使容器所在节点中资源相对空闲的机器上的容器中的第二应用实例进入在线状态。
  4. 根据权利要求2所述的方法,还包括:
    在应用实例将要进入在线状态而其所在节点上资源不足的情况下,基于用户态的检查点恢复CRIU技术的实时迁移功能,将所述应用实例迁移到相对空闲的节点上;以及/或者
    在同一个节点上具有多个低功耗状态的应用实例的情况下,基于用户态的检查点恢复(CRIU)技术,将一个或多个所述应用实例做成检查点快照,然后基于所述检查点快照在相对空闲的节点上恢复所述一个或多个应用实例。
  5. 根据权利要求2所述的方法,其中,
    基于控制组机制来限制应用实例的资源使用额度或解除所述限制;并且/或者
    在降低容器的资源配置时,基于容器荚资源原地升级机制,将所述应用实例所在容器所释放的资源归还给调度系统;以及/或者
    在提升容器的资源配置时,基于容器荚资源原地升级机制,从调度系统为所述容器申请资源。
  6. 根据权利要求1所述的方法,其中,
    使所述至少一个第一应用实例进入低功耗状态的步骤包括:基于CPU共享功能,将多个所述第一应用实例加入低功耗规格的CPU组中运行,所述低功耗规格的CPU组中的应用实例共享CPU;并且/或者
    使所述至少一个第二应用实例进入在线状态的步骤包括:使第二应用实例退出所述低功耗规格的CPU组。
  7. 根据权利要求1所述的方法,其中,
    使所述至少一个第一应用实例进入低功耗状态的步骤包括:使所述至少一个第一应用实例所占用的内存空间处于内存中的低功耗实例内存范围中;并且/或者
    使所述至少一个第二应用实例进入在线状态的步骤包括:使所述至少一个第二应用实例所占用的内存空间处于内存中的低功耗实例内存范围之外。
  8. 根据权利要求7所述的方法,其中,所述低功耗实例内存范围是内存中周期性内存管理操作在内存中的执行范围之外的范围。
  9. 根据权利要求8所述的方法,其中,所述周期性内存管理操作包括内存垃圾回收操作和/或释放预定期限内未曾使用过的内存的内存释放操作。
  10. 根据权利要求8所述的方法,还包括:
    响应于范围调整指令,调整所述执行范围和/或所述执行范围的大小;以及/或者
    在占用内存空间的第一应用实例进入低功耗状态的情况下,设置所述执行范围不包括所述第一应用实例所占用的内存空间;以及/或者
    在占用内存空间的第一应用实例进入在线状态的情况下,设置所述执行范围包括所述第一应用实例所占用的内存空间。
  11. 根据权利要求1所述的方法,其中,使所述至少一个第一应用实例进入低功耗状态的步骤包括:
    关闭第一应用实例所使用的部分资源或降低所述部分资源的使用量,仅保留第一应用实例所使用的部分系统资源。
  12. 根据权利要求1所述的方法,还包括:
    使一个或多个低功耗状态的应用实例的内存数据在内存和存储设备之间转移。
  13. 根据权利要求12所述的方法,其中,使一个或多个低功耗应用实例的内存数据在内存和存储设备之间转移的步骤包括:
    使用系统内核态的内存交换功能,使所述一个或多个低功耗应用实例的内存数据在内存和存储设备之间转移;以及/或者
    使用用户态的内存交换功能,使所述一个或多个低功耗应用实例的内存数据在内存和存储设备之间转移,并且将来自不同容器的内存数据转移到不同的存储设备或存储设备的不同页面。
  14. 根据权利要求13所述的方法,其中,使用用户态的内存交换功能,同时将多个低功耗应用实例的内存数据从内存分别转移到不同的存储设备或存储设备的不同页 面,并且/或者,同时将多个低功耗应用实例的内存数据从不同的存储设备或存储设备的不同页面转移到内存中。
  15. 根据权利要求13所述的方法,还包括:
    接收针对应用或应用实例的内存交换设置指令,所述内存交换设置指令用于指示设置针对该应用的应用实例或该应用实例,使用系统内核态的内存交换功能还是用户态的内存交换功能来执行内存数据转移;
    响应于所述内存交换设置指令,针对该应用的应用实例或该应用实例,设置用于执行内存数据转移的内存交换功能。
  16. 根据权利要求13所述的方法,其中,在使用用户态的内存交换功能的情况下,使用多个超性能云盘设备构造具有相同优先级的内存交换用存储设备。
  17. 根据权利要求12所述的方法,其中,所述一个或多个低功耗状态的应用实例的内存数据在内存和存储设备之间转移的步骤包括:
    基于最近最少使用算法,选择一个或多个低功耗状态的应用实例的内存数据,并将其从内存转移并持久化到存储设备;以及/或者
    响应于换回指令或流量请求或实例部署策略变化,将位于存储设备上的一个或多个低功耗状态的应用实例的内存数据转移回到内存中。
  18. 根据权利要求12所述的方法,其中,在所述第二应用实例的内存数据已被转移到存储设备上的情况下,使所述应用的至少一个低功耗状态的第二应用实例进入在线状态的步骤还包括:
    将所述第二应用实例的内存数据从存储设备转移到内存中周期性内存管理操作的执行范围之中。
  19. 根据权利要求1所述的方法,其中,
    响应于针对应用的缩容指令,或响应于应用的流量降低到低于第一预定阈值而成为低流量应用,对所述应用执行缩容处理;并且/或者
    响应于针对应用的扩容指令,或响应于应用的流量增加到高于第二预定阈值,对所述应用执行扩容处理。
  20. 一种基于容器的应用管理装置,部署于基于容器实现的无服务器计算系统中,所述无服务器计算系统和被配置为允许应用实例在运行时处于在线状态和低功耗状态之一,其中,应用实例在低功耗状态下的功耗和/或资源消耗低于在线状态下的功耗和/或资源消耗,该装置包括:
    缩容装置,响应于对应用执行缩容处理,使所述应用的至少一个在线状态的第一应用实例进入低功耗状态;以及
    扩容装置,响应于对应用执行扩容处理,使所述应用的至少一个低功耗状态的第二应用实例进入在线状态。
  21. 根据权利要求20所述的装置,其中,所述缩容装置包括:
    额度限制装置,用于限制低流量应用的至少一个第一应用实例的资源使用额度;以及
    配置降低装置,用于降低所述至少一个第一应用实例各自所在容器的资源配置。
  22. 根据权利要求20所述的装置,其中,所述扩容装置包括:
    配置提升装置,用于提升所述至少一个第一应用实例中至少一个第二应用实例各自所在容器的资源配置;以及
    额度恢复装置,用于解除所述至少一个第二应用实例的资源使用额度限制。
  23. 根据权利要求20至22中任何一项所述的装置,还包括:
    内存交换存储装置,使所述至少一个第一应用实例中的一个或多个第一应用实例的内存数据在内存和存储设备之间转移。
  24. 一种计算设备,包括:
    处理器;以及
    存储器,其上存储有可执行代码,当所述可执行代码被所述处理器执行时,使所述处理器执行如权利要求1至19中任何一项所述的方法。
  25. 一种非暂时性机器可读存储介质,其上存储有可执行代码,当所述可执行代码被电子设备的处理器执行时,使所述处理器执行如权利要求1至19中任何一项所述的方法。
PCT/CN2021/125158 2020-10-30 2021-10-21 基于容器的应用管理方法和装置 WO2022089281A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21885008.9A EP4239473A1 (en) 2020-10-30 2021-10-21 Container-based application management method and apparatus
US18/141,230 US20230266814A1 (en) 2020-10-30 2023-04-28 Container-based application management method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011194163.X 2020-10-30
CN202011194163.XA CN113296880A (zh) 2020-10-30 2020-10-30 基于容器的应用管理方法和装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/141,230 Continuation US20230266814A1 (en) 2020-10-30 2023-04-28 Container-based application management method and apparatus

Publications (1)

Publication Number Publication Date
WO2022089281A1 true WO2022089281A1 (zh) 2022-05-05

Family

ID=77318371

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/125158 WO2022089281A1 (zh) 2020-10-30 2021-10-21 基于容器的应用管理方法和装置

Country Status (4)

Country Link
US (1) US20230266814A1 (zh)
EP (1) EP4239473A1 (zh)
CN (1) CN113296880A (zh)
WO (1) WO2022089281A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115858087A (zh) * 2022-11-08 2023-03-28 瀚博创芯半导体(成都)有限公司 在云计算系统中部署多个进程的方法、装置、设备及介质
CN116302339A (zh) * 2023-03-09 2023-06-23 上海道客网络科技有限公司 一种基于容器云平台的容器组原地扩缩容的方法和系统
WO2023245366A1 (zh) * 2022-06-20 2023-12-28 北京小米移动软件有限公司 应用管理方法、装置、电子设备以及存储介质

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113296880A (zh) * 2020-10-30 2021-08-24 阿里巴巴集团控股有限公司 基于容器的应用管理方法和装置
CN114598706B (zh) * 2022-03-08 2023-05-16 中南大学 基于Serverless函数的存储系统弹性伸缩方法
CN114840450B (zh) * 2022-07-04 2022-11-18 荣耀终端有限公司 一种存储空间整理方法及电子设备
CN116010030A (zh) * 2022-12-30 2023-04-25 支付宝(杭州)信息技术有限公司 用于容器扩缩容的方法及装置
CN115858046B (zh) * 2023-02-28 2023-07-21 荣耀终端有限公司 一种预加载内存页的方法、电子设备及芯片系统
CN116319324B (zh) * 2023-05-23 2023-08-04 天津市亿人科技发展有限公司 一种基于sd-wan技术在arm芯片架构上的低功耗实现方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105245617A (zh) * 2015-10-27 2016-01-13 江苏电力信息技术有限公司 一种基于容器的服务器资源供给方法
EP3396543A1 (en) * 2017-04-26 2018-10-31 Nokia Solutions and Networks Oy Method to allocate/deallocate resources on a platform, platform and computer readable medium
CN110990119A (zh) * 2019-12-23 2020-04-10 中通服咨询设计研究院有限公司 一种基于容器技术提升Iaas云平台服务能力的方法
CN111611086A (zh) * 2020-05-28 2020-09-01 中国工商银行股份有限公司 信息处理方法、装置、电子设备和介质
CN111786904A (zh) * 2020-07-07 2020-10-16 上海道客网络科技有限公司 一种实现容器休眠与唤醒的系统及休眠与唤醒方法
CN113296880A (zh) * 2020-10-30 2021-08-24 阿里巴巴集团控股有限公司 基于容器的应用管理方法和装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108076082A (zh) * 2016-11-09 2018-05-25 阿里巴巴集团控股有限公司 一种应用集群的扩容方法、装置和系统
CN108469982B (zh) * 2018-03-12 2021-03-26 华中科技大学 一种容器在线迁移方法
US10860444B2 (en) * 2018-07-30 2020-12-08 EMC IP Holding Company LLC Seamless mobility for kubernetes based stateful pods using moving target defense

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105245617A (zh) * 2015-10-27 2016-01-13 江苏电力信息技术有限公司 一种基于容器的服务器资源供给方法
EP3396543A1 (en) * 2017-04-26 2018-10-31 Nokia Solutions and Networks Oy Method to allocate/deallocate resources on a platform, platform and computer readable medium
CN110990119A (zh) * 2019-12-23 2020-04-10 中通服咨询设计研究院有限公司 一种基于容器技术提升Iaas云平台服务能力的方法
CN111611086A (zh) * 2020-05-28 2020-09-01 中国工商银行股份有限公司 信息处理方法、装置、电子设备和介质
CN111786904A (zh) * 2020-07-07 2020-10-16 上海道客网络科技有限公司 一种实现容器休眠与唤醒的系统及休眠与唤醒方法
CN113296880A (zh) * 2020-10-30 2021-08-24 阿里巴巴集团控股有限公司 基于容器的应用管理方法和装置

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023245366A1 (zh) * 2022-06-20 2023-12-28 北京小米移动软件有限公司 应用管理方法、装置、电子设备以及存储介质
CN115858087A (zh) * 2022-11-08 2023-03-28 瀚博创芯半导体(成都)有限公司 在云计算系统中部署多个进程的方法、装置、设备及介质
CN115858087B (zh) * 2022-11-08 2023-07-18 瀚博创芯半导体(成都)有限公司 在云计算系统中部署多个进程的方法、装置、设备及介质
CN116302339A (zh) * 2023-03-09 2023-06-23 上海道客网络科技有限公司 一种基于容器云平台的容器组原地扩缩容的方法和系统

Also Published As

Publication number Publication date
US20230266814A1 (en) 2023-08-24
CN113296880A (zh) 2021-08-24
EP4239473A1 (en) 2023-09-06

Similar Documents

Publication Publication Date Title
WO2022089281A1 (zh) 基于容器的应用管理方法和装置
US9824011B2 (en) Method and apparatus for processing data and computer system
US10552337B2 (en) Memory management and device
CN110597451B (zh) 一种虚拟化缓存的实现方法及物理机
CN110795206B (zh) 用于促进集群级缓存和内存空间的系统和方法
CA2858109C (en) Working set swapping using a sequentially ordered swap file
US10282292B2 (en) Cluster-based migration in a multi-level memory hierarchy
JP6412244B2 (ja) 負荷に基づく動的統合
US10909072B2 (en) Key value store snapshot in a distributed memory object architecture
JP4902501B2 (ja) 電力制御方法、計算機システム、及びプログラム
US20160266923A1 (en) Information processing system and method for controlling information processing system
KR20120096489A (ko) 가상 스토리지 이주 방법, 가상 스토리지 이주 시스템 및 가상 머신 모니터
US20170344298A1 (en) Application aware memory resource management
WO2012069232A1 (en) Managing compressed memory using tiered interrupts
US9904639B2 (en) Interconnection fabric switching apparatus capable of dynamically allocating resources according to workload and method therefor
US20140089562A1 (en) Efficient i/o processing in storage system
CN115543530A (zh) 一种虚拟机迁移方法以及相关装置
JP6028415B2 (ja) 仮想サーバ環境のデータ移行制御装置、方法、システム
US10521371B2 (en) Cache system and associated method
CN106844417B (zh) 基于文件目录的热迁移方法及装置
CN114063894A (zh) 一种协程执行方法及装置
JP4792065B2 (ja) データ記憶方法
JP2005004282A (ja) ディスクアレイ装置、ディスクアレイ装置の管理方法及び管理プログラム
US10992751B1 (en) Selective storage of a dataset on a data storage device that is directly attached to a network switch
JP5382471B2 (ja) 電力制御方法、計算機システム、及びプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21885008

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2021885008

Country of ref document: EP

Effective date: 20230530