WO2021262054A1 - Method for controlling deployment of cached dependencies on one or more selected nodes in a distributed computing system - Google Patents
Method for controlling deployment of cached dependencies on one or more selected nodes in a distributed computing system Download PDFInfo
- Publication number
- WO2021262054A1 WO2021262054A1 PCT/SE2020/050665 SE2020050665W WO2021262054A1 WO 2021262054 A1 WO2021262054 A1 WO 2021262054A1 SE 2020050665 W SE2020050665 W SE 2020050665W WO 2021262054 A1 WO2021262054 A1 WO 2021262054A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- nodes
- dependencies
- estimated
- time
- cached
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/52—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow
- G06F21/53—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow by executing in a restricted environment, e.g. sandbox or secure virtual machine
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/60—Software deployment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5055—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering software capabilities, i.e. software resources associated or available to the machine
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
Definitions
- the present invention relates to a method for deploying cached dependencies.
- a method for deploying cached dependencies in a distributed computing system such as a cloud computing system.
- a central cloud is connected to a communications network and is providing applications, services and tasks to devices coupled to and/or connected to the communications network.
- Examples of applications or tasks with strict latency requirements are control of autonomous vehicles and/or autonomous processing plants.
- Applications running in an autonomous vehicle may e.g. include adaptive cruise control which is monitoring road signs and automatically control throttle and brakes of the vehicle.
- Typical properties of such applications are that the applications are relatively short lived and use resources for a relatively short period of time.
- a challenge when controlling execution of such applications is that the time it takes to setup the runtime environment and/or schedule/identify processing resources in the distributed computing system, may exceed the time it takes to run the application.
- the completion time of the application is dominated by the time it takes to initially launch the application and queuing for the application to be executed.
- the time to execute the application is relatively short in comparison to the time spent launching and queueing.
- the objects of the invention is achieved by a computer implemented method performed by a computer configured to control deployment of cached dependencies on one or more selected nodes in a distributed computing system, wherein the dependencies are software components upon which task software depend, wherein the computer selects the one or more nodes where the cached dependencies are to be deployed using an estimated completion time of the task software, wherein the completion time is estimated using an estimated launch time, an estimated scheduling time and an estimated execution time of the task software.
- the method comprises obtaining deployment input data, determining a number of nodes in the distributed computing system where the dependencies are to be deployed by evaluating the deployment input data, wherein the deployment input data is evaluated by at least comparing estimated launch times of nodes where the dependencies needs to be cached to estimated scheduling times of nodes where the dependencies are already cached, and selecting a set of nodes in the distributed computing system, wherein the selected set comprises a number of nodes corresponding to the determined number of nodes, and to control deployment of dependencies on the selected set of nodes.
- the advantage of the fist aspect is at least that completion times of tasks/applications are reduced.
- the objects of the invention is achieved by a computer configured to control deployment of cached dependencies on one or more selected nodes in a distributed computing system, wherein the dependencies are software components upon which task software depend, wherein the computer selects the one or more nodes where the cached dependencies are to be deployed using an estimated completion time of the task software, wherein the completion time is estimated using at least an estimated launch time, an estimated scheduling time and an estimated execution time of the task software, the computer further comprising processing circuitry, a memory comprising instructions executable by the processing circuitry, causing the processing circuitry to perform the method according to the first aspect.
- Fig. 1 illustrates a distributed computing system according to one or more embodiments of the present disclosure.
- Fig. 2A-B illustrates dependency deployment scenario according to one or more embodiments of the present disclosure.
- Fig. 3 shows a flowchart of a computer implemented method according to one or more embodiments of the present disclosure.
- Fig. 4 shows a flowchart of a use case embodiment of a computer implemented method according to one or more embodiments of the present disclosure.
- Fig. 5 shows details of a computer according to one or more embodiments of the present disclosure.
- the terms “application” is used to indicate a set of instructions forming a particular software and the term “task” to indicate instances of the application executing on processing circuitry, e.g. in a container instance.
- dependencies is used to denote all software components required to generate a runtime environment required by the application/task, such as libraries, data, and configuration files.
- deployment of cached dependencies denotes the act of providing software components typically from one node to another node, e.g. from a node comprising a dependency repository to a node on which a particular task/application may be scheduled.
- the present disclosure relates to deploying dependencies in a distributed computing system. In other words, software component deployment can be seen as a part of the field of software scheduling in a distributed computing system.
- distributed computing system denotes a system comprising a plurality of physically separate or virtualized computers where partial results or calculations are generated by different applications and/or services executing in different runtime environments, optionally different runtime environments on different nodes, e.g. a cloud computing network.
- application completion time is used interchangeably to denote the total time required to setup, schedule and execute a task until it has completed. I.e. until the execution of the task finishes.
- launch time is used to denote the time required to setup a runtime environment with all dependencies required by the task. This may e.g. involve obtaining or retrieving dependencies, such as data and container images/image files for the container instance/s to be started and to start all necessary containers instances for the task to run/execute.
- scheduling time is used to denote the time required to allocate processing resources to the task, e.g. queue time in a scheduling queue before execution of the task at a node in the distributed computing system.
- execution time is used to denote the time required to execute the task until it has completed. In other words, typically until the execution of the task finishes when all instructions have been processed.
- the present disclosure relates, in particular, to distributed systems operating in virtualized environments, such as systems providing cloud computing services.
- edge cloud strategies are used to reduce latency in the system.
- Many applications or tasks at the edge cloud are mainly short living applications/tasks/functions, with low latency requirements. They often follow Function as a Service (FaaS) model, where functions are often run within a container instance, with startup/launch times typically varying from 100 milli seconds, to a minute, depending if the parent container and its dependencies and libraries are already local or not.
- FaaS Function as a Service
- the typically layered file system of the containers allows a fine control over the container’s dependencies. In other words, knowledge of the dependencies or software components required by the runtime environment of the task, e.g. different software libraries.
- the edge cloud applications/tasks are short lived, they generally have a short completion time period.
- the dependency-aware schedulers try to reduce task launch time, by placing the tasks at a node that has the maximum number of dependencies available. To reduce the launch time, each node would cache previously used dependencies, such as images, and schedule the task where all, or a subset of task dependencies are available.
- dependencies such as images
- Non-optimal use of resources For a given workload, there is high risk that some nodes become more popular than the others. This will lead to over-utilization of some nodes, and under-utilizations of the others, depending on what is cached on each node.
- TLRU Time aware Least Recently Used
- the method executed by the computer constantly calculates a suitable tradeoff or near-optimal tradeoff or just a tradeoff between the scheduling queue time and the dependency download time.
- the controller makes this tradeoff by deciding if it is beneficial to adapt the number of nodes where required dependencies are cached, e.g. by dropping cached dependencies or pre-fetch ing and caching required dependencies in additional nodes, or is it is beneficial to only consider nodes for scheduling where all the required dependencies are already cached.
- the computer should also plan how to distribute the dependencies over the nodes in the distributed computing system to reduce the risk of prolonged completion times, e.g. resulting from long scheduling queues or container instance interference. In other words, predicting a required distribution of dependencies for an upcoming workload of the distributed computing system, e.g. predict an expected task arrival pattern.
- the present disclosure solves this by providing a method and a controller/computer which monitors the incoming workload of the distributed computing system and decides which dependencies to cache, where to cache the dependencies, and then dynamically adjust the cache size of nodes, and the number of nodes to cache the dependencies. This is done to decrease the risk of non-optimal and skewed utilization of processing resources and to improve the scheduling time, e.g. by reducing the overall scheduling queue length, the scheduling queue length of popular nodes and/or scheduling queue length of particular applications/tasks.
- nodes being popular in the sense that the cached dependencies of popular nodes are required by many tasks in the upcoming/future workload of the distributed computing system.
- Fig. 1 illustrates a distributed computing system 100 according to one or more embodiments of the present disclosure.
- the system 100 comprises one or more nodes N1-NM in a distributed computing system 100, each node comprising a local dependency cache LIS1- LIS3, e.g. a local library repository or cache.
- the nodes are communicatively coupled, optionally via a communications network (not shown).
- the distributed computing system 100 further comprises a repository node 120 configured to store dependencies required by tasks executing or to be executed in the system, e.g. in a dependency repository comprised in the repository node 120.
- the distributed computing system 100 further comprises a dependencies controller/computer 110 configured to control deployment of cached dependencies on one or more selected nodes in the distributed computing system 100.
- the controller 110 may optionally distribute its functionality over functional modules 1101-1103, such as a load prediction module 1101, a monitoring module 1102 and a control logic module 1103.
- the load prediction module 1101 is configured to predict load of the nodes N1-NM using a predicted task arrival pattern, e.g. obtained from an orchestrator node (not shown).
- the monitoring module 1102 is configured to monitor the nodes N1-NM, e.g. monitor status of cached libraries, status of scheduling queues and allocated cache sizes of nodes comprised in the distributed computing system 100.
- the control logic module 1103 is configured to control caching of dependencies in the nodes N1- NM, i.e. to control deployment of dependencies in the nodes N1-NM.
- Each of the nodes N1-NM may further comprise modules such as lookup proxy modules, container manager modules (not shown in the image).
- the container manager modules are typically configured to manage one or more container instances where tasks/applications can execute.
- the container manager modules typically obtain or retrieves dependencies, such as container images/image files for the container instance/s to be started, repository node 120, which is configured to store dependencies, such as images/image files, for runtime environments and/or container instances.
- the distributed computing system 100 may further comprise a container orchestration node (not shown) configured to control the different container managers to start all necessary containers instances for the task to run.
- a container orchestration node (not shown) configured to control the different container managers to start all necessary containers instances for the task to run.
- Fig. 2A-B illustrates dependency deployment scenario according to one or more embodiments of the present disclosure.
- Fig. 2A an initial state of a first node N 1 is shown.
- the dependencies D1-D3 required by application App1 are already cached and readily available.
- application behavior of instances/tasks of App1 is monitored, e.g. incoming task workload, scheduling times of the tasks. Further each node, e.g. N1 , is monitored to determine properties of the node, e.g. scheduling queue length, number of dependencies cached, cache capacity and size of cached dependencies.
- an estimated scheduling time S1 of node N1 for App1 where the required dependencies D1-D3 are already cached can then be generated by multiplying this sum with application Appl’s estimated execution time App1_exec_time.
- a launch time L2 of the application App1 for an additional node N2 where the dependencies D1-D3 needs to be cached is then estimated. E.g. by considering the monitored size of dependencies D1-D3 and average bandwidth in the communication network communicatively coupling the second node N2 to a dependency repository node.
- the estimated scheduling time S1 of node N1 for instances of App1 is then compared to the estimated launch time L2 for an instance of the application App1 for the additional node N2.
- the first completion time T1 comprises a relatively short launch time and a relatively long scheduling time.
- the second completion time T2 comprises a relatively long launch time and a relatively short scheduling time.
- Fig. 3 shows a flowchart of a computer implemented method according to one or more embodiments of the present disclosure.
- the computer implemented method is performed by a computer 110 configured to control deployment of cached dependencies on one or more selected nodes N1-NM in a distributed computing system 100.
- the dependencies are software components upon which the task software depend.
- the computer 110 selects the one or more nodes where the cached dependencies are to be deployed using an estimated completion time of the task software, where the completion time is estimated using an estimated launch time, an estimated scheduling time and an estimated execution time of the task software.
- the method comprises:
- Step 310 obtaining deployment input data.
- the step of obtaining deployment input data comprises: monitoring behavior of the task software by estimating a total offered workload of the distributed computing system as an expected pattern of arriving task software, retrieving a mapping indicative of dependencies on which the task software of the expected pattern of arriving task software depends, estimating a total scheduling time of already scheduled instances of the task software, monitoring each node of the distributed computing system by measuring scheduling queue length for the task software of each node, measuring a total scheduling queue length for all task software comprised in the expected pattern of arriving task software of each node, measuring number of cached dependencies of each node, measuring size of cached dependencies of each node, measuring cache resources of each node and predicting a number of instances of the task software to be started.
- Step 320 determining a number of nodes in the distributed computing system where the dependencies are to be deployed by evaluating the deployment input data.
- the deployment input data is evaluated by at least comparing estimated launch times of nodes where the dependencies needs to be cached to estimated scheduling times of nodes where the dependencies are already cached.
- the number of nodes is determined according to:
- N_new ((App1_Queue length + App1_estimated) * App1_exec_time / Task_per node * App1_expected_completion_time) - N_app1_Curr
- N_new is the determined number of nodes
- App1_Queue length is the average estimated scheduling time over the selected nodes
- App1_estimated is an estimated number of task software to be deployed
- App_exec_time is estimated execution time of the task software
- T ask_per_node is a number of parallel task instances that can be executed in parallel on a single node
- App1_expected_completion_time is estimated completion time of the task software
- N_app1_curr is a current/previously determined number of nodes used to schedule task software.
- Step 330 selecting a set of nodes in the distributed computing system 100, where the selected set comprises a number of nodes corresponding to the determined number of nodes.
- selecting the set of nodes comprises: mapping required dependencies of the task software to dependencies that are already cached, for each node of the distributed computing system, identifying a first set of nodes in the distributed computing system where all the required dependencies are already cached, identifying a second set of nodes where additional dependencies needs to be cached, selecting the set of nodes by selecting the first set of nodes and selecting the second set of nodes.
- the second set of nodes may e.g. be identified by calculating a number of nodes as the difference between the determined number of nodes and the number of nodes in the first set of nodes. A corresponding calculated number of nodes may then be selected, e.g. randomly from the total set of nodes in the distributed computing system.
- the method may then reduce the number of nodes caching required dependencies to save cache capacity.
- the method further comprises reducing the number of nodes in the first set of nodes to match the determined number of nodes.
- the method may then increase the number of nodes caching required dependencies.
- the method further comprises: identifying dependencies that needs to be cached for each of the second set of nodes, control deployment of dependencies by caching dependencies that needs to be cached, for each of the second set of nodes. Step 340: controlling deployment of dependencies on the selected set of nodes.
- the dependencies comprise runtime environment images.
- a computer is provided and is configured to control deployment of cached dependencies on one or more selected nodes in a distributed computing system
- the dependencies are software components upon which task software depend
- the computer selects the one or more nodes where the cached dependencies are to be deployed using an estimated completion time of the task software, wherein the completion time is estimated using at least an estimated launch time, an estimated scheduling time and an estimated execution time of the task software
- the computer further comprising processing circuitry, a memory comprising instructions executable by the processing circuitry, causing the processing circuitry to perform any of the method steps of the method described herein.
- Fig. 4 shows a flowchart of a use case embodiment of a computer implemented method 400 according to one or more embodiments of the present disclosure.
- the disclosed method and computer monitors and keeps track of the following information:
- the incoming workload to a computer configured to control deployment of cached dependencies on one or more selected nodes in a distributed computing system, e.g. the type of applications and their workload arriving to the computer. This can be used to predict a future task arrival pattern, (and therefore future required dependencies, e.g. libraries, over time)
- the controller continuously, monitors the system, predicts the incoming load per application and check to see if it needs to pre-fetch and download the application’s dependencies to reduce the scheduling time, on extra nodes. This is illustrated by the steps “Monitoring cycle starts”, “Monitor the application behavior”, “Monitor each node” and “Predict the number of incoming tasks for App1 , in Fig. 4.
- the computer 110 also continuously checks and calculates the trade-off between the task launch time and the time to wait in the scheduling queue. If the cost of waiting in the scheduling queue is larger than the time it takes to add more nodes and downloading required dependencies, such as libraries, on extra nodes, the controller would add nodes and download or pre-fetch the images on the new/additional nodes. To avoid hectic changes, the computer 110 mat use the gain factor C, to calculate the trade-off.
- the C value may e.g. be configured by the system admin. The higher C value, the more chance of consolidating applications on fewer nodes, which may makes sense if the cluster is already fully utilized, and the lower the C value, the higher flexibility to add new nodes, which makes sense where the workload is skewed.
- the controller also predicts the future incoming number of tasks. This can lead to more durable decisions.
- the number of extra nodes to serve each application is calculated as follows:
- N_new ((App1_Queue length + App1_estimated) * App1_exec_time / Task_per node * App1_expected_completion_time) - N_app1_Curr
- N_new is the number of extra nodes required to cache the dependencies for App1, so that the application can be, executed before its expected completion time.
- App1_Queue length is the average length of the queue over the nodes where the application is deployed, (The assumption is that the scheduler load balanced the tasks over the nodes in an efficient manner).
- App1_estimated is the predicted number of new tasks, App_exec_time, is application’s estimated execution time, Tasks_per_node is the number of parallel tasks that can be packed on a single node, App1_expected_completion_time is the application’s expected completion time and N_app1_curr is the current number of nodes that already has the application’s dependencies.
- the controller tracks the most used container layers. It caches the minimum layers common, to improve the hit rate, while minimize the local storage consumption.
- the controller looks at the total queue length for all application per node, and the expected completion time.
- Fig. 5 shows details of a node/computer/computer device 110 according to one or more embodiments of the present disclosure.
- the repository node 120 and the optional orchestrator node comprises all or at least a part of the features of the computer device 110 described below.
- the computer device 110 may be in the form of a selection of any of network node, a desktop computer, server, laptop, mobile device, a smartphone, a tablet computer, a smart watch etc.
- the computer device 110 may comprise processing circuitry 1012.
- the computer device 110 may optionally comprise a communications interface 1004 for wired and/or wireless communication. Further, the computer device 110 may further comprise at least one optional antenna (not shown in figure).
- the antenna may be coupled to a transceiver of the communications interface 1004 and is configured to transmit and/or emit and/or receive a wireless signals, e.g. in a wireless communication system.
- the processing circuitry 1012 may be any of a selection of processor and/or a central processing unit and/or processor modules and/or multiple processors configured to cooperate with each-other.
- the computer device 110 may further comprise a memory 1015.
- the memory 1015 may contain instructions executable by the processing circuitry 1012, that when executed causes the processing circuitry 1012 to perform any of the methods and/or method steps described herein.
- the communications interface 1004 e.g. the wireless transceiver and/or a wired/wireless communications network adapter, which is configured to send and/or receive data values or parameters as a signal to or from the processing circuitry 1012 to or from other external nodes
- the communications interface 1004 communicates directly between nodes or via a communications network.
- the computer device 110 may further comprise an input device 1017, configured to receive input or indications from a user and send a user-input signal indicative of the user input or indications to the processing circuitry 1012.
- the computer device 110 may further comprise a display 1018 configured to receive a display signal indicative of rendered objects, such as text or graphical user input objects, from the processing circuitry 1012 and to display the received signal as objects, such as text or graphical user input objects.
- the display 1018 is integrated with the user input device 1017 and is configured to receive a display signal indicative of rendered objects, such as text or graphical user input objects, from the processing circuitry 1012 and to display the received signal as objects, such as text or graphical user input objects, and/or configured to receive input or indications from a user and send a user-input signal indicative of the user input or indications to the processing circuitry 1012.
- the computer device 110 may further comprise one or more sensors 1019.
- the processing circuitry 1012 is communicatively coupled to the memory 1015 and/or the communications interface 1004 and/or the input device 1017 and/or the display 1018 and/or the one or more sensors 1019.
- the communications interface and/or transceiver 1004 communicates using wired and/or wireless communication techniques.
- the one or more memory 1015 may comprise a selection of a hard RAM, disk drive, a floppy disk drive, a magnetic tape drive, an optical disk drive, a CD or DVD drive (R or RW), or other removable or fixed media drive.
- the computer device 110 may further comprise and/or be coupled to one or more additional sensors (not shown) configured to receive and/or obtain and/or measure physical properties pertaining to the computer device or the environment of the computer device, and send one or more sensor signals indicative of the physical properties to the processing circuitry 1012.
- additional sensors not shown
- the processing circuitry 1012 may further comprise and/or be coupled to one or more additional sensors (not shown) configured to receive and/or obtain and/or measure physical properties pertaining to the computer device or the environment of the computer device, and send one or more sensor signals indicative of the physical properties to the processing circuitry 1012.
- a computer device comprises any suitable combination of hardware and/or software needed to perform the tasks, features, functions, and methods disclosed herein.
- the components of the computer device are depicted as single boxes located within a larger box, or nested within multiple boxes, in practice, a computer device may comprise multiple different physical components that make up a single illustrated component (e.g., memory 1015 may comprise multiple separate hard drives as well as multiple RAM modules).
- the computer device 110 may be composed of multiple physically separate components, which may each have their own respective components.
- the communications interface 1004 may also include multiple sets of various illustrated components for different wireless technologies, such as, for example, GSM, WCDMA, LTE, NR, Wi-Fi, or Bluetooth wireless technologies. These wireless technologies may be integrated into the same or different chip or set of chips and other components within the computer device 110.
- Processing circuitry 1012 is configured to perform any determining, calculating, or similar operations (e.g., certain obtaining operations) described herein as being provided by a computer device 110. These operations performed by processing circuitry 1012 may include processing information obtained by processing circuitry 1012 by, for example, converting the obtained information into other information, comparing the obtained information or converted information to information stored in the network node, and/or performing one or more operations based on the obtained information or converted information, and as a result of said processing making a determination.
- processing information obtained by processing circuitry 1012 by, for example, converting the obtained information into other information, comparing the obtained information or converted information to information stored in the network node, and/or performing one or more operations based on the obtained information or converted information, and as a result of said processing making a determination.
- Processing circuitry 1012 may comprise a combination of one or more of a microprocessor, controller, microcontroller, central processing unit, digital signal processor, application-specific integrated circuit, field programmable gate array, or any other suitable computing device, resource, or combination of hardware, software and/or encoded logic operable to provide, either alone or in conjunction with other computer device 110 components, such as device readable medium, computer 110 functionality.
- processing circuitry 1012 may execute instructions stored in device readable medium 1015 or in memory within processing circuitry 1012. Such functionality may include providing any of the various wireless features, functions, or benefits discussed herein.
- processing circuitry 1012 may include a system on a chip.
- processing circuitry 1012 may include one or more of radio frequency, RF, transceiver circuitry and baseband processing circuitry.
- RF transceiver circuitry and baseband processing circuitry may be on separate chips or sets of chips, boards, or units, such as radio units and digital units.
- part or all of RF transceiver circuitry and baseband processing circuitry may be on the same chip or set of chips, boards, or units
- processing circuitry 1012 may be performed by the processing circuitry 1012 executing instructions stored on device readable medium 1015 or memory within processing circuitry 1012. In alternative embodiments, some or all the functionality may be provided by processing circuitry 1012 without executing instructions stored on a separate or discrete device readable medium, such as in a hard-wired manner. In any of those embodiments, whether executing instructions stored on a device readable storage medium or not, processing circuitry 1012 can be configured to perform the described functionality. The benefits provided by such functionality are not limited to processing circuitry 1012 alone or to other components of computer device 110, but are enjoyed by computer device 110 as a whole, and/or by end users.
- Device readable medium or memory 1015 may comprise any form of volatile or non-volatile computer readable memory including, without limitation, persistent storage, solid-state memory, remotely mounted memory, magnetic media, optical media, random access memory (RAM), read-only memory (ROM), mass storage media (for example, a hard disk), removable storage media (for example, a flash drive, a Compact Disk (CD) ora Digital Video Disk (DVD)), and/or any other volatile or non-volatile, non-transitory device readable and/or computer- executable memory devices that store information, data, and/or instructions that may be used by processing circuitry 1012.
- volatile or non-volatile computer readable memory including, without limitation, persistent storage, solid-state memory, remotely mounted memory, magnetic media, optical media, random access memory (RAM), read-only memory (ROM), mass storage media (for example, a hard disk), removable storage media (for example, a flash drive, a Compact Disk (CD) ora Digital Video Disk (DVD)), and/or any other volatile or non-
- Device readable medium 1015 may store any suitable instructions, data or information, including a computer program, software, an application including one or more of logic, rules, code, tables, etc. and/or other instructions capable of being executed by processing circuitry 1012 and, utilized by computer device 110.
- Device readable medium 1015 may be used to store any calculations made by processing circuitry 1012 and/or any data received via interface 1004.
- processing circuitry 1012 and device readable medium 1015 may be integrated.
- the communications interface 1004 is used in the wired orwireless communication of signaling and/or data between computer device 110 and other nodes.
- Interface 1004 may comprise port(s)/terminal(s) to send and receive data, for example to and from computer device 110 over a wired connection.
- Interface 1004 also includes radio front end circuitry that may be coupled to, or in certain embodiments a part of, an antenna. Radio front end circuitry may comprise filters and amplifiers. Radio front end circuitry may be connected to the antenna and/or processing circuitry 1012.
- Examples of a computer device 110 include, but are not limited to an edge cloud node, a smart phone, a mobile phone, a cell phone, a voice over I P (Vol P) phone, a wireless local loop phone, a tablet computer, a desktop computer, a personal digital assistant (PDA), a wireless cameras, a gaming console or device, a music storage device, a playback appliance, a wearable terminal device, a wireless endpoint, a mobile station, a tablet, a laptop, a laptop-embedded equipment (LEE), a laptop-mounted equipment (LME), a smart device, a wireless customer-premise equipment (CPE), a vehicle-mounted wireless terminal device, etc.
- an edge cloud node a smart phone, a mobile phone, a cell phone, a voice over I P (Vol P) phone, a wireless local loop phone, a tablet computer, a desktop computer, a personal digital assistant (PDA), a wireless cameras, a gaming console or device, a music storage device, a play
- the communication interface may 1004 encompass wired and/or wireless networks such as a local-area network (LAN), a wide-area network (WAN), a computer network, a wireless network, a telecommunications network, another like network or any combination thereof.
- the communication interface may be configured to include a receiver and a transmitter interface used to communicate with one or more other devices over a communication network according to one or more communication protocols, such as Ethernet, TCP/IP, SONET, ATM, optical, electrical, and the like).
- the transmitter and receiver interface may share circuit components, software, or firmware, or alternatively may be implemented separately.
- a computer node 110 is provided and is configured to perform any of the method steps described herein.
- a computer 110 configured to control deployment of cached dependencies on one or more selected nodes in a distributed computing system, wherein the dependencies are software components upon which task software depend, wherein the computer selects the one or more nodes where the cached dependencies are to be deployed using an estimated completion time of the task software, wherein the completion time is estimated using an estimated launch time, an estimated scheduling time and an estimated execution time of the task software, the computer 110 further comprising: processing circuitry 1012, a memory 1015 comprising instructions executable by the processing circuitry 1012, causing the processing circuitry 1012 to perform any of the method steps described herein.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Stored Programmes (AREA)
Abstract
The present disclosure relates to a computer implemented method performed by a computer configured to control deployment of cached dependencies on one or more selected nodes in a distributed computing system, wherein the dependencies are software components upon which task software depend, wherein the computer selects the one or more nodes where the cached dependencies are to be deployed using an estimated completion time of the task software, wherein the completion time is estimated using an estimated launch time, an estimated scheduling time and an estimated execution time of the task software, the method comprising obtaining deployment input data, determining a number of nodes in the distributed computing system where the dependencies are to be deployed by evaluating the deployment input data, wherein the deployment input data is evaluated by at least comparing estimated launch times of nodes where the dependencies needs to be cached to estimated scheduling times of nodes where the dependencies are already cached, and selecting a set of nodes in the distributed computing system, wherein the selected set comprises a number of nodes corresponding to the determined number of nodes, control deployment of dependencies on the selected set of nodes.
Description
METHOD FOR CONTROLLING DEPLOYMENT OF CACHED DEPENDENCIES ON ONE OR MORE SELECTED NODES IN A DISTRIBUTED COMPUTING SYSTEM
TECHNICAL FIELD
The present invention relates to a method for deploying cached dependencies. In particular, a method for deploying cached dependencies in a distributed computing system, such as a cloud computing system.
BACKGROUND
Applications, tasks, or services are frequently provided using cloud technology. Commonly a central cloud is connected to a communications network and is providing applications, services and tasks to devices coupled to and/or connected to the communications network.
With the evolving technology and communication capacity, e.g. by the introduction of 5G networks, strict latency requirements of applications may make the central cloud approach troublesome. This has led to the introduction of concepts such as edge cloud computing, where applications, services and tasks are handled by processing resources at the edge of the cloud network.
Examples of applications or tasks with strict latency requirements are control of autonomous vehicles and/or autonomous processing plants. Applications running in an autonomous vehicle may e.g. include adaptive cruise control which is monitoring road signs and automatically control throttle and brakes of the vehicle. Typical properties of such applications are that the applications are relatively short lived and use resources for a relatively short period of time.
A challenge when controlling execution of such applications is that the time it takes to setup the runtime environment and/or schedule/identify processing resources in the distributed computing system, may exceed the time it takes to run the application. In other words, the completion time of the application is dominated by the time it takes to initially launch the application and queuing for the application to be executed. The time to execute the application is relatively short in comparison to the time spent launching and queueing.
Conventional methods sometimes adopt an application/task scheduling policy, known as dependency scheduling, where applications are scheduled at a node that already hosts the application/task dependencies, including libraries, data and configuration files that needs to be setup to execute the application/task. These methods try to schedule the tasks at a node that
already has the tasks’ dependencies cached locally. This strategy speeds up the container instance start up time compared to the dependency agnostic schedulers.
These conventional methods have drawbacks including non-optimal use of resources and Increase in the scheduling time. For a given workload, there is high risk that some nodes become more popular than the other nodes in the distributed computing system. This may lead to over-utilization of some nodes, and under-utilizations of other nodes, depending on what dependencies that are cached on each node. There is also a risk that the applications/tasks are queued up in the processing queue of the popular nodes, and thus wait a relatively long time to be scheduled, when they could alternatively be scheduled on other nodes, where the dependencies are not available but can be downloaded. So, in such conventional dependency scheduling, although the application/task launch time is decreased by executing the application/task on a node that already has all the required dependencies cached, the total application/task completion time is not improved.
Thus, there is a need for an improved method for controlling deployment of cached dependencies in a distributed computing system, such as a cloud computing system.
SUMMARY OF THE INVENTION
The above described drawbacks are overcome by the subject matter described herein. Further advantageous implementation forms of the invention are described herein.
According to a first aspect of the invention the objects of the invention is achieved by a computer implemented method performed by a computer configured to control deployment of cached dependencies on one or more selected nodes in a distributed computing system, wherein the dependencies are software components upon which task software depend, wherein the computer selects the one or more nodes where the cached dependencies are to be deployed using an estimated completion time of the task software, wherein the completion time is estimated using an estimated launch time, an estimated scheduling time and an estimated execution time of the task software. The method comprises obtaining deployment input data, determining a number of nodes in the distributed computing system where the dependencies are to be deployed by evaluating the deployment input data, wherein the deployment input data is evaluated by at least comparing estimated launch times of nodes where the dependencies needs to be cached to estimated scheduling times of nodes where the dependencies are already cached, and selecting a set of nodes in the distributed computing system, wherein the selected set comprises a number of nodes corresponding to the determined number of nodes, and to control deployment of dependencies on the selected set of nodes.
The advantage of the fist aspect is at least that completion times of tasks/applications are reduced.
According to a second aspect of the invention the objects of the invention is achieved by a computer configured to control deployment of cached dependencies on one or more selected nodes in a distributed computing system, wherein the dependencies are software components upon which task software depend, wherein the computer selects the one or more nodes where the cached dependencies are to be deployed using an estimated completion time of the task software, wherein the completion time is estimated using at least an estimated launch time, an estimated scheduling time and an estimated execution time of the task software, the computer further comprising processing circuitry, a memory comprising instructions executable by the processing circuitry, causing the processing circuitry to perform the method according to the first aspect.
The scope of the invention is defined by the claims, which are incorporated into this section by reference. A more complete understanding of embodiments of the invention will be afforded to those skilled in the art, as well as a realization of additional advantages thereof, by a consideration of the following detailed description of one or more embodiments. Reference will be made to the appended sheets of drawings that will first be described briefly
BRIEF DESCRIPTION OF THE DRAWINGS
A more complete understanding of embodiments of the invention will be afforded to those skilled in the art, as well as a realization of additional advantages thereof, by a consideration of the following detailed description of one or more embodiments. It should be appreciated that like reference numerals are used to identify like elements illustrated in one or more of the figures.
Fig. 1 illustrates a distributed computing system according to one or more embodiments of the present disclosure.
Fig. 2A-B illustrates dependency deployment scenario according to one or more embodiments of the present disclosure.
Fig. 3 shows a flowchart of a computer implemented method according to one or more embodiments of the present disclosure.
Fig. 4 shows a flowchart of a use case embodiment of a computer implemented method according to one or more embodiments of the present disclosure.
Fig. 5 shows details of a computer according to one or more embodiments of the present disclosure.
A more complete understanding of embodiments of the invention will be afforded to those skilled in the art, as well as a realization of additional advantages thereof, by a consideration of the following detailed description of one or more embodiments. It should be appreciated that like reference numerals are used to identify like elements illustrated in one or more of the figures.
DETAILED DESCRIPTION
Generally, all terms used herein are to be interpreted according to their ordinary meaning in the relevant technical field, unless a different meaning is clearly given and/or is implied from the context in which it is used. All references to a/an/the element, apparatus, component, means, step, etc. are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise. The steps of any methods disclosed herein do not have to be performed in the exact order disclosed, unless a step is explicitly described as following or preceding another step and/or where it is implicit that a step must follow or precede another step. Any feature of any of the embodiments disclosed herein may be applied to any other embodiment, wherever appropriate. Likewise, any advantage of any of the embodiments may apply to any other embodiments, and vice versa. Other objectives, features and advantages of the enclosed embodiments will be apparent from the following description.
In the present disclosure, the terms “application”, “task” and “function” will be used interchangeably to denote software or instructions executable by processing circuitry, causing the processing circuitry to perform any of the method steps described herein.
In the present disclosure, occasionally the terms “application” is used to indicate a set of instructions forming a particular software and the term “task” to indicate instances of the application executing on processing circuitry, e.g. in a container instance.
In the present disclosure, the term “dependencies” is used to denote all software components required to generate a runtime environment required by the application/task, such as libraries, data, and configuration files.
In the present disclosure the term “deployment of cached dependencies” denotes the act of providing software components typically from one node to another node, e.g. from a node comprising a dependency repository to a node on which a particular task/application may be scheduled. The present disclosure relates to deploying dependencies in a distributed
computing system. In other words, software component deployment can be seen as a part of the field of software scheduling in a distributed computing system.
In the present disclosure the term “distributed computing system” denotes a system comprising a plurality of physically separate or virtualized computers where partial results or calculations are generated by different applications and/or services executing in different runtime environments, optionally different runtime environments on different nodes, e.g. a cloud computing network.
In the present disclosure, the terms “application completion time”, “task completion time” or simply “completion time” is used interchangeably to denote the total time required to setup, schedule and execute a task until it has completed. I.e. until the execution of the task finishes.
In the present disclosure, the term “launch time” is used to denote the time required to setup a runtime environment with all dependencies required by the task. This may e.g. involve obtaining or retrieving dependencies, such as data and container images/image files for the container instance/s to be started and to start all necessary containers instances for the task to run/execute.
In the present disclosure, the term “scheduling time” is used to denote the time required to allocate processing resources to the task, e.g. queue time in a scheduling queue before execution of the task at a node in the distributed computing system.
In the present disclosure, the term “execution time” is used to denote the time required to execute the task until it has completed. In other words, typically until the execution of the task finishes when all instructions have been processed.
The present disclosure relates, in particular, to distributed systems operating in virtualized environments, such as systems providing cloud computing services. In such systems, edge cloud strategies are used to reduce latency in the system. Many applications or tasks at the edge cloud are mainly short living applications/tasks/functions, with low latency requirements. They often follow Function as a Service (FaaS) model, where functions are often run within a container instance, with startup/launch times typically varying from 100 milli seconds, to a minute, depending if the parent container and its dependencies and libraries are already local or not. The typically layered file system of the containers allows a fine control over the container’s dependencies. In other words, knowledge of the dependencies or software components required by the runtime environment of the task, e.g. different software libraries.
In this scenario, conventional task/application controllers, such as orchestrators, are adopting a scheduling policy, known as dependency scheduling, where functions are scheduled at a node that already hosts the tasks dependencies, including libraries, data and configuration
files. These methods try to schedule the tasks at a node that already has the tasks’ dependencies locally. This strategy speeds up the container start up time compared to the dependency agnostic schedulers.
As the edge cloud applications/tasks are short lived, they generally have a short completion time period. The task completion time may be defined as T = L + S + E, where L is the task launch time, S is the scheduling time, and E is the task execution time. The dependency-aware schedulers try to reduce task launch time, by placing the tasks at a node that has the maximum number of dependencies available. To reduce the launch time, each node would cache previously used dependencies, such as images, and schedule the task where all, or a subset of task dependencies are available. However, as mentioned in the background section, scheduling the tasks based on the locality of the dependencies can be problematic:
1- Non-optimal use of resources: For a given workload, there is high risk that some nodes become more popular than the others. This will lead to over-utilization of some nodes, and under-utilizations of the others, depending on what is cached on each node.
2- Increase in the scheduling time: There is a risk that the tasks are queued up in the popular nodes, and thus wait longer time to be scheduled, while they can be scheduled on other, less popular, nodes where the dependencies are not available but can be downloaded. So, in this case of standard dependency-aware scheduling, allthough the task launch time is decreased, the total task completion time is not improved.
In addition, it is hard to determine how large the cache size should be on each node of the distributed computing system, and how long the different task dependencies should be cached, to increase the hit rates of successful scheduling operations in the legacy dependency scheduling.
Algorithms such as Time aware Least Recently Used (TLRU) algorithms are trying to partly solve some of these problems, by setting a static timestamp to specify the cache relevance and valid life time, but they are mainly reactive and relying on the past, rather than prediction of the future workload, they have static and fixed time stamps. However, they are not designed to support dependency scheduling, so libraries would have to be cached regardless of what is supposed to be executed or the future workload of the distributed computing system.
In the present disclosure of an improved dependency scheduling regime, the method executed by the computer, constantly calculates a suitable tradeoff or near-optimal tradeoff or just a tradeoff between the scheduling queue time and the dependency download time. The controller makes this tradeoff by deciding if it is beneficial to adapt the number of nodes where required dependencies are cached, e.g. by dropping cached dependencies or pre-fetch ing and
caching required dependencies in additional nodes, or is it is beneficial to only consider nodes for scheduling where all the required dependencies are already cached. The computer should also plan how to distribute the dependencies over the nodes in the distributed computing system to reduce the risk of prolonged completion times, e.g. resulting from long scheduling queues or container instance interference. In other words, predicting a required distribution of dependencies for an upcoming workload of the distributed computing system, e.g. predict an expected task arrival pattern.
The present disclosure solves this by providing a method and a controller/computer which monitors the incoming workload of the distributed computing system and decides which dependencies to cache, where to cache the dependencies, and then dynamically adjust the cache size of nodes, and the number of nodes to cache the dependencies. This is done to decrease the risk of non-optimal and skewed utilization of processing resources and to improve the scheduling time, e.g. by reducing the overall scheduling queue length, the scheduling queue length of popular nodes and/or scheduling queue length of particular applications/tasks.
This can be either be achieved by adjusting the number of nodes, with pre-fetched dependencies, in advance for future workloads that is expected to have a durable incoming task. This can optionally or additionally be performed by calculating a trade-off between the dependency download time vs scheduling queue wait time to advice the scheduler on a scheduling action/decision. This can optionally or additionally be performed by deciding what dependencies that should be cached, and where, to increase the dependency scheduling hits, whilst reducing the scheduling queues at each node of the distributed computing system.
The disclosure has at least the advantages of:
-decreasing any unbalanced utilization among popular and unpopular nodes of the distributed computing system. The nodes being popular in the sense that the cached dependencies of popular nodes are required by many tasks in the upcoming/future workload of the distributed computing system.
-improving the task scheduling times by distributing the tasks among multiple nodes with dependencies available/cached locally.
- improving the tasks completion times by reducing the task launch time, by prefetching/pre caching dependencies required for a predicted workload of the distributed computing system.
Fig. 1 illustrates a distributed computing system 100 according to one or more embodiments of the present disclosure. The system 100 comprises one or more nodes N1-NM in a distributed computing system 100, each node comprising a local dependency cache LIS1-
LIS3, e.g. a local library repository or cache. The nodes are communicatively coupled, optionally via a communications network (not shown).
The distributed computing system 100 further comprises a repository node 120 configured to store dependencies required by tasks executing or to be executed in the system, e.g. in a dependency repository comprised in the repository node 120.
The distributed computing system 100 further comprises a dependencies controller/computer 110 configured to control deployment of cached dependencies on one or more selected nodes in the distributed computing system 100. The controller 110 may optionally distribute its functionality over functional modules 1101-1103, such as a load prediction module 1101, a monitoring module 1102 and a control logic module 1103. The load prediction module 1101 is configured to predict load of the nodes N1-NM using a predicted task arrival pattern, e.g. obtained from an orchestrator node (not shown). The monitoring module 1102 is configured to monitor the nodes N1-NM, e.g. monitor status of cached libraries, status of scheduling queues and allocated cache sizes of nodes comprised in the distributed computing system 100. The control logic module 1103 is configured to control caching of dependencies in the nodes N1- NM, i.e. to control deployment of dependencies in the nodes N1-NM.
Each of the nodes N1-NM may further comprise modules such as lookup proxy modules, container manager modules (not shown in the image). The container manager modules are typically configured to manage one or more container instances where tasks/applications can execute. The container manager modules typically obtain or retrieves dependencies, such as container images/image files for the container instance/s to be started, repository node 120, which is configured to store dependencies, such as images/image files, for runtime environments and/or container instances.
The distributed computing system 100 may further comprise a container orchestration node (not shown) configured to control the different container managers to start all necessary containers instances for the task to run.
Fig. 2A-B illustrates dependency deployment scenario according to one or more embodiments of the present disclosure. In Fig. 2A an initial state of a first node N 1 is shown. In the first node N1, the dependencies D1-D3 required by application App1 are already cached and readily available.
According to the present disclosure, application behavior of instances/tasks of App1 is monitored, e.g. incoming task workload, scheduling times of the tasks. Further each node, e.g. N1 , is monitored to determine properties of the node, e.g. scheduling queue length, number of dependencies cached, cache capacity and size of cached dependencies.
According to the present disclosure, for the particular Application App1 having required dependencies D1-D3, it is further determined a sum of: an average length of the scheduling queues over the nodes where instances of the application App1 is already deployed and an estimated/predicted number, App1_estimated, of upcoming tasks/instances of App An estimated scheduling time S1 of node N1 for App1 where the required dependencies D1-D3 are already cached can then be generated by multiplying this sum with application Appl’s estimated execution time App1_exec_time.
According to the present disclosure, a launch time L2 of the application App1 for an additional node N2 where the dependencies D1-D3 needs to be cached is then estimated. E.g. by considering the monitored size of dependencies D1-D3 and average bandwidth in the communication network communicatively coupling the second node N2 to a dependency repository node.
The estimated scheduling time S1 of node N1 for instances of App1 is then compared to the estimated launch time L2 for an instance of the application App1 for the additional node N2.
If it is determined that it is beneficial, i.e. it saves time, to deploy the required dependencies D1-D3 on the additional node N2 and schedule App1 on the second node N2, then the disclosed method proceeds accordingly.
As can be seen in Fig. 2A, if a new instance of App1 is scheduled to execute on the first node N1 , the first completion time T1 comprises a relatively short launch time and a relatively long scheduling time.
As can be seen in Fig. 2B, if App1 is scheduled to execute on the second and additional node N2 the second completion time T2 comprises a relatively long launch time and a relatively short scheduling time.
However, as the second completion time T2 is shorter than the first completion time T 1 , thus App1 can be executed quicker on N2 and time can be saved. The execution times E1/E2 are considered to be more or less comparable on the first and second node N1/N2.
It is understood that the present disclosure can be extended to deploy the required dependencies to a plurality of additional nodes without departing from the concept presented herein.
It is further understood that of the estimated scheduling time of nodes where the required dependencies D1-D3 are already cached is below a threshold, it may be determined to release the cached dependencies D1-D3 from some of the nodes, thereby freeing cache resources for other applications having different required dependencies.
Fig. 3 shows a flowchart of a computer implemented method according to one or more embodiments of the present disclosure. The computer implemented method is performed by a computer 110 configured to control deployment of cached dependencies on one or more selected nodes N1-NM in a distributed computing system 100. The dependencies are software components upon which the task software depend. The computer 110 selects the one or more nodes where the cached dependencies are to be deployed using an estimated completion time of the task software, where the completion time is estimated using an estimated launch time, an estimated scheduling time and an estimated execution time of the task software. The method comprises:
Step 310: obtaining deployment input data.
In one embodiment, the step of obtaining deployment input data comprises: monitoring behavior of the task software by estimating a total offered workload of the distributed computing system as an expected pattern of arriving task software, retrieving a mapping indicative of dependencies on which the task software of the expected pattern of arriving task software depends, estimating a total scheduling time of already scheduled instances of the task software, monitoring each node of the distributed computing system by measuring scheduling queue length for the task software of each node, measuring a total scheduling queue length for all task software comprised in the expected pattern of arriving task software of each node, measuring number of cached dependencies of each node, measuring size of cached dependencies of each node, measuring cache resources of each node and predicting a number of instances of the task software to be started.
Step 320: determining a number of nodes in the distributed computing system where the dependencies are to be deployed by evaluating the deployment input data. The deployment input data is evaluated by at least comparing estimated launch times of nodes where the dependencies needs to be cached to estimated scheduling times of nodes where the dependencies are already cached.
In one embodiment, the number of nodes is determined according to:
N_new = ((App1_Queue length + App1_estimated) * App1_exec_time / Task_per node * App1_expected_completion_time) - N_app1_Curr where N_new is the determined number of nodes App1_Queue length is the average estimated scheduling time over the selected nodes, App1_estimated is an estimated number of task software to be deployed, App_exec_time is estimated execution time of the task
software, T ask_per_node is a number of parallel task instances that can be executed in parallel on a single node, App1_expected_completion_time is estimated completion time of the task software and N_app1_curr is a current/previously determined number of nodes used to schedule task software.
Step 330: selecting a set of nodes in the distributed computing system 100, where the selected set comprises a number of nodes corresponding to the determined number of nodes.
In one embodiment, selecting the set of nodes comprises: mapping required dependencies of the task software to dependencies that are already cached, for each node of the distributed computing system, identifying a first set of nodes in the distributed computing system where all the required dependencies are already cached, identifying a second set of nodes where additional dependencies needs to be cached, selecting the set of nodes by selecting the first set of nodes and selecting the second set of nodes.
The second set of nodes may e.g. be identified by calculating a number of nodes as the difference between the determined number of nodes and the number of nodes in the first set of nodes. A corresponding calculated number of nodes may then be selected, e.g. randomly from the total set of nodes in the distributed computing system.
In some embodiments it may be determined that an upcoming workload do not justify keeping dependencies cached at the current number of selected nodes. The method may then reduce the number of nodes caching required dependencies to save cache capacity.
In one embodiment, where the determined number of nodes is less than a current/previously determined number of nodes and the method further comprises reducing the number of nodes in the first set of nodes to match the determined number of nodes.
In some embodiments it may be determined that an upcoming workload do justify to pre-fetch and/or deploy dependencies to be cached at additional selected nodes. The method may then increase the number of nodes caching required dependencies.
In one embodiment, where the determined number of nodes is greater or equal to a current/previously determined number of nodes, wherein the method further comprises: identifying dependencies that needs to be cached for each of the second set of nodes, control deployment of dependencies by caching dependencies that needs to be cached, for each of the second set of nodes.
Step 340: controlling deployment of dependencies on the selected set of nodes.
In one embodiment, the dependencies comprise runtime environment images.
In one embodiment, a computer is provided and is configured to control deployment of cached dependencies on one or more selected nodes in a distributed computing system is provided, where the dependencies are software components upon which task software depend, wherein the computer selects the one or more nodes where the cached dependencies are to be deployed using an estimated completion time of the task software, wherein the completion time is estimated using at least an estimated launch time, an estimated scheduling time and an estimated execution time of the task software, the computer further comprising processing circuitry, a memory comprising instructions executable by the processing circuitry, causing the processing circuitry to perform any of the method steps of the method described herein.
Fig. 4 shows a flowchart of a use case embodiment of a computer implemented method 400 according to one or more embodiments of the present disclosure.
As presented herein, the disclosed method and computer monitors and keeps track of the following information:
1- The incoming workload to a computer configured to control deployment of cached dependencies on one or more selected nodes in a distributed computing system, e.g. the type of applications and their workload arriving to the computer. This can be used to predict a future task arrival pattern, (and therefore future required dependencies, e.g. libraries, over time)
2- Tracks the dependencies, such as which images/layers that are already, cached and stored in each node.
3- Scheduling queue time per node (combined for all applications scheduled on each node)
4- Scheduling queue time for each application.
5- Available local storage used for dependencies, such as image, for caching on each node Using this information, the computer, dynamically and continuously calculates and adjusts:
1- The number of nodes caching the dependencies, such as images (image layer), per application.
2- The distribution of dependencies, e.g. cached images, over different nodes to allow better distribution of the tasks over different nodes.
As can be seen in Fig. 4, the controller, continuously, monitors the system, predicts the incoming load per application and check to see if it needs to pre-fetch and download the application’s dependencies to reduce the scheduling time, on extra nodes. This is illustrated
by the steps “Monitoring cycle starts”, “Monitor the application behavior”, “Monitor each node” and “Predict the number of incoming tasks for App1 , in Fig. 4.
The computer 110 also continuously checks and calculates the trade-off between the task launch time and the time to wait in the scheduling queue. If the cost of waiting in the scheduling queue is larger than the time it takes to add more nodes and downloading required dependencies, such as libraries, on extra nodes, the controller would add nodes and download or pre-fetch the images on the new/additional nodes. To avoid hectic changes, the computer 110 mat use the gain factor C, to calculate the trade-off. The C value may e.g. be configured by the system admin. The higher C value, the more chance of consolidating applications on fewer nodes, which may makes sense if the cluster is already fully utilized, and the lower the C value, the higher flexibility to add new nodes, which makes sense where the workload is skewed.
Adding more nodes will decrease the queue length on popular nodes, whilst also allowing the task/application to be scheduled faster, without waiting. To decide on the number of extra nodes to pre-fetch the images, the controller also predicts the future incoming number of tasks. This can lead to more durable decisions. The number of extra nodes to serve each application is calculated as follows:
N_new = ((App1_Queue length + App1_estimated) * App1_exec_time / Task_per node * App1_expected_completion_time) - N_app1_Curr
Where N_new is the number of extra nodes required to cache the dependencies for App1, so that the application can be, executed before its expected completion time. App1_Queue length is the average length of the queue over the nodes where the application is deployed, (The assumption is that the scheduler load balanced the tasks over the nodes in an efficient manner). App1_estimated is the predicted number of new tasks, App_exec_time, is application’s estimated execution time, Tasks_per_node is the number of parallel tasks that can be packed on a single node, App1_expected_completion_time is the application’s expected completion time and N_app1_curr is the current number of nodes that already has the application’s dependencies.
By caching popular dependencies, such as images, in additional nodes, the number of popular nodes can be increased and therefore distribute the load among more nodes, which leads to higher utilization of more nodes and shorter queue length and therefore shorter scheduling time.
In addition, the controller, tracks the most used container layers. It caches the minimum layers common, to improve the hit rate, while minimize the local storage consumption.
On where to cache, the controller looks at the total queue length for all application per node, and the expected completion time.
Fig. 5 shows details of a node/computer/computer device 110 according to one or more embodiments of the present disclosure.
The repository node 120 and the optional orchestrator node comprises all or at least a part of the features of the computer device 110 described below.
The computer device 110 may be in the form of a selection of any of network node, a desktop computer, server, laptop, mobile device, a smartphone, a tablet computer, a smart watch etc. The computer device 110 may comprise processing circuitry 1012. The computer device 110 may optionally comprise a communications interface 1004 for wired and/or wireless communication. Further, the computer device 110 may further comprise at least one optional antenna (not shown in figure). The antenna may be coupled to a transceiver of the communications interface 1004 and is configured to transmit and/or emit and/or receive a wireless signals, e.g. in a wireless communication system.
In one example, the processing circuitry 1012 may be any of a selection of processor and/or a central processing unit and/or processor modules and/or multiple processors configured to cooperate with each-other. Further, the computer device 110 may further comprise a memory 1015. The memory 1015 may contain instructions executable by the processing circuitry 1012, that when executed causes the processing circuitry 1012 to perform any of the methods and/or method steps described herein.
The communications interface 1004, e.g. the wireless transceiver and/or a wired/wireless communications network adapter, which is configured to send and/or receive data values or parameters as a signal to or from the processing circuitry 1012 to or from other external nodes In an embodiment, the communications interface 1004 communicates directly between nodes or via a communications network.
In one or more embodiments the computer device 110 may further comprise an input device 1017, configured to receive input or indications from a user and send a user-input signal indicative of the user input or indications to the processing circuitry 1012.
In one or more embodiments the computer device 110 may further comprise a display 1018 configured to receive a display signal indicative of rendered objects, such as text or graphical user input objects, from the processing circuitry 1012 and to display the received signal as objects, such as text or graphical user input objects.
In one embodiment the display 1018 is integrated with the user input device 1017 and is configured to receive a display signal indicative of rendered objects, such as text or graphical user input objects, from the processing circuitry 1012 and to display the received signal as objects, such as text or graphical user input objects, and/or configured to receive input or indications from a user and send a user-input signal indicative of the user input or indications to the processing circuitry 1012.
In one or more embodiments the computer device 110 may further comprise one or more sensors 1019.
In embodiments, the processing circuitry 1012 is communicatively coupled to the memory 1015 and/or the communications interface 1004 and/or the input device 1017 and/or the display 1018 and/or the one or more sensors 1019.
In embodiments, the communications interface and/or transceiver 1004 communicates using wired and/or wireless communication techniques.
In embodiments, the one or more memory 1015 may comprise a selection of a hard RAM, disk drive, a floppy disk drive, a magnetic tape drive, an optical disk drive, a CD or DVD drive (R or RW), or other removable or fixed media drive.
In a further embodiment, the computer device 110 may further comprise and/or be coupled to one or more additional sensors (not shown) configured to receive and/or obtain and/or measure physical properties pertaining to the computer device or the environment of the computer device, and send one or more sensor signals indicative of the physical properties to the processing circuitry 1012.
It is to be understood that a computer device comprises any suitable combination of hardware and/or software needed to perform the tasks, features, functions, and methods disclosed herein. Moreover, while the components of the computer device are depicted as single boxes located within a larger box, or nested within multiple boxes, in practice, a computer device may comprise multiple different physical components that make up a single illustrated component (e.g., memory 1015 may comprise multiple separate hard drives as well as multiple RAM modules).
Similarly, the computer device 110 may be composed of multiple physically separate components, which may each have their own respective components.
The communications interface 1004 may also include multiple sets of various illustrated components for different wireless technologies, such as, for example, GSM, WCDMA, LTE, NR, Wi-Fi, or Bluetooth wireless technologies. These wireless technologies may be integrated
into the same or different chip or set of chips and other components within the computer device 110.
Processing circuitry 1012 is configured to perform any determining, calculating, or similar operations (e.g., certain obtaining operations) described herein as being provided by a computer device 110. These operations performed by processing circuitry 1012 may include processing information obtained by processing circuitry 1012 by, for example, converting the obtained information into other information, comparing the obtained information or converted information to information stored in the network node, and/or performing one or more operations based on the obtained information or converted information, and as a result of said processing making a determination.
Processing circuitry 1012 may comprise a combination of one or more of a microprocessor, controller, microcontroller, central processing unit, digital signal processor, application-specific integrated circuit, field programmable gate array, or any other suitable computing device, resource, or combination of hardware, software and/or encoded logic operable to provide, either alone or in conjunction with other computer device 110 components, such as device readable medium, computer 110 functionality. For example, processing circuitry 1012 may execute instructions stored in device readable medium 1015 or in memory within processing circuitry 1012. Such functionality may include providing any of the various wireless features, functions, or benefits discussed herein. In some embodiments, processing circuitry 1012 may include a system on a chip.
In some embodiments, processing circuitry 1012 may include one or more of radio frequency, RF, transceiver circuitry and baseband processing circuitry. In some embodiments, RF transceiver circuitry and baseband processing circuitry may be on separate chips or sets of chips, boards, or units, such as radio units and digital units. In alternative embodiments, part or all of RF transceiver circuitry and baseband processing circuitry may be on the same chip or set of chips, boards, or units
In certain embodiments, some or all the functionality described herein as being provided by a computer device 110 may be performed by the processing circuitry 1012 executing instructions stored on device readable medium 1015 or memory within processing circuitry 1012. In alternative embodiments, some or all the functionality may be provided by processing circuitry 1012 without executing instructions stored on a separate or discrete device readable medium, such as in a hard-wired manner. In any of those embodiments, whether executing instructions stored on a device readable storage medium or not, processing circuitry 1012 can be configured to perform the described functionality. The benefits provided by such functionality
are not limited to processing circuitry 1012 alone or to other components of computer device 110, but are enjoyed by computer device 110 as a whole, and/or by end users.
Device readable medium or memory 1015 may comprise any form of volatile or non-volatile computer readable memory including, without limitation, persistent storage, solid-state memory, remotely mounted memory, magnetic media, optical media, random access memory (RAM), read-only memory (ROM), mass storage media (for example, a hard disk), removable storage media (for example, a flash drive, a Compact Disk (CD) ora Digital Video Disk (DVD)), and/or any other volatile or non-volatile, non-transitory device readable and/or computer- executable memory devices that store information, data, and/or instructions that may be used by processing circuitry 1012. Device readable medium 1015 may store any suitable instructions, data or information, including a computer program, software, an application including one or more of logic, rules, code, tables, etc. and/or other instructions capable of being executed by processing circuitry 1012 and, utilized by computer device 110. Device readable medium 1015 may be used to store any calculations made by processing circuitry 1012 and/or any data received via interface 1004. In some embodiments, processing circuitry 1012 and device readable medium 1015 may be integrated.
The communications interface 1004 is used in the wired orwireless communication of signaling and/or data between computer device 110 and other nodes. Interface 1004 may comprise port(s)/terminal(s) to send and receive data, for example to and from computer device 110 over a wired connection. Interface 1004 also includes radio front end circuitry that may be coupled to, or in certain embodiments a part of, an antenna. Radio front end circuitry may comprise filters and amplifiers. Radio front end circuitry may be connected to the antenna and/or processing circuitry 1012.
Examples of a computer device 110 include, but are not limited to an edge cloud node, a smart phone, a mobile phone, a cell phone, a voice over I P (Vol P) phone, a wireless local loop phone, a tablet computer, a desktop computer, a personal digital assistant (PDA), a wireless cameras, a gaming console or device, a music storage device, a playback appliance, a wearable terminal device, a wireless endpoint, a mobile station, a tablet, a laptop, a laptop-embedded equipment (LEE), a laptop-mounted equipment (LME), a smart device, a wireless customer-premise equipment (CPE), a vehicle-mounted wireless terminal device, etc.
The communication interface may 1004 encompass wired and/or wireless networks such as a local-area network (LAN), a wide-area network (WAN), a computer network, a wireless network, a telecommunications network, another like network or any combination thereof. The communication interface may be configured to include a receiver and a transmitter interface used to communicate with one or more other devices over a communication network according
to one or more communication protocols, such as Ethernet, TCP/IP, SONET, ATM, optical, electrical, and the like). The transmitter and receiver interface may share circuit components, software, or firmware, or alternatively may be implemented separately.
In one embodiment, a computer node 110 is provided and is configured to perform any of the method steps described herein.
In embodiments, a computer 110 is provided configured to control deployment of cached dependencies on one or more selected nodes in a distributed computing system, wherein the dependencies are software components upon which task software depend, wherein the computer selects the one or more nodes where the cached dependencies are to be deployed using an estimated completion time of the task software, wherein the completion time is estimated using an estimated launch time, an estimated scheduling time and an estimated execution time of the task software, the computer 110 further comprising: processing circuitry 1012, a memory 1015 comprising instructions executable by the processing circuitry 1012, causing the processing circuitry 1012 to perform any of the method steps described herein.
Finally, it should be understood that the invention is not limited to the embodiments described above, but also relates to and incorporates all embodiments within the scope of the appended independent claims.
Claims
1. A computer implemented method performed by a computer configured to control deployment of cached dependencies on one or more selected nodes in a distributed computing system, wherein the dependencies are software components upon which task software depend, wherein the computer selects the one or more nodes where the cached dependencies are to be deployed using an estimated completion time of the task software, wherein the completion time is estimated using an estimated launch time, an estimated scheduling time and an estimated execution time of the task software, the method comprising: obtaining deployment input data, determining a number of nodes in the distributed computing system where the dependencies are to be deployed by evaluating the deployment input data, wherein the deployment input data is evaluated by at least comparing estimated launch times of nodes where the dependencies needs to be cached to estimated scheduling times of nodes where the dependencies are already cached, and selecting a set of nodes in the distributed computing system, wherein the selected set comprises a number of nodes corresponding to the determined number of nodes, control deployment of dependencies on the selected set of nodes.
2. The method of claim 1 , wherein selecting the set of nodes comprises: mapping required dependencies of the task software to dependencies that are already cached, for each node of the distributed computing system, identifying a first set of nodes in the distributed computing system where all the required dependencies are already cached, identifying a second set of nodes where additional dependencies needs to be cached, selecting the set of nodes by selecting the first set of nodes and selecting the second set of nodes.
3. The method according to claim 2, wherein the determined number of nodes is less than a current/previously determined number of nodes and the method further comprises reducing the number of nodes in the first set of nodes to match the determined number of nodes.
4. The method according to claim 2, wherein the determined number of nodes is greater or equal to a current/previously determined number of nodes, wherein the method further comprises: identifying dependencies that needs to be cached for each of the second set of nodes, control deployment of dependencies by caching dependencies that needs to be cached, for each of the second set of nodes.
5. The method according to any of the preceding claims, wherein the determined number of nodes are determined according to:
N_new = ((App1_Queue length + App1_estimated) *App1_exec_time/Task_per node * App1_expected_completion_time) - N_app1_Curr where N_new is the determined number of nodes App1_Queue length is the average estimated scheduling time over the selected nodes, App1_estimated is an estimated number of task software to be deployed, App_exec_time is estimated execution time of the task software, Task_per_node is a number of parallel task instances that can be executed in parallel on a single node, App1_expected_completion_time is estimated completion time of the task software and N_app1_curr is a current/previously determined number of nodes used to schedule task software.
6. The method according to any of the preceding claims, wherein the deployment input data is indicative of a selection of any of: expected pattern of arriving task software, dependencies on which the arriving task software depends, status of dependencies already deployed on the selected nodes,
total current scheduling que times of the nodes in the distributed computing system, current scheduling que times of task software already deployed on the nodes in the distributed computing system, and available cache resources of each of the nodes in the distributed computing system.
7. The method according to any of the preceding claims, wherein the dependencies comprise runtime environment images.
8. A computer configured to control deployment of cached dependencies on one or more selected nodes in a distributed computing system, wherein the dependencies are software components upon which task software depend, wherein the computer selects the one or more nodes where the cached dependencies are to be deployed using an estimated completion time of the task software, wherein the completion time is estimated using at least an estimated launch time, an estimated scheduling time and an estimated execution time of the task software, the computer further comprising: processing circuitry, a memory comprising instructions executable by the processing circuitry, causing the processing circuitry to perform the method according to any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/SE2020/050665 WO2021262054A1 (en) | 2020-06-25 | 2020-06-25 | Method for controlling deployment of cached dependencies on one or more selected nodes in a distributed computing system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/SE2020/050665 WO2021262054A1 (en) | 2020-06-25 | 2020-06-25 | Method for controlling deployment of cached dependencies on one or more selected nodes in a distributed computing system |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021262054A1 true WO2021262054A1 (en) | 2021-12-30 |
Family
ID=71527890
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/SE2020/050665 WO2021262054A1 (en) | 2020-06-25 | 2020-06-25 | Method for controlling deployment of cached dependencies on one or more selected nodes in a distributed computing system |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2021262054A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116028108A (en) * | 2023-03-31 | 2023-04-28 | 深圳复临科技有限公司 | Method, device, equipment and storage medium for analyzing dependent package installation time |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170161078A1 (en) * | 2015-12-02 | 2017-06-08 | International Business Machines Corporation | Performance-aware instruction scheduling |
US20180074855A1 (en) * | 2016-09-14 | 2018-03-15 | Cloudera, Inc. | Utilization-aware resource scheduling in a distributed computing cluster |
US20190095214A1 (en) * | 2017-09-25 | 2019-03-28 | International Business Machines Corporation | Enhanced performance-aware instruction scheduling |
-
2020
- 2020-06-25 WO PCT/SE2020/050665 patent/WO2021262054A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170161078A1 (en) * | 2015-12-02 | 2017-06-08 | International Business Machines Corporation | Performance-aware instruction scheduling |
US20180074855A1 (en) * | 2016-09-14 | 2018-03-15 | Cloudera, Inc. | Utilization-aware resource scheduling in a distributed computing cluster |
US20190095214A1 (en) * | 2017-09-25 | 2019-03-28 | International Business Machines Corporation | Enhanced performance-aware instruction scheduling |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116028108A (en) * | 2023-03-31 | 2023-04-28 | 深圳复临科技有限公司 | Method, device, equipment and storage medium for analyzing dependent package installation time |
CN116028108B (en) * | 2023-03-31 | 2023-06-06 | 深圳复临科技有限公司 | Method, device, equipment and storage medium for analyzing dependent package installation time |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US12073265B2 (en) | Event handling in distributed event handling systems | |
US10895987B2 (en) | Memory compression method of electronic device and apparatus thereof | |
US10101910B1 (en) | Adaptive maximum limit for out-of-memory-protected web browser processes on systems using a low memory manager | |
US20200153751A1 (en) | Monitoring data streams and scaling computing resources based on the data streams | |
US8959249B1 (en) | Cooperative cloud I/O scheduler | |
US9405572B2 (en) | Optimized resource allocation and management in a virtualized computing environment | |
EP3698247B1 (en) | An apparatus and method for providing a performance based packet scheduler | |
US10862992B2 (en) | Resource cache management method and system and apparatus | |
US8307370B2 (en) | Apparatus and method for balancing load in multi-core processor system | |
US7747834B2 (en) | Memory manager for an embedded system | |
US20110161294A1 (en) | Method for determining whether to dynamically replicate data | |
US20140280956A1 (en) | Methods and systems to manage computer resources in elastic multi-tenant cloud computing systems | |
US10289446B1 (en) | Preserving web browser child processes by substituting a parent process with a stub process | |
US11675621B2 (en) | Method for controlling execution of application, electronic device and storage medium for the same | |
US12056521B2 (en) | Machine-learning-based replenishment of interruptible workloads in cloud environment | |
TW202119207A (en) | Scheduling method and apparatus, electronic device and storage medium | |
US10248321B1 (en) | Simulating multiple lower importance levels by actively feeding processes to a low-memory manager | |
US9934147B1 (en) | Content-aware storage tiering techniques within a job scheduling system | |
WO2021262054A1 (en) | Method for controlling deployment of cached dependencies on one or more selected nodes in a distributed computing system | |
WO2021118420A1 (en) | A multi-tenant real-time process controller for edge cloud environments | |
US10430233B1 (en) | Scheduling computational tasks among multiple classes of storage resources based on job classification | |
WO2023239533A1 (en) | System and method of dynamically adjusting virtual machines for a workload | |
JP2012242877A (en) | Memory management device, memory management method and control program | |
US9483317B1 (en) | Using multiple central processing unit cores for packet forwarding in virtualized networks | |
CN114138444A (en) | Task scheduling method, device, equipment, storage medium and program product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20737620 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20737620 Country of ref document: EP Kind code of ref document: A1 |