WO2021164857A1 - Dynamic resource dimensioning for service assurance - Google Patents

Dynamic resource dimensioning for service assurance Download PDF

Info

Publication number
WO2021164857A1
WO2021164857A1 PCT/EP2020/054270 EP2020054270W WO2021164857A1 WO 2021164857 A1 WO2021164857 A1 WO 2021164857A1 EP 2020054270 W EP2020054270 W EP 2020054270W WO 2021164857 A1 WO2021164857 A1 WO 2021164857A1
Authority
WO
WIPO (PCT)
Prior art keywords
values
resource allocation
resource
network
allocation values
Prior art date
Application number
PCT/EP2020/054270
Other languages
French (fr)
Inventor
Efthymios STATHAKIS
Arthur GUSMAO
Martha VLACHOU-KONCHYLAKI
Original Assignee
Telefonaktiebolaget Lm Ericsson (Publ)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget Lm Ericsson (Publ) filed Critical Telefonaktiebolaget Lm Ericsson (Publ)
Priority to PCT/EP2020/054270 priority Critical patent/WO2021164857A1/en
Publication of WO2021164857A1 publication Critical patent/WO2021164857A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • H04L41/0823Configuration setting characterised by the purposes of a change of settings, e.g. optimising configuration for enhancing reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements
    • H04L41/5003Managing SLA; Interaction between SLA and QoS
    • H04L41/5019Ensuring fulfilment of SLA
    • H04L41/5025Ensuring fulfilment of SLA by proactively reacting to service quality change, e.g. by reconfiguration after service quality degradation or upgrade
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0852Delays

Definitions

  • the present disclosure relates generally to system resource management, and more particularly, to management of system resources in communication networks.
  • SDNs Software Defined Networks
  • service management has become more flexible when the services are cloud- based.
  • laaS Infrastructure as a Service
  • the resource utilization of the underlying infrastructure may be monitored in order to perform load balancing to ensure that the Virtual Machines (VMs) are not overloaded, or to ensure that elastic containerized applications have sufficient resources to execute their tasks.
  • This monitoring is fundamental to enabling laaS providers to fulfill service-level agreements (SLAs), in which they are responsible for guaranteeing pre-established performance levels of the infrastructure, under penalty of fine in case the expected performance is not met.
  • SLAs service-level agreements
  • VNFs virtual network functions
  • CNFs cloud network functions
  • MME virtualized mobility management entities
  • EPG evolved packet gateways
  • PCF policy control functions
  • a web server may have as a KPI the latency u i for serving a request, while a computationally intensive service (such as, for example, an Artificial Intelligence backend) may have as a KPI the ratio r succ of successfully handled requests. Therefore, service-specific SLAs may target application-related KPIs that are not necessarily the same as the KPIs of the underlying infrastructure. Also, these SLAs are typically stochastic in nature, because the KPI is monitored and aggregated over a certain time period, e.g., a week or a month.
  • a SLA may sometimes be expressed probabilistically or as a ratio.
  • a probabilistically expressed SLA may state that a KPI, such as latency, should not exceed some upper (or lower) bound b for ⁇ % of the time, e.g., Prob(KPI ⁇ b) ⁇ ⁇ .
  • a SLA may be expressed as a ratio, e.g., the ratio of tasks that are successfully completed is at least ⁇ , where typical values for a are 95%, 99% or 99.5%.
  • one or the other of these SLA frameworks may be more suitable.
  • a method for managing one or more of system resources for a system infrastructure that supports a network service in a service-based communication network includes receiving a forecast of a service load, s i , of the network service for a future time epoch.
  • a search space of resource allocation values is searched for sets of resource allocation values that are predicted to meet a key performance indicator, KPI, metric of the network service.
  • the resource allocation values correspond to levels of system resources provided to the network service by the system infrastructure.
  • a set of resource allocation values is selected that meets a predetermined criterion for balancing resource utilization from among the sets of resource allocation values that meet the KPI metric of the network service, and the system infrastructure is configured to provide system resources having the selected set of resource allocation values to the network service during the future time epoch.
  • Selecting the set of resource allocation values that meets the predetermined criterion for balancing resource utilization may include selecting a set of resource allocation values that optimizes a function of the utilization of each of the resources in the set
  • searching the space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI metric includes, for a system resource of the system infrastructure, generating a predicted range of resource utilization values that are required to meet the forecast of the service load s i of the network service; and identifying, from among the predicted ranges of resource utilization values, a plurality of sets of resource utilization values that meet the KPI metric of the network service.
  • the method further includes selecting a predetermined number of utilization values from the predicted range of resource utilization values; and combining selected ones of the predicted utilization values of the plurality of system resources to form the sets of predicted utilization values, wherein the sets of predicted utilization values form the search space, wherein identifying the plurality of sets of system resource utilization values that meet the KPI metric of the network service includes identifying the plurality of sets of system resource utilization values that meet the KPI metric of the network service from the search space.
  • configuring the system infrastructure to provide system resources having the selected set of resource allocation values to the network service includes transmitting the selected set of system resource utilization values to an actuation node that is configured to apply changes to the system infrastructure to provide the system resources having the selected set of system resource utilization values during the future time epoch.
  • searching the search space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI metric includes, for a set of resource allocation values, generating a prediction U j of the KPI metric for the future epoch based on the forecasted system load s / , and the set of resource allocation values.
  • generating the predicted range of resource utilization values includes, for a first system resource, defining an interval [û/ r h , u/n] based on high maximum resource utilization value r h and a low maximum resource utilization value n, where 0 ⁇ n ⁇ r h ⁇
  • selecting the plurality of values û 1 from within the interval includes selecting N equidistant points within the interval.
  • the method further includes, for a second system resource, defining a plurality of second intervals [û/ r h , û/ n] for the second system resource and a predicted absolute resource value û for the second system resource, and selecting a plurality of values from within the second interval.
  • the network service includes a communication network, wherein the service load comprises a number of requests per unit time, and the KPI includes network latency.
  • the system infrastructure includes a distributed computing infrastructure, and the system resources comprise central processing unit, CPU, resources, memory resources and/or network resources.
  • the system resources include memory usage
  • selecting the set of resource allocation values that meets the predetermined criterion includes selecting the set of resource allocation values that maximizes a harmonic mean of expected resource utilization values.
  • selecting the set of resource allocation values that meets the predetermined criterion includes selecting a set of predicted resource utilization values that maximizes a harmonic mean of the predicted resource utilization values.
  • searching the search space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI of the network service and selecting the set of resource allocation values that meets the predetermined criterion from among the sets of resource allocation values that meet the KPI metric of the network service includes generating predicted resource allocation values v c , v m , v n of the system resources according to the formulas:
  • r c , r m and r n are maximum resource utilizations
  • Some embodiments provide a computer program comprising instructions which when executed on a computer perform any of the foregoing methods.
  • Some embodiments provide a computer program product comprising computer program, the computer program comprising instructions which when executed on a computer perform any of the foregoing methods.
  • Some embodiments provide a non-transitory computer readable medium storing instructions which when executed by a computer perform any of the foregoing methods.
  • Some embodiments provide a network node including processing circuitry configured to perform operations of receiving a forecast of a service load, s / , of the network service for a future time epoch, and searching a search space of resource allocation values for sets of resource allocation values that are predicted to meet a key performance indicator, KPI, metric of the network service, wherein the resource allocation values correspond to levels of system resources provided to the network service by the system infrastructure.
  • the operations further include selecting a set of resource allocation values that meets a predetermined criterion for balancing resource utilization from among the sets of resource allocation values that meet the KPI metric of the network service, and configuring the system infrastructure to provide system resources having the selected set of resource allocation values to the network service during the future time epoch.
  • a system for managing one or more of system resources for a system infrastructure that supports a network service in a service-based communication network includes a first node that records service load data of a service load, si, of the network service, a second node that generates a forecast of the service load based on the recorded service load data, and a third node that receives the forecast of the service load for the future time epoch.
  • the third node searches a search space of resource allocation values for sets of resource allocation values that are predicted to meet a key performance indicator, KPI, metric of the network service, wherein the resource allocation values correspond to levels of system resources provided to the network service by the system infrastructure.
  • the third node selects a set of resource allocation values that meets a predetermined criterion for balancing resource utilization from among the sets of resource allocation values that meet the KPI metric of the network service, and configures the system infrastructure to provide system resources having the selected set of resource allocation values to the network service during the future time epoch.
  • Some embodiments use arbitrary supervised learning algorithms to model resources and key performance indicators. Given a probabilistic service level agreement (SLA_ framework, some embodiments provide resource optimization that enhances resource utilization under SLA constraints.
  • SLA_ framework probabilistic service level agreement
  • Some embodiments described herein may results in lower operational costs as a result of smart provisioning and/or improved overall network performance by releasing unused resources to services that may benefit from them.
  • Figures 1 and 2 illustrate various elements and workflows of a core network of a wireless communication system.
  • Figure 3 is a block diagram of a network node that may be configured to perform operations according to some embodiments.
  • Figure 4 illustrates functional aspects of some nodes of a system according to some embodiments.
  • Figure 5 is a graph that illustrates an example of a service load time series profile.
  • FIG. 6 to 9 illustrate operations of systems/methods according to some embodiments.
  • VNF virtual network function
  • Some embodiments described herein provide systems and methods for dynamic dimensioning of services within designated resource limits.
  • some embodiments provide a framework for dynamic traffic-driven dimensioning that uses machine learning algorithms, of arbitra ry complexity, to provide an efficient resource allocation that meets probabilistic SLAs.
  • Some embodiments use arbitrary supervised learning algorithms to model the resources and KPIs. Given a probabilistic SLA framework, some embodiments provide resource optimization that enhances resource utilization under SLA constraints.
  • Some embodiments described herein may results in lower operational costs as a result of smart provisioning and/or improved overall network performance by releasing unused resources to services that may benefit from them.
  • Figure 1 illustrates various elements and workflows of a core network of a wireless communication system 100 including a plurality of network nodes in which some embodiments described herein may be utilized.
  • the nodes may be associated with a function of the core network, such as a network data analytics function (NWDAF), a management data analytics function (MDAF), etc.
  • NWDAAF network data analytics function
  • MDAF management data analytics function
  • An SLA may be defined that provides one or more performance requirements for the system 100.
  • a data stream consisting of service-related traffic may be captured by a network node Nl, which creates a time-series of the captured data.
  • This time-series data is passed on to an artificial intelligence (Al) Node N2 which generates a forecast of the service-related traffic for the next one or more time periods, where "time period" refers to an arbitra ry amount of time, e.g., one minute, one hour, etc.
  • another Al node N3 uses the forecast data as an input, together with an available resource budget, to specify an allocation of computational resources to the service, within the admissible limits, so that the SLA is fulfilled.
  • the proposed allocation is fed as input to an actuation node N4, which allocates resources within the infrastructure in accordance with the specification provided by the Al node N3.
  • the operations of the Al node N2 are implemented as part of an NWDAF 230, while the operations of the Al node N3 and the actuation node N4 are implemented as part of an MDAF 220.
  • the operations of the network node N1 are implemented separately from the NWDAF 230 and MDAF 220.
  • the operations of the network node N1 are implemented as part of the NWDAF 230.
  • the Al node N3 may provide a risk for SLA breach along with the proposed allocation strategy to the actuation node N4.
  • the risk can be quantified, for instance, as the margin for crossing a KPI threshold. For example, if the latency should be less than 10 msec for 99% of the time and the desired resource allocation strategy is predicted to have latency less than 10 msec for 99.5% of the time, then the safety margin of 0.5% can also be fed as input to the actuator to enable further decision making strategies on top of the resource optimizer.
  • Some embodiments described herein may advantageously provide automatic KPI- d riven scaling of a service deployment to meet an SLA. Some embodiments may advantageously provide efficient construction of a search-space for resource allocation to achieve enhanced utilization. In particular, some embodiments may advantageously maximize or increase utilization while fulfilling an SLA. Some embodiments may advantageously reduce resource consumption by the service, resulting in lower operational expenses. Moreover, some embodiments may advantageously reduce the amount of tied resources that could otherwise be utilized by other services or network functions.
  • FIG. 3 is a block diagram of a network node according to some embodiments.
  • Various embodiments provide a network node 300 that includes a processor circuit 306 a communication interface 320 coupled to the processor circuit, and a memory 308 coupled to the processor circuit.
  • the memory 308 includes machine-readable computer program instructions that, when executed by the processor circuit, cause the processor circuit to perform some of the operations depicted in Figures 6 to 9.
  • the network node 300 may be a core network node of a core network, such as a 5GC or EPC core network.
  • network node 300 includes a communication interface 320 (also referred to as a network interface) configured to provide communications with other nodes (e.g., with other base stations and/or core network nodes) of a communication network.
  • the network node 300 also includes a processor circuit 306 (also referred to as a processor) and a memory circuit 308 (also referred to as memory) coupled to the processor circuit 306.
  • the memory circuit 308 may include computer readable program code that when executed by the processor circuit 306 causes the processor circuit to perform operations according to embodiments disclosed herein. According to other embodiments, processor circuit 306 may be defined to include memory so that a separate memory circuit is not required.
  • operations of the network node 300 may be performed by processor 306 and/or communication interface 320.
  • the processor 306 may control the communication interface 320 to transmit communications through the communication interface 320 to one or more other network nodes and/or to receive communications through network interface from one or more other network nodes.
  • modules may be stored in memory 308, and these modules may provide instructions so that when instructions of a module are executed by processor 306, processor 306 performs respective operations (e.g., operations discussed herein with respect to example embodiments).
  • a structure similar to that of Figure 3 may be used to implement other network nodes.
  • network nodes discussed herein may be implemented as virtual network nodes.
  • a network node 300 (or radio access network (RAN) node 300) according to some embodiments includes a processor circuit 306 and a memory 308 coupled to the processor circuit, the memory including machine readable program instructions that, when executed by the processor circuit, cause the network node to perform operations described herein.
  • RAN radio access network
  • the first node N1 performs an operation of capturing a summary of records of the load/traffic of a service, such as a service operated in a core network of a wireless communication network.
  • the summary of records may correspond to raw data traffic at a highest possible resolution or to an aggregated version of the data traffic. In the latter case, the aggregation level of the records can be chosen based on performance targets and computational limitations.
  • network node N1 may provide a mobility management entity (MME) network function handling signaling traffic.
  • MME mobility management entity
  • Nodes N2, N3 and N4 can be implemented as different microservices that enhance existing 5G nodes.
  • Node N2 could be implemented, for example, as additional functionality to the NWDAF, while nodes N3 and N4 could be additional functionalities implemented in the MDAF.
  • node N1 provides node N2 with data relating to service traffic and load, node N2 trains a function, such as a machine learning function or deep learning function, to generate a forecast of a time series of the service traffic/load.
  • Node N3 uses the traffic/load prediction to optimize infrastructure resources for meeting the SLA requirements and node N4 pushes the infrastructure changes back to node Nl.
  • the infrastructure supporting an MME application may be scaled by changing the amount of memory and/or CPU processing power that are allocated to the virtual machine (VM) on which the MME function is executed.
  • VM virtual machine
  • nodes Nl and N2 are implemented together inside the NWDAF, since the primary function of the NWDAF is to serve consumers with insights that augment and enhance packet core functionality, as well assist with management of experience assurance.
  • node Nl collects data from various core nodes.
  • node N2 serves node N3 with a load/traffic prediction, and node N3 uses the load/traffic prediction to optimize infrastructure resources to ensure that the service meets the SLA requirements.
  • Node N4 pushes the change back to any core node affecting the SLA.
  • the functionality of nodes N2 and N3 is illustrated in more detail. As shown therein, node N2 performs monitoring and forecasting based on traffic load data. The forecasted value is thereafter used by node N3 as the basis for optimizing resource allocation of the infrastructure according to constraints to meet the required SLA.
  • the historical data recorded by node N1 is used to construct a time-series. For each new observation obtained, a record is added to the historical time-series.
  • a service load profile time series is shown in Figure 5.
  • part of the historical data e.g., the data that spans a given time window (similar to the shadowed part 502 in Figure 5)
  • a one-step or a multi-step ahead forecast of the future expected service load is generated, using its 90th percentile or some other statistical measure, for the next (one or more) aggregation periods.
  • the load can be expressed as a number of requests for a particular component or the number of bytes transmitted through the network.
  • the forecast stage described here corresponds to the second node (N2) of the system 100.
  • the second node N2 ( Figures 1 and 2) receives as input the historical load data from node N1 and outputs a forecasted load at a desired future moment.
  • the load data may consist, for example, of a number of requests that arrive at the network service at a given moment in time or to a number of active users. In general, any kind of measurement that influences the service behavior can be used as the service load data.
  • any suitable forecasting algorithm can be used, from traditional time series forecasting methods, such as autoregressive integrated moving average (ARIMA) models, to recurrent neural networks (RNNs) or dilated convolutional neural networks (CNNs).
  • RNNs recurrent neural networks
  • CNNs dilated convolutional neural networks
  • the region 502 represents a time window of historical data to be considered in the forecast
  • the non- shaded region 504 represents the load that the service will observe in the future. Note that it is possible to use either all historical data to conduct a prediction or a designated part of it, such as, for example, the last two days of historical data.
  • SL supervised machine learning
  • the input to the models f c ( j, f m () and f n () is the forecasted service load s u as well as the maximum CPU r c , memory r m and network r n resources that are allocated to the service.
  • the output of fi(x) is a probability distribution, which is the final output of the third node N3.
  • the third node N3 uses of a set of pre-trained machine learning models, it does not need to be placed in any specific physical location in the network. In practice, however, having it reside in the same physical location as the network service may help to decrease data transfer overhead between the node and a deployment database that provides the resource utilization data. Although not significant during the inference phase, this improvement may help in case the system includes mechanisms to trigger automatic re learning of the machine learning models.
  • This order of modeling the KPIs is chosen assuming that the CPU usage u c can be directly inferred from the load more accurately than memory or network usage, and that a combination of load and CPU r c captures memory better than network usage. Hence, by choosing this order, the resources that are easier to predict and less dependent on other resources are first modeled, and then the more complex ones are modeled. Note, however, that any other order of resource modeling would be possible, depending on how accurate the model is for the given combination input. The motivation for this successive model order is to reduce or minimize the error propagation in the steps described below.
  • the forecasted load s t is used to predict the required amount of CPU û c .
  • N 2 tuples (û c l ,û ⁇ are provided.
  • the absolute required network are computed, an interval [ ] is created and N equidistant points ⁇ 3 ⁇ 4 J ' ⁇ k-1 are taken.
  • a latency distribution fa (û c l , û ⁇ , ûn l ' k , Si) .
  • the process checks to see if it satisfies the SLA, i.e., check the condition Prob[u j' J ' fe ⁇ b ⁇ 3 a.
  • harmonic mean favors those resource allocations that lead to approximately the same utilization for all resources.
  • Step 1 Forecast load 100 req/sec
  • N 10 points per interval
  • 10 3 possible combinations are evaluated, covering a large part of the search space.
  • the evaluation of the tuple can be executed in parallel and is an inexpensive computation.
  • a naive approach would be to construct the search space for the resources semi-manually, e.g., by considering a set of CPU values in the range [r c /2,r c ⁇ , a set of memory values in the range [r m /2,r m ] and a set of network values in the range [r n /2, r n ]. Then, it would be possible to test all possible combinations of these points, yet in that case utilization is not taken into into account since all possible resources may be allocated.
  • a method for managing one or more of system resources for a system infrastructure that supports a network service in a service-based communication network includes receiving (602) a forecast of a service load, s / , of the network service for a future time epoch; searching (604) a search space of resource allocation values for sets of resource allocation values that are predicted to meet a key performance indicator, KPI, metric of the network service, wherein the resource allocation values correspond to levels of system resources provided to the network service by the system infrastructure; selecting (606) a set of resource allocation values that meets a predetermined criterion for balancing resource utilization from among the sets of resource allocation values that meet the KPI metric of the network service; and configuring (608) the system infrastructure to provide system resources having the selected set of resource allocation values to the network service during the future time epoch.
  • Selecting (606) the set of resource allocation values that meets the predetermined criterion for balancing resource utilization may include selecting a set of resource allocation values that optimizes a function of the utilization of each of the resources in the set.
  • selecting (606) the set of resource allocation values that meets the predetermined criterion for balancing resource utilization may include selecting a set of resource allocation values that maximizes a function of the utilization of each of the resources in the set.
  • searching the space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI metric includes, for a system resource of the system infrastructure, generating (702) a predicted range of resource utilization values that are required to meet the forecast of the service load si of the network service; and identifying (704), from among the predicted ranges of resource utilization values, a plurality of sets of resource utilization values that meet the KPI metric of the network service.
  • the method further includes selecting (802) a predetermined number of utilization values from the predicted range of resource utilization values; and combining (804) selected ones of the predicted utilization values of the plurality of system resources to form the sets of predicted utilization values, wherein the sets of predicted utilization values form the search space, wherein identifying the plurality of sets of system resource utilization values that meet the KPI metric of the network service includes identifying the plurality of sets of system resource utilization values that meet the KPI metric of the network service from the search space.
  • configuring the system infrastructure to provide system resources having the selected set of resource allocation values to the network service includes transmitting the selected set of system resource utilization values to an actuation node that is configured to apply changes to the system infrastructure to provide the system resources having the selected set of system resource utilization values during the future time epoch.
  • searching the search space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI metric includes, for a set of resource allocation values, generating a prediction of the KPI metric for the future epoch based on the forecasted system load s / , and the set of resource allocation values.
  • generating the predicted range of resource utilization values includes, for a first system resource, defining an interval [u/r h , u/n] based on high maximum resource utilization value r h and a low maximum resource utilization value n, where 0 ⁇ n ⁇ r h ⁇ 1 for the first system resource and a predicted absolute resource value û for the first system resource, and selecting a plurality of values û l from within the interval.
  • selecting the plurality of values û l from within the interval includes selecting N equidistant points within the interval.
  • the method further includes, for a second system resource, defining a plurality of second intervals [û/r h , û/ n] for the second system resource and a predicted absolute resource value û for the second system resource, and selecting a plurality of values from within the second interval.
  • the network service includes a communication network, wherein the service load comprises a number of requests per unit time, and the KPI includes network latency.
  • the system infrastructure includes a distributed computing infrastructure, and the system resources comprise central processing unit, CPU, resources, memory resources and/or network resources.
  • the system resources include memory usage
  • selecting the set of resource allocation values that meets the predetermined criterion includes selecting the set of resource allocation values that maximizes a harmonic mean of expected resource utilization values.
  • selecting the set of resource allocation values that meets the predetermined criterion includes selecting a set of predicted resource utilization values that maximizes a harmonic mean of the predicted resource utilization values.
  • models can be used to evaluate the optimal solution x * with greater accuracy and perhaps adjust it. For instance, after solving the optimization problem, the solution x * can be evaluated using another model and check if Probf j ⁇ x * ) £ b] 3 a. If the constraint is not met, then the resources can be adjusted by incrementing them and rechecking the condition. A similar assessment and adjustment can be done for the resource utilization models.
  • the terms “comprise”, “comprising”, “comprises”, “include”, “including”, “includes”, “have”, “has”, “having”, or variants thereof are open-ended, and include one or more stated features, integers, elements, steps, components, or functions but does not preclude the presence or addition of one or more other features, integers, elements, steps, components, functions, or groups thereof.
  • the common abbreviation “e.g.” which derives from the Latin phrase “exempli gratia,” may be used to introduce or specify a general example or examples of a previously mentioned item, and is not intended to be limiting of such item.
  • Example embodiments are described herein with reference to block diagrams and/or flowchart illustrations of computer-implemented methods, apparatus (systems and/or devices) and/or computer program products. It is understood that a block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions that are performed by one or more computer circuits.
  • These computer program instructions may be provided to a processor circuit of a general purpose computer circuit, special purpose computer circuit, and/or other programmable data processing circuit to produce a machine, such that the instructions, which execute via the processor of the computer and/or other programmable data processing apparatus, transform and control transistors, values stored in memory locations, and other hardware components within such circuitry to implement the functions/acts specified in the block diagrams and/or flowchart block or blocks, and thereby create means (functionality) and/or structure for implementing the functions/acts specified in the block diagrams and/or flowchart block(s).
  • any appropriate steps, methods, features, functions, or benefits disclosed herein may be performed through one or more functional units or modules of one or more virtual apparatuses.
  • Each virtual apparatus may comprise a number of these functional units.
  • These functional units may be implemented via processing circuitry, which may include one or more microprocessor or microcontrollers, as well as other digital hardware, which may include digital signal processors (DSPs), special-purpose digital logic, and the like.
  • the processing circuitry may be configured to execute program code stored in memory, which may include one or several types of memory such as read-only memory (ROM), random-access memory (RAM), cache memory, flash memory devices, optical storage devices, etc.
  • Program code stored in memory includes program instructions for executing one or more telecommunications and/or data communications protocols as well as instructions for carrying out one or more of the techniques described herein.
  • the processing circuitry may be used to cause the respective functional unit to perform corresponding functions according one or more embodiments.
  • the term unit may have conventional meaning in the field of electronics, electrical devices and/or electronic devices and may include, for example, electrical and/or electronic circuitry, devices, modules, processors, memories, logic solid state and/or discrete devices, computer programs or instructions for carrying out respective tasks, procedures, computations, outputs, and/or displaying functions, and so on, as such as those that are described herein.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Environmental & Geological Engineering (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

A method for managing system resources for a system infrastructure that supports a network service in a service-based communication network includes receiving a forecast of a service load (sl) of the network service for a future time epoch; searching a search space of resource allocation values for sets of resource allocation values that are predicted to meet a key performance indicator (KPI) metric of the network service; selecting a set of resource allocation values that meets a predetermined criterion for balancing resource utilization from among the sets of resource allocation values that meet the KPI metric of the network service; and configuring the system infrastructure to provide system resources having the selected set of resource allocation values to the network service during the future time epoch.

Description

DYNAMIC RESOURCE DIMENSIONING FOR SERVICE ASSURANCE
TECHNICAL FIELD
[0001] The present disclosure relates generally to system resource management, and more particularly, to management of system resources in communication networks.
BACKGROUND
[0002] Management of networks has benefited significantly from the introduction of Software Defined Networks (SDNs). Similarly, service management has become more flexible when the services are cloud- based. In cloud applications that utilize Infrastructure as a Service (laaS), the resource utilization of the underlying infrastructure may be monitored in order to perform load balancing to ensure that the Virtual Machines (VMs) are not overloaded, or to ensure that elastic containerized applications have sufficient resources to execute their tasks. This monitoring is fundamental to enabling laaS providers to fulfill service-level agreements (SLAs), in which they are responsible for guaranteeing pre-established performance levels of the infrastructure, under penalty of fine in case the expected performance is not met.
[0003] Infrastructure-related SLAs typically place requirements on hardware resources, such as VM uptime or operation within certain resource limits. However, virtual network functions (VNFs) and cloud network functions (CNFs), especially those that serve the packet core in 4G and 5G wireless communication networks, such as virtualized mobility management entities (MME) and evolved packet gateways (EPG) in a 4G evolved packet core (EPC) network or cloud native unified data management (UDM) functions or policy control functions (PCF) in a 5G core network, have their own key performance indicators (KPIs). For example, a web server may have as a KPI the latency ui for serving a request, while a computationally intensive service (such as, for example, an Artificial Intelligence backend) may have as a KPI the ratio rsucc of successfully handled requests. Therefore, service-specific SLAs may target application-related KPIs that are not necessarily the same as the KPIs of the underlying infrastructure. Also, these SLAs are typically stochastic in nature, because the KPI is monitored and aggregated over a certain time period, e.g., a week or a month.
[0004] A SLA may sometimes be expressed probabilistically or as a ratio. For example, a probabilistically expressed SLA may state that a KPI, such as latency, should not exceed some upper (or lower) bound b for α % of the time, e.g., Prob(KPI ≤ b) ≥ α. Alternatively, a SLA may be expressed as a ratio, e.g., the ratio of tasks that are successfully completed is at least α, where typical values for a are 95%, 99% or 99.5%. Depending on the nature of the KPI and depending on the context, one or the other of these SLA frameworks may be more suitable.
SUMMARY
[0005] A method for managing one or more of system resources for a system infrastructure that supports a network service in a service-based communication network includes receiving a forecast of a service load, si, of the network service for a future time epoch. A search space of resource allocation values is searched for sets of resource allocation values that are predicted to meet a key performance indicator, KPI, metric of the network service. The resource allocation values correspond to levels of system resources provided to the network service by the system infrastructure. A set of resource allocation values is selected that meets a predetermined criterion for balancing resource utilization from among the sets of resource allocation values that meet the KPI metric of the network service, and the system infrastructure is configured to provide system resources having the selected set of resource allocation values to the network service during the future time epoch. Selecting the set of resource allocation values that meets the predetermined criterion for balancing resource utilization may include selecting a set of resource allocation values that optimizes a function of the utilization of each of the resources in the set
[0006] In some embodiments, searching the space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI metric includes, for a system resource of the system infrastructure, generating a predicted range of resource utilization values that are required to meet the forecast of the service load si of the network service; and identifying, from among the predicted ranges of resource utilization values, a plurality of sets of resource utilization values that meet the KPI metric of the network service.
[0007] In some embodiments, the method further includes selecting a predetermined number of utilization values from the predicted range of resource utilization values; and combining selected ones of the predicted utilization values of the plurality of system resources to form the sets of predicted utilization values, wherein the sets of predicted utilization values form the search space, wherein identifying the plurality of sets of system resource utilization values that meet the KPI metric of the network service includes identifying the plurality of sets of system resource utilization values that meet the KPI metric of the network service from the search space.
[0008] In some embodiments, configuring the system infrastructure to provide system resources having the selected set of resource allocation values to the network service includes transmitting the selected set of system resource utilization values to an actuation node that is configured to apply changes to the system infrastructure to provide the system resources having the selected set of system resource utilization values during the future time epoch. [0009] In some embodiments, searching the search space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI metric includes, for a set of resource allocation values, generating a prediction Uj of the KPI metric for the future epoch based on the forecasted system load s/, and the set of resource allocation values.
[0010] In some embodiments, generating the predicted range of resource utilization values includes, for a first system resource, defining an interval [û/ rh, u/n] based on high maximum resource utilization value rh and a low maximum resource utilization value n, where 0 ≤ n < rh <
1 for the first system resource and a predicted absolute resource value û for the first system resource, and selecting a plurality of values û1 from within the interval.
[0011] In some embodiments, selecting the plurality of values û1 from within the interval includes selecting N equidistant points within the interval.
[0012] In some embodiments, the method further includes, for a second system resource, defining a plurality of second intervals [û/ rh, û/ n] for the second system resource and a predicted absolute resource value û for the second system resource, and selecting a plurality of values from within the second interval.
[0013] In some embodiments, the network service includes a communication network, wherein the service load comprises a number of requests per unit time, and the KPI includes network latency.
[0014] In some embodiments, the system infrastructure includes a distributed computing infrastructure, and the system resources comprise central processing unit, CPU, resources, memory resources and/or network resources. [0015] In some embodiments, the system resources include CPU utilization, and a predicted absolute CPI requirement uc is generated a function of the forecast system load as ûc = fc(sl), where fc() is a regression model of the absolute CPU usage.
[0016] In some embodiments, the system resources include memory usage, and a predicted absolute memory usage requirement ûm is generated as a function of the forecast system load si and a maximum CPU utilization rc as um = fm(sl,rc), where fm() is a regression model of the absolute memory usage.
[0017] In some embodiments, the system resources include network usage, and a predicted absolute network usage requirement un is generated a function of the forecast system load Si, and the maximum CPU utilization rc, and a maximum memory usage rm as un = fn(su rc, rm), where fn() is a regression model of the absolute network usage.
[0018] In some embodiments, selecting the set of resource allocation values that meets the predetermined criterion includes selecting the set of resource allocation values that maximizes a harmonic mean of expected resource utilization values.
[0019] In some embodiments, selecting the set of resource allocation values that meets the predetermined criterion includes selecting a set of predicted resource utilization values that maximizes a harmonic mean of the predicted resource utilization values.
[0020] In some embodiments, searching the search space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI of the network service and selecting the set of resource allocation values that meets the predetermined criterion from among the sets of resource allocation values that meet the KPI metric of the network service includes generating predicted resource allocation values vc, vm, vn of the system resources according to the formulas:
Vc — fc (S lc, rm, rn), Vm — fm(Sl, rc, rm, rn), Vn — fnC^b Vc, Tm, Gh) where rc, rm and rn are maximum resource utilizations; constructing a linear model for fi = wTx with a Gaussian prior on w; obtaining a posterior distribution of w as w~K (μ, Σ); determining a probability Prob[wTx ≤ b] ≥ α where b is a value of the KPI metric and 0 is a threshold; and maximizing fc(x), subject to fn(x) e [G\G ], fm(x ) e [r 1, r™] and b - μTx ≥ Φ-1(α) Σ x| , where x = [¾ rc, rm, rn], rl is a lower maximum resource utilization, rh is an upper maximum resource utilization where 0 ≤ rl ≤ rh ≤ 1.
[0021] Some embodiments provide a computer program comprising instructions which when executed on a computer perform any of the foregoing methods.
[0022] Some embodiments provide a computer program product comprising computer program, the computer program comprising instructions which when executed on a computer perform any of the foregoing methods.
[0023] Some embodiments provide a non-transitory computer readable medium storing instructions which when executed by a computer perform any of the foregoing methods.
[0024] Some embodiments provide a network node including processing circuitry configured to perform operations of receiving a forecast of a service load, s/, of the network service for a future time epoch, and searching a search space of resource allocation values for sets of resource allocation values that are predicted to meet a key performance indicator, KPI, metric of the network service, wherein the resource allocation values correspond to levels of system resources provided to the network service by the system infrastructure. The operations further include selecting a set of resource allocation values that meets a predetermined criterion for balancing resource utilization from among the sets of resource allocation values that meet the KPI metric of the network service, and configuring the system infrastructure to provide system resources having the selected set of resource allocation values to the network service during the future time epoch.
[0025] A system for managing one or more of system resources for a system infrastructure that supports a network service in a service-based communication network includes a first node that records service load data of a service load, si, of the network service, a second node that generates a forecast of the service load based on the recorded service load data, and a third node that receives the forecast of the service load for the future time epoch. The third node searches a search space of resource allocation values for sets of resource allocation values that are predicted to meet a key performance indicator, KPI, metric of the network service, wherein the resource allocation values correspond to levels of system resources provided to the network service by the system infrastructure. The third node selects a set of resource allocation values that meets a predetermined criterion for balancing resource utilization from among the sets of resource allocation values that meet the KPI metric of the network service, and configures the system infrastructure to provide system resources having the selected set of resource allocation values to the network service during the future time epoch.
[0026] Some embodiments use arbitrary supervised learning algorithms to model resources and key performance indicators. Given a probabilistic service level agreement (SLA_ framework, some embodiments provide resource optimization that enhances resource utilization under SLA constraints.
[0027] Some embodiments described herein may results in lower operational costs as a result of smart provisioning and/or improved overall network performance by releasing unused resources to services that may benefit from them.
BRIEF DESCRIPTION OF THE DRAWINGS
[0028] The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in a constitute a part of this application, illustrate certain non-limiting embodiments of inventive concepts. In the drawings:
[0029] Figures 1 and 2 illustrate various elements and workflows of a core network of a wireless communication system.
[0030] Figure 3 is a block diagram of a network node that may be configured to perform operations according to some embodiments.
[0031] Figure 4 illustrates functional aspects of some nodes of a system according to some embodiments.
[0032] Figure 5 is a graph that illustrates an example of a service load time series profile.
[0033] Figures 6 to 9 illustrate operations of systems/methods according to some embodiments.
DETAILED DESCRIPTION [0034] Inventive concepts will now be described more fully hereinafter with reference to the accompanying drawings, in which examples of embodiments of inventive concepts are shown. Inventive concepts may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of present inventive concepts to those skilled in the art. It should also be noted that these embodiments are not mutually exclusive. Components from one embodiment may be tacitly assumed to be present/used in another embodiment.
[0035] The following description presents various embodiments of the disclosed subject matter. These embodiments are presented as teaching examples and are not to be construed as limiting the scope of the disclosed subject matter. For example, certain details of the described embodiments may be modified, omitted, or expanded upon without departing from the scope of the described subject matter.
[0036] Despite deployment infrastructure having native monitoring and load-balancing mechanisms, such built-in mechanisms target only system-related resources, and are unaware of the KPIs that are relevant for the service that runs on top of the infrastructure. Therefore, the mechanisms for monitoring service-related KPIs must be implemented on top of the existing architecture. Naive solutions, such as the assignment of the maximum allowable resources, have the disadvantage that they may lead to over-provisioning and/or increased operational costs. Another reason is that allocated but unused resources can be harmful from a system point-of-view, since they cannot be flexibly utilized by other services that could potentially benefit from them. Therefore, to achieve efficient virtual network function (VNF) assurance, there is a need for mechanisms that forecast the service KPIs at least a few time steps ahead, and provide an adaptive resource configuration mechanism, in terms of memory, network, CPU, or other relevant resource allocation. The ultimate target is to enable the service to run seamlessly within the KPI bounds without under-utilizing the resources or operating within a risky regime, i.e., under conditions that are close to violating the SLA. [0037] In the context of 5G, [1] proposed a dynamic resource scheduling mechanism for VNFs where they pre-allocate VMs based on the load prediction for the next period, yet the KPI assurance only covers the underlying infrastructure. One framework that takes user QoE into account, for cloud-native 5G network functions, is disclosed in [2] as a simple reactive strategy for up- or down-scaling the resource allocation to maintain a user QoE. However, the QoE is not strictly quantified as in the formal context described above.
[0038] Some embodiments described herein provide systems and methods for dynamic dimensioning of services within designated resource limits. In particular, some embodiments provide a framework for dynamic traffic-driven dimensioning that uses machine learning algorithms, of arbitra ry complexity, to provide an efficient resource allocation that meets probabilistic SLAs.
[0039] Some embodiments use arbitrary supervised learning algorithms to model the resources and KPIs. Given a probabilistic SLA framework, some embodiments provide resource optimization that enhances resource utilization under SLA constraints.
[0040] Some embodiments described herein may results in lower operational costs as a result of smart provisioning and/or improved overall network performance by releasing unused resources to services that may benefit from them.
[0041] Figure 1 illustrates various elements and workflows of a core network of a wireless communication system 100 including a plurality of network nodes in which some embodiments described herein may be utilized. The nodes may be associated with a function of the core network, such as a network data analytics function (NWDAF), a management data analytics function (MDAF), etc. An SLA may be defined that provides one or more performance requirements for the system 100.
[0042] Referring to Figure 1, a data stream consisting of service-related traffic, e.g., a number of requests or network transmitted bytes, may be captured by a network node Nl, which creates a time-series of the captured data. This time-series data is passed on to an artificial intelligence (Al) Node N2 which generates a forecast of the service-related traffic for the next one or more time periods, where "time period" refers to an arbitra ry amount of time, e.g., one minute, one hour, etc. [0043] Next, another Al node N3 uses the forecast data as an input, together with an available resource budget, to specify an allocation of computational resources to the service, within the admissible limits, so that the SLA is fulfilled. The proposed allocation is fed as input to an actuation node N4, which allocates resources within the infrastructure in accordance with the specification provided by the Al node N3.
[0044] In the example shown in Figure 1, the operations of the Al node N2 are implemented as part of an NWDAF 230, while the operations of the Al node N3 and the actuation node N4 are implemented as part of an MDAF 220. The operations of the network node N1 are implemented separately from the NWDAF 230 and MDAF 220. In contrast, in the example shown in Figure 2, the operations of the network node N1 are implemented as part of the NWDAF 230.
[0045] In some implementations, the Al node N3 may provide a risk for SLA breach along with the proposed allocation strategy to the actuation node N4. The risk can be quantified, for instance, as the margin for crossing a KPI threshold. For example, if the latency should be less than 10 msec for 99% of the time and the desired resource allocation strategy is predicted to have latency less than 10 msec for 99.5% of the time, then the safety margin of 0.5% can also be fed as input to the actuator to enable further decision making strategies on top of the resource optimizer.
[0046] Some embodiments described herein may advantageously provide automatic KPI- d riven scaling of a service deployment to meet an SLA. Some embodiments may advantageously provide efficient construction of a search-space for resource allocation to achieve enhanced utilization. In particular, some embodiments may advantageously maximize or increase utilization while fulfilling an SLA. Some embodiments may advantageously reduce resource consumption by the service, resulting in lower operational expenses. Moreover, some embodiments may advantageously reduce the amount of tied resources that could otherwise be utilized by other services or network functions.
[0047] Figure 3 is a block diagram of a network node according to some embodiments. Various embodiments provide a network node 300 that includes a processor circuit 306 a communication interface 320 coupled to the processor circuit, and a memory 308 coupled to the processor circuit. The memory 308 includes machine-readable computer program instructions that, when executed by the processor circuit, cause the processor circuit to perform some of the operations depicted in Figures 6 to 9.
[0048] The network node 300 may be a core network node of a core network, such as a 5GC or EPC core network. As shown, network node 300 includes a communication interface 320 (also referred to as a network interface) configured to provide communications with other nodes (e.g., with other base stations and/or core network nodes) of a communication network. The network node 300 also includes a processor circuit 306 (also referred to as a processor) and a memory circuit 308 (also referred to as memory) coupled to the processor circuit 306. The memory circuit 308 may include computer readable program code that when executed by the processor circuit 306 causes the processor circuit to perform operations according to embodiments disclosed herein. According to other embodiments, processor circuit 306 may be defined to include memory so that a separate memory circuit is not required.
[0049] As discussed herein, operations of the network node 300 may be performed by processor 306 and/or communication interface 320. For example, the processor 306 may control the communication interface 320 to transmit communications through the communication interface 320 to one or more other network nodes and/or to receive communications through network interface from one or more other network nodes. Moreover, modules may be stored in memory 308, and these modules may provide instructions so that when instructions of a module are executed by processor 306, processor 306 performs respective operations (e.g., operations discussed herein with respect to example embodiments). In addition, a structure similar to that of Figure 3 may be used to implement other network nodes. Moreover, network nodes discussed herein may be implemented as virtual network nodes.
[0050] Accordingly, a network node 300 (or radio access network (RAN) node 300) according to some embodiments includes a processor circuit 306 and a memory 308 coupled to the processor circuit, the memory including machine readable program instructions that, when executed by the processor circuit, cause the network node to perform operations described herein.
[0051] Referring again to Figures 1 and 2, a set of nodes N1-N4 is illustrated. The first node N1 performs an operation of capturing a summary of records of the load/traffic of a service, such as a service operated in a core network of a wireless communication network. The summary of records may correspond to raw data traffic at a highest possible resolution or to an aggregated version of the data traffic. In the latter case, the aggregation level of the records can be chosen based on performance targets and computational limitations.
[0052] For example, network node N1 may provide a mobility management entity (MME) network function handling signaling traffic. Nodes N2, N3 and N4 can be implemented as different microservices that enhance existing 5G nodes. Node N2 could be implemented, for example, as additional functionality to the NWDAF, while nodes N3 and N4 could be additional functionalities implemented in the MDAF. In any implementation, node N1 provides node N2 with data relating to service traffic and load, node N2 trains a function, such as a machine learning function or deep learning function, to generate a forecast of a time series of the service traffic/load. Node N3 uses the traffic/load prediction to optimize infrastructure resources for meeting the SLA requirements and node N4 pushes the infrastructure changes back to node Nl. For example, in this example, the infrastructure supporting an MME application may be scaled by changing the amount of memory and/or CPU processing power that are allocated to the virtual machine (VM) on which the MME function is executed.
[0053] In another alternative shown in Figure 2, nodes Nl and N2 are implemented together inside the NWDAF, since the primary function of the NWDAF is to serve consumers with insights that augment and enhance packet core functionality, as well assist with management of experience assurance. In such a scenario, node Nl collects data from various core nodes. As in the system shown in Figure 1, node N2 serves node N3 with a load/traffic prediction, and node N3 uses the load/traffic prediction to optimize infrastructure resources to ensure that the service meets the SLA requirements. Node N4 pushes the change back to any core node affecting the SLA. [0054] Referring to Figure 4, the functionality of nodes N2 and N3 is illustrated in more detail. As shown therein, node N2 performs monitoring and forecasting based on traffic load data. The forecasted value is thereafter used by node N3 as the basis for optimizing resource allocation of the infrastructure according to constraints to meet the required SLA.
[0055] The historical data recorded by node N1 is used to construct a time-series. For each new observation obtained, a record is added to the historical time-series. A service load profile time series is shown in Figure 5. Using part of the historical data, e.g., the data that spans a given time window (similar to the shadowed part 502 in Figure 5), a one-step or a multi-step ahead forecast of the future expected service load is generated, using its 90th percentile or some other statistical measure, for the next (one or more) aggregation periods. In this example, the load can be expressed as a number of requests for a particular component or the number of bytes transmitted through the network.
[0056] The forecast stage described here corresponds to the second node (N2) of the system 100. Formally, the second node N2 (Figures 1 and 2) receives as input the historical load data from node N1 and outputs a forecasted load at a desired future moment. The load data may consist, for example, of a number of requests that arrive at the network service at a given moment in time or to a number of active users. In general, any kind of measurement that influences the service behavior can be used as the service load data.
[0057] For the forecast performed by node N2, any suitable forecasting algorithm can be used, from traditional time series forecasting methods, such as autoregressive integrated moving average (ARIMA) models, to recurrent neural networks (RNNs) or dilated convolutional neural networks (CNNs). In the times series shown in Figure 5, the region 502 represents a time window of historical data to be considered in the forecast, and the non- shaded region 504 represents the load that the service will observe in the future. Note that it is possible to use either all historical data to conduct a prediction or a designated part of it, such as, for example, the last two days of historical data.
[0058] Within node N3, some embodiments use supervised machine learning (SL) algorithms to build regression models for the system resources, e.g. fc{), fm() and fn() for absolute/ raw CPU usage uc, memory usage um, and network usage un, respectively. However, the systems/methods may use fc(), fm{) and fn( ) to model CPU utilization vc = uc/u ax , memory utilization vm = um/u™ax, the network utilization vn = un/u™ax, where the quantities rc, rm and rn denote the maximum allocation for the respective resource. For example, assume that a microservice is consuming 2.3 cores of CPU and 1GB of memory, while the maximum allocated resources are 4 cores and 2 GB, respectively. Then, the above metrics for absol ute usage are ( uc , rc, ) = (2.3, 4) cores, (itm, rm, ) = (1, 2) GB and the respective utilizations are vc=2.3/4 = 57,5% and vm=l/2 = 50%.
[0059] The input to the models fc( j, fm() and fn() is the forecasted service load su as well as the maximum CPU rc, memory rm and network rn resources that are allocated to the service. A Bayesian probabilistic model /¾(x), is provided, where x = [sj, rc, rm, rn], for the KPI of interest that, in this example, is the latency ut. The output of fi(x) is a probability distribution, which is the final output of the third node N3.
[0060] Since the third node N3 uses of a set of pre-trained machine learning models, it does not need to be placed in any specific physical location in the network. In practice, however, having it reside in the same physical location as the network service may help to decrease data transfer overhead between the node and a deployment database that provides the resource utilization data. Although not significant during the inference phase, this improvement may help in case the system includes mechanisms to trigger automatic re learning of the machine learning models.
[0061] Using the definitions introduced above, two approaches may be defined for optimizing resource usage and satisfying the SLA, described below, namely, sequential resource modeling and constrained optimization.
[0062] Sequential resource modeling
[0063] For sequential resource modeling, assume that the resource utilization, expressed as the fraction of resources used divided by the maximum allocated resources, should be in the interval
Figure imgf000014_0001
≤ rh ≤ 1. For this description, these intervals are chosen here to be the same for all resources, but different bounds can be chosen per resource without loss of generality. In this approach, models are constructed for the absolute resource usage, i.e., the number of cores uc, the memory capacity um, or the network usage un, as follows:
Figure imgf000015_0001
[0064] This order of modeling the KPIs is chosen assuming that the CPU usage uc can be directly inferred from the load
Figure imgf000015_0002
more accurately than memory or network usage, and that a combination of load and CPU rc captures memory better than network usage. Hence, by choosing this order, the resources that are easier to predict and less dependent on other resources are first modeled, and then the more complex ones are modeled. Note, however, that any other order of resource modeling would be possible, depending on how accurate the model is for the given combination input. The motivation for this successive model order is to reduce or minimize the error propagation in the steps described below.
[0065] The forecasted load st is used to predict the required amount of CPU ûc. An interval
c/rh, ûc/n] is created and N equidistant points
Figure imgf000015_0003
are defined in this interval. Each point ûc l is a candidate for the CPU allocation limit rc. Hence, in the following analysis, ûc l is used as a proxy for rc. This way, it is guaranteed that when the absolute CPU usage of the service is ûc, then the utilization vc = ucc l will lie in [rj,rft]
[0066] For each point ûc l, the absolute required memory ûm l = fc(si,ûc l ) is computed. Again, for each ûm l , an interval [ûm/rh, ûm l /rj\ is created and N equidistant points are taken
Figure imgf000015_0004
in this interval. Similarly to the previous step, each point
Figure imgf000015_0005
is a candidate for the memory allocation limit rm, thus
Figure imgf000015_0006
is used as a proxy for rm.
[0067] After this step, N2 tuples (ûc l,û^ are provided. Finally, for each pair (ûc l, ûl ), the absolute required network
Figure imgf000015_0007
are computed, an interval [ ] is created and N equidistant points {¾J' }k-1 are taken.
[0068] At the end of this process, the resulting N3 tuples
Figure imgf000015_0008
are evaluated using the function fi(x) to yield a latency distribution
Figure imgf000015_0009
= fa (ûc l, û^, ûnl 'k, Si) . For each distribution û^'k, the process checks to see if it satisfies the SLA, i.e., check the condition Prob[uj' J'fe ≤ b\ ³ a. For all index tuples (i*,_/*, fc*) such that û\ J ,k satisfies the criterion and pick the one that maximizes the harmonic mean of the expected CPU utilization vc l = ûcc , the expected memory utilization
Figure imgf000016_0001
= û^/û^ and the expected network utilization
Figure imgf000016_0002
[0069] A different criterion can be chosen besides harmonic mean, but the harmonic mean favors those resource allocations that lead to approximately the same utilization for all resources.
[0070] To illustrate the procedure, a simple example is provided here:
[0071] Initialization: [r rh] = [0.5, 0.8]
[0072] Step 1: Forecast load 100 req/sec
[0073] Step 2: Predict the CPU usage to obtain ûc = /c(100) = 1 core needed to serve this load.
[0074] Create thee interval [— ,— ] = [1.25, 2.0] and take N = 10 points ûc l, e.g., 1.25, 1.5, ...,
L0.8 0.5 J
2.
[0075] Step 3: For each ûc l get required memory, e.g.,
Figure imgf000016_0003
= fc{ 100, 1.25) = 1, . A0 = fc( 100, 2.0) = 2.
[0076] For û^, create interval [— ,— | = [1.25, 2.0] and take N = 10 points
L0.8 0.5J
Figure imgf000016_0004
e.g., 1.25,
1.5, 2...
[0077] For ύ^, create interval [— ,— 1 = [2.5, 4.0] and take N = 10 points
L0.8 0.5 J
Figure imgf000016_0005
e.g., 2.5,
2.75, ...,4.
[0078] Step 4: For each pair (ûc l ,û- ) compute the required network capacity,
Figure imgf000016_0006
= fc( 100, 1.25, 1.25) = 100, ...,u '10 = fc( 100, 2.0, 4.0) = 200.
Figure imgf000016_0007
[0079]
Figure imgf000016_0009
get interval ] = [125, 200], then take N = 10 points e.g., 125,
L 0.8 0.5 J
Figure imgf000016_0008
150, ..,200.
200 200
[0080] For M^'10, 6et interval = [250,400] then take N = 10 points u 0. 0,10,k, e.g., 8 0.5
250, ...,400.
[0081] Step 5: For each pair (ûc l , û^, Un 'k and the load st get latency distribution as follows: [0082] u '1'1 = fi( 100, 1.25, 1.25, 100), ...,u 0'10·10 = fc{ 100, 2.0, 4.0, 400) [0083] Find the distributions û J ,k that meet the SLA Prob J ,k ≤ ή| > a
Figure imgf000017_0001
[0084] For each tuple,
Figure imgf000017_0002
, ûn J ,k compute the CPU utilization vf = ûc/ ¾*, memory utilization
Figure imgf000017_0003
network utilization
Figure imgf000017_0004
[0085] Return the tuple that yields the best harmonic mean for
Figure imgf000017_0005
[0086] Note here that, for N = 10 points per interval, 103 possible combinations are evaluated, covering a large part of the search space. The evaluation of the tuple can be executed in parallel and is an inexpensive computation.
[0087] It will be appreciated that a different number of points Nk may be chosen for each interval, and such points may not necessarily equidistant. Essentially, this approach yields a reduced search space in the grid of all possible resource allocations by considering only those vectors that will eventually lead to good resource provisioning.
[0088] If one wishes to reduce complexity, then a naive approach would be to construct the search space for the resources semi-manually, e.g., by considering a set of CPU values in the range [rc/2,rc\, a set of memory values in the range [rm/2,rm] and a set of network values in the range [rn/2, rn ]. Then, it would be possible to test all possible combinations of these points, yet in that case utilization is not taken into into account since all possible resources may be allocated.
[0089] One of the benefits of the approach described above in this section is that it may lead to an allocation (ûc, um, un) that is strictly lower than (rc,r,rn), and thus the remaining resources can be temporarily released to other services until they are asked back.
[0090] It will be further appreciated that this approach is not constrained in the choice of functions, and that decisions tree, neural networks or any other function that provides good fitting can be used.
[0091] Operations according to some embodiments are illustrated in Figures 6 to 8. Referring to Figure 6, A method for managing one or more of system resources for a system infrastructure that supports a network service in a service-based communication network includes receiving (602) a forecast of a service load, s/, of the network service for a future time epoch; searching (604) a search space of resource allocation values for sets of resource allocation values that are predicted to meet a key performance indicator, KPI, metric of the network service, wherein the resource allocation values correspond to levels of system resources provided to the network service by the system infrastructure; selecting (606) a set of resource allocation values that meets a predetermined criterion for balancing resource utilization from among the sets of resource allocation values that meet the KPI metric of the network service; and configuring (608) the system infrastructure to provide system resources having the selected set of resource allocation values to the network service during the future time epoch. Selecting (606) the set of resource allocation values that meets the predetermined criterion for balancing resource utilization may include selecting a set of resource allocation values that optimizes a function of the utilization of each of the resources in the set. In particular, selecting (606) the set of resource allocation values that meets the predetermined criterion for balancing resource utilization may include selecting a set of resource allocation values that maximizes a function of the utilization of each of the resources in the set.
[0092] Referring to Figure 7, in some embodiments, searching the space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI metric includes, for a system resource of the system infrastructure, generating (702) a predicted range of resource utilization values that are required to meet the forecast of the service load si of the network service; and identifying (704), from among the predicted ranges of resource utilization values, a plurality of sets of resource utilization values that meet the KPI metric of the network service.
[0093] Referring to Figure 8, in some embodiments, the method further includes selecting (802) a predetermined number of utilization values from the predicted range of resource utilization values; and combining (804) selected ones of the predicted utilization values of the plurality of system resources to form the sets of predicted utilization values, wherein the sets of predicted utilization values form the search space, wherein identifying the plurality of sets of system resource utilization values that meet the KPI metric of the network service includes identifying the plurality of sets of system resource utilization values that meet the KPI metric of the network service from the search space.
[0094] In some embodiments, configuring the system infrastructure to provide system resources having the selected set of resource allocation values to the network service includes transmitting the selected set of system resource utilization values to an actuation node that is configured to apply changes to the system infrastructure to provide the system resources having the selected set of system resource utilization values during the future time epoch.
[0095] In some embodiments, searching the search space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI metric includes, for a set of resource allocation values, generating a prediction
Figure imgf000019_0001
of the KPI metric for the future epoch based on the forecasted system load s/, and the set of resource allocation values.
[0096] In some embodiments, generating the predicted range of resource utilization values includes, for a first system resource, defining an interval [u/rh, u/n] based on high maximum resource utilization value rh and a low maximum resource utilization value n, where 0 ≤ n ≤ rh ≤ 1 for the first system resource and a predicted absolute resource value û for the first system resource, and selecting a plurality of values ûl from within the interval.
[0097] In some embodiments, selecting the plurality of values ûl from within the interval includes selecting N equidistant points within the interval.
[0098] In some embodiments, the method further includes, for a second system resource, defining a plurality of second intervals [û/r h, û/ n] for the second system resource and a predicted absolute resource value û for the second system resource, and selecting a plurality of values from within the second interval.
[0099] In some embodiments, the network service includes a communication network, wherein the service load comprises a number of requests per unit time, and the KPI includes network latency.
[0100] In some embodiments, the system infrastructure includes a distributed computing infrastructure, and the system resources comprise central processing unit, CPU, resources, memory resources and/or network resources. [0101] In some embodiments, the system resources include CPU utilization, and a predicted absolute CPI requirement ûc is generated a function of the forecast system load as ûc = fc(sl), where fc() is a regression model of the absolute CPU usage.
[0102]
[0103] In some embodiments, the system resources include memory usage, and a predicted absolute memory usage requirement ûm is generated as a function of the forecast system load si and a maximum CPU utilization rc as ûm = fm(si>rc), where fm() is a regression model of the absolute memory usage.
[0104] In some embodiments, the system resources include network usage, and a predicted absolute network usage requirement ûn is generated a function of the forecast system load si, and the maximum CPU utilization rc, and a maximum memory usage rm as ûn = fn(sl rc rm), where fn() is a regression model of the absolute network usage.
[0105] In some embodiments, selecting the set of resource allocation values that meets the predetermined criterion includes selecting the set of resource allocation values that maximizes a harmonic mean of expected resource utilization values.
[0106] In some embodiments, selecting the set of resource allocation values that meets the predetermined criterion includes selecting a set of predicted resource utilization values that maximizes a harmonic mean of the predicted resource utilization values.
[0107] Constrained optimization
[0108] For constrained optimization, assume again that the resource utilization should be in the interval [rl rh], where 0 ≤ rl £ rh ≤ 1. Without loss of generality, it is assumed that linear models (quadratic models are also possible) of the resource utilization, i.e., of the quantities vc, vm and vn, can be constructed as follows:
Figure imgf000020_0001
[0109] Also, choose a linear model for fl = wTx with a Gaussian prior on w. After training, e.g., using Markov chain Monte Carlo (MCMC), the posterior distribution of w is obtained as w ~7\G(μ,S). Then, conditioned on the input vector x = [si,rc,rm,rn], obtain a closed form solution for Prob [wTx £ b] ³ a. Capitalizing on the closed-form expression for the SLA, the following optimization problem is solved: maximize /c(x) subject to /n(x) e [r , r£] fm(x) e [r,m,r¾ m]
Figure imgf000021_0001
[0110] This problem can be efficiently solved and gives us the optimal/maximum CPU utilization and admissible memory and network utilization such that the SLA is met. Of course, one can choose to maximize another resource utilization and place CPU utilization under the constraints.
[0111] In some implementations, one can implement more sophisticated ML algorithms to build models, such decision trees and neural networks, for the resource utilization vc, vm, vn, and the latency ut. These models can be used to evaluate the optimal solution x* with greater accuracy and perhaps adjust it. For instance, after solving the optimization problem, the solution x* can be evaluated using another model
Figure imgf000021_0002
and check if Probf j^x*) £ b] ³ a. If the constraint is not met, then the resources can be adjusted by incrementing them and rechecking the condition. A similar assessment and adjustment can be done for the resource utilization models.
[0112] Accordingly, referring to Figure 9, in some embodiments, searching the search space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI of the network service and selecting the set of resource allocation values that meets the predetermined criterion from among the sets of resource allocation values that meet the KPI metric of the network service includes generating (902) predicted resource allocation values vc,vm, vn of the system resources according to the formulas:
Figure imgf000021_0003
where rc, rm and rn are maximum resource utilizations; constructing (904) a linear model for fi = wTx with a Gaussian prior on w; obtaining (906) a posterior distribution of w as w~Jf (/!,Σ); determining (908) a probability Prob[wr¾ £ b] ³ a where b is a value of the KPI metric and 0 is a threshold; and maximizing (910) fc(x), subject to fn(x) E [r \ r ], fm(x) E
Figure imgf000022_0001
lower maximum resource utilization, rh is an upper maximum resource utilization where 0 ≤ rj ≤ rh ≤ 1.
[0113] Explanations for abbreviations from the above disclosure are provided below.
Abbreviation Explanation
3GPP 3rd Generation Partnership Project
5G 5th Generation
5GC 5G Core
ANN Artificial Neural Network
ARIMA Autoregressive Integrated Moving Average
CNF Cloud Network Functions
CNN Convolutional Neural Network
DAF Network Data Analytics Function
EPC Evolved Packet Core
EPG Evolved Packet Gateway laaS Infrastructure-as-a-Service
KPI Key Performance Indicators
MME Mobile Management Entity
NN Neural Network
PCF Policy Control Function
RNN Recurrent Neural Network
SL Supervised Learning
SLA Service-Level Agreements
SDN Software Defined Networks
UDM User Data Management
VNF Virtual Network functions
VM Virtual Machine
[0114] References:
[1] A. Bilal, T. Tarik, A. Vajda, and B. Miloud, "Dynamic Cloud Resource Scheduling in Virtualized 5G Mobile Systems," in 2016 IEEE Global Communications Conference (GLOBECOM),
2016, pp. 1-6. [2] S. Dutta, T. Taleb, and A. Ksentini, "QoE-aware elasticity support in cloud-native 5G systems," in 2016 IEEE International Conference on Communications (ICC), 2016, pp. 1-6. [0115] Further definitions and embodiments are discussed below.
[0116] In the above-description of various embodiments of present inventive concepts, it is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of present inventive concepts. Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which present inventive concepts belong. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of this specification and the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
[0117] When an element is referred to as being "connected", "coupled", "responsive", or variants thereof to another element, it can be directly connected, coupled, or responsive to the other element or intervening elements may be present. In contrast, when an element is referred to as being "directly connected", "directly coupled", "directly responsive", or variants thereof to another element, there are no intervening elements present. Like numbers refer to like elements throughout. Furthermore, "coupled", "connected", "responsive", or variants thereof as used herein may include wirelessly coupled, connected, or responsive. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. Well-known functions or constructions may not be described in detail for brevity and/or clarity. The term "and/or" includes any and all combinations of one or more of the associated listed items.
[0118] It will be understood that although the terms first, second, third, etc. may be used herein to describe various elements/operations, these elements/operations should not be limited by these terms. These terms are only used to distinguish one element/operation from another element/operation. Thus, a first element/operation in some embodiments could be termed a second element/operation in other embodiments without departing from the teachings of present inventive concepts. The same reference numerals or the same reference designators denote the same or similar elements throughout the specification.
[0119] As used herein, the terms "comprise", "comprising", "comprises", "include", "including", "includes", "have", "has", "having", or variants thereof are open-ended, and include one or more stated features, integers, elements, steps, components, or functions but does not preclude the presence or addition of one or more other features, integers, elements, steps, components, functions, or groups thereof. Furthermore, as used herein, the common abbreviation "e.g.", which derives from the Latin phrase "exempli gratia," may be used to introduce or specify a general example or examples of a previously mentioned item, and is not intended to be limiting of such item. The common abbreviation "i.e.", which derives from the Latin phrase "id est," may be used to specify a particular item from a more general recitation. [0120] Example embodiments are described herein with reference to block diagrams and/or flowchart illustrations of computer-implemented methods, apparatus (systems and/or devices) and/or computer program products. It is understood that a block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions that are performed by one or more computer circuits. These computer program instructions may be provided to a processor circuit of a general purpose computer circuit, special purpose computer circuit, and/or other programmable data processing circuit to produce a machine, such that the instructions, which execute via the processor of the computer and/or other programmable data processing apparatus, transform and control transistors, values stored in memory locations, and other hardware components within such circuitry to implement the functions/acts specified in the block diagrams and/or flowchart block or blocks, and thereby create means (functionality) and/or structure for implementing the functions/acts specified in the block diagrams and/or flowchart block(s).
[0121] These computer program instructions may also be stored in a tangible computer- readable medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instructions which implement the functions/acts specified in the block diagrams and/or flowchart block or blocks. Accordingly, embodiments of present inventive concepts may be embodied in hardware and/or in software (including firmware, resident software, micro-code, etc.) that runs on a processor such as a digital signal processor, which may collectively be referred to as "circuitry," "a module" or variants thereof.
[0122] It should also be noted that in some alternate implementations, the functions/acts noted in the blocks may occur out of the order noted in the flowcharts. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Moreover, the functionality of a given block of the flowcharts and/or block diagrams may be separated into multiple blocks and/or the functionality of two or more blocks of the flowcharts and/or block diagrams may be at least partially integrated. Finally, other blocks may be added/inserted between the blocks that are illustrated, and/or blocks/operations may be omitted without departing from the scope of inventive concepts. Moreover, although some of the diagrams include arrows on communication paths to show a primary direction of communication, it is to be understood that communication may occur in the opposite direction to the depicted arrows.
[0123] Many variations and modifications can be made to the embodiments without substantially departing from the principles of the present inventive concepts. All such variations and modifications are intended to be included herein within the scope of present inventive concepts. Accordingly, the above disclosed subject matter is to be considered illustrative, and not restrictive, and the examples of embodiments are intended to cover all such modifications, enhancements, and other embodiments, which fall within the spirit and scope of present inventive concepts. Thus, to the maximum extent allowed by law, the scope of present inventive concepts are to be determined by the broadest permissible interpretation of the present disclosure including the examples of embodiments and their equivalents, and shall not be restricted or limited by the foregoing detailed description.
[0124] Additional explanation is provided below. [0125] Generally, all terms used herein are to be interpreted according to their ordinary meaning in the relevant technical field, unless a different meaning is clearly given and/or is implied from the context in which it is used. All references to a/an/the element, apparatus, component, means, step, etc. are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise.
The steps of any methods disclosed herein do not have to be performed in the exact order disclosed, unless a step is explicitly described as following or preceding another step and/or where it is implicit that a step must follow or precede another step. Any feature of any of the embodiments disclosed herein may be applied to any other embodiment, wherever appropriate. Likewise, any advantage of any of the embodiments may apply to any other embodiments, and vice versa. Other objectives, features and advantages of the enclosed embodiments will be apparent from the foregoing description.
[0126] Any appropriate steps, methods, features, functions, or benefits disclosed herein may be performed through one or more functional units or modules of one or more virtual apparatuses. Each virtual apparatus may comprise a number of these functional units. These functional units may be implemented via processing circuitry, which may include one or more microprocessor or microcontrollers, as well as other digital hardware, which may include digital signal processors (DSPs), special-purpose digital logic, and the like. The processing circuitry may be configured to execute program code stored in memory, which may include one or several types of memory such as read-only memory (ROM), random-access memory (RAM), cache memory, flash memory devices, optical storage devices, etc. Program code stored in memory includes program instructions for executing one or more telecommunications and/or data communications protocols as well as instructions for carrying out one or more of the techniques described herein. In some implementations, the processing circuitry may be used to cause the respective functional unit to perform corresponding functions according one or more embodiments.
[0127] The term unit may have conventional meaning in the field of electronics, electrical devices and/or electronic devices and may include, for example, electrical and/or electronic circuitry, devices, modules, processors, memories, logic solid state and/or discrete devices, computer programs or instructions for carrying out respective tasks, procedures, computations, outputs, and/or displaying functions, and so on, as such as those that are described herein.

Claims

CLAIMS:
1. A method for managing one or more of system resources for a system infrastructure that supports a network service in a service-based communication network, the method comprising: receiving (602) a forecast of a service load, s/, of the network service for a future time epoch; searching (604) a search space of resource allocation values for sets of resource allocation values that are predicted to meet a key performance indicator, KPI, metric of the network service, wherein the resource allocation values correspond to levels of system resources provided to the network service by the system infrastructure; and selecting (606) a set of resource allocation values that meets a predetermined criterion for balancing resource utilization from among the sets of resource allocation values that meet the KPI metric of the network service; and configuring (608) the system infrastructure to provide system resources having the selected set of resource allocation values to the network service during the future time epoch.
2. The method of Claim 1, wherein searching the space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI metric comprises: for a system resource of the system infrastructure, generating (702) a predicted range of resource utilization values that are required to meet the forecast of the service load s/ of the network service; and identifying (704), from among the predicted ranges of resource utilization values, a plurality of sets of resource utilization values that meet the KPI metric of the network service.
3. The method of Claim 2, further comprising: selecting (802) a predetermined number of utilization values from the predicted range of resource utilization values; and combining (804) selected ones of the predicted utilization values of the plurality of system resources to form the sets of predicted utilization values, wherein the sets of predicted utilization values form the search space; wherein identifying the plurality of sets of system resource utilization values that meet the KPI metric of the network service comprises identifying the plurality of sets of system resource utilization values that meet the KPI metric of the network service from the search space.
4. The method of any previous Claim, wherein configuring the system infrastructure to provide system resources having the selected set of resource allocation values to the network service comprises: transmitting the selected set of system resource utilization values to an actuation node that is configured to apply changes to the system infrastructure to provide the system resources having the selected set of system resource utilization values during the future time epoch.
5. The method of any previous Claim, wherein searching the search space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI metric comprises: for a set of resource allocation values, generating a prediction
Figure imgf000029_0001
of the KPI metric for the future epoch based on the forecasted system load s/, and the set of resource allocation values.
6. The method of Claim 2, wherein generating the predicted range of resource utilization values, vc, vm, vn, comprises, for a first system resource, defining an interval [u/rh, fi/n] based on high maximum resource utilization value rh and a low maximum resource utilization value n, where 0 ≤ n ≤ rh ≤ 1 for the first system resource and a predicted absolute resource value û for the first system resource, and selecting a plurality of values û1 from within the interval.
7. The method of Claim 6, wherein selecting the plurality of values ûl from within the interval comprises selecting N equidistant points within the interval.
8. The method of Claim 6, further comprising, for a second system resource, defining a plurality of second intervals [û/ rh, w/n] for the second system resource and a predicted absolute resource value û for the second system resource, and selecting a plurality of values fil,i from within the second interval.
9. The method of any previous Claim, wherein the network service comprises a communication network, wherein the service load comprises a number of requests per unit time, and wherein the KPI comprises network latency.
10. The method of any previous Claim, wherein the system infrastructure comprises a distributed computing infrastructure, and wherein the system resources comprise central processing unit, CPU, resources, memory resources and/or network resources.
11. The method of Claim 10, wherein the system resources comprise CPU utilization, and wherein a predicted absolute CPI requirement ûc is generated a function of the forecast system load as ûc = fc(si), where fc() is a regression model of the absolute CPU usage.
12. The method of Claim 11, wherein the system resources comprise memory usage, and wherein a predicted absolute memory usage requirement ûm is generated as a function of the forecast system load si and a maximum CPU utilization rc as ûm = /m(så, rc), where fm() is a regression model of the absolute memory usage.
13. The method of Claim 12, wherein the system resources comprise network usage, and wherein a predicted absolute network usage requirement ûn is generated a function of the forecast system load si, and the maximum CPU utilization rc, and a maximum memory usage rm as ûn = /n(sj, rc, rm), where /n is a regression model of the absolute network usage.
14. The method of any previous Claim, wherein selecting the set of resource allocation values that meets the predetermined criterion comprises selecting the set of resource allocation values that maximizes a harmonic mean of expected resource utilization values.
15. The method of Claim 2, wherein selecting the set of resource allocation values that meets the predetermined criterion comprises selecting a set of predicted resource utilization values, vc, vm, vn, that maximizes a harmonic mean of the predicted resource utilization values.
16. The method of Claim 1, wherein searching the search space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI of the network service and selecting the set of resource allocation values that meets the predetermined criterion from among the sets of resource allocation values that meet the KPI metric of the network service comprises: generating (902) predicted resource allocation values vc, vm,vn of the system resources according to the formulas:
Figure imgf000032_0001
where rc, rm and rn are maximum resource utilizations; constructing (904) a linear model for /) = wTx with a Gaussian prior on w obtaining (906) a posterior distribution of w as w~ (m,S); determining (908) a probability Prob [wTx £ b] ³ a where b is a value of the KPI metric and a is a threshold; and maximizing (910) fc(x), subject to fn(x) E [r , r ], fm(x ) e [r^, r^] and b - mtx ³
Figure imgf000032_0002
where x = [surc, rm,rn], rt is a lower maximum resource utilization, rh is an upper maximum resource utilization where 0 ≤ rt ≤ rh ≤ 1.
17. The method of any of Claims 1 to 16, wherein selecting (606) the set of resource allocation values that meets the predetermined criterion for balancing resource utilization comprises selecting a set of resource allocation values that optimizes a function of the utilization of each of the resources in the set.
18. A computer program comprising instructions which when executed on a computer perform any of the methods of Claims 1 to 17.
19. A computer program product comprising computer program, the computer program comprising instructions which when executed on a computer perform any of the methods of Claims 1 to 17.
20. A non-transitory computer readable medium storing instructions which when executed by a computer perform any of the methods of Claims 1 to 17.
21. A network node (300) comprising: processing circuitry (306); and a memory coupled to the processing circuitry, wherein the memory comprises computer readable instructions that, when executed by the processing circuitry, cause the network node to perform operations comprising: receiving (602) a forecast of a service load, s/, of a network service for a future time epoch; searching (604) a search space of resource allocation values for sets of resource allocation values that are predicted to meet a key performance indicator, KPI, metric of the network service, wherein the resource allocation values correspond to levels of system resources provided to the network service by a system infrastructure; and selecting (606) a set of resource allocation values that meets a predetermined criterion for balancing resource utilization from among the sets of resource allocation values that meet the KPI metric of the network service; and configuring (608) the system infrastructure to provide system resources having the selected set of resource allocation values to the network service during the future time epoch.
22. A system (100) for managing one or more of system resources for a system infrastructure that supports a network service in a service-based communication network, the system comprising: a first node (Nl) that records service load data of a service load, s/, of the network service; a second node (N2) that generates a forecast of the service load based on the recorded service load data; and a third node (N3) that receives (602) the forecast of the service load for the future time epoch, searches (604) a search space of resource allocation values for sets of resource allocation values that are predicted to meet a key performance indicator, KPI, metric of the network service, wherein the resource allocation values correspond to levels of system resources provided to the network service by the system infrastructure, selects (606) a set of resource allocation values that meets a predetermined criterion for balancing resource utilization from among the sets of resource allocation values that meet the KPI metric of the network service, and configures (608) the system infrastructure to provide system resources having the selected set of resource allocation values to the network service during the future time epoch.
23. The system of Claim 22, further comprising: a fourth node (N4) that applies changes to the system infrastructure based on the configured system resources.
24. The system of Claim 22 or 23, wherein the second node is part of a network data analytics function, NWDAF, of a core network of a wireless communication network, and wherein the third node is part of a management data analytics function, MDAF, of the core network.
25. The system of Claim 24, wherein the first node is part of the NWDAF of the core network.
PCT/EP2020/054270 2020-02-18 2020-02-18 Dynamic resource dimensioning for service assurance WO2021164857A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/EP2020/054270 WO2021164857A1 (en) 2020-02-18 2020-02-18 Dynamic resource dimensioning for service assurance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2020/054270 WO2021164857A1 (en) 2020-02-18 2020-02-18 Dynamic resource dimensioning for service assurance

Publications (1)

Publication Number Publication Date
WO2021164857A1 true WO2021164857A1 (en) 2021-08-26

Family

ID=69631592

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2020/054270 WO2021164857A1 (en) 2020-02-18 2020-02-18 Dynamic resource dimensioning for service assurance

Country Status (1)

Country Link
WO (1) WO2021164857A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023051940A1 (en) * 2021-10-01 2023-04-06 Telefonaktiebolaget Lm Ericsson (Publ) Methods and apparatus for quality of service analysis

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1806002A1 (en) * 2004-10-28 2007-07-11 Telecom Italia S.p.A. Method for managing resources in a platform for telecommunication service and/or network management, corresponding platform and computer program product therefor
WO2012166641A1 (en) * 2011-05-27 2012-12-06 Vpisystems Inc. Methods and systems for network traffic forecast and analysis
US20190158417A1 (en) * 2017-11-21 2019-05-23 International Business Machines Corporation Adaptive resource allocation operations based on historical data in a distributed computing environment
WO2020033424A1 (en) * 2018-08-06 2020-02-13 Intel Corporation Management data analytical kpis for 5g network traffic and resource

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1806002A1 (en) * 2004-10-28 2007-07-11 Telecom Italia S.p.A. Method for managing resources in a platform for telecommunication service and/or network management, corresponding platform and computer program product therefor
WO2012166641A1 (en) * 2011-05-27 2012-12-06 Vpisystems Inc. Methods and systems for network traffic forecast and analysis
US20190158417A1 (en) * 2017-11-21 2019-05-23 International Business Machines Corporation Adaptive resource allocation operations based on historical data in a distributed computing environment
WO2020033424A1 (en) * 2018-08-06 2020-02-13 Intel Corporation Management data analytical kpis for 5g network traffic and resource

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; Management and orchestration; Management Services for Communication Service Assurance; Requirements (Release 16)", 5 December 2019 (2019-12-05), XP051838099, Retrieved from the Internet <URL:ftp.3gpp.org/tsg_sa/TSG_SA/TSGS_86/Docs/SP-191186.zip 28535-100.docx> [retrieved on 20191205] *
A. BILALT. TARIKA. VAJDAB. MILOUD: "Dynamic Cloud Resource Scheduling in Virtualized 5G Mobile Systems", 2016 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM, 2016, pages 1 - 6, XP033058486, DOI: 10.1109/GLOCOM.2016.7841760
S. DUTTAT. TALEBA. KSENTINI: "QoE-aware elasticity support in cloud-native 5G systems", 2016 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC, 2016, pages 1 - 6, XP032922532, DOI: 10.1109/ICC.2016.7511377

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023051940A1 (en) * 2021-10-01 2023-04-06 Telefonaktiebolaget Lm Ericsson (Publ) Methods and apparatus for quality of service analysis

Similar Documents

Publication Publication Date Title
Cheng et al. Network function virtualization in dynamic networks: A stochastic perspective
CN111953758B (en) Edge network computing unloading and task migration method and device
Hatem et al. Deep learning-based dynamic bandwidth allocation for future optical access networks
US11503615B2 (en) Bandwidth allocation using machine learning
Kim et al. Multi-agent reinforcement learning-based resource management for end-to-end network slicing
CN113709048A (en) Routing information sending and receiving method, network element and node equipment
CN112737823A (en) Resource slice allocation method and device and computer equipment
US11240690B2 (en) Streaming media quality of experience prediction for network slice selection in 5G networks
Toscano et al. Machine learning aided network slicing
US11962478B2 (en) Feasibility check for network slice instantiation
Xu et al. Schedule or wait: age-minimization for IoT big data processing in MEC via online learning
Tran et al. A survey of autoscaling in kubernetes
Mason et al. Using distributed reinforcement learning for resource orchestration in a network slicing scenario
Li et al. Federated orchestration for network slicing of bandwidth and computational resource
Ma et al. Mobility-aware delay-sensitive service provisioning for mobile edge computing
WO2021164857A1 (en) Dynamic resource dimensioning for service assurance
Laroui et al. Scalable and cost efficient resource allocation algorithms using deep reinforcement learning
Fendt et al. An efficient model for mobile network slice embedding under resource uncertainty
Hara et al. Deep Reinforcement Learning with Graph Neural Networks for Capacitated Shortest Path Tour based Service Chaining
CN106533730B (en) Hadoop cluster component index acquisition method and device
Wang et al. Advanced delay assured numerical heuristic modelling for peer to peer project management in cloud assisted internet of things platform
Heydari et al. Energy saving scheduling in a fog-based IoT application by Bayesian task classification approach
Ji et al. INTaaS: Provisioning In-band Network Telemetry as a service via online learning
Feng et al. A delay-aware deployment policy for end-to-end 5G network slicing
WO2016095162A1 (en) Device and method for determining operation for adjusting number of virtual machines

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20706218

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20706218

Country of ref document: EP

Kind code of ref document: A1