WO2021164857A1

WO2021164857A1 - Dynamic resource dimensioning for service assurance

Info

Publication number: WO2021164857A1
Application number: PCT/EP2020/054270
Authority: WO
Inventors: Efthymios STATHAKIS; Arthur GUSMAO; Martha VLACHOU-KONCHYLAKI
Original assignee: Telefonaktiebolaget Lm Ericsson (Publ)
Priority date: 2020-02-18
Filing date: 2020-02-18
Publication date: 2021-08-26

Abstract

A method for managing system resources for a system infrastructure that supports a network service in a service-based communication network includes receiving a forecast of a service load (sl) of the network service for a future time epoch; searching a search space of resource allocation values for sets of resource allocation values that are predicted to meet a key performance indicator (KPI) metric of the network service; selecting a set of resource allocation values that meets a predetermined criterion for balancing resource utilization from among the sets of resource allocation values that meet the KPI metric of the network service; and configuring the system infrastructure to provide system resources having the selected set of resource allocation values to the network service during the future time epoch.

Description

DYNAMIC RESOURCE DIMENSIONING FOR SERVICE ASSURANCE

TECHNICAL FIELD

[0001] The present disclosure relates generally to system resource management, and more particularly, to management of system resources in communication networks.

BACKGROUND

[0002] Management of networks has benefited significantly from the introduction of Software Defined Networks (SDNs). Similarly, service management has become more flexible when the services are cloud- based. In cloud applications that utilize Infrastructure as a Service (laaS), the resource utilization of the underlying infrastructure may be monitored in order to perform load balancing to ensure that the Virtual Machines (VMs) are not overloaded, or to ensure that elastic containerized applications have sufficient resources to execute their tasks. This monitoring is fundamental to enabling laaS providers to fulfill service-level agreements (SLAs), in which they are responsible for guaranteeing pre-established performance levels of the infrastructure, under penalty of fine in case the expected performance is not met.

[0003] Infrastructure-related SLAs typically place requirements on hardware resources, such as VM uptime or operation within certain resource limits. However, virtual network functions (VNFs) and cloud network functions (CNFs), especially those that serve the packet core in 4G and 5G wireless communication networks, such as virtualized mobility management entities (MME) and evolved packet gateways (EPG) in a 4G evolved packet core (EPC) network or cloud native unified data management (UDM) functions or policy control functions (PCF) in a 5G core network, have their own key performance indicators (KPIs). For example, a web server may have as a KPI the latency u_i for serving a request, while a computationally intensive service (such as, for example, an Artificial Intelligence backend) may have as a KPI the ratio r_succ of successfully handled requests. Therefore, service-specific SLAs may target application-related KPIs that are not necessarily the same as the KPIs of the underlying infrastructure. Also, these SLAs are typically stochastic in nature, because the KPI is monitored and aggregated over a certain time period, e.g., a week or a month.

[0004] A SLA may sometimes be expressed probabilistically or as a ratio. For example, a probabilistically expressed SLA may state that a KPI, such as latency, should not exceed some upper (or lower) bound b for α % of the time, e.g., Prob(KPI ≤ b) ≥ α. Alternatively, a SLA may be expressed as a ratio, e.g., the ratio of tasks that are successfully completed is at least α, where typical values for a are 95%, 99% or 99.5%. Depending on the nature of the KPI and depending on the context, one or the other of these SLA frameworks may be more suitable.

SUMMARY

[0005] A method for managing one or more of system resources for a system infrastructure that supports a network service in a service-based communication network includes receiving a forecast of a service load, s_i, of the network service for a future time epoch. A search space of resource allocation values is searched for sets of resource allocation values that are predicted to meet a key performance indicator, KPI, metric of the network service. The resource allocation values correspond to levels of system resources provided to the network service by the system infrastructure. A set of resource allocation values is selected that meets a predetermined criterion for balancing resource utilization from among the sets of resource allocation values that meet the KPI metric of the network service, and the system infrastructure is configured to provide system resources having the selected set of resource allocation values to the network service during the future time epoch. Selecting the set of resource allocation values that meets the predetermined criterion for balancing resource utilization may include selecting a set of resource allocation values that optimizes a function of the utilization of each of the resources in the set

[0006] In some embodiments, searching the space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI metric includes, for a system resource of the system infrastructure, generating a predicted range of resource utilization values that are required to meet the forecast of the service load s_i of the network service; and identifying, from among the predicted ranges of resource utilization values, a plurality of sets of resource utilization values that meet the KPI metric of the network service.

[0007] In some embodiments, the method further includes selecting a predetermined number of utilization values from the predicted range of resource utilization values; and combining selected ones of the predicted utilization values of the plurality of system resources to form the sets of predicted utilization values, wherein the sets of predicted utilization values form the search space, wherein identifying the plurality of sets of system resource utilization values that meet the KPI metric of the network service includes identifying the plurality of sets of system resource utilization values that meet the KPI metric of the network service from the search space.

[0008] In some embodiments, configuring the system infrastructure to provide system resources having the selected set of resource allocation values to the network service includes transmitting the selected set of system resource utilization values to an actuation node that is configured to apply changes to the system infrastructure to provide the system resources having the selected set of system resource utilization values during the future time epoch. [0009] In some embodiments, searching the search space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI metric includes, for a set of resource allocation values, generating a prediction U_j of the KPI metric for the future epoch based on the forecasted system load s_/, and the set of resource allocation values.

[0010] In some embodiments, generating the predicted range of resource utilization values includes, for a first system resource, defining an interval [û/ r_h, u/n] based on high maximum resource utilization value r_h and a low maximum resource utilization value n, where 0 ≤ n < r_h <

1 for the first system resource and a predicted absolute resource value û for the first system resource, and selecting a plurality of values û¹ from within the interval.

[0011] In some embodiments, selecting the plurality of values û¹ from within the interval includes selecting N equidistant points within the interval.

[0012] In some embodiments, the method further includes, for a second system resource, defining a plurality of second intervals [û/ r_h, û/ n] for the second system resource and a predicted absolute resource value û for the second system resource, and selecting a plurality of values from within the second interval.

[0013] In some embodiments, the network service includes a communication network, wherein the service load comprises a number of requests per unit time, and the KPI includes network latency.

[0014] In some embodiments, the system infrastructure includes a distributed computing infrastructure, and the system resources comprise central processing unit, CPU, resources, memory resources and/or network resources. [0015] In some embodiments, the system resources include CPU utilization, and a predicted absolute CPI requirement u_c is generated a function of the forecast system load as û_c = f_c(s_l), where f_c() is a regression model of the absolute CPU usage.

[0016] In some embodiments, the system resources include memory usage, and a predicted absolute memory usage requirement û_m is generated as a function of the forecast system load si and a maximum CPU utilization r_c as u_m = f_m(sl,r_c), where f_m() is a regression model of the absolute memory usage.

[0017] In some embodiments, the system resources include network usage, and a predicted absolute network usage requirement u_n is generated a function of the forecast system load Si, and the maximum CPU utilization r_c, and a maximum memory usage r_m as u_n = f_n(s_u r_c, r_m), where f_n() is a regression model of the absolute network usage.

[0018] In some embodiments, selecting the set of resource allocation values that meets the predetermined criterion includes selecting the set of resource allocation values that maximizes a harmonic mean of expected resource utilization values.

[0019] In some embodiments, selecting the set of resource allocation values that meets the predetermined criterion includes selecting a set of predicted resource utilization values that maximizes a harmonic mean of the predicted resource utilization values.

[0020] In some embodiments, searching the search space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI of the network service and selecting the set of resource allocation values that meets the predetermined criterion from among the sets of resource allocation values that meet the KPI metric of the network service includes generating predicted resource allocation values v_c, v_m, v_n of the system resources according to the formulas:

Vc — f_c (S l_c, r_m, r_n), V_m — f_m(S_l, r_c, r_m, r_n), V_n — f_nC^_{b Vc}, T_m, G_h) where r_c, r_m and r_n are maximum resource utilizations; constructing a linear model for f_i = w^Tx with a Gaussian prior on w; obtaining a posterior distribution of w as w~K (μ, Σ); determining a probability Prob[w^Tx ≤ b] ≥ α where b is a value of the KPI metric and 0 is a threshold; and maximizing f_c(x), subject to f_n(x) e [G\G ], f_m(x ) e [r ¹, r™] and b - μ^Tx ≥ Φ^-1(α) Σ x| _, where x = [¾ r_c, r_m, r_n], r_l is a lower maximum resource utilization, r_h is an upper maximum resource utilization where 0 ≤ r_l ≤ r_h ≤ 1.

[0021] Some embodiments provide a computer program comprising instructions which when executed on a computer perform any of the foregoing methods.

[0022] Some embodiments provide a computer program product comprising computer program, the computer program comprising instructions which when executed on a computer perform any of the foregoing methods.

[0023] Some embodiments provide a non-transitory computer readable medium storing instructions which when executed by a computer perform any of the foregoing methods.

[0024] Some embodiments provide a network node including processing circuitry configured to perform operations of receiving a forecast of a service load, s_/, of the network service for a future time epoch, and searching a search space of resource allocation values for sets of resource allocation values that are predicted to meet a key performance indicator, KPI, metric of the network service, wherein the resource allocation values correspond to levels of system resources provided to the network service by the system infrastructure. The operations further include selecting a set of resource allocation values that meets a predetermined criterion for balancing resource utilization from among the sets of resource allocation values that meet the KPI metric of the network service, and configuring the system infrastructure to provide system resources having the selected set of resource allocation values to the network service during the future time epoch.

[0025] A system for managing one or more of system resources for a system infrastructure that supports a network service in a service-based communication network includes a first node that records service load data of a service load, si, of the network service, a second node that generates a forecast of the service load based on the recorded service load data, and a third node that receives the forecast of the service load for the future time epoch. The third node searches a search space of resource allocation values for sets of resource allocation values that are predicted to meet a key performance indicator, KPI, metric of the network service, wherein the resource allocation values correspond to levels of system resources provided to the network service by the system infrastructure. The third node selects a set of resource allocation values that meets a predetermined criterion for balancing resource utilization from among the sets of resource allocation values that meet the KPI metric of the network service, and configures the system infrastructure to provide system resources having the selected set of resource allocation values to the network service during the future time epoch.

[0026] Some embodiments use arbitrary supervised learning algorithms to model resources and key performance indicators. Given a probabilistic service level agreement (SLA_ framework, some embodiments provide resource optimization that enhances resource utilization under SLA constraints.

[0027] Some embodiments described herein may results in lower operational costs as a result of smart provisioning and/or improved overall network performance by releasing unused resources to services that may benefit from them.

BRIEF DESCRIPTION OF THE DRAWINGS

[0028] The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in a constitute a part of this application, illustrate certain non-limiting embodiments of inventive concepts. In the drawings:

[0029] Figures 1 and 2 illustrate various elements and workflows of a core network of a wireless communication system.

[0030] Figure 3 is a block diagram of a network node that may be configured to perform operations according to some embodiments.

[0031] Figure 4 illustrates functional aspects of some nodes of a system according to some embodiments.

[0032] Figure 5 is a graph that illustrates an example of a service load time series profile.

[0033] Figures 6 to 9 illustrate operations of systems/methods according to some embodiments.

DETAILED DESCRIPTION [0034] Inventive concepts will now be described more fully hereinafter with reference to the accompanying drawings, in which examples of embodiments of inventive concepts are shown. Inventive concepts may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of present inventive concepts to those skilled in the art. It should also be noted that these embodiments are not mutually exclusive. Components from one embodiment may be tacitly assumed to be present/used in another embodiment.

[0035] The following description presents various embodiments of the disclosed subject matter. These embodiments are presented as teaching examples and are not to be construed as limiting the scope of the disclosed subject matter. For example, certain details of the described embodiments may be modified, omitted, or expanded upon without departing from the scope of the described subject matter.

[0036] Despite deployment infrastructure having native monitoring and load-balancing mechanisms, such built-in mechanisms target only system-related resources, and are unaware of the KPIs that are relevant for the service that runs on top of the infrastructure. Therefore, the mechanisms for monitoring service-related KPIs must be implemented on top of the existing architecture. Naive solutions, such as the assignment of the maximum allowable resources, have the disadvantage that they may lead to over-provisioning and/or increased operational costs. Another reason is that allocated but unused resources can be harmful from a system point-of-view, since they cannot be flexibly utilized by other services that could potentially benefit from them. Therefore, to achieve efficient virtual network function (VNF) assurance, there is a need for mechanisms that forecast the service KPIs at least a few time steps ahead, and provide an adaptive resource configuration mechanism, in terms of memory, network, CPU, or other relevant resource allocation. The ultimate target is to enable the service to run seamlessly within the KPI bounds without under-utilizing the resources or operating within a risky regime, i.e., under conditions that are close to violating the SLA. [0037] In the context of 5G, [1] proposed a dynamic resource scheduling mechanism for VNFs where they pre-allocate VMs based on the load prediction for the next period, yet the KPI assurance only covers the underlying infrastructure. One framework that takes user QoE into account, for cloud-native 5G network functions, is disclosed in [2] as a simple reactive strategy for up- or down-scaling the resource allocation to maintain a user QoE. However, the QoE is not strictly quantified as in the formal context described above.

[0038] Some embodiments described herein provide systems and methods for dynamic dimensioning of services within designated resource limits. In particular, some embodiments provide a framework for dynamic traffic-driven dimensioning that uses machine learning algorithms, of arbitra ry complexity, to provide an efficient resource allocation that meets probabilistic SLAs.

[0039] Some embodiments use arbitrary supervised learning algorithms to model the resources and KPIs. Given a probabilistic SLA framework, some embodiments provide resource optimization that enhances resource utilization under SLA constraints.

[0040] Some embodiments described herein may results in lower operational costs as a result of smart provisioning and/or improved overall network performance by releasing unused resources to services that may benefit from them.

[0041] Figure 1 illustrates various elements and workflows of a core network of a wireless communication system 100 including a plurality of network nodes in which some embodiments described herein may be utilized. The nodes may be associated with a function of the core network, such as a network data analytics function (NWDAF), a management data analytics function (MDAF), etc. An SLA may be defined that provides one or more performance requirements for the system 100.

[0042] Referring to Figure 1, a data stream consisting of service-related traffic, e.g., a number of requests or network transmitted bytes, may be captured by a network node Nl, which creates a time-series of the captured data. This time-series data is passed on to an artificial intelligence (Al) Node N2 which generates a forecast of the service-related traffic for the next one or more time periods, where "time period" refers to an arbitra ry amount of time, e.g., one minute, one hour, etc. [0043] Next, another Al node N3 uses the forecast data as an input, together with an available resource budget, to specify an allocation of computational resources to the service, within the admissible limits, so that the SLA is fulfilled. The proposed allocation is fed as input to an actuation node N4, which allocates resources within the infrastructure in accordance with the specification provided by the Al node N3.

[0044] In the example shown in Figure 1, the operations of the Al node N2 are implemented as part of an NWDAF 230, while the operations of the Al node N3 and the actuation node N4 are implemented as part of an MDAF 220. The operations of the network node N1 are implemented separately from the NWDAF 230 and MDAF 220. In contrast, in the example shown in Figure 2, the operations of the network node N1 are implemented as part of the NWDAF 230.

[0045] In some implementations, the Al node N3 may provide a risk for SLA breach along with the proposed allocation strategy to the actuation node N4. The risk can be quantified, for instance, as the margin for crossing a KPI threshold. For example, if the latency should be less than 10 msec for 99% of the time and the desired resource allocation strategy is predicted to have latency less than 10 msec for 99.5% of the time, then the safety margin of 0.5% can also be fed as input to the actuator to enable further decision making strategies on top of the resource optimizer.

[0046] Some embodiments described herein may advantageously provide automatic KPI- d riven scaling of a service deployment to meet an SLA. Some embodiments may advantageously provide efficient construction of a search-space for resource allocation to achieve enhanced utilization. In particular, some embodiments may advantageously maximize or increase utilization while fulfilling an SLA. Some embodiments may advantageously reduce resource consumption by the service, resulting in lower operational expenses. Moreover, some embodiments may advantageously reduce the amount of tied resources that could otherwise be utilized by other services or network functions.

[0047] Figure 3 is a block diagram of a network node according to some embodiments. Various embodiments provide a network node 300 that includes a processor circuit 306 a communication interface 320 coupled to the processor circuit, and a memory 308 coupled to the processor circuit. The memory 308 includes machine-readable computer program instructions that, when executed by the processor circuit, cause the processor circuit to perform some of the operations depicted in Figures 6 to 9.

[0048] The network node 300 may be a core network node of a core network, such as a 5GC or EPC core network. As shown, network node 300 includes a communication interface 320 (also referred to as a network interface) configured to provide communications with other nodes (e.g., with other base stations and/or core network nodes) of a communication network. The network node 300 also includes a processor circuit 306 (also referred to as a processor) and a memory circuit 308 (also referred to as memory) coupled to the processor circuit 306. The memory circuit 308 may include computer readable program code that when executed by the processor circuit 306 causes the processor circuit to perform operations according to embodiments disclosed herein. According to other embodiments, processor circuit 306 may be defined to include memory so that a separate memory circuit is not required.

[0049] As discussed herein, operations of the network node 300 may be performed by processor 306 and/or communication interface 320. For example, the processor 306 may control the communication interface 320 to transmit communications through the communication interface 320 to one or more other network nodes and/or to receive communications through network interface from one or more other network nodes. Moreover, modules may be stored in memory 308, and these modules may provide instructions so that when instructions of a module are executed by processor 306, processor 306 performs respective operations (e.g., operations discussed herein with respect to example embodiments). In addition, a structure similar to that of Figure 3 may be used to implement other network nodes. Moreover, network nodes discussed herein may be implemented as virtual network nodes.

[0050] Accordingly, a network node 300 (or radio access network (RAN) node 300) according to some embodiments includes a processor circuit 306 and a memory 308 coupled to the processor circuit, the memory including machine readable program instructions that, when executed by the processor circuit, cause the network node to perform operations described herein.

[0051] Referring again to Figures 1 and 2, a set of nodes N1-N4 is illustrated. The first node N1 performs an operation of capturing a summary of records of the load/traffic of a service, such as a service operated in a core network of a wireless communication network. The summary of records may correspond to raw data traffic at a highest possible resolution or to an aggregated version of the data traffic. In the latter case, the aggregation level of the records can be chosen based on performance targets and computational limitations.

[0052] For example, network node N1 may provide a mobility management entity (MME) network function handling signaling traffic. Nodes N2, N3 and N4 can be implemented as different microservices that enhance existing 5G nodes. Node N2 could be implemented, for example, as additional functionality to the NWDAF, while nodes N3 and N4 could be additional functionalities implemented in the MDAF. In any implementation, node N1 provides node N2 with data relating to service traffic and load, node N2 trains a function, such as a machine learning function or deep learning function, to generate a forecast of a time series of the service traffic/load. Node N3 uses the traffic/load prediction to optimize infrastructure resources for meeting the SLA requirements and node N4 pushes the infrastructure changes back to node Nl. For example, in this example, the infrastructure supporting an MME application may be scaled by changing the amount of memory and/or CPU processing power that are allocated to the virtual machine (VM) on which the MME function is executed.

[0053] In another alternative shown in Figure 2, nodes Nl and N2 are implemented together inside the NWDAF, since the primary function of the NWDAF is to serve consumers with insights that augment and enhance packet core functionality, as well assist with management of experience assurance. In such a scenario, node Nl collects data from various core nodes. As in the system shown in Figure 1, node N2 serves node N3 with a load/traffic prediction, and node N3 uses the load/traffic prediction to optimize infrastructure resources to ensure that the service meets the SLA requirements. Node N4 pushes the change back to any core node affecting the SLA. [0054] Referring to Figure 4, the functionality of nodes N2 and N3 is illustrated in more detail. As shown therein, node N2 performs monitoring and forecasting based on traffic load data. The forecasted value is thereafter used by node N3 as the basis for optimizing resource allocation of the infrastructure according to constraints to meet the required SLA.

[0055] The historical data recorded by node N1 is used to construct a time-series. For each new observation obtained, a record is added to the historical time-series. A service load profile time series is shown in Figure 5. Using part of the historical data, e.g., the data that spans a given time window (similar to the shadowed part 502 in Figure 5), a one-step or a multi-step ahead forecast of the future expected service load is generated, using its 90th percentile or some other statistical measure, for the next (one or more) aggregation periods. In this example, the load can be expressed as a number of requests for a particular component or the number of bytes transmitted through the network.

[0056] The forecast stage described here corresponds to the second node (N2) of the system 100. Formally, the second node N2 (Figures 1 and 2) receives as input the historical load data from node N1 and outputs a forecasted load at a desired future moment. The load data may consist, for example, of a number of requests that arrive at the network service at a given moment in time or to a number of active users. In general, any kind of measurement that influences the service behavior can be used as the service load data.

[0057] For the forecast performed by node N2, any suitable forecasting algorithm can be used, from traditional time series forecasting methods, such as autoregressive integrated moving average (ARIMA) models, to recurrent neural networks (RNNs) or dilated convolutional neural networks (CNNs). In the times series shown in Figure 5, the region 502 represents a time window of historical data to be considered in the forecast, and the non- shaded region 504 represents the load that the service will observe in the future. Note that it is possible to use either all historical data to conduct a prediction or a designated part of it, such as, for example, the last two days of historical data.

[0058] Within node N3, some embodiments use supervised machine learning (SL) algorithms to build regression models for the system resources, e.g. f_c{), f_m() and f_n() for absolute/ raw CPU usage u_c, memory usage u_m, and network usage u_n, respectively. However, the systems/methods may use f_c(), f_m{) and f_n( ) to model CPU utilization v_c = u_c/u ^ax , memory utilization v_m = u_m/u™^ax, the network utilization v_n = u_n/u™^ax, where the quantities r_c, r_m and r_n denote the maximum allocation for the respective resource. For example, assume that a microservice is consuming 2.3 cores of CPU and 1GB of memory, while the maximum allocated resources are 4 cores and 2 GB, respectively. Then, the above metrics for absol ute usage are ( u_c , r_c, ) = (2.3, 4) cores, (it_m, r_m, ) = (1, 2) GB and the respective utilizations are v_c=2.3/4 = 57,5% and v_m=l/2 = 50%.

[0059] The input to the models f_c( j, f_m() and f_n() is the forecasted service load s_u as well as the maximum CPU r_c, memory r_m and network r_n resources that are allocated to the service. A Bayesian probabilistic model /_¾(x), is provided, where x = [s_j, r_c, r_m, r_n], for the KPI of interest that, in this example, is the latency u_t. The output of fi(x) is a probability distribution, which is the final output of the third node N3.

[0060] Since the third node N3 uses of a set of pre-trained machine learning models, it does not need to be placed in any specific physical location in the network. In practice, however, having it reside in the same physical location as the network service may help to decrease data transfer overhead between the node and a deployment database that provides the resource utilization data. Although not significant during the inference phase, this improvement may help in case the system includes mechanisms to trigger automatic re learning of the machine learning models.

[0061] Using the definitions introduced above, two approaches may be defined for optimizing resource usage and satisfying the SLA, described below, namely, sequential resource modeling and constrained optimization.

[0062] Sequential resource modeling

[0063] For sequential resource modeling, assume that the resource utilization, expressed as the fraction of resources used divided by the maximum allocated resources, should be in the interval

≤ r_h ≤ 1. For this description, these intervals are chosen here to be the same for all resources, but different bounds can be chosen per resource without loss of generality. In this approach, models are constructed for the absolute resource usage, i.e., the number of cores u_c, the memory capacity u_m, or the network usage u_n, as follows:

[0064] This order of modeling the KPIs is chosen assuming that the CPU usage u_c can be directly inferred from the load

more accurately than memory or network usage, and that a combination of load and CPU r_c captures memory better than network usage. Hence, by choosing this order, the resources that are easier to predict and less dependent on other resources are first modeled, and then the more complex ones are modeled. Note, however, that any other order of resource modeling would be possible, depending on how accurate the model is for the given combination input. The motivation for this successive model order is to reduce or minimize the error propagation in the steps described below.

[0065] The forecasted load s_t is used to predict the required amount of CPU û_c. An interval

[û_c/r_h, û_c/n] is created and N equidistant points

are defined in this interval. Each point û_c ^l is a candidate for the CPU allocation limit r_c. Hence, in the following analysis, û_c ^l is used as a proxy for r_c. This way, it is guaranteed that when the absolute CPU usage of the service is û_c, then the utilization v_c = u_c/û_c ^l will lie in [r_j,r_ft]

[0066] For each point û_c ^l, the absolute required memory û_m ^l = f_c(si,û_c ^l ) is computed. Again, for each û_m ^l , an interval [ûm/r_h, û_m ^l /r_j\ is created and N equidistant points are taken

in this interval. Similarly to the previous step, each point

is a candidate for the memory allocation limit r_m, thus

is used as a proxy for r_m.

[0067] After this step, N² tuples (û_c ^l,û^ are provided. Finally, for each pair (û_c ^l, û^l ), the absolute required network

are computed, an interval [ ] is created and N equidistant points {¾^J' }_k-1 are taken.

[0068] At the end of this process, the resulting N³ tuples

are evaluated using the function fi(x) to yield a latency distribution

= fa (û_c ^l, û^, ûn^l '^k, Si) . For each distribution û^'^k, the process checks to see if it satisfies the SLA, i.e., check the condition Prob[u_j' ^J'^fe ≤ b\ ³ a. For all index tuples (i^*,_/^*, fc^*) such that û\ ^{J ,k} satisfies the criterion and pick the one that maximizes the harmonic mean of the expected CPU utilization v_c ^l = û_c/û_c , the expected memory utilization

= û^/û^ and the expected network utilization

[0069] A different criterion can be chosen besides harmonic mean, but the harmonic mean favors those resource allocations that lead to approximately the same utilization for all resources.

[0070] To illustrate the procedure, a simple example is provided here:

[0071] Initialization: [r r_h] = [0.5, 0.8]

[0072] Step 1: Forecast load 100 req/sec

[0073] Step 2: Predict the CPU usage to obtain û_c = /_c(100) = 1 core needed to serve this load.

[0074] Create thee interval [— ,— ] = [1.25, 2.0] and take N = 10 points û_c ^l, e.g., 1.25, 1.5, ...,

L0.8 0.5 J

2.

[0075] Step 3: For each û_c ^l get required memory, e.g.,

= f_c{ 100, 1.25) = 1, . A⁰ = fc( 100, 2.0) = 2.

[0076] For û^, create interval [— ,— | = [1.25, 2.0] and take N = 10 points

L0.8 0.5J

e.g., 1.25,

1.5, 2...

[0077] For ύ^, create interval [— ,— 1 = [2.5, 4.0] and take N = 10 points

L0.8 0.5 J

e.g., 2.5,

2.75, ...,4.

[0078] Step 4: For each pair (û_c ^l ,û- ) compute the required network capacity,

= f_c( 100, 1.25, 1.25) = 100, ...,u '¹⁰ = f_c( 100, 2.0, 4.0) = 200.

[0079]

get interval ] = [125, 200], then take N = 10 points e.g., 125,

L 0.8 0.5 J

150, ..,200.

200 200

[0080] For M^'¹⁰, 6^et interval = [250,400] then take N = 10 points u 0. ^0,10,k, e.g., 8 0.5

250, ...,400.

[0081] Step 5: For each pair (û_c ^l , û^, Un '^k and the load s_t get latency distribution as follows: [0082] u '¹'¹ = fi( 100, 1.25, 1.25, 100), ...,u ⁰'¹⁰·¹⁰ = f_c{ 100, 2.0, 4.0, 400) [0083] Find the distributions û ^{J ,k} that meet the SLA Prob ^{J ,k} ≤ ή| > a

[0084] For each tuple,

, û_n ^{J ,k} compute the CPU utilization vf = û_c/ ¾^*, memory utilization

network utilization

[0085] Return the tuple that yields the best harmonic mean for

[0086] Note here that, for N = 10 points per interval, 10³ possible combinations are evaluated, covering a large part of the search space. The evaluation of the tuple can be executed in parallel and is an inexpensive computation.

[0087] It will be appreciated that a different number of points N_k may be chosen for each interval, and such points may not necessarily equidistant. Essentially, this approach yields a reduced search space in the grid of all possible resource allocations by considering only those vectors that will eventually lead to good resource provisioning.

[0088] If one wishes to reduce complexity, then a naive approach would be to construct the search space for the resources semi-manually, e.g., by considering a set of CPU values in the range [r_c/2,r_c\, a set of memory values in the range [r_m/2,r_m] and a set of network values in the range [r_n/2, r_n ]. Then, it would be possible to test all possible combinations of these points, yet in that case utilization is not taken into into account since all possible resources may be allocated.

[0089] One of the benefits of the approach described above in this section is that it may lead to an allocation (û_c, u_m, u_n) that is strictly lower than (r_c,r,r_n), and thus the remaining resources can be temporarily released to other services until they are asked back.

[0090] It will be further appreciated that this approach is not constrained in the choice of functions, and that decisions tree, neural networks or any other function that provides good fitting can be used.

[0091] Operations according to some embodiments are illustrated in Figures 6 to 8. Referring to Figure 6, A method for managing one or more of system resources for a system infrastructure that supports a network service in a service-based communication network includes receiving (602) a forecast of a service load, s_/, of the network service for a future time epoch; searching (604) a search space of resource allocation values for sets of resource allocation values that are predicted to meet a key performance indicator, KPI, metric of the network service, wherein the resource allocation values correspond to levels of system resources provided to the network service by the system infrastructure; selecting (606) a set of resource allocation values that meets a predetermined criterion for balancing resource utilization from among the sets of resource allocation values that meet the KPI metric of the network service; and configuring (608) the system infrastructure to provide system resources having the selected set of resource allocation values to the network service during the future time epoch. Selecting (606) the set of resource allocation values that meets the predetermined criterion for balancing resource utilization may include selecting a set of resource allocation values that optimizes a function of the utilization of each of the resources in the set. In particular, selecting (606) the set of resource allocation values that meets the predetermined criterion for balancing resource utilization may include selecting a set of resource allocation values that maximizes a function of the utilization of each of the resources in the set.

[0092] Referring to Figure 7, in some embodiments, searching the space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI metric includes, for a system resource of the system infrastructure, generating (702) a predicted range of resource utilization values that are required to meet the forecast of the service load si of the network service; and identifying (704), from among the predicted ranges of resource utilization values, a plurality of sets of resource utilization values that meet the KPI metric of the network service.

[0093] Referring to Figure 8, in some embodiments, the method further includes selecting (802) a predetermined number of utilization values from the predicted range of resource utilization values; and combining (804) selected ones of the predicted utilization values of the plurality of system resources to form the sets of predicted utilization values, wherein the sets of predicted utilization values form the search space, wherein identifying the plurality of sets of system resource utilization values that meet the KPI metric of the network service includes identifying the plurality of sets of system resource utilization values that meet the KPI metric of the network service from the search space.

[0094] In some embodiments, configuring the system infrastructure to provide system resources having the selected set of resource allocation values to the network service includes transmitting the selected set of system resource utilization values to an actuation node that is configured to apply changes to the system infrastructure to provide the system resources having the selected set of system resource utilization values during the future time epoch.

[0095] In some embodiments, searching the search space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI metric includes, for a set of resource allocation values, generating a prediction

of the KPI metric for the future epoch based on the forecasted system load s_/, and the set of resource allocation values.

[0096] In some embodiments, generating the predicted range of resource utilization values includes, for a first system resource, defining an interval [u/r_h, u/n] based on high maximum resource utilization value r_h and a low maximum resource utilization value n, where 0 ≤ n ≤ r_h ≤ 1 for the first system resource and a predicted absolute resource value û for the first system resource, and selecting a plurality of values û^l from within the interval.

[0097] In some embodiments, selecting the plurality of values û^l from within the interval includes selecting N equidistant points within the interval.

[0098] In some embodiments, the method further includes, for a second system resource, defining a plurality of second intervals [û/r _h, û/ n] for the second system resource and a predicted absolute resource value û for the second system resource, and selecting a plurality of values from within the second interval.

[0099] In some embodiments, the network service includes a communication network, wherein the service load comprises a number of requests per unit time, and the KPI includes network latency.

[0100] In some embodiments, the system infrastructure includes a distributed computing infrastructure, and the system resources comprise central processing unit, CPU, resources, memory resources and/or network resources. [0101] In some embodiments, the system resources include CPU utilization, and a predicted absolute CPI requirement û_c is generated a function of the forecast system load as û_c = fc(^sl), where f_c() is a regression model of the absolute CPU usage.

[0102]

[0103] In some embodiments, the system resources include memory usage, and a predicted absolute memory usage requirement û_m is generated as a function of the forecast system load s_i and a maximum CPU utilization r_c as û_m = f_m(^si_>r_c), where f_m() is a regression model of the absolute memory usage.

[0104] In some embodiments, the system resources include network usage, and a predicted absolute network usage requirement û_n is generated a function of the forecast system load s_i, and the maximum CPU utilization r_c, and a maximum memory usage r_m as û_n = fn(^sl ^rc ^rm)_, where f_n() is a regression model of the absolute network usage.

[0105] In some embodiments, selecting the set of resource allocation values that meets the predetermined criterion includes selecting the set of resource allocation values that maximizes a harmonic mean of expected resource utilization values.

[0106] In some embodiments, selecting the set of resource allocation values that meets the predetermined criterion includes selecting a set of predicted resource utilization values that maximizes a harmonic mean of the predicted resource utilization values.

[0107] Constrained optimization

[0108] For constrained optimization, assume again that the resource utilization should be in the interval [r_l r_h], where 0 ≤ r_l £ r_h ≤ 1. Without loss of generality, it is assumed that linear models (quadratic models are also possible) of the resource utilization, i.e., of the quantities v_c, v_m and v_n, can be constructed as follows:

[0109] Also, choose a linear model for fl = w^Tx with a Gaussian prior on w. After training, e.g., using Markov chain Monte Carlo (MCMC), the posterior distribution of w is obtained as w ~7\G(μ,S). Then, conditioned on the input vector x = [si,r_c,r_m,r_n], obtain a closed form solution for Prob [w^Tx £ b] ³ a. Capitalizing on the closed-form expression for the SLA, the following optimization problem is solved: maximize /_c(x) subject to /_n(x) e [r , r£] f_m(x) e [r,^m,r_¾ ^m]

[0110] This problem can be efficiently solved and gives us the optimal/maximum CPU utilization and admissible memory and network utilization such that the SLA is met. Of course, one can choose to maximize another resource utilization and place CPU utilization under the constraints.

[0111] In some implementations, one can implement more sophisticated ML algorithms to build models, such decision trees and neural networks, for the resource utilization v_c, v_m, v_n, and the latency u_t. These models can be used to evaluate the optimal solution x^* with greater accuracy and perhaps adjust it. For instance, after solving the optimization problem, the solution x^* can be evaluated using another model

and check if Probf j^x^*) £ b] ³ a. If the constraint is not met, then the resources can be adjusted by incrementing them and rechecking the condition. A similar assessment and adjustment can be done for the resource utilization models.

[0112] Accordingly, referring to Figure 9, in some embodiments, searching the search space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI of the network service and selecting the set of resource allocation values that meets the predetermined criterion from among the sets of resource allocation values that meet the KPI metric of the network service includes generating (902) predicted resource allocation values v_c,v_m, v_n of the system resources according to the formulas:

where r_c, r_m and r_n are maximum resource utilizations; constructing (904) a linear model for fi = w^Tx with a Gaussian prior on w; obtaining (906) a posterior distribution of w as w~Jf (/!,Σ); determining (908) a probability Prob[w^r¾ £ b] ³ a where b is a value of the KPI metric and 0 is a threshold; and maximizing (910) f_c(x), subject to f_n(x) E [r \ r ], f_m(x) E

lower maximum resource utilization, r_h is an upper maximum resource utilization where 0 ≤ r_j ≤ r_h ≤ 1.

[0113] Explanations for abbreviations from the above disclosure are provided below.

Abbreviation Explanation

3GPP 3rd Generation Partnership Project

5G 5th Generation

5GC 5G Core

ANN Artificial Neural Network

ARIMA Autoregressive Integrated Moving Average

CNF Cloud Network Functions

CNN Convolutional Neural Network

DAF Network Data Analytics Function

EPC Evolved Packet Core

EPG Evolved Packet Gateway laaS Infrastructure-as-a-Service

KPI Key Performance Indicators

MME Mobile Management Entity

NN Neural Network

PCF Policy Control Function

RNN Recurrent Neural Network

SL Supervised Learning

SLA Service-Level Agreements

SDN Software Defined Networks

UDM User Data Management

VNF Virtual Network functions

VM Virtual Machine

[0114] References:

[1] A. Bilal, T. Tarik, A. Vajda, and B. Miloud, "Dynamic Cloud Resource Scheduling in Virtualized 5G Mobile Systems," in 2016 IEEE Global Communications Conference (GLOBECOM),

2016, pp. 1-6. [2] S. Dutta, T. Taleb, and A. Ksentini, "QoE-aware elasticity support in cloud-native 5G systems," in 2016 IEEE International Conference on Communications (ICC), 2016, pp. 1-6. [0115] Further definitions and embodiments are discussed below.

[0116] In the above-description of various embodiments of present inventive concepts, it is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of present inventive concepts. Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which present inventive concepts belong. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of this specification and the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

[0117] When an element is referred to as being "connected", "coupled", "responsive", or variants thereof to another element, it can be directly connected, coupled, or responsive to the other element or intervening elements may be present. In contrast, when an element is referred to as being "directly connected", "directly coupled", "directly responsive", or variants thereof to another element, there are no intervening elements present. Like numbers refer to like elements throughout. Furthermore, "coupled", "connected", "responsive", or variants thereof as used herein may include wirelessly coupled, connected, or responsive. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. Well-known functions or constructions may not be described in detail for brevity and/or clarity. The term "and/or" includes any and all combinations of one or more of the associated listed items.

[0118] It will be understood that although the terms first, second, third, etc. may be used herein to describe various elements/operations, these elements/operations should not be limited by these terms. These terms are only used to distinguish one element/operation from another element/operation. Thus, a first element/operation in some embodiments could be termed a second element/operation in other embodiments without departing from the teachings of present inventive concepts. The same reference numerals or the same reference designators denote the same or similar elements throughout the specification.

[0119] As used herein, the terms "comprise", "comprising", "comprises", "include", "including", "includes", "have", "has", "having", or variants thereof are open-ended, and include one or more stated features, integers, elements, steps, components, or functions but does not preclude the presence or addition of one or more other features, integers, elements, steps, components, functions, or groups thereof. Furthermore, as used herein, the common abbreviation "e.g.", which derives from the Latin phrase "exempli gratia," may be used to introduce or specify a general example or examples of a previously mentioned item, and is not intended to be limiting of such item. The common abbreviation "i.e.", which derives from the Latin phrase "id est," may be used to specify a particular item from a more general recitation. [0120] Example embodiments are described herein with reference to block diagrams and/or flowchart illustrations of computer-implemented methods, apparatus (systems and/or devices) and/or computer program products. It is understood that a block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions that are performed by one or more computer circuits. These computer program instructions may be provided to a processor circuit of a general purpose computer circuit, special purpose computer circuit, and/or other programmable data processing circuit to produce a machine, such that the instructions, which execute via the processor of the computer and/or other programmable data processing apparatus, transform and control transistors, values stored in memory locations, and other hardware components within such circuitry to implement the functions/acts specified in the block diagrams and/or flowchart block or blocks, and thereby create means (functionality) and/or structure for implementing the functions/acts specified in the block diagrams and/or flowchart block(s).

[0121] These computer program instructions may also be stored in a tangible computer- readable medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instructions which implement the functions/acts specified in the block diagrams and/or flowchart block or blocks. Accordingly, embodiments of present inventive concepts may be embodied in hardware and/or in software (including firmware, resident software, micro-code, etc.) that runs on a processor such as a digital signal processor, which may collectively be referred to as "circuitry," "a module" or variants thereof.

[0122] It should also be noted that in some alternate implementations, the functions/acts noted in the blocks may occur out of the order noted in the flowcharts. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Moreover, the functionality of a given block of the flowcharts and/or block diagrams may be separated into multiple blocks and/or the functionality of two or more blocks of the flowcharts and/or block diagrams may be at least partially integrated. Finally, other blocks may be added/inserted between the blocks that are illustrated, and/or blocks/operations may be omitted without departing from the scope of inventive concepts. Moreover, although some of the diagrams include arrows on communication paths to show a primary direction of communication, it is to be understood that communication may occur in the opposite direction to the depicted arrows.

[0123] Many variations and modifications can be made to the embodiments without substantially departing from the principles of the present inventive concepts. All such variations and modifications are intended to be included herein within the scope of present inventive concepts. Accordingly, the above disclosed subject matter is to be considered illustrative, and not restrictive, and the examples of embodiments are intended to cover all such modifications, enhancements, and other embodiments, which fall within the spirit and scope of present inventive concepts. Thus, to the maximum extent allowed by law, the scope of present inventive concepts are to be determined by the broadest permissible interpretation of the present disclosure including the examples of embodiments and their equivalents, and shall not be restricted or limited by the foregoing detailed description.

[0124] Additional explanation is provided below. [0125] Generally, all terms used herein are to be interpreted according to their ordinary meaning in the relevant technical field, unless a different meaning is clearly given and/or is implied from the context in which it is used. All references to a/an/the element, apparatus, component, means, step, etc. are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise.

The steps of any methods disclosed herein do not have to be performed in the exact order disclosed, unless a step is explicitly described as following or preceding another step and/or where it is implicit that a step must follow or precede another step. Any feature of any of the embodiments disclosed herein may be applied to any other embodiment, wherever appropriate. Likewise, any advantage of any of the embodiments may apply to any other embodiments, and vice versa. Other objectives, features and advantages of the enclosed embodiments will be apparent from the foregoing description.

[0126] Any appropriate steps, methods, features, functions, or benefits disclosed herein may be performed through one or more functional units or modules of one or more virtual apparatuses. Each virtual apparatus may comprise a number of these functional units. These functional units may be implemented via processing circuitry, which may include one or more microprocessor or microcontrollers, as well as other digital hardware, which may include digital signal processors (DSPs), special-purpose digital logic, and the like. The processing circuitry may be configured to execute program code stored in memory, which may include one or several types of memory such as read-only memory (ROM), random-access memory (RAM), cache memory, flash memory devices, optical storage devices, etc. Program code stored in memory includes program instructions for executing one or more telecommunications and/or data communications protocols as well as instructions for carrying out one or more of the techniques described herein. In some implementations, the processing circuitry may be used to cause the respective functional unit to perform corresponding functions according one or more embodiments.

[0127] The term unit may have conventional meaning in the field of electronics, electrical devices and/or electronic devices and may include, for example, electrical and/or electronic circuitry, devices, modules, processors, memories, logic solid state and/or discrete devices, computer programs or instructions for carrying out respective tasks, procedures, computations, outputs, and/or displaying functions, and so on, as such as those that are described herein.

Claims

CLAIMS:

1. A method for managing one or more of system resources for a system infrastructure that supports a network service in a service-based communication network, the method comprising: receiving (602) a forecast of a service load, s_/, of the network service for a future time epoch; searching (604) a search space of resource allocation values for sets of resource allocation values that are predicted to meet a key performance indicator, KPI, metric of the network service, wherein the resource allocation values correspond to levels of system resources provided to the network service by the system infrastructure; and selecting (606) a set of resource allocation values that meets a predetermined criterion for balancing resource utilization from among the sets of resource allocation values that meet the KPI metric of the network service; and configuring (608) the system infrastructure to provide system resources having the selected set of resource allocation values to the network service during the future time epoch.

2. The method of Claim 1, wherein searching the space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI metric comprises: for a system resource of the system infrastructure, generating (702) a predicted range of resource utilization values that are required to meet the forecast of the service load s_/ of the network service; and identifying (704), from among the predicted ranges of resource utilization values, a plurality of sets of resource utilization values that meet the KPI metric of the network service.

3. The method of Claim 2, further comprising: selecting (802) a predetermined number of utilization values from the predicted range of resource utilization values; and combining (804) selected ones of the predicted utilization values of the plurality of system resources to form the sets of predicted utilization values, wherein the sets of predicted utilization values form the search space; wherein identifying the plurality of sets of system resource utilization values that meet the KPI metric of the network service comprises identifying the plurality of sets of system resource utilization values that meet the KPI metric of the network service from the search space.

4. The method of any previous Claim, wherein configuring the system infrastructure to provide system resources having the selected set of resource allocation values to the network service comprises: transmitting the selected set of system resource utilization values to an actuation node that is configured to apply changes to the system infrastructure to provide the system resources having the selected set of system resource utilization values during the future time epoch.

5. The method of any previous Claim, wherein searching the search space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI metric comprises: for a set of resource allocation values, generating a prediction

6. The method of Claim 2, wherein generating the predicted range of resource utilization values, v_c, v_m, v_n, comprises, for a first system resource, defining an interval [u/r_h, fi/n] based on high maximum resource utilization value r_h and a low maximum resource utilization value n, where 0 ≤ n ≤ r_h ≤ 1 for the first system resource and a predicted absolute resource value û for the first system resource, and selecting a plurality of values û¹ from within the interval.

7. The method of Claim 6, wherein selecting the plurality of values û^l from within the interval comprises selecting N equidistant points within the interval.

8. The method of Claim 6, further comprising, for a second system resource, defining a plurality of second intervals [û/ r_h, w/n] for the second system resource and a predicted absolute resource value û for the second system resource, and selecting a plurality of values fi^l,i from within the second interval.

9. The method of any previous Claim, wherein the network service comprises a communication network, wherein the service load comprises a number of requests per unit time, and wherein the KPI comprises network latency.

10. The method of any previous Claim, wherein the system infrastructure comprises a distributed computing infrastructure, and wherein the system resources comprise central processing unit, CPU, resources, memory resources and/or network resources.

11. The method of Claim 10, wherein the system resources comprise CPU utilization, and wherein a predicted absolute CPI requirement û_c is generated a function of the forecast system load as û_c = f_c(si), where f_c() is a regression model of the absolute CPU usage.

12. The method of Claim 11, wherein the system resources comprise memory usage, and wherein a predicted absolute memory usage requirement û_m is generated as a function of the forecast system load si and a maximum CPU utilization r_c as û_m = /_m(s_å, r_c), where f_m() is a regression model of the absolute memory usage.

13. The method of Claim 12, wherein the system resources comprise network usage, and wherein a predicted absolute network usage requirement û_n is generated a function of the forecast system load si, and the maximum CPU utilization r_c, and a maximum memory usage r_m as û_n = /_n(s_j, r_c, r_m), where /_n is a regression model of the absolute network usage.

14. The method of any previous Claim, wherein selecting the set of resource allocation values that meets the predetermined criterion comprises selecting the set of resource allocation values that maximizes a harmonic mean of expected resource utilization values.

15. The method of Claim 2, wherein selecting the set of resource allocation values that meets the predetermined criterion comprises selecting a set of predicted resource utilization values, v_c, v_m, v_n, that maximizes a harmonic mean of the predicted resource utilization values.

16. The method of Claim 1, wherein searching the search space of resource allocation values for sets of resource allocation values that are predicted to meet the KPI of the network service and selecting the set of resource allocation values that meets the predetermined criterion from among the sets of resource allocation values that meet the KPI metric of the network service comprises: generating (902) predicted resource allocation values v_c, v_m,v_n of the system resources according to the formulas:

where r_c, r_m and r_n are maximum resource utilizations; constructing (904) a linear model for /) = w^Tx with a Gaussian prior on w obtaining (906) a posterior distribution of w as w~ (m,S); determining (908) a probability Prob [w^Tx £ b] ³ a where b is a value of the KPI metric and a is a threshold; and maximizing (910) f_c(x), subject to f_n(x) E [r , r ], f_m(x ) e [r^, r^] and b - m^tx ³

where x = [s_ur_c, r_m,r_n], r_t is a lower maximum resource utilization, r_h is an upper maximum resource utilization where 0 ≤ r_t ≤ r_h ≤ 1.

17. The method of any of Claims 1 to 16, wherein selecting (606) the set of resource allocation values that meets the predetermined criterion for balancing resource utilization comprises selecting a set of resource allocation values that optimizes a function of the utilization of each of the resources in the set.

18. A computer program comprising instructions which when executed on a computer perform any of the methods of Claims 1 to 17.

19. A computer program product comprising computer program, the computer program comprising instructions which when executed on a computer perform any of the methods of Claims 1 to 17.

20. A non-transitory computer readable medium storing instructions which when executed by a computer perform any of the methods of Claims 1 to 17.

21. A network node (300) comprising: processing circuitry (306); and a memory coupled to the processing circuitry, wherein the memory comprises computer readable instructions that, when executed by the processing circuitry, cause the network node to perform operations comprising: receiving (602) a forecast of a service load, s_/, of a network service for a future time epoch; searching (604) a search space of resource allocation values for sets of resource allocation values that are predicted to meet a key performance indicator, KPI, metric of the network service, wherein the resource allocation values correspond to levels of system resources provided to the network service by a system infrastructure; and selecting (606) a set of resource allocation values that meets a predetermined criterion for balancing resource utilization from among the sets of resource allocation values that meet the KPI metric of the network service; and configuring (608) the system infrastructure to provide system resources having the selected set of resource allocation values to the network service during the future time epoch.

22. A system (100) for managing one or more of system resources for a system infrastructure that supports a network service in a service-based communication network, the system comprising: a first node (Nl) that records service load data of a service load, s_/, of the network service; a second node (N2) that generates a forecast of the service load based on the recorded service load data; and a third node (N3) that receives (602) the forecast of the service load for the future time epoch, searches (604) a search space of resource allocation values for sets of resource allocation values that are predicted to meet a key performance indicator, KPI, metric of the network service, wherein the resource allocation values correspond to levels of system resources provided to the network service by the system infrastructure, selects (606) a set of resource allocation values that meets a predetermined criterion for balancing resource utilization from among the sets of resource allocation values that meet the KPI metric of the network service, and configures (608) the system infrastructure to provide system resources having the selected set of resource allocation values to the network service during the future time epoch.

23. The system of Claim 22, further comprising: a fourth node (N4) that applies changes to the system infrastructure based on the configured system resources.

24. The system of Claim 22 or 23, wherein the second node is part of a network data analytics function, NWDAF, of a core network of a wireless communication network, and wherein the third node is part of a management data analytics function, MDAF, of the core network.

25. The system of Claim 24, wherein the first node is part of the NWDAF of the core network.