WO2023147877A1 - Regroupement adaptatif de séries chronologiques à partir d'emplacements géographiques dans un réseau de communication - Google Patents

Regroupement adaptatif de séries chronologiques à partir d'emplacements géographiques dans un réseau de communication Download PDF

Info

Publication number
WO2023147877A1
WO2023147877A1 PCT/EP2022/052782 EP2022052782W WO2023147877A1 WO 2023147877 A1 WO2023147877 A1 WO 2023147877A1 EP 2022052782 W EP2022052782 W EP 2022052782W WO 2023147877 A1 WO2023147877 A1 WO 2023147877A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
time series
optimal
clusters
model
Prior art date
Application number
PCT/EP2022/052782
Other languages
English (en)
Inventor
Valentin Kulyk
Jalil TAGHIA
Selim ICKIN
Mats Folkesson
Jörgen GUSTAFSSON
Original Assignee
Telefonaktiebolaget Lm Ericsson (Publ)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget Lm Ericsson (Publ) filed Critical Telefonaktiebolaget Lm Ericsson (Publ)
Priority to PCT/EP2022/052782 priority Critical patent/WO2023147877A1/fr
Publication of WO2023147877A1 publication Critical patent/WO2023147877A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/06Generation of reports
    • H04L43/067Generation of reports using time frame reporting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/02Arrangements for optimising operational condition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/08Testing, supervising or monitoring using real traffic
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H40/00ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
    • G16H40/20ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the management or administration of healthcare resources or facilities, e.g. managing hospital staff or surgery rooms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network

Definitions

  • the present disclosure relates generally to the field of analysis of time-series data collected from various geographic locations in a communication network, and more specifically to techniques for clustering data from a time series that has noisy, irregular, and/or missing values.
  • a time series is a sequence of data or information values, each of which has an associated time instance (e.g., when the data or information value was generated and/or collected).
  • the data or information can be anything measurable that depends on time in some way, such as prices, humidity, or number of people.
  • frequency is how often the data values of the data set are recorded. Frequency is also inversely related to the period (or duration) between successive data values.
  • Time series analysis includes techniques that attempt to understand or contextualize time series data, such as to make forecasts or predictions of future data (or events) using a model built from past time series data. To best facilitate such analysis, it is preferrable that the time series consists of data values measured and/or recorded with a constant frequency or period.
  • Time series datasets can be collected from geographic locations, such as from nodes of a communication network located in one or more geographic areas (e.g., countries, regions, provinces, cities, etc.). For example, values of performance measurement counters can be collected from the various nodes at certain time intervals. Time series data collected in this manner can be used to analyze, predict, and/or understand user behavior patterns. Furthermore, such behavior patterns can be connected to and used for detection and/or prediction of behavior consequences, such as spread of infectious disease, admittance at hospitals, consumption of goods and/or services, etc.
  • time series clustering is an unsupervised data mining technique for organizing data points into groups (“clusters”) based on their similarity. Objectives include maximizing data similarity within clusters and minimizing data similarity across clusters.
  • Real world time series data - such as collected from a communication network - can be noisy and irregular.
  • a time series can have missing data values and even entire portions or parts in which all data values are missing.
  • data imputation and/or denoising there is a need for some preprocessing of real-world time series data, such data imputation and/or denoising.
  • Data imputation involves replacing the missing values with values that are estimated and/or interpolated from other values that are present in the dataset.
  • Data imputation can introduce various biases since all imputation techniques necessarily rely on some assumption about the nature of the time series data. For example, data imputation may assume that the parts with all values missing preserve the properties already present in the available data.
  • time series data e.g., collected from a communication network
  • time series data can grow over time, which increases the resources required for data storage and processing.
  • a growing dataset also creates uncertainty about what portion or amount of the dataset should be used for analysis, detection, and/or prediction.
  • the most suitable portion or amount of a time series can be dependent on time of prediction and/or prediction target of interest.
  • embodiments of the present disclosure address one or more of these and other problems, issues, and/or difficulties by providing techniques for adaptive clustering and cluster selection for a prediction target of interest in an unsupervised manner, without any assumptions on the timeseries properties and without the need for data imputation.
  • Some embodiments of the present disclosure include methods (e.g., procedures) for facilitating prediction of target information based on communications network performance management (PM) data.
  • PM communications network performance management
  • These exemplary methods can include obtaining a time series of PM data representing performance of the communication network at a plurality of periodic time instances over a first duration. These exemplary methods can also include computing a plurality of frequency-domain representations of a corresponding plurality of different durations of the time series PM data, the different durations beings less than or equal to the first duration. These exemplary methods can also include determining a corresponding plurality of clustering models for the plurality of frequency-domain representations. These exemplary methods can also include, based on target information and the time series PM data, determining an optimal one of the clustering models and an optimal number of clusters (Ns) of time series PM data associated with the optimal clustering model.
  • Ns optimal number of clusters
  • these exemplary methods can also include selecting one or more optimal combinations of clusters of the time series PM data associated with the optimal clustering model, for prediction of the target information.
  • the PM data is missing from a portion of the periodic time instances.
  • the clustering model for each frequency-domain representation is a hierarchical clustering (HC) model.
  • the HC model for each frequency-domain representation includes a plurality of levels, with each level associated with a different number of clusters.
  • determining the optimal clustering model and the optimal number of clusters (Ns) of time series PM data associated with the optimal clustering model includes the following operations for each particular HC model:
  • the determining operations can also include selecting the HC model having lowest calculated interaction information as the optimal HC model, with the optimal number of clusters (Ns) being the optimal number of clusters for the selected HC model.
  • the interaction information for each level of each HC model is calculated based on a joint entropy among the associated clusters and the target information.
  • selecting the one or more optimal combination of clusters of the time series PM data associated with the optimal clustering model includes the following operations:
  • the plurality of probability metrics calculated for each global center combination include:
  • the mutual information for each global center is normalized by the entropy of the target information.
  • selecting the one or more optimal combinations of clusters of the time series PM data based on the plurality of probability metrics for the global center combinations includes the following operations:
  • selecting the one or more optimal combinations of clusters of the time series PM data based on the plurality of probability metrics for the global center combinations also includes selecting a second optimal combination of clusters that corresponds to a global center combination having a highest information gain.
  • the first and second optimal combinations of clusters are selected from among global centers having positive first interaction information.
  • the plurality of frequency-domain representations are computed based on a Lomb-Scargle periodogram. In some of these embodiments, computing the plurality of frequency-domain representations of the corresponding plurality of different durations of the time series PM data includes the following operations:
  • the time series PM data includes samples of a plurality PM counters for each a plurality of base stations at different locations in the communication network and for each of the plurality of periodic time instances over the first duration.
  • the time series PM data includes samples of key performance indicators (KPIs) for each of a plurality of network nodes or network functions (NFs) within the communication network and for each of the plurality of periodic time instances over the first duration.
  • KPIs key performance indicators
  • NFs network functions
  • Figures 1-3 illustrate various examples of time-domain signals and corresponding frequency-domain representations generated by a Fourier Transform (FT).
  • FT Fourier Transform
  • FIG. 4 shows a flow diagram of an adaptive hierarchical clustering (HC) technique according to various embodiments of the present disclosure.
  • Figure 5 shows a dendrogram that illustrates performance of certain HC embodiments disclosed herein.
  • Figure 6 shows a flow diagram of an exemplary optimal HC detector, according to various embodiments of the present disclosure.
  • Figure 7 shows a flow diagram of an exemplary optimal number of clusters detector, according to various embodiments of the present disclosure.
  • Figure 8 shows a flow diagram of an exemplary interaction information processor, according to various embodiments of the present disclosure.
  • Figure 9 shows a flow diagram of an exemplary periodogram generator, according to various embodiments of the present disclosure.
  • Figure 10 shows a flow diagram of an exemplary optimal cluster combination selector, according to various embodiments of the present disclosure.
  • Figures 11-12 show flow diagrams of exemplary usage scenarios for various embodiments of the present disclosure.
  • Figure 13 is a block diagram of an exemplary 5G network.
  • Figure 14 shows a flow diagram of another exemplary usage scenario for various embodiments of the present disclosure.
  • Figure 15 illustrates an exemplary method e.g., procedure) performed by a computing apparatus, according to various embodiments of the present disclosure.
  • Figure 16 shows an exemplary communication network in which various embodiments of the present disclosure can be implemented.
  • Figure 17 shows an exemplary host computing system in which various embodiments of the present disclosure can be implemented.
  • Figure 18 shows an exemplary virtualization environment in which various embodiments of the present disclosure can be implemented.
  • Time series are important in real-world applications.
  • missing values are often found in time series.
  • the missing rate can reach 90%, which makes it difficult to utilize the data.
  • the missing values negatively impact downstream applications such as classification, regression, sequential data integration, forecasting, etc.
  • Some missing data approaches simplify the problem by discarding portions that include the missing values. If the samples with missing values differ systematically from the samples without missing data, this can bias the results from downstream applications.
  • Another approach is data imputation, which involves replacing the missing values with values that are estimated and/or interpolated from other values that are present in the dataset.
  • a variety of imputation approaches can be used that range from extremely simple to rather complex. These methods keep the full sample size, which can be advantageous for bias and precision; however, they can introduce other kinds of bias.
  • an easy imputation technique is to replace each missing value with the mean of the observed values for that variable.
  • this technique can severely distort the distribution for this variable, leading to complications such as underestimates of the standard deviation and distorting correlation relationships between variables.
  • Another imputation technique is to fit a regression to the observed values and then use that to predict the missing values.
  • the predicted values from the regression can be less variable than the observed values, e.g., a lower standard deviation.
  • node can refer to any type of device or apparatus of that can operate in and/or communicate via a wired or wireless network, including but not limited to access nodes (e.g., radio access nodes such as base stations), core network nodes, servers, gateways, user equipment (UEs), etc. Examples of various nodes are described below with reference to various figures.
  • access nodes e.g., radio access nodes such as base stations
  • core network nodes e.g., servers, gateways, user equipment (UEs), etc. Examples of various nodes are described below with reference to various figures.
  • embodiments of the present disclosure overcome these problems, issues, and/or difficulties by using a frequency-domain representation of the time series data.
  • An example frequency-domain representation is the Lomb-Scargle periodogram, which is an algorithm used to detect and characterize periodicity in unevenly sampled time series data, such as in astronomical data.
  • Lomb-Scargle representation facilitates avoiding imputation pre-processing of the time series data and its related disadvantages.
  • the Lomb-Scargle periodogram allows efficient computation of a Fourier-like power spectrum estimator from unevenly sampled data, resulting in an intuitive means of determining a period of oscillation.
  • a periodogram is based on a Fourier transform of a time-domain signal, specifically the squared amplitude that is often referred to as the power spectral density (PSD) or the power spectrum.
  • Ps(f) can be computed as: where g n is the nth time-domain sample taken at time t n .
  • Real-world temporal measurements of a signal involve some finite span of time at some finite rate of sampling.
  • the resulting data can be described by a pointwise product of the true underlying continuous signal with a window function describing the observation. For example, a continuous signal measured over a finite duration is described by a rectangular window function spanning the duration of the observation, and a signal measured at regular intervals is described by a Dirac comb window function marking those measurement times.
  • Figure 1 shows an illustrative example of a Fourier transform using a rectangular window on a periodic signal.
  • the left column shows different time-domain signals and the right column shows their corresponding frequency-domain representations computed via Fourier Transform (FT).
  • FT Fourier Transform
  • the continuous-time periodic signal at the tope is represented in the frequency domain as a series of delta functions at its composite frequencies.
  • the rectangular window is represented in the frequency domain as a sin(f)/f (or “sine”) function.
  • the windowed periodic signal is represented in the frequency domain as the convolution of the delta functions at the top with the sinc(f) function in the middle.
  • FIG. 2 shows an illustrative example of a Gaussian signal sampled at a uniform interval or period.
  • the FT of the resulting sampled signal is the FT of the Gaussian signal (which is also Gaussian) convolved with the FT of the comb function (which is also a comb function).
  • Figure 3 shows an illustrative example of the effects of non-uniform sampling on the Gaussian window.
  • the window transform (middle) will generally not be a sequence of delta functions, because the symmetry present in the Dirac comb is broken by the uneven sampling such that the window FT will be much more “noisy.”
  • the average sampling rate in Figure 3 is the same as the uniform sampling rate in Figure 2.
  • irregular spacing of observations leads directly to nonstructured frequency peaks in the window transform.
  • This nonstructured window transform when convolved with the FT of the true signal, results in an observed FT that reflects the irregular sampling.
  • the Lomb-Scargle periodogram estimation is based on rewriting the periodogram representation in (1) as: where A, B, and T are arbitrary functions of the frequency f and observing times ⁇ t n ⁇ but not the observed values ⁇ g n ⁇ .
  • Solutions for A and B that meet certain least-squares (LS) type criteria result in the following representation: p (f _ (In9n cos 2nj[t n -T]')') 2 (Xnffn sin(27if[t n -T])) 2 LS ⁇ 2 En5'n COS 2 (27if[t n -T]) ndn sin 2 (27if[t n -T])
  • Lomb-Scargle time series periodograms can be clustered in unsupervised manner using hierarchical clustering, e.g., based on Euclidean distance.
  • the optimal number of clusters can be estimated using a generalization of the mutual information for multiple distributions including the periodogram of the prediction target.
  • the generalization of the mutual information is also referred to as interaction information and is explained in more detail below.
  • the number of the clusters reflect a relation between the obtained clusters and the prediction target. These operations are repeated on different sizes of the times series data and the data size with the best cluster’s separation is chosen. A selection of the clusters related to the prediction target of interest is performed again using the interaction information, testing the mutual information gain when the prediction target is added to different clusters combinations.
  • mutual information describes how much information a first random provides about a second random variable, e.g., the reduction in uncertainty about the second random variable given knowledge of the first random variable.
  • Mutual information is proportional to reduction in uncertainty, with zero mutual information indicating that the two random variables are independent.
  • a selection of the cluster combination related to the prediction of the target of interest is performed using the interaction information.
  • An optimal number of clusters for a certain hierarchical model is not usually very large, nor is a list of all possible combinations of the clusters.
  • By testing interaction information for different combinations of clusters it is possible to detect the most suitable cluster combination explaining the current state of the prediction target (e.g., variable of interest).
  • the selected cluster combination can be used for the prediction.
  • embodiments provide as output the combination of clusters that can be used for subsequent analysis and modelling.
  • the Lomb-Scargle periodogram can be used to identify the dominant cyclical behavior in time series data, such as the dominant periods or frequencies.
  • One periodogram represents the frequency-domain characteristics of the entire timeseries.
  • wavelets can be used to generate both time- and frequency-domain representation of a time series.
  • Other techniques for conversion of a time series to frequency domain, time-frequency domain, or other time-domain representation are also possible in various embodiments.
  • Embodiments can provide various benefits and/or advantages when used with time series data having missing values.
  • One advantage is adaptivity to the time series data and prediction target of interest since embodiments do not rely on any assumptions on the time series data.
  • embodiments can be used on the real-time series data.
  • Embodiments produce connected time series clusters that describe a current state of a variable of interest and can be used as reliable inputs for root cause analysis and/or a prediction modelling pipeline.
  • FIG. 4 shows a flow diagram of an adaptive clustering technique according to various embodiments of the present disclosure. Certain aspects of the technique shown in Figure 4 are described below.
  • the time series data is processed in the frequency domain using Lomb-Scargle periodograms, which can accommodate unevenly sampled timeseries data without need for imputation.
  • Only the last part of the time series data is used for clustering analysis.
  • the last part of the time series data is segmented into data sets of different durations starting from the last timestamp in the data (block 410). For example, time series data sets with durations of 4 weeks, 8 weeks, 12 weeks, etc. can be created.
  • the data in these respective data sets are converted to periodogram sets by the periodogram generator (block 420).
  • the respective periodogram sets are clustered using agglomerative hierarchical clustering (HC, block 430).
  • the output of the agglomerative HC is a linkage matrix (also referred to as “HC model” or “dendrogram”) with distances and connections between all data in a data set.
  • the HC model facilitates selection of different numbers of clusters depending on distances between clustered items.
  • clustering involves dividing objects (or data sets) into clusters of data sets that are similar but also dissimilar to data sets belonging to other clusters.
  • Hierarchical clustering involves separating the data set into different groups of a hierarchy based on some measure of similarity.
  • Agglomerative hierarchical clustering is a bottom-up approach in which all data sets start as clusters and are merged based on distance (or similarity) between clusters until one large (or high-level) cluster is formed.
  • a dendrogram can be used to show a vi sualization of the cluster merging process.
  • Clustering results are adapted to the prediction target by the Optimal HC Detector (block 440), which receives as input the HC models built in block 430, all timeseries data, and the prediction target (Y) timeseries. It identifies an optimal time series data size from the last part (i.e., end) of the data that is also relevant for the prediction target. This also indicates at point in time when processes causing the current state of the variable need to be analyzed and/or predicted. Also in block 440, the entire source time series data is analyzed against the prediction target (Y) in terms of multivariate interaction information. The outputs of block 440 are the HC model for the optimal timeseries data size and an optimal number of clusters for this HC model.
  • the Optimal Clusters Combination Selector detects the optimal combination of the clusters from the HC model delivered by the Optimal HC Detector (block 440). All possible cluster combinations including and excluding the prediction target are tested in terms of multivariate interaction information. The outputs of block 450 are the HC model and the optimal clusters combination relevant for further analysis or prediction of the prediction target.
  • Figure 5 shows a dendrogram that illustrates performance of certain embodiments of the HC technique disclosed herein.
  • Figure 5 shows distance between clusters on the vertical axis with specific distance values labelled on the graph.
  • the horizontal axis shows a sample index or cluster size.
  • MIj Int(Q; Y) / H y , mutual information normalized by entropy of the prediction target;
  • Emi mean ⁇ ME, ME, . . ., MI n ⁇ , mean mutual information for combination of n clusters;
  • Figure 6 shows a flow diagram of an exemplary optimal HC detector, according to various embodiments of the present disclosure.
  • the exemplary optimal HC detector shown in Figure 6 can be used in block 440 of Figure 4.
  • the HC model having the lowest or minimum of the interaction information calculated in block 610 is selected as optimal.
  • the output of the Optimal HC clustering detector is the optimal model (HCs), optimal number of clusters for it (Ns), and data size used by it (S).
  • Figure 7 shows a flow diagram of an exemplary optimal number of clusters detector, according to various embodiments of the present disclosure.
  • the exemplary optimal number of clusters detector shown in Figure 7 can be used in block 610 of Figure 6.
  • a range of cluster numbers (Ni, N2, N3, ..., Ni) with increasing values is generated from the time series data and HC clustering model.
  • the Interaction Information Processor calculates the interaction information measures for the respective (Ni, N2, N3, ..., Ni) based on the prediction target Y.
  • the output is an ordered list of values ⁇ CYIM(Ni), CYIM(N2), ... , CYIM(Ni) ⁇ corresponding to the order of the input (Ni, N2, N3, ..., Ni).
  • the first minimum in the ordered list is selected as the optimal number of clusters Ns.
  • Figure 8 shows a flow diagram of an exemplary interaction information processor, according to various embodiments of the present disclosure.
  • the exemplary interaction information processor shown in Figure 8 can be used in block 720 of Figure 7.
  • the interaction information processor calculates entropy -related measures for numbers of clusters obtained from the HC model. As input, it takes the HC model, the optimal number of clusters (Ns), the time series data, and the prediction target (Y). Given discrete random variables Xi, X2... , X n , interaction information is defined mathematically as: where H(Xi: i e T) denotes joint entropy of the subset of the random variables of the set T. Thus, interaction information can be viewed as an alternating sum of joint entropies, with the sets of random variables used to compute the joint entropy in each term selected from the power set of available random variables.
  • Interaction Information can also be understood as a nonlinear generalization of correlation between any number of attributes.
  • a positive value of interaction information can indicate phenomena such as correlation or moderation, where one attribute affects the relationship of other attributes.
  • a negative value of interaction information can indicate mediation, where one or more of the attributes (at least partially) convey information already conveyed by other attributes.
  • the time series data is separated into Ns clusters according to HC model.
  • Cj(t) mean ⁇ Xi(t), Xi(t), ..., Xi(t) ⁇ .
  • the result is Ns timeseries ⁇ Ci, C2, ..., CNS ⁇ representing the Ns clusters.
  • the global centers timeseries ⁇ Ci, C2, ..., CNS ⁇ and the prediction target timeseries are converted into periodograms by a periodogram generator, an example of which is described in more detail below.
  • the discretization procedure can use any approach that is appropriate for the properties or characteristics of the timer series data.
  • CYIM min ⁇ CYIi, CYI2, ..., CYIi ⁇ , minimum of interaction information CYIi calculated for a cluster’s combinations obtained for a certain Ni.
  • Emj mean ⁇ Mli, Mh, MI n ⁇ , mean mutual information for j -th combination of n clusters.
  • Figure 9 shows a flow diagram of an exemplary periodogram generator, according to various embodiments of the present disclosure.
  • the exemplary periodogram generator shown in Figure 9 can be used in block 830 of Figure 8.
  • periodograms for time series in the dataset are calculated using the Lomb- Scargle technique discussed above, which is suitable for irregularly sampled time series and/or time series with missing values. As such, no data imputation is needed.
  • the periodograms are converted to log-periodogram representation based on taking natural logarithms.
  • a smoothing filter is applied to reduce high-frequency noise in the logperiodograms, which can be due to measurement noise. For example, a fifth-order Butterworth low-pass filter can be used, but a specific order can be selected based on amount of noise needing to be removed.
  • Figure 10 shows a flow diagram of an exemplary optimal cluster combination selector, according to various embodiments of the present disclosure.
  • the exemplary periodogram generator shown in Figure 10 can be used in block 450 of Figure 4.
  • the optimal cluster combination selector works with interaction information results that have positive information gain (IG) due to addition of the prediction target to a cluster combination.
  • Cluster combinations with negative interaction information without the prediction target are considered as having some shared information in terms of patterns in the time series data. These patterns are captured by the periodogram representation of the time series in the frequency domain, discussed above.
  • cluster combinations are sorted based on sign of IG, with cluster combinations having negative IG being discarded.
  • cluster combinations with positive IG are sorted or ordered based on the combined maximum of mean mutual information between the clusters of the combination (Emi) and the IG due to the prediction target. In some embodiments, the maximum of E mi has higher priority than maximum of IG.
  • the top N (e.g., 10) of the sorted cluster combinations are selected.
  • the N cluster combinations are further sorted into two groups (S and D) having respectively negative and positive interaction information without the prediction target (CI).
  • S having CKO can be viewed as having some redundant information that is made less redundant by adding the prediction target Y.
  • the combinations in D having CKO can be viewed as dissimilar clusters whose dissimilarity is increased by adding the prediction target.
  • the cluster combination B s with the maximum of IG among the group S is selected as the optimal cluster combination.
  • the cluster combination Ba with the maximum of E mi (mean mutual information) among the group D is selected as an alternative optimal cluster combination.
  • two alternative optimal cluster combinations are provided for use by modelling and/or prediction algorithms, which can select between them according to various criteria.
  • the cluster combinations B s may be more appropriate for the Bayesian modelling using Hidden Markov Model (HMM) clustering.
  • HMM Hidden Markov Model
  • Figure 11 shows a flow diagram for an exemplary usage scenario for various embodiments described above.
  • Figure 11 illustrates an extraction of spatiotemporal time series data that explains the prediction target, which is measured separate from the time series data.
  • the time series data is collected at different geographical locations (e.g., base stations, routers, network functions) within a wireless network (e.g., NG-RAN, 5GC).
  • a measurement of interest (prediction target) is problems in the wireless network. Parts of the collected measurements not relevant for the prediction target should be excluded.
  • the measurement of interest can be used as data request input to the Spatial Explainer, which based on the request selects the relevant data from the time series “data lake” of measurements.
  • the Spatial Explainer selects a combination of clusters of time series data that explains the prediction target, such as by using techniques described above.
  • the Spatial Explainer provides relevant data that is useful for a forecast modelling pipeline and/or directly by a forecast model.
  • the forecast model can perform additional feature selection and tuning before training and generation of prediction within desired forecast window.
  • Figure 12 shows a flow diagram for another exemplary usage scenario for various embodiments described above.
  • Figure 12 illustrates a prediction of patients admitted to a hospital in an epidemic or pandemic situation.
  • PM counters collected from base stations of the wireless network can be used for definition of thresholds for counts of active UEs in different geo locations.
  • Some exemplary PM counters include pmActiveUeDlSum, pmCellHoExeSuccLtelnterF, pmCellHoExeSuccLtelntraF, pmActiveUeDlSum, pmActiveUeUlSum, pmActiveUeDlSum, and pmsessionTimeUe. These PM counters can also be combined in various functions, such as:
  • PM counters and/or functions can be combined to define a level of UE activity, e.g., pmActiveUeDlSum + HO.
  • the PM data collected from base stations at different geo-locations and the collected PM data time series are transformed into UE-activity time series data, which is input to the Spatial Explainer together with the admitted patients counts as the prediction target.
  • the Spatial Explainer selects a combination of clusters of UE-activity time series data that explains the prediction target, such as by using techniques described above.
  • the selected clusters explaining the admitted patient counts are used by a forecast model to generate a prediction of admitted patients.
  • Embodiments of the present disclosure are also applicable to prediction of communication network key performance indicators (KPIs) for specific subsets of the network, such as per cell or per slice of network functionality, also referred to as “network slice”.
  • KPIs communication network key performance indicators
  • network slice is a logical partition of a 5G network that provides specific network capabilities and characteristics, e.g., in support of a particular service.
  • a network slice instance is a set of network function (NF) instances and the required network resources (e.g., compute, storage, communication) that provide the capabilities and characteristics of the network slice.
  • NF network function
  • the 5G System includes a Next-Generation Radio Access Network (NG-RAN) and a 5G Core Network (5GC).
  • the NG-RAN provides user equipment (UEs) with connectivity to the 5GC, e.g., via base stations such as gNBs or ng-eNBs described below.
  • the 5GC includes a variety of Network Functions (NFs) that provide a wide range of different functionalities such as session management, connection management, charging, authentication, etc.
  • NFs Network Functions
  • Traditional peer-to-peer interfaces and protocols found in earlier-generation networks are modified and/or replaced by a Service Based Architecture (SB A) in which NFs provide one or more services to one or more service consumers.
  • SB A Service Based Architecture
  • the various services are self- contained functionalities that can be changed and modified in an isolated manner without affecting other services.
  • Figure 13 shows an exemplary reference architecture for a 5G network (1300) with service-based interfaces and various 3GPP-defined NFs, including the following:
  • Application Function AF, with Naf interface
  • An AF offers applications for which service is delivered in a different layer (i.e., transport layer) than the one in which the service has been requested (i.e., signaling layer), the control of flow resources according to what has been negotiated with the network.
  • An AF communicates dynamic session information to PCF (via N5 interface), including description of media to be delivered by transport layer.
  • PCF Policy Control Function
  • Npcf interface supports unified policy framework to govern the network behavior, via providing PCC rules (e.g., on the treatment of each service data flow that is under PCC control) to the SMF via the N7 reference point.
  • PCF provides policy control decisions and flow based charging control, including service data flow detection, gating, QoS, and flow-based charging (except credit management) towards the SMF.
  • the PCF receives session and media related information from the AF and informs the AF of traffic (or user) plane events.
  • UPF User Plane Function
  • SMF Packet Control Function
  • PDN packet data network
  • Session Management Function interacts with the decoupled traffic (or user) plane, including creating, updating, and removing Protocol Data Unit (PDU) sessions and managing session context with the User Plane Function (UPF), e.g., for event reporting.
  • SMF Session Management Function
  • PDU Protocol Data Unit
  • UPF User Plane Function
  • SMF performs data flow detection (based on filter definitions included in PCC rules), online and offline charging interactions, and policy enforcement.
  • Charging Function (CHF, with Nchf interface) is responsible for converged online charging and offline charging functionalities. It provides quota management (for online charging), re-authorization triggers, rating conditions, etc. and is notified about usage reports from the SMF. Quota management involves granting a specific number of units (e.g., bytes, seconds) for a service. CHF also interacts with billing systems.
  • Access and Mobility Management Function terminates the RAN CP interface and handles all mobility and connection management of UEs (similar to MME in EPC).
  • AMFs communicate with UEs via the N1 reference point and with the RAN (e.g., NG-RAN) via the N2 reference point.
  • NEF Network Exposure Function
  • Nnef interface acts as the entry point into operator's network, by securely exposing to AFs the network capabilities and events provided by 3GPP NFs and by providing ways for the AF to securely provide information to 3GPP network.
  • NEF provides a service that allows an AF to provision specific subscription data (e.g., expected UE behavior) for various UEs.
  • NRF Network Repository Function
  • Nnssf interface - enables other NFs (e.g., AMF) to identify a network slice instance that is appropriate for a UE’s desired service.
  • AMF Network Slice Selection Function
  • AUSF Authentication Server Function
  • HPLMN home network
  • NWDAF Network Data Analytics Function
  • Nnwdaf interface provides network analytics information (e.g., statistical information of past events and/or predictive information) to other NFs on a network slice instance level.
  • the NWDAF can collect data from any 5GC NF.
  • LMF Location Management Function
  • Nlmf interface supports various functions related to determination of LIE locations, including location determination for a UE and obtaining any of the following: DL location measurements or a location estimate from the LTE; UL location measurements from the NG RAN; and non-UE associated assistance data from the NG RAN.
  • 5GC control plane functions e.g., AMF and SMF
  • PCC packet core controller
  • PCG packet core gateway
  • the Unified Data Management (UDM) function supports generation of 3 GPP authentication credentials, user identification handling, access authorization based on subscription data, and other subscriber-related functions. To provide this functionality, the UDM uses subscription data (including authentication data) stored in the 5GC unified data repository (UDR). In addition to the UDM, the UDR supports storage and retrieval of policy data by the PCF, as well as storage and retrieval of application data by NEF.
  • UDM Unified Data Management
  • the NG-RAN can include one or more gNodeB’s (gNBs) connected to the 5GC via one or more NG interfaces. More specifically, gNBs can connected to one or more AMFs in the 5GC via respective NG-C interfaces and to one or more UPFs in the 5GC via respective NG-U interfaces. In addition, the gNBs can be connected to each other via one or more Xn interfaces.
  • the radio technology for the NG-RAN is often referred to as “New Radio” (NR). With respect the NR interface to UEs, each of the gNBs can support frequency division duplexing (FDD), time division duplexing (TDD), or a combination thereof.
  • FDD frequency division duplexing
  • TDD time division duplexing
  • Each of the gNBs can serve a geographic coverage area including one or more cells and, in some cases, can also use various directional beams to provide coverage in the respective cells.
  • NG-RAN nodes such as gNBs can include a Central Unit (CU or gNB-CU) and one or more Distributed Units (DU or gNB-DU).
  • CUs are logical nodes that host higher-layer protocols and perform various gNB functions such controlling the operation of DUs, which are decentralized logical nodes that host lower layer protocols.
  • CUs and DUs can have different subsets of gNB functionality, depending on implementation.
  • Each CU and DU can include various circuitry needed to perform their respective functions, including processing circuitry, transceiver circuitry (e.g., for communication), and power supply circuitry.
  • a gNB-CU connects to one or more gNB- DUs over respective Fl logical interfaces.
  • a gNB-CU and connected gNB-DU(s) are only visible to other gNBs and the 5GC as a gNB, i.e., the Fl interface is not visible beyond gNB-CU.
  • Figure 14 shows a flow diagram for another exemplary usage scenario for various embodiments described above.
  • Figure 14 illustrates a prediction of KPIs for specific subsets of a 5G network, such as per cell or per network slice.
  • KPI timeseries data needs to be predicted based on a large number of performance measurement (PM) or KPI collected from the 5G network.
  • PM performance measurement
  • KPI collected from the 5G network As an example, latency and throughput can be prediction targets in this scenario.
  • the time series PM data is collected from different network nodes or NFs within the 5G network.
  • the service assurance for a 5G slice the data can be collected from gNBs, AMF and SMF in a PCC, and UPF in a PCG.
  • the prediction target KPI can be part of the collected PM data or measured independently (e.g., for E2E latency use case).
  • the collected time series PM data is first grouped by geo-location, with the grouped data being input to the Spatial Explainer together with the prediction target KPI.
  • the Spatial Explainer selects a combination of clusters of time series PM data that explains the prediction target, such as by using techniques described above.
  • the selected time series PM clusters explaining the prediction target are consumed by a forecast model to generate a prediction of the relevant KPIs.
  • Figure 15 depict exemplary method (e.g., procedures) for facilitating prediction of target information based on communications network performance management (PM) data, according to various embodiments of the present disclosure.
  • PM communications network performance management
  • various features of the operations described below correspond to various embodiments described above.
  • the exemplary method is illustrated in Figure 15 by specific blocks in a particular order, the operations corresponding to the blocks can be performed in a different order than shown and can be combined and/or divided into blocks having different functionality than shown.
  • Optional blocks and/or operations are indicated by dashed lines.
  • the exemplary method illustrated by Figure 15 can be performed by any appropriate computing apparatus that is configured to obtain the PM data and perform the operations, calculations, etc. comprising various embodiments of the exemplary method.
  • the computing apparatus can be a NF in the communication network (e.g., NWDAF), an application function (AF) associated with the communication network, or a cloud-based computing apparatus or system.
  • NWDAF wireless fidelity
  • AF application function
  • cloud-based computing apparatus or system.
  • Other example computing apparatus are discussed below in relation to other figures.
  • the exemplary method can include the operations of block 1510, where the computing apparatus can obtain a time series of PM data representing performance of the communication network at a plurality of periodic time instances over a first duration.
  • the exemplary method can also include the operations of block 1520, where the computing apparatus can compute a plurality of frequency-domain representations of a corresponding plurality of different durations of the time series PM data, the different durations beings less than or equal to the first duration.
  • the exemplary method can also include the operations of block 1530, where the computing apparatus can determine a corresponding plurality of clustering models for the plurality of frequency-domain representations.
  • the exemplary method can also include the operations of block 1540, where the computing apparatus can, based on target information and the time series PM data, determine an optimal one of the clustering models and an optimal number of clusters (Ns) of time series PM data associated with the optimal clustering model.
  • the exemplary method can also include the operations of block 1550, where the computing apparatus can select one or more optimal combinations of clusters of the time series PM data associated with the optimal clustering model, for prediction of the target information.
  • Figure 4 shows a high-level example of operations 1510-1550.
  • the PM data is missing from a portion of the periodic time instances.
  • the clustering model for each frequency-domain representation is a hierarchical clustering (HC) model.
  • the HC model for each frequency-domain representation includes a plurality of levels, with each level associated with a different number of clusters.
  • Figure 5 shows an example of this arrangement.
  • determining the optimal clustering model and the optimal number of clusters (Ns) of time series PM data associated with the optimal clustering model in block 1520 includes the following operations for each particular HC model, labelled with corresponding sub-block numbers:
  • the determining operations of block 1520 also include the operations of sub-block 1523, where the computing apparatus selects the HC model having lowest calculated interaction information as the optimal HC model, with the optimal number of clusters (Ns) being the optimal number of clusters for the selected HC model.
  • the interaction information for each level of each HC model is calculated (e.g., in sub-block 1521) based on a joint entropy among the associated clusters and the target information.
  • Figure 6 shows an example of the operations comprising block 1520.
  • selecting the one or more optimal combination of clusters of the time series PM data associated with the optimal clustering model in block 1550 includes the following operations, labelled with corresponding sub-block numbers:
  • Figure 8 shows an example of the operations comprising sub-blocks 1551-1554.
  • the plurality of probability metrics calculated for each global center combination include:
  • selecting the one or more optimal combinations of clusters of the time series PM data based on the plurality of probability metrics for the global center combinations in sub-block 1555 includes the following operations:
  • selecting the one or more optimal combinations of clusters of the time series PM data based on the plurality of probability metrics for the global center combinations in sub-block 1555 also includes selecting a second optimal combination of clusters that corresponds to a global center combination having a highest information gain.
  • the first and second optimal combinations of clusters are selected from among global centers having positive first interaction information.
  • Figure 10 shows an example of the operations of sub-block 1555.
  • the plurality of frequency-domain representations are computed based on a Long-Scargle periodogram.
  • computing the plurality of frequency-domain representations of the corresponding plurality of different durations of the time series PM data in block 1520 includes the following operations labelled with corresponding sub-block numbers:
  • Figure 9 shows an example of the operations of sub-blocks 1521-1523.
  • the target information is patients admitted to hospital and the time series PM data includes samples of a plurality PM counters for each a plurality of base stations at different locations in the communication network and for each of the plurality of periodic time instances over the first duration.
  • the plurality of PM counters can include any of the following: number of active users in uplink, number of active users in downlink, total number of handovers, and total duration of all UE sessions in an area during a time interval.
  • the plurality of PM counters can include any of the following: pmActiveUeDlSum, pmCellHoExeSuccLtelnterF, pmCellHoExeSuccLtelntraF, pmActiveUeDlSum, pmActiveUeUlSum, pmActiveUeDlSum, and pmSessionTimeUe.
  • the target information is end-to-end (E2E) latency and/or E2E throughput for the communication network
  • the time series PM data includes samples of key performance indicators (KPIs) for each of a plurality of network nodes or network functions (NFs) within the communication network and for each of the plurality of periodic time instances over the first duration.
  • KPIs key performance indicators
  • obtaining the time series of PM data in block 1510 includes the operations of sub-block 1511, where the computing apparatus can group the time series PM data according to geo-location of the respective network nodes or NFs.
  • the frequency-domain representations are determined (e.g., in block 1520) based on the time series PM data grouped according to geo-location.
  • FIG 16 shows an example of a communication system 1600 in accordance with some embodiments.
  • the communication system 1600 includes a telecommunication network 1602 that includes an access network 1604, such as a radio access network (RAN), and a core network 1606, which includes one or more core network nodes 1608.
  • the access network 1604 includes one or more access network nodes, such as network nodes 1610a and 1610b (one or more of which may be generally referred to as network nodes 1610), or any other similar 3 GPP access node or non-3GPP access point.
  • the network nodes 1610 facilitate direct or indirect connection of user equipment (UE), such as by connecting UEs 1612a, 1612b, 1612c, and 1612d (one or more of which may be generally referred to as UEs 1612) to the core network 1606 over one or more wireless connections.
  • UE user equipment
  • Example wireless communications over a wireless connection include transmitting and/or receiving wireless signals using electromagnetic waves, radio waves, infrared waves, and/or other types of signals suitable for conveying information without the use of wires, cables, or other material conductors.
  • the communication system 1600 may include any number of wired or wireless networks, network nodes, UEs, and/or any other components or systems that may facilitate or participate in the communication of data and/or signals whether via wired or wireless connections.
  • the communication system 1600 may include and/or interface with any type of communication, telecommunication, data, cellular, radio network, and/or other similar type of system.
  • the UEs 1612 may be any of a wide variety of communication devices, including wireless devices arranged, configured, and/or operable to communicate wirelessly with the network nodes 1610 and other communication devices.
  • the network nodes 1610 are arranged, capable, configured, and/or operable to communicate directly or indirectly with the UEs 1612 and/or with other network nodes or equipment in the telecommunication network 1602 to enable and/or provide network access, such as wireless network access, and/or to perform other functions, such as administration in the telecommunication network 1602.
  • the core network 1606 connects the network nodes 1610 to one or more hosts, such as host 1616. These connections may be direct or indirect via one or more intermediary networks or devices. In other examples, network nodes may be directly coupled to hosts.
  • the core network 1606 includes one more core network nodes (e.g., core network node 1608) that are structured with hardware and software components. Features of these components may be substantially similar to those described with respect to the UEs, network nodes, and/or hosts, such that the descriptions thereof are generally applicable to the corresponding components of the core network node 1608.
  • Example core network nodes include functions of one or more of a Mobile Switching Center (MSC), Mobility Management Entity (MME), Home Subscriber Server (HSS), Access and Mobility Management Function (AMF), Session Management Function (SMF), Authentication Server Function (AUSF), Subscription Identifier De-concealing function (SIDF), Unified Data Management (UDM), Security Edge Protection Proxy (SEPP), Network Exposure Function (NEF), and/or a User Plane Function (UPF).
  • MSC Mobile Switching Center
  • MME Mobility Management Entity
  • HSS Home Subscriber Server
  • AMF Access and Mobility Management Function
  • SMF Session Management Function
  • AUSF Authentication Server Function
  • SIDF Subscription Identifier De-concealing function
  • UDM Unified Data Management
  • SEPP Security Edge Protection Proxy
  • NEF Network Exposure Function
  • UPF User Plane Function
  • the host 1616 may be under the ownership or control of a service provider other than an operator or provider of the access network 1604 and/or the telecommunication network 1602, and may be operated by the service provider or on behalf of the service provider.
  • the host 1616 may host a variety of applications to provide one or more service. Examples of such applications include live and pre-recorded audio/video content, data collection services such as retrieving and compiling data on various ambient conditions detected by a plurality of UEs, analytics functionality, social media, functions for controlling or otherwise interacting with remote devices, functions for an alarm and surveillance center, or any other such function performed by a server.
  • host 1616 can perform operations corresponding to any of the exemplary methods or procedures described above in relation to Figures 4, 6-10, and 15.
  • host 1616 may be part of a cloud computing apparatus, system, or environment.
  • the communication system 1600 of Figure 16 enables connectivity between the UEs, network nodes, and hosts.
  • the communication system may be configured to operate according to predefined rules or procedures, such as specific standards that include, but are not limited to: Global System for Mobile Communications (GSM); Universal Mobile Telecommunications System (UMTS); Long Term Evolution (LTE), and/or other suitable 2G, 3G, 4G, 5G standards, or any applicable future generation standard (e.g., 6G); wireless local area network (WLAN) standards, such as the Institute of Electrical and Electronics Engineers (IEEE) 802.11 standards (WiFi); and/or any other appropriate wireless communication standard, such as the Worldwide Interoperability for Microwave Access (WiMax), Bluetooth, Z-Wave, Near Field Communication (NFC) ZigBee, LiFi, and/or any low-power wide-area network (LPWAN) standards such as LoRa and Sigfox.
  • GSM Global System for Mobile Communications
  • UMTS Universal Mobile Telecommunications System
  • LTE Long Term Evolution
  • the telecommunication network 1602 is a cellular network that implements 3 GPP standardized features. Accordingly, the telecommunications network 1602 may support network slicing to provide different logical networks to different devices that are connected to the telecommunication network 1602. For example, the telecommunications network 1602 may provide Ultra Reliable Low Latency Communication (URLLC) services to some UEs, while providing Enhanced Mobile Broadband (eMBB) services to other UEs, and/or Massive Machine Type Communication (mMTC)/Massive loT services to yet further UEs.
  • URLLC Ultra Reliable Low Latency Communication
  • eMBB Enhanced Mobile Broadband
  • mMTC Massive Machine Type Communication
  • the UEs 1612 are configured to transmit and/or receive information without direct human interaction.
  • a UE may be designed to transmit information to the access network 1604 on a predetermined schedule, when triggered by an internal or external event, or in response to requests from the access network 1604.
  • a UE may be configured for operating in single- or multi -RAT or multi-standard mode.
  • a UE may operate with any one or combination of Wi-Fi, NR (New Radio) and LTE, i.e., being configured for multi -radio dual connectivity (MR-DC), such as E-UTRAN (Evolved-UMTS Terrestrial Radio Access Network) New Radio - Dual Connectivity (EN-DC).
  • MR-DC multi -radio dual connectivity
  • the hub 1614 communicates with the access network 1604 to facilitate indirect communication between one or more UEs (e.g., UE 1612c and/or 1612d) and network nodes (e.g., network node 1610b).
  • the hub 1614 may be a controller, router, content source and analytics, or any of the other communication devices described herein regarding UEs.
  • the hub 1614 may be a broadband router enabling access to the core network 1606 for the UEs.
  • the hub 1614 may be a controller that sends commands or instructions to one or more actuators in the UEs.
  • Commands or instructions may be received from the UEs, network nodes 1610, or by executable code, script, process, or other instructions in the hub 1614.
  • the hub 1614 may be a data collector that acts as temporary storage for UE data and, in some embodiments, may perform analysis or other processing of the data.
  • the hub 1614 may be a content source. For example, for a UE that is a VR headset, display, loudspeaker or other media delivery device, the hub 1614 may retrieve VR assets, video, audio, or other media or data related to sensory information via a network node, which the hub 1614 then provides to the UE either directly, after performing local processing, and/or after adding additional local content.
  • the hub 1614 acts as a proxy server or orchestrator for the UEs, in particular in if one or more of the UEs are low energy loT devices.
  • the hub 1614 may have a constant/persistent or intermittent connection to the network node 1610b.
  • the hub 1614 may also allow for a different communication scheme and/or schedule between the hub 1614 and UEs (e.g., UE 1612c and/or 1612d), and between the hub 1614 and the core network 1606.
  • the hub 1614 is connected to the core network 1606 and/or one or more UEs via a wired connection.
  • the hub 1614 may be configured to connect to an M2M service provider over the access network 1604 and/or to another UE over a direct connection.
  • UEs may establish a wireless connection with the network nodes 1610 while still connected via the hub 1614 via a wired or wireless connection.
  • the hub 1614 may be a dedicated hub - that is, a hub whose primary function is to route communications to/from the UEs from/to the network node 1610b.
  • the hub 1614 may be a non-dedicated hub - that is, a device which is capable of operating to route communications between the UEs and network node 1610b, but which is additionally capable of operating as a communication start and/or end point for certain data channels.
  • FIG 17 is a block diagram of a host 1700, in accordance with various aspects described herein.
  • the host 1700 may be or comprise various combinations hardware and/or software, including a standalone server, a blade server, a cloud-implemented server, a distributed server, a virtual machine, container, or processing resources in a server farm.
  • the host 1700 may provide one or more services to one or more UEs, to nodes or NFs of a communication network, and/or to a service provider.
  • the host may be part of a cloud computing apparatus, system, or environment.
  • the host 1700 includes processing circuitry 1702 that is operatively coupled via a bus 1704 to an input/output interface 1706, a network interface 1708, a power source 1710, and a memory 1712.
  • processing circuitry 1702 that is operatively coupled via a bus 1704 to an input/output interface 1706, a network interface 1708, a power source 1710, and a memory 1712.
  • Other components may be included in other embodiments. Features of these components may be substantially similar to those described with respect to the devices of previous figures, such as Figures 15 and 16, such that the descriptions thereof are generally applicable to the corresponding components of host 1700.
  • the memory 1712 may include one or more computer programs including one or more host application programs 1714 and data 1716, which may include user data, e.g., data generated by a UE for the host 1700 or data generated by the host 1700 for a UE.
  • Embodiments of the host 1700 may utilize only a subset or all of the components shown.
  • the host application programs 1714 may be implemented in a container-based architecture.
  • the containerized host application programs may provide support for video codecs (e.g., Versatile Video Coding (VVC), High Efficiency Video Coding (HEVC), Advanced Video Coding (AVC), MPEG, VP9) and audio codecs (e.g., FLAC, Advanced Audio Coding (AAC), MPEG, G.711), including transcoding for multiple different classes, types, or implementations of UEs (e.g., handsets, desktop computers, wearable display systems, heads-up display systems).
  • the host application programs 1714 may also provide for user authentication and licensing checks and may periodically report health, routes, and content availability to a central node, such as a device in or on the edge of a core network.
  • the host 1700 may select and/or indicate a different host for over-the-top services for a UE.
  • the host application programs 1714 may support various protocols, such as the HTTP Live Streaming (HLS) protocol, Real-Time Messaging Protocol (RTMP), Real-Time Streaming Protocol (RTSP), Dynamic Adaptive Streaming over HTTP (MPEG-DASH), etc.
  • HLS HTTP Live Streaming
  • RTMP Real-Time Messaging Protocol
  • RTSP Real-Time Streaming Protocol
  • MPEG-DASH Dynamic Adaptive Streaming over HTTP
  • the containerized applications running in host 1700 can include one or more applications that include operations corresponding to any of the exemplary methods or procedures described above in relation to Figures 4, 6-10, and 15.
  • FIG. 18 is a block diagram illustrating a virtualization environment 1800 in which functions implemented by some embodiments may be virtualized.
  • virtualizing means creating virtual versions of apparatuses or devices which may include virtualizing hardware platforms, storage devices and networking resources.
  • virtualization can be applied to any device described herein, or components thereof, and relates to an implementation in which at least a portion of the functionality is implemented as one or more virtual components.
  • Some or all of the functions described herein may be implemented as virtual components executed by one or more virtual machines (VMs) implemented in one or more virtual environments 1800 hosted by one or more of hardware nodes, such as a hardware computing device that operates as a network node, UE, core network node, or host.
  • VMs virtual machines
  • the virtual node does not require radio connectivity (e.g., a core network node or host)
  • the node may be entirely virtualized.
  • Applications 1802 (which may alternatively be called software instances, virtual appliances, network functions, virtual nodes, virtual network functions, etc.) are run in the virtualization environment 1800 to implement some of the features, functions, and/or benefits of some of the embodiments disclosed herein.
  • any of the exemplary methods or procedures described above in relation to Figures 4, 6-10, and 15 can be instantiated as an applications 1802 running in virtualization environment 1800, such as in the form of an application function (AF) or a virtual network function (NF).
  • AF application function
  • NF virtual network function
  • virtualization environment 1800 may be (or be part of) a cloud computing system or environment that hosts various applications, including but not limited to instantiations of the exemplary methods or procedures described herein.
  • Hardware 1804 includes processing circuitry, memory that stores software and/or instructions (designated 1805) executable by hardware processing circuitry, and/or other hardware devices as described herein, such as a network interface, input/output interface, and so forth.
  • Software may be executed by the processing circuitry to instantiate one or more virtualization layers 1806 (also referred to as hypervisors or virtual machine monitors (VMMs)), provide VMs 1808a and 1808b (one or more of which may be generally referred to as VMs 1808), and/or perform any of the functions, features and/or benefits described in relation with some embodiments described herein.
  • the virtualization layer 1806 may present a virtual operating platform that appears like networking hardware to the VMs 1808.
  • the VMs 1808 comprise virtual processing, virtual memory, virtual networking or interface and virtual storage, and may be run by a corresponding virtualization layer 1806.
  • a virtualization layer 1806 Different embodiments of the instance of a virtual appliance 1802 may be implemented on one or more of VMs 1808, and the implementations may be made in different ways.
  • Virtualization of the hardware is in some contexts referred to as network function virtualization (NFV). NFV may be used to consolidate many network equipment types onto industry standard high volume server hardware, physical switches, and physical storage, which can be located in data centers, and customer premise equipment.
  • NFV network function virtualization
  • a VM 1808 may be a software implementation of a physical machine that runs programs as if they were executing on a physical, non-virtualized machine.
  • Each of the VMs 1808, and that part of hardware 1804 that executes that VM be it hardware dedicated to that VM and/or hardware shared by that VM with others of the VMs, forms separate virtual network elements.
  • a virtual network function is responsible for handling specific network functions that run in one or more VMs 1808 on top of the hardware 1804 and corresponds to the application 1802.
  • Hardware 1804 may be implemented in a standalone network node with generic or specific components. Hardware 1804 may implement some functions via virtualization. Alternatively, hardware 1804 may be part of a larger cluster of hardware (e.g., such as in a data center or CPE) where many hardware nodes work together and are managed via management and orchestration 1810, which, among others, oversees lifecycle management of applications 1802.
  • hardware 1804 is coupled to one or more radio units that each include one or more transmitters and one or more receivers that may be coupled to one or more antennas. Radio units may communicate directly with other hardware nodes via one or more appropriate network interfaces and may be used in combination with the virtual components to provide a virtual node with radio capabilities, such as a radio access node or a base station.
  • some signaling can be provided with the use of a control system 1812 which may alternatively be used for communication between hardware nodes and radio units.
  • a control system 1812 which may alternatively be used for communication between hardware nodes and radio units.
  • the term unit can have conventional meaning in the field of electronics, electrical devices and/or electronic devices and can include, for example, electrical and/or electronic circuitry, devices, modules, processors, memories, logic solid state and/or discrete devices, computer programs or instructions for carrying out respective tasks, procedures, computations, outputs, and/or displaying functions, and so on, as such as those that are described herein.
  • any appropriate steps, methods, features, functions, or benefits disclosed herein may be performed through one or more functional units or modules of one or more virtual apparatuses.
  • Each virtual apparatus may comprise a number of these functional units.
  • These functional units may be implemented via processing circuitry, which may include one or more microprocessor or microcontrollers, as well as other digital hardware, which may include Digital Signal Processor (DSPs), special-purpose digital logic, and the like.
  • the processing circuitry may be configured to execute program code stored in memory, which may include one or several types of memory such as Read Only Memory (ROM), Random Access Memory (RAM), cache memory, flash memory devices, optical storage devices, etc.
  • Program code stored in memory includes program instructions for executing one or more telecommunications and/or data communications protocols as well as instructions for carrying out one or more of the techniques described herein.
  • the processing circuitry may be used to cause the respective functional unit to perform corresponding functions according one or more embodiments of the present disclosure.
  • device and/or apparatus can be represented by a semiconductor chip, a chipset, or a (hardware) module comprising such chip or chipset; this, however, does not exclude the possibility that a functionality of a device or apparatus, instead of being hardware implemented, be implemented as a software module such as a computer program or a computer program product comprising executable software code portions for execution or being run on a processor.
  • functionality of a device or apparatus can be implemented by any combination of hardware and software.
  • a device or apparatus can also be regarded as an assembly of multiple devices and/or apparatuses, whether functionally in cooperation with or independently of each other.
  • devices and apparatuses can be implemented in a distributed fashion throughout a system, so long as the functionality of the device or apparatus is preserved. Such and similar principles are considered as known to a skilled person.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • Software Systems (AREA)
  • General Business, Economics & Management (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Operations Research (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Educational Administration (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Des modes de réalisation comprennent des procédés pour faciliter la prédiction d'informations cibles sur la base de données de gestion de performance (PM) de réseau de communication. De tels procédés comprennent l'obtention d'une série chronologique de données PM représentant des performances du réseau de communication à une pluralité d'instances temporelles périodiques sur une première durée et le calcul d'une pluralité de représentations dans le domaine fréquentiel d'une pluralité correspondante de durées différentes des données PM de série chronologique. De tels procédés comprennent également la détermination d'une pluralité correspondante de modèles de regroupement pour la pluralité de représentations dans le domaine fréquentiel et, sur la base d'informations cibles et des données PM de série chronologique, la détermination d'un modèle de regroupement optimal parmi les modèles de regroupement et d'un nombre optimal de groupes (Ns) de données PM de série chronologique associées au modèle de regroupement optimal. De tels procédés comprennent également la sélection d'une ou plusieurs combinaisons optimales de groupes des données PM de série chronologique associées au modèle de regroupement optimal, pour la prédiction des informations cibles.
PCT/EP2022/052782 2022-02-04 2022-02-04 Regroupement adaptatif de séries chronologiques à partir d'emplacements géographiques dans un réseau de communication WO2023147877A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/EP2022/052782 WO2023147877A1 (fr) 2022-02-04 2022-02-04 Regroupement adaptatif de séries chronologiques à partir d'emplacements géographiques dans un réseau de communication

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2022/052782 WO2023147877A1 (fr) 2022-02-04 2022-02-04 Regroupement adaptatif de séries chronologiques à partir d'emplacements géographiques dans un réseau de communication

Publications (1)

Publication Number Publication Date
WO2023147877A1 true WO2023147877A1 (fr) 2023-08-10

Family

ID=80628877

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2022/052782 WO2023147877A1 (fr) 2022-02-04 2022-02-04 Regroupement adaptatif de séries chronologiques à partir d'emplacements géographiques dans un réseau de communication

Country Status (1)

Country Link
WO (1) WO2023147877A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200294067A1 (en) * 2019-03-15 2020-09-17 Target Brands, Inc. Time series clustering analysis for forecasting demand
CN109767043B (zh) * 2019-01-17 2020-11-24 中南大学 一种电力负荷时间序列大数据智能建模与预测方法
US20210294818A1 (en) * 2020-03-19 2021-09-23 Cisco Technology, Inc. Extraction of prototypical trajectories for automatic classification of network kpi predictions

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109767043B (zh) * 2019-01-17 2020-11-24 中南大学 一种电力负荷时间序列大数据智能建模与预测方法
US20200294067A1 (en) * 2019-03-15 2020-09-17 Target Brands, Inc. Time series clustering analysis for forecasting demand
US20210294818A1 (en) * 2020-03-19 2021-09-23 Cisco Technology, Inc. Extraction of prototypical trajectories for automatic classification of network kpi predictions

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
WARREN LIAO ET AL: "Clustering of time series data-a survey", PATTERN RECOGNITION, ELSEVIER, GB, vol. 38, no. 11, 1 November 2005 (2005-11-01), pages 1857 - 1874, XP027610890, ISSN: 0031-3203, [retrieved on 20051101] *
YANG YUN ET AL: "Adaptive Bi-Weighting Toward Automatic Initialization and Model Selection for HMM-Based Hybrid Meta-Clustering Ensembles", IEEE TRANSACTIONS ON CYBERNETICS, IEEE, PISCATAWAY, NJ, USA, vol. 49, no. 5, 1 May 2019 (2019-05-01), pages 1657 - 1668, XP011713108, ISSN: 2168-2267, [retrieved on 20190304], DOI: 10.1109/TCYB.2018.2809562 *

Similar Documents

Publication Publication Date Title
Nikravesh et al. Mobile network traffic prediction using MLP, MLPWD, and SVM
US20220141111A1 (en) Mobility network slice selection
US20210185676A1 (en) Dynamic carrier management and network slicing for internet of things (iot)
Pérez-Romero et al. Knowledge-based 5G radio access network planning and optimization
US10149238B2 (en) Facilitating intelligent radio access control
US9730098B2 (en) Knowledge discovery and data mining-assisted multi-radio access technology control
US20160261981A1 (en) Access to mobile location related information
US20150119020A1 (en) Facilitating adaptive key performance indicators in self-organizing networks
Rajagopal et al. FedSDM: Federated learning based smart decision making module for ECG data in IoT integrated Edge–Fog–Cloud computing environments
Samek et al. The convergence of machine learning and communications
CN112365366B (zh) 一种基于智能化5g切片的微电网管理方法及系统
WO2019206100A1 (fr) Procédé et appareil de programmation d'extraction de caractéristiques
US20230325258A1 (en) Method and apparatus for autoscaling containers in a cloud-native core network
US20160345185A1 (en) Self organizing radio access network in a software defined networking environment
CN113379176A (zh) 电信网络异常数据检测方法、装置、设备和可读存储介质
US11310125B2 (en) AI-enabled adaptive TCA thresholding for SLA assurance
KR20180130295A (ko) 통신망의 장애를 예측하는 장치 및 방법
CN111083710A (zh) 一种用于5g系统的智慧组网方法
Zhohov et al. One step further: Tunable and explainable throughput prediction based on large-scale commercial networks
WO2023147877A1 (fr) Regroupement adaptatif de séries chronologiques à partir d'emplacements géographiques dans un réseau de communication
Jiewu et al. User traffic collection and prediction in cellular networks: Architecture, platform and case study
Shiomoto et al. A novel network traffic prediction method based on a Bayesian network model for establishing the relationship between traffic and population
Mastelic et al. Data velocity scaling via dynamic monitoring frequency on ultrascale infrastructures
US20210103830A1 (en) Machine learning based clustering and patterning system and method for network traffic data and its application
WO2023147871A1 (fr) Extraction de motifs temporels à partir de données collectées à partir d'un réseau de communication

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22707624

Country of ref document: EP

Kind code of ref document: A1