WO2016182483A1

WO2016182483A1 - An arrangement and method performed therein for data analytics

Info

Publication number: WO2016182483A1
Application number: PCT/SE2015/050521
Authority: WO
Inventors: Subramanian Shivashankar; Karthikeyan Premkumar; Anand Varadarajan
Original assignee: Telefonaktiebolaget Lm Ericsson (Publ)
Priority date: 2015-05-11
Filing date: 2015-05-11
Publication date: 2016-11-17

Abstract

Embodiments herein relate to a method performed by an arrangement (10) for enabling a recommendation of a model to use in one or more analytics applications of a data source for a target domain. The arrangement (10): - extracts features from a first dataset based on a domain information, of the target domain, for one or more applications using the data source and/or other data sources, which features are application relevant features with history data for a sequence of Maximum Likelihood, ML, models; - models interactions of the one or more application, the data source and/or other data sources, and the sequence of ML models, by using the extracted features, and the history data of the sequence of ML models into an interaction model, and - provides the interaction model to a model selection process enabling a recommendation of a model to use for the one or more analytics applications.

Description

AN ARRANGEMENT AND METHOD PERFORMED THEREIN

FOR DATA ANALYTICS

TECHNICAL FIELD

Embodiments herein relate to an arrangement and a method performed therein. In particular, embodiments herein relate to enabling a recommendation of a model to use in one or more analytics applications of a data source that can be performed in a target domain. BACKGROUND

In a typical communications network, communication devices, also known as mobile stations and/or user equipments (UEs), communicate via e.g. a Radio Access Network (RAN) to one or more core networks. The RAN covers a geographical area which is divided into cell areas, with each cell area being served by a Base Station (BS), e.g., a radio base station (RBS), which in some networks may also be called, for example, a "NodeB" or "eNodeB". A cell is a geographical area where radio coverage is provided by the radio base station at a base station site or an antenna site in case the antenna and the radio base station are not collocated. Each cell is identified by an identity within the local radio area, which is broadcast in the cell. Another identity identifying the cell uniquely in the whole mobile network is also broadcasted in the cell. One base station may have one or more cells. A cell may be downlink and/or uplink cell. The base stations communicate over the air interface operating on radio frequencies with the user equipments within range of the base stations.

A Universal Mobile Telecommunications System (UMTS) is a third generation mobile communication system, which evolved from the second generation (2G) Global System for Mobile Communications (GSM). The UMTS terrestrial radio access network (UTRAN) is essentially a RAN using wideband code division multiple access (WCDMA) and/or High Speed Packet Access (HSPA) for user equipments. In a forum known as the Third Generation Partnership Project (3GPP), telecommunications suppliers propose and agree upon standards for third generation networks and UTRAN specifically, and investigate enhanced data rate and radio capacity. In some versions of the RAN as e.g. in UMTS, several base stations may be connected, e.g., by landlines or microwave, to a controller node, such as a radio network controller (RNC) or a base station controller (BSC), which supervises and coordinates various activities of the plural base stations connected thereto. The RNCs are typically connected to one or more core networks.

Specifications for the Evolved Packet System (EPS) have been completed within the 3^rd Generation Partnership Project (3GPP) and this work continues in the coming 3GPP releases. The EPS comprises the Evolved Universal Terrestrial Radio Access Network (E-UTRAN), also known as the Long Term Evolution (LTE) radio access, and the Evolved Packet Core (EPC), also known as System Architecture Evolution (SAE) core network. E-UTRAN/LTE is a variant of a 3GPP radio access technology wherein the radio base station nodes are directly connected to the EPC core network rather than to RNCs. In general, in E-UTRAN/LTE the functions of a RNC are distributed between the radio base stations nodes, e.g. eNodeBs in LTE, and the core network. As such, the Radio Access Network (RAN) of an EPS has an essentially "flat" architecture comprising radio base station nodes without reporting to RNCs.

There exist common analytical problems across different parts of a

telecommunication network that enable mobile telephony. Typical problems that can be considered are: in the core network, signaling/traffic load condition prediction is a key problem in re-routing / rebalancing of the traffic across Mobile Switching Centre (MSC) or Media Gateways (MGW) etc; and in e.g. a charging network, prediction of charging interrogation load condition is a challenge in Session Description Protocol (SDP) load balancing. Today these problems are solved individually by independent domain experts/researchers with the aid of analytical tools.

Analytics as a Service (AaaS) provides analytics on demand. Customers pay for the usage of various analytics solutions. For example, an operator can use a churn prediction model provided as part of AaaS, rather than building one themselves. A churn prediction model is a model that helps predicting which of e.g. a company's customers is going to leave a service or similar. Taking one step further, this churn prediction model may be seen as "Model as a Service". Here the users of AaaS may build their own business use-case, e.g. a churn prediction model, using Maximum Likelihood (ML) models that are provided from a central source. The ML models are models that select the set of values of the input parameters that maximizes the likelihood function. The ML models may be Naive Bayesian, Support Vector Machines (SVM), Decision Trees, etc. The ML models may also include pre-processing techniques such as Singular Value Decomposition (SVD), Principal Component Analysis (PCA), Random Projections, etc.

One of the critical challenges in analytics is to select a model. Today, application developers must be assisted when selecting a best model e.g. a ML model, for the given dataset based on bias/variance [1 ]. Data scientists often pick a ML model based on characteristics of the data parameters or dataset that provide a desired level of accuracy. This knowledge of picking the right ML model for a given dataset for a specific type of problem can be realized as a Meta model to reduce the time & effort to choose an appropriate algorithm. A Meta model is a model of an actual model, i.e. a simplified model of an actual model of picking ML model. This meta-model can be evolved as a semantic model for reasoning out the choice of a particular model selection.

Currently, researchers focus on building models or ML models, for each application, such as churning or upselling, separately. With the growth of computational facilities provided by e.g. cloud networks and similar, an organization may leverage the organization to cater their customers/subscribers in various ways on demand.

Researchers have proposed a number of model selection processes for machine learning methods such as SVM, K-Means, and for specific ML applications such as document clustering [2, 3]. The implementations are mostly on tuning parameters for a particular ML model. Other generic methods, such as meta-learning and similar, project the data source to a meta feature space, and do recommendation of ML models based on

history/benchmark data [4]. These present techniques are not optimal and do not always result in a recommendation of an optimal model when using different inputs. SUMMARY

An object of embodiments herein is to provide a mechanism that improves recommendation of a model to use in one or more analytics applications.

The object is achieved by providing a method, performed by an arrangement in a communication network, for enabling a recommendation of a model to use in one or more analytics applications of a data source for a target domain. The arrangement extracts features from a first dataset based on a domain information, of the target domain, for one or more applications using the data source and/or other data sources, which features are application relevant features with history data for a sequence of Maximum Likelihood, ML, models. The arrangement models interactions of the one or more application, the data source and/or other data sources, and the sequence of ML models, by using the extracted features, and the history data of the sequence of ML models into an interaction model. Furthermore, the arrangement provides the interaction model to a model selection process enabling a recommendation of a model to use for the one or more analytics applications.

The object is achieved by providing an arrangement for enabling a

recommendation of a model to use in one or more analytics applications of a data source 5 for a target domain. The arrangement is configured to extract features from a first dataset based on a domain information, of the target domain, for one or more applications using the data source and/or other data sources, which features are application relevant features with history data for a sequence of Maximum Likelihood, ML, models. The arrangement is configured to model interactions of the one or more application, the data o source and/or other data sources, and the sequence of ML models, by using the extracted features, and the history data of the sequence of ML models into an interaction model. The arrangement is further configured to provide the interaction model to a model selection process enabling a recommendation of a model to use for the one or more analytics applications.

5 Embodiments herein also provide a computer program, comprising instructions, which, when executed on at least one processor, cause the at least one processor to carry out the methods herein. Furthermore, embodiments herein provide a computer- readable storage medium having stored thereon a computer program, comprising instructions which, when being executed on at least one processor, cause the at least one0 processor to carry out the methods herein.

Embodiments herein disclose to model or characterize a 3-D space of information, i.e. space for application, data source and ML model, for a target domain, enabling a recommendation of a model selection for an analytics application. Embodiments herein result in a more optimal recommendation of a model to use in one or more analytics

5 applications. The system can also be adapted to a number of domains in a similar

fashion.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments will now be described in more detail in relation to the enclosed

0 drawings, in which:

Figure 1 shows an overview depicting a communication network according to embodiments herein;

Figure 2 shows an overview depicting a communication network according to embodiments herein; Figure 3 shows a schematic flowchart depicting a method performed by an arrangement according to embodiments herein; and

Figure 4 shows block diagram depicting an arrangement according to

embodiments herein.

DETAILED DESCRIPTION

Embodiments herein relate to communication networks in general. Fig. 1 is a schematic overview depicting a communication network 1. The communication network 1 comprises one or more RANs and one or more CNs. The RANs may use a number of different technologies, such as Wi-Fi, Long Term Evolution (LTE), LTE-Advanced, Wideband Code Division Multiple Access (WCDMA), Global System for Mobile communications/Enhanced Data rate for GSM Evolution (GSM/EDGE), Worldwide Interoperability for Microwave Access (WiMax), or Ultra Mobile Broadband (UMB), just to mention a few possible implementations.

In the communication network 1 , an arrangement 10 is provided for enabling a recommendation of a model to use in one or more analytics applications of a data source for a target domain. The arrangement 10 may be comprised in a communication node, such as a computer node, a server, a core network node, a radio network node or similar. A data source may be Call Detail Records (CDR), other detail records (xDR), Deep-Dives (DD), Node Key Performance Indicators (KPI) etc. CDR data may comprise data records that contain information about each call that was processed by a CallManager, and Call Management Records (CMR), i.e. data records that contain Quality of Service (QoS) or diagnostic information about the call. Also referred to as diagnostic records. The target domain may be a telecommunication domain, a social network domain or similar. Node health records such as KPIs are generated periodically in each network element. This is mined for various applications such as alarm prediction and detection. This is applicable to the telecommunication domain, computing devices network domain, data centres and so on. Another example is mining customer service records or logs to identify frequent issues and solutions, e.g. quick fixes, to handle those. This is useful in any customer oriented software/system. In examples herein, the target domain is the telecommunication domain. Examples of the analytics application may be churn prediction, upselling recommendation, Mean Opinion Score (MoS), Quality of Experience (QoE) and similar.

As mentioned above typical problems that can be considered in a communication network are: in the core network, signaling/traffic load condition prediction is a key problem in re-routing / rebalancing of the traffic across Mobile Switching Centre (MSC) or Media Gateways (MGW) etc; and in e.g. a charging network, prediction of charging interrogation load condition is a challenge in Session Description Protocol (SDP) load balancing. These problems may be considered analogues and even though KPIs that characterize the load conditions of these two problems are different, from an analytics perspective these can be solved using same/similar algorithms, e.g. same application may be used for different data sources. There can be other possible scenarios as well, as mentioned below. Embodiments herein provide an analytical solution that addresses these cross-domain analogues problems which could save considerable time and effort than if solved in isolation.

The arrangement 10 may comprise a first part 11 configured to extract or collect features, e.g. meta features, that are built on top of usual features such as average speed per month, gender and occupation, of one or more data sources, e.g. CDR and xDR, based on a domain information, of the target domain, for one or more applications using the one or more data sources. The one or more application may be any application e.g. a churn prediction or a different application. The features are application relevant features with history data for a sequence of ML models. Using a centralized setup such as Multicast Source Discovery Protocol (MSDP) the usage/logs may be monitored, which usage/logs have history information on data sets used, applications built, models used and performances achieved. This history information is referred to as history data. The ML models may be Boosted trees, Decision Trees (DT), Random Forests (RF), Support Vector Machines (SVM), and Gaussian Processes (GP).

The arrangement 10 may further comprise a second part 12 being configured to perform a modelling operation. The second part 12 models interactions of the analytics application, the Data source and/or other data sources and the sequence of ML models, by using the extracted features, and the history data of the sequence of ML models into an interaction model.

The arrangement may further comprise a third part 13 being configured to perform a model selection process using the interaction model provided from the second part 12. From this selection process the third part 13 may recommend a model to use for the analytics application. It should be noted that the third part 13 may be comprised in the same communication node as the first part 1 1 and second part 12 or the third part 13 may be comprised in a communication node separated from the second part 12.

Embodiments herein model interactions between data sources, applications and ML models, and utilize the domain information and other background information, history data, available. Embodiments herein deal with one of the important challenges needed to realize analytic applications as a service, and that is to recommend a model during development of an analytics application. The proposed method leverages history data and aids in model selection. Intuition is to extract features, e.g. meta-information, from each data source encountered before and also for a new data source with similar meta- 5 information, and suggest a best model from past experience. Thus, choosing a model, an ML model, for a data source and application requirement.

Figure 2 is a block diagram depicting solution according to embodiments herein. As an example, the arrangement 10 comprises a collecting part 21 gathering o data regarding usage experiences in the past, also referred to as history data or

information, from a history part 22. The collecting part 21 may further gather the features, e.g. data application specific meta-information, from a database 23 with domain information. The data regarding usage experiences in the past is taken into account when establishing/selecting the features at the collecting part 21 . As stated

5 above, the first part 1 1 extracts the features e.g. from the collecting part 21 . Thus, the first part 1 1 stores the gathered features and model configurations, i.e., parameters used for ML model. The second part 12 of the arrangement 10 gathers the features and model configurations from the first part 1 1 , and also usage experiences in the past from the history part 22. The second part 12 performs a modelling that takes the

0 features of the application/s and data source/s from the first part 1 1 as well as the

history data of the sequence of ML models from the history part. The outcome of the modelling, being an interaction model, is fed to the third part 13 of the arrangement 10. The third part 13 then performs a model selection based on the interaction model resulting in a recommendation of a model to use.

5 Embodiments herein may be implemented in a recommendation engine on a

managed services delivery platform in a telecommunication domain, which

recommendation engine stores multiple data sources, e.g. CDR and xDR, within the telecommunication domain and caters those for various use cases or analytics applications. It is also possible to access external data sources, such as Social media,0 weather data, etc, using third party applications from a same platform.

For example, in a solution of the telecommunication domain, in the database 23 a list of features is maintained that would characterize each data source, either by using domain experts or by mining the knowledge repositories such as Research articles, web, and so on. As an example, for call graph data, in the telecommunication domain, In-5 degree, Out-degree, Total degree, Eigen Vector Centrality, Page Rank, Between-ness Centrality, Closeness Centrality, Shapley value, usage pattern such as mean, median, standard deviation, location visited, origin, destination could be the list of features.

History data is logged on the history part 22, which history data is based on performance of various users previously built models for an application using the available 5 data sources, which performance may be measured based on a target metrics, e.g. root- mean-square error (RMSE), log likelihood, Sum of Squared Errors (SSE), receiver operating characteristics (ROC), etc, thus, the history data being examples of usage experiences in the past. As an example, based on history data of using call graph data for users, the data source, application, ML models are monitored and the performance o measured and logged.

Since there will be many features available in the domain information of the database 23 the collecting part 21 may determine which ones are relevant for each analytics application. E.g. the collecting part 21 may map relevant features to different analytics applications based on the history data e.g. performance measured, forming the 5 collected features. For a given analytics application such as an influential user prediction, for each data source, ML model and performance measured, the application relevant features are obtained from the database 23 with regards to the performance measured for a same ML model, e.g. correlation based ranking. A majority voting process at the collecting part 21 may be performed to obtain a final list of application relevant features,0 e.g. based on a predefined cut-off, with history data for a number of ML models. For

instance, the usage pattern such as mean, median, standard deviation, location visited, origin, destination may be determined not relevant to an influential user's classification application and may be excluded from the list of features. It should be noted that the cutoff may also be tuned empirically.

5 The arrangement 10 may then, based on the final list of application relevant

features with history data for the number of ML models, create a database (DB) for features and ML models e.g. a DB comprising meta information and ML models measured from history data, a so called Features and ML models DB.

The second part 12 uses the features and the ML models from the first part 1 1 and0 also the history data logged on the history part 22, to model interactions between the features and ML models. As the features are collected based on interaction between the data source/s and the analytic application the second part 12 models interactions between the analytic application, the data source and/or other data sources, and the ML model. The modelling may be referred to as Models, Data source and/or other data sources, and5 Application modeling. The second part 12 builds the interaction model wherein, for the data source, the extracted features may be clustered into a number of groups of clustered features. The second part 12 may generate a look up set comprising the sequence of ML models for each group of clustered features wherein a ML model previously tried for a feature, i.e. a 5 ML model previously used for analysis, in the group is tagged in the look up set. The second part 12 may further perform an automata process, wherein, for each tagged ML model, automata for accepting a prefix will be generated so there would be automata for each tagged model. The prefix being a sub-sequence of the sequence of ML models. The second part 12 may then merge automata for the tagged models with a same prefix, o wherein the interaction model will be any of the tagged models with the same prefix.

The third part 13, for a new data source and application, then uses the interaction model built in the second part 12 to perform a recommendation of models/work-flow, i.e. the third part 13 performs a Model Selection. The third part 13 and the second part 12 may be comprised in a same communication node or in different communication nodes. 5 Embodiments herein may recommend e.g. an ML model to a user for a same application and a same data source as used previously. This may be useful when the data source and the application built are same as used previously, for instance a churn prediction using CDR data. If there exist past experience with using CDR for churn prediction using various ML models for an operator X, this past experience may be used0 to recommend ML models for the same application, i.e. the churn prediction, using same data source, i.e. the CDR, for an operator Y or the operator X for a different region, or any other context, in the future. Embodiments herein may further recommend an ML model to a user for a same application and a different data source as used previously. If churn prediction is performed using network usage data, then 'common' attributes, features,5 may be used, these common attributes may be domain specific and statistical attributes, as well as similarity between applications to recommend the ML model or models. For instance, if there are previous experiences with using network usage data for Upselling, being a different application than churning for different data sources i.e. network usage data, then based that 'churning' and 'up-selling' are negatively correlated, an ML model0 may be recommended based on the historic data of an Upselling use case. Embodiments herein may additionally recommend an ML model to a user for a different application and a same data source as used previously. It is similar to the above mentioned case, except the same set of attributes would be used e.g. for network usage data, and a subset of common attributes would not be used. Embodiments herein may furthermore recommend an ML model to a user for a different application and a different data source as used previously. For example, if there are previous experiences of a preferred ML model using network usage data for

Upselling, then the common attributes and relationship between various applications and various data sources that are mined from the history data may be used to recommend an ML model.

The method actions performed by the arrangement 10 for enabling a

recommendation of a model to use in one or more analytics applications of a data source for a target domain according to some embodiments will now be described with reference to a flowchart depicted in Fig. 3. The actions do not have to be taken in the order stated below, but may be taken in any suitable order. Actions performed in some embodiments are marked with dashed boxes.

Action 301 . The arrangement 10 or the second part 12 extracts features from a first dataset based on a domain information, of the target domain, for one or more applications using the data source and/or other data sources, which features are application relevant features with history data for a sequence of Maximum Likelihood, ML, models. The features may be extracted from: a maintained list of features that

characterize a data source; or features generated statistically, resulting in a combined set of features, and wherein the combined set of features maximizes a performance for the one or more analytics applications of the data source and/or other data sources using history data.

For example, meta features may be extracted from the given dataset based on a target application. To build e.g. an influential users prediction model using call graphs, possible meta features, e.g. distributions, for each node can be as given below.

S. No Measu re

1 I n-deg ree

2 Out-deg ree

3 Total deg ree

4 Eigen Vector Central ity

5 Page Ran k

6 Between - ness Cent ral ity

7 Closen ess Central ity

8 Shapley val u e Similarly, for an up-sell/recommendation task, the meta information may be the following derived from CDR, DDs and Customer Relationship Management (CRM) database.

5

This may be achieved by: (/) maintaining a list of meta features that would characterize a data source - this can be either using domain experts or by mining the o knowledge repositories, e.g. Research articles, web, and so on; (ii) generating meta features statistically - distributions of columns, Eigen values, latent space information and so on. From the combined set of meta features, a representative feature set may be generated by searching through a feature space that helps in maximizing the performance for applications using the history data. In other words, the meta features may capture 5 application specific characteristics. The search of features may be achieved using a

greedy search, local search or any other advanced techniques such as genetic algorithms. An objective of the search of features would be as follows argmax _{5(A, d)}∑_M θ (δ (A, d, M)) - λ * D (δ (A, d, M)) 0 Here features 'δ' are selected for an application TV, using data source 'd' such that the performance 'θ' is maximized across the sequences of ML models M, and there is a penalty on number of features 'D' chosen. In order to specify the weightage to restrict the size of features, a Lagrange multiplier 'λ' may be introduced. To extract the features the arrangement 10 search through the feature space, for each performance metric Θ, and5 choose the features that maximizes the objective function. Note that the sum of

performance across models∑_M may change to any other operator such as median, mode, max, etc. Note that, the objective function helps in selecting good features for each data source separately which can later be used for ensemble or multi-source learning.

Secondly, an important advantage with this setup is that, a lot of domain

0 information/expert systems based knowledge may be input and the domain information/expert systems based knowledge may then be filtered out based on the application and data source in handle. The domain information may be input from experts, publications, patents and other materials.

Action 302. The arrangement 10 models interactions of the one or more 5 application, the data source and/or other data sources and the sequence of ML models, by using the extracted features and the history data of the sequence of ML models into an interaction model. This modelling may be a predictive model using a given sequence of ML models and performance. The arrangement 10 may e.g. cluster or group the extracted features into a number of groups of clustered features. In addition may the arrangement

1 0 10 generate a look up set comprising the sequence of ML models for each group of

clustered features wherein a ML model previously tried for a feature in the group is tagged in the look up set. The arrangement 10 may perform an automata process, wherein, for each tagged ML model, automata for accepting a prefix will be generated so there would be automata for each tagged ML model. The prefix is a sub sequence of the sequence of

1 5 ML models. The arrangement 10 may merge automata for the tagged models with a same prefix, wherein the interaction model will be any of the tagged ML models with the same prefix.

The arrangement 10 may thus perform a ML Models (M), Data Source (D) and Applications (A) modelling. In the previous action 301 , the interactions between a data

20 source and application across ML models are modelled for a certain performance metric Θ. In this action 302, the interactions between the features, e.g. meta-information, which features capture A and D - interactions, and sequence of models (M) are modelled.

Here, the arrangement 10 assists developers with models to try during business process creation using the data available of the data source. This may be achieved by

25 mining the history data of system usage. For each e.g. Telecom dataset, and for a

particular application, e.g. churn prediction, characteristics C of the data source are captured, sequence of data analysis models M that was tried along with the performance metric Θ achieved with it. It would be a triplet <C, M, θ>. With this a predictive model may be built, to suggest a model M, after M or before M_t+1 , for instance Subset selection

30 using entropy after discretization for a data source with characteristics Cx based on the history data. Example data characteristics with a call graph would be in/out/total degree distributions, number of triangles, connected components, communities, etc. With a new operator's call graph, a K closest match may be found from history data with regards to attributes used and the models may be suggested for a queried category in the same

35 order as their performance. For example, to classify users as churners or not, and to use SVM as the classifier (M_t+1), SVD might have performed better than Random Projections (RP) for a similar call graph, so embodiments herein may suggest SVD above RP(M_t). Embodiments herein add more information with more usage, thus it is incremental in learning. Simple heuristic to build a predictive model using the given ML model sequence 5 and performance is given below.

Grouping datasets: For each data source, the datasets are grouped into K groups by clustering the features extracted

Look-up Set Generation: For each clustered dataset, the sequence of ML models tried will be a feature in the sequence. In the sequence ABCX with the ML model X, if the o performance increases, with regards to a measure e.g. accuracy, then the sequence ABCX will be added to a look-up set for accuracy, with a tag on X. Similarly there will be different look-up sets for each measure F1 , Area Under Curve (AUC), Precision, and so on. If users did not estimate accuracy after trying each model, and only after trying two or more models, then the tagged ML model will be of any arbitrary length (>=1 ).

5 Tagged Model Automata: In this action, for each tagged model(s) in the previous action, automata for "accepting" the prefix will be generated. So there would be automata for each tagged model(s). Now with the given input sequence of ML models, all the automata is used to match the given input sequence of ML models, and choose the tagged ML models to suggest. Sequences of ML models occurring less than a tunable0 threshold Δ will be not be considered for automata generation, because it might not be statistically significant.

Merging Automata: Automata for tagged models with a common prefix may be merged together, and the final interaction model will be any of the tagged models. Based on an average increase in performance measure achieved across observations, the ML5 models will be ranked. This can be used to rank the next possible ML model to try. E.g. when there are two sequences ABCD and ABCDE, BCD, this process finds a closed sequence ABCDE, which sequence of ML models ABCDE covers all the sequences of ML models ABCD and ABCDE, BCD, since they have the same prefix BCD. Automata is a technique that recognizes the available sequences of ML models. E.g. if ABC is given,0 embodiments herein may look up and suggest DE to follow.

A similar procedure may be repeated for all the metrics Θ. The method described above is one possible implementation, and may be replaced with an instance of a generic solution which models the co-occurrence matrix M*. Matrix M* for each data source features and metric Θ, a history data is obtained which history data captures various

5 sequence of ML models M tried and performances measured. Action 303. The arrangement 10 further provides the interaction model to a model selection process enabling a recommendation of a model to use for the one or more analytics applications. E.g. the second part 12 may provide the interaction model to the third part 13.

Action 304. The arrangement 10 may further calibrate history data for each application, data source and ML model combination. In the history data, values measured for each application, data source and ML model combination, may be re-calibrated due to possible differences in a running setup. The possible differences may be number of folds used in cross-validation, training/test ratio used for evaluation, maximum number of iterations set, convergence threshold, initialization of parameters and so on. So, a factorization method may be used as a re-calibration of values. Here R is a matrix, where each row is a dataset and columns denote metrics recorded using different ML models. The factorization method works as follows

R ¾ P x Q - ft

This is solved by minimizing following objective function

There are many methods such as stochastic gradient descent to solve the above objective function. This was verified on 30 binary classification datasets from a repository, and the benchmark/history was built using openml, where many data scientists upload their experimental results on various datasets and data sources. For a leave one out cross validation of ranking models based on top K similar experiments in the benchmark, with calibration, denoted NMF Rho, and without calibration, denoted Non NMF Rho, results are given below. Graph below shows that using only closest historic

experiment/benchmark entry is giving a best performance. Rho is a spearman rank correlation for model ranking. Models chosen for an evaluation were: Bayesian Net, Na^'ive Bayes, Logistic Regression, SVM, 10NN (KNN), KStar, JRip , J48 decision tree. This shows that a re-calibration procedure may be helpful to improve performance in a crowd- sourced setup. NMF vs Non_NMF (Rho vs )

5 10 15 20 25

K

Action 305. Furthermore, the arrangement 10 may recommend the model to use based on a ranking of the provided interaction model. The arrangement 10 may e.g. find a cluster of models to which the interaction model belongs to; rank models of the cluster of models by a weighted sum of score for each model in the cluster of models; and perform a final ranking based on an overall score assigned to each model of the cluster of models. The arrangement 10 may comprise the model selection process in the third part 13. As an example, based on the data source, application and metric, a model suggestion module or a sequence of models suggestion module will be triggered accordingly. The steps are given below

• Find the cluster to which a dataset or data source belongs to.

• Rank models by a weighted sum of score for each model∑Si * r_M ( 6 ) ,

where Si denotes the membership to each cluster and r_M ( θ ) denotes the ranking of a model M based on metric θ .

· Final ranking is done based on the overall score assigned to each mode, or a sequence of ML models. Contributions of embodiments herein include the following:

(/) Modelling the history data along three dimensions, application, the data source and/or other data sources, and the sequence of ML models, while preserving a sequential information of the ML models used.

(/^'/) Enable a model selection during application development leveraging relationships within and between applications & models and learning a system specific for a domain's sources, e.g. Telecom Probes.

Hence, embodiments herein provide:

· For a data source, such as call graphs, session records, DPI, CRM data, features or 'meta-features' are extracted, which features are a combination of statistical and domain specific attributes

• For each application, a local search for features is performed using the training data to find the useful set of attributes

· Based on usage data e.g. crowd sourced information, data source,

applications, models used and the metrics may be tracked or monitored.

• In-order to overcome the possibility of having noise in recorded data, a de- noising technique may be used to smoothen the crowd sourced

information.

· For a given data source and application, one or more models are recommended based on the history data.

In simple terms, embodiments herein provide a generalized arrangement and method to recommend models to cater to many analytics applications that can be performed in the target domain.

Advantages of embodiments herein

• Faster application building/creating for operators and embodiments herein may be used by anyone, not necessarily by ML experts.

• Allow smaller operators to use a central arrangement enabling model selection for their applications instead of building and maintaining one themselves.

• This also provides an additional revenue source for operators, where this "as a service" that may be deployed in their premises, and the functionalities may be shared with 3rd party vendors like advertisement agencies without moving the data out of the operators' premises.

· Model sequence selection for application developers • Platform will learn continuously with more data sources or datasets for, model selection, a so called intelligent assistance.

Figure 4 is a block diagram depicting the arrangement 10 for enabling a 5 recommendation of a model to use in one or more analytics applications of a data

source for a target domain according to embodiments herein. The arrangement 10 may comprise processing circuitry 401 configured to perform the methods herein.

The arrangement 10 is configured to extract features from a first dataset based on a domain information, of the target domain, for one or more applications using the o data source and/or other data sources, which features are application relevant features with history data for a sequence of ML models. The arrangement 10 may comprise an extracting module 402 such as the first part 1 1 . The extracting module 402 and/or the processing circuitry 401 may be configured to extract features from a first dataset based on a domain information, of the target domain, for one or more applications

5 using the data source and/or other data sources, which features are application

relevant features with history data for a sequence of ML models.

The arrangement 10 is further configured to model interactions of the one or more application, the data source and/or other data sources, and the sequence of ML models, by using the extracted features, and the history data of the sequence of ML0 models into an interaction model. The arrangement 10 may comprise a modelling

module 403 such as the second part 12. The modelling module 403 and/or the processing circuitry 401 may be configured to model interactions of the one or more application, the data source and/or other data sources, and the sequence of ML models, by using the extracted features, and the history data of the sequence of ML5 models into an interaction model.

The arrangement 10 is additionally configured to provide the interaction model to a model selection process enabling a recommendation of a model to use for the one or more analytics applications. The arrangement 10 may comprise a providing module 404 such as the third part 13. The providing module 404 and/or the processing0 circuitry 401 may be configured to provide the interaction model to a model selection process enabling a recommendation of a model to use for the one or more analytics applications.

The arrangement 10, the modelling module 403 and/or the processing circuitry 401 may be configured, when the model is a predictive model using a given sequence5 of models and performance, to cluster the extracted features into a number of groups of clustered features. The arrangement 10, the modelling module 403 and/or the processing circuitry 401 may then be configured to generate a look up set comprising the sequence of ML models for each group of clustered features wherein a ML model previously tried for a feature in the group is tagged in the look up set. In addition, the 5 arrangement 10, the modelling module 403 and/or the processing circuitry 401 may be configured to perform an automata process, wherein, for each tagged ML model, automata for accepting a prefix will be generated so there would be automata for each tagged ML model, wherein the prefix is a sub sequence of the sequence of ML models. The arrangement 10, the modelling module 403 and/or the processing circuitry 401 o may further be configured to merge automata for the tagged models with a same prefix, wherein the interaction model will be any of the tagged ML models with the same prefix.

The arrangement 10, the extracting module 402 and/or the processing circuitry 401 may be configured to extract the features from a maintained list of features that 5 characterize a data source; or features generated statistically, resulting in a combined set of features, and wherein the combined set of features maximizes a performance for the one or more analytics applications of the data source and/or other data sources using history data.

The arrangement 10 may further be configured to calibrate history data for each0 application, data source and ML model combination. The arrangement 10 may

comprise a calibrating module 405. The calibrating module 405 and/or the processing circuitry 401 may be configured to calibrate history data for each application, data source and ML model combination.

The arrangement 10 may further be configured to recommend the model to use5 based on a ranking of the provided interaction model. The arrangement 10 may

comprise a recommending module 406. The recommending module 406 and/or the processing module 401 may be configured to recommend the model to use based on a ranking of the provided interaction model

The arrangement 10, the recommending module 406 and/or the processing0 module 401 may further be configured to recommend the model by finding a cluster of models to which the interaction model belongs to; ranking models of the cluster of models by a weighted sum of score for each model in the cluster of models; and performing a final ranking based on an overall score assigned to each model of the cluster of models. The arrangement 10 further comprises a memory 407. The memory comprises one or more units to be used to store data on, such as sequence of ML models, interaction models, features, one or more data sources, applications, interactions of the features and the sequence of ML models, applications to perform the methods disclosed herein when being executed, and similar.

The methods according to the embodiments described herein for the arrangement 10 may be implemented by means of e.g. a computer program 408 or a computer program product, comprising instructions, i.e., software code portions, which, when executed on at least one processor, cause the at least one processor to carry out the actions described herein, as performed by the arrangement 10. The computer program 408 may be stored on a computer-readable storage medium 409, e.g. a disc or similar. The computer-readable storage medium 409, having stored thereon the computer program 408, may comprise the instructions which, when executed on at least one processor, cause the at least one processor to carry out the actions described herein, as performed by the arrangement 10. In some embodiments, the computer-readable storage medium may be a non-transitory computer-readable storage medium.

As will be readily understood by those familiar with communications design, that functions means or modules may be implemented using digital logic and/or one or more microcontrollers, microprocessors, or other digital hardware. In some

embodiments, several or all of the various functions may be implemented together, such as in a single application-specific integrated circuit (ASIC), or in two or more separate devices with appropriate hardware and/or software interfaces between them. Several of the functions may be implemented on a processor shared with other functional components of a communication node, for example.

Alternatively, several of the functional elements of the processing means discussed may be provided through the use of dedicated hardware, while others are provided with hardware for executing software, in association with the appropriate software or firmware. Thus, the term "processor" or "controller" as used herein does not exclusively refer to hardware capable of executing software and may implicitly include, without limitation, digital signal processor (DSP) hardware, read-only memory (ROM) for storing software, random-access memory for storing software and/or program or application data, and non-volatile memory. Other hardware, conventional and/or custom, may also be included. Designers of communications receivers will appreciate the cost, performance, and maintenance tradeoffs inherent in these design choices. Modifications and other embodiments of the disclosed embodiments will come to mind to one skilled in the art having the benefit of the teachings presented in the foregoing descriptions and the associated drawings. Therefore, it is to be understood that the embodiment(s) is/are not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of this disclosure. Although specific terms may be employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.

References

[1 ] http://www.cs.cmu.edU/afs/cs/academic/class/10601 -f 10/lecture/lec16.pdf [2] US6842751

[3] US6584456

[4] http://www.cs.bris.ac.uk/Research/MachineLearning/wekametal/

Claims

A method performed in an arrangement (10) for enabling a recommendation of a model to use in one or more analytics applications of a data source for a target domain; comprising

- extracting (301 ) features from a first dataset based on a domain information, of the target domain, for one or more applications using the data source and/or other data sources, which features are application relevant features with history data for a sequence of Maximum Likelihood, ML, models;

- modelling (302) interactions of the one or more application, the data source and/or other data sources, and the sequence of ML models, by using the extracted features, and the history data of the sequence of ML models into an interaction model, and

- providing (303) the interaction model to a model selection process enabling a recommendation of a model to use for the one or more analytics applications.

A method according to claim 1 , wherein the modelling (302) is a predictive model using a given sequence of models and performance, comprising

- clustering the extracted features into a number of groups of clustered features;

- generating look up set comprising the sequence of ML models for each group of clustered features wherein a ML model previously tried for a feature in the group is tagged in the look up set;

- performing an automata process, wherein, for each tagged ML model, automata for accepting a prefix will be generated so there would be automata for each tagged ML model, wherein the prefix is a sub sequence of the sequence of ML models; and

- merging automata for the tagged models with a same prefix, wherein the interaction model will be any of the tagged ML models with the same prefix.

A method according to any of the claims 1 -2, wherein the features are extracted from: a maintained list of features that characterize a data source; or features generated statistically, resulting in a combined set of features, and wherein the combined set of features maximizes a performance for the one or more analytics applications of the data source and/or other data sources using history data.

4. A method according to any of the claims 1 -3, further comprising - calibrating (304) history data for each application, data source and ML model combination.

A method according to any of the claims 1 -4, further comprising

- recommending (305) the model to use based on a ranking of the provided interaction model.

A method according to claim 5; wherein the recommending (305) comprises: finding a cluster of models to which the interaction model belongs to; ranking models of the cluster of models by a weighted sum of score for each model in the cluster of models; and

- performing a final ranking based on an overall score assigned to each model of the cluster of models.

An arrangement (10) for enabling a recommendation of a model to use in one or more analytics applications of a data source for a target domain; being configured to:

extract features from a first dataset based on a domain information, of the target domain, for one or more applications using the data source and/or other data sources, which features are application relevant features with history data for a sequence of Maximum Likelihood, ML, models;

model interactions of the one or more application, the data source and/or other data sources, and the sequence of ML models, by using the extracted features, and the history data of the sequence of ML models into an interaction model, and

to provide the interaction model to a model selection process enabling a recommendation of a model to use for the one or more analytics applications.

An arrangement (10) according to claim 7, further configured to model being a predictive model using a given sequence of models and performance, and being configured to

cluster the extracted features into a number of groups of clustered features; generate a look up set comprising the sequence of ML models for each group of clustered features wherein a ML model previously tried for a feature in the group is tagged in the look up set;

perform an automata process, wherein, for each tagged ML model, automata for accepting a prefix will be generated so there would be automata for each tagged ML model, wherein the prefix is a sub sequence of the sequence of ML models; and

merge automata for the tagged models with a same prefix, wherein the interaction model will be any of the tagged ML models with the same prefix.

An arrangement (10) according to any of the claims 7-8, being configured to extract the features from: a maintained list of features that characterize a data source; or features generated statistically, resulting in a combined set of features, and wherein the combined set of features maximizes a performance for the one or more analytics applications of the data source and/or other data sources using history data.

10. An arrangement (10) according to any of the claims 7-9, further being configured to calibrate history data for each application, data source and ML model combination.

1 . An arrangement according to any of the claims 7-10, further configured to

recommend the model to use based on a ranking of the provided interaction model.

2. An arrangement (10) according to claim 1 1 ; being configured to recommend the model by finding a cluster of models to which the interaction model belongs to; ranking models of the cluster of models by a weighted sum of score for each model in the cluster of models; and performing a final ranking based on an overall score assigned to each model of the cluster of models.

13. A computer program, comprising instructions, which, when executed on at least one processor, cause the at least one processor to carry out the method according any of the claims 1 -6.

14. A computer-readable storage medium having stored thereon a computer program, comprising instructions which, when being executed on at least one processor, cause the at least one processor to carry out the method according to any of the claims 1 -6.