WO2022180421A1

WO2022180421A1 - Methods for generating a machine learning composition

Info

Publication number: WO2022180421A1
Application number: PCT/IB2021/051576
Authority: WO
Inventors: Jalil TAGHIA; Wenfeng HU; Carmen LEE
Original assignee: Telefonaktiebolaget Lm Ericsson (Publ)
Priority date: 2021-02-25
Filing date: 2021-02-25
Publication date: 2022-09-01
Also published as: EP4298553A1

Abstract

A computer implemented method (100) is disclosed for generating a Machine Learning (ML) composition that is optimized to perform a composition task in a communication network. The ML composition comprises a plurality of interconnected ML modules, each ML module trained to perform a module task that is specific to the ML module, and an ML module comprises at least one ML model. The method comprises obtaining a candidate set of ML modules (110) and initiating a current version of a topology for the composition (120). The method further comprises repeating, until a termination condition is satisfied, the steps of identifying, from the candidate set of ML modules, a possible topology for the composition that includes the current version of the topology and minimizes a first loss function (130), evolving the current version of the topology based on the identified possible topology (140), and evaluating the current version of the topology using a second loss function (150). The ML composition optimized to perform the composition task comprises the ML modules and connections between ML modules present in the current version of the topology when the termination condition is satisfied.

Description

Methods for generating a Machine Learning composition Technical Field The present disclosure relates to a method for generating a Machine Learning (ML) composition that is optimized to perform a composition task in a communication network. The present disclosure also relates to a management node and to a computer program and a computer program product configured, when run on a computer to carry out a method for generating an ML composition. Background Many Machine Learning (ML) technologies use data driven learning via “black-box” systems, with Deep Neural Networks (DNN) being a well-known example of such systems. Given enough training data and computation resources, DNNs are capable of solving complex tasks. However, the resulting black-box models are difficult to interpret, and are thus described as lacking “explainability”. Despite a growing literature on explaining neural networks, no consensus has yet been reached on how to explain a neural network’s decision, or how to evaluate an explanation. This is in part owing to a lack of interpretability in system design and in signal flow through the system components. One approach to the challenge of providing explainable AI is through visualization of layers in deep learning models. This type of technique aims to provide insights into what a neural network is doing by visualizing the output of individual layer. Another common approach to designing an explainable model is through piece-wise linear models or mixtures of linear models, as discussed in greater detail below. Such design can be facilitated by a so-called indirect encoding, according to which a small model can be used to encode a large model. This process searches for a meta structure in which the relation between the building blocks is modelled. The building blocks can be combined to solve new tasks using combinatory optimization, as discussed for example in Kirsch, Louis, Julius Kunze, and David Barber. "Modular networks: Learning to decompose neural computation." In Advances in Neural Information Processing Systems, pp.2408- 2418.2018. Although the explainable approaches discussed above offer insight into the inner mechanism of a model, they do not have the “explainability” baked in the design. Before obtaining the results, it remains largely unexpected what the model will produce. A linear model offers explainability but with the cost of compromising model performance. Linear models are not suitable for modelling real-world datasets that are highly nonlinear in nature. In addition, owing to their limited model expressiveness, they are not able to take advantage of significant quantities of data, that is the availability of more data only marginally improves the performance of linear models. Piece-wise linear models have been introduced as an approach towards improving model expressiveness while maintaining the interpretability. A piece-wise linear model divides data into a number of regions, with each region modelled using a linear model. The main limitation of such techniques is that the division of data into regions ignores the long-term temporal and spatial dependencies that may be present in the data. Mixture of linear models is another family of techniques which seeks to improve the model expressiveness capabilities of linear models. Such techniques can be seen as a data-driven variant of piece-wise linear models, in which data are divided into a number of regions each of which is modelled by a linear model. As discussed above, such mixtures cannot capture the long-term dependencies in data. Both piece-wise linear models and mixture of linear models perform poorly in comparison to nonlinear models such as DNNs, which are able to capture complex dynamics in data. Additionally, while offering some improvements in explainability compared with DNNs, these techniques the lack ease of interpretability that is offered by a simple linear model, owing to the need for various user-defined hyperparameters, including the number of regions and the number of mixture components. Summary It is an aim of the present disclosure to provide a management node, method performed by a management node, and a computer readable medium which at least partially address one or more of the challenges discussed above. It is a further aim of the present disclosure to provide a management node, computer readable medium and associated method which enable the generation of an ML composition that is optimized for performing a composition task, and that comprises a plurality of ML modules, each ML module trained to perform a module task that is specific to the ML module, and each ML module comprising at least one ML model. Such a composition may offer a combination of explainability, though understanding of the components of the composition and their interconnections, and reliable task performance. According to a first aspect of the present disclosure, there is provided a computer implemented method for generating a Machine Learning, (ML) composition that is optimized to perform a composition task in a communication network. The ML composition comprises a plurality of interconnected ML modules, each ML module trained to perform a module task that is specific to the ML module, and an ML module comprises at least one ML model. The method is performed by a management node and comprises obtaining a candidate set of ML modules, each ML module in the candidate set being a candidate for inclusion in the composition, and initiating a current version of a topology for the composition, the initiated current version topology comprising at least two ML modules from the candidate set and at least one connection between the ML modules. The method further comprises repeating, until a termination condition is satisfied, the steps of identifying, from the candidate set of ML modules, a possible topology for the composition that includes the current version of the topology and minimizes a first loss function associated with the composition task, evolving the current version of the topology by adding at least one node or connection from the identified possible topology to the current version of the topology, and evaluating the current version of the topology using a second loss function associated with the composition task. The ML composition optimized to perform the composition task comprises the ML modules and connections between ML modules present in the current version of the topology when the termination condition is satisfied. According to another aspect of the present disclosure, there is provided a computer program product comprising a computer readable medium, the computer readable medium having computer readable code embodied therein, the computer readable code being configured such that, on execution by a suitable computer or processor, the computer or processor is caused to perform a method according to any one or more of aspects or examples of the present disclosure. According to another aspect of the present disclosure, there is provided a management node for generating a Machine Learning (ML) composition that is optimized to perform a composition task in a communication network, wherein the ML composition comprises a plurality of interconnected ML modules, each ML module trained to perform a module task that is specific to the ML module, and wherein an ML module comprises at least one ML model. The management node comprises processing circuitry configured to cause the management node to obtain a candidate set of ML modules, each ML module in the candidate set being a candidate for inclusion in the composition and initiate a current version of a topology for the composition, the initiated current version topology comprising at least two ML modules from the candidate set and at least one connection between the ML modules. The processing circuitry is further configured to cause the management node to repeat, until a termination condition is satisfied, the steps of identifying, from the candidate set of ML modules, a possible topology for the composition that includes the current version of the topology and minimizes a first loss function associated with the composition task, evolving the current version of the topology by adding at least one node or connection from the identified possible topology to the current version of the topology, and evaluating the current version of the topology using a second loss function associated with the composition task. The ML composition optimized to perform the composition task comprises the ML modules and connections between ML modules present in the current version of the topology when the termination condition is satisfied. According to another aspect of the present disclosure, there is provided a management node for generating a Machine Learning (ML) composition that is optimized to perform a composition task in a communication network, wherein the ML composition comprises a plurality of interconnected ML modules, each ML module trained to perform a module task that is specific to the ML module, and wherein an ML module comprises at least one ML model. The management node comprises an ML module selector unit for obtaining a candidate set of ML modules, each ML module in the candidate set being a candidate for inclusion in the composition. The management node further comprises an evolutionary topology learner unit, the evolutionary topology learner unit comprising an initiator unit for initiating a current version of a topology for the composition, the initiated current version topology comprising at least two ML modules from the candidate set and at least one connection between the ML modules, and a topology search unit, a topology grower unit and a loss calculator unit. The topology search unit, topology grower unit and loss calculator unit are for repeating, until a termination condition is satisfied, the steps of identifying, by the topology search unit and from the candidate set of ML modules, a possible topology for the composition that includes the current version of the topology and minimizes a first loss function associated with the composition task, evolving, by the topology grower unit, the current version of the topology by adding at least one node or connection from the identified possible topology to the current version of the topology, and evaluating, by the loss calculator unit, the current version of the topology using a second loss function associated with the composition task. The ML composition optimized to perform the composition task comprises the ML modules and connections between ML modules present in the current version of the topology when the termination condition is satisfied. Examples of the present disclosure thus provide a method and a management node that approach the task of using Machine Learning for addressing a task through a modular design. ML Modules are introduced that are specialized at solving specific tasks, and may comprise one or more ML models. A composition comprising multiple ML modules is then learned, the composition being optimized to address a composition task. The composition itself can then offer insight on how it obtains the solution to the task, through the knowledge of which specialized ML modules are included in the composition, and how they are interconnected. The ML modules to include and their interconnections are learned in a data driven manner, but once established, the tasks at which these modules are specialized, and their interconnections, can offer conceptual insight into how the composition obtains a solution. Brief Description of the Drawings For a better understanding of the present disclosure, and to show more clearly how it may be carried into effect, reference will now be made, by way of example, to the following drawings in which: Figure 1 is a flow chart illustrating process steps in a computer implemented method for generating a Machine Learning (ML) composition that is optimized to perform a composition task in a communication network; Figures 2a to 2d show a flow chart illustrating process steps in another example of a computer implemented method for generating an ML composition that is optimized to perform a composition task in a communication network; Figure 3 is a block diagram illustrating functional units in a management node; Figure 4 is a block diagram illustrating functional units in another example of a management node; Figure 5 illustrates process flow through components of a management node; Figure 6 illustrates an example logical implementation architecture for methods according to the present disclosure; Figure 7 illustrates process flow for an Evolutionary Topology Learner unit; Figure 8 illustrates process flow for a Conditional Topology Grower unit; Figure 9 illustrates an example of a top-down, gradient based search method; Figure 10 illustrated an example of mutation generation; Figure 11 illustrates species generation; Figures 12 to 14 illustrate cross over combination and offspring generation; Figures 15 and 16 illustrate offspring selection; and Figure 17 shows a table illustrating example ML modules and use cases. Detailed Description Examples of the present disclosure propose to address the challenge of generating a Machine Learning solution to a composition task by considering the composition task as a collection of smaller tasks, identifying the relations between such tasks, and then combining ML modules specialized at solving the smaller tasks into an ML composition for solving the composition task. The present disclosure thus proposes a modular design, according to which a plurality of ML modules may each be specialized at solving specific tasks. Each ML module comprises at least one ML model, and may comprise a plurality of interconnected ML models, and each ML module has been optimized for performing its specific task. The ML modules are reusable, and, once optimized for their specific task, may be combined and used together to solve a wide range of composition tasks that may be only weakly related to the specific task for which any one ML module was trained. According to examples of the present disclosure, when used in a composition for solving a new composition task, ML modules do not need any additional training. Instead, individual ML modules may be managed and undergo periodic or scheduled retraining during their own lifecycle. According to examples of the present disclosure, in order to generate an ML composition for solving a new composition task, a set of potentially useful, or candidate, ML modules is first obtained from a dictionary or other source of available ML modules. A composition of these modules that is able to solve the composition task is then learned from data. The learned composition, comprising the ML modules retained from the candidate set and the interconnections between them, offers insight on the solution to the composition task that is provided by the composition. The individual ML modules (together with the knowledge of the tasks for which they are specialized), and the interconnections between the ML modules, provide insight into how the composition as a whole addresses and solves the composition task. In some examples, the ML modules may comprise ML models which are themselves “explainable”, comprising for example linear models or other models having some aspect of “explainability”. In such circumstances, the ML composition offers an even greater degree of insight and explainability into how the composition task is solved. Figure 1 is a flow chart illustrating process steps in a computer implemented method 100 for generating a Machine Learning (ML) composition that is optimized to perform a composition task in a communication network. The ML composition comprises a plurality of interconnected ML modules, each ML module trained to perform a module task that is specific to the ML module, and an ML module comprises at least one ML model. The method is performed by a management node which may comprise a physical or virtual node, and may be implemented in a server apparatus and/or in a virtualized environment, for example in a cloud or fog deployment. The management node may for example be implemented in a core network of the communication network, and may in some examples encompass multiple logical entities. Referring to Figure 1, in a first step 110, the method 100 comprises obtaining a candidate set of ML modules, each ML module in the candidate set being a candidate for inclusion in the composition. As illustrated at 110a, each ML module may be subjected to retraining to optimize performance of its specific module task, and retraining of individual ML modules may be scheduled according to fulfillment of a retraining condition for the ML module. Such a retraining condition may be associated with ML module performance of its specialized task, task performance of ML compositions in which the ML module is included, and/or with a periodic or other retaining schedule. In step 120, the method 100 comprises initiating a current version of a topology for the composition, the initiated current version topology comprising at least two ML modules from the candidate set and at least one connection between the ML modules. The method 100 then comprises, in step 130, identifying from the candidate set of ML modules, a possible topology for the composition that includes the current version of the topology and minimizes a first loss function associated with the composition task. In step 140, the method comprises evolving the current version of the topology by adding at least one node or connection from the identified possible topology to the current version of the topology. In step 150, the method comprises evaluating the current version of the topology using a second loss function associated with the composition task. As illustrated at 160, the method 100 comprises repeating the identifying, evolving and evaluating steps 130, 140, 150 until a termination condition is satisfied. As illustrated at step 170, the ML composition generated by the method 100 and which is optimized to perform the composition task comprises the ML modules and connections between ML modules present in the current version of the topology when the termination condition is satisfied. According to examples of the present disclosure, and as discussed in further detail below, the composition task may be associated with a task training dataset comprising input values and corresponding output values, and the ML composition optimized to perform the task may be optimized to generate a mapping representing the relation between the inputs and outputs from the composition task training dataset. The composition task may be a complex task, and may encompass one or more of the module tasks for which individual ML modules are specialized. In other examples, the composition task may be a task that is evolving with time, and/or may be related more or less closely to one or more of the module tasks for which individual ML modules are specialized. For the purposes of the present disclosure, it will be appreciated that an ML model is considered to comprise the output of a Machine Learning algorithm or process, wherein an ML process comprises instructions through which data may be used in a training procedure to generate a model artefact for performing a given task, or for representing a real world process or system. An ML model is the model artefact that is created by such a training procedure, and which comprises the computational architecture that performs the task. Also for the purposes of the present disclosure, a topology is considered to comprise an arrangement of interconnected ML modules, the arrangement identifying both the ML modules present and the connections between them, wherein an identified connection may specify both directionality (output from ML module X provided as input to ML module Y) and importance weighting. It will be appreciated that the method 100 differentiates between a current version of the topology for the ML composition, and various possible versions, with the current version evolving with each iteration to include at least one ML module or connection from the most recently identified possible version of the topology. As discussed in further detail below, the current version of the topology thus evolves “from the bottom up”, while the various possible versions may be searched “from the top down” with each iteration. Each possible version of the topology for the ML composition is constrained to include the current version of the topology, and to be comprised of interconnected ML modules from the candidate set. As discussed above, the final generated ML composition, comprising the ML modules and interconnections present in the current version pf the topology when the termination condition is satisfied, offers by its nature insight into how the ML composition addresses its composition task. This insight is provided through the knowledge of what specialized ML modules have been retained in the ML composition, and through the knowledge of how such modules are interconnected. For example, knowing which specialized ML modules provide their output as input to which other specialized modules, as well as knowing the weighting of the importance of such connections, provides considerable insight into the methodology of the ML composition as a whole for addressing the composition task. The method 100 learns the composition through an iterative process of continually generating a possible topology using a training data set, using the possible topology to evolve a current version topology, and evaluating the current version topology. The composition is thus learned in a data driven manner in order to identify the most effective combination of ML modules for addressing the task. However, despite being generated through a data driven process, the resulting composition offers conceptual insight into how the composition task will be solved, and this conceptual insight is available before the ML composition is used in an inference phase to actually address the composition task. No further training or evolution of the ML composition is required before it can be used, and there is no requirement for a period of online use before any insight or understanding of how the composition operates becomes available. Figures 2a to 2d, show flow charts illustrating process steps in another example of method 200 for generating a Machine Learning, ML, composition that is optimized to perform a composition task in a communication network. The method 200 provides various examples of how the steps of the method 100 may be implemented and supplemented to achieve the above discussed and additional functionality. As for the method 100, the method 200 is performed by a management node, which may comprise a physical or virtual node, and may be implemented in a server apparatus and/or in a virtualized environment, for example in a cloud or fog deployment. The management node may for example be implemented in a core network of the communication network, and may in some examples encompass multiple logical entities. Referring initially to Figure 2a, in a first step 210, the management node performing the method 200 obtains a candidate set of ML modules, each ML module in the candidate set being a candidate for inclusion in the composition. As discussed above, an ML module is trained to perform a module task that is specific to the ML module, and an ML module comprises at least one ML model. As illustrated at step 210, each ML module may be associated with metadata comprising a specification of the input space of the ML module. As illustrated at 210, the management node may obtain the ML modules in the candidate set from available ML modules maintained by a life cycle management node. The life cycle management node may be co-located with the management node carrying out the method 200, or may be separately located. For example, in some implementations, it may be advantageous to maintain a centralized life cycle management function that is separate from distributed management nodes for selection and learning of compositions for different composition tasks within the communication network. In this manner, management nodes may be appropriately instantiated to generate ML compositions for composition tasks relating to a Radio Access Network, Fronthaul network, Backhaul network and/or Core Network, with each management node obtaining ML modules for its respective candidate sets from a centralized life cycle management function which may be instantiated in the core network, for example as a virtualized network function. Example sub steps for obtaining a candidate set of ML modules are illustrated in Figure 2a at steps 210a to 210c. In step 210a, the management node may obtain the metadata of available ML modules. As illustrated at 210a, the metadata of available ML modules may comprise, in addition to a specification of the input space for the ML module, at least one of an identification of the ML module task that the ML module is trained to perform, and/or the number inputs from other ML modules that can be accepted. In some examples, ML module metadata may further comprise a measure of historical importance of the ML module, for example reflecting importance of the ML modules in ML compositions in which the ML module has previously been included, ML module retraining information, such as next retraining time or retraining schedule, and/or any other information relating to the functioning of the ML module. In step 210b, the management node may select candidate ML modules for inclusion in the candidate set based on the obtained metadata. This may comprise for example selecting only ML modules for which the specified input space is a subset of the input space of the composition task. In some examples, step 210b may further comprise automatically including in the candidate set any ML modules which have been flagged as relevant to a particular composition task, either by the management node itself, by another node or by a human user. Such information may be included in metadata associated with the composition task. In step 210c, the management node may obtain the selected ML modules, for example by requesting and receiving the selected ML modules from a life cycle management node or other storage, and/or by retrieving one or more of the selected ML modules from a storage accessible to or managed by the management node. In step 220, the management node initiates a current version of a topology for the ML composition, the initiated current version topology comprising at least two ML modules from the candidate set and at least one connection between the ML modules. Sub steps which may be included in the process of initiating a current version of a topology for the ML composition are illustrated in Figure 2c. Referring now to Figure 2c, in a first sub step 220a, the management node checks whether any ML modules are indicated as required for the composition task. As mentioned above, the composition task may be associated with metadata describing the composition task and specifying its input space, and the metadata may additionally indicate ML modules that are required for the composition ask. In other examples, this information may be input by a human user or other node. If no specific ML modules are required for the composition task, the management node randomly selects two ML modules from the candidate set in step 220d, and randomly selects, from among possible connections between the two selected ML modules, at least one connection for inclusion in the initiated current version of the topology at step 220e. If at step 220a it is determined that there is at least one specific ML module that is required for the composition task, then the management node, at step 220b, includes in the initiated current version topology any and all ML modules indicated in the composition task metadata (or elsewhere) as being required for the composition task. At step 220c, the management node checks whether at least two ML modules have been included. If only one ML module has so far been included, then the management node proceeds to step 220d to randomly select one other ML module for inclusion in the initiated current version of the topology. If at least two ML modules have already been included, owing to being flagged as required for the composition task, then the management node proceeds to step 220a, and randomly selects, from among possible connections between the ML modules, at least one connection for each ML module. It will be appreciated that the restriction to possible connections between ML modules may for example refer to the input spaces of the ML modules, as well as the number of inputs from other ML modules that can be accepted, and or specific identification of ML modules that can be accepted as inputs. As discussed above, such information may be included in metadata associated with the ML modules. Having initiated the current version of the topology for the ML composition, the management node proceeds to step 230, illustrated in Figure 2b. Referring to Figure 2b, in step 230, the management node identifies, from the candidate set of ML modules, a possible topology for the composition that includes the current version of the topology and minimizes a first loss function associated with the composition task. As illustrated at step 230, this may comprise performing a top-down search, which may be a gradient based search, of possible topologies that include the current version of the topology. The search space of possible topologies may include all combinations of any subset, up to and including the entire set, of candidate ML modules, which combinations respect the input spaces and limitations on number of input ML modules that may be specified in ML module metadata. Consequently, as illustrated at 230a, identifying a possible topology for the composition that includes the current version of the topology and minimizes a first loss function associated with the composition task may comprise identifying a topology in which ML modules of the candidate set are connected in a manner consistent with input and output spaces specified in their metadata. As discussed above, and illustrated at 230b, the composition task may be associated with a dataset comprising input values and corresponding output values, and the first loss function associated with the composition task may comprise a loss function based on a difference between outputs from the composition task data set and outputs provided by the possible topology for the composition, given the same inputs from the composition task dataset. In step 240, the management node then evolves the current version of the topology by adding at least one node or connection from the identified possible topology to the current version of the topology. An example of sub steps that may performed in order to achieve the evolving of the current topology at step 230 is illustrated in Figure 2d. Referring to Figure 2d, in the illustrated example, in order to evolve the current version of the topology, the management node first generates possible first order mutations of the current version of the topology in step 241, which first order mutations are included in the identified possible topology. For the purposes of the present disclosure, a first order mutation comprises a single change in the topology, i.e. an addition of an ML module to the topology or an addition of a connection between ML modules present in the topology. The limitation to mutations that are present in the identified possible topology restricts the new ML modules and new connections to only those ML modules and connections that are present in the identified possible topology. As already discussed, the identified possible topology is restricted to including the current version of the topology, and so by definition, first order mutations of the current version of the topology, which mutations are present in the identified possible topology, are additions to the current version of the topology as opposed to removals of ML modules or connections. The evolution of the current version of the topology to include at least one mutation may thus be envisaged as a “growing” of the current version of the topology, with at least one mutation being added at each iteration. As illustrated at in Figure 2d, generating possible first order mutations of the current version of the topology, which first order mutations are included in the identified possible topology, may comprise identifying ML modules that are not present in the current version of the topology that is included in the identified possible topology, but have a connection to the current version of the topology in step 241a, and identifying connections between ML modules of the current version of the topology, which connections are present in the identified possible topology but are not present in the current version of the topology. Each ML module and connection identified in steps 241a and 241b comprises a first order mutation of the current version of the topology. In step 242, the management node determines whether the first order mutations should be divided into species. This may comprise checking whether the number of first order mutations exceeds a threshold value. It will be appreciated that as the current version of the topology grows with iterations of steps 230 to 250, the number of first order mutations that can be identified will vary. If a large number of first order mutations have been identified then it may be useful or appropriate to divide the mutations into species, for example in order to ensure efficient computation and/or to improve performance. The threshold value may be set by a human operator, or by the management node, to represent the number of first order mutations at which division into species affords advantages in efficiency, performance or some other criterion. In other examples, some other criterion for assessing the generated first order mutations may be used to determine whether division into species should be performed. Division into species offers the advantage of protecting mutations that are not immediately advantageous when compared with other existing entities, but which may nonetheless offer value. Species division also provides ways of exploring alternative compositions which would otherwise have been discarded. For example, if a mutation allows a connection between two nodes that are not present or connected in any other mutation, then that may be considered to be a novel mutation and worthwhile to protect, even if other mutations appear to be more advantageous. Species division can protect novel mutations, increasing the likelihood that such mutations will be incorporated. If the management node determines at step 242 that the first order mutations should not be divided into species, the management node then, at step 243, generates combinations of the first order mutations, each combination comprising at least two mutations not present in any other combination. This may comprise, as illustrated at 243a, for groups of mutations, adding the mutations in the group such that the combination comprises each first order mutation of the group. In some examples, each group may comprise two first order mutations, or in some examples, groups of more than two first order mutations may be envisaged. In step 244, the management node selects a single evolution combination from among the generated combinations. As illustrated in Figure 2d, this may comprise selecting the generated combination that minimizes a third loss function associated with the composition task to be the evolution combination in step 244a. As illustrated at 244b, the third loss function associated with the composition task may comprise a loss function based on a difference between outputs from the composition task data set and outputs provided by the current version of the topology evolved to include a generated combination, given the same inputs from the composition task dataset. In step 245, the management node then applies the selected evolution combination to the current version of the topology. Returning to step 242, if the management node determines that it should divide the first order mutations into species, then the management node does so at step 246. This may comprise for example clustering the first order mutations according to a clustering parameter. Structural features is one example clustering parameter that may be used by the management node at step 246, enabling a clustering that ensures each species comprises first order mutations that share similar structures. Clustering in this manner may help to identify the most promising mutations combination for application to the evolved current version of the topology. Following division into species, the management node then, for each species, generates combinations of first order mutations in the species at step 247, each combination comprising at least two mutations not present in any other combination, and selects a species combination from among the generated combinations of the species at step 248. As illustrated at step 248, selecting a species combination from among the generated combinations for a species may comprise selecting as the species combination the generated combination that minimizes the third loss function associated with the composition task. Having selected a species combination for each species, the management node then selects, in step 249, a single evolution combination from among the selected species combinations. As previously, this may comprise selecting the species combination that minimizes a loss function associated with the composition task to be the evolution combination. Finally, the management node applies the selected evolution combination to the current version of the topology instep 245. Referring again to Figure 2b, following evolution of the current version of the topology in step 240, the management node then evaluates the current version of the topology using a second loss function associated with the composition task in step 250. As illustrated at 250a, the second loss function associated with the composition task may comprise a loss function based on a difference between outputs from the composition task data set and outputs provided by the current version of the topology for the composition, given the same inputs from the composition task dataset. In step 255, the management node checks whether a termination condition is satisfied. The termination condition may for example describe a minimum number of iterations, a threshold value for the second loss function, or any other criterion appropriate for assessing convergence to an acceptable solution. the criterion may be set by a human user, learned on the basis of ML composition performance, or established in any other suitable manner. If the termination condition is not yet satisfied, then the management node returns to step 230 to identify a new possible topology, evolve the current version of the topology in step 240 and then evaluate the new current version of the topology in step 250. If the termination condition is satisfied, the management node then sets the ML modules and connections between ML modules present in the current version of the topology at the time that the termination condition is satisfied to be the ML composition optimized to perform the composition task. It will be appreciated that, as discussed previously, no further training of the ML composition on the competition task data set is required. The ML composition resulting from the method 200, including its ML modules and their interconnections, which may include both directionality and importance weighting, is ready for use in solving the composition task. The retraining of individual ML modules including in the ML composition is managed independently of their use in any one ML composition, according to a retraining schedule that is particular to the ML module and is handled for example by a life cycle node as discussed above. As discussed above, the methods 100 and 200 may be performed by a management node, and the present disclosure provides a management node that is adapted to perform any or all of the steps of the above discussed methods. The management node may comprise a physical node such as a computing device, server etc., or may comprise a virtual node. A virtual node may comprise any logical entity, such as a Virtualized Network Function (VNF) which may itself be running in a cloud, edge cloud or fog deployment. The management node may for example comprise or be instantiated in any part of a communication network node such as a logical core network node, network management center, network operations center, Radio Access node etc. Any such communication network node may itself be divided between several logical and/or physical functions, and any one or more parts of the management node may be instantiated in one or more logical or physical functions of a communication network node. Figure 3 is a block diagram illustrating an example management node 300 which may implement the method 100 and/or 200, as illustrated in Figures 1 to 2d, according to examples of the present disclosure, for example on receipt of suitable instructions from a computer program 350. Referring to Figure 3, the management node 300 comprises a processor or processing circuitry 302, and may comprise a memory 304 and interfaces 306. The processing circuitry 302 is operable to perform some or all of the steps of the method 100 and/or 200 as discussed above with reference to Figures 1 to 2d. The memory 304 may contain instructions executable by the processing circuitry 302 such that the management node 300 is operable to perform some or all of the steps of the method 100 and/or 200, as illustrated in Figures 1 to 2d. The instructions may also include instructions for executing one or more telecommunications and/or data communications protocols. The instructions may be stored in the form of the computer program 350. In some examples, the processor or processing circuitry 302 may include one or more microprocessors or microcontrollers, as well as other digital hardware, which may include digital signal processors (DSPs), special-purpose digital logic, etc. The processor or processing circuitry 302 may be implemented by any type of integrated circuit, such as an Application Specific Integrated Circuit (ASIC), Field Programmable Gate Array (FPGA) etc. The memory 304 may include one or several types of memory suitable for the processor, such as read-only memory (ROM), random-access memory, cache memory, flash memory devices, optical storage devices, solid state disk, hard disk drive etc. Figure 4 illustrates functional units in another example of management node 400 which may execute examples of the methods 100 and/or 200 of the present disclosure, for example according to computer readable instructions received from a computer program. It will be understood that the units illustrated in Figure 4 are functional units, and may be realized in any appropriate combination of hardware and/or software. The units may comprise one or more processors and may be integrated to any degree. Referring to Figure 4, the management node 400 is for generating a Machine Learning (ML) composition that is optimized to perform a composition task in a communication network, wherein the ML composition comprises a plurality of interconnected ML modules, each ML module trained to perform a module task that is specific to the ML module, and wherein an ML module comprises at least one ML model. The management node 400 comprises an ML module selector unit 402 for obtaining a candidate set of ML modules, each ML module in the candidate set being a candidate for inclusion in the composition. The management node further comprises an evolutionary topology learner unit 404, the evolutionary topology learner unit 404 comprising an initiator unit 406 for initiating a current version of a topology for the composition, the initiated current version topology comprising at least two ML modules from the candidate set and at least one connection between the ML modules, and a topology search unit 408, a topology grower unit 410 and a loss calculator unit 412. The topology search unit 408, topology grower unit 410 and loss calculator unit 412 are for repeating, until a termination condition is satisfied, the steps of identifying, by the topology search unit 408 and from the candidate set of ML modules, a possible topology for the composition that includes the current version of the topology and minimizes a first loss function associated with the composition task, evolving, by the topology grower unit 410, the current version of the topology by adding at least one node or connection from the identified possible topology to the current version of the topology, and evaluating, by the loss calculator unit 412, the current version of the topology using a second loss function associated with the composition task. The ML composition optimized to perform the composition task comprises the ML modules and connections between ML modules present in the current version of the topology when the termination condition is satisfied. The management node 400 may further comprise interfaces 414 which may be operable to facilitate communication with a life cycle node, and/or with other communication network nodes over suitable communication channels. Figures 1 to 2d discussed above provide an overview of methods which may be performed according to different examples of the present disclosure. There methods may be performed by a management node, as illustrated in Figures 3, 4 and 6. There now follows a detailed discussion of how different process steps illustrated in Figures 1 to 2d and discussed above may be implemented. The functionality and implementation detail described below is discussed with reference to the functional modules of Figures 4 and 6. However, it will be appreciated that the functionality described below may also be implemented in the node of Figure 3, performing examples of the methods 100 and/or 200, substantially as described above. Figure 5 illustrates process flow through components of the management node, as well as a life cycle management node, and introduces notation used below in detailed discussion of the function of the different management node units in implementing the methods 100, 200. Referring to Figure 5, in step 1, the Module Selector unit 402 of the management node 400 generates a list of candidate modules from a dictionary of available modules. The Module Selector unit takes as its input metadata of the composition task T_* that the ML composition is to solve, as well as metadata for the ML modules. In step 2, the Module Selector unit 402 obtains the ML modules in the candidate list from a life cycle management node 500. In step 3, the Evolutionary Topology Learner unit 404 learns a topology for the ML composition, comprising modules from the candidate list, that will address the composition task. The Evolutionary Topology Learner unit 404 takes as input the ML modules of the candidate set and a dataset

corresponding to the task T_∗. The output of the Evolutionary Topology Leaner unit 404 is the learned topology Q_∗, which describes the ML modules from the candidate set that are to be included in the ML composition, and the interactions between these modules, in order to solve the composition task T_∗. As discussed above, the interactions between ML modules may include both directionality of a connection and importance weighting. Figure 6 illustrates an example logical implementation architecture 2000 comprising a management node 600, life cycle management node 500 and appropriate data interfaces 2050. Elements of the implementation architecture, and their operation, are discussed in greater detail below. Specialized ML Modules An ML Module 502 contains a set of machine learning models 506 that are fully trained and optimized for solving a specific module task. There are no restrictions on the architecture of the machine learning models 506 of an ML Module 502, although in some examples the ML models 506 may comprise linear models or other models fulfilling criteria to be considered as “explainable”. Using at least some ML models that are considered to be explainable may provide an additional layer of insight into how an ML composition comprising ML models using such models approaches a given composition task. However, as discussed above, “explainable” models are not required for the ML composition, as the individual ML modules (together with the knowledge of the tasks for which they are specialized), and the interconnections between the ML modules already provide insight into how the composition as a whole addresses and solves the composition task. Each ML module is associated with metadata 504 that describes various aspects of the module’s functionality and its requirements. The metadata includes the specific task that the module is specialized to perform, as well as the input space of the ML module and the accepted number of alien input modules, that is the number of other ML modules whose output can be accepted by the ML module as an input. The metadata may further include historical importance weight of the ML module in ML compositions in which the ML module has been included, and or a retraining schedule for the ML module. Other examples of metadata that may be included relating to ML modules may also be envisaged. The ML modules 502 are maintained by the Life Cycle Management node 500, and are subjected to regular retraining scheduled by the Life Cycle Management node 500. The Life Cycle Management node 500 may adapt the frequency of retraining of an ML module by taking into account its historical importance in ML compositions. However, when an ML Module is employed by the Topology Learner unit 604, it does not require further training and its learnable parameters are fixed. When included in an ML composition, an ML module simply serves as a function which takes a number of inputs and produces output responses. Mathematical Formulation Let J be a set that includes valid ML Modules. An ML Module is shown as

where: is the task at which the Module is specialized, is the “joint number” which corresponds to the accepted number of alien input odules, is the historical importance of the Module, s the frequency of retraining indicating how often this Module needs retraining, is the feature space of the Module inputs (the feature attributes).

Let

be the dataset associated with task T_i, where n_i is the number of training samples,

is the dimensionality of the inputs X_i, and is the

dimensionality of the output responses. The ML Module M_i with the joint number K_i introduced above contains a single marginal model and K_i joint models for solving task t_I. In an example in which for K_i = 2, possible models may include:

Here, s a machine learning model which takes as input X_i and produces Y_i. The parameter set Φ_i includes all the model parameters. Additionally, is a machine

learning model which takes as input both X_i and Y_j, which is the output of the ML Module M_j. Similarly, is a machine learning model which takes as the inputs both X_i and

the pair of Y_j and Y_j, which are the outputs of the ML Modules M_j and M_k, respectively. It will be appreciated that are three different machine learning models

with possibly different architectures. Such models may be linear models or other explainable models but also may be non-explainable models such as ANNs. Life Cycle Management The Life Cycle Management (LCM) node maintains the ML Modules including their metadata, and initiates retraining of each ML Module according to the module’s retraining schedule

The Life Cycle Management node may in some examples comprise be an overarching mechanism which also monitors and records the performance of the modules in ML compositions as well as when being used for their specialized tasks. Module Selector Unit (Steps 110, 210 of method 100, 200) The Module Selector unit 602 has access to the metadata of all ML Modules. The Module Selector unit 602 takes as the input the metadata of a task T_∗ and provides a list of viable modules that are potentially suitable for solving this task. In identifying suitable ML modules, the Module Selector Unit 602 may limit the selection to only those ML modules whose input space is a subset (or subset equal) of the input space of the task T_∗. Formally, let ^ be the set of all Specialized Modules, and let A_k be the input feature space of the Specialized Module k. The Specialized Module M_k is a valid candidate if

The list of all valid candidates is shown as

The Module Selector can also allow for incorporation of domain knowledge. For example, if certain ML Modules are required for a given composition task T_∗, these can be included by the Module Selector unit. Information about required ML modules may for example be specified in composition task metadata, or in any other suitable manner, for example by a human user, orchestrator node, use case manager for a particular composition task, etc. Evolutionary Topology Learner Unit (Steps 120 – 150, 220 – 250) The Evolutionary Topology Learner unit 604 component consists of the following units: Initiator unit 606, Topology Search unit 608, Topology Grower unit 610 Loss Calculator unit 612. In some examples, the initiator unit 606 may be a component part of the topology search unit 608, and the loss calculator may be a component part of the topology grower unit 612. These units are illustrated separately for clarity, but it will be appreciated that the units are logical entities whose functions may be implemented in any suitable manner. The Topology Learner unit 604 takes as its inputs: ● the dataset associated with the task T_∗,

● the candidate ML Modules selected by the Module Selector unit

602, and ● a current version Q_∗ of a topology for the ML composition.

In a first iteration of the Topology Learner Unit, the current version Q_∗ of the topology is initiated by the initiator unit 606 as discussed below. The Topology Learner unit 604 then evolves the topology Q_∗ by iterating between conditional search for possible topologies and conditional growth or evolution of the current version of the topology. The following process steps describe how the Evolutionary Topology Learner may carry out its function. Step 0 (steps 120, 220): Initiate a current version topology Q_∗. In the absence of any prior knowledge, such as for example required ML modules specified in task metadata, the initial topology is a functional composition comprising two randomly selected ML Modules from the candidate set connected in a manner consistent with the limitations on input space, allowable inputs etc., specified in their metadata. Considering randomly selected modules M_i and M_j where ^

the current version topology Q_∗may be initiated such that:

where is the pretrained ML model of the module M_i and i s the pretrained ML

model of the module M_j, which takes the output of module M_i as an additional input. Step 1 (steps 130, 230): Apply the Topology Search unit 608 given the current topology Q_∗ and produce possible topology S_∗:

Conditional Topology Search is described in greater detail below. In brief, this search step takes as input the dataset D_∗ associated with the composition task T_∗ and the candidate set M of ML modules, and searches, using the data set D_∗ for a possible topology comprising ML modules from the candidate set to solve the task. The search is constrained in that the possible topology must include the current version of the topology Q_∗. Step 2 (steps 140, 240): Evolve the current topology by applying the Topology Grower unit 610:

Conditional Topology Growth or evolution is also described in greater detail below. In brief, this evolution step takes as input the dataset D_∗ associated with the composition task T_∗ and the current topology Q_∗, selects a combination of one or more first order mutations of the current version topology Q_∗ to apply so as to result in a new current version of the topology Q_∗. The evolution of the current version of the topology is constrained in that the first order mutation or mutations selected for application must be present in the most recently obtained possible version of the topology S_∗. Step 3 (steps 150, 250): Evaluate loss between Y_∗ (set of outputs in the task dataset D_∗) and its estimation

(produced by the evolved current version of the topology Q_∗):

Step 4 (steps 160, 260): Repeat steps 1-3 until convergence of ℓ based on a user- defined criterion. Figure 7 illustrates process flow for the Evolutionary Topology Learner unit 604. Referring to Figure 7, Conditional Topology Search 608 takes as input the dataset D_∗, the modules M of the candidate set, and the current (initiated) version of the topology Q_∗. Conditional topology search 608 produces a possible topology S_∗, which is used by the Conditional Topology Grower 610, together with the dataset D_∗ and the current version of the topology Q_∗to generate an evolved current version of the topology Q_∗. This current version of the topology Q_∗is then evaluated by the loss calculator 612 using the dataset D_∗ to produce a loss value ℓ. As discussed above, Conditional Topology Search is the process of identifying a from the candidate set of ML modules, a possible topology for the composition that includes the current version of the topology and minimizes a loss function associated with the task (steps 130, 230). The loss function may for example be the difference between outputs from the task dataset and predicted outputs from the possible topologies identified. Conditional Topology Search unit takes as inputs: ● dataset

associated with the task T_∗, ● The ML Modules selected by the Module Selector component

to form the candidate set, and ● the current topology Q_∗. The search comprises, in the present implementation, a top-down (gradient-based) search in order to find a topology of the modules that minimizes the loss between outputs from the dataset and outputs produced by the possible topology. Let S be the possible topology identified by the Conditional Topology Search unit 608, that is:

where is the estimation provided by the possible topology of Y_∗, the outputs in the dataset. The objective of the gradient-based search is to find the structure that minimizes the following loss:

where S is the set of all possible topologies. As discussed above, the search for a possible topology is constrained by the fact that a possible topology must include the current version of the topology Q_∗, and must comprise only modules from the candidate set. The size and complexity of the possible topology is otherwise only restricted by the input spaces and allowed inputs of alien ML modules for the ML modules in the candidate set, and the requirement to minimize the above loss function. The possible topology may therefore comprise some or all of the modules in the candidate set. In the beginning of learning where the size of the current topology Q_∗ is small (with the current version being initiated with just two ML modules, or the minimum number of required modules if this number is greater than two), the search space can be very large. However, as the size of Q_∗ grows during learning, the search space becomes progressively smaller. Figure 9 illustrates an example of a top-down, gradient based search method. Referring to Figure 9, the left of the Figure illustrates a candidate set of 12 ML modules, represented as nodes, and an initiated current version of a topology for an ML composition. The initiated current version topology comprises ML modules 1 and 2, connected such that the output of ML module 1 is input to ML module 2. The possible topology search produces a possible topology that is illustrated on the right of Figure 9. The possible topology is required to include the current version topology, and may include up to all of the other ML modules in the candidate set. As can be seen from Figure 9, the example possible topology comprises, in addition to interconnected ML Modules 1 and 2 that are present in the current version topology, ML modules 5, 6, 3, 11, and 8, as well as the illustrated interconnections between them. Conditional Topology Growth, or evolution, is the process of adding at least one ML module and/or connection between ML modules to the current version of the topology (steps 140, 240). The current version of the topology thus grows incrementally, with each addition being drawn from the latest identified possible topology, as demonstrated below. The Conditional Topology Grower unit 610 takes as inputs: ●_dataset

_{associated with the task T∗,} ● the output of the Conditional Topology Search unit, that is the possible topology S_∗, and ● the current version of the topology Q

(as discussed above, the possible topology S_∗ must include the current version of the topology Q_∗). The Conditional Topology Grower unit 610 then uses a technique inspired by neuroevolution (as disclosed in K. O. Stanley, J. Clune, J. Lehman, and R. Miikkulainen, ‘Designing neural networks through neuroevolution’, Nat Mach Intell, vol. 1, no. 1, pp. 24–35, Jan. 2019, doi: 10.1038/s42256-018-0006-z) in order to evolve the current topology Q_∗. This process may be performed by dedicated logical units within the topology Grower unit 610 as follows and as illustrated in Figure 8: 1. (Step 241) Generate first-order ML module mutations and weight mutations of the current version topology by applying the Mutation Generator unit 622. 2. (Step 246) Divide the resulting mutations into a number of species by applying the Species Generator unit 626. 3. (Step 247, 243) Generate offspring per species by applying the Offspring Generator unit 624 to the mutations within each species. 4. (Step 248) Per species, select the single best-fit offspring by applying the Offspring Selector unit 628. 5. (Step 249, 244) Across species, select the single best-fit offspring by applying the Offspring Selector unit 628 to the offspring representatives of each species. The actions of each unit of the Conditional Topology Grower unit 610 are described below with reference to the process flow of Figure 8 and to Figures 10 to 16. Mutation Generator unit 622 (Step 241) The Mutation Generator unit takes as input the current version topology Q_∗ and possible topology S_∗. The Mutation Generator unit 622 then generates all the possible ML module mutations of the first order and weight mutations, given Q_∗ and S_∗. The lower bound on the number of mutations is one and the upper bound depends on Q_∗ and S_∗, as all of the mutations generated are required to be present in the possible topology S_∗. An example of mutation generation is illustrated in Figure 10. In the upper part of Figure 10, a possible topology S_∗, including a current version topology Q_∗, is illustrated. In the lower part of Figure 10, all of the possible first order weight and ML module mutations (referred to as node mutations) are illustrated. A first order mutation comprises a single change in the current version topology, i.e. an addition of an ML module to the current version of the topology or an addition of a connection between ML modules already present in the current version topology. As discussed above, the first order mutations are limited to be only those mutations that are present in the most recently identified possible topology. Species Generator unit 626 (Step 246) The Species Generator unit 626 clusters the mutations into a number of species, if this is advantageous given the number of mutations or some other characteristic of the mutations generated. Various techniques for clustering of the mutations may be envisaged, including for example by measuring the structural similarities among mutations. In such an example, each species cluster would contain a number of mutations that share similar structures. Species generation is illustrated in Figure 11, in which the mutations illustrated in Figure 10 are illustrated as clustered into two species: Species 1 and Species 2. Offspring Generator unit 624 (Steps 243, 247) The offspring Generator unit 624 performs the task of cross over across arbitrary pairs of mutations belonging to the same species. The number of generated offspring is Comibnation(m, 2) which is the combination of the m mutations in a species taken two at a time without repetition. In some examples, it may be desirable to combine across groups of three or more mutations. Cross over combination and offspring generation are illustrated in Figures 12, 13 and 14. Figure 12 illustrates the three mutations of Species 1 from Figure 11. In the lower part of Figure 12, each mutation is expressed as candidate for cross over comprising a series of connections between the ML modules present in the mutations of the species. In Figure 13, the three offspring that can be generated by cross cover of the three candidates in Figure 12 are illustrated. The three offspring are the combinations of candidates (mutations) 1 and 2, candidates (mutations) 1 and 3, and candidates (mutations) 2 and 3. Each offspring comprises all of the ML modules and connections in each of the candidates generating the offspring. Figure 14 illustrates the offspring in graphical form. Offspring Selector unit 628 (Steps 244, 248, 249) The Offspring Selector unit 628 takes as inputs the candidate offspring, the possible topology S_∗ and the dataset D_∗. The Offspring Selector unit 628 then selects the single best-fit offspring from the pool of offspring from a given species. The best-fit offspring Q_∗ is the offspring that results in the lowest loss:

where O_i is a candidate offspring from a pool of offspring at a given species. Offspring selection is illustrated in Figures 15 and 16. Following selection of a single offspring for each species, the Offspring Selector unit 628 then applies the same procedure to select a single best offspring from among the representations of each species, which selected offspring will be applied to the current version of the topology and evaluated. The management node will then either return to conditional topology search and continue iterating until a termination condition is satisfied, or, if the termination condition is satisfied, then output the current version topology as the ML composition for solving the composition task. It will be appreciated that methods according to the present disclosure may be applied to a wide range of use cases within a communication network. Table 1 below illustrates two example use cases in a communication network, including sleeping cell prediction and network performance score (NPS) prediction. NPS prediction for example uses two ML modules: M4 (optimized for RAN throughput KPI degradation prediction) and M5 (optimized for transport layer latency prediction). The modules (rows in the dashboard) will be life cycle managed by a life cycle management node, as instructed by an owner of the modules, which may comprise a small team or single domain expert. Each use case, comprising a specific composition task, will be life cycle managed by a use case owner, who may decide to re run examples of the method disclosed herein, for example on the basis of newly available data, newly available ML modules, retrained or updated ML modules, etc. A wide range of ML modules may be envisaged within a communication network, including for example: - Downlink Signal to interference and noise ratio (SINR) prediction in RAN - Uplink SINR prediction in RAN - Secondary carrier prediction in RAN - Link adaptation recommender in RAN Similarly, a wide range of composition tasks may be envisaged within a Core network, Backhaul network and RAN. Such tasks may be complex, and may evolve with time. Example tasks may include communication network resource management in a Core network, and radio resource management in a RAN. Examples of the present disclosure provide methods and a management node that approach the task of using ML models for addressing a task through a modular design. ML Modules are introduced that are specialized at solving specific tasks, and may comprise one or more ML models. Each ML module is life cycle managed and may be reused in any number of ML compositions for solving composition tasks, which may be complex tasks involving a large number of smaller tasks, and/or may be only very loosely connected or related to the task for which the ML module itself is specialized. In order to identify an ML composition for a new task, a set of candidate modules is selected, and a composition of these modules is learned from data, which composition can effectively solve the task. The composition itself can then offer insight on how it obtains the solution to the task, through the knowledge of which specialized ML modules are included in the composition, and how they are interconnected. The ML modules to include and their interconnections are learned in a data driven manner, but once established, the tasks at which these modules are specialized, and their interconnections, can offer conceptual insight into how the composition obtains a solution. Examples of the present disclosure thus offer an approach that facilitates interpretability of ML solutions and root cause analysis. Methods according to the present disclosure also ensures flexibility and efficiency, allowing for different ML modules to be used to the most beneficial effect for solving different tasks, with individual ML modules being life cycle managed by a small team or even single domain expert, and ML compositions being life cycle managed by a use case owner. The modular approach also offers the possibility of orchestrating distributed training without exchanging weights and weight aggregation. The reusability of the individual ML modules contributes to reducing power consumption. The methods of the present disclosure may be implemented in hardware, or as software modules running on one or more processors. The methods may also be carried out according to the instructions of a computer program, and the present disclosure also provides a computer readable medium having stored thereon a program for carrying out any of the methods described herein. A computer program embodying the disclosure may be stored on a computer readable medium, or it could, for example, be in the form of a signal such as a downloadable data signal provided from an Internet website, or it could be in any other form. It should be noted that the above-mentioned examples illustrate rather than limit the disclosure, and that those skilled in the art will be able to design many alternative embodiments without departing from the scope of the appended claims. The word “comprising” does not exclude the presence of elements or steps other than those listed in a claim, “a” or “an” does not exclude a plurality, and a single processor or other unit may fulfil the functions of several units recited in the claims. Any reference signs in the claims shall not be construed so as to limit their scope.

Claims

CLAIMS 1. A computer implemented method (100) for generating a Machine Learning, ML, composition that is optimized to perform a composition task in a communication network, wherein the ML composition comprises a plurality of interconnected ML modules, each ML module trained to perform a module task that is specific to the ML module, and wherein an ML module comprises at least one ML model, the method, performed by a management node, comprising: obtaining a candidate set of ML modules, each ML module in the candidate set being a candidate for inclusion in the composition (110); initiating a current version of a topology for the composition, the initiated current version topology comprising at least two ML modules from the candidate set and at least one connection between the ML modules (120); repeating, until a termination condition is satisfied (160), the steps of: identifying, from the candidate set of ML modules, a possible topology for the composition that includes the current version of the topology and minimizes a first loss function associated with the composition task (130); evolving the current version of the topology by adding at least one node or connection from the identified possible topology to the current version of the topology (140); and evaluating the current version of the topology using a second loss function associated with the composition task (150); wherein the ML composition optimized to perform the composition task comprises the ML modules and connections between ML modules present in the current version of the topology when the termination condition is satisfied (170).

2. A method as claimed in claim 1, wherein each ML module is associated with metadata comprising a specification of the input space of the ML module (210).

3. A method as claimed in claim 1 or 2, wherein obtaining a candidate set of ML modules, each ML module in the candidate set being a candidate for inclusion in the composition, comprises: obtaining the ML modules in the candidate set from available ML modules maintained by a life cycle management node (210).

4. A method as claimed in claim 2 or 3, wherein obtaining a candidate set of ML modules, each ML module in the candidate set being a candidate for inclusion in the composition, comprises: obtaining the metadata of available ML modules (210a); selecting candidate ML modules for inclusion in the candidate set based on the obtained metadata (210b); and obtaining the selected ML modules (210c).

5. A method as claimed in claim 4, wherein selecting candidate ML modules for inclusion in the candidate set based on the obtained metadata comprises selecting only modules for which the specified input space is a subset of the input space of the composition task (210b).

6. A method as claimed in any one of the preceding claims, wherein the composition task is associated with metadata describing the composition task, and wherein initiating a current version of a topology for the composition, the initiated current version topology comprising at least two ML modules from the candidate set and at least one connection between the ML modules, comprises: including in the initiated current version topology any ML modules indicated in the composition task metadata as being required for the composition task(220b).

7. A method as claimed in any one of the preceding claims, wherein initiating a current version of a topology for the composition, the initiated current version topology comprising at least two ML modules from the candidate set and at least one connection between the ML modules, comprises: randomly selecting two ML modules from the candidate set (220d); and randomly selecting, from among possible connections between the two selected ML modules, at least one connection for inclusion in the initiated current version of the topology (220e).

8. A method as claimed in any one of claims 2 to 7, wherein identifying, from the candidate set of ML modules, a possible topology for the composition that includes the current version of the topology and minimizes a first loss function associated with the composition task comprises identifying a topology in which ML modules of the candidate set are connected in a manner consistent with input and output spaces specified in their metadata (230a).

9. A method as claimed in any one of the preceding claims, wherein the composition task is associated with a dataset comprising input values and corresponding output values, and wherein the first loss function associated with the composition task comprises a loss function based on a difference between outputs from the composition task data set and outputs provided by the possible topology for the composition, given the same inputs from the composition task dataset (230b).

10. A method as claimed in any one of the preceding claims, wherein identifying, from the candidate set of ML modules, a possible topology for the composition that includes the current version of the topology and minimizes a first loss function associated with the composition task comprises performing a top-down search of possible topologies including the current version of the topology (230).

11. A method as claimed in any one of the preceding claims, wherein evolving the current version of the topology by adding at least one node or connection from the identified possible topology to the current version of the topology comprises: generating possible first order mutations of the current version of the topology, which first order mutations are included in the identified possible topology (241); generating combinations of the first order mutations, each combination comprising at least two mutations not present in any other combination (243); selecting a single evolution combination from among the generated combinations (244); and applying the selected evolution combination to the current version of the topology (245).

12. A method as claimed in claim 11, wherein generating possible first order mutations of the current version of the topology, which first order mutations are included in the identified possible topology, comprises: identifying ML modules that are not present in the current version of the topology that is included in the identified possible topology, but have a connection to the current version of the topology (241a); identifying connections between ML modules of the current version of the topology, which connections are present in the identified possible topology but are not present in the current version of the topology (241b); wherein each identified ML module and connection comprises a first order mutation of the current version of the topology.

13. A method as claimed in claim 11 or 12, wherein generating combinations of first order mutations, each combination comprising at least two mutations not present in any other combination generating offspring, comprises, for groups of mutations, adding the mutations in the group such that the combination comprises each first order mutation of the group (243a).

14. A method as claimed in any one of claims 11 to 13, wherein selecting a single evolution combination from among the generated combinations comprises: selecting as the evolution combination the generated combination that minimizes a third loss function associated with the composition task (244a).

15. A method as claimed in claim 14, wherein the composition task is associated with a dataset comprising input values and corresponding output values, and wherein the third loss function associated with the composition task comprises a loss function based on a difference between outputs from the composition task data set and outputs provided by the current version of the topology evolved to include a generated combination, given the same inputs from the composition task dataset (244b).

16. A method as claimed in any one of claims 11 to 13, wherein evolving the current version of the topology by adding at least one node or connection from the identified possible topology to the current version of the topology further comprises: dividing the generated first order mutations into species (246); and wherein generating combinations of the first order mutations, each combination comprising two mutations not present in any other combination, and selecting a single evolution combination from among the generated combinations, comprises: for each species: generating combinations of first order mutations in the species, each combination comprising at least two mutations not present in any other combination (247); and selecting a species combination from among the generated combinations of the species (248); and selecting a single evolution combination from among the selected species combinations (249).

17. A method as claimed in claim 16, wherein dividing the generated first order mutations into species comprises: clustering the first order mutations according to clustering parameter (246).

18. A method as claimed in claim 16 or 17, wherein, for each species, selecting a species combination from among the generated combinations comprises: selecting as the species combination the generated combination that minimizes a loss function associated with the composition task (248).

19. A method as claimed in any one of claims 16 to 18, wherein selecting a single evolution combination from among the selected species combinations comprises: selecting as the evolution combination the species combination that minimizes a loss function associated with the composition task (249).

20. A method as claimed in any one of the preceding claims, wherein the composition task is associated with a dataset comprising input values and corresponding output values, and wherein the second loss function associated with the composition task comprises a loss function based on a difference between outputs from the composition task data set and outputs provided by the current version of the topology for the composition, given the same inputs from the composition task dataset (250a).

21. A method as claimed in any one of the preceding claims, wherein each ML module is subjected to retraining to optimize performance of its specific module task, and wherein retraining of individual ML modules is scheduled according to fulfillment of a retraining condition for the ML module (110a).

22. A method as claimed in any one of the claims 2 to 21, wherein ML module metadata further comprises at least one of (210a): identification of the ML module task that the ML module is trained to perform; the number inputs from other ML modules that can be accepted.

23. A computer program product comprising a computer readable medium, the computer readable medium having computer readable code embodied therein, the computer readable code being configured such that, on execution by a suitable computer or processor, the computer or processor is caused to perform a method as claimed in any one of claims 1 to 22.

24. A management node (300) for generating a Machine Learning, ML, composition that is optimized to perform a composition task in a communication network, wherein the ML composition comprises a plurality of interconnected ML modules, each ML module trained to perform a module task that is specific to the ML module, and wherein an ML module comprises at least one ML model, the management node comprising processing circuitry (302) configured to cause the management node to: obtain a candidate set of ML modules, each ML module in the candidate set being a candidate for inclusion in the composition; initiate a current version of a topology for the composition, the initiated current version topology comprising at least two ML modules from the candidate set and at least one connection between the ML modules; repeat, until a termination condition is satisfied, the steps of: identifying, from the candidate set of ML modules, a possible topology for the composition that includes the current version of the topology and minimizes a first loss function associated with the composition task; evolving the current version of the topology by adding at least one node or connection from the identified possible topology to the current version of the topology; and evaluating the current version of the topology using a second loss function associated with the composition task; wherein the ML composition optimized to perform the composition task comprises the ML modules and connections between ML modules present in the current version of the topology when the termination condition is satisfied.

25. A management node as claimed in claim 24, wherein the processing circuitry is further configured to cause the management node to carry out the steps of any one or more of claims 2 to 22.

26. A management node (400) for generating a Machine Learning, ML, composition that is optimized to perform a composition task in a communication network, wherein the ML composition comprises a plurality of interconnected ML modules, each ML module trained to perform a module task that is specific to the ML module, and wherein an ML module comprises at least one ML model, the management node comprising: an ML module selector unit (402) for obtaining a candidate set of ML modules, each ML module in the candidate set being a candidate for inclusion in the composition; and an evolutionary topology learner unit (404); the evolutionary topology learner unit comprising: an initiator unit (406) for initiating a current version of a topology for the composition, the initiated current version topology comprising at least two ML modules from the candidate set and at least one connection between the ML modules; and a topology search unit (408), a topology grower unit (410), and a loss calculator unit (412) for repeating, until a termination condition is satisfied, the steps of: identifying, by the topology search unit and from the candidate set of ML modules, a possible topology for the composition that includes the current version of the topology and minimizes a first loss function associated with the composition task; evolving, by the topology grower unit, the current version of the topology by adding at least one node or connection from the identified possible topology to the current version of the topology; and evaluating, by the loss calculator unit, the current version of the topology using a second loss function associated with the composition task; wherein the ML composition optimized to perform the composition task comprises the ML modules and connections between ML modules present in the current version of the topology when the termination condition is satisfied.

27. A management node as claimed in claim 26, wherein each ML module is associated with metadata comprising a specification of the input space of the ML module.

28. A management node as claimed in claim 26 or 27, wherein the ML module selector unit (402) is for obtaining a candidate set of ML modules, each ML module in the candidate set being a candidate for inclusion in the composition, by: obtaining the ML modules in the candidate set from available ML modules maintained by a life cycle management node.

29. A management node as claimed in claim 27 or 28, wherein the ML module selector unit (402) is for obtaining a candidate set of ML modules, each ML module in the candidate set being a candidate for inclusion in the composition, by: obtaining the metadata of available ML modules; selecting candidate ML modules for inclusion in the candidate set based on the obtained metadata; and obtaining the selected ML modules.

30. A management node as claimed in claim 29, wherein selecting candidate ML modules for inclusion in the candidate set based on the obtained metadata comprises selecting only modules for which the specified input space is a subset of the input space of the composition task.

31. A management node as claimed in any one of the claims 26 to 30, wherein the composition task is associated with metadata describing the composition task, and wherein the initiator unit (406) is for initiating a current version of a topology for the composition, the initiated current version topology comprising at least two ML modules from the candidate set and at least one connection between the ML modules, by: including in the initiated current version topology any ML modules indicated in the composition task metadata as being required for the composition task.

32. A management node as claimed in any one of claims 26 to 31, wherein the initiator unit (406) is for initiating a current version of a topology for the composition, the initiated current version topology comprising at least two ML modules from the candidate set and at least one connection between the ML modules, by: randomly selecting two ML modules from the candidate set; and randomly selecting, from among possible connections between the two selected ML modules, at least one connection for inclusion in the initiated current version of the topology.

33. A management node as claimed in any one of claims 27 to 32, wherein the topology search unit (408) is for identifying, from the candidate set of ML modules, a possible topology for the composition that includes the current version of the topology and minimizes a first loss function associated with the composition task by identifying a topology in which ML modules of the candidate set are connected in a manner consistent with input and output spaces specified in their metadata.

34. A management node as claimed in any one of claims 26 to 33, wherein the composition task is associated with a dataset comprising input values and corresponding output values, and wherein the first loss function associated with the composition task comprises a loss function based on a difference between outputs from the composition task data set and outputs provided by the possible topology for the composition, given the same inputs from the composition task dataset.

35. A management node as claimed in any one of claims 26 to 34, wherein the topology search unit (408) is for identifying, from the candidate set of ML modules, a possible topology for the composition that includes the current version of the topology and minimizes a first loss function associated with the composition task by performing a top-down search of possible topologies including the current version of the topology.

36. A management node as claimed in any one of claims 26 to 35, wherein the topology grower unit (410) for evolving the current version of the topology by adding at least one node or connection from the identified possible topology to the current version of the topology comprises: a mutation generator unit (622) for generating possible first order mutations of the current version of the topology, which first order mutations are included in the identified possible topology; and offspring generator unit (624) for generating combinations of the first order mutations, each combination comprising at least two mutations not present in any other combination; and and offspring selector unit (628) for selecting a single evolution combination from among the generated combinations, and applying the selected evolution combination to the current version of the topology.

37. A management node as claimed in claim 36, wherein the mutation generator unit (622) is for generating possible first order mutations of the current version of the topology, which first order mutations are included in the identified possible topology, by: identifying ML modules that are not present in the current version of the topology that is included in the identified possible topology, but have a connection to the current version of the topology; identifying connections between ML modules of the current version of the topology, which connections are present in the identified possible topology but are not present in the current version of the topology; wherein each identified ML module and connection comprises a first order mutation of the current version of the topology.

38. A management node as claimed in claim 36 or 37, wherein the offspring generator unit (624) is for generating combinations of first order mutations, each combination comprising at least two mutations not present in any other combination generating offspring, by, for groups of mutations, adding the mutations in the group such that the combination comprises each first order mutation of the group.

39. A management node as claimed in any one of claims 36 to 38, wherein the offspring selector unit (628) is for selecting a single evolution combination from among the generated combinations by: selecting as the evolution combination the generated combination that minimizes a third loss function associated with the composition task.

40. A management node as claimed in claim 39, wherein the composition task is associated with a dataset comprising input values and corresponding output values, and wherein the third loss function associated with the composition task comprises a loss function based on a difference between outputs from the composition task data set and outputs provided by the current version of the topology evolved to include a generated combination, given the same inputs from the composition task dataset.

41. A management node as claimed in any one of claims 36 to 38, wherein the topology grower unit (610) for evolving the current version of the topology by adding at least one node or connection from the identified possible topology to the current version of the topology further comprises: a species generator unit (626) for dividing the generated first order mutations into species; and wherein the offspring generator unit (624) and the offspring selector unit (628) are for generating combinations of the first order mutations, each combination comprising two mutations not present in any other combination, and selecting a single evolution combination from among the generated combinations, by: for each species: generating combinations of first order mutations in the species, each combination comprising at least two mutations not present in any other combination; and selecting a species combination from among the generated combinations of the species; and selecting a single evolution combination from among the selected species combinations.

42. A management node as claimed in claim 41, wherein the species generator unit (626) is for dividing the generated first order mutations into species by: clustering the first order mutations according to clustering parameter.

43. A management node as claimed in claim 41 or 42, wherein the offspring selector unit (628) is for selecting, for each species, a species combination from among the generated combinations by: selecting as the species combination the generated combination that minimizes a loss function associated with the composition task.

44. A management node as claimed in any one of claims 41 to 43, wherein the offspring selector unit (628) is for selecting a single evolution combination from among the selected species combinations by: selecting as the evolution combination the species combination that minimizes a loss function associated with the composition task.

45. A management node as claimed in any one of claims 26 to 44, wherein the composition task is associated with a dataset comprising input values and corresponding output values, and wherein the second loss function associated with the composition task comprises a loss function based on a difference between outputs from the composition task data set and outputs provided by the current version of the topology for the composition, given the same inputs from the composition task dataset.

46. A management node as claimed in any one of claims 26 to 45, wherein each ML module is subjected to retraining to optimize performance of its specific module task, and wherein retraining of individual ML modules is scheduled according to fulfillment of a retraining condition for the ML module.

47. A management node as claimed in any one of the claims 27 to 46, wherein ML module metadata further comprises at least one of: identification of the ML module task that the ML module is trained to perform; the number inputs from other ML modules that can be accepted.