WO2020183136A1

WO2020183136A1 - Methods of deploying machine learning models

Info

Publication number: WO2020183136A1
Application number: PCT/GB2020/050546
Authority: WO
Inventors: Andrew Gray
Original assignee: Kortical Ltd
Priority date: 2019-03-08
Filing date: 2020-03-06
Publication date: 2020-09-17
Also published as: GB201903172D0

Abstract

A computer-implemented method of deploying a machine learning model to an application within an environment is disclosed. An environment is associated with an endpoint providing user access to an application within the environment, and a first machine learning model is deployed in the application. The method comprises: while maintaining the association between the environment and the endpoint, deploying a second machine learning model to the application by disassociating the application with the first machine learning model; and associating the application with the second machine learning model.

Description

Methods of deploying machine learning models

Technical Field

The present disclosure relates to methods of deploying machine learning models.

Background

Advances in computational capabilities have in recent years led to a rapid increase in the use and applicability of artificial intelligence and machine learning in a wide array of technical fields. As a result, the use of machine learning models to run predictions which aid in solving a variety of problems is now widespread.

Typically, machine learning models are initially refined through a "training" process, during which the models are fed with training data. The models run predictions on the basis of this training data and the outputs of the models are assessed. Typically, some form of feedback mechanism is provided to provide feedback to the model indicating the accuracy (or lack thereof) of the model's predictions. By repeating training cycles and providing continuous or periodic feedback in this manner, the predictive performance of the model is iteratively improved.

At a certain point, a model may be sufficiently trained that it is desired to deploy the model to a particular application directed to a specific purpose. Deployment of the model in an application typically enables the application to utilise the model to generate predictions for whatever purpose the application is directed to. For example, an application may be directed to optimising raw materials usage, improving manufacturing, providing for more efficient maintenance or enabling better healthcare provision. These improvements may be obtained through predictions based on historical data for one or more parties. The application may use one or more machine learning models deployed in it to produce these and other similar predictions.

Typically, before an application using a particular model is made accessible to end-users in a production environment, various rounds of testing of the model are first carried out, to test stability and performance. Typically, such testing takes place in a number of other runtime environments. The application and/or model will be promoted through the various environments which typically become more production-like, in other words more similar to the production environment, with each round of promotion. Early stage testing may be carried out, for example, in a testing environment such as an integration environment. Once stability of the model in the integration environment has been established, testing of the model in another testing environment, such as an acceptance or user acceptance testing (UAT) environment may be carried out. Typically, only developers and programmers have access to these types of testing environments. Once the model is ready for use by end-users, for example companies, organisations and members of the public as opposed to just developers and programmers, the model is provided in an application in an environment, such as a production environment, which is accessible to these end-users. End-users may thereby access the application and utilise the application to run predictions based on the model.

One significant drawback of existing methods of model deployment is that for each promotion of the model (for example from the integration environment to the UAT environment) the calling client-side application typically requires new application code and/or configuration ("config")

to be written and tested or libraries to be updated and code recompiled, before the model can be provided to the application in the new environment. This is a resource intensive process.

Access to the application is typically provided via an endpoint. On updating the application code, a new endpoint provides access to the updated application, and this new endpoint must therefore be provided to the client-side application or device, to enable the user to access the updated application. For example, where the endpoint is a URL, the URL will change as a result of the application update and the new URL must be communicated to client-side devices. Furthermore, when code changes the full client side application and model typically need to be tested. This means the development and data-science teams become fully dependent on each other. In practice this frequently means that the data-science team, who are responsible for creating the model, has to wait for the development team to complete their updates and testing, which can lead to significant delays and inefficiencies.

It will therefore be appreciated that promoting a model from one environment to another using existing methods requires a significant amount of time and code development, as well as communication with client-side devices.

Another significant drawback of existing methods is that, even within a single environment, it is similarly complex to change the model being used by an application within the environment. For example, an application running in a particular environment (for example the integration environment) may have a first machine learning model deployed therein. It may be desired to instead deploy a second machine learning model to the application in place of the first, for example because it has been discovered during training that the second machine learning model provides superior predictions. Changing the machine learning model being used by the application involves a similar process as was described above in relation to promoting a model across environments. In particular, changes to the application code are typically required and the altered code needs to be tested, compiled and redeployed. The endpoint used to access the application also needs to be changed and these changes need to be communicated to client-side devices, often alongside a client- side update. Once again, the data-science team, who are responsible for creating the model, has to wait for the development team to complete their updates and testing, leading to significant delays and inefficiencies.

It will therefore be appreciated that changing the model being used by an application in a particular environment using existing methods also requires a significant amount of time and code

development, as well as communication with client-side devices.

As can be seen, significant drawbacks exist in existing model deployment methods. It would be advantageous to provide systems and methods which address one or more of these problems, in isolation or in combination.

Overview

This overview introduces concepts that are described in more detail in the detailed description. It should not be used to identify essential features of the claimed subject matter, nor to limit the scope of the claimed subject matter.

A machine learning model (or machine learning solution) can learn to approximate the relationship between a set of input observations and their observed outcomes. The machine learning model (or machine learning solution) can be used to make predictions of the outcome of new input observations, the actual outcomes of which are unknown. Typically, a machine learning model (or machine learning solution) comprises a series of algorithms and transformations which can learn to approximate the relationship between a set of input observations and their observed outcomes.

Typically, once any elements associated with the model are updated, the model is deemed to be a new model. Such elements may include, for example, the training data used to train the model, the parameters which the model is configured to run predictions based on and/or the model configuration ("config").

According to one aspect of the present disclosure, there is provided a computer-implemented method of deploying a machine learning model to an application within an environment, wherein an environment is associated with an endpoint providing user access to an application within the environment, and wherein a first machine learning model is deployed in the application. The method comprises, while maintaining the association between the environment and the endpoint, deploying a second machine learning model to the application by disassociating the application with the first machine learning model and associating the application with the second machine learning model.

By maintaining the association between the environment and the endpoint while a new machine learning model (for example the second machine learning model) is deployed to the application, user-access to the application within the environment can still be provided via the same endpoint, even after the machine learning model deployed in the application has changed. Furthermore, a change in behaviour by the users of the endpoint after the change in model deployment is not required, and no updates or communication need to be sent advising of changes of connection details or endpoint details. Users may still access the application as before, using the same endpoint. Further, no other updates, whether to the server-side application code, client-side application code, the models or the environment are required.

As used in the present disclosure, associating an application with a machine learning model can be considered as enabling the application to use the machine learning model for one or more purposes, for example to run predictions. Associating an application with a machine learning model can comprise loading the machine learning model into memory accessible by the application. Associating an application with a machine learning model can comprise configuring the application to call the machine learning model. Other ways in which use of a machine learning model by an application can be enabled will be apparent and fall within the scope of the present disclosure, including where this mechanism is distributed among multiple computers and endpoints.

As used in the present disclosure, disassociating an application with a machine learning model can be considered as removing the ability of an application to use the machine learning model for one or more purposes, for example to run predictions. Disassociating an application with a machine learning model can comprise removing the machine learning model from memory accessible by the application. Disassociating an application with a machine learning model can comprise configuring the application to not call the machine learning model. Other ways in which use of a machine learning model by an application can be disabled will be apparent and fall within the scope of the present disclosure, including where this mechanism is distributed among multiple computers and endpoints. Deploying a new machine learning model (for example the second machine learning model) to the application may therefore be independent of any other updates made to the code of the application and/or the model and/or the environment. As a result, models deployed in an application can be swapped seamlessly and without the procedural burden which exists in existing model deployment methods. This can have significant benefits in terms of enabling efficient testing and swapping of models deployed in an application within a particular environment.

An endpoint, which can be associated with an environment and provide user access to an application within the environment, can comprise an endpoint ID. The endpoint ID can optionally be a URL. The endpoint ID may be unique. Other types of endpoint and endpoint IDs will be apparent and fall within the scope of the present disclosure. Maintaining the association between an environment and an endpoint can comprise maintaining the endpoint ID, which can be considered as not changing the endpoint ID. For example, where the endpoint ID is a URL, maintaining the association between an environment and an endpoint can comprise not changing the URL. The application can in this case be accessed via the same endpoint ID (for example the same URL), even after a new machine learning model has been deployed in the application. This means that a new machine learning model can be deployed without any need to communicate a new endpoint to client devices.

The computer-implemented method may further comprise, while the first machine learning model is deployed in the application, checking storage or memory to determine which machine learning model should be deployed in the application and determining from the storage or memory that a new machine learning model (for example the second machine learning model) should be deployed to the application. Deploying the new machine learning model to the application can be based on this determination. The storage or memory may comprise a database. The storage or memory may comprise an electronic file system. The storage or memory may comprise random access memory (RAM).

By checking storage or memory to determine which machine learning model should be deployed in the application, the model deployment can be easily and simply updated by updating the storage or memory. For example, a deployment list or table stored at said storage or memory may be updated. When the updated list or table is next checked, the machine learning model swap can be effected, in other words a new machine learning model can be deployed to the application. The machine learning model deployed in the application may therefore be determined simply by updating the relevant information in storage or memory, and without requiring any direct communication with the application itself. Said checking storage or memory to determine which machine learning model should be deployed in the application can occur periodically, for example each time a certain pre-determined time interval has elapsed. Alternatively or additionally, said checking storage or memory to determine which machine learning model should be deployed in the application can occur each time the application uses the machine learning model, for example each time the application calls the machine learning model.

In existing systems, machine learning models are configured to run predictions based on data for a specific number of parameters. In contrast, in the present disclosure one or more machine learning models can be configured to receive data for a set of parameters and determine the number of parameters contained in the set of parameters. If the number of parameters contained in the set of parameters is equal to the number of parameters which the respective machine learning model is configured to use data for, the respective machine learning model can be configured to extract data for each of the parameters of the set of parameters from the received data. If the number of parameters contained in the set of parameters is greater than the number of parameters which the respective machine learning model is configured to use data for, the respective machine learning model can be configured to extract data for as many parameters as the respective machine learning model is configured to use data for from the received data and not extract data for the additional parameters. The machine learning model may be configured to ignore data for the additional parameters. In either case, the model may then run predictions based on the extracted data.

By configuring one or more of the machine learning models to extract the required data in this manner, the versatility of the models can be improved because the models can utilise received data to run predictions even when the received data contains data for more parameters than the models are respectively configured to use. This marks a departure from machine learning models used in existing systems, which lack this functionality and instead produce an error if they receive data containing data for more parameters than they are configured to use.

It will be appreciated that, while the foregoing describes a method using two machine learning models, in other words deploying a second machine learning model in place of a first machine learning model, this is merely exemplary and any number of machine learning models can be used and deployed. Accordingly, the computer-implemented method may further comprise, while maintaining the association between the environment and the endpoint, deploying a third machine learning model to the application by disassociating the application with the second machine learning model and associating the application with the third machine learning model, and so on.

According to another aspect of the present disclosure, there is provided a computer-implemented method of deploying a machine learning model to an application within an environment, wherein a first environment is associated with a first endpoint, the first endpoint providing user access to a first application within the first environment; wherein a second environment is associated with a second endpoint, the second endpoint providing user access to a second application within the second environment; and wherein a machine learning model is deployed in the first application within the first environment. The method comprises, while maintaining the association between the second environment and the second endpoint, deploying the machine learning model to the second application within the second environment by associating the machine learning model with the second application within the second environment.

By maintaining the association between a new environment (for example the second environment) and an endpoint while a machine learning model previously deployed in an application in another environment (for example the first environment) is deployed to an application in the new environment, user-access to the application in the new environment can still be provided via the same endpoint, even after the machine learning model has been deployed therein. Furthermore, a change in behaviour by the end-user after the new model deployment is not required, and no updates or communication need to be sent to client-side devices. Users may still access the application in question as before, using the same endpoint. Further, no other updates, whether to the server-side application code, client-side application code, model or any of the environments are required. Deploying the machine learning model in an application in a new environment may therefore be independent of any other updates made to the code of the application and/or the model and/or the environment. As a result, machine learning models can be redeployed across a plurality of environments seamlessly and without the procedural burden which exists in existing model deployment methods. This can have significant benefits in terms of enabling efficient testing and "promotion" of models, which typically involves sequentially deploying a model to applications in new environments during a development cycle.

The method may further comprise disassociating a machine learning model from a first application prior to associating the machine learning model with a second application. An endpoint (for example the second endpoint), which is associated with an environment (for example the second environment) and provides user access to an application (for example the second application) within the environment, can comprise an endpoint ID. The endpoint ID can optionally be a URL The endpoint ID is typically unique. Other types of endpoint and endpoint IDs will be apparent and fall within the scope of the present disclosure. Maintaining the association between an environment and an endpoint can comprise maintaining the endpoint ID, which can be considered as not changing the endpoint ID. For example, where the endpoint ID is a URL, maintaining the association between the second environment and the second endpoint can comprise not changing the URL. The second application can in this case be accessed via the same endpoint ID (for example the same URL), even after a new machine learning model has been deployed in the second application. It is not required to send updates to client-side devices, and no new endpoint needs to be communicated.

The computer-implemented method may further comprise, while maintaining the association between the second environment and the second endpoint, deploying another machine learning model to the second application within the second environment by disassociating the machine learning model with the second application within the second environment and associating the other machine learning model with the second application within the second environment. The various advantages of maintaining the association between an environment and an endpoint while changing the model deployed in an application in the environment have already been described in detail above.

A third environment may be associated with a third endpoint, the third endpoint providing user access to a third application within the third environment. The method may further comprise, while maintaining the association between the third environment and the third endpoint, deploying the machine learning model to the third application within the third environment by associating the machine learning model with the third application within the third environment. The first environment can be an integration environment. The second environment can be a user acceptance testing (UAT) environment. The third environment can be a production environment. A machine learning model may be deployed consecutively to the integration, UAT and production environments respectively, for example as part of a development or deployment cycle for the model. Sequentially deploying a machine learning model to a number of environments (for example from the integration environment to the UAT environment and then to the production environment) in this manner can be considered as "promoting" the machine learning model across the respective environments. It will be appreciated that, while the foregoing describes the use of one, two or three environments, this is merely exemplary and any number of environments can be provided. Each environment may be associated with a respective endpoint, wherein each respective endpoint may provide access to a respective application within the respective environment. The association between each

environment and its respective endpoint may be maintained while a machine learning model deployed in the application within the respective environment is changed.

It will further be appreciated that, while the foregoing describes one or two machine learning models, this is merely exemplary and any number of machine learning models can be provided and deployed to any number of applications.

According to another aspect of the present disclosure, there is provided a computer program comprising computer-executable instructions which, when executed by one or more computers, cause the one or more computers to perform any of the methods described herein.

According to another aspect of the present disclosure, there is provided a computer system comprising one or more computers having a processor and memory, wherein the memory comprises computer-executable instructions which, when executed, cause the one or more computers to perform any of the methods described herein.

Brief Description of the Figures

Illustrative implementations of the present disclosure will now be described, by way of example only, with reference to the drawings. In the drawings:

Figure 1 shows a method of deploying a new machine learning model to an application within an environment;

Figure 2 shows a method of determining which machine learning model should be deployed in an application;

Figures 3a and 3b schematically show a change of machine learning model deployment brought about by the methods of Figures 1 and 2;

Figure 4 shows a method of promoting a machine learning model across a plurality of environments; Figures 5a-5c schematically show a promotion of a machine learning model across a plurality of environments brought about by the method of Figure 4; and

Figure 6 shows the components of a computer that can be used to implement the methods described herein.

Throughout the description and the drawings, like reference numerals refer to like features.

Detailed description

This detailed description describes, with reference to Figure 1, a method of deploying a new machine learning model to an application within an environment. Figure 2 is used to describe a method of determining which machine learning model should be deployed in an application. With reference to Figures 3a and 3b a change of model deployment brought about by the methods of Figures 1 and 2 is described in more detail. With reference to Figure 4, a method of promoting a machine learning model across a plurality of environments is described. With reference to Figures 5a-c, a promotion of a machine learning model brought about by the method of Figure 4 is described. Finally, with reference to Figure 6, the components of a computer that can be used to implement the methods described herein are described.

The methods disclosed herein relate generally to machine learning models, which can be deployed in applications such that the applications may use the machine learning models deployed therein to run predictions. The applications populate runtime environments. Access to a respective application is provided via an endpoint associated with the environment in which the application is provided. For example, a particular application may be deployed in an integration environment, which is a testing environment used during development testing of applications, for example applications which use machine learning models to run predictions.

Access is provided to an application via an endpoint, and the endpoint is associated with the environment in which the application is provided. In the present example, the endpoint is a URL, and access to the application is provided via the URL. Therefore, in an example, a particular URL associated with the integration environment is entered at a client device (either automatically or by a user, optionally in a client-side application), and the user thereby gains access to the application provided within the integration environment. The user may then use the application within the integration environment to run predictions, based on whichever machine learning model is currently deployed in the application. It will be apparent that there are innumerable types of application that can make use of machine learning models, and the methods of the present disclosure can be used for applications related to any field. The methods described herein can be used by any party to generate insights based on data such as historical data. Merely as an example, applications utilising machine learning models are commonly used in the fields of manufacturing to optimise resource and raw material allocation. Maintenance predictions can also be provided using machine learning models, to improve and facilitate efficient upkeep of devices and machinery. Machine learning models can also be used in the areas of healthcare, for example to optimise the provision of treatments. Other potential uses of machine learning models in the banking, retail and marketing sectors will also be apparent. .

Machine learning models are typically refined through a training process, as is known in the art. At some point, it may be discovered that a certain machine learning model performs particularly well for a specific type of prediction, and is therefore well suited for a particular application directed to that type of prediction. In this case, it may be desired to deploy the machine learning model to the application, in place of the machine learning model currently deployed in the application. As described in more detail above, existing methods for achieving this change of machine learning model are extremely time consuming and inefficient, typically requiring substantial re-coding of both server-side and client-side code and requiring a change of endpoint which must be communicated to the client device. The present disclosure provides methods for overcoming these problems, as will now be described in relation to Figure 1.

Figure 1 shows a method of deploying a new machine learning model to an application. In particular, a first machine learning model deployed in the application is replaced by a second machine learning model. This is achieved without requiring any re-coding of the application code, without any updates needing to be sent to the client-side device and without any change in the endpoint which provides access to the application being required.

At block 102, an environment is associated with an endpoint, which in this example is a URL. An application operable to use machine learning models to run predictions is deployed in the environment, and the URL provides user access to the application. For example, the URL may be entered at a client-device, and the client device may then utilise the application.

At block 104, a first machine learning model is deployed in the application, meaning that the first machine learning model is associated with the application. In this example this means that the first machine learning model is loaded to memory accessible by the application. The application is therefore able to access the first machine learning model and run machine learning predictions using the first machine learning model. Accordingly, when a user accesses the application and uses it to produce predictions, these predictions are produced using the first machine learning model.

At some point it may be desired to instead use a second machine learning model to run the predictions. This may happen, for example, if it is discovered during training of the second machine learning model that it performs better than the first machine learning model when producing predictions of a specific kind.

As a result, at block 106, the application and first machine learning model are disassociated, which in this example means that the first machine learning model is removed from memory accessible by the application. The association between the environment and the endpoint is maintained during this process, which in this example means that the URL used to access the application in the environment does not change. As a result, the same URL can be entered at the client-side device to gain access to the application even after the first machine learning model and the application have been disassociated.

At block 108, the new, second machine learning model is deployed in the environment, meaning that the second machine learning model is associated with the application. Again, in this example this means that the second machine learning model is loaded to memory accessible by the application. The application is therefore able to access the second machine learning model and run machine learning predictions using the second machine learning model. Accordingly, when a user now accesses the application and uses it to produce predictions, these predictions are produced using the second machine learning model.

The association between the environment and the endpoint is again maintained at block 108, which in this example means that the URL used to access the application in the environment again does not change. As a result, the same URL can be entered at the client-side device to gain access to the application even after the second machine learning model and the application have been associated.

At the end of the process of Figure 1, the second machine learning model is deployed in the application in place of the first machine learning model, and so the application can run predictions using the second machine learning model. This swap of models is achieved without any need for the code of the application to be altered. In other words, deploying the second machine learning model to the application is independent of any other updates made to the code of the application. No updates need to be sent to the client-side device. No changes need to be made to the environment or either model. Additionally, because the association between the environment and the URL has been maintained, the user can still access the application via the same URL even after the change of model deployed in the application. The disclosed method therefore provides a simpler and more efficient method of machine learning model deployment than is currently available.

Figure 2 shows an exemplary method for determining which machine learning model should be deployed in an application. The method of Figure 2 may therefore be readily incorporated into any of the other methods disclosed herein, for example the method just described in relation to Figure 1.

At block 202, a machine learning model, for example the first machine learning model described in relation to Figure 1, is deployed in an application. An association between the model and application is thereby maintained. In this example, as in the case of Figure 1, the association between the application and model means that the model is loaded to memory accessible by the application for running predictions.

At block 204, it is determined whether a threshold time period has elapsed. This time period may be pre-determined for a given application, and may be consistent or may change dynamically based on one or more criteria. Use of a consistent time threshold means that the check of which application should be deployed in the application occurs periodically.

If it is determined that a threshold time has not yet elapsed, the method returns to block 202 and the association between the application and the current machine learning model continues. If it is determined that a threshold time has elapsed, the method moves to block 205.

At block 206, a check is made in storage or memory to determine which machine learning model should be deployed in the application. The check may be performed by the application, the environment or another element of the computational system. In this example the storage or memory comprises a database, which may be external or internal to the environment. In this example, checking the database comprises checking a table of model association, which lists applications and indicates which machine learning models should be deployed in each application.

If it is determined at block 206 that the current association for a particular application is correct, in other words that the correct machine learning model is deployed in the application, then the method returns to block 202 and the association is maintained, in other words the same model remains deployed in the application. For example, if the first machine learning model is deployed in the application and the model association table stored in the database indicates that the first machine learning model is indeed the model which should be deployed in the application, then the method returns to block 202 and the first machine learning model remains deployed in the application.

If, on the other hand, it is determined at block 206 that the current association is incorrect, in other words that the correct machine learning model is not deployed in the application, then the method moves to block 208. This may occur, for example, if a change has been made in the model association table stored in the database, such that the table now indicates that a different machine learning model (for example a second machine learning model) should be deployed in the application.

At block 208, the correct machine learning model is deployed to the application, such that the correct model (for example the second machine learning model), is associated with the application. For example, the second machine learning model can be loaded to memory accessible by the application for running predictions, in place of the first machine learning model.

The method then returns to block 202, and the newly deployed machine learning model remains deployed in the application until such time as the check made at block 206 indicates that a different machine learning model should be deployed in the application instead.

While the method shown in Figure 2 involves periodic checking based on a threshold time elapsing, this is not essential and the check of block 206 can alternatively or additionally be performed at other times, for example each time an application is instructed to run predictions using a machine learning model or each time an application calls a machine learning model. It will be appreciated that the method of Figure 2 is in its entirety optional. In other words, checking storage or memory to determine which machine learning model should be deployed in a particular application is not required. Alternatively or additionally, deployment of a new machine learning model may be effected by sending a message to the application, environment or another element of the computational system providing instructions regarding which machine learning model is to be deployed in the application. Alternatively, a new machine learning model may simply be manually loaded into memory accessible by the application, and the previous machine learning model can be removed from memory accessible by the application. The functionality provided by the methods of Figures 1 and 2 may be more fully understood with reference to Figures 3a and 3b, which schematically show a machine learning model deployed in an application being swapped for a new machine learning model in accordance with the methods of Figures 1 and 2. As just described, it will be apparent that the functionality of Figure 2 may be omitted, in other words the functionality of Figures 3a and 3b may be provided by the method of Figure 1 alone.

Figure 3a shows a first machine learning model, labelled Model 1, deployed in an application 312.

The application 312 is provided within an environment 302, which in this example is an integration environment.

A client device 330 is able to access the application 312 via an endpoint 332 associated with the integration environment 302. In this example the endpoint 332 is a URL. By entering the URL at the client device 330, the client device 330 can access the first application 312. This is represented by the dashed arrow between the client device 330 and the endpoint 332.

A training module 320 is provided and comprises a training environment, wherein machine learning models are trained to output predictions through repeated iterative training cycles. Various methods for training machine learning models are known and will be apparent, and these will not be described in further detail here. Any appropriate training method can be used to train the machine learning models of the present disclosure. Once a machine learning model is sufficiently trained and/or it is desired to deploy the machine learning model to the application 312, the machine learning model is provided from the training module 320 to a database 308. This is represented by the dashed arrow between the training module 320 and the database 308. Providing the machine learning model to the database 308 can in some examples comprise persisting or streaming the model to the database 308.

The database 308 can then provide the machine learning model to the application 312. This is represented by the dashed arrow between the database 308 and the first machine learning model deployed in the application 312 in Figure 3a.

In this example, machine learning models are provided to the application 312 by loading the machine learning models into memory accessible by the application 312. Once loaded into memory accessible by the application 312, a machine learning model is associated with the application 312 and is thereby deployed in the application 312. The machine learning model can then be used by the application 312 to run predictions.

Other ways of providing the machine learning models into memory, such as streaming, may be used.

It will be apparent that use of training module 320 and database 308 is optional, and machine learning models can be provided to the application 312 from any suitable source. The models do not need to have been trained in training module 320 prior to being deployed to the application 312. Where training module 320 is used, machine learning models can be provided to the application directly from training module 320, in other words database 308 may be omitted.

At some point, it may be desired to deploy a different machine learning model to the application 312. To achieve this, the methods described above in relation to Figures 1 and 2 may be used. In particular, a check is made to determine which machine learning model should be deployed in the application 312. This checking process was described in further detail in relation to Figure 2. Upon checking database 308, it may be determined from a model association table that a second machine learning model should be deployed in the application 312.

As a result, the second machine learning model can be deployed in the application in place of the first machine learning model, in the manner described in further detail in relation to Figure 3. The first machine learning model deployed in the application 312 is disassociated with the application 312, which in this example comprises removing the first machine learning model from memory accessible by the application 312. A new, second, machine learning model is then deployed to the application 312 instead. In this example, deploying the second machine learning model to the application 312 comprises loading, from database 308, the second machine learning model to memory accessible by the application 312 such that the application is associated with the second machine learning model. The new machine learning model is then deployed in the application, and any future predictions run by the application 312 will be run based on the new machine learning model.

This change of model can be seen in Figure 3b, which shows the same computational system as Figure 3a, however now the second machine learning model, labelled as Model 2, is deployed in the application 312 in place of the original first machine learning model, Model 1. As described in detail in relation to Figure 1, the association between the integration environment 302 and endpoint 332 is maintained throughout the model switching process, as can be seen from Figure 3b where endpoint 332 still provides client access to the application 312 within the integration environment 302, as in Figure 3a. In this example, where endpoint 332 is a URL, this means that the same URL can be entered at client device 330 to access the application 312. The change in model (from Model 1 to Model 2) also does not require any change to the application 312, integration environment 302, client device 330 or endpoint 332.

As can be seen, therefore, a simple and efficient mechanism for seamlessly deploying a new model to an application is provided which overcomes many of the drawbacks of existing methods. This is particularly beneficial to end-users such as members of the public, companies and organisations who do not have coding expertise and do not want to wait for new application code to be written each time they want to use a new machine learning model. By radically simplifying the process of model redeployment, the ability to swap machine learning models is made more readily available. New models can be deployed and tested without having to wait for new application code to be rewritten, meaning that the data science aspects of machine learning, namely running machine learning models to generate data, can be separated from the coding aspects of machine learning relating to how the machine learning models are provided in application code.

Turning now to Figure 4, another method of deploying machine learning models is provided, in particular relating to promoting a model across a plurality of environments. Promoting a model is herein intended to mean deploying a model deployed in an application in an environment to a different application in a different environment. Typically, promoting a model will occur during a development cycle for the model, as described above. For example, a model may be tested in an application in one or more testing environments before being promoted, in other words deployed, to a more production-like environment where the application in which the model is deployed is accessible by end-users such as companies, organisations and members of the public as opposed to just developers and programmers.

Promotion from one environment to the next typically occurs once sufficient tests have been carried out and stability of the model has been confirmed in a particular environment. For example, a model may initially be deployed in an application in an integration environment. Once developers have tested the application and stability has been confirmed, the model may be promoted, in other words deployed to, an application in a more production-like environment, such as UAT. Flere, more stringent tests may be carried out by a higher number of developers or test users. Finally, once sufficient tests have been carried out and stability of the model confirmed in UAT, the model may be promoted to a yet more production-like environment, such as the production environment itself, where the application in which the model is deployed is accessible to all users. It will be apparent that this promotion path is merely exemplary, and other development processes and deployment sequences can be used. Further, while promotion typically implies deployment of the model to applications in sequentially more production-like environments, the methods of the present disclosure are not limited to this. Sequential deployments are possible in any sequence of environments, whether these environments are more or less production-like (or neither) .

At block 402 of Figure 4, a first environment is associated with a first endpoint. The first endpoint provides user access to a first application within the first environment, in the same way as was described in detail in relation to Figures 1-3. At block 404, a second environment is associated with a second endpoint. The second endpoint similarly provides user access to a second application within the second environment.

At block 406, a machine learning model is deployed in the first application within the first environment. As in the case of Figures 1-3, in this example deployment of the machine learning model in the first application means that the machine learning model is loaded to memory accessible by the first application. The machine learning model is thereby associated with the first application, and can be used by the first application to run predictions.

At some point, it may be desired to deploy the machine learning model to a new application within a new environment. For example, it may be desired to promote the machine learning model to a more production-like environment in order to test the model's stability and viability in a different environment and under different demands. Accordingly, at block 408, the machine learning model is deployed to the second application within the second environment. In this example deployment of the machine learning model to the second application again comprises loading the machine learning model to memory accessible by the second application. The machine learning model is thereby associated with the second application, and can be used by the second application to run predictions. The association between the second environment and the second endpoint is maintained throughout this process. The second application may therefore be accessed via the same endpoint as before, even after the machine learning model has been deployed to the second application. As in the case of Figures 1-3, in this example the second endpoint is a URL. Therefore, the second application may be accessed by entry of the same URL at a client device as before. At the end of the process of Figure 4, the machine learning model is deployed in the second application, and so the second application can run predictions using the machine learning model.

This promotion of the model is achieved without any need for the code of either the first or second application to be altered, in other words promoting the machine learning model to the second application is independent of any other updates made to the code of the second application or the first application. No updates need to be sent to the client-side device. No changes need to be made to either the first or second environment or the model. Additionally, because the association between the second environment and the second endpoint has been maintained, the user can still access the second application via the same endpoint even after the model is deployed therein. A simpler and more efficient method of promoting a machine learning model than is currently available is therefore disclosed.

The functionality provided by the method of Figure 4 may be more fully understood with reference to Figures 5a-c, which schematically show a machine learning model being promoted across a plurality of environments in accordance with the method of Figure 4.

Figure 5a shows the client device 330, integration environment 302 , database 308 and training module 320 of Figures 3a and 3b. As in Figures 3a and 3b, an application 312 is provided in the integration environment 302. In addition, two further environments are provided, each with an application provided therein. The application 312 in the integration environment is therefore now a first application of a plurality of applications, and the integration environment 302 is a first environment of a plurality of environments.

In addition to the integration environment, a UAT environment 304 is now also shown, within which is provided a second application 314. In addition, a production environment 306 is now also shown, within which is provided a third application 316. The dashed arrows in Figures 5a-c represent equivalent relationships as in Figures 3a and 3b.

A client device 330 is able to access each of the first 312, second 314 and third 316 applications respectively via respective first 332, second 334 and third 336 endpoints. The first endpoint 332 is associated with the integration environment 302. The second endpoint 334 is associated with the UAT environment 304. The third endpoint 336 is associated with the production environment 306. In this example each endpoint is a URL. By entering one of the first, second or third URLs at the client device 330, the client device 330 can access each of the respective first, second and third applications. Machine learning models can be provided to each application via training module 320 and database 308 in the same manner as described in relation to Figures 3a and 3b. As in the case of Figures 3a and 3b, it will be apparent that the use of training module 320 and database 308 is optional in the arrangement of Figures 5a-c.

In this example a machine learning model is deployed in the first application 312, labelled "Model 1". In this example no machine learning models are deployed in the second 314 or third 316

applications, as can be seen from Figure 5a.

As in the example of Figures 3a and 3b, machine learning models are provided to the respective applications by loading the machine learning models into memory accessible by the applications. Once loaded into memory accessible by one of the respective applications, a machine learning model is associated with that application and is thereby deployed in the application. Once deployed, a machine learning model can be used by the respective application to run predictions. In the arrangement of Figure 5a, a machine learning model is deployed to the first application 312, and so the first application 312 can use the machine learning model to run predictions.

Testing of the first application 312 and model deployed therein can be carried out in the integration environment 302 to determine whether they are stable. At some point, sufficient testing may have been carried out and it may be desired to promote the model to a more production-like

environment, so that different and/or more intensive testing can be carried out.

Promotion of the machine learning model to the next environment, in this example the UAT environment 304, can be achieved in accordance with the method of Figure 4. In particular, the machine learning model is deployed to the second application 314 in the UAT environment 304 by associating the second application 314 with the machine learning model, such that the machine learning model can be used by the second application 314 to run predictions. This promotion of the machine learning model to the UAT environment is shown in Figure 5b, which shows the same computational system as Figure 5a, however now the machine learning model, is deployed in the second application 314 in the UAT environment 304.

Now that the model is deployed in the second application 314 in the UAT environment 304, further testing of the second application 314 and model can begin to determine whether they are stable under these new conditions. At some point, sufficient testing may have been carried out and it may be desired to promote the model to a more production-like environment, for example an environment that is accessible to end-users rather than just developers.

Promotion of the machine learning model to the next environment, in this case the production environment 306 itself, can be achieved in accordance with the method of Figure 4 in the same manner as promoting the machine learning model from the integration environment to the UAT environment. In particular, the machine learning model is deployed to the third application 316 in the production environment 306 by associating the third application 316 with the machine learning model, such that the machine learning model can be used by the third application 316 to run predictions. The promotion of the machine learning model to the production environment is shown in Figure 5c, which shows the same computational system as Figure 5b, however now the machine learning model is deployed in the third application 316 in the production environment 306.

Typically, promotion of a model to the production environment marks the final step in the development cycle of the model, and the model can now be accessed and used to run predictions by end-users such as members of the public.

As described in detail in relation to Figure 4, the association between each environment and its respective endpoint is maintained throughout the model promotion process, as can be seen from Figures 5a-c where each endpoint provides client access to the same application within the same environment throughout. In the present example, where each endpoint is a URL, this means that the same URL can be entered at client device 330 to access each respective application at each stage of the development process. The promotion of the model across the plurality of environments also does not require any changes to be made to any of the applications, environments, client device 330 or endpoints.

As can be seen, therefore, a simple and efficient mechanism for seamlessly promoting a model across a plurality of environments is provided which overcomes many of the drawbacks of existing methods. This is particularly beneficial to end-users such as members of the public, companies and organisations who do not have coding expertise and do not want to wait for new application code to be written each time they want to promote a machine learning model. By radically simplifying the process of model redeployment, the ability to promote machine learning models is made more readily available. New models can be promoted and tested without having to wait for new application code to be rewritten, meaning that the data science aspects of machine learning, namely running machine learning models to generate data, can be separated from the coding aspects of machine learning relating to how the machine learning models are provided in application code.

For simplicity, the functionality of Figure 2 was not referenced in the description of Figures 4 and 5a- c. Flowever, it will be apparent that the functionality of Figure 2 can readily be applied to the method of Figure 4. For example, deployment of a model to a new environment can result from checking of a deployment table in database 308.

Turning finally to Figure 6, Figure 6 shows a schematic and simplified representation of a computer apparatus 600 which can be used to perform the methods described herein, either alone or in combination with other computer apparatuses.

The computer apparatus 600 comprises various data processing resources such as a processor 602 coupled to a central bus structure. Also connected to the bus structure are further data processing resources such as memory 604. A display adapter 606 connects a display device 608 to the bus structure. One or more user-input device adapters 610 connect a user-input device 612, such as a keyboard and/or a mouse to the bus structure. One or more communications adapters 614 are also connected to the bus structure to provide connections to other computer systems 600 and other networks.

In operation, the processor 602 of computer system 600 executes a computer program comprising computer-executable instructions that may be stored in memory 604. When executed, the computer-executable instructions may cause the computer system 600 to perform one or more of the methods described herein. The results of the processing performed may be displayed to a user via the display adapter 606 and display device 608. User inputs for controlling the operation of the computer system 600 may be received via the user-input device adapters 610 from the user-input devices 612.

It will be apparent that some features of computer system 600 shown in Figure 6 may be absent in certain cases. For example, one or more of the plurality of computer apparatuses 600 may have no need for display adapter 606 or display device 608. This may be the case, for example, for particular server-side computer apparatuses 600 which are used only for their processing capabilities and do not need to display information to users. Similarly, user input device adapter 610 and user input device 612 may not be required. In its simplest form, computer apparatus 600 comprises processor 602 and memory 604. In the present example, each of the machine learning models is configured to run predictions based on data for a given number of parameters. The number of parameters which each model is configured to utilise can vary from model to model. For example, a first model may be configured to run predictions based on data for five parameters, while a second model may be configured to run predictions based on data for ten parameters. To avoid errors and inefficiencies, however, the machine learning models of the present disclosure are configured, in one example, to extract data for however many parameters they are configured to use data for, even if the data they receive contains data for more parameters. This marks a departure from machine learning models used in existing systems, which lack this functionality and instead produce an error if they receive data containing data for more parameters than they are configured to use.

For example, a first machine learning model in this example is configured to run predictions based on data for five parameters. Upon receiving data, the model is configured to check how many parameters the received data contains data for. If it is determined that the data contains parameters for less than five parameters, an error results. If it is determined that the data contains parameters for precisely five parameters, then the model extracts data for each of the five parameters and runs predictions based on this data. If it is determined that the data contains parameters for more than five parameters, the model extracts data for five parameters and ignores, in other words does not extract or process, the data for the remaining parameters. This is in contrast to existing machine learning models which will output an error if they receive data containing data for more parameters than they are configured to use for predictions.

While the methods of Figures 1 and 4 have been discussed separately, it will be apparent that the methods are closely related and can be readily combined in any desired combination. For example, a machine learning model deployed in one of the plurality of environments referred to in Figures 4 and 5a-c can be swapped for a different machine learning model using the methods disclosed in relation to Figures 1 and 3a-3b. The claims relating to the various methods described may therefore also be combined in any suitable combination.

The methods of Figure 2 can also be combined with the methods of Figures 4 and 5a-c, namely a check can be made, for example periodically, to determine which model(s) should be deployed in which application(s), and promotion of the model to an application in a new environment can be based on this check. This information may be stored in storage or memory, for example database

308. The term "application" as used herein can be considered to mean software operable to run machine learning models to output predictions. In one particular example, an application comprises a kubernetes container or "pod" encapsulating code which provides the predict and explain functionality of the application, wherein the pod uses the machine learning model to perform these operations.

It will be apparent that the functionality disclosed herein may be provided through or facilitated by an application programming interface (API) which can facilitate or enable a client device accessing or calling an application.

The methods described above describe swapping a first and second model, however it will be apparent that any number of machine learning models can be used and swapped as desired and that the methods disclosed herein can be used regardless of how many machine learning models are utilised and swapped.

Similarly, the methods described above describe promoting a machine learning model across two or three environments, for example integration, UAT and production environments. However, it will be apparent that models can be deployed in any number of environments and in any order, as desired. The labels used for the environments (integration, UAT and production) are entirely optional and are merely given as illustrative examples. The methods disclosed herein can be used regardless of how many environments, machine learning models or applications are utilised and the type of environments, machine learning models or applications are used.

The described methods may be implemented using computer executable instructions. A computer program product or computer readable medium may comprise or store the computer executable instructions. The computer program product or computer readable medium may comprise a hard disk drive, a flash memory, a read-only memory (ROM), a CD, a DVD, a cache, a random-access memory (RAM) and/or any other storage media in which information is stored for any duration (e.g., for extended time periods, permanently, brief instances, for temporarily buffering, and/or for caching of the information). A computer program may comprise the computer executable instructions. The computer readable medium may be a tangible or non-transitory computer readable medium. The term "computer readable" encompasses "machine readable". The singular terms "a" and "an" should not be taken to mean "one and only one". Rather, they should be taken to mean "at least one" or "one or more" unless stated otherwise. The word "comprising" and its derivatives including "comprises" and "comprise" include each of the stated features, but does not exclude the inclusion of one or more further features.

The above implementations have been described by way of example only, and the described implementations are to be considered in all respects only as illustrative and not restrictive. It will be appreciated that variations of the described implementations may be made without departing from the scope of the disclosure. It will also be apparent that there are many variations that have not been described, but that fall within the scope of the appended claims.

Claims

1. A computer-implemented method of deploying a machine learning model to an application within an environment,

wherein an environment is associated with an endpoint providing user access to an application within the environment, and

wherein a first machine learning model is deployed in the application,

the method comprising:

while maintaining the association between the environment and the endpoint: deploying a second machine learning model to the application by disassociating the application with the first machine learning model and associating the application with the second machine learning model.

2. The computer-implemented method of claim 1, wherein deploying the second machine learning model to the application is independent of any other updates made to the code of the application.

3. The computer-implemented method of claim 1 or 2, wherein the endpoint comprises an endpoint ID, wherein the endpoint ID is optionally a URL.

4. The computer-implemented method of claim 3, wherein maintaining the association between the environment and the endpoint comprises maintaining the endpoint ID.

5. The computer-implemented method of any preceding claim, further comprising:

while the first machine learning model is deployed in the application:

checking storage or memory to determine which machine learning model should be deployed in the application; and

determining from the storage or memory that the second machine learning model should be deployed to the application,

wherein said deploying the second machine learning model to the application is based on this determination.

6. The computer-implemented method of claim 5, wherein said checking storage or memory to determine which machine learning model should be deployed in the application occurs periodically.

7. The computer-implemented method of any preceding claim wherein one or more of the first and second machine learning models is configured to: receive data for a set of parameters;

determine the number of parameters contained in the set of parameters;

if the number of parameters contained in the set of parameters is equal to the number of parameters which the respective machine learning model is configured to use data for:

extract data for each of the parameters of the set of parameters from the received data;

if the number of parameters contained in the set of parameters is greater than the number of parameters which the respective machine learning model is configured to use data for:

extract data for as many parameters as the respective machine learning model is configured to use data for from the received data; and

not extract data for the additional parameters.

8. A computer-implemented method of deploying a machine learning model to an application within an environment,

wherein a first environment is associated with a first endpoint, the first endpoint providing user access to a first application within the first environment;

wherein a second environment is associated with a second endpoint, the second endpoint providing user access to a second application within the second environment; and

wherein a machine learning model is deployed in the first application within the first environment;

the method comprising:

while maintaining the association between the second environment and the second endpoint:

deploying the machine learning model to the second application within the second environment by associating the machine learning model with the second application within the second environment.

9. The computer-implemented method of claim 8, wherein deploying the machine learning model to the second application is independent of any other updates made to code of the second application.

10. The computer-implemented method of claim 8 or 9, wherein the second endpoint comprises an endpoint ID, wherein the endpoint ID is optionally a URL.

11. The computer-implemented method of claim 10, wherein maintaining the association between the second environment and the second endpoint comprises maintaining the endpoint ID.

12. The computer-implemented method of any of claims 8 to 11, further comprising while maintaining the association between the second environment and the second endpoint:

deploying another machine learning model to the second application within the second environment by disassociating the machine learning model with the second application within the second environment and associating the other machine learning model with the second application within the second environment.

13. The computer-implemented method of any of claims 8 to 12,

wherein a third environment is associated with a third endpoint, the third endpoint providing user access to a third application within the third environment, the method further comprising:

while maintaining the association between the third environment and the third endpoint:

deploying the machine learning model to the third application within the third environment by associating the machine learning model with the third application within the third environment.

14. The computer-implemented method of claim 13, wherein the first environment is an integration environment, the second environment is a user acceptance testing (UAT) environment and the third environment is a production environment.

15. A computer program comprising computer-executable instructions which, when executed by one or more computers, cause the one or more computers to perform the method of any preceding claim or

a computer system comprising one or more computers having a processor and memory, wherein the memory comprises computer-executable instructions which, when executed, cause the one or more computers to perform the method of any preceding claim.