GB2611854A

GB2611854A - A system and method for automating a tender process

Info

Publication number: GB2611854A
Application number: GB2211382.3A
Authority: GB
Inventors: Llewellyn Loveday Colin; Sebastian Hill Warrick; Mary Cooper Jessica
Original assignee: North Swell Tech Ltd
Current assignee: North Swell Tech Ltd
Priority date: 2021-08-11
Filing date: 2022-08-04
Publication date: 2023-04-19
Also published as: WO2023017246A1; GB202211382D0

Abstract

A computer system for monitoring system usage under a service contract comprises a customer data processing module for receiving customer usage information and a machine learning module with a trained data model which analyses the customer usage information and predicts future customer usage. An analysis module maps the predicted future customer usage onto the service contract and creates an invitation to tender based on predicted future customer usage and any modifications which have been made to the contract based on predicted future customer usage. This allows for the customer to make better informed decisions about which supplier to use by using an automated system to match customer needs to the best supplier solution. Communication means may provide the created invitation to tender to one or more supplier for response, and the system may select the optimum tender from a plurality of supplier responses to the invitation to tender.

Description

A System and Method for Automating a Tender Process

Introduction

The present invention relates to a system and method for automating a tender process and in particular to a system and method for automating a tender process by analysing service usage and supplier provided information.

Background

Businesses of all sizes will use mobile devices for phone calls and to use mobile data. In a typical business, the mobile device contract will be a significant cost because of the high number of devices which are provided under the contract and the ever-increasing use of these devices. The market for providing mobile devices to businesses is dominated by a small number of service suppliers.

When purchasing a business mobile contract, a business (customer) will typically issue an invitation to tender (ITT) and seek responses from service suppliers. This process requires the business to go through a process in which they assess and identify their current device usage and create an ITT based on the information they have gathered. Such a process is often done by a business's IT director and finance team and places a significant resource burden on both. The process is very time consuming and difficult because of the high volume of data involved. In addition, most business do not have the expertise and market knowledge to find the best suppliers.

Once the ITT has been created, responses to the ITT are made by a range of service suppliers. It is common for these responses to be set up such that they emphasise the features of the service offering that the service supplier wishes to promote, at the expense of the business's requirements. Given the variety of tender responses and formats, it is often difficult and complex for the business to compare ITT responses from different suppliers. Managing the entire process in-house requires the management and analysis of lots of data. It is common for the customer to be overwhelmed by data and in receipt of much additional information, calls and meetings with service suppliers who are competing for their business. Therefore, the decision often ends up being a best guess or they choose the safe option of retaining the incumbent service supplier.

In some cases, a business will use an online comparison website such as Compare the MarketTm or a business comparison website such as BillmonitorTm to assist the in-house IT and finance departments with this process. Comparison websites come from the consumer market and don't have business mobile expertise and data analyfics to give you a bespoke result. Plus, they can focus on the price rather taking into account other business key factors.

An alternative is to outsource the entire tender and procurement process to a cost reduction consultant. Cost reduction consultants tend to be general specialists rather than business mobile experts, so are unlikely to have the expertise to get the best deal for you and can be a costly option.

The solutions available at present require the manual processing of complex information or present the illusion of a computer generated solution, which merely automates some aspects of the supplier selection process without reducing the complexity of the task, providing the customer with any insight into the best solution for their business, reducing the risk that the customer has made the wrong decision and having confidence that they have made the correct decision.

In addition, once a service has been selected, whilst the contract forms the basis of the cost the customer is likely to pay, cost certainty depends upon customer usage remaining within the agreed usage as defined in the contract. However, business use of telecommunications is very likely to change during the duration of a contract and this may lead to significant unanticipated cost over runs.

Summary of the Invention

It is an object of the invention to provide an improved system and method for automating a tender process so as to save client time and money.

In accordance with a first aspect of the invention there is provided a computer system for monitoring system usage under a service contract, the system cornprising: a customer data processing module for receiving customer usage information; a machine learning module which comprises a trained data model which analyses the customer usage information and predicts future customer usage; an analysis module which maps the predicted future customer usage onto the service contract and creates an invitation to tender based on predicted future customer usage and any modifications which have been made to the service 10 contract based upon predicted future customer usage.

Preferably, the system further comprise communications means for providing the created invitation to tender to one or more supplier who may wish to respond to the invitation to tender.

Preferably, the system further comprise communications means for providing the created invitation to tender to the customer.

Preferably, the system further comprise communications means for communication between the supplier and the customer.

Preferably, the system further comprise customer control function which allows the customer to amend the created invitation to tender.

Preferably, the system selects the optimum tender response from a plurality of supplier responses to the tender document.

Preferably, the machine learning module comprises a neural network.

Preferably, the trained data model is trained using sample data which comprises historical usage information.

Preferably, after pre-processing, the sample data consists of a single row, with one column per input feature and one per output feature.

Preferably, the sample data is received in different categories.

Preferably, the sample data is provided in three categories, bill analysis, bill analysis 10 summary and calls.

Preferably, the sample data comprises a usage data spanning a predetermined period to allow prediction for future usage.

Preferably, the sample data comprises a usage data spanning for at least one category of sample data.

Preferably, the sample data comprises a usage data spanning for each category.

Preferably, the sample data contains months 1 to 3 of usage data for each category per day to allow prediction of month 5 usage.

Preferably, the performance of the trained data model is evaluated by spliting the data into training data, validation data and testing data.

Preferably, the training data set is used to learn the network's weights.

Preferably, the validation data set is used to assess the trained data model's output after each training run to allow adjustment of hyperparameters and/or network topology in order to improve performance, without overfitting on the training set.

Preferably, the samples are partitioned into a training data, validation data and test data sets.

Preferably, the samples are partitioned into a training data, validation data and test data sets with a ratio of 70% training data, 15% validation data and 15% test data.

Preferably the partitioned samples are stratified using the customer ID to ensure 5 even customer representation across the datasets Preferably, the neural network comprises a convolutional layer, a pooling layer a dense layer with ReLu activatuion and a dense layer to output with ReLU activation.

Preferably, the convolution layer has one filter for each of the features to be predicted.

Preferably the convolution layer has 22 filters.

Preferably, the convolution layer has a kernel size of 31 Preferably, the convolution layer has a stride of 1.

Preferably, the convolution layer has a leaky ReLU activation.

Preferably the pooling layer comprises max pooling with a kernel size of 7 and a stride of 1, followed by a flattening operation.

Preferably, an error at output nodes of the neural network is specified by a loss 25 function.

Preferably, the loss function is defined with reference to the weighted absolute error and the percentage error per category.

Preferably, the loss function is weighted to improve prediction in categories for which there is limited data available.

Preferably, the weighted loss function is combined with a per category percentage error to drive the model to predict equally well across all categories, rather than learning to predict the easier features at the expense of the more difficult ones.

Preferably, network weights are updated in such a way that minimises the loss function.

Preferably, a change in the output of the loss function changes as the weights change is calculated using backpropagation, an algorithm which calculates the partial derivative of the loss function with respect to each weight, the weight is then adjusted such that the gradient of the slope decreases.

Preferably, the amount by which the weight changes is specified by a learning rate and momentum, and whether it is increased or decreased depends on the direction of the slope.

Alternatively, the machine learning module comprises a multivariate multiple regression (MMR) model is used create a model to predict future data usage in which multiple dependent variables)1; "I are modelled using multiple inputs X n Preferably, the machine learning module comprises a neural network with a single linear layer.

Preferably, the neural network is trained using an adaptive optimiser to optimise 25 weights b such that the error e between the predicted data and actual data is minimized.

Preferably, the machine learning module: categorises customer data; and apportions the categorised customer data into overlapping samples for a predetermined time period to create a base dataset from which the training, validation and test data is built.

Preferably the customer data is usage data.

Preferably the predetermined time period is 70-110 days.

More preferably the predetermined time period is 90 days.

Preferably the MMR uses call and data usage records for a given customer for the predetermined time period to predict that customer's usage for a future time period days to allow the system to recommend the most suitable plan for that customer one month ahead of time.

Preferably, the system further comprises a computer system for selecting a service provider, the system comprising: a customer data processing module for receiving customer usage information; a machine learning module which comprises a trained data model which analyses the customer usage information and predicts future customer usage; a benchmarking module which compares customer usage information with benchmarking data; a tender database which formats client tender information and which creates an invitation to tender (ITT); a supplier database to which ITT responses are uploaded by one or more suppliers; a comparison module which compares the ITT responses, ranks the ITT responses in accordance with one or more criteria and provides ranking information to the customer; wherein the customer selects one of the ITT responses based on the ranking of the ITT responses.

Preferably, the system further comprises a computer system for monitoring system usage under a service contract, the system comprising: a customer data processing module for receiving customer usage information; a machine learning module which comprises a trained data model which analyses the customer usage information and predicts future customer usage; an analysis module which maps the predicted future customer usage onto the service contract and calculates the cost associated with the future usage and determines the excess cost of said future usage under the terms of the service contract to allow the customer to amend usage or amend the contract in response to the determination of excess cost.

Preferably, the customer data processing module reformats the customer data to allow the machine learning module to compare it with a suitable benchmark tariff to determine a savings calculation.

Preferably, the customer data processing module reformats the customer data to allow the benchmarking module to compare it with a suitable benchmark tariff to determine a savings calculation.

Preferably, the invitation to tender presents the customer with a series of graphical user interfaces which contain questions designed to extract information from the customer about their requirements in a suitable format.

Preferably, the suitable format is a standardised format which compels suppliers to provide information in a way which makes it readily comparable with information provided by alternative suppliers.

Preferably, the comparison module compares the ITT responses by calculating parameter values for parameters from categories of information defined in the tender 25 document and ranking the tender based on the categories.

Preferably, a weighting is applied to the categories.

Preferably the weighting may be altered.

Preferably, a tool is provided to the customer to change the weightings.

Preferably, the ranking information is provided in a numerical or pictorial format.

Brief Description of the Drawings

The invention will now be described by way of example only with reference to the 5 accompanying drawings in which: figure 1 is a block diagram which illustrates an example of a system in accordance with the present invention; figure 2 is a flow diagram which shows an example of the method of the present invention; figure 3 is a block diagram which illustrates an example of a system in accordance with the present invention which is combined with a novel system for selecting a service supplier; figure 4 is a block diagram which illustrates another example of a system in accordance with the present invention which is combined with a novel system for selecting a service supplier and a novel system for predicting usage; figure 5 is an illustration of a kernel as applied to a layer's input data in a convolutional neural network (CNN); figure 6 is an illustration of the process of feature extraction and classification in a 25 CNN; figure 7 is an illustration of the partitioning of data used to train a model in a CNN; figure 8 is a topology of a CNN in accordance with the present invention; figure 9 is a diagram which shows input variables, operations and an output placeholder; figure 10 is a graph of loss v weight which shows an optimal point for optimal weight and loss; figure 11 is a graph of average per customer percentage validation batch error in a range of categories figure 12 is a graph of loss over training run; figure 13 is a graph of a sample dataset provided over a period of time; figure 14 is an illustration of the MMR input and output data; figure 15 is a graph which shows an example of the trained model predictions for customer data; figure 16 is a graph which shows an example of the trained model predictions for customer data; figure 17 is a graph which shows percentage category errors for a range of data categories for a token; figure 18 is a graph which shows mean absolute error/category maximum error for a range of data categories for a token; figure 19 is a schematic flow diagram which shows the process steps for the use of an example of a system in accordance with the present invention; figure 20 is a schematic flow diagram which shows the process steps for a customer savings calculation in the example of the system of the present invention shown in figure 19; figure 21 is a schematic flow diagram which shows the process steps for a customer creating a tender specification in the example of the system of the present invention shown in figure 19; figure 22 is a schematic flow diagram which shows the process steps for a supplier creating a tender response in the example of the system of the present invention shown in figure 19; figure 23 is a schematic flow diagram which shows the process steps for a customer selecting a supplier in the example of the system of the present invention shown in figure 19; figure 24 is a screen shot from an example of a graphical user interface which shows the ranking of suppliers and their score in the example of the system of the present invention shown in figure 19, figure 25 is a screen shot from an example of a graphical user interface which shows the side-by-side comparison of suppliers in the example of the system of the present invention shown in figure 19; figure 26 is a screen shot from an example of a graphical user interface which shows a "health score" for several parameters contained in the tender response in the example of the system of the present invention shown in figure 19; and figure 27 is a screen shot from an example of a graphical user interface which shows a slider tool which allows a customer to repriorifise parameters in the example of the system of the present invention shown in figure 19.

Detailed Description of the Drawings

The present invention comprises a software application which has been designed to automatically create an invitation to tender based upon predicted future use of a service.

The solution provided by the present invention had to address several technical problems which had limited the extent to which it was possible to automate the tender process. Typically, customers can provide a limited amount of data, some customers have more historical data than others, and some have periods for which we have no data at all. In addition, different customers have different usage patterns depending on, the size of the customer company, location, amount of travel undertaken by the employees, data use and talk time.

The present invention has developed a process and method which includes a machine learning module which works for a wide range of company sizes with a wide 15 range of usage and therefore avoids the need to train separate systems for each customer company.

The system and method of the present invention provides accurate predictions via the machine learning module which operates across a number of parameters/features/usage categories to allow it to predict usage taking into account factors such as the wide variety of sparsity of usage across different categories for different customers and the fact that different suppliers classify usage differently.

figure 1 is a block diagram 2 which illustrates an example of a system in accordance with the present invention. In this example, the invention is shown as a stand-alone module which may be used by a client to analyse customer data from an existing contract to provide the basis for creating an invitation to tender. In this example, the client 4 provides ongoing usage data from an existing customer service contract to a customer data processor which formats the data for analysis by a prediction module.

The prediction module 8 uses a machine learning module to analyse the formatted customer data to create predicted future use data which is then used in the ITT generator 101 to automatically create a tender document which may be provided to one or more potential suppliers.

Figure 2 is a flow diagram 12 which shows an example of the method of the present invention. Firstly, the customer obtains a contract 14., in an alternative, the customer may use the system and method of the present invention with usage data from an existing contract. The customer then uploads usage data to the system 16. The usage data may be uploaded in one or more batch, alternatively, the usage data may be automatically uploaded from billing information received by the customer from the supplier, or the supplier may provide the usage data directly to the system of the present invention. Once the system has received the data, it is categorised then the categorised data is uploaded 18 and analysed in the machine learning module. The processed usage information predicts future usage, and it is then imported into an ITT template 22. The completed ITT is uploaded onto the system 24, supplier responses are received 26, the system selects the optimum tender response 28 and the new contract is obtained 30.

A usage prediction is made by the machine learning module for a predetermined time period. The time period may be 3 months but is typically two months in advance of the latest data. The prediction provides the customer with information on the whether their predicted usage matches their current contract terms.

Therefore, the system has a very efficient and useful way of monitoring and predicting usage in order to create an ITT which anticipates the customer's future usage requirements.

figure 3 is a block diagram which illustrates an example of a system in accordance with the present invention.

Figure 4 shows a customer 3 who may access the system 1. The customer uploads usage information to a customer data processor 5 where the data is formatted and standardized. The customer data processor 5 receives two types of information. Initial data 34 which is used to provide an estimated cost saving in conjunction with the benchmarking module 9 and ongoing data which is processed for use in order to automatically generate an ITT. The standardized usage information is analysed by a machine learning module, referred to as the prediction module 7 has an initial usage data prediction module 38 which analyses the customer usage information and predicts future customer usage.

Benchmarking module 9 compares the standardized usage information with benchmark data to create a savings value as an indication of the level of savings available to the customer 9. The benchmarking module may also compare the standardized usage information with a predicted future use value which has been calculated by the prediction module 7. In this example of the present invention, the benchmarking module may also be used to compare between an initial usage data prediction 38 which provides a predicted future use value, with benchmarked tariff data.

During the life of a contract and particularly in the months leading towards the end of the contract, the system provides an ongoing usage prediction 36 in the prediction module 7. The ongoing usage data prediction is provided to the ITT generator 1 la which creates a tender document based upon the predicted future use it has calculated using the machine learning module.

Suppliers 19, 21, 23 may log in to the system and search for suitable ITTs or may specify the parameters which match the service they supply. Once they have found an ITT which they are interested in responding to, they complete a tender response online. The tender responses are stored in a supplier database 13 and then the comparison module 15 analyses and ranks the tender response based on several predefined criteria. The customer 3 may modify the weighting attached to each of the criteria to meet a range of changing circumstances or simply according to preference. The customer makes a tender selection 17 which is communicated to the successful supplier 23.

Figure 4 is a block diagram which illustrates another example of a system in accordance with the present invention which is combined with a novel system for selecting a service supplier and a novel system for predicting usage.

The example of the present invention shown in figure 4 shows the same functionality as that of figure 3, with the exception that the customer receives data 40 from the ongoing usage data prediction module to allow the customer to monitor data usage during the contract.

As described above and in relation to the present invention, the system of figure 4 comprises a machine learning module which comprises a trained data model which analyses ongoing customer usage information and predicts future customer usage.

An analysis module called the ongoing usage data prediction module 36 maps the predicted future customer usage onto the service contract and calculates the cost associated with the future usage and determines the excess cost of said future usage under the terms of the service contract to allow the customer to amend usage or amend the contract in response to the determination of excess cost.

As described above in relation to figures 3 and 4, the system of the present invention uses machine learning to predict future call and data usage from ongoing and/or historical usage information. As described in the following example, this result is achieved by the creation of a convolutional neural network which achieved an average per category validation batch error of under 10% across all customers.

The system creates a machine learning /prediction module from historical usage information provided by a customer which has been uploaded and which represent the three months of call and data usage.

This information is the input to a machine learning algorithm which then predicts future data and call usage. For example, the historical usage information for three months, defined as months 1 to 3, can predict the sum of that customer's data and call usage for the period two months in the future (month 5). Once this prediction has been made, a recommendation as to the most suitable plan for that customer, which takes into account future usage, is made. The machine learning system has been designed to provide accurate and reliable results with a relatively small amount of data.

The machine learning model has been designed to operate as a single system which is capable of handle customer data which has different usage patterns depending on, for example the size of the company, location, amount of travel undertaken by the employees.

The machine learning model of the present invention is capable of generalising. In particular, it has been designed to accommodate a large number of features (usage categories) in proportion to the amount of data we have to work with, a wide variation in the sparsity of usage across different categories and a variation in the way in which usage is classified by different providers, for example, provider has different countries which fall into a certain usage category.

In a preferred embodiment of the present invention the machine learning module comprises an artificial neural network which is a computational model consisting of a number of layers through which data is passed. It has an input layer, some number of hidden layers, and an output layer. Each layer contains several nodes, which take input data and transform it using an activation function in order to introduce non-linearity into the network. Each of these nodes are connected to the nodes in the previous layers by links, which multiply the output of the previous node by a weight.

There is also a bias node in each layer, which is set to 1, and has weighted links to each node in the next layer. This bias essentially allows the network to shift the output of the activation function independent of the input data.

The neural network creates a model based upon sample data which is known as training data. The training data is used to optimize the weights during the training stage, such that a loss function (a measure of the difference between the actual prediction of the trained network and what we want to train it to predict) is minimised.

Figure 5 is an illustration 25 of a kernel as applied to a layer's input data in a convolutional neural network. It shows input data 27 assembled as a row of different features 29 each feature being presented as a time series in a column 31. Convolution enables location independent recognition of features in the data. This is done by constructing a kernel 33 (also sometimes called a filter or feature detector) which is a matrix containing the network weights that we aim to learn. This kernel 33 is applied to that layer's input data a number of times using a sliding window, covering the entirety of the input. The distance from the centre of one kernel application to the next kernel application is called the stride. For every application, the results of element wise multiplication over the two matrices (the inputs at the kernel location, and the kernel weights) are summed to produce a single element of the layer's output 35.

Figure 6 is an illustration 41 of the process of feature extraction 43 and classification 45 in a convolutional neural network (CNN). In this example, the neural network is a convolutional neural network (CNN) which receives an input 47, contains convolutional layers in which each node is not connected to every other node in the previous layer and in which some weights are shared. The convolutional layers are interspersed with pooling layers 51 in which the matrix passed from the previous convolutional layer is down sampled and which are followed by a few dense layers which are fully connected layers 53 which map the features identified by the preceding operations to the desired output format.

The convolutional and pooling layers are collectively responsible for detecting high level features in the input data and the fully connected layers then use those features to output a class prediction. In this example, a one-dimensional convolution is used because the input data takes the form of a time series rather than a two-dimensional image.

Pooling layers are used to reduce the dimensions of the feature maps. Thus, it reduces the number of parameters to learn and the amount of computation performed in the network. The pooling layer summarizes the features present in a region of the feature map generated by a convolution layer.

When training a machine learning system such as a neural network it is important to avoid overfitting or underfitfing to the training data. Underfitting is when a model fails to learn optimal weights, usually because it has an insufficiently complex topology, too few epochs, or a too small or noisy dataset Underfitting can be addressed by a variety of techniques, including adjusting model topology, increasing the number of layers or nodes in each layer, procuring more data or augmenting the existing data.

Conversely, overfitting is when the model learns to perform very well on the data it is trained with but fails to work as well when previously unseen data is used. This can be due to an overly complex model which identifies noise in the training data as salient features, over-optimisation of the model parameters over a large number of epochs, or a biased, inadequate or unrepresentative training dataset. Overfitfing can be addressed by stopping training before the error completely converges, simplifying the model, or procuring or generating better data.

In this example of the present invention, the training data comprises approximately two years' worth of historical data for 15 customers from three different providers. After pre-processing, each training sample consists of a single row, with one column per input feature and one per output feature.

The usage data is received in the form of three tables: bill analysis; bill analysis summary; and call categories. Bill_analysis maps various csv files containing customer records to the ID of that customer and their provider.

Bill analysis summary contains all call and data usage records from those files, with each row assigned the appropriate bill analysis ID according the source of its data. Call categories simply maps the category ID to its name.

Selected summary with customer ids are constructed, which selects the relevant columns from bill analysis summary and merges onto on the bill analysis ID and customer name to form a single table in preparation for building the datasets. From this each customer's usage per category per day is aggregated, to form a per day data table. For days where no information was present, 0.0 values are inserted for each category. This table was then apportioned into samples, each containing months 1 -3 of usage data for each category per day (our model input) and the sum usage for each category for month 5 (what we want to learn to predict). This is the base dataset from which the training, validation and test data are constructed.

In this example, the model focuses on predicting usage rather than cost, since cost is likely to change over time and for different plans. In addition, cost may be calculated directly from usage as necessary when predictions are made. The available data allowed the generation of -6000 samples.

In order to evaluate the performance of the neural network objectively and identify overfit or underfit, the data was typically split into three distinct sets: training, validation and testing. Each set was created to be roughly representative of the dataset as a whole and for this reason the samples were randomly shuffled before partitioning.

The training set is used to learn the network's weights. The validation set is not used for training, but rather to assess the trained model's output after each training run to allow adjustment of hyperparameters and/or network topology in order to improve performance, without overfitting on the training set. The testing set is not used during this process, but only once a final, trained model has been developed. This avoids overfitting of hyperparameters and topology to the validation set and ensures that our evaluation metrics are based on classifying unknown data.

Figure 7 is a diagram 61 which shows the partitioning of data used to train a model in a CNN. The entire dataset 63 is partitioned into a training data 65, validation data 67 and test data sets with a ratio of 70% training, 15% validation and 15% test. The data was stratified using the customer ID to ensure even customer representation across the datasets. The training data 65 and validation data 67 is used to train the model, try different models and adjust hyperparameters. The test data 69 is only used when testing final models and parameters.

The design of the network requires the specification of a number of variables including, number of layers, the number of nodes at each layer, how the activation functions is used, how the nodes are linked together and any other transformations applied to the data. This is referred to as its topology and is illustrated in the diagram figure 8 which shows an input matrix 73, output features 83 and a four layer neural network with is specified consisting of: Layer 1: Convolutional layer 75 with 22 filters (the number of features we wish to predict), a kernel size of 31, a stride of 1 and a leaky ReLU activation Layer 2: Max Pooling layer 77 with a kernel size of 7 and a stride of 1, followed by a flattening operation Layer 3: Dense Layer 79 with 24 nodes and a ReLU activation Layer 4: Dense Layer 81 to output and a ReLU activation Hyperparameters refer to the variables we must set when training the network. The optimal values for these were selected via heuristics and iterative grid search. These include: * Learning Rate: how much the model weights are updated by at each batch. This, along with epsilon, is used for Adam Optimisation (a form of stochastic gradient descent which we use to train the model).

* Epsilon: A small constant for numerical stability which is added to the denominator when computing gradients for backpropagation -this is necessary to avoid division by zero errors.

* Maximum Number of Epochs: this was set at 1500.

* Stagnation Threshold: after how many epochs with no loss improvement to quit training.

* Batch Size: how many samples we pass the model before the error is calculated and the weights updated.

How the error at the output nodes of a network is calculated is specified by the loss function, which takes Y, the actual output of the sample, and?, the model's prediction. How the loss function is defined depends upon the task at hand -in this case, it was found that a custom loss function taking into account the weighted absolute error and the percentage error per category was most effective. Due to imbalance in the proportion of minutes/bytes usage per category (e.g. UK data usage is frequently present in the daily data and can be up to 500,000 bytes per day, while directory calls are far rarer and range in length from 0 to just 14 minutes) it was necessary to weight the loss to ensure that the model didn't just learn to predict well on the categories for which there is a lot of data available. Combining this with the per category percentage error ensured that the model was driven to predict equally well across all categories, rather than learning to predict the easier features at the expense of the more difficult ones.

Combined weighted absolute and per category percentage loss:

S

E7=o (Yi.j) L(Y, 2) = (1Y -21-(1-y for y in X11=0 r=0 oti,J)) + 1) ((Er=od) (lYil) 100) + 1) Training and validation scripts are written using TensorFlow, a convenient python framework for building systems to train and test machine learning models, and running those systems in high-performance C++. It facilitates the creation of directed graphs to control the flow of data On the form of multidimensional matrices -also called tensors) through operations.

is A placeholder is a variable that has data assigned to it at a later date. It allows for the creation of operations and the building of our computation graph, without needing the data. Data into the graph through these piaceholders. TensorFlow also includes a tool called TensorBoard, a visualisation suite which is used to plot quantitative metrics and generate visualisations of the computational graph. As illustrated in the diagram 91 of figure 9 which shows a placeholder input 93 and variable inputs 95, operations 97 and output 99.

In order to update the network weights in such a way that minimises the loss function, the change in the output of the loss function as the weights change is determined. This is calculated using backpropagation, an algorithm which calculates the partial derivative (AKA the slope) of the loss function with respect to each weight. This weight is then adjusted in such that the gradient of the slope decreases -a technique more commonly known as gradient descent. The amount by which the weight changes is specified by the learning rate and the momentum, and whether it is increased or decreased depends on the direction of the slope.

Figure 10 shows a graph 101 which plots loss 103 against weight 105. Curve 107 shows the initial weight 113, weight updates 115, calculated gradients 111 and a minimum value 109. Adam (Adaptive Moment Estimation) Optimisation, a specialised form of gradient descent is used to compute the learning rate and momentum for each weight individually.

When running the training script, optional arguments may be specified from the command line. (See script or run python run.py -h for details.) If these are not specified, the script runs with default values selected to give the best performance. When the script is run, an exit handler is initiated which evaluates and saves information about the model and training performance upon exit. This is so that it's possible to kill a running script from the command line without losing any information -the model's parameters are logged and evaluation metrics calculated and saved to file.

A unique ID is generated for each training run, so that the same models training on the same data but running with different parameters can be distinguished for later evaluation. The data is read in from the database and standardised before being batched. Other variables are also initialised at this point to control how often the model is saved and how many times per epoch a summary is printed. TensorFlow variables and operations are also created, along with summary operations to log the loss and accuracy for later visualisation in TensorBoard.

When the session is run, each epoch and each batch is looped through. At each batch, a number of samples equal to the batch size are passed to the model and its predictions are calculated. These are compared to the known output, and the current loss is stored. The Adam optimiser then updates the network weights and the next training batch is generated.

The default value for the number of epochs is 10,000 -but in practice training ends before that number is reached. At each epoch, we check if the number of epochs in a row with no loss decrease has reached the stagnation threshold -if so, we reduce the learning rate. If there are twice the stagnation threshold epochs without loss decreased, or if the error decreases past the convergence threshold, training is stopped early. This is partly to avoid models overfitting, and partly to save time if a model or a certain set of hyperparameters are clearly not performing well.

Each time a training run is completed, the model's performance over the entire run is logged for viewing in TensorBoard, and its parameters are saved so that it can be loaded at a later date for testing or to make predictions. The training script is also saved. After each training run, the model's performance on both training and validation sets is logged along with its hyperparameters and saved to the database. (See training_validafion_results, validafion_results and test_results) We primarily used the average percentage batch error per category on validation to rank potential models and hyperparameter performance. The best performing model as described above achieved an average per category training batch error of 6.05% and an average per category validation batch error of 8.76%.

A benchmark dataset consisting of uniformly distributed random noise produced an average per category training batch error of 46.79% and an average per category validation batch error of > 100%, confirming the good fit of our model to the data.

A random sample from the final batch of training predictions of the most successful model is shown below.

Random Training Sample: Avg per category training batch error: 6.56% Total training batch error: 4.00% CATEGORY FRED ACTUAL ERROR % BATCH ERR calls_from_europe 221 163 -58 8.5 calls_from_row 0 0 0 9.9 calls_from_traveller 129 140 11 4.9 chargeable_calls 0 0 0 9.2 data_europe 1027 1207 180 4.8 data_row 459 734 275 8.6 data_traveller 0 0 0 5.2 data_uk 85410 86970 1560 5.7 directory 0 21 21 6.4 free_calls 0 o o s idd_uk_mins_to_apa c 0 43 43 5.6 idd_uk_mins_to_eur ape 0 539 539 6.2 idd_uk_mins_to_row 0 7 7 6.7 idd_uk_mins_to_usa _canada 0 390 390 5.3 other 0 108 0 6.3 picture msg 0 0 0 10.6 texts_while_roaming 0 0 0 4.5 uk_calls 13618 13586 -32 5.7 uk_texts 0 0 0 5.6 uk_to_international_t exts 0 0 0 7 Training & Validation Results BATCH_SIZE. 48 TRAIN_BATCHES_PER_EPOCH: 87 S NUM_OF_EPOCHS: 1502 TRAINING_TIME: 12 minutes TRAIN AVG LOSS: 2306 TRAIN_TOTAL_BATCH_ERROR: 7.11% TRAIN_AVG_CAT_BATCH_ERROR: 6.05% EVAL AVG LOSS: 2306 EVAL_TOTAL_BATCH_ERROR: 6.05% EVAL_AVG_CAT_BATCH_ERROR: 8.76% LEARNING_RATE: 0.001 + decay EPSILON: 0.000 Per category avg batch errors: Per customer avg batch errors: calls_from_europe 5.94% 1.0 8.45% calls_from_row 9.98% 2.0 6.79% calls_from_traveller 4.83% 3.0 5.76% chargeable_calls 11.9% 4.0 10.2% data_europe 5.07% 5.0 2.99% data_row 6.75% 6.0 14.6% data_traveller 6.67% 7.0 6.97% data_uk 4.96% 8.0 4.59% directory 6.36% 10.0 4.17% free_calls 7.36% 11.0 3.51% idd_uk_mins_to_apac 7.59% 12.0 10.9% idd uk mins to europe 9.21% 14.0 5.84% idd_uk_mins_to_row 13.68% 15.0 12.5% idd_uk_mins_to_usa_and_canada 5.12% 16.0 623% other 11.30% picture_msg 7.54% texts_while_roaming 11.79% uk_calls 10.36% uk_texts 8.08% uk_to_international_texts 11.43% Figure 11 is a graph of average per customer percentage validation batch error in a range of categories and figure 12 is a graph of loss over training run as related to the above tables.

In another embodiment, an alternative machine learning algorithm was used to improve the quality of prediction of future call and data usage from historical information. In this example, the system used call and data usage records for a given customer for the past 90 days to predict that customer's usage for the next 60 days to allow the system of the present invention to recommend the most suitable plan for that customer one month ahead of time. This example is illustrated in figures 10 to 14 In doing so, the machine learning algorithm was designed to work with a limited data set and variable amounts of data to address the technical problem that some customers have more historical data than others and some have periods for which no data existed. In addition, different customers have different usage patterns depending on the size of the company, location, amount of travel undertaken by the employees et cetera.

The machine learning algorithm was designed to create a generalised model to address the technical problem that some customers do not have sufficient data to train separate systems. It also had a relatively large number of features (usage categories) that must be predicted in proportion to the amount of data which is available to work with and the sparsity of usage across different categories is very varied.

In this example, approximately 18 months of historical usage data for five customers was used. Each was assigned an identification token. For each day, usage data for the following categories was available.

iddUkMinsToEurope','iddUkMinsToRow','iddUkMinsToUsaCanada', 'iddUkMinsToApac', 'iddMinsTotal', 'dataUk', lukToInternationalTexts', pictureMessages', 'chargeableCalls', lcallsFromEurope',1clataEuropel,toamingDaysEU7roamingDaysROVV7roamingDays Traveller', 'directory', iukCalls',1ukTexts',IreeCalls',1other,lcallsFromTraveller, 'dataTraveller', ltextsWhileRoaming', 'callsFromRowl, 'dataRow', 'oobCost' For each customer, this data was then apportioned into overlapping samples, each containing 90 days of usage data for each category per day (model input) and the following 60 days of usage data (usage prediction) to create a base dataset from which the training, validation and test data is built. In this example, usage rather than cost is predicted because cost is likely to change over time and for different plans. In addition, cost may be calculated directly from usage as necessary when predictions are made.

Figure 13 is a graph 100 which shows data related to the customer tokens labelled 102, 104, 106, 108 and 110. Per customer sample count: Customer Token Number of Samples (90 + 30 days) 67d4ad8e-baa1-4140-bf9d-c53c0fde8f23 110 110 123b5671-f3be 4865 9312-f7c107086a4e 06caf371-a 1b5-45ff-9612-bad3ece84c2 b 300 300 33b64c7d-5d46-4e8f-874f-e4383b939b2f b0391f16-ebd7-4ee4-9999-714f614425c1 This dataset presents challenges as the majority of customers have large periods for which data is unavailable. For this reason, the analysis herein will focus on customers b0391f16-ebd7-4ee4-9999-714f614425c1 and 06caf371-alb5-45ff-9612-bad3ece84c2b, for which we have complete data.

In this example multivariate multiple regression (MMR) was used in preference to larger models with higher data requirements. MMR is a statistical machine learning technique, in which multiple dependent variables are modelled using multiple inputs X...n. It can also be understood as a neural network with a single linear layer.

Since the data is continuous, the mean square error (MSE) is used as criterion.

Yr = bc, + + b2x2+...bx for i in X e = 1 -n I -02 A stochastic gradient descent (AdamVV) is formed to optimise weights b such that the error e between the predicted data and actual data is minimized. figure 14 is an illustration 120 of the MMR input 122 and output 124 data and learned weights 126.

The MMR model was trained individually for each customer dataset for 1000 epochs, with a learning rate of 0.001. Each dataset was normalised between 0 and 1 prior to training. We also train a base model on all samples from all existing customers, which can be used to make predictions for new customers with little historical data available.

This training resulted in validation losses as follows: B0391f16-ebd7-4ee4-9999-714f614425c1 0.004099701996892691 33b64c7d-5d46-4e8f-874f-e4383b939b2f 0.0030984063632786274 06caf371-a 1 b5-45ff-9612-bad3ece84c2b 0.009908866137266159 123b5671-f3be-4865-9312-f7c107086a4e 0.004331550560891628 67d4ad8e-baa1-4140-bf9d-c53c0fde8f23 0.0013555206824094057 BASE (all training samples) 0.00632152333855629 Figures 15 and 16 are graphs which shows an example of the trained model predictions for customer data. Figure 12 shows a graph 130 with a numerical Y axis 140 and a temporal (days) x axis 138, trained model predictions: actual 132 (green), predicted 136 (purple), and error 134 (red) for customers b0391f16-ebd7-4ee4-9999-714f614425c1 andO6caf371-alb5-45ff-9612-bad3ece84c2b.

Figure 16 shows a graph 150 with a numerical Y axis 160 and a temporal (days) x axis 158, trained model predictions: actual 152 (green), predicted 156 (purple), and error 154 (red) for customers b0391f16-ebd7-4ee4-9999-714f614425c1 and 06caf371-alb5-45ff-9612-bad3ece84c2b.

The performance of the model is quantified for each class c using the absolute error (prediction -actual usage) divided by the maximum actual usage value in each category, averaged over the total number of days d for which a prediction may be made: ci -dl max (ac) i=1 For example, if actual usage for some category ranged between 0 and 10 units, and on average the prediction was incorrect by 3 units, a percentage error of 30% would be apparent.

Figure 17 is a graph which shows percentage category errors for a range of data categories for a token and figure 18 is a graph which shows mean absolute error/category maximum error for a range of data categories for a token.

As shown in figures 17 and 18, this embodiment of the present invention provides predictions which are generally very reliable, falling within 8% of the maximum usage for each category for both tokens.

Figures 19 to 27 presented below illustrate an example of the customer and service provider experience when using a system in accordance with the present invention in which the ongoing prediction system and method is combined with a system and method for automating the tendering process Figure 19 is a schematic flow diagram 141 which shows the process steps for the use of an example of a system in accordance with the present invention. In this example, the system allows the user to compare their existing contract to a benchmark, manually create an ITT and receive responses, compare ongoing usage to identify situations where the usage has diverged from that anticipated in the contract and to allow the automatic creation of an ITT. The customer may, upon receiving an automatically created tender, decide to use in its entirety, as the basis of a tender document, or may decide to work on a manually created version. The manual process is described below with reference to figures 20 and 21.

The customer uploads their usages data 143 in a suitable format. The data may be collected automatically from the current service provider of the client may upload the data from billing and usage information they have received from the service provider.

The platform takes the usage data, reformats it and compares it with benchmark tariff to determine a savings calculation 145. At this part of the process, the system may use real current data or may use data which has been created as a prediction of future use to provide a benchmark comparison with future use as shown in figure 1.

In order to create an invitation to tender (ITT), the customer completes a questionnaire and the ITT is published on the platform and is viewable by service providers 147 Suppliers who view the tender may wish to complete a tender response 149. The tender responses is submitted and the system marks and ranks the tender responses and produces a recommendation based on the marks and rankings 151.

In addition, the system provides various ways in which a visual comparison may be made between different tender responses 153 in order to assist the customer in making a decision and to allow them to alter the scoring of the parameters defined in the ITT Thereafter, the customer choses a supplier 155 and contracts are negotiated signed.

There is also the facility to provide automatic feedback to the suppliers who were unsuccessful 157.

Step 159 in the process is referred to as an early warning and it relates to the system and method of the present invention. The machine learning module predicts future customer usage and an analysis module maps the predicted future customer usage onto the service contract and calculates the cost associated with the future usage and determines the excess cost of said future usage under the terms of the service contract to warn the customer as to the suitability of the current contract as it is running and to allow the customer to amend usage or amend the contract in response to the determination of excess cost as described in figure 4 above.

Step 160 is the process described with reference to figures 3 and 4 in which ongoing usage data is used to automatically generate am ITT.

Figure 20 is a schematic flow diagram 161 which shows the process steps for a customer savings calculation. The customer firstly goes to the URL 163 then enters a username/password or registers 165. The user then uploads usage information 167. This may include the details of the network currently used by the customer.

The system will ask the customer to provide login details for current supplier network billing engine or the customer can manually add billing information.

The customer then adds company details 169 and initiates the benchmarking process which can be almost immediate or may take several hours depending upon the customer's current network.

Once the results have been calculated, the customer will receive notification that they are available. Typically, the notification is via an email inviting the customer to see the minimum savings they will be able to achieve on the platform. This is based on the benchmark tariff the system has created based on customer requirements. The savings are based on the active lines on the account over the last 3 months.

Figure 21 is a schematic flow diagram 191 which shows the process steps for a customer creating a tender specification. The customer firstly goes to the URL 193 then enters a username/password or registers 195. The customer is then presented with a series of graphical user interfaces which contain questions with drop down boxes designed to extract information from the customer about their requirements, which the customer must complete to proceed. This may include expiry of current contract, preferred length of new contract, number of voice lines, number of data lines, usage growth, current spend, key post codes for signal, EU usage, non-EU usage and device/hardware requirements. The customer may add additional information which is relevant to the ITT in free form text boxes.

When completed the customer will finish the process, receive an acknowledgement and obtain a timeline for response 199.

The customer may check and edit the ITT prior to publication. The tender will close in a minimum of 2 weeks or 2 weeks after the month where the contract finishes.

Once published the customer cannot make any changes to the tender.

Figure 22 is a schematic flow diagram 201 which shows the process steps for a supplier creating a tender response. The supplier firstly goes to the URL 203 then enters a username/password or registers 205.

Once the supplier has accessed their account, they can search for tenders 207. The graphical user interface allows the supplier to view the tender 209 in order to access details of a customer's requirements including important free form text for additional customer requirements The next stage 211 is an eight step process to complete and submit a tender response. In doing so, the supplier must provide specific cost and service level information relating to Tariff, 00B, Hardware fund, network additional lines signal and customer service. Thereafter the tender response is submitted on the system, 213 The above information is submitted using templates for general tariff requirements for the various items such as monthly line discount, usage which is pre-categorised for ease of analysis, traveller days which correspond to the daily roaming allowances, ROW is the actual amounts of usage for countries which are outside of any packages offered by the networks.

Costs may be inputted based on the normal per usage cost. If packages are available to cover usage in a more cost effect way there is a Bundle Wizard which allows a supplier to add the specific pack and apply it to the usage type. Bundles are applied and total cost is detailed against each category, however, the total for the category is only counted once so there is no double counting for the 00B totals.

Cost of devices which are to be part of the hardware fund value not in addition to the total hardware fund provided. It is assumed that the hardware will be deducted from the total fund at the cost. Postcodes selected by customer to check signal important if customer has confirmed a network change possibility.

A summary of all the responses in the tender for suppliers is provided for review. If correct suppliers can post their tender using the submit tender response button 213.

Figure 23 is a schematic flow diagram 221 which shows the process steps for a customer selecting a supplier. The customer firstly goes to the URL 223 then enters a username/password or registers 225. Thereafter the customer views the tender responses 227, selects and modifies tender metrics 229 and selects the most suitable tender 231.

A selection of the graphical user interfaces presented to the customer are provided in figures 23 to 27. The GUIs assist the customer in comparing the tender responses, reprioritising the parameters and ultimately selecting the best tender.

Figure 24 is a screen shot 241 from an example of a graphical user interface which shows the ranking of suppliers and their score 243. The categories 245 are (from left to right) average per line per month cost, monthly cost, total fixed and 00B cost and contract saving. As can be seen, four responses are listed and ranked in order of score.

Figure 25 is a screen shot from an example of a graphical user interface which shows the side-by-side comparison of suppliers which is continued in figure 26 which shows a "health score" for the parameters. The score 253 is shown at the top of the table for each tender response, with the categories 255 arranged in columns. A summary breakdown is provided 257 and a health check which is a colour code which goes from green, light green, yellow orange and red which signify the quality of the response in relation to the following parameters value, tariff, customer service, signal and convenience, green being best and red being worst.

Figure 27 is a screen shot 271 from an example of a graphical user interface which shows a slider tool which allows a customer to repriorifise parameters. The parameters of tariff, value customer service network and convenience 273 each have a slider 275 with a max and min value represented by the line 277 which allows the customer to assign different weights (importances) to these parameters.

Repriorifisafion can lead to a change in the ranking of the suppliers and result in a different outcome.

The present invention provides a system and method for selecting a service supplier 5 which makes it easier for a customer to get the optimum business mobile contract.

In at least one example, the system of the present invention comprises an automated comparison platform made for business mobile communications which seamlessly creates an ITT based on customer ongoing use to determine future use using a machine learning module and to match customer needs to the best supplier solution. The system has automated the process of procurement so that the customer gets a robust, repeatable, and reliable process in which machine learning and advance analytics is used to ensure accurate tenders, tailored proposals and side by side comparison resulting in clear and informed business choices. The system aims to save a customer time and money. In addition, it gives a reliable and systematic basis for their purchase decision. Once the contract has been let, the system may monitor ongoing usage and advises the customer of divergences between use and contract terms.

It makes the process automated, streamlined, and secure, helps a customer to find the best value, tailored solution and provides a smart comparison platform which expertly ranks, compares and recommends, giving you confidence in your chosen solution From the initial automatic bill upload and analysis to the creation of the tender, the present invention cuts out unnecessary supplier interactions. It provides a customer with a free, competitive environment where they can find the best supplier solution and to realise significant costs savings. The comparison platform provides side by side comparison, with ranked attributes which can be weighted depending on customer requirements In at least one aspect of the present invention, the system matches individual customer needs to the best service supplier selected from a number of alternatives.

This will help bring transparency to the tender process, cutting out the guesswork, enabling real, informed choice, and saving time, money and resources on both sides.

The description of the invention including that which describes examples of the invention with reference to the drawings may comprise a computer apparatus and/or processes performed in a computer apparatus. However, the invention also extends to computer programs, particularly computer programs stored on or in a carrier adapted to bring the invention into practice. The program may be in the form of source code, object code, or a code intermediate source and object code, such as in partially compiled form or in any other form suitable for use in the implementation of the method according to the invention. The carrier may comprise a storage medium such as ROM, e.g. CD ROM, or magnetic recording medium, e.g. a memory stick or hard disk. The carrier may be an electrical or optical signal which may be transmitted via an electrical or an optical cable or by radio or other means.

In the specification the terms "comprise, comprises, comprised and comprising" or any variation thereof and the terms include, includes, included and including" or any variation thereof are considered to be totally interchangeable and they should all be afforded the widest possible interpretation and vice versa.

Improvements and modifications may be incorporated herein without deviating from the scope of the invention.

Claims

Claims 1. A computer system for monitoring system usage under a service contract, the system comprising: a customer data processing module for receiving customer usage information; a machine learning module which comprises a trained data model which analyses the customer usage information and predicts future customer usage; an analysis module which maps the predicted future customer usage onto the service contract and creates an invitation to tender based on predicted future customer usage and any modifications which have been made to the service contract based upon predicted future customer usage.
2. The computer system as claimed in claim 1 wherein, the system further comprise, communications means for providing the created invitation to tender to 15 one or more supplier who may wish to respond to the invitation to tender.
3. The computer system as claimed in claim 1 or claim 2 wherein, the system further comprise, communications means for providing the created invitation to tender to the customer.
4. The computer system as claimed in any preceding claim wherein, the system further comprise, communications means for communication between the supplier and the customer.
5. The computer system as claimed in any preceding claim wherein, the system further comprise, customer control function which allows the customer to amend the created invitation to tender.
6. The computer system as claimed in any preceding claim wherein, the system selects the optimum tender response from a plurality of supplier responses to the tender document
7. The computer system as claimed in any preceding claim wherein, the machine learning module comprises a neural network.
8. The computer system as claimed in any preceding claim wherein, the trained data model is trained using sample data which comprises historical usage information.
9. The computer system as claimed in claim 8 wherein, after a pre-processing step, the sample data consists of a single row, with one column per input feature and one per output feature.to
10. The computer system as claimed in claim 8 and claim 9 wherein, the sample data is received in different categories.
11. The computer system as claimed in any of claims 8 to 10 wherein, the sample data is provided in three categories, bill analysis, bill analysis summary and calls.
12. The computer system as claimed in claims 8 to 11 wherein, the sample data comprises a usage data spanning a predetermined period to allow prediction for future usage.
13. The computer system as claimed in any of claims 8 to 12 wherein, the sample data comprises a usage data spanning for at least one category of sample data.
14. The computer system as claimed in any of claims 8 to 13 wherein, the sample data comprises a usage data spanning for each category.
15. The computer system as claimed in any of claims 8 to 14 wherein, the sample data contains months 1 to 3 of usage data for each category per day to allow prediction of month 5 usage.
16. The computer system as claimed in any preceding claim wherein, the performance of the trained data model is evaluated by splitting the data into training data, validation data and testing data.
17. The computer system as claimed in claim 16 wherein, the training data set is used to learn the network's weights.
18. The computer system as claimed in claim 16 wherein, the validation data set is used to assess the trained data model's output after each training run to allow adjustment of hyperparameters and/or network topology in order to improve performance, without overfitting on the training set.
19. The computer system as claimed in claim 8 wherein, the sample data is partitioned into a training data, validation data and test data sets.
20. The computer system as claimed in claim 19 wherein, the samples are partitioned into a training data, validation data and test data sets with a ratio of 70% training data, 15% validation data and 15% test data.
21. The computer system as claimed in claim 20 wherein, the partitioned samples are stratified using the customer ID to ensure even customer representation across the datasets
22. The computer system as claimed in claim 7 wherein, the neural network comprises a convolutional layer, a pooling layer a dense layer with ReLu acfivatuion and a dense layer to output with ReLU activation.
23. The computer system as claimed in claim 22 wherein, the convolution layer has one filter for each of the features to be predicted.
24. The computer system as claimed in claim 22 wherein the pooling layer comprises max pooling with a kernel size of 7 and a stride of 1, followed by a flattening operation.
25. The computer system as claimed in claim 7 wherein, an error at output nodes of the neural network is specified by a loss function.
26. The computer system as claimed in claim 25 wherein, the loss function is defined with reference to the weighted absolute error and the percentage error per category.
27. The computer system as claimed in claim 25 or claim 26 wherein, the loss function is a weighted loss function which is weighted to improve prediction in categories for which there is limited data available.
28. The computer system as claimed in claim 27 wherein, the weighted loss function is combined with a per category percentage error to drive the model to predict equally well across all categories, rather than learning to predict the easier features at the expense of the more difficult ones.
29. The computer system as claimed in claim 16 wherein, network weights are updated in such a way that minimises the loss function.
30. The computer system as claimed in claim 25 wherein, a change in the output of the loss function changes as the weights change is calculated using backpropagation, an algorithm which calculates the partial derivative of the loss function with respect to each weight, the weight is then adjusted such that the gradient of the slope decreases.
31. The computer system as claimed in claim 25 wherein, the amount by which the weight changes is specified by a learning rate and momentum, and whether it is increased or decreased depends on the direction of the slope.
32. The computer system as claimed in claim 1 wherein, the machine learning module comprises a multivariate multiple regression (MMR) model is used create a model to predict future data usage in which multiple dependent variables Y,., ,are modelled using multiple inputs X,
33. The computer system as claimed in claim 32 wherein, the machine learning module comprises a neural network with a single linear layer.
34. The computer system as claimed in claim 32 or 33 wherein, the neural network is trained using an adaptive optimiser to optimise weights b such that the error e between the predicted data and actual data is minimized.
35. The computer system as claimed in any of claims 32 to 34 wherein, the machine learning module: categorises customer data; and apportions the categorised customer data into overlapping samples for a predetermined time period to create a base dataset from which the training, validation and test data is built.
36. The computer system as claimed in claim 35 wherein, the customer data is usage data.
37. The computer system as claimed in claim 35 wherein, the predetermined time period is 70-110 days.
38. The computer system as claimed in claim 32 wherein, the predetermined time period is 90 days.
39. The computer system as claimed in claim 32 wherein, the MMR uses call and data usage records for a given customer for the predetermined time period to predict that customer's usage for a future time period days to allow the system to recommend the most suitable plan for that customer one month ahead of time.
40. The computer system as claimed in any preceding claim wherein, the system further comprises a computer system for selecting a service provider, the system comprising: a customer data processing module for receiving customer usage information; a machine learning module which comprises a trained data model which analyses the customer usage information and predicts future customer usage; a benchmarking module which compares customer usage information with benchmarking data; a tender database which formats client tender information and which creates an invitation to tender (ITT); a supplier database to which ITT responses are uploaded by one or more suppliers; a comparison module which compares the ITT responses, ranks the ITT responses in accordance with one or more criteria and provides ranking information to the customer; wherein the customer selects one of the ITT responses based on the ranking of the ITT responses.
41. The computer system as claimed in any preceding claim wherein, the system further comprises a computer system for monitoring system usage under a service contract, the system comprising: a customer data processing module for receiving customer usage information; a machine learning module which comprises a trained data model which analyses the customer usage information and predicts future customer usage; an analysis module which maps the predicted future customer usage onto the service contract and calculates the cost associated with the future usage and determines the excess cost of said future usage under the terms of the service contract to allow the customer to amend usage or amend the contract in response to the determination of excess cost.
42. The computer system as claimed in any preceding claim wherein, the customer data processing module reformats the customer data to allow the machine learning module to compare it with a suitable benchmark tariff to determine a savings 25 calculation.
43. The computer system as claimed in any preceding claim wherein, the customer data processing module reformats the customer data to allow the benchmarking module to compare it with a suitable benchmark tariff to determine a savings calculation.
44. The computer system as claimed in any preceding claim wherein, the invitation to tender presents the customer with a series of graphical user interfaces which contain questions designed to extract information from the customer about their requirements in a suitable format.
45. The computer system as claimed in claim 43 wherein, the suitable format is a standardised format which compels suppliers to provide information in a way which makes it readily comparable with information provided by alternative suppliers.
46. The computer system as claimed in any preceding claim wherein, the comparison module compares the ITT responses by calculating parameter values for parameters from categories of information defined in the tender document and ranking the tender based on the categories.
47. The computer system as claimed in claim 45 wherein, a weighting is applied to the categories.