EP3937084A1 - Formation d'un modèle pour effectuer une tâche sur des données médicales - Google Patents

Formation d'un modèle pour effectuer une tâche sur des données médicales Download PDF

Info

Publication number
EP3937084A1
EP3937084A1 EP20185311.6A EP20185311A EP3937084A1 EP 3937084 A1 EP3937084 A1 EP 3937084A1 EP 20185311 A EP20185311 A EP 20185311A EP 3937084 A1 EP3937084 A1 EP 3937084A1
Authority
EP
European Patent Office
Prior art keywords
model
training
local
data
parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP20185311.6A
Other languages
German (de)
English (en)
Inventor
Ravindra Balasaheb Patil
Chaitanya Kulkarni
Dinesh Mysore Siddu
Maulik Yogeshbhai PANDYA
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Koninklijke Philips NV
Original Assignee
Koninklijke Philips NV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips NV filed Critical Koninklijke Philips NV
Priority to EP20185311.6A priority Critical patent/EP3937084A1/fr
Priority to EP21742110.6A priority patent/EP4179467A1/fr
Priority to JP2022578700A priority patent/JP2023533188A/ja
Priority to PCT/EP2021/068922 priority patent/WO2022008630A1/fr
Priority to CN202180049170.7A priority patent/CN115803751A/zh
Priority to US18/015,144 priority patent/US20230252305A1/en
Publication of EP3937084A1 publication Critical patent/EP3937084A1/fr
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/098Distributed learning, e.g. federated learning
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/40ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Definitions

  • Embodiments herein relate to training a model using a distributed machine learning process.
  • Models can be trained using machine learning processes on large volumes of data from patients who have been treated previously. Models trained in this manner have the potential to be used to make predictions in many areas of medicine, such as image segmentation and diagnosis, amongst others. Such models may be used to better personalise healthcare.
  • One way of mitigating such issues is by training a model using a distributed machine learning process, such as, for example, a Federated Learning process such as that described in the paper by Bonawitz et al. 2019 entitled “Towards Federated Learning at Scale: System Design ".
  • Distributed learning enables models to be trained using data from different clinical sites without the data leaving the premises.
  • Fig. 1 shows a central server 102 in communication with a plurality of clinical sites, 104 to 112.
  • the central server co-ordinates training of a model using a distributed learning process using training data located at each of the clinical sites 104 to 112.
  • the central server holds a "global" or central copy of the model and may send 114 information about the global model, e.g. such as parameters enabling a local copy of the model to be created, to each clinical site.
  • Each clinical site may then create a local copy of the model and train its local copy on training data at the respective clinical site.
  • Each clinical site 104 to 112 may then send 116 an update to one or more parameters of the model to the central server.
  • the central server combines the updates, for example through averaging, from the respective clinical sites to update the global model. This allows a global model at a central server 102 to be trained e.g. updated and improved, based on training data at a plurality of clinical sites 104 to 112, without the data having to leave the respective clinical site. It is an object of embodiments herein to improve on such processes for training models to perform a task on medical data using distributed machine learning process.
  • a method of training a model to perform a task on medical data using a distributed machine learning process whereby a global model is updated based on training performed on local copies of the model at a plurality of clinical sites.
  • the method comprises: a) sending information to the plurality of clinical sites to enable each of the plurality of clinical sites to create a local copy of the model and train the respective local copy of the model on training data at the respective clinical site; b) receiving, from each of the plurality of clinical sites, i) a local update to a parameter in the model obtained by training the local copy of the model on the training data at the respective clinical site and ii) metadata related to a quality of the training performed at the respective clinical site; and c) updating the parameter in the global model, based on the received local updates to the parameter and the received metadata.
  • Metadata related to the quality of the training performed at each site may be used when combining the local updates into an update for the global model.
  • different local updates may be given different significances (e.g. through the use of weightings) dependent on the quality of the training performed at the respective clinical site.
  • This can result in improved training, resulting in improved models and thus improved clinical outcomes for clinical processes that use the models.
  • the model is trained on data from different sites, there may be irregularities in the data, and this can lead to bias and model drift.
  • model drift may be avoided, leading to a better quality model.
  • a method at a clinical site for training a model to perform a task on medical data using a distributed machine learning process whereby a global model at a central server is updated based on training performed on a local copy of the model at the clinical site.
  • the method comprises: receiving information from a central server enabling a local copy of the model to be created and trained on training data at the clinical site; training a local copy of the model according to the information; and sending to the central server, i) an update to the model based on training of the local copy of the model on the training data at the clinical site and ii) metadata related to a quality of the training performed at the respective clinical site.
  • a model trained according to the first or second aspects to perform a task on medical data.
  • the apparatus comprises a memory comprising instruction data representing a set of instructions, and a processor configured to communicate with the memory and to execute the set of instructions.
  • the set of instructions when executed by the processor, cause the processor to: a) send information to the plurality of clinical sites to enable each of the plurality of clinical sites to create a local copy of the model and train the respective local copy of the model on training data at the respective clinical site; b) receive, from each of the plurality of clinical sites, i) a local update to a parameter in the model obtained by training the local copy of the model on the training data at the respective clinical site and ii) metadata related to a quality of the training performed at the respective clinical site; and c) update the parameter in the global model, based on the received local updates to the parameter and the received metadata.
  • a computer program product comprising a computer readable medium, the computer readable medium having computer readable code embodied therein, the computer readable code being configured such that, on execution by a suitable computer or processor, the computer or processor is caused to perform the method of the first and second aspects.
  • embodiments herein aim to improve methods for training clinical models to perform a task on medical data using distributed machine learning processes.
  • an apparatus 200 for use in training a model to perform a task on medical data using a distributed machine learning process there is an apparatus 200 for use in training a model to perform a task on medical data using a distributed machine learning process, according to some embodiments herein.
  • the apparatus may form part of a computer apparatus or system e.g. such as a laptop, desktop computer or other computing device.
  • the apparatus 200 may form part of a distributed computing arrangement or the cloud.
  • the apparatus comprises a memory 204 comprising instruction data representing a set of instructions and a processor 202 (e.g. processing circuitry or logic) configured to communicate with the memory and to execute the set of instructions.
  • a processor 202 e.g. processing circuitry or logic
  • the set of instructions when executed by the processor, may cause the processor to perform any of the embodiments of the method 300 as described below.
  • Embodiments of the apparatus 200 may be for use in training a model to perform a task on medical data using a distributed machine learning process whereby a global model is updated based on training performed on local copies of the model at a plurality of clinical sites.
  • the set of instructions when executed by the processor 202, cause the processor to: a) send information to the plurality of clinical sites to enable each of the plurality of clinical sites to create a local copy of the model and train the respective local copy of the model on training data at the respective clinical site; b) receive, from each of the plurality of clinical sites, i) a local update to a parameter in the model obtained by training the local copy of the model on the training data at the respective clinical site and ii) metadata related to a quality of the training performed at the respective clinical site; and c) update the parameter in the global model, based on the received local updates to the parameter and the received metadata.
  • the processor 202 can comprise one or more processors, processing units, multi-core processors or modules that are configured or programmed to control the apparatus 200 in the manner described herein.
  • the processor 202 can comprise a plurality of software and/or hardware modules that are each configured to perform, or are for performing, individual or multiple steps of the method described herein.
  • the processor 202 can comprise one or more processors, processing units, multi-core processors and/or modules that are configured or programmed to control the apparatus 200 in the manner described herein.
  • the processor 202 may comprise a plurality of (for example, interoperated) processors, processing units, multi-core processors and/or modules configured for distributed processing. It will be appreciated by a person skilled in the art that such processors, processing units, multi-core processors and/or modules may be located in different locations and may perform different steps and/or different parts of a single step of the method described herein.
  • the memory 204 is configured to store program code that can be executed by the processor 202 to perform the method described herein.
  • one or more memories 204 may be external to (e.g. separate to or remote from) the apparatus 200.
  • one or more memories 204 may be part of another device.
  • Memory 204 can be used to store the global model, the received local updates, the received metadata and/or any other information or data received, calculated or determined by the processor 202 of the apparatus 200 or from any interfaces, memories or devices that are external to the apparatus 200.
  • the processor 202 may be configured to control the memory 204 to store the global model, the received local updates, the received metadata and/or the any other information or data described herein.
  • the memory 204 may comprise a plurality of sub-memories, each sub-memory being capable of storing a piece of instruction data.
  • at least one sub-memory may store instruction data representing at least one instruction of the set of instructions, while at least one other sub-memory may store instruction data representing at least one other instruction of the set of instructions.
  • Fig. 2 only shows the components required to illustrate this aspect of the disclosure and, in a practical implementation, the apparatus 200 may comprise additional components to those shown.
  • the apparatus 200 may further comprise a display.
  • a display may comprise, for example, a computer screen, and/or a screen on a mobile phone or tablet.
  • the apparatus may further comprise a user input device, such as a keyboard, mouse or other input device that enables a user to interact with the apparatus, for example, to provide initial input parameters to be used in the method described herein.
  • the apparatus 200 may comprise a battery or other power supply for powering the apparatus 200 or means for connecting the apparatus 200 to a mains power supply.
  • a computer implemented method 300 for use in training a model to perform a task on (e.g. process) medical data using a distributed machine learning process whereby a global model is updated based on training performed on local copies of the model at a plurality of clinical sites.
  • Embodiments of the method 300 may be performed, for example by an apparatus such as the apparatus 200 described above.
  • the method 300 comprises: sending 302 information to the plurality of clinical sites to enable each of the plurality of clinical sites to create a local copy of the model and train the respective local copy of the model on training data at the respective clinical site.
  • the method 300 comprises receiving 304, from each of the plurality of clinical sites, i) a local update to a parameter in the model obtained by training the local copy of the model on the training data at the respective clinical site and ii) metadata related to a quality of the training performed at the respective clinical site.
  • the method comprises updating 306 the parameter in the global model, based on the received local updates to the parameter and the received metadata.
  • bias describes how well a model matches the training set. A model with high bias won't match the data set closely, while a model with low bias will match the data set very closely. Bias comes from models that are overly simple and fail to capture the trends present in the data set. Model drift can be classified into two broad categories. The first type is called 'concept drift'.
  • Metadata may be used related to the quality of the training performed at each site when combining the local updates into an update for the global model. In this way, different local updates may be given different significances (e.g. through the use of weightings) dependent on the quality of the training performed at the respective clinical site.
  • the model may comprise any type of model that can be trained using a machine learning process.
  • models include, but are not limited to neural networks, deep neural networks such as F-Nets, U-Nets and Convolutional Neural Networks, Random Forest models and Support Vector Machine (SVM) models.
  • machine learning can be used to find a predictive function for a given dataset; the dataset is typically a mapping between a given input to an output.
  • the predictive function (or mapping function) is generated in a training phase, which involves providing example inputs and ground truth (e.g. correct) outputs to the model.
  • a test phase comprises predicting the output for a given input.
  • Applications of machine learning include, for example, curve fitting, facial recognition and spam filtering.
  • the model comprises a neural network model, such as a deep neural network model.
  • neural networks are a type of machine learning model that can be trained to predict a desired output for given input data.
  • Neural networks are trained by providing training data comprising example input data and the corresponding "correct" or ground truth outcome that is desired.
  • Neural networks comprise a plurality of layers of neurons, each neuron representing a mathematical operation that is applied to the input data. The output of each layer in the neural network is fed into the next layer to produce an output.
  • weights associated with the neurons are adjusted (e.g. using processes such as back propagation and/or gradient decent) until the optimal weightings are found that produce predictions for the training examples that reflect the corresponding ground truths.
  • distributed learning processes were described above with respect to Fig. 1 and the detail therein will be understood to apply to embodiments of the apparatus 200 and the method 300.
  • distributed learning processes include, but are not limited to Federated Learning and Distributed Data Parallelism methods.
  • the apparatus 200 may comprise a server that co-ordinates the training performed by the servers at the plurality of clinical sites, in other words, a "central server".
  • the method 300 may be performed or initiated by a user, company or any other designer or orchestrator of the training process, e.g. using the apparatus 200.
  • the central server e.g. such as an apparatus 200
  • the plurality of clinical sites may comprise "workers" or nodes.
  • the central server may store and/or maintain (e.g. update) a global model.
  • the global model (or global copy of the model) comprises a master copy, or central copy of the model.
  • outcomes e.g. local updates
  • the global model represents the current "combined" outcome of all of the training performed at the plurality of clinical sites.
  • a clinical site may comprise a hospital, a surgery, a clinic, and/or a datacentre or other computing site suitable for storing medical data originating from such a clinical site.
  • medical data may comprise any type of data that can be used, produced and/or obtained in a medical setting, including but not limited to: clinical diagnostic data, such as patient vital signs, or physiological parameters, medical images, medical files (e.g. such as patient records), and/or outputs of medical machines (e.g. operational or diagnostic data from medical equipment).
  • clinical diagnostic data such as patient vital signs, or physiological parameters
  • medical images e.g. such as patient records
  • medical files e.g. such as patient records
  • outputs of medical machines e.g. operational or diagnostic data from medical equipment.
  • the model may take as input one or more of the types of medical data described above and perform a task on the medical data.
  • the task may comprise, for example, a classification task or a segmentation task.
  • the model may predict a classification for the medical data and/or provide an output classification.
  • the model may output, for example, a patient diagnosis based on the input medical data.
  • the model may output, for example, a segmentation of the medical image, a location of a feature of interest in the medical image, or a diagnosis based on the medical image.
  • the model may take different types of medical data as input and provide different types of outputs (e.g. perform different tasks) to the examples provided above.
  • the method 300 comprises: a) sending (302) information to the plurality of clinical sites to enable each of the plurality of clinical sites to create a local copy of the model and train the respective local copy of the model on training data at the respective clinical site.
  • the information may comprise model information, indicating the type of model, and/or values of parameters in the model.
  • the information may comprise parameters including, but not limited to, a number of layers in the neural network model, the input and output channels of the model, and values of the weights and biases in the neural network model.
  • the information sent in step a) is sufficient to enable each of the plurality of clinical sites to create a local copy of the model.
  • the information may further comprise instructions of how each clinical site is to train the model.
  • the information may indicate, for example, a number of epochs of training that is to be performed, a number of pieces of training data that should be used to train the model, the type of data that is to be used to train the model, etc.
  • step b) the method 300 comprises receiving (304), from each of the plurality of clinical sites, i) a local update to a parameter in the model obtained by training the local copy of the model on the training data at the respective clinical site and ii) metadata related to a quality of the training performed at the respective clinical site.
  • the local update to the parameter in the model may comprise an outcome of the training of the local copy of the model on the training data at the respective clinical site. For example, a change in a parameter of the model resulting from the training.
  • the parameter may comprise a weight or bias in the neural network, or a change that should be applied to a weight or bias in a neural network.
  • the step b) comprises receiving updated values wi (or changes in values ⁇ w i ) of one or more weights or biases in the neural network model.
  • the metadata is related to a quality of the training performed at the respective clinical site.
  • the metadata provides an indication of a performance of the respective local copy of the model after the training. For example, an indication of the accuracy of the local model at the respective clinical site.
  • the metadata provides an indication of a performance of the respective local copy of the model after the training, for one or more subsets of training data having a common characteristic that is expected to influence model error.
  • the characteristic may be expected to influence model error, for example, by making it easier (or conversely more difficult) for the model to perform the task on (e.g. classify/segment) the medical data.
  • the metadata may comprise an indication of the performance of the respective local model at classifying medical data with different quality levels, or different levels of completeness (e.g. full images compared to partial images, for example.)
  • the metadata may comprise medical statistics that can influence the training error.
  • the metadata may comprise statics relating to features of the training data at the respective medical site that may influence the accuracy of the respective local model. For example, the number of training data samples of high quality compared to the number of training data samples of low quality.
  • the metadata provides an indication of a quality of the training data at the respective clinical site.
  • the metadata may provide an indication of a distribution of the training data at the clinical site between different output classifications of the model.
  • output classifications may comprise labels or categories output by the model.
  • the metadata may describe whether the training data is evenly distributed between different output classifications, or whether the training data is skewed towards particular classifications (e.g. with more training data associated with some labels compared to other labels).
  • each clinical site has different ratios of data in each class and the trainable data varies as distributed learning is performed.
  • the returned metadata may comprise the number of samples per class present in each node during weight updating. This may provide an indication of how balanced the training data is (e.g. between different classes) that was used to train the respective local model. Local updates resulting from more balanced training data sets may be given more weight compared to local updates resulting from less balanced training data sets.
  • step c) of the method 300 the method comprises updating (306) the parameter in the global model, based on the received local updates to the parameter and the received metadata.
  • the metadata is used to perform parameter merging at the central server. Therefore, the merged parameter may comprise a function of the parameters received from the clinical sites and the corresponding metadata.
  • Merged parameter function metadata parameters received from clinical sites . Mathematically the function can be represented as follows: lets, consider n clinical sites N1, N2, N3, . ...., etc with parameters W1, W2, W3, Vietnamese,etc, and each of the clinical sites has a quality measure ⁇ 1, ⁇ 2, ⁇ 3,...., etc, where the alpha value varies between 0 and 1 and is calculated from the metadata sent from the clinical sites to the central server.
  • step c) may comprise combining the local updates to the parameter to determine an update to the global model by weighting each local update according to the respective metadata such that local updates associated with metadata indicating high quality training outcomes have a higher weighting (e.g. a higher value of ⁇ as described above) compared to updates associated with metadata indicating low quality training outcomes. For example, generally, a local update associated with a more accurate local model may be given a higher weighting compared to a local update associated with a less accurate local model.
  • the medical data comprises computed tomography, CT, scan data.
  • the metadata may provide an indication of a performance of the respective local copy of the model at classifying CT images of different radiation dosage, e.g. the metadata may provide an indication of the performance of the model when classifying high dosage CT scans and/or (or compared to) low dosage CT scans.
  • the model may be expected to be able to classify CT images of high radiation dosage more accurately than CT images of low radiation dosage.
  • such metadata may be used to prioritise an update received from a first clinical site having a local model with higher performance on high dosage CT scans, compared to an update received from a second clinical sites having a local model with lower performance on high dosage CT scans, e.g. even if the performance of the first model on low dosage CT scans is comparatively poor.
  • the metadata may describe the number of training data samples that are low dose or high dose for contrast enhancement. As noted above, if a model makes mistakes on low does CT image this error is given less weightage when compared to a model making a mistake on high dose CT (As the expectation is for the algorithm to perform very well on high dose CT images and a few mistakes on low dose CT images will be acceptable).
  • the metadata may comprise an indication of the performance of the model when classifying training data of different completeness levels.
  • the model is trained to perform a segmentation of an anatomical feature in medical imaging data; and wherein the metadata comprises an indication of the performance of the model when segmenting full images of the anatomical feature and/or partial images of the anatomical feature.
  • such metadata may be used to prioritise an update received from a first clinical site having a local model with higher performance when segmenting full images of the anatomical feature, compared to an update received from a second clinical sites having a local model with lower performance when segmenting full images of the anatomical feature, e.g. even if the performance of the first model on when segmenting partial images of the anatomical feature is comparatively poor.
  • the metadata may comprise, for example, the following information:
  • a global model may be updated using metadata that provides a deeper insight into the quality of local updates determined by a plurality of clinical sites in a distributed learning scheme.
  • the method herein proves means to reduce this and combat model bias, and data heterogeneity.
  • the method 300 may be improved further by detecting whether the global model is drifting during the training process, based on the analysis of the visualization output. For example, if the region of interest that is activated/considered by the model when determining a classification or label keeps varying, the drift associated can be ascertained.
  • the variation value may be computed based on benchmark training data, that is fed through the global model at different time points during the training process. (e.g. at time point t0 and the change in the variation is obtained at t1).
  • the variation computation may be in terms of the coordinate value, or area under the bounding box of the region of interest that is activated/considered by the model.
  • model drift may be determined according to: Model Drift:
  • the method 300 may comprise determining, for a test medical image, a first region of the test image used by the global model to perform the task on the test medical image.
  • the method may then further comprise, following steps a), b) and c): determining, for the test medical image, a second region of the test image used by the updated global model to perform the task on the test medical image, and comparing the first region of the test image to the second region of the test image to determine a measure of model drift.
  • the step of comparing may comprise, for example, comparing co-ordinates associated with the first and second regions (e.g. at the centre or edges of the regions or bounding boxes), or comparing the areas within the first and second regions and determining whether the regions have changed e.g. by a statistically significant amount, or greater than a threshold amount.
  • the dynamic threshold may be determined based on the current content and the different models and also the type of the model under question. Hence it is not static for all applications/model types.
  • FIG. 4 shows an image of a liver 402 comprising a lesion 404.
  • a model is used to classify (e.g. locate) the lesion.
  • the model is trained according to the method 300 above. Preceding the steps a), b) and c), at a time to, the model classifies the lesion based on the region of the image 406a. At a time t 1 , the model classifies the same lesion based on the region of the image 406b. The difference in the locations and size of the regions 406a and 406b may indicate that the model has drifted. Thus, by comparing and monitoring changes in the region between different training epochs/updates, drift of the model may be determined.
  • steps a), b) and c) may be repeated, e.g. to provide a sequence of training epochs.
  • steps a), b) and c) may be repeated periodically, or each time new training data becomes available at a clinical site.
  • the method may be augmented though the use of Active Learning.
  • Active Learning The skilled person will be familiar with active learning, but in brief in active learning focusses training on training data that has previously been miss-classified or classified with a low probability of accuracy by the model. Thus effectively focussing the training on areas of weakness in the model.
  • the method may thus comprise repeating steps a), b) and c) for a subset of the training data at each respective clinical site that was classified by the model with a certainty below a threshold certainty level.
  • the certainty may be measured using confidence levels output by the model.
  • the certainty with which the model classified the data may be calculated using a measure of entropy. Measures of entropy may reflect an amount of information in a dataset. Thus, the higher the entropy, the higher the amount of information in the dataset. Thus, for example, if a dataset has high entropy, it has variety in its content.
  • an ambiguity zone may be defined comprising training data for which the classification is uncertain. Training data in such an ambiguity zone may be used in subsequent epochs of model training. It is noted that the ambiguity zone may be dynamic, and change between epochs as the (global) model improves.
  • the metadata and thus, the quality measure value ( ⁇ ) as described above may change for each training epoch.
  • FIG. 5 illustrates an apparatus 500 for use in a clinical site for training a model to perform a task on medical data using a distributed machine learning process, according to some embodiments herein.
  • the apparatus may form part of a computer apparatus or system e.g. such as a laptop, desktop computer or other computing device.
  • the apparatus 500 may form part of a distributed computing arrangement or the cloud.
  • the apparatus comprises a memory 504 comprising instruction data representing a set of instructions and a processor 502 (e.g. processing circuitry or logic) configured to communicate with the memory and to execute the set of instructions.
  • a processor 502 e.g. processing circuitry or logic
  • the set of instructions when executed by the processor, may cause the processor to perform any of the embodiments of the method 600 as described below.
  • Embodiments of the apparatus 500 may be for use in a clinical site for training a model to perform a task on medical data using a distributed machine learning process whereby a global model at a central server is updated based on training performed on a local copy of the model at the clinical site.
  • the set of instructions when executed by the processor, cause the processor to: receive information from a central server enabling a local copy of the model to be created and trained on training data at the clinical site; train a local copy of the model according to the information; and send to the central server, i) an update to the model based on training of the local copy of the model on the training data at the clinical site and ii) metadata related to a quality of the training performed at the respective clinical site.
  • the processor 502 can comprise one or more processors, processing units, multi-core processors or modules that are configured or programmed to control the apparatus 500 in the manner described herein.
  • the processor 502 can comprise a plurality of software and/or hardware modules that are each configured to perform, or are for performing, individual or multiple steps of the method described herein.
  • the processor 502 can comprise one or more processors, processing units, multi-core processors and/or modules that are configured or programmed to control the apparatus 500 in the manner described herein.
  • the processor 502 may comprise a plurality of (for example, interoperated) processors, processing units, multi-core processors and/or modules configured for distributed processing. It will be appreciated by a person skilled in the art that such processors, processing units, multi-core processors and/or modules may be located in different locations and may perform different steps and/or different parts of a single step of the method described herein.
  • the memory 504 is configured to store program code that can be executed by the processor 502 to perform the method described herein.
  • one or more memories 504 may be external to (i.e. separate to or remote from) the apparatus 500.
  • one or more memories 504 may be part of another device.
  • Memory 504 can be used to store the local copy of the model, the training data, the outputs of the training and/or any other information or data received, calculated or determined by the processor 502 of the apparatus 500 or from any interfaces, memories or devices that are external to the apparatus 500.
  • the processor 502 may be configured to control the memory 504 to store the local copy of the model, the training data, the outputs of the training and/or any other information or data produced by or used during the method 600 described below.
  • the memory 504 may comprise a plurality of sub-memories, each sub-memory being capable of storing a piece of instruction data.
  • at least one sub-memory may store instruction data representing at least one instruction of the set of instructions, while at least one other sub-memory may store instruction data representing at least one other instruction of the set of instructions.
  • Fig. 5 only shows the components required to illustrate this aspect of the disclosure and, in a practical implementation, the apparatus 500 may comprise additional components to those shown.
  • the apparatus 500 may further comprise a display.
  • a display may comprise, for example, a computer screen, and/or a screen on a mobile phone or tablet.
  • the apparatus may further comprise a user input, such as a keyboard, mouse or other input device that enables a user to interact with the apparatus, for example, to provide initial input parameters to be used in the method described herein.
  • the apparatus 500 may comprise a battery or other power supply for powering the apparatus 500 or means for connecting the apparatus 500 to a mains power supply.
  • Fig. 600 there is a computer implemented method 600 for use in training a model to perform a task on medical data using a distributed machine learning process whereby a global model at a central server is updated based on training performed on a local copy of the model at the clinical site.
  • Embodiments of the method 600 may be performed, for example by an apparatus such as the apparatus 500 described above.
  • the method 600 comprises: receiving information from a central server enabling a local copy of the model to be created and trained on training data at the clinical site.
  • the method comprises training a local copy of the model according to the information.
  • the method comprises sending to the central server, i) an update to the model based on training of the local copy of the model on the training data at the clinical site and ii) metadata related to a quality of the training performed at the respective clinical site.
  • a clinical site 500 may comprise a server (e.g. a "clinical server") or a datacentre associated with a hospital, a surgery, a clinic, or any other medical facility.
  • a clinical site may comprise, for example, a datacentre such as a Hospital Data Centre (HDC) or any other computing site suitable for storing medical data.
  • HDC Hospital Data Centre
  • step 602 The information received in step 602 was described above with respect to Figs. 2 and 3 and the detail therein will be understood to apply equally to the apparatus 500 and the method 600.
  • the clinical site uses the information to create a local copy of the model and trains the local copy of the model using training data at the clinical site (e.g. according to the information received from the central server).
  • the clinical site obtains metadata related to a quality of the training performed at the respective clinical site on the local model, and in step 606, sends i) an update to the model based on training of the local copy of the model on the training data at the clinical site (e.g. the outcome of the training) and ii) the metadata to the central server.
  • the metadata was described in detail above with respect to the apparatus 200 and the method 300 and the details therein will be understood to apply equally to the apparatus 500 and the method 600.
  • Fig. 7 illustrates a method of training a model using a distributed learning process according to some embodiments herein.
  • a Researcher or other user on a computer or server 700, a central server 702 and a plurality of clinical sites (or nodes) 704. Only one clinical site 704 is shown in Fig. 7 for clarity.
  • the model comprises a neural network. The method is as follows.
  • a model trained according to any of the methods or apparatus described herein e.g. the methods 300, 600 or 700 or the apparatus 200 or 500
  • the use may be performed in addition to, or separately from the methods described herein.
  • Examples of use include but are not limited to, for example, segmenting an image (such as CT scan of a liver) using a model trained according to any of the methods herein; classifying (e.g. diagnosing or making some other classification of) medical records using a model trained according to any of the methods herein.
  • FIG. 8 shows an output segmentation of a liver 802 produced by a model trained using a traditional distributed learning process compared to a segmentation of a liver 804 output by a model trained using the methods 300 and 600 described above.
  • a computer program product comprising a computer readable medium, the computer readable medium having computer readable code embodied therein, the computer readable code being configured such that, on execution by a suitable computer or processor, the computer or processor is caused to perform the method or methods described herein.
  • the disclosure also applies to computer programs, particularly computer programs on or in a carrier, adapted to put embodiments into practice.
  • the program may be in the form of a source code, an object code, a code intermediate source and an object code such as in a partially compiled form, or in any other form suitable for use in the implementation of the method according to the embodiments described herein.
  • a program code implementing the functionality of the method or system may be sub-divided into one or more sub-routines.
  • the sub-routines may be stored together in one executable file to form a self-contained program.
  • Such an executable file may comprise computer-executable instructions, for example, processor instructions and/or interpreter instructions (e.g. Java interpreter instructions).
  • one or more or all of the sub-routines may be stored in at least one external library file and linked with a main program either statically or dynamically, e.g. at run-time.
  • the main program contains at least one call to at least one of the sub-routines.
  • the sub-routines may also comprise function calls to each other.
  • the carrier of a computer program may be any entity or device capable of carrying the program.
  • the carrier may include a data storage, such as a ROM, for example, a CD ROM or a semiconductor ROM, or a magnetic recording medium, for example, a hard disk.
  • the carrier may be a transmissible carrier such as an electric or optical signal, which may be conveyed via electric or optical cable or by radio or other means.
  • the carrier may be constituted by such a cable or other device or means.
  • the carrier may be an integrated circuit in which the program is embedded, the integrated circuit being adapted to perform, or used in the performance of, the relevant method.
  • a computer program may be stored or distributed on a suitable medium, such as an optical storage medium or a solid-state medium supplied together with or as part of other hardware, but may also be distributed in other forms, such as via the Internet or other wired or wireless telecommunication systems. Any reference signs in the claims should not be construed as limiting the scope.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Public Health (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Primary Health Care (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Epidemiology (AREA)
  • Radiology & Medical Imaging (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Image Analysis (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
EP20185311.6A 2020-07-10 2020-07-10 Formation d'un modèle pour effectuer une tâche sur des données médicales Withdrawn EP3937084A1 (fr)

Priority Applications (6)

Application Number Priority Date Filing Date Title
EP20185311.6A EP3937084A1 (fr) 2020-07-10 2020-07-10 Formation d'un modèle pour effectuer une tâche sur des données médicales
EP21742110.6A EP4179467A1 (fr) 2020-07-10 2021-07-08 Entraînement d'un modèle pour effectuer une tâche sur des données médicales
JP2022578700A JP2023533188A (ja) 2020-07-10 2021-07-08 医用データに対しタスクを実行するためのモデルの訓練
PCT/EP2021/068922 WO2022008630A1 (fr) 2020-07-10 2021-07-08 Entraînement d'un modèle pour effectuer une tâche sur des données médicales
CN202180049170.7A CN115803751A (zh) 2020-07-10 2021-07-08 训练模型用于对医学数据执行任务
US18/015,144 US20230252305A1 (en) 2020-07-10 2021-07-08 Training a model to perform a task on medical data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP20185311.6A EP3937084A1 (fr) 2020-07-10 2020-07-10 Formation d'un modèle pour effectuer une tâche sur des données médicales

Publications (1)

Publication Number Publication Date
EP3937084A1 true EP3937084A1 (fr) 2022-01-12

Family

ID=71575212

Family Applications (2)

Application Number Title Priority Date Filing Date
EP20185311.6A Withdrawn EP3937084A1 (fr) 2020-07-10 2020-07-10 Formation d'un modèle pour effectuer une tâche sur des données médicales
EP21742110.6A Pending EP4179467A1 (fr) 2020-07-10 2021-07-08 Entraînement d'un modèle pour effectuer une tâche sur des données médicales

Family Applications After (1)

Application Number Title Priority Date Filing Date
EP21742110.6A Pending EP4179467A1 (fr) 2020-07-10 2021-07-08 Entraînement d'un modèle pour effectuer une tâche sur des données médicales

Country Status (5)

Country Link
US (1) US20230252305A1 (fr)
EP (2) EP3937084A1 (fr)
JP (1) JP2023533188A (fr)
CN (1) CN115803751A (fr)
WO (1) WO2022008630A1 (fr)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115222945B (zh) * 2022-09-15 2022-12-06 深圳市软盟技术服务有限公司 基于多尺度自适应课程学习的深度语义分割网络训练方法
WO2024071845A1 (fr) * 2022-09-28 2024-04-04 주식회사 메디컬에이아이 Procédé, programme et dispositif de construction d'un modèle d'intelligence artificielle médicale

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3528179A1 (fr) * 2018-02-15 2019-08-21 Koninklijke Philips N.V. Apprentissage d'un réseau neuronal

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3528179A1 (fr) * 2018-02-15 2019-08-21 Koninklijke Philips N.V. Apprentissage d'un réseau neuronal

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BONAWITZ ET AL., TOWARDS FEDERATED LEARNING AT SCALE: SYSTEM DESIGN'', 2019
SEYYEDALI HOSSEINALIPOUR ET AL: "From Federated Learning to Fog Learning: Towards Large-Scale Distributed Machine Learning in Heterogeneous Wireless Networks", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 7 June 2020 (2020-06-07), XP081692346 *
WIKIPEDIA: "Federated learning", 28 June 2020 (2020-06-28), XP055749944, Retrieved from the Internet <URL:https://en.wikipedia.org/w/index.php?title=Federated_learning&oldid=964899708> [retrieved on 20201112] *

Also Published As

Publication number Publication date
CN115803751A (zh) 2023-03-14
JP2023533188A (ja) 2023-08-02
WO2022008630A1 (fr) 2022-01-13
EP4179467A1 (fr) 2023-05-17
US20230252305A1 (en) 2023-08-10

Similar Documents

Publication Publication Date Title
US20230351204A1 (en) Selecting a training dataset with which to train a model
US20230252305A1 (en) Training a model to perform a task on medical data
Bouktif et al. Ant colony optimization algorithm for interpretable Bayesian classifiers combination: application to medical predictions
US10957038B2 (en) Machine learning to determine clinical change from prior images
Rawat et al. Case based Reasoning Technique in Digital Diagnostic System for Lung Cancer Detection
Geurts et al. Visual comparison of 3D medical image segmentation algorithms based on statistical shape models
Prabhudesai et al. Lowering the computational barrier: Partially Bayesian neural networks for transparency in medical imaging AI
Zhang et al. A probabilistic model for controlling diversity and accuracy of ambiguous medical image segmentation
CN109564782B (zh) 基于医院人口统计的电子临床决策支持设备
Pölsterl et al. Scalable, axiomatic explanations of deep alzheimer’s diagnosis from heterogeneous data
Srinivasan et al. To pretrain or not? A systematic analysis of the benefits of pretraining in diabetic retinopathy
US20240020417A1 (en) Systems and methods for federated feedback and secure multi-model training within a zero-trust environment
US20230359927A1 (en) Dynamic user-interface comparison between machine learning output and training data
US20230245430A1 (en) Systems and methods for processing electronic images for auto-labeling for computational pathology
Bagdonas et al. Application of merkelized abstract syntax trees to the Internet of Things location-based cybersecurity solutions
Kriščiukaitis et al. On the problem of eigenvector sign ambiguity: ad-hoc solution for eigenvector decomposition-based signal and image analysis
Bernatavičienė Data Analysis Methods for Software Systems
Gudas et al. Causal interactions of the Agile activities in application development management
Gadeikytė et al. Computational models of heat and mass transfer in the textile structures
Breskuvienė Forbearance prediction using XGBoost and LightGBM models
Budžys et al. User behaviour analysis based on similarity measures to detect anomalies
Krikščiūnienė et al. Evaluation of artificial intelligence methods and tools for discovering healthcare data patterns
Gipiškis et al. Application of CNNs for brain MRI image segmentation
Dauliūtė et al. Software for automatic anonymisation of radiographic images
Gadeikytė et al. Numerical simulations of heat and mass exchange between human skin and textile structures

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN PUBLISHED

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

B565 Issuance of search results under rule 164(2) epc

Effective date: 20201221

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20220713