WO2023200732A1 - Systèmes et procédés de prédiction d'étiquettes de classe de niveau de diapositive pour une image de diapositive entière - Google Patents

Systèmes et procédés de prédiction d'étiquettes de classe de niveau de diapositive pour une image de diapositive entière Download PDF

Info

Publication number
WO2023200732A1
WO2023200732A1 PCT/US2023/018074 US2023018074W WO2023200732A1 WO 2023200732 A1 WO2023200732 A1 WO 2023200732A1 US 2023018074 W US2023018074 W US 2023018074W WO 2023200732 A1 WO2023200732 A1 WO 2023200732A1
Authority
WO
WIPO (PCT)
Prior art keywords
patches
bag
image
layer
trained
Prior art date
Application number
PCT/US2023/018074
Other languages
English (en)
Inventor
James Pao
Original Assignee
Foundation Medicine, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Foundation Medicine, Inc. filed Critical Foundation Medicine, Inc.
Publication of WO2023200732A1 publication Critical patent/WO2023200732A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0499Feedforward networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/69Microscopic objects, e.g. biological cells or cellular parts
    • G06V20/695Preprocessing, e.g. image segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/69Microscopic objects, e.g. biological cells or cellular parts
    • G06V20/698Matching; Classification
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/40ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/092Reinforcement learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • G06N5/025Extracting rules from data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/045Explanation of inference; Explainable artificial intelligence [XAI]; Interpretable artificial intelligence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30024Cell structures in vitro; Tissue sections in vitro

Definitions

  • This application relates generally to whole-slide images, and, more particularly, to predicting slide-level class labels for whole-slide images.
  • MIL Multiple-instance learning
  • a MIL machine-learning model trains on inputs of sets of image instances (e.g., referred to as a “bag” of patches of pixels) as opposed to the individual image instances themselves.
  • a bag of patches of pixels may be structured and pre-processed in a manner in which ground truth class labels are assigned to the bags of patches of pixels for the purposes of training the MIL model.
  • a histopathology image such as a hemotoxin and eosin (H&E) slide, may include a tissue sample that may be further sequenced and analyzed.
  • H&E hemotoxin and eosin
  • one or more biomarkers or gene alterations may be determined as being associated with the whole tissue sample as opposed to, for example, any specific tissue cells within the tissue sample.
  • the MIL model may be particularly suitable for analyzing and classifying bags of patches of pixels in histopathology images, which may often include very large and high-resolution images.
  • Embodiments of the present disclosure are directed toward one or more computing devices, methods, and non-transitory computer-readable media that may generate inference- phase-specific batch normalization parameters for a machine-learning model trained to predict a slide-level class label describing one or more gene alterations or other biomarkers based on a singular whole-slide histopathology image.
  • a multiple-instance learning convolutional neural network may include, for example, one or more convolutional layers, one or more pooling or max-pooling layers, one or more fully-connected layers, one or more batch normalization layers, and one or more rectified linear units (ReLUs).
  • the one or more batch normalization layers may utilize one or more sets of mini-batch normalization parameters (e.g., a mini-mean parameter and a minivariance parameter) to normalize features in a feature map at each layer of the MILCNN.
  • mini-batch normalization parameters e.g., a mini-mean parameter and a minivariance parameter
  • each batch of data is normalized by subtracting the batch mean and dividing by the square root of the batch variance.
  • at least one bag of patches of pixels of a singular whole-slide histopathology image may be inputted to the MILCNN and the MILCNN may predict a slide-level label (e.g., predict a slide-level label based on the entire bag of patches of pixels as opposed to the individual patches of pixels constituting the bag).
  • the MILCNN may generate and utilize inference-phase-specific batch normalization parameters, such that the one or more batch normalization layers may normalize features in a feature map at each feature layer of the MILCNN utilizing the inference-phase- specific batch normalization parameters as opposed to utilizing, for example, a running mean and variance calculated based on the one or more sets of mini-batch normalization parameters (e.g., a mini-mean parameter and a mini-variance parameter) learned during the training phase of the MILCNN.
  • mini-batch normalization parameters e.g., a mini-mean parameter and a mini-variance parameter
  • the trained MILCNN may better predict slide-level class labels describing one or more gene alterations or other biomarkers (e.g., an epidermal growth factor receptor (EFGR) gene alteration, an anaplastic lymphoma kinase (ALK) gene alteration, an ROS-1 gene alteration, a tumor gene mutation burden (TMB), neurotrophic tyrosine receptor kinase 3 (NTRK3) gene alteration, a fibroblast growth factor receptor 2 (FGFR2) gene alteration, mesenchymal-epithelial transition (MET) gene alteration, phosphatidylinositol-4,5-bisphosphate 3-Kinase catalytic subunit alpha (PIK3CA) gene alteration, neurotrophic tyrosine receptor kinase (NTRK) genes 1/2/3, and so forth) based on a singular whole slide histopathology image.
  • EFGR epidermal growth factor receptor
  • ALK anaplastic lymphoma kinase
  • the trained MILCNN may include batch normalization parameters that are appropriately “fitted” to the data, e.g., inference data or input data, which includes bags of patches of pixels all corresponding to a singular whole-slide histopathology image (e.g., as opposed to being overly “fitted” to only the most recent images inputted to the MIL model as would otherwise be the case utilizing training-phase-determined running mean and running variance batch normalization parameters).
  • batch normalization parameters that are appropriately “fitted” to the data, e.g., inference data or input data, which includes bags of patches of pixels all corresponding to a singular whole-slide histopathology image (e.g., as opposed to being overly “fitted” to only the most recent images inputted to the MIL model as would otherwise be the case utilizing training-phase-determined running mean and running variance batch normalization parameters).
  • one or more computing devices, methods, and non-transitory computer-readable media may segment an image into a plurality of patches.
  • the image includes only one whole-slide image (WSI).
  • the image may include an image of a tissue sample, and each patch of the plurality of patches may include a plurality of pixels corresponding to one or more regions of the image.
  • the image may include a histological stain image, a fluorescence in situ hybridization (FISH) image, an immunofluorescence (IF) image, or a hematoxylin and eosin (H&E) image.
  • FISH fluorescence in situ hybridization
  • IF immunofluorescence
  • H&E hematoxylin and eosin
  • the one or more computing devices may then group the plurality of patches into at least one bag of patches and input the at least one bag of patches into a machine-learning model trained to generate a prediction of an image class label based on the at least one bag of patches.
  • the machine-learning model may include a first layer trained to generate one or more feature maps based on the at least one bag of patches, a second layer trained to normalize the one or more feature maps utilizing a set of batch normalization parameters determined from the at least one bag of patches to generate one or more normalized feature maps, and a third layer trained to generate the prediction of the image class label based at least in part on the one or more normalized feature maps.
  • the machine-learning model may include one or more convolutional neural networks (CNNs).
  • the machine-learning model may include one or more fully connected neural networks (FCNNs).
  • the machine-learning model may include a multiple-instance learning (MIL) machine-learning model.
  • the set of batch normalization parameters may include a mean and a variance determined from the at least one bag of patches. In some embodiments, the set of batch normalization parameters corresponds to only the at least one second bag of patches.
  • the one or more computing devices, methods, and non-transitory computer-readable media may receive a training image, segment the training image into a second plurality of patches, group the second plurality of patches into at least one second bag of patches, and input the at least one second bag of patches into the machine-learning model to generate a prediction of a second image class label based on the at least one second bag of patches.
  • the first layer of the machine-learning may be trained to generate one or more feature maps based on the at least one second bag of patches.
  • the second layer of the machinelearning model may be trained to normalize the one or more second feature maps utilizing a set of mini-batch normalization parameters determined from the at least one second bag of patches to generate one or more second normalized feature maps.
  • the third layer of the machine-learning model may be trained to generate the prediction of the second image class label for the training image based at least in part on the one or more second normalized feature maps.
  • the first layer may include one or more convolutional layers
  • the second layer may include one or more batch normalization layers
  • the third layer may include an output layer.
  • the one or more batch normalization layers may be trained to compute at least one of a running mean, a running variance, a gamma parameter (i.e., a scaling parameter), and a beta parameter (i.e., an offset parameter) of each of a plurality of sets of minibatch normalization parameters during the training phase of the machine-learning model.
  • a gamma parameter i.e., a scaling parameter
  • a beta parameter i.e., an offset parameter
  • one or more of a mean and a variance determined for the at least one second bag of patches is configured to be determined for each additional bag of patches from the at least one second bag of patches.
  • the set of mini-batch normalization parameters may include a mini-batch mean and a mini-batch variance.
  • segmenting the training image into at least one second bag of patches may include randomly sampling one or more patches of pixels of the at least one second bag of patches.
  • the image class label may include an indication of a genetic biomarker of a tissue sample captured in the image.
  • the genetic biomarker of the tissue sample may include an epidermal growth factor receptor (EFGR) gene alteration, an anaplastic lymphoma kinase (ALK) gene alteration, an ROS-1 gene alteration, a tumor gene mutation burden (TMB), neurotrophic tyrosine receptor kinase 3 (NTRK3) gene alteration, a fibroblast growth factor receptor 2 (FGFR2) gene alteration, mesenchymal-epithelial transition (MET) gene alteration, phosphatidylinositol-4,5-bisphosphate 3-Kinase catalytic subunit alpha (PIK3CA) gene alteration, or a neurotrophic tyrosine receptor kinase (NTRK) genes 1/2/3..
  • EFGR epidermal growth factor receptor
  • ALK anaplastic lymphoma kinase
  • ROS-1 ROS-1 gene
  • the one or more computing devices, methods, and non- transitory computer-readable media may then generate a report based on the prediction of the image class label indicating the genetic biomarker of the tissue sample.
  • the one or more computing devices, methods, and non-transitory computer-readable media cause one or more electronic devices to display the report, in which the one or more electronic devices includes a human machine interface (HMI) associated with a pathologist to display the report.
  • HMI human machine interface
  • FIG. 1 illustrates an exemplary workflow diagram of a training phase for training a machine-learning model to predict a slide-level class label describing one or more gene alterations or other biomarkers based on a singular whole-slide histopathology image, according to some embodiments.
  • FIG. 2 illustrates an exemplary workflow diagram of an inference phase for utilizing a machine-learning model trained to predict a slide-level class label describing one or more gene alterations or other biomarkers based on a singular whole-slide histopathology image, according to some embodiments.
  • FIG. 3A illustrates an exemplary training phase
  • FIG. 3B illustrates an exemplary inference phase of a multiple-instance learning (MIL) convolutional neural network (CNN) model trained to predict a slide-level class label describing one or more gene alterations or other biomarkers based on a singular whole-slide histopathology image, according to some embodiments.
  • MIL multiple-instance learning
  • CNN convolutional neural network
  • FIG. 4 illustrates a flow diagram of an exemplary method for utilizing a machinelearning model train to predict a slide-level class label describing one or more gene alterations or other biomarkers based on a singular whole-slide histopathology image, according to some embodiments.
  • FIG. 5 illustrates a flow diagram of an exemplary method for training a machinelearning model to predict a slide-level class label describing one or more gene alterations or other biomarkers based on a singular whole-slide histopathology image, according to some embodiments.
  • FIG. 6 illustrates an example computing system, according to some embodiments.
  • FIG. 7 illustrates a diagram of an example artificial intelligence (Al) architecture included as part of the example computing system of FIG. 6, according to some embodiments.
  • FIG. 8A shows a comparison of several exemplary deep learning models for predicting EGFR status from H&E images, with the MIL model being significantly better than comparator two-stage patch models and a weakly-supervised patch prediction model, according to some embodiments.
  • FIG. 8B shows cross-validated receiver operator curve for the MIL model, according to some embodiments.
  • FIG. 9A shows attention weights for EGFR prediction as separated by predicted tissue morphology, with all patches from 100 high-confidence bags (50 from EGFR mutant slides and 50 from wild-type slides), according to some embodiments. For both mutant and wildtype slides, the tumor patches received the most attention from the MIL model.
  • FIG. 9B shows median attention weight per tissue-morphology group for the 100 slides, according to some embodiments. Both mutant and wild-type slides had an appreciable median attention weight for tumor patches. Mutant slides had higher distribution of median patch attention for tumor and stroma patches than wild-type, whereas wild-type slides had higher distribution of median patch attention for immune, normal, and necrosis.
  • FIG. 9C shows maximum attention weight per tissue-morphology group for each of the 100 slides, according to some embodiments.
  • FIG. 9D shows EGFR tumor positive (TP) attention weights from bag of 250 patches, according to some embodiments.
  • I-V are acinar predominant pattern and hobnail cytology, with low peritumoral and intratumoral immune fractions, ranging 0.1 - 0.2.
  • IV has a low presence of necrotic tissue (VI) was predicted as stroma by the tissue-morphology model, and pathologists confirmed this patch as fibrosis.
  • FIG. 9E shows EGFR tumor negative (TN) attention weights from bag of 250 patches, according to some embodiments.
  • I-II are acinar/lepidic pattern with hobnail cytology and intratumoral lymphoid aggregates.
  • III-VI were predicted as tumor or immune foci by the tissue-morphology model. Pathologists confirmed high peritumoral and intratumoral immune fraction, ranging from 0.2 - 0.7, for these patches. Inflammation was noticeable present as well (IV).
  • FIG. 10B shows minor architectural pattern of high- attention patches, with strong enrichment for mutant status prediction for lepidic and micropapillary, for 49 pathologist reviewed bags, according to some embodiments.
  • FIG. 10C shows cytology for high- attention patches, determined by patch mode, with enrichment of mutant predictions for hobnail and columnar types and enrichment of wild-type predictions for mucinous and sarcomatoid types, for 49 pathologist reviewed bags, according to some embodiments.
  • FIG. 10D shows non-neoplastic qualities present in high-attention patches, as determined by patch mode, for 49 pathologist reviewed bags, according to some embodiments. DESCRIPTION OF EXAMPLE EMBODIMENTS
  • MIL Multiple-instance learning
  • a MIL machine-learning model trains on inputs of sets of image instances (e.g., referred to as a “bag” of patches of pixels) as opposed to the individual image instances themselves.
  • a bag of patches of pixels may be structured and pre-processed in a manner in which class labels are assigned to the bags of patches of pixels for the purposes of training the MIL model.
  • a histopathology image such as a hemotoxin and eosin (H&E) slide, may include a tissue sample, which may be further sequenced and analyzed.
  • H&E hemotoxin and eosin
  • one or more biomarkers or gene alterations may be determined as being associated with the whole tissue sample as opposed to, for example, any specific tissue cells comprising the tissue sample.
  • the MIL model may be particular suitable for analyzing and classifying bags of patches of pixels in histopathology images, which may often include very large and high-resolution images.
  • bags of patches of pixels are each taken from the same histopathology image, traditional machine-learning model techniques of tracking data as the model trains on batches of varying images may be unsuitable and diminish the overall prediction performance of the model. It may be useful to provide techniques to improve MIL models for predicting slidelevel class labels for a singular whole-slide image.
  • a multiple-instance learning convolutional neural network may include, for example, one or more convolutional layers, one or more pooling or max-pooling layers, one or more fully-connected layers, one or more batch normalization layers, and one or more rectified linear units (ReLUs).
  • the one or more batch normalization layers may utilize one or more sets of mini-batch normalization parameters (e.g., a mini-mean parameter and a minivariance parameter) to normalize features in a feature map at each layer of the MILCNN.
  • mini-batch normalization parameters e.g., a mini-mean parameter and a minivariance parameter
  • each batch of data is normalized by subtracting the batch mean and dividing by the square root of the batch variance.
  • at least one bag of patches of pixels of a singular whole-slide histopathology image may be inputted to the MILCNN and the MILCNN may predict a slide-level label (e.g., predict a slide-level label based on the entire bag of patches of pixels as opposed to the individual patches of pixels constituting the bag).
  • the MILCNN may generate and utilize inference-phase-specific batch normalization parameters, such that the one or more batch normalization layers may normalize features in a feature map at each feature layer of the MILCNN utilizing the inference-phase- specific batch normalization parameters as opposed to utilizing, for example, a running mean and variance calculated based on the one or more sets of mini-batch normalization parameters (e.g., a mini-mean parameter and a mini-variance parameter) learned during the training phase of the MILCNN.
  • mini-batch normalization parameters e.g., a mini-mean parameter and a mini-variance parameter
  • the trained MILCNN may better predict slide-level class labels describing one or more gene alterations or other biomarkers (e.g., an epidermal growth factor receptor (EFGR) gene alteration, an anaplastic lymphoma kinase (ALK) gene alteration, an ROS-1 gene alteration, a tumor gene mutation burden (TMB), neurotrophic tyrosine receptor kinase 3 (NTRK3) gene alteration, a fibroblast growth factor receptor 2 (FGFR2) gene alteration, mesenchymal-epithelial transition (MET) gene alteration, phosphatidylinositol-4,5-bisphosphate 3-Kinase catalytic subunit alpha (PIK3CA) gene alteration, neurotrophic tyrosine receptor kinase (NTRK) genes 1/2/3, and so forth) based on a singular whole slide histopathology image.
  • EFGR epidermal growth factor receptor
  • ALK anaplastic lymphoma kinase
  • the trained MILCNN may include batch normalization parameters that are appropriately “fitted” to the data, which includes bags of patches of pixels all corresponding to a singular whole-slide histopathology image (e.g., as opposed to being overly “fitted” to only the most recent images inputted to the MIL model as would otherwise be the case utilizing training-phase-determined running mean and running variance batch normalization parameters).
  • FIG. 1 illustrates a workflow diagram 100 of a training phase for training a machinelearning model to predict a slide-level class label describing one or more gene alterations or other biomarkers based on a singular whole-slide histopathology image, in accordance with the disclosed embodiments.
  • the workflow diagram 100 may be performed by a MIL neural network pipeline 102.
  • the MIL neural network pipeline 102 may be based on a residual neural network (ResNet) image-classification network or a deep ResNet image-classification network (e.g., ResNet- 18, ResNet-34, ResNet-50, ResNet-101, ResNet-152) trained on a dataset based on natural (e.g., non-medical) images, such as the ImageNet dataset (e.g., a labeled high-resolution image database available publicly).
  • ResNet residual neural network
  • a data set of images 104 may be accessed.
  • the data set of images 104 may include, for example, any of various whole-slide images (WSIs), such as fluorescence in situ hybridization (FISH) images, an immunofluorescence (IF) images, hematoxylin and eosin (H&E) images, immunohistochemistry (IHC) images, imaging mass cytometry (IMC) images, and so forth.
  • WSIs whole-slide images
  • FISH fluorescence in situ hybridization
  • IF immunofluorescence
  • H&E hematoxylin and eosin
  • IHC immunohistochemistry
  • IMC imaging mass cytometry
  • the data set of images 104 may include a set of histopathology images (e.g., 1,000 or more histopathology images), which may each include very large and high-resolution images (e.g., 1.5K X 2K pixels, 2K X 4K pixels, 6K X 8K pixels, 7.5K X 10K pixels, 9K X 12K pixels, 15K X 20K pixels, 20K X 24K pixels, 20K X 30K pixels, 24K X 30K pixels).
  • the data set of images 104 are not limited to histopathology images, and may include any large and/or high-resolution images.
  • the MIL neural network pipeline 102 may be trained on a singular WSI 106 (per training instance or per training step) selected from the data set of images 104. Specifically, while previous techniques of training neural networks are based on batches of training images (e.g., 30-35 images per batch) each being independent of each other, in accordance with the presently-disclosed embodiments, the MIL neural network pipeline 102 may be trained on bags of patches of pixels all sampled from the same singular WSI 106.
  • the MIL neural network pipeline 102 may further include segmenting the singular WSI 106 into a complete set of patches of pixels 108, which may each include different regions of pixels of the singular WSI 106 clustered into a respective patch. [0038] In certain embodiments, the MIL neural network pipeline 102 may further include grouping the complete set of patches of pixels 108 into bags of patches of pixels 110.
  • the MIL neural network pipeline 102 may include randomly sampling one or more subsets of the complete set of patches of pixels 108 patches (e.g., 30-35 patches of pixels) to be grouped or clustered into the bags of patches of pixels 110 for inputting into a MIL convolutional neural network (CNN) 112 for training the MILCNN model 112.
  • MILCNN model 112 may include, for example, any multiple-learning neural network machine-learning model that may be trained to predict slide-level class labels based on the bags of patches of pixels 110, in which each bag of the bags of patches of pixels 110 includes that same training class label.
  • At least one bag of the bags of patches of pixels 110 may be inputted to the MILCNN model 112 to train the MILCNN model 112 to generate a prediction of a slide-level label 114 (e.g., generate a prediction a slide-level label based on the entire at least one bag of the bags of patches of pixels 110 as opposed to the individual subset of patches of pixels 108 patches constituting the bags of patches of pixels 110).
  • the prediction of a slide-level label 114 may include, for example, a prediction of a slide-level class label describing one or more gene alterations or other biomarkers that may be included in the singular WSI 106.
  • the workflow diagram 100 may be performed iteratively for each of the bags of patches of pixels 110 until the MILCNN model 112 is sufficiently trained (e.g., correctly predicting the prediction of a slide-level label 114 with a probability greater than 0.8 or greater than 0.9).
  • FIG. 2 illustrates a workflow diagram 200 of an inference phase for utilizing a machine-learning model trained to predict a slide-level class label describing one or more gene alterations or other biomarkers based on a singular whole-slide histopathology image, in accordance with the disclosed embodiments.
  • the slide-level class label is not limited thereto and may be any label relevant to the image utilized to train the machinelearning model and to predict the slide-level class label.
  • the workflow diagram 200 may be performed by a MIL neural network pipeline 202 (e.g., corresponding to the MIL neural network pipeline 102 having been trained as discussed above with respect to FIG. 1).
  • an input WSI 204 may be accessed.
  • the input WSI 204 may include, for example, a fluorescence in situ hybridization (FISH) input image, an immunofluorescence (IF) input image, a hematoxylin and eosin (H&E) input image, immunohistochemistry (IHC) input image, an imaging mass cytometry (IMC) input image, and so forth.
  • FISH fluorescence in situ hybridization
  • IF immunofluorescence
  • H&E hematoxylin and eosin
  • IHC immunohistochemistry
  • IMC imaging mass cytometry
  • the MIL neural network pipeline 202 may access a singular input WSI 204 (e.g., a real-world, clinical WSI).
  • a singular input WSI 204 e.g., a real-world, clinical WSI.
  • each batch normalization layer in the network calculates its corresponding batch normalization parameters (e.g., mini-batch normalization parameters) during the training phase of the MILCNN model 112, for example, and then during the inference phase of the MILCNN model 210 (e.g., corresponding to the MIL neural network pipeline 102 having been trained as discussed above with respect to FIG.
  • batch normalization is performed based on a running mean and variance calculated based on the batch normalization parameters (e.g., mini-batch normalization parameters) determined during the training phase of the MILCNN model 112 may be unsuitable and diminish the overall predict performance of the MILCNN model 210, for example.
  • batch normalization parameters e.g., mini-batch normalization parameters
  • the running mean and variance of all of the bags of patches of pixels 110 may be calculated to be utilized during the inference phase of the MILCNN model 210 (e.g., corresponding to the MIL neural network pipeline 102 having been trained as discussed above with respect to FIG. 1).
  • the MIL neural network pipeline 202 may further include segmenting the input WSI 204 into a complete set of patches of pixels 206, which may each include different regions of pixels of the input WSI 204 clustered into a respective patch. In certain embodiments, the MIL neural network pipeline 202 may further include grouping the complete set of patches of pixels 206 into bags of patches of pixels 208.
  • the MIL neural network pipeline 202 may include randomly sampling one or more subsets of the complete set of patches of pixels 206 patches (e.g., 30-35 patches of pixels) to be grouped or clustered into the bags of patches of pixels 208 for inputting into the MILCNN model 210 (e.g., corresponding to the MIL neural network pipeline 102 having been trained as discussed above with respect to FIG. 1). [0044] Accordingly, as will be further illustrated with respect to FIGS.
  • the MILCNN model 210 may generate and utilize inference -phase-specific batch normalization parameters, such that one or more batch normalization layers included as part of the MILCNN model 210 may normalize features in a feature map at each layer of the MILCNN model 210 utilizing the inference-phase-specific batch normalization parameters (e.g., as opposed to utilizing, for example, a running mean and variance calculated based on the set of mini-batch normalization parameters determined during the training phase of the MILCNN model 112).
  • At least one bag of the bags of patches of pixels 208 may be inputted to the MILCNN model 210 (e.g., corresponding to the MIL neural network pipeline 102 having been trained as discussed above with respect to FIG. 1) to generate a prediction of a single image level label 212 (e.g., a slide-level label) (e.g., generate a prediction a slide-level label based on the entire at least one bag of the bags of patches of pixels 208 as opposed to the individual subset of patches of pixels 206 patches constituting the bags of patches of pixels 208).
  • a single image level label 212 e.g., a slide-level label
  • the prediction of a slide-level label 114 may include, for example, a prediction of a slide-level class label describing one or more gene alterations or other biomarkers that may be included in the input WSI 204.
  • the slide-level label 212 describing one or more gene alterations or other biomarkers may describe one or more gene alterations or other biomarkers including, for example, an epidermal growth factor receptor (EFGR) gene alteration, an anaplastic lymphoma kinase (ALK) gene alteration, an ROS-1 gene alteration, a tumor gene mutation burden (TMB), neurotrophic tyrosine receptor kinase 3 (NTRK3) gene alteration, a fibroblast growth factor receptor 2 (FGFR2) gene alteration, mesenchymal-epithelial transition (MET) gene alteration, phosphatidylinositol-4,5-bisphosphate 3 -Kinase catalytic subunit alpha (PIK3CA
  • EFGR epidermal growth factor
  • FIGS. 3A and 3B illustrate a training phase and an inference phase of a multipleinstance learning (MIL) convolutional neural network (CNN) model 300A, 300B trained to predict a slide-level class label describing one or more gene alterations or other biomarkers based on a singular whole-slide histopathology image, in accordance with the disclosed embodiments.
  • MIL multipleinstance learning
  • CNN convolutional neural network
  • the MILCNN models 300A, 300B may include, for example, an ResNet image-classification network or a deep ResNet image-classification network (e.g., ResNet-18, ResNet-34, ResNet-50, ResNet-101, ResNet-152) trained and utilized as discussed above with respect to the MILCNN model 112 of FIG. 1 and the MILCNN model 210 of FIG. 2, respectively.
  • ResNet-18, ResNet-34, ResNet-50, ResNet-101, ResNet-152 may be trained and utilized as discussed above with respect to the MILCNN model 112 of FIG. 1 and the MILCNN model 210 of FIG. 2, respectively.
  • the MILCNN models 300A, 300B may represent only one example embodiment of a neural network architecture that may be trained and utilized to predict a slide-level class label describing one or more gene alterations or other biomarkers based on a singular whole-slide histopathology image in accordance with the presently disclosed embodiments.
  • the MILCNN models 300A, 300B may include, for example, any number of convolutional layers, pooling layers or max-pooling layers, full-connected layers, batch normalization layers, or output layers (e.g., ReLUs, artificial neural network (ANN) classifiers, and so forth) that may be scaled-up or scaled-down based on the implementation and application.
  • the MILCNN model 300A may receive an input 302A including a training bag of patches of pixels.
  • the input 302A including the training bag of patches of pixels may be inputted in a first A X A convolutional layer 304A (e.g., a 3 X 3 convolutional layer, a 7 X 7 convolutional layer, 10 X 10 convolutional layer).
  • the first A X A convolutional layer 304A may include, for example, one or more convolutional filters or kernels that may be utilized to extract and generate a feature map based on the input 302A including a training bag of patches of pixels.
  • one or more weighing layers may be included as part of the first A X A convolutional layer 304A (e.g., included as part of the one or more full-connected layers) that may be utilized to generate and iteratively update one or more gamma parameters (i.e., scaling parameters) and beta parameters (i.e., offset parameters) that may be associated with weighting and biasing the MILCNN model 300A.
  • gamma parameters i.e., scaling parameters
  • beta parameters i.e., offset parameters
  • the feature map outputted by the first A X A convolutional layer 304A may be then inputted to a first batch normalization layer 306A.
  • the first batch normalization layer 306A may normalize the feature map utilizing a first set of mini-batch parameters 308A.
  • the first set of mini-batch parameters 308A e.g., mini mean value and mini variance value
  • the first batch normalization layer 306A may then output the normalized feature map to a first output layer 310A (e.g., first ReLU layer or first activation function layer), which may be utilized to generate a first probability or probability distribution based on the normalized feature map.
  • a first output layer 310A e.g., first ReLU layer or first activation function layer
  • the first probability or probability distribution outputted by the first output layer 310A may be inputted to a second A X A convolutional layer 312A (e.g., a 3 X 3 convolutional layer, a 7 X 7 convolutional layer, 10 X 10 convolutional layer).
  • a second A X A convolutional layer 312A e.g., a 3 X 3 convolutional layer, a 7 X 7 convolutional layer, 10 X 10 convolutional layer.
  • the second A X A convolutional layer 312A may then generate a second feature map based on the input 302A including a training bag of patches of pixels and the first probability or probability distribution outputted by the first output layer 310A (e.g., first ReLU layer or first activation function layer).
  • the second feature map outputted by the second A X A convolutional layer 312A may be then inputted to a second batch normalization layer 314A.
  • the second batch normalization layer 314A may normalize the second feature map utilizing a second set of mini -batch parameters 316A.
  • the second set of mini-batch parameters 316A may include, for example, batch normalization statistics that correspond to the current training bag of patches of pixels inputted to the first A X A convolutional layer 304A and the second feature layer of the MILCNN model 300A.
  • the second batch normalization layer 314A may then output the second normalized feature map to a second output layer 318A (e.g., second ReLU layer or second activation function layer), which may be utilized to generate a second probability or probability distribution based on the second normalized feature map.
  • the second probability or probability distribution outputted by the second output layer 318A (e.g., second ReLU layer or second activation function layer) and the input 302A including a training bag of patches of pixels may be then summed (e.g., via a summer 320A) and outputted to a third output layer 322A (e.g., third ReLU layer or third activation function layer).
  • the third output layer 322 A may then generate a final prediction of a slide-level class label describing one or more gene alterations or other biomarkers that may be included in a WSI from which the input 302 A including a training bag of patches of pixels was sampled.
  • one or more weighing layers may be included as part of the first A X A convolutional layer 304B (e.g., included as part of the one or more full-connected layers) that may be utilized to generate and iteratively update one or more gamma parameters (i.e., scaling parameters) and beta parameters (i.e., offset parameters) that may be associated with weighting and biasing the MILCNN model 300B.
  • gamma parameters i.e., scaling parameters
  • beta parameters i.e., offset parameters
  • the feature map outputted by the first A X A convolutional layer 304B may be then inputted to a first batch normalization layer 306B.
  • the first batch normalization layer 306B may then generate a first set of inference -phase-specific batch normalization parameters 308B (e.g., inference-phase-specific mean value and an inference -phase-specific variance value).
  • the set of inference-phase-specific batch normalization parameters 308B may be then utilized to normalize the feature map outputted by the first A X A convolutional layer 304B.
  • the first set of inference-phase-specific batch normalization parameters 308B may include, for example, batch normalization statistics that correspond to the current input bag of patches of pixels inputted to the first N X N convolutional layer 304B and the first feature layer of the MILCNN model 300B.
  • the first batch normalization layer 306B may then output the normalized feature map to a first output layer 310B (e.g., first ReLU layer or first activation function layer), which may be utilized to generate a first probability or probability distribution based on the normalized feature map outputted by the first batch normalization layer 306B.
  • a first output layer 310B e.g., first ReLU layer or first activation function layer
  • the first N X N convolutional layer 304B, the first batch normalization layer 306B, and the first output layer 310B may collectively represent a first feature layer of the MILCNN model 300B tasked with learning one or more first specific image features of the input 302B including an input bag of patches of pixels (e.g., first feature layer of the MILCNN model 300B may predict edges, the second feature layer of the MILCNN model 300B may predict contours, the third feature layer of the MILCNN model 300B may predict surfaces, and so on and so forth).
  • the first probability or probability distribution outputted by the first output layer 310B may be inputted to a second N X N convolutional layer 312B (e.g., a 3 X 3 convolutional layer, a 7 X 7 convolutional layer, 10 X 10 convolutional layer).
  • a second N X N convolutional layer 312B e.g., a 3 X 3 convolutional layer, a 7 X 7 convolutional layer, 10 X 10 convolutional layer.
  • the second N X N convolutional layer 312B may then generate a second feature map based on the input 302B including an input bag of patches of pixels and the first probability or probability distribution outputted by the first output layer 310B (e.g., first ReLU layer or first activation function layer).
  • the second feature map outputted by the second N X N convolutional layer 312B may be then inputted to a second batch normalization layer 314B.
  • the second batch normalization layer 314B may then generate a second set of inference-phase-specific batch normalization parameters 314B (e.g., inference-phase-specific mean value and an inference-phase-specific variance value).
  • the set of inference- phase-specific batch normalization parameters 314B may be then utilized to normalize the feature map outputted by the second N X N convolutional layer 312B.
  • the second batch normalization layer 314B may then output the second normalized feature map to a second output layer 318B (e.g., second ReLU layer or second activation function layer), which may be utilized to generate a second probability or probability distribution based on the second normalized feature map.
  • a second output layer 318B e.g., second ReLU layer or second activation function layer
  • the second probability or probability distribution outputted by the second output layer 318B (e.g., second ReLU layer or second activation function layer) and the input 302B including an input bag of patches of pixels may be then summed (e.g., via a summer 320B) and outputted to a third output layer 322B (e.g., third ReLU layer or third activation function layer).
  • the third output layer 322B e.g., third ReLU layer or third activation function layer
  • FIG. 4 illustrates a flow diagram of a method 400 for utilizing a machine-learning model train to predict a slide-level class label describing one or more gene alterations or other biomarkers based on a singular whole-slide histopathology image, in accordance with the disclosed embodiments.
  • the method 400 may be performed utilizing one or more processing devices (e.g., computing device(s) and artificial intelligence architecture to be discussed below with respect to FIGs.
  • a general purpose processor e.g., a general purpose processor, a graphic processing unit (GPU), an application-specific integrated circuit (ASIC), a system-on- chip (SoC), a microcontroller, a field-programmable gate array (FPGA), a central processing unit (CPU), an application processor (AP), a visual processing unit (VPU), a neural processing unit (NPU), a neural decision processor (NDP), a deep learning processor (DLP), a tensor processing unit (TPU), a neuromorphic processing unit (NPU), or any other processing device(s) that may be suitable for processing various omics data and making one or more decisions based thereon), software (e.g., instructions running/executing on one or more processors), firmware (e.g., microcode), or some combination thereof.
  • hardware e.g., a general purpose processor, a graphic processing unit (GPU), an application-specific integrated circuit (ASIC), a system-on- chip (SoC), a microcontroller, a
  • the method 400 may begin at block 402 with one or more processing devices segmenting an image into a plurality of patches.
  • the method 400 may then continue at block 404 with one or more processing devices grouping the plurality of patches into at least one bag of patches.
  • the method 400 may then conclude at block 406 with one or more processing devices inputting the at least one bag of patches into a machine-learning model trained to generate a prediction of an image class label based on the at least one bag of patches and utilizing a set of batch normalization parameters determined from the at least one bag of patches.
  • FIG. 5 illustrates a flow diagram of a method 500 for training a machine-learning model to predict an image label (e.g., a slide-level class label) describing one or more gene alterations or other biomarkers based on a singular image (e.g., a whole-slide histopathology image), in accordance with the disclosed embodiments.
  • the method 500 may be performed utilizing one or more processing devices (e.g., computing device(s) and artificial intelligence architecture to be discussed below with respect to FIGs.
  • a general purpose processor e.g., a general purpose processor, a graphic processing unit (GPU), an application-specific integrated circuit (ASIC), a system-on-chip (SoC), a microcontroller, a field-programmable gate array (FPGA), a central processing unit (CPU), an application processor (AP), a visual processing unit (VPU), a neural processing unit (NPU), a neural decision processor (NDP), a deep learning processor (DLP), a tensor processing unit (TPU), a neuromorphic processing unit (NPU), or any other processing device(s) that may be suitable for processing various omics data and making one or more decisions based thereon), software (e.g., instructions running/executing on one or more processors), firmware (e.g., microcode), or some combination thereof.
  • hardware e.g., a general purpose processor, a graphic processing unit (GPU), an application-specific integrated circuit (ASIC), a system-on-chip (SoC), a microcontroller, a
  • the method 500 may begin at block 502 with one or more processing devices receiving a training image. The method 500 may then continue at block 504 with one or more processing devices segmenting the training image into a second plurality of patches of pixels. The method 500 may then continue at block 506 with one or more processing devices grouping the plurality of patches of pixels into at least one second bag of patches. The method 500 may then conclude at block 508 with one or more processing devices inputting the at least one second bag of patches into a machine-learning model to generate a prediction of an image class label based on the at least one second bag of patches and utilizing a set of mini-batch normalization parameters determined from the at least one second bag of patches.
  • the present embodiments are directed toward one or more computing devices, methods, and non-transitory computer-readable media that may generate inference-phase- specific batch normalization parameters for a machine-learning model trained to predict a label (e.g., a slide-level class) describing a portion of the image (e.g., one or more gene alterations or other biomarkers) based on a singular image (e.g., a singular whole-slide histopathology image).
  • a label e.g., a slide-level class
  • a portion of the image e.g., one or more gene alterations or other biomarkers
  • a singular image e.g., a singular whole-slide histopathology image
  • a multiple-instance learning convolutional neural network may include, for example, one or more convolutional layers, one or more pooling or max-pooling layers, one or more fully-connected layers, one or more batch normalization layers, and one or more rectified linear units (ReLUs).
  • the one or more batch normalization layers may utilize one or more sets of mini-batch normalization parameters (e.g., a mini-mean parameter and a mini-variance parameter) to normalize features in a feature map at each layer of the MILCNN.
  • each batch of data is normalized by subtracting the batch mean and dividing by the square root of the batch variance.
  • at least one bag of patches of pixels of a singular image e.g., a singular whole-slide histopathology image
  • the MILCNN may predict an image-level label (e.g., predict a slide-level label based on the entire bag of patches of pixels as opposed to the individual patches of pixels constituting the bag).
  • the MILCNN may generate and utilize inference-phase-specific batch normalization parameters, such that the one or more batch normalization layers may normalize features in a feature map at each feature layer of the MILCNN utilizing the inference-phase-specific batch normalization parameters as opposed to utilizing, for example, a running mean and variance calculated based on the one or more sets of mini-batch normalization parameters (e.g., a mini-mean parameter and a mini-variance parameter) learned during the training phase of the MILCNN.
  • mini-batch normalization parameters e.g., a mini-mean parameter and a mini-variance parameter
  • the trained MILCNN may better predict slide-level class labels describing one or more gene alterations or other biomarkers (e.g., an epidermal growth factor receptor (EFGR) gene alteration, an anaplastic lymphoma kinase (ALK) gene alteration, an ROS-1 gene alteration, a tumor gene mutation burden (TMB), neurotrophic tyrosine receptor kinase 3 (NTRK3) gene alteration, a fibroblast growth factor receptor 2 (FGFR2) gene alteration, mesenchymal-epithelial transition (MET) gene alteration, phosphatidylinositol-4,5-bisphosphate 3-Kinase catalytic subunit alpha (PIK3CA) gene alteration) based on a singular whole slide histopathology image.
  • EFGR epidermal growth factor receptor
  • ALK anaplastic lymphoma kinase
  • ROS-1 ROS-1 gene alteration
  • TMB tumor gene mutation burden
  • NTRK3 neurotrophic
  • the trained MILCNN may include batch normalization parameters that are appropriately “fitted” to the data, which includes bags of patches of pixels all corresponding to a singular whole-slide histopathology image (e.g., as opposed to being overly “fitted” to only the most recent images inputted to the MIL model as would otherwise be the case utilizing training-phase-determined running mean and running variance batch normalization parameters).
  • FIG. 6 illustrates an example of one or more computing device(s) 600 that may be utilized to generate inference-phase-specific batch normalization parameters for a machinelearning model trained to predict a slide-level class label describing one or more gene alterations or other biomarkers based on a singular whole-slide histopathology image, in accordance with the disclosed embodiments.
  • the one or more computing device(s) 600 may perform one or more steps of one or more methods described or illustrated herein.
  • the one or more computing device(s) 600 provide functionality described or illustrated herein.
  • software running on the one or more computing device(s) 600 performs one or more steps of one or more methods described or illustrated herein, or provides functionality described or illustrated herein. Certain embodiments include one or more portions of the one or more computing device(s) 600.
  • This disclosure contemplates any suitable number of computing systems 600.
  • This disclosure contemplates one or more computing device(s) 600 taking any suitable physical form.
  • one or more computing device(s) 600 may be an embedded computer system, a system-on-chip (SOC), a single-board computer system (SBC) (e.g., a computer-on-module (COM) or system-on-module (SOM)), a desktop computer system, a laptop or notebook computer system, an interactive kiosk, a mainframe, a mesh of computer systems, a mobile telephone, a personal digital assistant (PDA), a server, a tablet computer system, an augmented/virtual reality device, or a combination of two or more of these.
  • the one or more computing device(s) 600 may be unitary or distributed; span multiple locations; span multiple machines; span multiple data centers; or reside in a cloud, which may include one or more cloud components in one or more networks.
  • the one or more computing device(s) 600 may perform without substantial spatial or temporal limitation one or more steps of one or more methods described or illustrated herein.
  • the one or more computing device(s) 600 may perform, in real-time or in batch mode, one or more steps of one or more methods described or illustrated herein.
  • the one or more computing device(s) 600 may perform, at different times or at different locations, one or more steps of one or more methods described or illustrated herein, where appropriate.
  • the one or more computing device(s) 600 includes a processor 602, memory 604, database 606, an input/output (I/O) interface 608, a communication interface 610, and a bus 612.
  • processor 602 includes hardware for executing instructions, such as those making up a computer program.
  • processor 602 may retrieve (or fetch) the instructions from an internal register, an internal cache, memory 604, or database 606; decode and execute them; and then write one or more results to an internal register, an internal cache, memory 604, or database 606.
  • processor 602 may include one or more internal caches for data, instructions, or addresses. This disclosure contemplates processor 602 including any suitable number of any suitable internal caches, where appropriate.
  • processor 602 may include one or more instruction caches, one or more data caches, and one or more translation lookaside buffers (TLBs). Instructions in the instruction caches may be copies of instructions in memory 604 or database 606, and the instruction caches may speed up retrieval of those instructions by processor 602.
  • TLBs translation lookaside buffers
  • Data in the data caches may be copies of data in memory 604 or database 606 for instructions executing at processor 602 to operate on; the results of previous instructions executed at processor 602 for access by subsequent instructions executing at processor 602 or for writing to memory 604 or database 606; or other suitable data.
  • the data caches may speed up read or write operations by processor 602.
  • the TLBs may speed up virtual-address translation for processor 602.
  • processor 602 may include one or more internal registers for data, instructions, or addresses. This disclosure contemplates processor 602 including any suitable number of any suitable internal registers, where appropriate. Where appropriate, processor 602 may include one or more arithmetic logic units (ALUs); be a multicore processor; or include one or more processors 602.
  • ALUs arithmetic logic units
  • memory 604 includes main memory for storing instructions for processor 602 to execute or data for processor 602 to operate on.
  • the one or more computing device(s) 600 may load instructions from database 606 or another source (such as, for example, another one or more computing device(s) 600) to memory 604.
  • Processor 602 may then load the instructions from memory 604 to an internal register or internal cache.
  • processor 602 may retrieve the instructions from the internal register or internal cache and decode them.
  • processor 602 may write one or more results (which may be intermediate or final results) to the internal register or internal cache.
  • Processor 602 may then write one or more of those results to memory 604.
  • processor 602 executes only instructions in one or more internal registers, internal caches, or memory 604 (as opposed to database 606 or elsewhere) and operates only on data in one or more internal registers, internal caches, or memory 604 (as opposed to database 606 or elsewhere).
  • One or more memory buses (which may each include an address bus and a data bus) may couple processor 602 to memory 604.
  • Bus 612 may include one or more memory buses, as described below.
  • one or more memory management units reside between processor 602 and memory 604 and facilitate accesses to memory 604 requested by processor 602.
  • memory 604 includes random access memory (RAM). This RAM may be volatile memory, where appropriate.
  • this RAM may be dynamic RAM (DRAM) or static RAM (SRAM). Moreover, where appropriate, this RAM may be single-ported or multi-ported RAM. This disclosure contemplates any suitable RAM.
  • Memory 604 may include one or more memory devices 604, where appropriate. Although this disclosure describes and illustrates particular memory, this disclosure contemplates any suitable memory.
  • database 606 includes mass storage for data or instructions.
  • database 606 may include a hard disk drive (HDD), a floppy disk drive, flash memory, an optical disc, a magneto-optical disc, magnetic tape, or a Universal Serial Bus (USB) drive, or a combination of two or more of these.
  • Database 606 may include removable or non-removable (or fixed) media, where appropriate.
  • Database 606 may be internal or external to the one or more computing device(s) 600, where appropriate.
  • database 606 is non-volatile, solid-state memory.
  • database 606 includes read-only memory (ROM).
  • this ROM may be maskprogrammed ROM, programmable ROM (PROM), erasable PROM (EPROM), electrically erasable PROM (EEPROM), electrically alterable ROM (EAROM), flash memory, or a combination of two or more of these.
  • This disclosure contemplates mass database 606 taking any suitable physical form.
  • Database 606 may include one or more storage control units facilitating communication between processor 602 and database 606, where appropriate.
  • database 606 may include one or more databases 606.
  • RO interface 608 includes hardware, software, or both, providing one or more interfaces for communication between the one or more computing device(s) 600 and one or more RO devices.
  • the one or more computing device(s) 600 may include one or more of these RO devices, where appropriate.
  • One or more of these I/O devices may enable communication between a person and the one or more computing device(s) 600.
  • an RO device may include a keyboard, keypad, microphone, monitor, mouse, printer, scanner, speaker, still camera, stylus, tablet, touch screen, trackball, video camera, another suitable I/O device, or a combination of two or more of these.
  • An RO device may include one or more sensors.
  • RO interface 608 may include one or more device or software drivers enabling processor 602 to drive one or more of these I/O devices.
  • I/O interface 608 may include one or more RO interfaces 608, where appropriate.
  • communication interface 610 includes hardware, software, or both providing one or more interfaces for communication (such as, for example, packet-based communication) between the one or more computing device(s) 600 and one or more other computing device(s) 600 or one or more networks.
  • communication interface 610 may include a network interface controller (NIC) or network adapter for communicating with an Ethernet or other wire -based network or a wireless NIC (WNIC) or wireless adapter for communicating with a wireless network, such as a WI-FI network.
  • NIC network interface controller
  • WNIC wireless NIC
  • WI-FI network wireless network
  • the one or more computing device(s) 600 may communicate with an ad hoc network, a personal area network (PAN), a local area network (LAN), a wide area network (WAN), a metropolitan area network (MAN), one or more portions of the Internet, or a combination of two or more of these.
  • PAN personal area network
  • LAN local area network
  • WAN wide area network
  • MAN metropolitan area network
  • One or more portions of one or more of these networks may be wired or wireless.
  • the one or more computing device(s) 600 may communicate with a wireless PAN (WPAN) (such as, for example, a BLUETOOTH WPAN), a WI-FI network, a WLMAX network, a cellular telephone network (such as, for example, a Global System for Mobile Communications (GSM) network), other suitable wireless network, or a combination of two or more of these.
  • WPAN wireless PAN
  • the one or more computing device(s) 600 may include any suitable communication interface 610 for any of these networks, where appropriate.
  • Communication interface 610 may include one or more communication interfaces 610, where appropriate.
  • bus 612 includes hardware, software, or both coupling components of the one or more computing device(s) 600 to each other.
  • bus 612 may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a front-side bus (FSB), a HYPERTRANSPORT (HT) interconnect, an Industry Standard Architecture (ISA) bus, an INFINIBAND interconnect, a low-pin-count (LPC) bus, a memory bus, a Micro Channel Architecture (MCA) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCIe) bus, a serial advanced technology attachment (SATA) bus, a Video Electronics Standards Association local (VLB) bus, another suitable bus, or a combination of two or more of these.
  • AGP Accelerated Graphics Port
  • EISA Enhanced Industry Standard Architecture
  • FAB front-side bus
  • HT HYPERTRANSPORT
  • ISA Industry Standard Architecture
  • ISA Industry
  • Bus 612 may include one or more buses 612, where appropriate. Although this disclosure describes and illustrates a particular bus, this disclosure contemplates any suitable bus or interconnect.
  • a computer-readable non-transitory storage medium or media may include one or more semiconductor-based or other integrated circuits (ICs) (such, as for example, field- programmable gate arrays (FPGAs) or application-specific ICs (ASICs)), hard disk drives (HDDs), hybrid hard drives (HHDs), optical discs, optical disc drives (ODDs), magneto-optical discs, magneto-optical drives, floppy diskettes, floppy disk drives (FDDs), magnetic tapes, solid- state drives (SSDs), RAM-drives, SECURE DIGITAL cards or drives, any other suitable computer-readable non-transitory storage media, or any suitable combination of two or more of these, where appropriate.
  • ICs semiconductor-based or other integrated circuits
  • HDDs hard disk drives
  • HHDs hybrid hard drives
  • ODDs
  • FIG. 7 illustrates a diagram 700 of an example artificial intelligence (Al) architecture 702 (which may be included as part of the one or more computing device(s) 600 as discussed above with respect to FIG. 6) that may be utilized to generate inference -phase-specific batch normalization parameters for a machine-learning model trained to predict a slide-level class label describing one or more gene alterations or other biomarkers based on a singular whole-slide histopathology image, in accordance with the disclosed embodiments.
  • Al artificial intelligence
  • the Al architecture 702 may be implemented utilizing, for example, one or more processing devices that may include hardware (e.g., a general purpose processor, a graphic processing unit (GPU), an application-specific integrated circuit (ASIC), a system-on-chip (SoC), a microcontroller, a field-programmable gate array (FPGA), a central processing unit (CPU), an application processor (AP), a visual processing unit (VPU), a neural processing unit (NPU), a neural decision processor (NDP), a deep learning processor (DLP), a tensor processing unit (TPU), a neuromorphic processing unit (NPU), and/or other processing device(s) that may be suitable for processing various molecular data and making one or more decisions based thereon), software (e.g., instructions running/executing on one or more processing devices), firmware (e.g., microcode), or some combination thereof.
  • hardware e.g., a general purpose processor, a graphic processing unit (GPU), an application-specific integrated circuit (ASIC
  • the Al architecture 702 may include machine learning (ML) algorithms and functions 704, natural language processing (NLP) algorithms and functions 706, expert systems 708, computer-based vision algorithms and functions 710, speech recognition algorithms and functions 712, planning algorithms and functions 714, and robotics algorithms and functions 716.
  • the ML algorithms and functions 704 may include any statistics-based algorithms that may be suitable for finding patterns across large amounts of data (e.g., “Big Data” such as genomics data, proteomics data, metabolomics data, metagenomics data, transcriptomics data, or other omics data).
  • the ML algorithms and functions 704 may include deep learning algorithms 718, supervised learning algorithms 720, and unsupervised learning algorithms 722.
  • the deep learning algorithms 718 may include any artificial neural networks (ANNs) that may be utilized to learn deep levels of representations and abstractions from large amounts of data.
  • the deep learning algorithms 718 may include ANNs, such as a perceptron, a multilayer perceptron (MLP), an autoencoder (AE), a convolution neural network (CNN), a recurrent neural network (RNN), long short term memory (LSTM), a grated recurrent unit (GRU), a restricted Boltzmann Machine (RBM), a deep belief network (DBN), a bidirectional recurrent deep neural network (BRDNN), a generative adversarial network (GAN), and deep Q-networks, a neural autoregressive distribution estimation (NADE), an adversarial network (AN), attentional models (AM), a spiking neural network (SNN), deep reinforcement learning, and so forth.
  • ANNs such as a perceptron, a multilayer perceptron (MLP), an autoencoder (AE
  • the supervised learning algorithms 720 may include any algorithms that may be utilized to apply, for example, what has been learned in the past to new data using labeled examples for predicting future events. For example, starting from the analysis of a known data set, the supervised learning algorithms 720 may produce an inferred function to make predictions about the output values. The supervised learning algorithms 600 may also compare its output with the correct and intended output and find errors in order to modify the supervised learning algorithms 720 accordingly.
  • the unsupervised learning algorithms 722 may include any algorithms that may applied, for example, when the data used to train the unsupervised learning algorithms 722 are neither classified nor labeled. For example, the unsupervised learning algorithms 722 may study and analyze how systems may infer a function to describe a hidden structure from unlabeled data.
  • the NLP algorithms and functions 706 may include any algorithms or functions that may be suitable for automatically manipulating natural language, such as speech and/or text.
  • the NLP algorithms and functions 706 may include content extraction algorithms or functions 724, classification algorithms or functions 726, machine translation algorithms or functions 728, question answering (QA) algorithms or functions 730, and text generation algorithms or functions 732.
  • the content extraction algorithms or functions 724 may include a means for extracting text or images from electronic documents (e.g., webpages, text editor documents, and so forth) to be utilized, for example, in other applications.
  • the classification algorithms or functions 726 may include any algorithms that may utilize a supervised learning model (e.g., logistic regression, naive Bayes, stochastic gradient descent (SGD), k-nearest neighbors, decision trees, random forests, support vector machine (SVM), and so forth) to learn from the data input to the supervised learning model and to make new observations or classifications based thereon.
  • the machine translation algorithms or functions 728 may include any algorithms or functions that may be suitable for automatically converting source text in one language, for example, into text in another language.
  • the QA algorithms or functions 730 may include any algorithms or functions that may be suitable for automatically answering questions posed by humans in, for example, a natural language, such as that performed by voice-controlled personal assistant devices.
  • the text generation algorithms or functions 732 may include any algorithms or functions that may be suitable for automatically generating natural language texts.
  • the expert systems 708 may include any algorithms or functions that may be suitable for simulating the judgment and behavior of a human or an organization that has expert knowledge and experience in a particular field (e.g., stock trading, medicine, sports statistics, and so forth).
  • the computer-based vision algorithms and functions 710 may include any algorithms or functions that may be suitable for automatically extracting information from images (e.g., photo images, video images).
  • the computer-based vision algorithms and functions 710 may include image recognition algorithms 734 and machine vision algorithms 736.
  • the image recognition algorithms 734 may include any algorithms that may be suitable for automatically identifying and/or classifying objects, places, people, and so forth that may be included in, for example, one or more image frames or other displayed data.
  • the machine vision algorithms 736 may include any algorithms that may be suitable for allowing computers to “see”, or, for example, to rely on image sensors cameras with specialized optics to acquire images for processing, analyzing, and/or measuring various data characteristics for decision making purposes.
  • the speech recognition algorithms and functions 712 may include any algorithms or functions that may be suitable for recognizing and translating spoken language into text, such as through automatic speech recognition (ASR), computer speech recognition, speech-to-text (STT) 738, or text-to-speech (TTS) 740 in order for the computing to communicate via speech with one or more users, for example.
  • the planning algorithms and functions 714 may include any algorithms or functions that may be suitable for generating a sequence of actions, in which each action may include its own set of preconditions to be satisfied before performing the action. Examples of Al planning may include classical planning, reduction to other problems, temporal planning, probabilistic planning, preference-based planning, conditional planning, and so forth.
  • the robotics algorithms and functions 716 may include any algorithms, functions, or systems that may enable one or more devices to replicate human behavior through, for example, motions, gestures, performance tasks, decision-making, emotions, and so forth.
  • the methods described herein may further be used to characterize a cancer in a subject, for example as having a positive biomarker status.
  • the method may be used to characterize the cancer as positive for a mutation or alteration in a genetic biomarker.
  • the genetic biomarker may be, for example, an epidermal growth factor receptor (EFGR) gene alteration, an anaplastic lymphoma kinase (ALK) gene alteration, an ROS-1 gene alteration, a tumor gene mutation burden (TMB), neurotrophic tyrosine receptor kinase 3 (NTRK3) gene alteration, a fibroblast growth factor receptor 2 (FGFR2) gene alteration, mesenchymal-epithelial transition (MET) gene alteration, phosphatidylinositol-4,5-bisphosphate 3-Kinase catalytic subunit alpha (PIK3CA) gene alteration, or one or more neurotrophic tyrosine receptor kinase (NTRK) genes 1/2/3.
  • EFGR epidermal growth factor receptor
  • ALK anaplastic lymphoma kinase
  • ROS-1 ROS-1 gene alteration
  • TMB tumor gene mutation burden
  • NTRK3 neurotrophic tyrosine receptor kin
  • a method of treating a subject with cancer comprising characterizing the cancer of the subject being positive for the genetic alteration according to the method described herein, and administering to the subject an effective therapy.
  • the effective therapy may be, for example, a poly (ADP-ribose) polymerase inhibitor (PARPi), a platinum compound, a kinase inhibitor, chemotherapy, radiation therapy, a targeted therapy (e.g., immunotherapy), surgery, or any combination thereof.
  • “About” and “approximately” shall generally mean an acceptable degree of error for the quantity measured given the nature or precision of the measurements. Exemplary degrees of error are within 20 percent (%), typically, within 10%, and more typically, within 5% of a given value or range of values.
  • the terms “comprising” (and any form or variant of comprising, such as “comprise” and “comprises”), “having” (and any form or variant of having, such as “have” and “has”), “including” (and any form or variant of including, such as “includes” and “include”), or “containing” (and any form or variant of containing, such as “contains” and “contain”), are inclusive or open-ended and do not exclude additional, un-recited additives, components, integers, elements, or method steps.
  • the terms “individual,” “patient,” or “subject” are used interchangeably and refer to any single animal, e.g., a mammal (including such non-human animals as, for example, dogs, cats, horses, rabbits, zoo animals, cows, pigs, sheep, and non- human primates) for which treatment is desired.
  • a mammal including such non-human animals as, for example, dogs, cats, horses, rabbits, zoo animals, cows, pigs, sheep, and non- human primates
  • the individual, patient, or subject herein is a human.
  • cancer and “tumor” are used interchangeably herein. These terms refer to the presence of cells possessing characteristics typical of cancer-causing cells, such as uncontrolled proliferation, immortality, metastatic potential, rapid growth and proliferation rate, and certain characteristic morphological features. Cancer cells are often in the form of a tumor, but such cells can exist alone within an animal, or can be a non-tumorigenic cancer cell, such as a leukemia cell. These terms include a solid tumor, a soft tissue tumor, or a metastatic lesion. As used herein, the term “cancer” includes premalignant, as well as malignant cancers.
  • treatment refers to clinical intervention (e.g., administration of an anti-cancer agent or anticancer therapy) in an attempt to alter the natural course of the individual being treated, and can be performed either for prophylaxis or during the course of clinical pathology.
  • Desirable effects of treatment include, but are not limited to, preventing occurrence or recurrence of disease, alleviation of symptoms, diminishment of any direct or indirect pathological consequences of the disease, preventing metastasis, decreasing the rate of disease progression, amelioration or palliation of the disease state, and remission or improved prognosis.
  • references in the appended claims to an apparatus or system or a component of an apparatus or system being adapted to, arranged to, capable of, configured to, enabled to, operable to, or operative to perform a particular function encompasses that apparatus, system, component, whether or not it or that particular function is activated, turned on, or unlocked, as long as that apparatus, system, or component is so adapted, arranged, capable, configured, enabled, operable, or operative. Additionally, although this disclosure describes or illustrates certain embodiments as providing particular advantages, certain embodiments may provide none, some, or all of these advantages.
  • Embodiment 1 A method, comprising: segmenting, by one or more processors, an image into a plurality of patches; grouping, by the one or more processors, the plurality of patches into at least one bag of patches; inputting, by the one or more processors, the at least one bag of patches into a machinelearning model trained to generate a prediction of an image class label based on the at least one bag of patches, the machine-learning model including: a first layer trained to generate one or more feature maps based on the at least one bag of patches; a second layer trained to normalize the one or more feature maps utilizing a set of batch normalization parameters determined from the at least one bag of patches to generate one or more normalized feature maps; and a third layer trained to generate the prediction of the image class label based at least in part on the one or more normalized feature maps; and outputting, by the one or more processors, the prediction of the image class label.
  • Embodiment 2 The method of embodiment 1 , wherein the image comprises only one whole-slide image (WSI).
  • WSI whole-slide image
  • Embodiment 3 The method of any one of embodiments 1 or 2, further comprising receiving, by the one or more processors, the image, wherein the image comprises an image of a tissue sample.
  • Embodiment 4 The method of any one of embodiments 1-3, wherein each patch of the plurality of patches comprises a plurality of pixels corresponding to one or more regions of the image.
  • Embodiment 5. The method of any one of embodiments 1-4, wherein the image comprises a histological stain image, a fluorescence in situ hybridization (FISH) image, an immunofluorescence (IF) image, or a hematoxylin and eosin (H&E) image.
  • FISH fluorescence in situ hybridization
  • IF immunofluorescence
  • H&E hematoxylin and eosin
  • Embodiment 6 The method of any one of embodiments 1-5, wherein: the first layer comprises one or more convolutional layers; the second layer comprises one or more batch normalization layers; and the third layer comprises an output layer.
  • Embodiment 7 The method of any one of embodiments 1-6, wherein the machinelearning model further comprises a pooling layer and a fully connected layer.
  • Embodiment 8 The method of any one of embodiments 1-7, wherein the machinelearning model comprises one or more convolutional neural networks (CNNs).
  • CNNs convolutional neural networks
  • Embodiment 9 The method of any one of embodiments 1-8, wherein the machinelearning model comprises a multiple-instance learning (MIL) machine-learning model.
  • MIL multiple-instance learning
  • Embodiment 10 The method of any one of embodiments 1-9, wherein the machinelearning model comprises a multiple-instance learning convolutional neural network (MILCNN) machine-learning model.
  • MILCNN multiple-instance learning convolutional neural network
  • Embodiment 11 The method of any one of embodiments 1-10, wherein the set of batch normalization parameters comprises a mean and a variance determined from the at least one bag of patches.
  • Embodiment 12 The method of any one of embodiments 1-11, wherein the set of batch normalization parameters corresponds to only the at least one second bag of patches.
  • Embodiment 13 The method of any one of embodiments 1-12, wherein the machinelearning model was trained by: receiving, by the one or more processors, a training image; segmenting, by the one or more processors, the training image into a second plurality of patches; grouping, by the one or more processors, the second plurality of patches into at least one second bag of patches; and inputting, by the one or more processors, the at least one second bag of patches into the machine-learning model to generate a prediction of a second image class label based on the at least one second bag of patches; wherein: the first layer is trained to generate one or more feature maps based on the at least one second bag of patches; the second layer is trained to normalize the one or more second feature maps utilizing a set of mini-batch normalization parameters determined from the at least one second bag of patches to generate one or more second normalized feature maps; and the third layer is trained to generate the prediction of the second image class label for the training image based at least in part on the one or more second normalized feature
  • Embodiment 14 The method of any one of embodiments 1-13, wherein each patch of the second plurality of patches comprises a plurality of pixels corresponding to one or more regions of the training image.
  • Embodiment 15 The method of any one of embodiments 1-14, wherein: the first layer comprises one or more convolutional layers; the second layer comprises one or more batch normalization layers; and the third layer comprises an output layer.
  • Embodiment 16 The method of any one of embodiments 1-15, wherein the one or more batch normalization layers are trained to compute at least one of a running mean, a running variance, a gamma parameter, and a beta parameter of each of a plurality of sets of mini-batch normalization parameters during a training phase of the machine-learning model.
  • Embodiment 17 The method of any one of embodiments 1-16, wherein, in response to the machine-learning model being trained, one or more of the gamma parameter and the beta parameter are fixed.
  • Embodiment 18 The method of any one of embodiments 1-17, wherein in response to the machine-learning model being trained, one or more of a mean and a variance determined for the at least one second bag of patches is configured to be determined for each additional bag of patches from the at least one second bag of patches.
  • Embodiment 19 The method of any one of embodiments 1-18, wherein the set of mini-batch normalization parameters comprises a mini-batch mean and a mini-batch variance.
  • Embodiment 20 The method of any one of embodiments 1-19, wherein segmenting the training image into at least one second bag of patches comprises randomly sampling one or more patches of pixels of the at least one second bag of patches.
  • Embodiment 21 The method of any one of embodiments 1-20, wherein the image class label comprises an indication of a genetic biomarker of a tissue sample captured in the image.
  • Embodiment 22 The method of any one of embodiments 1-21, wherein the genetic biomarker of the tissue sample comprises an epidermal growth factor receptor (EFGR) gene alteration, an anaplastic lymphoma kinase (ALK) gene alteration, an ROS-1 gene alteration, a tumor gene mutation burden (TMB), neurotrophic tyrosine receptor kinase 3 (NTRK3) gene alteration, a fibroblast growth factor receptor 2 (FGFR2) gene alteration, mesenchymal-epithelial transition (MET) gene alteration, phosphatidylinositol-4,5-bisphosphate 3-Kinase catalytic subunit alpha (PIK3CA) gene alteration, or one or more neurotrophic tyrosine receptor kinase (NTRK) genes 1/2/3.
  • EFGR epidermal growth factor receptor
  • ALK anaplastic lymphoma kinase
  • ROS-1 ROS-1 gene alteration
  • TMB tumor gene mutation burden
  • Embodiment 23 The method of any one of embodiments 1-22, further comprising generating a report based on the prediction of the image class label indicating the genetic biomarker of the tissue sample.
  • Embodiment 24 The method of any one of embodiments 1-23, further comprising causing one or more electronic devices to display the report.
  • Embodiment 25 The method of any one of embodiments 1-24, wherein causing the one or more electronic devices to display the report comprises causing a human machine interface (HMI) associated with a pathologist to display the report.
  • HMI human machine interface
  • Embodiment 26 A system including one or more computing devices, comprising: one or more non-transitory computer-readable storage media including instructions; and one or more processors coupled to the one or more storage media, the one or more processors configured to execute the instructions to: segment an image into a plurality of patches; group the plurality of patches into at least one bag of patches; and input the at least one bag of patches into a machine-learning model trained to generate a prediction of an image class label based on the at least one bag of patches, the machine-learning model including: a first layer trained to generate one or more feature maps based on the at least one bag of patches; a second layer trained to normalize the one or more feature maps utilizing a set of batch normalization parameters determined from the at least one bag of patches to generate one or more normalized feature maps; and a third layer trained to generate the prediction of the image class label based at least in part on the one or more normalized feature maps; and output the prediction of the image class label.
  • Embodiment 27 The system of embodiment 26, wherein the image comprises only one whole-slide image (WSI).
  • WI whole-slide image
  • Embodiment 28 The system of embodiment 26 or 27, wherein the instructions further comprise instructions to receive the image, wherein the image comprises an image of a tissue sample.
  • Embodiment 29 The system of any one of embodiments 26-28, wherein each patch of the plurality of patches comprises a plurality of pixels corresponding to one or more regions of the image.
  • Embodiment 30 The system of any one of embodiments 26-29, wherein the image comprises a histological stain image, a fluorescence in situ hybridization (FISH) image, an immunofluorescence (IF) image, or a hematoxylin and eosin (H&E) image.
  • FISH fluorescence in situ hybridization
  • IF immunofluorescence
  • H&E hematoxylin and eosin
  • Embodiment 31 The system of any one of embodiments 26-30, wherein: the first layer comprises one or more convolutional layers; the second layer comprises one or more batch normalization layers; and the third layer comprises an output layer.
  • Embodiment 32 The system of embodiment 31 , wherein the machine-learning model further comprises a pooling layer and a fully-connected layer.
  • Embodiment 33 The system of any one of embodiments 26-32, wherein the machinelearning model comprises one or more convolutional neural networks (CNNs).
  • CNNs convolutional neural networks
  • Embodiment 34 The system of any one of embodiments 26-33, wherein the machinelearning model comprises a multiple-instance learning (MIL) machine-learning model.
  • Embodiment 35 The system of any one of embodiments 26-34, wherein the machinelearning model comprises a multiple-instance learning convolutional neural network (MILCNN) machine-learning model.
  • MILCNN multiple-instance learning convolutional neural network
  • Embodiment 36 The system of any one of embodiments 26-35, wherein the set of batch normalization parameters comprises a mean and a variance determined from the at least one bag of patches.
  • Embodiment 37 The system of any one of embodiments 26-36, wherein the set of batch normalization parameters corresponds to only the at least one second bag of patches.
  • Embodiment 38 The system of any one of embodiments 26-37, wherein the machinelearning model was trained by: receiving a training image; segmenting the training image into a second plurality of patches; grouping the second plurality of patches into at least one second bag of patches; and inputting the at least one second bag of patches into the machine-learning model to generate a prediction of a second image class label based on the at least one second bag of patches; wherein: the first layer is trained to generate one or more feature maps based on the at least one second bag of patches; the second layer is trained to normalize the one or more second feature maps utilizing a set of mini-batch normalization parameters determined from the at least one second bag of patches to generate one or more second normalized feature maps; and the third layer is trained to generate the prediction of the second image class label for the training image based at least in part on the one or more second normalized feature maps.
  • Embodiment 39 The system of embodiment 38, wherein each patch of the second plurality of patches comprises a plurality of pixels corresponding to one or more regions of the training image.
  • Embodiment 40 The system of embodiment 38 or 39, wherein: the first layer comprises one or more convolutional layers; the second layer comprises one or more batch normalization layers; and the third layer comprises an output layer.
  • Embodiment 41 The system of embodiment 40, wherein the one or more batch normalization layers are trained to compute at least one of a running mean, a running variance, a gamma parameter, and a beta parameter of each of a plurality of sets of mini-batch normalization parameters during a training phase of the machine-learning model.
  • Embodiment 42 The system of embodiment 41, wherein, in response to the machinelearning model being trained, one or more of the gamma parameter and the beta parameter are fixed.
  • Embodiment 43 The system of any one of embodiments 38-42, wherein in response to the machine-learning model being trained, one or more of a mean and a variance determined for the at least one second bag of patches is configured to be determined for each additional bag of patches from the at least one second bag of patches.
  • Embodiment 44 The system of any one of embodiments 38-43, wherein the set of mini-batch normalization parameters comprises a mini-batch mean and a mini-batch variance.
  • Embodiment 45 The system of any one of embodiments 38-44, wherein the instructions to segment the training image into at least one second bag of patches further comprise instructions to randomly sampling one or more patches of pixels of the at least one second bag of patches.
  • Embodiment 46 The system of any one of embodiments 26-45, wherein the image class label comprises an indication of a genetic biomarker of a tissue sample captured in the image.
  • Embodiment 47 The system of embodiment 46, wherein the genetic biomarker of the tissue sample comprises an epidermal growth factor receptor (EFGR) gene alteration, an anaplastic lymphoma kinase (ALK) gene alteration, an ROS-1 gene alteration, a tumor gene mutation burden (TMB), neurotrophic tyrosine receptor kinase 3 (NTRK3) gene alteration, a fibroblast growth factor receptor 2 (FGFR2) gene alteration, mesenchymal-epithelial transition (MET) gene alteration, phosphatidylinositol-4,5-bisphosphate 3-Kinase catalytic subunit alpha (PIK3CA) gene alteration, or one or more neurotrophic tyrosine receptor kinase (NTRK) genes 1/2/3.
  • EFGR epidermal growth factor receptor
  • ALK anaplastic lymphoma kinase
  • ROS-1 ROS-1 gene alteration
  • TMB tumor gene mutation burden
  • Embodiment 48 The system of embodiment 46 or 47, wherein the instructions further comprise instructions generate a report based on the prediction of the image class label indicating the genetic biomarker of the tissue sample.
  • Embodiment 49 The system of embodiment 48, wherein the instructions further comprise instructions to cause one or more electronic devices to display the report.
  • Embodiment 50 The system of embodiment 49, wherein the instructions to cause the one or more electronic devices to display the report further comprise instructions to cause a human machine interface (HMI) associated with a pathologist to display the report.
  • HMI human machine interface
  • Embodiment 51 A non-transitory computer-readable medium comprising instructions that, when executed by one or more processors of one or more computing devices, cause the one or more processors to: segment, by the one or more processors, an image into a plurality of patches; group, by the one or more processors, the plurality of patches into at least one bag of patches; input, by the one or more processors, the at least one bag of patches into a machine-learning model trained to generate a prediction of an image class label based on the at least one bag of patches, the machine-learning model including: a first layer trained to generate one or more feature maps based on the at least one bag of patches; a second layer trained to normalize the one or more feature maps utilizing a set of batch normalization parameters determined from the at least one bag of patches to generate one or more normalized feature maps; and a third layer trained to generate the prediction of the image class label based at least in part on the one or more normalized feature maps; and output, by the one or more processors, the prediction
  • Embodiment 52 The non-transitory computer-readable medium of embodiment 51 , wherein the image comprises only one whole-slide image (WSI).
  • WSI whole-slide image
  • Embodiment 53 The non-transitory computer-readable medium of embodiment 51 or 52, wherein the instructions further comprise instructions to receive the image, wherein the image comprises an image of a tissue sample.
  • Embodiment 54 The non-transitory computer-readable medium of any one of embodiments 51-53, wherein each patch of the plurality of patches comprises a plurality of pixels corresponding to one or more regions of the image.
  • Embodiment 55 The non-transitory computer-readable medium of any one of embodiments 51-54, wherein the image comprises a histological stain image, a fluorescence in situ hybridization (FISH) image, an immunofluorescence (IF) image, or a hematoxylin and eosin (H&E) image.
  • FISH fluorescence in situ hybridization
  • IF immunofluorescence
  • H&E hematoxylin and eosin
  • Embodiment 56 The non-transitory computer-readable medium of any one of embodiments 51-55, wherein: the first layer comprises one or more convolutional layers; the second layer comprises one or more batch normalization layers; and the third layer comprises an output layer.
  • Embodiment 57 The non-transitory computer-readable medium of any one of embodiments 51-56, wherein the machine-learning model further comprises a pooling layer and a fully connected layer.
  • Embodiment 58 The non-transitory computer-readable medium of any one of embodiments 51-57, wherein the machine-learning model comprises one or more convolutional neural networks (CNNs).
  • CNNs convolutional neural networks
  • Embodiment 59 The non-transitory computer-readable medium of any one of embodiments 51-58, wherein the machine-learning model comprises a multiple-instance learning (MIL) machine-learning model.
  • MIL multiple-instance learning
  • Embodiment 60 The non-transitory computer-readable medium of any one of embodiments 51-59, wherein the machine-learning model comprises a multiple-instance learning convolutional neural network (MILCNN) machine-learning model.
  • MILCNN multiple-instance learning convolutional neural network
  • Embodiment 61 The non-transitory computer-readable medium of any one of embodiments 51-60, wherein the set of batch normalization parameters comprises a mean and a variance determined from the at least one bag of patches.
  • Embodiment 62 The non-transitory computer-readable medium of any one of embodiments 51-61, wherein the set of batch normalization parameters corresponds to only the at least one second bag of patches.
  • Embodiment 63 The non-transitory computer-readable medium of any one of embodiments 51-62, wherein the machine-learning model was trained by: receiving a training image; segmenting the training image into a second plurality of patches; grouping the second plurality of patches into at least one second bag of patches; and inputting the at least one second bag of patches into the machine-learning model to generate a prediction of a second image class label based on the at least one second bag of patches; wherein: the first layer is trained to generate one or more feature maps based on the at least one second bag of patches; the second layer is trained to normalize the one or more second feature maps utilizing a set of mini-batch normalization parameters determined from the at least one second bag of patches to generate one or more second normalized feature maps; and the third layer is trained to generate the prediction of the second image class label for the training image based at least in part on the one or more second normalized feature maps.
  • Embodiment 64 The non-transitory computer-readable medium of embodiment 63, wherein each patch of the second plurality of patches comprises a plurality of pixels corresponding to one or more regions of the training image.
  • Embodiment 65 The non-transitory computer-readable medium of embodiment 63 or 64, wherein: the first layer comprises one or more convolutional layers; the second layer comprises one or more batch normalization layers; and the third layer comprises an output layer.
  • Embodiment 66 The non-transitory computer-readable medium of embodiment 65, wherein the one or more batch normalization layers are trained to compute at least one of a running mean, a running variance, a gamma parameter, and a beta parameter of each of a plurality of sets of mini-batch normalization parameters during a training phase of the machinelearning model.
  • Embodiment 67 The non-transitory computer-readable medium of embodiment 66, wherein, in response to the machine-learning model being trained, one or more of the gamma parameter and the beta parameter are fixed.
  • Embodiment 68 The non-transitory computer-readable medium of any one of embodiments 65-67, wherein in response to the machine-learning model being trained, one or more of a mean and a variance determined for the at least one second bag of patches is configured to be determined for each additional bag of patches from the at least one second bag of patches.
  • Embodiment 69 The non-transitory computer-readable medium of any one of embodiments 65-68, wherein the set of mini-batch normalization parameters comprises a minibatch mean and a mini-batch variance.
  • Embodiment 70 The non-transitory computer-readable medium of any one of embodiments 65-69, wherein the instructions to segment the training image into at least one second bag of patches further comprise instructions to randomly sampling one or more patches of pixels of the at least one second bag of patches.
  • Embodiment 71 The non-transitory computer-readable medium of any one of embodiments 51-70, wherein the image class label comprises an indication of a genetic biomarker of a tissue sample captured in the image.
  • Embodiment 72 The non-transitory computer-readable medium of embodiment 71, wherein the genetic biomarker of the tissue sample comprises an epidermal growth factor receptor (EFGR) gene alteration, an anaplastic lymphoma kinase (ALK) gene alteration, an ROS-1 gene alteration, a tumor gene mutation burden (TMB), neurotrophic tyrosine receptor kinase 3 (NTRK3) gene alteration, a fibroblast growth factor receptor 2 (FGFR2) gene alteration, mesenchymal-epithelial transition (MET) gene alteration, phosphatidylinositol-4,5-bisphosphate 3-Kinase catalytic subunit alpha (PIK3CA) gene alteration, or one or more neurotrophic tyrosine receptor kinase (NTRK) genes 1/2/3.
  • EFGR epidermal growth factor receptor
  • ALK anaplastic lymphoma kinase
  • ROS-1 ROS-1 gene alteration
  • TMB
  • Embodiment 73 The non-transitory computer-readable medium of embodiment 71 or 72, wherein the instructions further comprise instructions generate a report based on the prediction of the image class label indicating the genetic biomarker of the tissue sample.
  • Embodiment 74 The non-transitory computer-readable medium of embodiment 73, wherein the instructions further comprise instructions to cause one or more electronic devices to display the report.
  • Embodiment 75 The non-transitory computer-readable medium of embodiment 74, wherein the instructions to cause the one or more electronic devices to display the report further comprise instructions to cause a human machine interface (HMI) associated with a pathologist to display the report.
  • HMI human machine interface
  • Embodiment 76 A method, comprising: receiving, by one or more processors, a training image; segmenting, by the one or more processors, the training image into a plurality of patches; grouping, by the one or more processors, the plurality of patches into at least one bag of patches; training a first layer to generate one or more feature maps based on the at least one bag of patches; training a second layer to normalize the one or more feature maps utilizing a set of minibatch normalization parameters from the one or more normalized feature maps; and training a third layer to generate the prediction of an image class label for the training image based at least in part on the one or more normalized feature maps.
  • Embodiment 77 The method of embodiment 76, wherein the training image is a single image.
  • Embodiment 78 The method of embodiment 76 or 77, wherein each patch of the plurality of patches comprises a plurality of pixels corresponding to one or more regions of the training image.
  • Embodiment 79 The method of any one of embodiments 76-78, wherein: the first layer comprises one or more convolutional layers; the second layer comprises one or more batch normalization layers; and the third layer comprises an output layer.
  • Embodiment 80 The method of embodiment 79, wherein the one or more batch normalization layers are trained to compute at least one of a running mean, a running variance, a gamma parameter, and a beta parameter of each of a plurality of sets of mini-batch normalization parameters during a training phase.
  • Embodiment 81 The method of embodiment 80, wherein the set of mini -batch normalization parameters comprises a mini-batch mean and a mini-batch variance.
  • Embodiment 82 The method of any one of embodiments 76-81, wherein segmenting the training image into at least one bag of patches comprises randomly sampling one or more patches of pixels of the at least one bag of patches.
  • Embodiment 83 A method, comprising: receiving, by one or more processors, an image of a tissue sample; segmenting, by the one or more processors, the image into a plurality of bags of patches, wherein each patch of the plurality of bags of patches comprises a plurality of pixels corresponding to one or more regions of the tissue sample; inputting, by the one or more processors, at least one bag of patches of the plurality of bags of patches into a machine-learning model trained to generate a prediction of an image class label indicating a genetic biomarker of the tissue sample based on the at least one bag of patches, the machine-learning model including: first layer trained to generate one or more feature maps based on the at least one bag of patches; a second layer trained to normalize the one or more feature maps utilizing a set of batch normalization parameters determined from the at least one bag of patches to generate one or more normalized feature maps; and a third layer trained to generate the prediction of the image class label indicating a genetic biomarker of the tissue sample based at
  • Embodiment 84 The method of embodiment 83, wherein the image of the tissue sample comprises only one whole-slide image (WSI) of one or more cancer tissue samples.
  • Embodiment 85 The method of embodiment 83 or 84, wherein the image of the tissue sample comprises a histological stain image of the one or more cancer tissue samples, a fluorescence in situ hybridization (FISH) image of the one or more cancer tissue samples, an immunofluorescence (IF) image of the one or more cancer tissue samples, or a hematoxylin and eosin (H&E) image of the one or more cancer tissue samples.
  • FISH fluorescence in situ hybridization
  • IF immunofluorescence
  • H&E hematoxylin and eosin
  • Embodiment 86 The method of any one of embodiments 83-85, wherein: the first layer comprises one or more convolutional layers; the second layer comprises one or more batch normalization layers; and the third layer comprises an output layer.
  • Embodiment 87 The method of embodiment 86, wherein the machine-learning model further comprises a pooling layer and a fully connected layer.
  • Embodiment 88 The method of any one of embodiments 83-87, wherein the set of batch normalization parameters comprises a mean and a variance determined from the at least one bag of patches.
  • Embodiment 89 The method of any one of embodiments 83-88, wherein the set of batch normalization parameters corresponds to only the at least one second bag of patches.
  • Embodiment 90 The method of any one of embodiments 83-89, wherein the machine-learning model was trained by: receiving, by the one or more processors, a training image; segmenting, by the one or more processors, the training image into a second plurality of patches; grouping, by the one or more processors, the second plurality of patches into at least one second bag of patches; and inputting, by the one or more processors, the at least one second bag of patches into the machine-learning model to generate a prediction of a second image class label based on the at least one second bag of patches; wherein: the first layer is trained to generate one or more feature maps based on the at least one second bag of patches; the second layer is trained to normalize the one or more second feature maps utilizing a set of mini-batch normalization parameters determined from the at least one second bag of patches to generate one or more second normalized feature maps; and the third layer is trained to generate the prediction of the second image class label for the training image based at least in part on the one or more
  • Embodiment 91 The method of embodiment 90, wherein each patch of the second plurality of patches comprises a plurality of pixels corresponding to one or more regions of the training image.
  • Embodiment 92 The method of embodiment 90 or 91, wherein: the first layer comprises one or more convolutional layers; the second layer comprises one or more batch normalization layers; and the third layer comprises an output layer.
  • Embodiment 93 The method of embodiment 92, wherein the one or more batch normalization layers are trained to compute at least one of a running mean, a running variance, a gamma parameter, and a beta parameter of each of a plurality of sets of mini-batch normalization parameters during a training phase of the machine-learning model.
  • Embodiment 94 The method of embodiment 93, wherein, in response to the machine-learning model being trained, one or more of the gamma parameter and the beta parameter are fixed.
  • Embodiment 95 The method of any one of embodiments 90-94, wherein in response to the machine-learning model being trained, one or more of a mean and a variance determined for the at least one second bag of patches is configured to be determined for each additional bag of patches from the at least one second bag of patches.
  • Embodiment 96 The method of any one of embodiments 90-95, wherein the set of mini-batch normalization parameters comprises a mini-batch mean and a mini-batch variance.
  • Embodiment 97 The method of any one of embodiments 90-96, wherein segmenting the training image into at least one second bag of patches comprises randomly sampling one or more patches of pixels of the at least one second bag of patches.
  • Embodiment 98 The method of any one of embodiments 90-97, further comprising generating a report based on the prediction of the image class label indicating the genetic biomarker of the tissue sample.
  • Embodiment 99 The method of embodiment 98, further comprising causing one or more electronic devices to display the report.
  • Embodiment 100 The method of embodiment 99, wherein causing the one or more electronic devices to display the report comprises causing a human machine interface (HMI) associated with a pathologist to display the report.
  • HMI human machine interface
  • Embodiment 101 The method of any one of embodiments 76-100, wherein the machine-learning model comprises one or more convolutional neural networks (CNNs).
  • CNNs convolutional neural networks
  • Embodiment 102 The method of any one of embodiments 76-101, wherein the machine-learning model comprises a multiple-instance learning (MIL) machine-learning model.
  • Embodiment 103 The method of any one of embodiments 76-102, wherein the machine-learning model comprises a multiple-instance learning convolutional neural network (MILCNN) machine-learning model.
  • Embodiment 104 The method of any one of embodiments 76-103, wherein the image class label comprises an indication of a genetic biomarker of a tissue sample captured in the image.
  • Embodiment 105 The method of embodiment 104, wherein the genetic biomarker of the tissue sample comprises an epidermal growth factor receptor (EFGR) gene alteration, an anaplastic lymphoma kinase (ALK) gene alteration, an ROS-1 gene alteration, a tumor gene mutation burden (TMB), neurotrophic tyrosine receptor kinase 3 (NTRK3) gene alteration, a fibroblast growth factor receptor 2 (FGFR2) gene alteration, mesenchymal-epithelial transition (MET) gene alteration, phosphatidylinositol-4,5-bisphosphate 3-Kinase catalytic subunit alpha (PIK3CA) gene alteration, or one or more neurotrophic tyrosine receptor kinase (NTRK) genes 1/2/3.
  • EFGR epidermal growth factor receptor
  • ALK anaplastic lymphoma kinase
  • ROS-1 ROS-1 gene alteration
  • TMB tumor gene mutation burden
  • the attention-based EGFR algorithm achieved an area under the curve (AUC) of 0.870, a negative predictive value (NPV) of 0.954 and a positive predictive value (PPV) of 0.410 in a validation cohort reflecting the 15% prevalence of EGFR mutations in lung adenocarcinoma.
  • AUC area under the curve
  • NPV negative predictive value
  • PPV positive predictive value
  • the attention model outperformed a heuristic -based model focused exclusively on tumor regions used as a comparison. Although the attention model also extracts signal primarily from tumor morphology, it extracts additional signal from non-tumor tissue regions. Further analysis of high- attention regions by pathologists showed associations of predicted EGFR negativity with solid growth patterns and higher peritumoral immune presence.
  • This algorithm highlights the potential of this process to provide instantaneous rule -out screening for biomarker alterations and may help prioritize the use of scarce tissue for biomarker testing.
  • this Example was trained to call EGFR statuses, similar models may be trained to call the mutational status of different genes, such as ALK, ROS1, FGFR2, MET, PIK3CA, NTRK1, NTRK2, or NTRK3.
  • attention-based multiple-instance learning can predict EGFR mutational status in advanced metastatic lung adenocarcinoma samples directly from H&E images with state-of-the-art performance on real- world datasets, where many samples have less than 50% tumor content.
  • tissue morphology classification models and pathologist review it is shown that although tumor regions contain the most signal for EGFR, the attention-based model also considers relevant outlier instances from other tissue types such as immune or stromal features when predicting EGFR mutational status.
  • association rules mining a process is demonstrated wherein morphology models and pathologist expertise can be leveraged to biologically verify end-to-end biomarker predictions by evaluating associated feature combinations, allowing for better model interpretation when supporting clinical decisions.
  • Model Architecture The attention-based multiple-instance learning model was built using ResNet50 without the top-layer and with an added global average pooling layer to serve as a trainable feature extractor. Following the feature extractor was an attention-mechanism including two fully-connected layers (512-dimensional, 256-dimensional) to reduce the embedding dimensionality. The reduced embeddings were then passed to a 256-dimensional fully-connected layer followed by another 1 -dimensional fully-connected layer. The output is then transposed and all patches within a multiple-instance bag are passed through a softmax activation which fractionally weighs the attention for each patch within the bag. The reduced embeddings are then weighed using the softmax attention weights to generate the slide-level weighted embedding. A final fully-connected layer processes the slide-level weighted embedding and uses the sigmoid activation to predict the specimen-level EGFR status.
  • an attention-mechanism including two fully-connected layers (512-dimensional, 256-dimensional) to reduce the embedding dimensionality.
  • the EGFR mutation prediction task is a binary classification task where, given a bag of patches (where the bag as a whole has a label but patches individually do not), a prediction from an H&E image was made as to whether a gene mutation is present within a specimen.
  • the loss is optimized for is the binary crossentropy loss:
  • I - ( log(g(z ) + (1 - y)Zo ⁇ (l - g(z) ) where y is the target value for the input sample.
  • the batch loss is aggregated across each input sample within the batch by either summing or averaging the losses, and gradient descent is performed to update the model parameters.
  • the MIL models were trained for 200 epochs with 40 patches per bag during each training pass, trained using the TensorFlow 33 framework. The Adam 34 optimizer was used with a learning rate of le-5.
  • Pathologists scored each patch for a set of numerical variables and then further reviewed each patch for categorical characteristics.
  • the numerical variables were tumor nuclei fraction, necrosis fraction, peritumoral immune fraction, and intratumoral immune fraction.
  • Tumor nuclei fraction was determined as the fraction of tumor nuclei relative to all nuclei present within a patch.
  • Necrosis fraction was determined as the fraction of the patch area containing necrotic tissue.
  • the peritumoral immune fraction was determined as the fraction of tumor edges that had noticeable immune cell response, such as lymphocytes aggregating at or within the tumor boundary.
  • the intratumoral immune fraction was determined as the fraction of tumor tissue within a patch that had noticeable immune infiltration, such as lymphocytes dispersed throughout a tumor mass or nest.
  • Bags of patches were randomly sampled from each slide during training and the entire bag was given the specimen-level EGFR status as the label. Through the attention mechanism, the model learned without human guidance how to weigh different patches within each bag when predicting for specimen-level mutational status.
  • the models also achieved an NPV of 0.954 ⁇ 0.024 and a PPV of 0.41 ⁇ 0.081 at a binary classification threshold of 0.5.
  • FIGS. 9D and 9E an EGFR true positive (TP) exemplar is presented.
  • TP EGFR true positive
  • Patches I-V had a predominant acinar pattern and hobnail cytology, with low peritumoral and intratumoral immune fractions, ranging from 0.1 to 0.2.
  • Patch IV had a low presence of necrotic tissue and patch VI was predicted as stroma by the tissue-morphology model, and pathologists confirmed this patch was fibrosis.
  • FIG. 9D an EGFR true positive (TP) exemplar is presented.
  • TP EGFR true positive
  • each bag’s overall characteristics were summarized by determining the mode of each category for the bag’s reviewed patches.
  • bags that were predominantly lepidic or papillary were predicted EGFR mutant five times more often than EGFR wild-type (FIG. 10A).
  • bags that predominantly possessed the solid architecture were predicted as EGFR wild-type seven times more often than mutant.
  • the predominant architecture was mucinous, it was twice as likely that the bag would be predicted as EGFR wild-type. There was no strong enrichment (ratio ⁇ 2.0) in prediction status of either type for predominantly acinar bags.
  • the highest-lift item-sets for predicted EGFR mutated status included: ⁇ fibrosis, lepidic minor architectural pattern, hobnail cytology ⁇ and ⁇ fibrosis, acinar predominant architectural pattern, hobnail cytology ⁇ , both with a lift of 1.92.
  • the EGFR prediction algorithm recapitulates several known morphological and cytological associations with EGFR status and these features can be tested on a per sample basis by analyzing highly attended regions manually or via tissue morphology/cytology classification algorithms.
  • attention-based models do not require expensive manual annotation or guidance to train.
  • biological verification of attention-based end-to- end models can be performed by combining assessment approaches such as morphological profiling, item-set analysis, and pathology review, potentially increasing accuracy in a clinical setting.
  • TME tumor microenvironment
  • fibrosis is present alongside the tumor-related features. This inclusion of fibrosis is less expected than the inclusion of tumor features but may also suggest interesting interactions within the TME. Many studies now suggest that stroma and stromal elements may play far more than a passive role within TMEs and may have direct effects on tumorigenesis. The inclusion of fibrosis as a relevant feature may indicate the ability of machine learning models to recognize, without human guidance, patterns involving tissue regions that may be orthogonal to tumor-specific features.
  • the machine learning model enabled with self-directed intuition can predict EGFR mutational status from morphologically-diverse real-world tissue specimens without human intervention.
  • the ability to rely upon machine-intuition to extract meaningful features could enable low-effort signal-searching experiments at scale, as well as provide a means to investigate machine-discovered patterns within the phenotype that may be biologically informative.
  • models intended to assist in clinical decision-making recapitulate expected results, such as finding tumor regions most predictive for genomic alteration signal, but also that such models may be capable of determining patterns and interactions within phenotypic features in ways that elevate performance beyond methods relying solely upon human intuition.
  • these screening algorithms could provide rapid genomic insights regarding a patient specimen, which can then be checked by a combination of more interpretable models as well as pathologist visual examination. Any low-confidence predictions or samples flagged by pathologists could then be selected for further genomic testing.

Abstract

Un procédé mis en œuvre par un ou plusieurs processeurs consiste à segmenter une image en une pluralité de blocs regroupant la pluralité de blocs en au moins un sac de blocs, et à entrer ledit au moins un sac de blocs dans un modèle d'apprentissage automatique entraîné pour générer une prédiction d'une étiquette de classe d'image sur la base dudit au moins un sac de blocs. Le modèle d'apprentissage automatique comprend une première couche entraînée pour générer une ou plusieurs cartes de caractéristiques sur la base dudit au moins un sac de blocs, une seconde couche entraînée pour normaliser la ou les cartes de caractéristiques à l'aide d'un ensemble de paramètres de normalisation de lot déterminé à partir dudit au moins un sac de blocs pour générer une ou plusieurs cartes de caractéristiques normalisées, ainsi qu'une troisième couche entraînée pour générer la prédiction de l'étiquette de classe d'image sur la base, au moins en partie, de la ou des cartes de caractéristiques normalisées.
PCT/US2023/018074 2022-04-11 2023-04-10 Systèmes et procédés de prédiction d'étiquettes de classe de niveau de diapositive pour une image de diapositive entière WO2023200732A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202263329730P 2022-04-11 2022-04-11
US63/329,730 2022-04-11

Publications (1)

Publication Number Publication Date
WO2023200732A1 true WO2023200732A1 (fr) 2023-10-19

Family

ID=88330151

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2023/018074 WO2023200732A1 (fr) 2022-04-11 2023-04-10 Systèmes et procédés de prédiction d'étiquettes de classe de niveau de diapositive pour une image de diapositive entière

Country Status (1)

Country Link
WO (1) WO2023200732A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112101451A (zh) * 2020-09-14 2020-12-18 北京联合大学 一种基于生成对抗网络筛选图像块的乳腺癌组织病理类型分类方法
US20210366577A1 (en) * 2020-05-22 2021-11-25 Insitro, Inc. Predicting disease outcomes using machine learned models
CN109376830B (zh) * 2018-10-17 2022-01-11 京东方科技集团股份有限公司 二维码生成方法及装置
US20220076411A1 (en) * 2019-05-29 2022-03-10 Leica Biosystems Imaging Inc. Neural netork based identification of areas of interest in digital pathology images
US20220101519A1 (en) * 2018-05-14 2022-03-31 Tempus Labs, Inc. Determining Biomarkers from Histopathology Slide Images

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220101519A1 (en) * 2018-05-14 2022-03-31 Tempus Labs, Inc. Determining Biomarkers from Histopathology Slide Images
CN109376830B (zh) * 2018-10-17 2022-01-11 京东方科技集团股份有限公司 二维码生成方法及装置
US20220076411A1 (en) * 2019-05-29 2022-03-10 Leica Biosystems Imaging Inc. Neural netork based identification of areas of interest in digital pathology images
US20210366577A1 (en) * 2020-05-22 2021-11-25 Insitro, Inc. Predicting disease outcomes using machine learned models
CN112101451A (zh) * 2020-09-14 2020-12-18 北京联合大学 一种基于生成对抗网络筛选图像块的乳腺癌组织病理类型分类方法

Similar Documents

Publication Publication Date Title
US11416716B2 (en) System and method for automatic assessment of cancer
US20230029915A1 (en) Multimodal machine learning based clinical predictor
US11562585B2 (en) Systems and methods for image preprocessing
Taylor et al. Automated detection of moderate and large pneumothorax on frontal chest X-rays using deep convolutional neural networks: A retrospective study
Doan et al. SONNET: A self-guided ordinal regression neural network for segmentation and classification of nuclei in large-scale multi-tissue histology images
EP4052118A1 (fr) Réduction automatique d'ensembles d'instruction pour programmes d'apprentissage automatique
US20220036971A1 (en) Method and system for predicting response to immune anticancer drugs
US20230306598A1 (en) Systems and methods for mesothelioma feature detection and enhanced prognosis or response to treatment
CA3200491A1 (fr) Systemes et procedes d'evaluation d'images de radiologie par tep
US10665347B2 (en) Methods for predicting prognosis
WO2023014789A1 (fr) Système et procédé d'analyse d'image de pathologie à l'aide d'un réseau neuronal entraîné et d'un cadre d'apprentissage actif
EP4138091A1 (fr) Pronostic du cancer
Hema et al. Region-based segmentation and classification for ovarian cancer detection using convolution neural network
Arya et al. Proposal of svm utility kernel for breast cancer survival estimation
Qattous et al. PaCMAP-embedded convolutional neural network for multi-omics data integration
WO2023200732A1 (fr) Systèmes et procédés de prédiction d'étiquettes de classe de niveau de diapositive pour une image de diapositive entière
Yang et al. Automated facial recognition for Noonan syndrome using novel deep convolutional neural network with additive angular margin loss
Sarkar et al. Breast Cancer Subtypes Classification with Hybrid Machine Learning Model
EP3975059A1 (fr) Sélection dynamique de réseaux de neurones détecter des caractéristiques prédéterminées
WO2023042184A1 (fr) Apprentissage automatique pour prédire un génotype de cancer et une réponse de traitement à l'aide d'images histopathologiques numériques
Singh et al. STRAMPN: Histopathological image dataset for ovarian cancer detection incorporating AI-based methods
Jia et al. DCCAFN: deep convolution cascade attention fusion network based on imaging genomics for prediction survival analysis of lung cancer
US20220130065A1 (en) Method for analyzing thickness of cortical region
US20220237777A1 (en) Method for measuring lesion of medical image
Hao Biologically interpretable, integrative deep learning for cancer survival analysis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23788812

Country of ref document: EP

Kind code of ref document: A1