US20160203597A1 - Methods for constructing association maps of imaging data and biological data - Google Patents

Methods for constructing association maps of imaging data and biological data Download PDF

Info

Publication number
US20160203597A1
US20160203597A1 US14/757,779 US201514757779A US2016203597A1 US 20160203597 A1 US20160203597 A1 US 20160203597A1 US 201514757779 A US201514757779 A US 201514757779A US 2016203597 A1 US2016203597 A1 US 2016203597A1
Authority
US
United States
Prior art keywords
imaging
features
data
imaging features
association map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US14/757,779
Inventor
Howard Yuan-Hao Chang
Eran Segal
Michael David Kuo
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US14/757,779 priority Critical patent/US20160203597A1/en
Publication of US20160203597A1 publication Critical patent/US20160203597A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F23COMBUSTION APPARATUS; COMBUSTION PROCESSES
    • F23QIGNITION; EXTINGUISHING-DEVICES
    • F23Q3/00Igniters using electrically-produced sparks
    • F23Q3/004Using semiconductor elements
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/48Other medical applications
    • A61B5/4842Monitoring progression or stage of a disease
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7271Specific aspects of physiological measurement analysis
    • G06K9/46
    • G06K9/66
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression

Definitions

  • the subject matter described herein relates to methods for predicting disease risk, prognosis, and best treatment regimens in clinical subjects.
  • the methods involve evaluating a subjects non-invasively obtained imaging features in view of an association map that correlates imaging features with biological data.
  • a method of constructing an association map between imaging features and biological data comprising:
  • the features from a plurality of images of a subject are associated with a disease.
  • the identifying comprises identifying one or more imaging features based on frequency of the one or more features in the plurality of images.
  • the identifying comprises identifying one or more imaging features based on its independence from other features.
  • the identifying comprises identifying one or more imaging features from images obtained using an imaging technique selected from the group consisting of computerized tomography imaging, magnetic resonance imaging (MRI), positron emission tomography (PET), ultrasonography (US), optical imaging, infrared imaging, and x-ray radiography.
  • an imaging technique selected from the group consisting of computerized tomography imaging, magnetic resonance imaging (MRI), positron emission tomography (PET), ultrasonography (US), optical imaging, infrared imaging, and x-ray radiography.
  • the imaging technique comprises the use of an imaging agent or image-enhancing agent.
  • the applying comprises applying a module networks algorithm.
  • the applying comprises applying an algorithm that applies an iterative Bayesian probabilistic procedure that identifies combinations of imaging features that relate to the biological data.
  • the applying comprises applying an algorithm to gene expression data.
  • the gene expression data is from a DNA microarray assay. In some embodiments, the gene expression data is from a cDNA microarray assay. In some embodiments, the gene expression data is from an RNA microarray assay.
  • the applying comprises applying an algorithm to protein expression data.
  • the evaluating the statistical significance of the association map comprises evaluating by comparison of the map with permuted data sets.
  • the evaluating the statistical significance of the association map comprises evaluating by testing the prediction using an independent biological data set, independent images, or both.
  • a method for predicting a gene or protein expression level in a biological sample comprising:
  • the method further comprises, based on the predicting, providing a treatment prognosis of said patient based on the presence and/or absence of certain imaging features.
  • the providing comprises providing a prediction of a patient's response to a drug. In some embodiments, the providing comprises providing a prediction of a patient's probable survival. In particular embodiments, the probable survival is disease free survival.
  • the providing comprises providing a likelihood of disease recurrence.
  • the providing comprises providing a likelihood of metastasis.
  • FIGS. 1A-1C are computerized tomography (CT) images of distinct features in human hepatocellular carcinomas (HCC), the features referred to as internal arteries ( FIG. 1A ), hypodense halo ( FIG. 1B ), and texture heterogeneity ( FIG. 1C );
  • CT computerized tomography
  • FIG. 1D illustrates a strategy for constructing an association map between imaging features and gene expression
  • FIG. 2A shows an overview of an association map of imaging features and global gene expression, where each column is a sample; each row is a module. For each module, a decision tree of imaging features is associated with variation in the expression level of module genes. Knowledge of the imaging features thus allows an approximate reconstruction of the gene expression pattern.
  • FIG. 2B is a graph showing the cumulative fraction of gene expression variation across the full complement of gene activities that is predicted by the number of imaging features in the model.
  • FIG. 2C shows a matrix of modules, associated imaging features, and their enriched gene ontology annotations. Only modules and annotations with significant enrichment (false discovery rate ⁇ 0.05 after accounting for multiple hypothesis testing) are shown.
  • FIGS. 3A-3C show molecular portraits of HCC from imaging features, where modules associated with HCC proliferation ( FIG. 3A ), liver synthetic function ( FIG. 3B ), and extracellular matrix remodeling ( FIG. 3C ) are shown; each column is a tumor sample; each row is a gene. Imaging features specifying each module are outlined on top; expression pattern of genes within the module as distinguished by imaging features are shown on bottom.
  • FIGS. 4A-4B show that imaging features predict venous invasion and survival, where a two-feature decision tree associated with a gene expression signature of venous invasion is shown to predict histologic venous invasion ( FIG. 4A ), and Kaplan-Meier survival curves of HCC patients with and without “internal arteries” imaging feature are shown in FIG. 4B .
  • FIG. 5 is a Table showing examples of image features.
  • a method wherein an image or one or more imaging features is correlated to an association map of imaging features and biological data.
  • the method finds use in various fields, including medical diagnostics and therapeutics.
  • the methods have use in clinical subject/patient disease screening, diagnosis, characterization, and treatment selection.
  • the method is based on correlating biological data with associated imaging data, to construct a bidirectional association map, as will be illustrated below in Example 1.
  • the biological data for construction of the association map can be obtained from a database or generated from patient biological samples. Databases of polynucleotide and protein expression data are well known. Such gene expression data can also be obtained, for example, using a DNA microarray that surveys the expression levels of thousands of genes simultaneously. For example, a 21-gene assay, termed Oncotype Dx, is a commercially available DNA microarray to determine prognosis and predict response of primary breast tumors to chemotherapy. A 70-gene signature known as Mammaprint is known for use in determining an adjuvant chemotherapeutic regimen in primary breast cancer. Gene expression signatures have also been identified to predict prognosis or therapeutic response in lung cancer, leukemia, and prostate cancer.
  • Gene expression data can be for any tissue source, such as cancerous tissue, tissue associated with a malignant or benign growth, infected tissue, inflamed tissue, and the like.
  • Gene expression data may relate to expression levels, splicing patterns, gene copy number, chromosomal alterations (e.g., deletions, amplifications, inversions, and translocations), single nucleotide polymorphisms, and the like.
  • Gene expression data include epigenetic data, e.g., relating to DNA methylation and histone modifications (e.g. acetylation, methylation, and ubiquitination).
  • Gene expression data may be based on analyses of DNA, cDNA, mRNA, snRNA, iRNA, or other nucleic acids.
  • Biological data includes data based on protein-based analyses, including tissue protein expression profiles of different tissues (e.g. cancer, infected, inflamed, infected, etc). Particular examples include biological data from Serial Analysis of Gene Expression (SAGE), nuclear magnetic resonance, protein-interaction screens, chromatin immunoprecipitation-chips, isotope coded affinity tagging, activity based reagents, gel or chromatographic separation, RNAi screens, tissue arrays or mass spectrometry in which a large number of genes, proteins or metabolites are measured in a single experiment or assay is also contemplated. Biological data also include data from serological tests, EKGs, EEG, urinalysis, and other clinical and forensic analyses.
  • SAGE Serial Analysis of Gene Expression
  • the method combines the association map with imaging data.
  • imaging data can be obtained from a wide variety of sources, including but not limited to magnetic resonance imaging (MRI), positron emission tomography (PET), computerized tomography (CT), ultrasonography (US), optical imaging, infrared imaging, and x-ray radiography.
  • Imaging can be coupled with drugs or compounds, contrast agents or other agents or stimuli, or medical devices to elicit additional information from the imaging. Images are obtained using these modalities applied to a tissue sample, a lesion, an organism imaged in whole or in part.
  • the method of constructing an association map comprises providing a plurality of images of, for example, a tissue or a whole or part of an organism, such as a human subject, and biological data that has some relation to the images.
  • images of a solid tumor would preferably be accompanied by biological data based on the imaged solid tumor or on a like solid tumor. That is, images of tumors in the thyroid or images of infected tissue on a limb would have corresponding biological data from thyroid tumors or infected limb tissue, respectively.
  • the image and the biological data derive from the same tissue or organism; however, a population of images and a population of biological data need not have a one-to-one correspondence.
  • An exemplary association map relating to human hepatocellular carcinoma is constructed by inspecting the imaging data and identifying distinctive features in the image. Examples of distinctive image features (or traits) for human hepatocellular carcinomas are shown in FIGS. 1A-1C , where computerized tomography (CT) images of features referred to as internal arteries ( FIG. 1A ), hypodense halo ( FIG. 1B ), and texture heterogeneity ( FIG. 1C ) were identified. As will be illustrated below (Example 1), the image or images may be scrutinized to extract certain features or features that inform gene expression. Such features include observations related to morphology, composition, structure, and/or physiology.
  • CT computerized tomography
  • tissue necrosis tissue heterogeneity
  • tumor margin score internal septa
  • enhancement pattern internal arteries
  • hypodense halo wash-out
  • wash-in texture heterogeneity
  • capsule infiltration
  • other imaging features familiar to artisans.
  • imaging features are associated with a unique image, imaging study, examination, subject or population, all of which are data relating to the image.
  • image data independently or in combination define elements or components of the image, or the composite imaging appearance itself, which are included in the biological data used to construct an association map.
  • the method of constructing an association map further includes one or both of (i) using an algorithm to identify relationships between one or more imaging features and the biological data and/or (ii) evaluating the statistical significance of the association map.
  • modules network algorithm is suitable for use (Segal, E. et al., Nat. Genet., 34:166-176 (2003)) wherein the algorithm identifies groups of genes, termed modules, which demonstrate coherent variation in expression across multiple samples.
  • This algorithm further applies an iterative Bayesian probabilistic analysis and to identify combinations of imaging features that can predict the expression levels of gene modules.
  • Bayesian probabilistic analysis refers broadly to a genus of related models and their derivatives. Multiple regression analysis and other analyses are known in the art. Classification algorithms such as neural networks, support vector machines, decision trees, Markov networks, and their derivatives may be applied. An exemplary analyses involves application of the Cox proportional hazard model. Other algorithms that can identify multi-way relationships may also be used.
  • association map is applicable to, and predictive for, images and/or biological data that was not used in the construction of the map.
  • Such statistical analysis thereby provides a means to validate the association map as being generally applicable (i.e., generalizable) to other images and biological data.
  • a feature of some embodiments of the present method is confirmation of the statistical significance and predictive value of the association map.
  • Statistical significance can be evaluated in several ways, for example, by comparing the actual/observed association map with theoretical maps derived from modified/permuted data sets, e.g., where the imaging features and biological data have been scrambled. Observation of the same image feature-biological data association at equal frequency using such scrambled data, strongly suggests that the image feature or gene module is noisy and non-specific.
  • cross-validation also called leave-one-out analysis.
  • half, ten percent, or a single individual can be left out as the test, and the procedure is iterated until each individual subject in the data set has been used both as the test and for training.
  • Such iterative learning procedures may be a component of the module network algorithm, described above.
  • the most robust method for confirming statistical significance and predictive value is to test the association map against a completely independent set of subjects. Because the association map has not been trained on the new set of patients, the ability of the map to predict the outcomes in the test set provides strong evidence that the association map is generalizable—meaning that the map can be used to give diagnostic and prognostic information on most, if not all, future subjects.
  • Example 1 An approach of constructing an association map is illustrated in Example 1 using expression data from imaging features on three phase contrast-enhanced CT and gene expression patterns of 28 human hepatocellular carcinomas (HCC).
  • HCC human hepatocellular carcinomas
  • the association map is used to guide treatment or provide a diagnosis of a subject.
  • an image of a tumor in a subject such as a brain, breast, lung, prostate tumor, can be viewed in light of the association map to inform the clinician of the gene or protein expression of the patient.
  • Knowledge of the gene or protein expression profile, i.e., molecular based information, about the patient informs the clinician about a patient's likely response to a drug, probability of relapse, survival rate, disease free survival, and the like.
  • Such information will guide the treatment regimen, including the drug selection, dose, dosing regimen, and whether additional treatments should be considered, such as radiotherapy or tumor resection.
  • additional treatments such as radiotherapy or tumor resection.
  • a noninvasive image of a patient informs the clinician of molecular information useful in guiding treatment.
  • the methods can also be used for preventative medicine, in which case the biological data, with indeterminate image data, may suggest further imaging to be performed on a subject, e.g., to watch for likely diseases or conditions. This situation would arise, for example, when a subject was at risk for a disease, based on genetic data, lifestyle data, and laboratory tests but the presence of the disease could not be definitively shown by imaging or other methods.
  • Association maps are also suited for use in predicting subject outcome. Gene expression data or sequence variation patterns that predict treatment response to particular therapies are reported in the medical literature. For example, subjects with breast cancer that express particular cell surface receptors, such are HER2, are more responsive to certain chemotherapeutic agents than subjects that do not express certain cell surface receptors. Thus, an image of a tumor or other diseased tissue in a subject, viewed in light of an association map, can be used to predict response to a selected treatment.
  • association maps can be constructed from images and biological data generated or gathered solely for this purpose, or another particular purpose. For example, images of patients that were not responsive to a particular drug and biological data from the subjects can be used to build an association map.
  • An association map between imaging and biological data can also be used to design a targeted therapeutic treatment regimen for a patient, providing a personalized care program. Based on an image of a tumor viewed in light of an association map for that tumor type, information about the gene and/or protein expression of the tumor can be determined. Understanding the tumor cell surface receptors permits selection of targeting agents, such as antibody fragments or other agents that have binding specificity for particular cell surface receptors, that can guide or direct a drug to the tumor cell.
  • the targeting agent can be attached directly to the drug, or attached to a carrier for the drug, such as a liposome.
  • Imaging features/traits One hundred thirty eight (138) distinct imaging features that were present in at least one tumor sample were defined and were scored across all tumor samples. Features were selected a priori based on intrinsic radiological interest (e.g., internal arteries and hypodense halos). Features were also filtered based on their frequency and prominence in the data, inter-observer agreement and independence from other features based on Pearson correlation (cut off value of 0.9). Thirty-two (32) imaging features were used as input in the Bayesian model, and 28 of 32 were found to be informative of gene expression ( FIG. 5 ).
  • Microarray data Gene expression profiles of imaged HCCs were downloaded from Stanford Microarray Database, which is available via the Stanfor website. Data from array elements that had hybridization signal over background by 1.5 fold in both Cy5 and Cy3 channels and present in 70% of samples were centered by mean across samples. Data from replicate probes representing the same gene (as determined by Locuslink ID) were averaged. 6732 genes met these criteria for data quality and were used for subsequent analysis.
  • Module network A module network procedure previously developed was applied (Segal, E. et al., Nat. Genet., 34:166-176 (2003)) to construct an association map between imaging features and gene expression profiles.
  • the module network procedure takes as input a gene expression data and a set of potential regulatory input, and attempts to partition the expression data into distinct and mutually exclusive modules, such that the gene assigned to each module can be well predicted by a small decision tree of input regulatory inputs.
  • the regulatory inputs were set to be the real-valued imaging features and were applied to the expression data described above.
  • the 116 imaging networks can be interactively searched (Segal et al. (2007) Nat. Biotechnol. 25:675-80).
  • venous invasion genes Mapping venous invasion genes to imaging features.
  • seven (7) modules that were significantly enriched for these gene were identified using the hypergeometric distribution as described above.
  • the associated imaging feature trees of the 7 modules were analyzed (Table, below), and two features, internal arteries and halos, were found to be overrepresented among the top splits.
  • the consensus threshold of applying these features for this purpose the p-value weighted average of the splits from the 7 image feature trees was calculated. The consensus thresholds were used for the imaging feature decision tree of FIG. 4A .
  • Association Map Construction of Association Map.
  • a three step strategy was used to create an “association map” between imaging features gene expression patterns. More particularly, an association map between imaging features on three phase contrast-enhanced CT and gene expression patterns of 28 human hepatocellular carcinomas (HCC; Chen, X. et al., Mol. Biol. Cell, 13:1929-1939 (2002)) was constructed, as shown in FIG. 1D .
  • HCC human hepatocellular carcinomas
  • FIG. 1D 138 distinctive imaging features present in one or more HCCs were defined and quantified.
  • FIG. 5 Thirty two imaging features were judged most promising by these criteria and used for subsequent analysis ( FIG. 5 ). For instance, and with reference to FIGS. 1A-1C , channels of radio-dense signal within certain tumors on the arterial phase of the CT scan were noted, and this feature was termed “internal arteries”.
  • a module networks algorithm (Segal, E. et al., Nat. Genet., 34:166-176 (2003)) was adopted to systematically search for associations between expression levels of 6732 well-measured genes determined by microarray analysis (Chen, X. et al., Mol Biol Cell, 13:1929-1939 (2002)) and combinations of imaging features.
  • the algorithm identifies groups of genes, termed modules, which demonstrate coherent variation in expression across multiple samples.
  • the algorithm further applies an iterative Bayesian probabilistic procedure to identify combinations of imaging features that can predict the expression levels of gene modules. An end result is identification of specific networks of imaging features that predict the expression level of gene modules. Each network of imaging features predicts the expression level of one gene module.
  • the association map of imaging features and gene expression revealed that a surprisingly large fraction of the gene expression program can be reconstructed from a small number of imaging features, as seen in FIGS. 2A -2B.
  • the expression variation in 6732 genes was captured by 116 gene modules, each of which was associated with specific combinations of imaging features. For each module, presence or absence of combinations of imaging features predicted the aggregate expression level of genes within the module ( FIG. 2A ).
  • the combinations of relevant imaging features are depicted in decision trees: each split in the tree is specified by variation of an imaging feature; each terminal leaf in the tree is a cluster of samples that share similar expression pattern of module genes.
  • the association map allowed one to predict the relative expression level of a gene (by mapping to a module) in a given HCC sample (by mapping to a cluster).
  • the hierarchical combination of only 28 imaging features was sufficient to predict the variation of all 116 gene modules. As shown in FIG. 2B , only nine features were sufficient to predict the expression patterns of 50% of the full complement of gene activities, and the prediction plateaus to above 80% of the full complement of gene activities with more than 23 features. For each gene, the number of features needed to predict its variation was on average three and no more than four in any instance.
  • the association of imaging features and gene expression was highly significant by several independent statistical criteria. Specification of the entire module network involved 355 splits based on imaging features. The average gene expression levels between two sides of each split was significantly different in 299 of 355 splits (p ⁇ 0.05 after applying the conservative Bonferroni correction), accounting for 5282 of 6732 input genes (78.5%).
  • association map imaging features predictive of expression level of specific genes are directly revealed, and the potential physiologic significance of many imaging features can be inferred from their associated genes.
  • the distribution of genes into modules defined by imaging features was not random, but was highly enriched for specific and diverse biological functions and processes.
  • Comparison of gene membership in modules versus the published Gene Ontology annotation revealed significant overlaps, as shown in FIG. 2C , allowing many key physiologic properties of tumors to be gleaned from CT images.
  • three image features predicted the expression level of module 697 that is highly enriched in genes involved in cell proliferation, including PCNA, cydin A, MCM5, MCM6, and geminin, as shown in FIG. 3A .
  • expression level of VEGF an important driver of tumor angiogenesis and target of the approved chemotherapy drug bevacizumab (Kerr, D. J., Nat. Clin. Pract. Oncol., 1:39-43 (2004)), co-varies with these cell cycle genes and is predicted by the same imaging features, as seen in FIG. 3A .
  • the association provides a method for non-invasively delineating a molecularly distinct subset of tumors for a targeted therapeutic strategy.
  • liver synthetic function of HCC patients is an important guide of disease severity (Thomas, M. B. et al., J. Clin. Oncol., 23:8093-8108 (2005)), and this information is evident in module 595, which details the expression level of albumin, pyruvate kinase, transferrin receptor 2, as well as revealing clotting function (thrombin, factor V, factor X), and detoxification activity (GSTO1, CYP27A1, epoxide hydroxylase), as seen in FIG. 3B .
  • the imaging feature “Tumor Margin Score, Minimum” denotes tumors that show an ill-defined transition zone between tumor and surrounding liver tissue. It was found that the presence of this feature was associated with elevated expression of a group of genes associated with extracellular matrix remodeling, such as MMP2, MMP7, COL3A1, COL6A2, and thrombospondin 1 and thrombospondin 2, as seen in FIG. 3C .
  • MMP2 G. et al., Int. J. Cancer, 97:425-431 (2002)
  • Qin L. X. et al., World J.
  • thrombospondin Qin, L. X. et al., World J. Gastroenterol., 8:385-392 (2002); Poon, R. T. et al., Clin. Cancer Res., 10:4150-4157 (2004)
  • increase tumor invasiveness into surrounding stroma which may lead to the poor demarcation of tumor margins on CT imaging.
  • the association map also enables systematic mapping of a predetermined group of genes to their corresponding imaging features.
  • Expression variation in a group of 91 genes that was associated with microscopic venous invasion has been identified (Chen, X. et al., Mol. Biol Cell, 13:1929-1939 (2002)), and is a well-established sign of poor prognosis (Thomas, M. B. et al., J. Clin. Oncol., 23:8093-8108 (2005)) that is extremely difficult to predict using conventional imaging methods in the absence of gross venous invasion.
  • association map can identify novel imaging features corresponding to gene expression signatures and provide useful information to guide clinical decision making.
  • the global gene expression profiles of liver cancer are embodied in their imaging features.
  • the systematic association between imaging features and gene expression allowed useful inference from both directions: on one hand, the association map identified biological processes, based on specific gene expression programs, which underlie specific imaging features.
  • the association map enabled the use of imaging features to reconstruct the global gene expression programs of cancer, thereby creating a noninvasive “molecular portrait” of the tumor ( FIGS. 3A-3C ).
  • the utility of this approach by identifying and validating a two-feature predictor of venous invasion in HCC ( FIG. 4 ) was shown.
  • the “Internal Artery” feature that emerged from this analysis was a significant predictor of survival in two independent groups of patients.
  • liver cancer as an exemplary disease illustrates the robustness of the method.
  • Canonical association maps constructed from large representative series of tumors will enable routine noninvasive diagnosis of genetically heterogeneous tumors, reveal their prognosis, and allow serial profiling of tumors during therapy. This type of imaging based molecular profiling permits personalized medicine.

Abstract

A method for constructing an association map between imaging features and biological data is described. The method comprises combining one or more image features relating to a clinical subject with biological data and using an algorithm to make predictions based on the features and data.

Description

    STATEMENT REGARDING GOVERNMENT INTEREST
  • This work was supported in part by grant number 1 K08 AR050007 from the National Institute of Health. The U.S. Government has certain rights in the invention.
  • TECHNICAL FIELD
  • The subject matter described herein relates to methods for predicting disease risk, prognosis, and best treatment regimens in clinical subjects. The methods involve evaluating a subjects non-invasively obtained imaging features in view of an association map that correlates imaging features with biological data.
  • BACKGROUND
  • Scientists and clinicians routinely use non-invasive imaging to detail the physical and structural composition of living matter. Assessing the genetic and biochemical makeup of living tissue through non-invasive imaging is a desirable goal of current research. Recent development of genomic and proteomic methods have enabled molecular profiling of biological specimens by simultaneously revealing the expression level of thousands of genes and proteins. For example, gene expression patterns of cancer can reveal its etiology, prognosis, and therapeutic potential (Chung, C. H. et al., Nat. Genet., 32 Suppl.:533-540 (2002); Segal, E. et al., Nat. Genet., 37 Suppl.:S38-45 (2005); Chen, X. et al., Mol Biol Cell, 13:1929-1939 (2002)).
  • Current methods of molecular profiling often require invasive surgeries for tissue procurement and specialized equipment, thus limiting its routine use. In some cases, current profiling methods provide a single snap shot in time because they are destructive by nature in that cells must be disintegrated to extract nucleic acids or proteins for analysis. Another barrier to wide spread use of molecular profiling is that human tissues exhibit diverse distinctive features on noninvasive radiographic imaging, many of which currently have no known significance. Because imaging features of tissues reflect the dynamic and physiologic interplay of parenchymal cells, blood vessels, and stroma, it would be desirable if imaging features could be used to predict specific gene expression patterns in human diseases.
  • The foregoing examples of the related art and limitations related therewith are intended to be illustrative and not exclusive. Other limitations of the related art will become apparent to those of skill in the art upon a reading of the specification and a study of the drawings.
  • BRIEF SUMMARY
  • The following aspects and embodiments thereof described and illustrated below are meant to be exemplary and illustrative, not limiting in scope.
  • In one aspect, a method of constructing an association map between imaging features and biological data is provided, comprising:
      • identifying one or more imaging features from a plurality of images of a subject;
      • applying an algorithm to identify relationships between the one or more imaging features and biological data relating to the subject, wherein the identified relationships are used to construct an association map between the one or more imaging features and the biological data;
      • evaluating the statistical significance of the association map to test its predictive value.
  • In some embodiments, the features from a plurality of images of a subject are associated with a disease.
  • In some embodiments, the identifying comprises identifying one or more imaging features based on frequency of the one or more features in the plurality of images.
  • In some embodiments, the identifying comprises identifying one or more imaging features based on its independence from other features.
  • In some embodiments, the identifying comprises identifying one or more imaging features from images obtained using an imaging technique selected from the group consisting of computerized tomography imaging, magnetic resonance imaging (MRI), positron emission tomography (PET), ultrasonography (US), optical imaging, infrared imaging, and x-ray radiography. In particular embodiments, the imaging technique comprises the use of an imaging agent or image-enhancing agent.
  • In some embodiments, the applying comprises applying a module networks algorithm.
  • In some embodiments, the applying comprises applying an algorithm that applies an iterative Bayesian probabilistic procedure that identifies combinations of imaging features that relate to the biological data.
  • In some embodiments, the applying comprises applying an algorithm to gene expression data.
  • In some embodiments, the gene expression data is from a DNA microarray assay. In some embodiments, the gene expression data is from a cDNA microarray assay. In some embodiments, the gene expression data is from an RNA microarray assay.
  • In some embodiments, the applying comprises applying an algorithm to protein expression data.
  • In some embodiments, the evaluating the statistical significance of the association map comprises evaluating by comparison of the map with permuted data sets.
  • In some embodiments, the evaluating the statistical significance of the association map comprises evaluating by testing the prediction using an independent biological data set, independent images, or both.
  • In a related aspect, a method for predicting a gene or protein expression level in a biological sample is provided, comprising:
      • providing an image of the biological sample,
      • comparing the image to an association map as above to predict a gene or protein expression of the biological sample.
  • In some embodiments, the method further comprises, based on the predicting, providing a treatment prognosis of said patient based on the presence and/or absence of certain imaging features.
  • In some embodiments, the providing comprises providing a prediction of a patient's response to a drug. In some embodiments, the providing comprises providing a prediction of a patient's probable survival. In particular embodiments, the probable survival is disease free survival.
  • In some embodiments, the providing comprises providing a likelihood of disease recurrence.
  • In some embodiments, the providing comprises providing a likelihood of metastasis.
  • In another aspect, an association map constructed using the above method is provided.
  • In addition to the exemplary aspects and embodiments described above, further aspects and embodiments will become apparent by reference to the drawings and by study of the following descriptions.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIGS. 1A-1C are computerized tomography (CT) images of distinct features in human hepatocellular carcinomas (HCC), the features referred to as internal arteries (FIG. 1A), hypodense halo (FIG. 1B), and texture heterogeneity (FIG. 1C);
  • FIG. 1D illustrates a strategy for constructing an association map between imaging features and gene expression;
  • FIG. 2A shows an overview of an association map of imaging features and global gene expression, where each column is a sample; each row is a module. For each module, a decision tree of imaging features is associated with variation in the expression level of module genes. Knowledge of the imaging features thus allows an approximate reconstruction of the gene expression pattern.
  • FIG. 2B is a graph showing the cumulative fraction of gene expression variation across the full complement of gene activities that is predicted by the number of imaging features in the model.
  • FIG. 2C shows a matrix of modules, associated imaging features, and their enriched gene ontology annotations. Only modules and annotations with significant enrichment (false discovery rate <0.05 after accounting for multiple hypothesis testing) are shown.
  • FIGS. 3A-3C show molecular portraits of HCC from imaging features, where modules associated with HCC proliferation (FIG. 3A), liver synthetic function (FIG. 3B), and extracellular matrix remodeling (FIG. 3C) are shown; each column is a tumor sample; each row is a gene. Imaging features specifying each module are outlined on top; expression pattern of genes within the module as distinguished by imaging features are shown on bottom.
  • FIGS. 4A-4B show that imaging features predict venous invasion and survival, where a two-feature decision tree associated with a gene expression signature of venous invasion is shown to predict histologic venous invasion (FIG. 4A), and Kaplan-Meier survival curves of HCC patients with and without “internal arteries” imaging feature are shown in FIG. 4B.
  • FIG. 5 is a Table showing examples of image features.
  • DETAILED DESCRIPTION
  • In one aspect, a method is provided wherein an image or one or more imaging features is correlated to an association map of imaging features and biological data. The method finds use in various fields, including medical diagnostics and therapeutics. The methods have use in clinical subject/patient disease screening, diagnosis, characterization, and treatment selection.
  • The method is based on correlating biological data with associated imaging data, to construct a bidirectional association map, as will be illustrated below in Example 1. The biological data for construction of the association map can be obtained from a database or generated from patient biological samples. Databases of polynucleotide and protein expression data are well known. Such gene expression data can also be obtained, for example, using a DNA microarray that surveys the expression levels of thousands of genes simultaneously. For example, a 21-gene assay, termed Oncotype Dx, is a commercially available DNA microarray to determine prognosis and predict response of primary breast tumors to chemotherapy. A 70-gene signature known as Mammaprint is known for use in determining an adjuvant chemotherapeutic regimen in primary breast cancer. Gene expression signatures have also been identified to predict prognosis or therapeutic response in lung cancer, leukemia, and prostate cancer.
  • Data from any or all of these sources, preexisting or generated for the purpose of building an association map, are examples of biological data suitable for use in the method described herein. It will be appreciated that the gene expression data can be for any tissue source, such as cancerous tissue, tissue associated with a malignant or benign growth, infected tissue, inflamed tissue, and the like. Gene expression data may relate to expression levels, splicing patterns, gene copy number, chromosomal alterations (e.g., deletions, amplifications, inversions, and translocations), single nucleotide polymorphisms, and the like. Gene expression data include epigenetic data, e.g., relating to DNA methylation and histone modifications (e.g. acetylation, methylation, and ubiquitination). Gene expression data may be based on analyses of DNA, cDNA, mRNA, snRNA, iRNA, or other nucleic acids.
  • Biological data includes data based on protein-based analyses, including tissue protein expression profiles of different tissues (e.g. cancer, infected, inflamed, infected, etc). Particular examples include biological data from Serial Analysis of Gene Expression (SAGE), nuclear magnetic resonance, protein-interaction screens, chromatin immunoprecipitation-chips, isotope coded affinity tagging, activity based reagents, gel or chromatographic separation, RNAi screens, tissue arrays or mass spectrometry in which a large number of genes, proteins or metabolites are measured in a single experiment or assay is also contemplated. Biological data also include data from serological tests, EKGs, EEG, urinalysis, and other clinical and forensic analyses.
  • As noted above, the method combines the association map with imaging data. Such imaging data can be obtained from a wide variety of sources, including but not limited to magnetic resonance imaging (MRI), positron emission tomography (PET), computerized tomography (CT), ultrasonography (US), optical imaging, infrared imaging, and x-ray radiography. Imaging can be coupled with drugs or compounds, contrast agents or other agents or stimuli, or medical devices to elicit additional information from the imaging. Images are obtained using these modalities applied to a tissue sample, a lesion, an organism imaged in whole or in part.
  • In a general embodiment, the method of constructing an association map comprises providing a plurality of images of, for example, a tissue or a whole or part of an organism, such as a human subject, and biological data that has some relation to the images. For example, images of a solid tumor would preferably be accompanied by biological data based on the imaged solid tumor or on a like solid tumor. That is, images of tumors in the thyroid or images of infected tissue on a limb would have corresponding biological data from thyroid tumors or infected limb tissue, respectively. In a preferred embodiment, the image and the biological data derive from the same tissue or organism; however, a population of images and a population of biological data need not have a one-to-one correspondence.
  • An exemplary association map relating to human hepatocellular carcinoma is constructed by inspecting the imaging data and identifying distinctive features in the image. Examples of distinctive image features (or traits) for human hepatocellular carcinomas are shown in FIGS. 1A-1C, where computerized tomography (CT) images of features referred to as internal arteries (FIG. 1A), hypodense halo (FIG. 1B), and texture heterogeneity (FIG. 1C) were identified. As will be illustrated below (Example 1), the image or images may be scrutinized to extract certain features or features that inform gene expression. Such features include observations related to morphology, composition, structure, and/or physiology. Examples of distinct features that inform gene expression analyses include tissue necrosis, tissue heterogeneity, tumor margin score, internal septa, enhancement pattern, internal arteries, hypodense halo, wash-out, wash-in, texture heterogeneity, capsule, infiltration, and other imaging features familiar to artisans.
  • Such imaging features (and representative data) are associated with a unique image, imaging study, examination, subject or population, all of which are data relating to the image. Such image data independently or in combination define elements or components of the image, or the composite imaging appearance itself, which are included in the biological data used to construct an association map.
  • It will be appreciated that a single imaging feature may be sufficient to add value an association map; however more (and more detailed) features/data are generally preferred.
  • In some embodiments, the method of constructing an association map further includes one or both of (i) using an algorithm to identify relationships between one or more imaging features and the biological data and/or (ii) evaluating the statistical significance of the association map.
  • With respect to (i), algorithms that identify relationships between the imaging features and the biological data are known in the art, and such identified relationships form the basis for constructing an association map between such imaging features and biological data. For example, a module network algorithm is suitable for use (Segal, E. et al., Nat. Genet., 34:166-176 (2003)) wherein the algorithm identifies groups of genes, termed modules, which demonstrate coherent variation in expression across multiple samples. This algorithm further applies an iterative Bayesian probabilistic analysis and to identify combinations of imaging features that can predict the expression levels of gene modules.
  • As used herein, Bayesian probabilistic analysis refers broadly to a genus of related models and their derivatives. Multiple regression analysis and other analyses are known in the art. Classification algorithms such as neural networks, support vector machines, decision trees, Markov networks, and their derivatives may be applied. An exemplary analyses involves application of the Cox proportional hazard model. Other algorithms that can identify multi-way relationships may also be used.
  • With respect to (ii), evaluating the statistical significance of the association map ensures that the map is applicable to, and predictive for, images and/or biological data that was not used in the construction of the map. Such statistical analysis thereby provides a means to validate the association map as being generally applicable (i.e., generalizable) to other images and biological data.
  • For example, when two large biological data sets are compared, many apparent associations will occur by chance alone. These spurious associations are not useful, and in fact interfere with the identification of significant (i.e., “real” or “actually”) associations that have predictive value. Thus, a feature of some embodiments of the present method is confirmation of the statistical significance and predictive value of the association map.
  • Statistical significance can be evaluated in several ways, for example, by comparing the actual/observed association map with theoretical maps derived from modified/permuted data sets, e.g., where the imaging features and biological data have been scrambled. Observation of the same image feature-biological data association at equal frequency using such scrambled data, strongly suggests that the image feature or gene module is noisy and non-specific.
  • In addition, statistical significance and predictive value can be evaluated by cross-validation, also called leave-one-out analysis. This means that an association map is constructed on some fraction of the subject biological data or image features, and the resulting map is used to predict the outcome in the remaining patients in subjects not used to “train” the algorithm. In practice, half, ten percent, or a single individual can be left out as the test, and the procedure is iterated until each individual subject in the data set has been used both as the test and for training. Such iterative learning procedures may be a component of the module network algorithm, described above.
  • Finally, the most robust method for confirming statistical significance and predictive value is to test the association map against a completely independent set of subjects. Because the association map has not been trained on the new set of patients, the ability of the map to predict the outcomes in the test set provides strong evidence that the association map is generalizable—meaning that the map can be used to give diagnostic and prognostic information on most, if not all, future subjects.
  • An approach of constructing an association map is illustrated in Example 1 using expression data from imaging features on three phase contrast-enhanced CT and gene expression patterns of 28 human hepatocellular carcinomas (HCC). As will become apparent, global gene expression patterns of human cancers are encoded in their dynamic imaging features. In order to relate gene expression to imaging, distinctive features of from qualitative imaging were identified, and coherent patterns of variation from gene expression profiles were defined.
  • In another aspect, methods for using an association map constructed as described above, and as exemplified in Example 1, are provided. In one embodiment, the association map is used to guide treatment or provide a diagnosis of a subject. For example, an image of a tumor in a subject, such as a brain, breast, lung, prostate tumor, can be viewed in light of the association map to inform the clinician of the gene or protein expression of the patient. Knowledge of the gene or protein expression profile, i.e., molecular based information, about the patient informs the clinician about a patient's likely response to a drug, probability of relapse, survival rate, disease free survival, and the like. Such information will guide the treatment regimen, including the drug selection, dose, dosing regimen, and whether additional treatments should be considered, such as radiotherapy or tumor resection. Thus, a noninvasive image of a patient informs the clinician of molecular information useful in guiding treatment.
  • While the methods have been exemplified mainly using disease conditions, the methods can also be used for preventative medicine, in which case the biological data, with indeterminate image data, may suggest further imaging to be performed on a subject, e.g., to watch for likely diseases or conditions. This situation would arise, for example, when a subject was at risk for a disease, based on genetic data, lifestyle data, and laboratory tests but the presence of the disease could not be definitively shown by imaging or other methods.
  • Association maps are also suited for use in predicting subject outcome. Gene expression data or sequence variation patterns that predict treatment response to particular therapies are reported in the medical literature. For example, subjects with breast cancer that express particular cell surface receptors, such are HER2, are more responsive to certain chemotherapeutic agents than subjects that do not express certain cell surface receptors. Thus, an image of a tumor or other diseased tissue in a subject, viewed in light of an association map, can be used to predict response to a selected treatment.
  • It will also be appreciated that association maps can be constructed from images and biological data generated or gathered solely for this purpose, or another particular purpose. For example, images of patients that were not responsive to a particular drug and biological data from the subjects can be used to build an association map.
  • An association map between imaging and biological data can also be used to design a targeted therapeutic treatment regimen for a patient, providing a personalized care program. Based on an image of a tumor viewed in light of an association map for that tumor type, information about the gene and/or protein expression of the tumor can be determined. Understanding the tumor cell surface receptors permits selection of targeting agents, such as antibody fragments or other agents that have binding specificity for particular cell surface receptors, that can guide or direct a drug to the tumor cell. The targeting agent can be attached directly to the drug, or attached to a carrier for the drug, such as a liposome.
  • It will be appreciated that the method described herein can be accompanied, if desired, by additional clinical information for a patient, such as a
  • III. EXAMPLES
  • The following examples are illustrative in nature and are in no way intended to be limiting.
  • Materials and Methods
  • Imaging features/traits. One hundred thirty eight (138) distinct imaging features that were present in at least one tumor sample were defined and were scored across all tumor samples. Features were selected a priori based on intrinsic radiological interest (e.g., internal arteries and hypodense halos). Features were also filtered based on their frequency and prominence in the data, inter-observer agreement and independence from other features based on Pearson correlation (cut off value of 0.9). Thirty-two (32) imaging features were used as input in the Bayesian model, and 28 of 32 were found to be informative of gene expression (FIG. 5).
  • Microarray data. Gene expression profiles of imaged HCCs were downloaded from Stanford Microarray Database, which is available via the Stanfor website. Data from array elements that had hybridization signal over background by 1.5 fold in both Cy5 and Cy3 channels and present in 70% of samples were centered by mean across samples. Data from replicate probes representing the same gene (as determined by Locuslink ID) were averaged. 6732 genes met these criteria for data quality and were used for subsequent analysis.
  • Module network. A module network procedure previously developed was applied (Segal, E. et al., Nat. Genet., 34:166-176 (2003)) to construct an association map between imaging features and gene expression profiles. The module network procedure takes as input a gene expression data and a set of potential regulatory input, and attempts to partition the expression data into distinct and mutually exclusive modules, such that the gene assigned to each module can be well predicted by a small decision tree of input regulatory inputs. The regulatory inputs were set to be the real-valued imaging features and were applied to the expression data described above. The 116 imaging networks can be interactively searched (Segal et al. (2007) Nat. Biotechnol. 25:675-80).
  • Module enrichment in Gene Ontology annotations. Significance of overlap between genes in modules and gene ontology annotations was calculated by comparison to the degree of overlap expected by chance alone using the hypergeometric distribution. Multiple hypothesis testing was accounted for by calculating a false discovery rate and present results with FDR<0.05.
  • Mapping venous invasion genes to imaging features. To find imaging features that correspond to the set of 91 genes associated with venous invasion, seven (7) modules that were significantly enriched for these gene were identified using the hypergeometric distribution as described above. The associated imaging feature trees of the 7 modules were analyzed (Table, below), and two features, internal arteries and halos, were found to be overrepresented among the top splits. To identify the consensus threshold of applying these features for this purpose, the p-value weighted average of the splits from the 7 image feature trees was calculated. The consensus thresholds were used for the imaging feature decision tree of FIG. 4A.
  • TABLE
    Venous Invasion Module Analysis
    Node
    Imaging Trait Level Frequency Module
    Internal Arteries, Density 1 4 595, 720, 651,
    773
    Hypodense Halo 1 2 479, 556
    Tumor—Liver Difference, Minimum 1 1 697
    Tumor Margin Score, Maximum 2 2 720, 773
    Attenuation Heterogeneity, Maximum 2 2 595, 697
    Internal Arteries, Rank 2 1 556
    Internal Septa 2 1 651
    Tumor Margin Score, Minumum 2 1 479
    Tumor Margin Score, Minumum 3 3 773, 556, 720
    Wash-out, Maximum 3 1 651
    Necrosis, Density 3 1 595
    Tumor Margin Score, Maximum 3 1 697
    Attenuation Heterogeneity, Maximum 4 1 651

    The position (node level) of each imaging feature/trait used to construct the decision trees used to predict the 7 venous invasion modules and their frequency of occurence at this node level are displayed. Internal Arteries, followed by Hypodense Halos, are over-represented in the imaging networks occupying the top node level and frequency and were thus used to construct the venous invasion predictor.
  • Clinical data analysis. Microscopic venous invasion status on histologic analysis was available for 30 patients in the training set and 32 patients in the test set. Within each data set, patients were partitioned into two groups based on the two feature decision trees (“internal arteries” and “hypodense halos” on CT scan, FIG. 4A). Significance of association between the two feature imaging groups and histologic venous invasion was calculated using two-by-two contingency tables and chi square test. Overall survival data were available for 23 patients in the training set and 32 patients in the test set; only patients with clear surgical margin after HCC resection were used in this analysis. Within each data set, patients were partitioned based on the presence or absence of the “internal arteries” feature on CT scan, and survival analysis by the method of Kaplan and Meier for the two groups of patients was implemented in Winstat (R. Fitch Software, Bad Krozingen, DE).
  • Construction of Association Map. In this example, a three step strategy was used to create an “association map” between imaging features gene expression patterns. More particularly, an association map between imaging features on three phase contrast-enhanced CT and gene expression patterns of 28 human hepatocellular carcinomas (HCC; Chen, X. et al., Mol. Biol. Cell, 13:1929-1939 (2002)) was constructed, as shown in FIG. 1D. In the analysis, 138 distinctive imaging features present in one or more HCCs were defined and quantified. To identify informative features, features were filtered based on their frequency and prominence in the data, inter-observer agreement between two radiologists, and independence from other features as determined by Pearson correlation among the features (r=0.9). Thirty two imaging features were judged most promising by these criteria and used for subsequent analysis (FIG. 5). For instance, and with reference to FIGS. 1A-1C, channels of radio-dense signal within certain tumors on the arterial phase of the CT scan were noted, and this feature was termed “internal arteries”.
  • Next, a module networks algorithm (Segal, E. et al., Nat. Genet., 34:166-176 (2003)) was adopted to systematically search for associations between expression levels of 6732 well-measured genes determined by microarray analysis (Chen, X. et al., Mol Biol Cell, 13:1929-1939 (2002)) and combinations of imaging features. The algorithm identifies groups of genes, termed modules, which demonstrate coherent variation in expression across multiple samples. The algorithm further applies an iterative Bayesian probabilistic procedure to identify combinations of imaging features that can predict the expression levels of gene modules. An end result is identification of specific networks of imaging features that predict the expression level of gene modules. Each network of imaging features predicts the expression level of one gene module.
  • Next, statistical significance of the association map was validated by comparison with permuted data sets, and also by testing the prediction of the association map in an independent set of tumors.
  • The association map of imaging features and gene expression revealed that a surprisingly large fraction of the gene expression program can be reconstructed from a small number of imaging features, as seen in FIGS. 2A-2B. The expression variation in 6732 genes was captured by 116 gene modules, each of which was associated with specific combinations of imaging features. For each module, presence or absence of combinations of imaging features predicted the aggregate expression level of genes within the module (FIG. 2A). The combinations of relevant imaging features are depicted in decision trees: each split in the tree is specified by variation of an imaging feature; each terminal leaf in the tree is a cluster of samples that share similar expression pattern of module genes. Thus, the association map allowed one to predict the relative expression level of a gene (by mapping to a module) in a given HCC sample (by mapping to a cluster).
  • The hierarchical combination of only 28 imaging features was sufficient to predict the variation of all 116 gene modules. As shown in FIG. 2B, only nine features were sufficient to predict the expression patterns of 50% of the full complement of gene activities, and the prediction plateaus to above 80% of the full complement of gene activities with more than 23 features. For each gene, the number of features needed to predict its variation was on average three and no more than four in any instance. The association of imaging features and gene expression was highly significant by several independent statistical criteria. Specification of the entire module network involved 355 splits based on imaging features. The average gene expression levels between two sides of each split was significantly different in 299 of 355 splits (p<0.05 after applying the conservative Bonferroni correction), accounting for 5282 of 6732 input genes (78.5%). Comparison of the observed association map of imaging features and gene expression with maps derived from data sets with permuted sample labels confirmed that the predictive power of imaging features for expression patterns was highly unlikely due to chance alone. The log-likelihood was −18 per microarray, compared to only −23±0.1 expected by chance (10 permutations; p<10−50). Thus, the variation in gene expression is densely encoded by a small number of imaging features. Once discovered, such “coding” image features can be quickly used to translate visual images into the underlying gene expression.
  • Using the association map, imaging features predictive of expression level of specific genes are directly revealed, and the potential physiologic significance of many imaging features can be inferred from their associated genes. The distribution of genes into modules defined by imaging features was not random, but was highly enriched for specific and diverse biological functions and processes. Comparison of gene membership in modules versus the published Gene Ontology annotation (Ashburner, M. et al., Nat. Genet., 25:25-29 (2000)) revealed significant overlaps, as shown in FIG. 2C, allowing many key physiologic properties of tumors to be gleaned from CT images. For example, three image features predicted the expression level of module 697 that is highly enriched in genes involved in cell proliferation, including PCNA, cydin A, MCM5, MCM6, and geminin, as shown in FIG. 3A. In addition, expression level of VEGF, an important driver of tumor angiogenesis and target of the approved chemotherapy drug bevacizumab (Kerr, D. J., Nat. Clin. Pract. Oncol., 1:39-43 (2004)), co-varies with these cell cycle genes and is predicted by the same imaging features, as seen in FIG. 3A.
  • Thus, in one embodiment, the association provides a method for non-invasively delineating a molecularly distinct subset of tumors for a targeted therapeutic strategy. For example, the liver synthetic function of HCC patients is an important guide of disease severity (Thomas, M. B. et al., J. Clin. Oncol., 23:8093-8108 (2005)), and this information is evident in module 595, which details the expression level of albumin, pyruvate kinase, transferrin receptor 2, as well as revealing clotting function (thrombin, factor V, factor X), and detoxification activity (GSTO1, CYP27A1, epoxide hydroxylase), as seen in FIG. 3B.
  • It will also be appreciated that identity of genes in a module can reveal the physiologic basis of an imaging feature. The imaging feature “Tumor Margin Score, Minimum” denotes tumors that show an ill-defined transition zone between tumor and surrounding liver tissue. It was found that the presence of this feature was associated with elevated expression of a group of genes associated with extracellular matrix remodeling, such as MMP2, MMP7, COL3A1, COL6A2, and thrombospondin 1 and thrombospondin 2, as seen in FIG. 3C. Several of these genes, notably MMP2 (Giannelli, G. et al., Int. J. Cancer, 97:425-431 (2002); Qin, L. X. et al., World J. Gastroenterol., 8:385-392 (2002)) and thrombospondin (Qin, L. X. et al., World J. Gastroenterol., 8:385-392 (2002); Poon, R. T. et al., Clin. Cancer Res., 10:4150-4157 (2004)) are known to increase tumor invasiveness into surrounding stroma, which may lead to the poor demarcation of tumor margins on CT imaging.
  • The association map also enables systematic mapping of a predetermined group of genes to their corresponding imaging features. Expression variation in a group of 91 genes that was associated with microscopic venous invasion has been identified (Chen, X. et al., Mol. Biol Cell, 13:1929-1939 (2002)), and is a well-established sign of poor prognosis (Thomas, M. B. et al., J. Clin. Oncol., 23:8093-8108 (2005)) that is extremely difficult to predict using conventional imaging methods in the absence of gross venous invasion. Here, the 91 genes in the “venous invasion signature” were enriched in 7 modules and associated with two predominant imaging features- the presence of “Internal Arteries” and absence of “Hypodense Halos”, as seen FIG. 4A and FIG. 5. Therefore, whether this pair of imaging features, as observed during the pre-operative CT scan, predicted the occurrence of microscopic venous invasion on histologic analysis was evaluated. In 30 patients with HCC, tumors with this combination of imaging features had a twelve-fold increased risk of microscopic venous invasion (p=0.004).
  • The predictive value of the two-feature predictor of venous invasion was validated in an independent set of 32 patients that were not used for training the association map (FIG. 4A, p=0.03). The presence of the feature “Internal Arteries” in the pre-operative CT scan of HCCs was a significant univariate predictor of overall survival in both groups of patients, as seen in FIG. 4B. Thus, the association map can identify novel imaging features corresponding to gene expression signatures and provide useful information to guide clinical decision making.
  • In summary, the global gene expression profiles of liver cancer are embodied in their imaging features. The systematic association between imaging features and gene expression allowed useful inference from both directions: on one hand, the association map identified biological processes, based on specific gene expression programs, which underlie specific imaging features. On the other hand, the association map enabled the use of imaging features to reconstruct the global gene expression programs of cancer, thereby creating a noninvasive “molecular portrait” of the tumor (FIGS. 3A-3C). The utility of this approach by identifying and validating a two-feature predictor of venous invasion in HCC (FIG. 4) was shown. Moreover, the “Internal Artery” feature that emerged from this analysis was a significant predictor of survival in two independent groups of patients. These results demonstrate that existing imaging technology may be used to reconstruct the molecular anatomy of disease, such as cancer, in a noninvasive fashion. The examples and data set forth herein using liver cancer as an exemplary disease illustrates the robustness of the method. Canonical association maps constructed from large representative series of tumors will enable routine noninvasive diagnosis of genetically heterogeneous tumors, reveal their prognosis, and allow serial profiling of tumors during therapy. This type of imaging based molecular profiling permits personalized medicine.
  • While a number of exemplary aspects and embodiments have been discussed above, those of skill in the art will recognize certain modifications, permutations, additions and sub-combinations thereof. It is therefore intended that the following appended claims and claims hereafter introduced are interpreted to include all such modifications, permutations, additions and sub-combinations as are within their true spirit and scope.

Claims (6)

1.-24. (canceled)
25. A method of assessing the disease state of a tumor comprising:
a. identifying one or more tumor imaging features from images from a plurality of subjects,
b. applying a module networks algorithm to identify relationships between the one or more tumor imaging features and biological data relating to the images;
c. constructing, based on the identified relationships an association map between the one or more tumor imaging features and the biological data; and
d. determining the diseased state of a tumor by visually inspecting the association map.
26. The method of claim 1 wherein the tumor imaging features are associated with a disease.
27. The method of claim 1, wherein the identifying comprises identifying one or more tumor imaging features based on frequency of the one or more features in the images.
28. The method of claim 1, wherein the identifying comprises identifying one or more tumor imaging features based on its independence from other features.
29. The method of claim 1, wherein said identifying comprises identifying one or more tumor imaging features from images obtained using an imaging technique.
US14/757,779 2006-10-31 2015-12-23 Methods for constructing association maps of imaging data and biological data Abandoned US20160203597A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/757,779 US20160203597A1 (en) 2006-10-31 2015-12-23 Methods for constructing association maps of imaging data and biological data

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US85638606P 2006-10-31 2006-10-31
PCT/US2007/022973 WO2008054768A2 (en) 2006-10-31 2007-10-30 Methods for constructing association maps of imaging data and biological data
US44789010A 2010-03-15 2010-03-15
US14/757,779 US20160203597A1 (en) 2006-10-31 2015-12-23 Methods for constructing association maps of imaging data and biological data

Related Parent Applications (2)

Application Number Title Priority Date Filing Date
US12/447,890 Continuation US20100189318A1 (en) 2006-10-31 2007-10-30 Methods for constructing association maps of imaging data and biological data
PCT/US2007/022973 Continuation WO2008054768A2 (en) 2006-10-31 2007-10-30 Methods for constructing association maps of imaging data and biological data

Publications (1)

Publication Number Publication Date
US20160203597A1 true US20160203597A1 (en) 2016-07-14

Family

ID=39344895

Family Applications (3)

Application Number Title Priority Date Filing Date
US12/516,750 Active 2027-12-17 US8179657B2 (en) 2006-10-31 2006-12-13 Emission analyzer
US12/447,890 Abandoned US20100189318A1 (en) 2006-10-31 2007-10-30 Methods for constructing association maps of imaging data and biological data
US14/757,779 Abandoned US20160203597A1 (en) 2006-10-31 2015-12-23 Methods for constructing association maps of imaging data and biological data

Family Applications Before (2)

Application Number Title Priority Date Filing Date
US12/516,750 Active 2027-12-17 US8179657B2 (en) 2006-10-31 2006-12-13 Emission analyzer
US12/447,890 Abandoned US20100189318A1 (en) 2006-10-31 2007-10-30 Methods for constructing association maps of imaging data and biological data

Country Status (2)

Country Link
US (3) US8179657B2 (en)
WO (1) WO2008054768A2 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160260224A1 (en) * 2013-10-28 2016-09-08 London Health Sciences Centre Research Inc. Method and apparatus for analyzing three-dimensional image data of a target region of a subject
US11273283B2 (en) 2017-12-31 2022-03-15 Neuroenhancement Lab, LLC Method and apparatus for neuroenhancement to enhance emotional response
US11364361B2 (en) 2018-04-20 2022-06-21 Neuroenhancement Lab, LLC System and method for inducing sleep by transplanting mental states
US11452839B2 (en) 2018-09-14 2022-09-27 Neuroenhancement Lab, LLC System and method of improving sleep
US11717686B2 (en) 2017-12-04 2023-08-08 Neuroenhancement Lab, LLC Method and apparatus for neuroenhancement to facilitate learning and performance
US11723579B2 (en) 2017-09-19 2023-08-15 Neuroenhancement Lab, LLC Method and apparatus for neuroenhancement
US11786694B2 (en) 2019-05-24 2023-10-17 NeuroLight, Inc. Device, method, and app for facilitating sleep

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009156897A1 (en) * 2008-06-23 2009-12-30 Koninklijke Philips Electronics N.V. Medical system for retrieval of medical information
US8824769B2 (en) 2009-10-16 2014-09-02 General Electric Company Process and system for analyzing the expression of biomarkers in a cell
US8320655B2 (en) 2009-10-16 2012-11-27 General Electric Company Process and system for analyzing the expression of biomarkers in cells
JP5569086B2 (en) * 2010-03-25 2014-08-13 株式会社島津製作所 Optical emission spectrometer
US8732181B2 (en) * 2010-11-04 2014-05-20 Litera Technology Llc Systems and methods for the comparison of annotations within files
US10025782B2 (en) 2013-06-18 2018-07-17 Litera Corporation Systems and methods for multiple document version collaboration and management
US10253744B2 (en) * 2013-11-25 2019-04-09 Nxp Usa, Inc. Flyback switching mode power supply with voltage control and a method thereof
CN104021316B (en) * 2014-06-27 2017-04-05 中国科学院自动化研究所 Based on the method that the matrix decomposition that gene space merges predicts new indication to old medicine
US11468049B2 (en) * 2016-06-19 2022-10-11 Data.World, Inc. Data ingestion to generate layered dataset interrelations to form a system of networked collaborative datasets
JP6730887B2 (en) * 2016-09-02 2020-07-29 株式会社Soken Ignition device
US11356024B2 (en) * 2018-12-06 2022-06-07 Unison Industries, Llc Ignition exciter assembly and method for charging a tank capacitor for an ignition exciter

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020146371A1 (en) * 2000-10-18 2002-10-10 Li King Chuen Methods for development and use of diagnostic and therapeutic agents
US20030049701A1 (en) * 2000-09-29 2003-03-13 Muraca Patrick J. Oncology tissue microarrays

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS60114746A (en) 1983-11-25 1985-06-21 Shimadzu Corp Spark discharge circuit for emission spectrochemical analysis
JPH08184564A (en) 1994-12-28 1996-07-16 Kawasaki Steel Corp Emission spectrochemical analytical method
JP2000009645A (en) 1998-06-26 2000-01-14 Horiba Ltd Emission spectroscopic analyzer
JP2003052173A (en) 2001-08-06 2003-02-21 Canon Inc Flyback-type voltage step-up circuit of capacitor
JP3893290B2 (en) * 2002-01-09 2007-03-14 キヤノン株式会社 Capacitor charger and camera strobe charger
US20040086873A1 (en) * 2002-10-31 2004-05-06 Johnson Peter C. System and method of generating and storing correlated hyperquantified tissue structure and biomolecular expression datasets
US6836159B2 (en) * 2003-03-06 2004-12-28 General Electric Company Integrated high-voltage switching circuit for ultrasound transducer array
JP3708529B2 (en) * 2003-03-18 2005-10-19 Smk株式会社 Constant voltage output control method and constant voltage output control device for switching power supply circuit
JP2004333323A (en) 2003-05-08 2004-11-25 Shimadzu Corp Emission spectrophotometer
JP3733966B2 (en) 2004-01-08 2006-01-11 Jfeスチール株式会社 Emission spectroscopic method
US20060269476A1 (en) * 2005-05-31 2006-11-30 Kuo Michael D Method for integrating large scale biological data with imaging

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030049701A1 (en) * 2000-09-29 2003-03-13 Muraca Patrick J. Oncology tissue microarrays
US20020146371A1 (en) * 2000-10-18 2002-10-10 Li King Chuen Methods for development and use of diagnostic and therapeutic agents

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Segal et al, Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data, June 2003, Nature Genetics Vol. 34 No. 2, pp. 166-175 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160260224A1 (en) * 2013-10-28 2016-09-08 London Health Sciences Centre Research Inc. Method and apparatus for analyzing three-dimensional image data of a target region of a subject
US10198825B2 (en) * 2013-10-28 2019-02-05 London Health Sciences Centre Research Inc. Method and apparatus for analyzing three-dimensional image data of a target region of a subject
US11723579B2 (en) 2017-09-19 2023-08-15 Neuroenhancement Lab, LLC Method and apparatus for neuroenhancement
US11717686B2 (en) 2017-12-04 2023-08-08 Neuroenhancement Lab, LLC Method and apparatus for neuroenhancement to facilitate learning and performance
US11273283B2 (en) 2017-12-31 2022-03-15 Neuroenhancement Lab, LLC Method and apparatus for neuroenhancement to enhance emotional response
US11318277B2 (en) 2017-12-31 2022-05-03 Neuroenhancement Lab, LLC Method and apparatus for neuroenhancement to enhance emotional response
US11478603B2 (en) 2017-12-31 2022-10-25 Neuroenhancement Lab, LLC Method and apparatus for neuroenhancement to enhance emotional response
US11364361B2 (en) 2018-04-20 2022-06-21 Neuroenhancement Lab, LLC System and method for inducing sleep by transplanting mental states
US11452839B2 (en) 2018-09-14 2022-09-27 Neuroenhancement Lab, LLC System and method of improving sleep
US11786694B2 (en) 2019-05-24 2023-10-17 NeuroLight, Inc. Device, method, and app for facilitating sleep

Also Published As

Publication number Publication date
WO2008054768A9 (en) 2008-07-17
US20100189318A1 (en) 2010-07-29
WO2008054768A3 (en) 2008-08-28
WO2008054768A2 (en) 2008-05-08
US8179657B2 (en) 2012-05-15
US20100073844A1 (en) 2010-03-25

Similar Documents

Publication Publication Date Title
US20160203597A1 (en) Methods for constructing association maps of imaging data and biological data
Yu et al. Effect of adjuvant paclitaxel and carboplatin on survival in women with triple-negative breast cancer: a phase 3 randomized clinical trial
Zhang et al. Computed tomography-based radiomics model for discriminating the risk stratification of gastrointestinal stromal tumors
Rutman et al. Radiogenomics: creating a link between molecular diagnostics and diagnostic imaging
US20120015843A1 (en) Gene and gene expressed protein targets depicting biomarker patterns and signature sets by tumor type
Simon Genomic biomarkers in predictive medicine. An interim analysis
Zhang et al. Early response evaluation using primary tumor and nodal imaging features to predict progression-free survival of locally advanced non-small cell lung cancer
Schiller et al. PSMA-PET/CT–based lymph node atlas for prostate cancer patients recurring after primary treatment: Clinical implications for salvage radiation therapy
Abrahams et al. The history of personalized medicine
Gao et al. Comparison of prognostic indices in NSCLC patients with brain metastases after radiosurgery
Van Laar An online gene expression assay for determining adjuvant therapy eligibility in patients with stage 2 or 3 colon cancer
Beukinga et al. Addition of HER2 and CD44 to 18 F-FDG PET–based clinico-radiomic models enhances prediction of neoadjuvant chemoradiotherapy response in esophageal cancer
Grigoroiu et al. Gene-expression profiling in non-small cell lung cancer with invasion of mediastinal lymph nodes for prognosis evaluation
Yin et al. Research trends of artificial intelligence in pancreatic cancer: a bibliometric analysis
Caputo et al. Comprehensive genome profiling by next generation sequencing of circulating tumor DNA in solid tumors: a single academic institution experience
Huang et al. Lactate dehydrogenase kinetics predict chemotherapy response in recurrent metastatic nasopharyngeal carcinoma
Choi et al. Correlation between 18 F-fluorodeoxyglucose uptake and epidermal growth factor receptor mutations in advanced lung cancer
Jain et al. Predictive genomic tools in disease stratification and targeted prevention: a recent update in personalized therapy advancements
Ji et al. Stage-specific PET radiomic prediction model for the histological subtype classification of non-small-cell lung cancer
Mokbel et al. The Impact of EndoPredict Clinical Score on Chemotherapy Recommendations in Women with Invasive ER+/HER2− Breast Cancer Stratified as Having Moderate or Poor Prognosis by Nottingham Prognostic Index
Duan et al. Detection and independent validation of model-based quantitative transcriptional regulation relationships altered in lung cancers
Eun et al. Identification of novel biomarkers for prediction of neurological prognosis following cardiac arrest
Hartmann et al. Imaging genomics: data fusion in uncovering disease heritability
Song et al. A 18FDG PET/CT-based volume parameter is a predictor of overall survival in patients with local advanced gastric cancer
Hobbs et al. Prognostic/predictive markers in systemic therapy resistance and metastasis in breast cancer

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION