WO2019036043A2

WO2019036043A2 - A method to generate a cocktail of personalized cancer vaccines from tumor-derived genetic alterations for the treatment of cancer

Info

Publication number: WO2019036043A2
Application number: PCT/US2018/000334
Authority: WO
Inventors: Amitabha Chaudhuri; Papia CHAKRABORTY; Ravi Gupta; Priyanka SHAH; Vasumathi KODE; Sreedhar SANTHOSH; Kayla Renee LEE; Xiaoshan SHI, (shirley); Malini MANOHARAN; Nitin MANDLOI; Rohit Gupta
Original assignee: Medgenome Inc.
Priority date: 2017-08-16
Filing date: 2018-08-16
Publication date: 2019-02-21
Also published as: WO2019036043A3

Abstract

The invention provides methods of selecting cancer vaccines from genetically altered proteins expressed by mammalian cancer cells and tissues.

Description

A Method To Generate A Cocktail Of Personalized Cancer Vaccines From Tumor-Derived Genetic Alterations For The Treatment Of Cancer Background of Invention

The use of immune checkpoint inhibitors to treat cancer patients has reached a new milestone by their ability to produce long term survival to a subset of treated patients. However, a large proportion of cancer patients fail to respond to immune checkpoint inhibitor therapy and of those that respond to the therapy, 8-10% experience survival beyond 10 years. The lack of long term response could be overcome by combining cancer vaccines with immune checkpoint inhibitors. The cancer vaccines will induce the expansion of tumor antigen-specific T cells and the checkpoint inhibitors will prevent these T cells from becoming dysfunctional. Cancer vaccines are derived from tumor-specific immunogenic peptides that are produced by intracellular proteolytic processing of mutated proteins. These tumor-derived mutated peptides are presented on the surface of antigen presenting cells in complex with class I or class II HLA proteins to generate tumor antigen-specific T cells. These T cells recognize and eliminate tumor cells presenting the tumor antigens.

Tumors accumulate large number of somatic mutations during cancer development and only a small subset of these are recognized by the T cells - in other words are immunogenic. Therefore, identifying the immunogenic peptides, which will engage T cells productively to generate an antitumor T cell response requires accurate modeling of the steps involved in expression of the mutated gene, generation of the peptide by intracellular processing, their entry into the endoplasmic reticulum through specific transporters,, binding to HLA, presentation of the HLA-bound peptide on the cell surface, binding to the TCR and the effect of the peptide on the clonal amplification of T cells and their functional phenotype.

Current pipelines select immunogenic peptides by taking into consideration the expression of the mutated genes, likelihood that the mutated proteins will be processed to generate the peptides of interest, their entry into the endoplasmic compartment where they will bind to HLA and get presented on the cell surface. However, these steps only predict whether a peptide will be presented on the surface of the cells, which is required but not sufficient for TCR binding. Binding of HLA- bound peptide to the T cells will result in T cell activation and the peptide will be considered immunogenic. Therefore, current methods of selecting peptides to formulate a cancer vaccine cocktail will include a mixture of immunogenic and non-immunogenic peptides that will decrease the efficacy of the vaccine reducing their impact on the disease. To improve the predictive power of our neoepitope prioritization pipeline and to circumvent the selection biases inherent in using HLA binding as a proxy for immunogenicity prediction, we created the cancer vaccine prediction tool OncoPeptVAC, which includes an additional analytical step for predicting whether a HLA-bound peptide will engage T cell receptor (TCR). The TCR-binding algorithm alone predicts with high accuracy whether a HLA -peptide complex will bind TCR, however, it does not assess whether the binding of the peptide to the TCR will result in an anti-tumor response. The tumor-killing property of a T cell requires that it displays an activated phenotype characterized by the production of proinflammatory cytokines. To assess whether the predicted immunogenic peptide induces tumor- killing properties in T cells we have performed ex vivo T cell activation assay (Wolfl and Greenberg 2014) with TCR repertoire analysis and single cell transcriptomics of T cells. The collective data is then fed to an algorithm to create a rank-order list of peptides that will become part of the vaccine cocktail.

Summary of the Invention

The invention relates to the discovery of highly specific and sensitive methods for predicting, identifying and/or validating immunogenic peptides for use in, e.g., cancer therapy. Further provided are compositions comprising the immunogenic peptides of the invention, kits, formulations containing the compositions/immunogenic peptides of the invention, and therapeutic methods for treating disease.

In one aspect, the invention provides methods for selecting a cancer vaccine from genetically altered protein(s) expressed by a mammalian cancer cell and/or tissue. In one embodiment, the method comprises identifying neo-epitopes in mutant cancer peptides from the genetically altered protein(s) which is from the mammalian cancer cell and/or tissue. The method further

comprises calculating probability of TCR binding of the neo-epitope(s) of (a) to generate a T-cell response, thereby identifying a T-cell activating neo-epitope(s) from the genetically altered protein. Additionally, the method comprises selecting one or more mutant cancer peptide(s) so identified above having the highest probability or a probability above a threshold setting that can modulate the immune response of a mammal when challenged with the mutant cancer peptide(s), thereby selecting a cancer vaccine; wherein the cancer vaccine comprises one or more mutant cancer peptides derived from the genetically altered protein(s) and wherein the mammalian subject expresses the genetically altered protein(s) and expresses an HLA or MHC molecule that binds the mutant cancer peptide(s).

In another aspect, the invention also provides methods for identification of a T-cell epitope

(neoepitope) for cancer immunotherapy. In one embodiment, the method comprises obtaining peripheral blood mononuclear cells (PBMCs) from a subject. Then, CD 14+ CD 16+ monocytes are isolated from PBMCs. Additionally, the CD 14+ CD 16+ monocytes are contacted with a DC maturation cytokine cocktail comprising GM-CSF, IL4 and IFN so as to differentiate the CD 14+ CD 16+ monocytes to dendritic cells (DCs). Furthermore, naive CD8+ T cells are isolated from PBMCs. After that, the DCs are contacted with a peptide from a protein overexpressed in a cancer cell or a genetically altered protein described by any of the method above. Moreober, the DCs described above are co-cultured with the isolated naive CD8+ T cells described above in a culture medium comprising the DC maturation cytokine cocktail. The method further comprises supplementing the medium described with a second cytokine cocktail, contacting the co-culture described above with additional peptide-pulsed autologous PBMCs or DCs so as to re-stimulate the T cells. Optionally, the method comprises treating the cultured cells with an inhibitor of cellular transport prior to analysis of marker(s) of activated CD8+ T cells, and quantifying the amount of marker(s) of activated CD8+ T cells wherein presence of the marker(s) of activated CD8+ T cells above control level obtained from a co-culture with no peptide challenge or challenge with a peptide known not to stimulate naive CD8+ T cells indicates T-cell recognition of the peptide presented by antigen presenting cells as a T-cell epitope.

The invention further provides methods for identifying CD8+ T cell clones for adoptive T cell therapy for a subject. In an embodiment of the invention, the method comprises identifying an immunogenic peptide derived from an overexpressed or genetically altered protein from the subject in need by the method described above. The method then comprises contacting the immunogenic peptide identified above with isolated antigen presenting cells or dendritic cells from the subject in need or from an allogenic subject. Then, the cells obtained above are co-cultured with isolated naive CD8+ T cells from the subject in need or from an allogenic subject. Moreover, the method comprises detecting presence of marker(s) for activated CD8+ T cells. The method further comprises culturing activated CD8+ T cells so as to obtain a clonal population using a CD3/CD28 stimuli or an allogeneic stimulus using irradiated or mitomycin treated PBMC or lymphoblastic cells.

The invention also provides methods for identifying T cell receptor (TCR) recognizing an immunogenic peptide for therapeutic use by engineering T cells against cancer. The method comprises identifying an immunogenic peptide derived from an overexpressed or genetically altered protein from a cancer cell by the method described above. The method then comprises contacting the immunogenic peptide identified above with isolated antigen presenting cells or dendritic cells from an autologous subject or from an allogenic subject. Moreover, the cells obtained above are co- cultured with isolated naive CD8+ T cells from the autologous subject or from an allogenic subject, so as to activate the CD8+ T cells. The method further comprises expanding clonal populations of T cells using a CD3/CD28 stimuli or an allogeneic stimulus using irradiated or mitomycin treated PBMC or lymphoblastic cells. Furthermore, the method comprises determining nucleic acid or protein sequence of T cell receptor from the activated CD8+ T cells, thereby identifying the T cell receptor (TCR) recognizing an immunogenic peptide for therapeutic use by engineering T cells against cancer.

Additionally, the invention provides methods of selecting neoepitopes from genetically altered proteins expressed by human cancer cells and/or tissues. In one embodiment, the method comprises calculating the probability of HLA binding with optimal processing sites from a library of mutant cancer peptides. Additionally, the method comprises calculating the probability of TCR binding to generate a T-cell response. The method then comprises selecting the mutant cancer peptides having the highest probability or a probability above a threshold setting so calculated from above that can modulate the immune response of a human, when challenged with the mutant cancer peptide;

wherein, each selected mutant cancer peptide serves as or comprises a neoepitope. Further, in one aspect, the invention provides methods of selecting a cancer vaccine comprising one or more validated immunogenic peptides to treat a tumor in a subject. In one embodiment, the method comprises obtaining a tumor sample from the subject. The method additionally comprises identifying one or more mutations in expressed genetic material and/or one or more alterations in level of expressed genetic material associated with the tumor. Then, the method comprises predicting immunogenicity of said mutations and/or alteration in level of expressed genetic material associated with the tumor comprising a TCR-binding algorithm. The TCR-binding algorithm comprises peptide(s) of a pre-defined length comprising one or more mutations and/or one or more alterations in level of expressed genetic material associated with the tumor, and selecting and matching features associated with an amino acid at each position of the peptide with selected predefined features for each position of peptides recognized by TCR associated with either CD8+ T- cell or CD4+ T-cell, so as to obtain predictive ability of the peptide(s) to interact with the TCR. Moreover, the method comprises validating predicted immunogenic peptide(s) obtained above in a CD4⁺ and/or CD8⁺ T-cell activation assay, so as to ensure ability of the peptide(s) to activate CD4⁺ and/or CD8⁺ T-cell. Furthermore, the method comprises selecting validated immunogenic peptide(s) that elicit a specific T-cell response. The specific T-cell response comprises monoclonal or polyclonal expansion of T cells. The T-cell response also comprises expression of CD4+ T helper cell markers and/or CD8+ T cell cytolytic markers. The T-cell response further comprises sustainability of active T cells.

In one aspect, the method may also comprise use of an algorithm based on positive prediction of the validated immunogenic peptide to be bound by TCR, HLA or MHC binding affinity of the validated immunogenic peptide, quality of proteasomal processing of the validated immunogenic peptide derived from mutant protein, quality of TAP transporter binding of the validated immunogentic peptide derived from mutant protein, positive in T cell activation assay, magnitue of T cell activation, monoclonal and polyclonal T-cell amplification response, functional competence of T cells by expression of T-helper markers or CTL markers, lack of anergic and/or exhaustion markers for T cells, and/or a combination thereof.

In yet a further embodiment of the invention, the method comprises use of an algorithm comprising: frequency of occurrence of mutant allele for one or more genetically altered protein associated with the tumor in a population; HLA or MHC binding affinity of the validated immunogenic peptide; Quality of proteasomal processing of the validated immunogenic peptide derived from mutant protein; Quality of TAP transporter binding of the validated immunogentic peptide derived from mutant protein; Magnitue of T cell activation; Monoclonal and polyclonal T-cell amplification response, Functional competence of T cells by expression of T-helper markers or CTL markers, and/or a combination thereof.

The invention further provides methods for obtaining a minimal gene expression signature associated with a specific immune cell type and/or subtype that distinguishes the specific immune cell type and/or subtype from other immune cell types and/or subtypes. In one embodiment, the method comprises: (a) obtaining a plurality of samples from a plurality of subjects (one or more sample from one or more subject); (b) determining gene expression of the specific immune cell type and/or subtype from the samples; (c) determining gene expression of other immune cell types and/or subtypes from the samples; (d) comparing the gene expression of (b) with (c) so as to identify for each immune cell type and/or subtype, the highest gene expression within each immune cell type and/or subtype but having greatest variance in gene expression between different immune cell types and/or subtypes; (e) selecting genes so identified in (d) with low plasticity of expression so as to reflect consistent gene expression or lowest variance in gene expression within each immune cell type and/or subtype; (f) validating utility of the selected genes from (e) for ability to discriminate cognate immune cell type and/or subtype from non-cognate immune cell type, and validating gene expression signature as a minimal gene expression signature consisting of a minimal set of genes with greatest difference in differentiating cognate from non-cognate immune cell type and/or subtypes; and (g) optionally, changing composition of the selected genes in (f) following discovery of an improved smaller subset of selected genes selected from (f) during validation in (f).

The invention provides a method for identifying a cancer patient most likely to be responsiveness to immune checkpoint inhibitor therapy. In one embodiment, the method comprises obtaining a tumor sample from the cancer patient. The method comprises determining gene expression for a set of genes of the isolated tumor sample. Additionally the method comprises, applying minimal gene expression signature associated with CD8+ T-cell so as to determine a threshold presence of CD8+ T-cell. The method comprises determining functional state of the CD8+ T-cell by analyzing one or more marker associated with anergic and exhaustion of CD8+ T-cell, wherein the marker is selected from the group consisting of CTLA-4, LAG3 and TIM3 or a combination thereof. The method additionally comprises finding presence or upregulation of CTLA-4, LAG3 and/or TIM3 being indicative of anergic and exhausted CD8+ T-cell and a tumor infiltrated by dysfunctional CD8+ T- cell which is responsive to immune checkpoint blockade.

The invention also provides a method for identifying immunogenic features of a tumor

microenvironment. In one embodiment, the method comprises: (a) obtaining a tumor tissue sample from a subject; (b) determining gene expression of the isolated tumor tissue so as to obtain gene expression data; (c) deconvolving gene expression data of (b) by applying gene expression signatures associated with specific immune cell types and/or subtypes, so as to obtain immune scores for the immune cell types and/or subtypes with gene expression signatures used in deconvolving gene expression data; (d) optionally, determining one or more functional marker of immune cells so as to assess functional status of immune cell infiltrate; and (e) comparing the immune score for each specific immune cell type and/or subtype with the immune score for other immune cell types and/or subtypes, and optionally, functional status of immune cells, so as to identify specific immune cell types and/or subtypes as immune infiltrates enriched or deficient in the tumor tissue, and optionally, functional status of the specific immune cell types and/or subtypes of immune cell infiltrate.

The invention provides a method for assessing prognosis of a subject afflicted with a tumor or cancer and predicting response to a cancer drug by the subject. In one embodiment, the method comprises: (a) identifying a subject afflicted by a particular type or subtype of tumor; (b) obtaining a tumor sample from the subject; (c) identifying immunogenic features of a tumor

microenvironment in a tumor sample from the subject; (d) comparing the the immunogenic features so obtained with the immunogenic features associated with a good or favourable tumor or cancer prognosis and/or associated with a bad or unfavourable tumor or cancer prognosis, so as to assess prognosis of a subject afflicted with a tumor or cancer; and (e) comparing the the immunogenic features so obtained with the immunogenic features associated with a good or favourable response to a cancer drug and/or associated with a bad or unfavourable response to a cancer drug, so as to predict response to a cancer drug by the subject. Brief Description of the Figures

Figure la-f. Steps to identify and prioritize cancer vaccine candidates from tissue samples.

Figure 2. HLA-binding features in a peptide fail to discriminate immunogenic from non- immunogenic peptides, (a) The distribution of HLA-binding affinity of immunogenic and non- immunogenic 9-mer peptides. The HLA binding score was generated by NetMHCcons. The black shows the 500nM binding score. Score <=500nM is used widely for defining immunogenic peptides, (b) The sensitivity and specificity of the method if <=500nM is used as a cut-off to define immunogenic peptides. More than 70% of the non-immunogenic peptides have binding affinity <=500nM and 25% of the immunogenic peptides have binding affinity >500nM. (c) Distribution of hydrophobicity score at each position of the immunogenic and non-immunogenic 9 mer peptides. At position #1 and #7 hydrophobicity score is higher (p- value < 0.01) for the immunogenic peptides as compared to non-immunogenic peptides, (d) Overall hydrophobicity score of the 9-mer immunogenic and non-immunogenic peptides. The immunogenic peptides show higher overall hydrophobic score (p-value = 0.022) as compared to non-immunogenic peptides, (e-f) Theenrichment of amino acids at different positions in the o immunogenic and non- immunogenic peptides. A similar pattern of amino acids is seen in the two groups of peptides.

Figure 3. Schematic of the workflow for feature construction and selection of TCR-binding peptides

Figure 4a-c. Schematic of the workflow for feature construction and selection of TCR-binding peptides, (a) The filtering process to select HLA-binding 9-mer peptides from the IEDB database for developing the of IPepPredicT program. The ambiguous peptides (reported as both

immunogenic and non-immunogenic in different assays) were removed. Immunogenic peptides were selected by their ability to activate CD8 T cellsin a biological assay and the HLA 4-digit information available for all the selected immunogenic peptides, (b) Immunogenic and non- immunogenic peptides restriced to HLA for which 4-digit information is available in IEDB. Of the 1 170 peptides that passed the criteria given in (a), 423 peptides are restriced to HLA-A*02:01 and were used for model building, (c) A schematic of the methodology used for the development of the immunogenic peptide prediction program. Physicochemical (AAIndex, peplib), peptide processing and HLA-binding properties of the peptides were used that generated 12,093 features for each peptide. The dataset was subsampled and 500 training instances were generated with a balanced number (~100 in number) of immunogenic and non-immunogenic peptides. Feature reduction step was performed to reduce the total features that will avoid overtraining by discarding correlated features. Decision tree - based classifier was used on the reduced features and the prediction from all classifiers is aggregated to generate an ensemble voting score for each peptide. Peptides with score > 0.5 were labeled as immunogenic. Fig. 5a-g. Performance evaluation of the classifier, (a) Sensitivity and specificity distribution of 500 classifiers before and after feature selection. The median sensitivity and specificity of the classifiers are 0.596 and 0.620 respectively, (b) Prediction score of the ensemble 500 classifiers without feature selection (c) The ROC curve obtained from the ensemble classifier generated from 500 models. The ensemble classifier result is slightly better than random prediction, (d) Sensitivity and specificity distribution of 433 classifiers with feature reduction. There are two groups of classifiers. The groupl classifiers behave like the classifiers with no feature reduction with median sensitivity and specificity of 0.55 and 0.60 respectively. The group2 classifiers are higher sensitivity (0.65) and specificity (0.85). (e) Prediction score of the ensemble 433 classifiers with feature selection, (f) Prediction score of the ensemble 45 best performing classifiers (group2) with feature selection, (g) The ROC curve for both ensemble classifiers are shown.

Fig. 6a-j. Selected peptide features and HLA-peptide-TCR complex crystal structure analysis, (a) Heatmap showing the selected features identified by the superior- performing classifiers. The most frequent feature type includes Helix/turn, hydrophobicity, and non-bonding interactions. Position specific enrichment of different features is detected. Among the nine residues in the peptide, residues at position 2, 6 and 8 are most important. Detection of position- specific enrichment of individual feature type, (b) hydrogen bonding interaction frequency between peptide and MHC class I chain, (c) hydrogen bonding interaction frequency between the peptide and TCR alpha chain, (d) hydrogen bonding interaction frequency between the peptide and TCR beta chain, (e) non-bonding interactions (mainly hydrophobic) of the peptide with MHC class I chain, (f) non-bonding interactions (mainly hydrophobic) of the peptide with TCR alpha chain, (g) non-bonding interactions (mainly hydrophobic) of the peptide with TCR beta chain, (h) Ramachandran plot for each position of 9-mer peptides presented by HLA-A02:01 alleles using available crystallographic TCR-pMHC complex structures, (i) superimposition of 21 HLA- A02:01 binding 9mer-peptides extracted from PDB complexes, (j) 9mer peptide (shown in purple) and MHC molecule. The hydrophobic residue in the MHC molecule is shown in green. A bend in the peptide at position 3 is discernable in the TCR-pMHC complex structure. Figure 7a-c. Flow cytometry analysis to test the immunogenicity of predicted peptides in a T cell activation assay. 9-mer peptides generated from geneticall altered proteins derived from tumor cells are tested in a T cell activation assay in three different formats (A) peptide added to peripheral blood mononuclear cells (PBMCs); (B) peptide added to purified dendritic cell - CD8 T cell co- culture assay; (C) peptide expressed as a minigene in a purified dendritic cell - CD8 T cell co- culture assay. Production of IFN-γ by CD8 T cells in the presence of the mutant peptide is compared with wild-type peptide to select a peptide as immunogenic.

Figure 8. Top clonally amplified population of T cells in the presence of different peptides. Peptide 1 (Pep 1) and Peptide 2 (Pep 2) show specific expansion of single TCR clones (monoclonal response) while peptide 3 and 4 (pep 3 and 4) display clonal expansion of multiple (4-5) clones (polyclonal response). Only clones with frequencies above 5% are shown. Underlined values on the top of each bar represent the frequency of the clone in the control sample.

Figure 9. Schematic showing the workflow of 10 X Genomic Single Cell TCR sequencing platform that can be overlaid with single cell transcriptomic analysis. Cells from peptide-induced CD8⁺ T cell activation assay is processed for a single cell sequencing experiment. Single gel bead containing barcoded oligonucleotides are encapsulated into nanoliter- sized GEMs using 1 OX Genomics GemCode platform. Lysis of barcoded cells followed by reverse transcription of RNAs from single cells are performed inside each GEM. Post cDNA synthesis the samples can be processed for gene-expression and TCR alpha- beta paired sequencing on Illumina HiSEQ2500 or MiSeq platforms. Using the unique barcodes, the TCR sequence data can be coupled to gene expression.

Figure 10. lOx Genomics single cell analysis of clonally amplified T cells and expression of phenotypic markers of cytolytic T cells (CTLs). Clonally amplified population of CDS T cells (dark blue dots) in the background of non-amplified T cells (light blue cells). B. Expression of cytolytic markers on clonally amplified T cells.

Figure 11. Creation and validation of minimal gene expression signature profile (MGESP) for eight different immune cells. A. Workflow for creating and validating MGESPs. B. Validation of MGESPs on RNA seq data. One immune cell type is represented in each panel with the signature of the given cell type applied to all the immune cell types. The highest score corresponds to the cognate immune cell-type. C. Visualization of immune cell-types using MGESPs on two- dimensional coordinates from t-stochastic neighbor embedding (t-SNE) algorithm. D. Hierarchical clustering of immune cell-types on RNA-seq data from pure immune cells E. Segregation of immune cells by MGESPs from single-cell RNA-seq data. F. Comparison of MGESPs with other published signatures on FACS data. Signatures were applied to FACS sorted immune cells and shown as a correlation plot.

Figure 12. Use of CD8⁺ T cell signature to stratify tumors. Cancers in which >25% of tumors have a positive CDS T cell infiltration score is classified as high (example, melanoma (SKCM). Cancers with <25% - >5% CD8 T cell infiltration is classified as medium (example, head and neck squamous cell carcinoma (HNSCC). Cancers in which <5% tumors have positive CD8 T cell score are classified as low (example, prostate cancer (PRAD).

Figure 13a-d. Comprehensive analysis of the immune landscape of 9640 tumors across 33 cancers using MGESPs. A. Workflow to identify cancers with the highest infiltration of a given immune; oe.11-t.ype (left panel), MGESP-derived score for each immune cell-type was calculated for each of the tumors in the data set and arranged into quartiles. The number of samples in each quartile was used to create the heatmap (right panel). The color represents the proportion of tumor samples belonging to each cancer present in the quartile. Red and white color indicates higher and lower numbers of tumor samples in a given quartile. B. Co-infiltration of immune cells in Ql tumors. Tumors belonging to Ql for each of the cell-types were analyzed for the co-infiltration of other immune cells and expressed as a correlation plot. Each vertical column represents the correlation of immune scores of a given cell-type with other immune cells. C. Infiltration of immune cells is dependent on the expression of chemoattractant genes specific to each immune cell-type. Dependence is shown as a correlation plot of normalized expression of chemoattractant genes and MGESP scores for each of the eight immune cell-types across all cancers. D. Enrichment of specific immune cells in tumors carrying mutations in a subset of oncogenes and tumor suppressor genes. Bubble plot shows significant correlation (p-value<= 0.005). Each colored bubble represents a specific immune cell, and the size of the bubble represents number of tumors carrying a specific mutation and infiltrated by a specific immune cell-type.

Figure 14a-c. The relationship between infiltration of immune cells in tumors and their effect on patient survival across cancers. A. Correlation between infiltration of different immune cells and patient survival. For each cancer, survival benefit between top and bottom 20% tumors infiltrated by specific immune cells was compared. Size of the bubble shows sample number, red and white indicate good and poor prognosis, respectively, and significant associations (p-value <0.05) are shown. B. Effect of combined infiltration of two cell-types on patient survival represented as Kaplan-Meier plots for selected cancers. KIRC (CD4⁺ + Neutrophil) and SARC (CD8⁺ + Monocyte) showing good survival and LGG (Treg + Monocyte) showing poor survival. C. Changes in immune infiltrate in early and late stage tumors from different cancers. The immune scores differing significantly between cancer stages for a given cell-type are represented by the pie plot (p-value < 0.05).

Figure 15a-f. Cluster Analysis of 9120 tumors according to their immune profile. A. The 42- gene expression signature representing eight different immune cell-types was applied to cluster tumors according to their immune landscape. The four major clusters are shown in different colors, and a heatmap representing the profile of immune infiltrate for each cluster is shown below the dendrogram. B. The bar plot shows the percentage of tumors from each cancer present in different clusters (color of each cluster is shown in the bar plot). C. Epithelial, stromal and immune content of tumors present in different clusters. D. Immune cell content of tumors present in each cluster. E. Distribution of non-synonymous mutations in each cluster. F. Immune cell content of MSI⁺ (104) and MSI^" (6250) tumors.

Figure 16a-h. Analysis of factors affecting prognosis in the CD8⁺ T cell^hi cluster (cluster4). A.

Survival plot of cases present in cluster 1 and cluster4. B. Survival plot of cases present in cluster2 and cluster4. C. Epithelial, stromal and immune content in tumors present in alive and deceased groups from cluster4 (left panel). Immune landscape of tumors in alive and deceased groups from cluster4 (right panel). D. Inflammatory and immunosuppressive features of tumors present in the alive and the deceased groups. E. Presence of cytolytic cells in tumors belonging to the alive and deceased groups. F. Correlation of expression of anergic and exhaustion markers with CD8⁺ T cell infiltrate in the alive and dead groups. G. Genes upregulated in the TCR signaling pathway in the alive subjects of cluster4. H. Schematic of TCR signaling complex showing genes upregulated in alive cases (†) compared to the dead.

Figure 17. Schematic showing the immune microenvironment of tumors that experience long- term survival benefit (alive) over tumors that fail to show benefit (dead). The analysis is restricted to tumors that have high infiltration of CD8⁺ T cells. Tumors that experience long-term survival benefit are infiltrated by functional CD8⁺ T cells characterized by higher expression of 1 L markers, and higher expression of TCR signaling genes.

Figure 18. Schematic showing the immune microenvironment of tumors that experience long- term survival benefit (alive) over tumors that fail to show benefit (dead). The analysis is restricted to tumors that have high infiltration of CD8⁺ T cells. Tumors that experience long-term survival benefit are infiltrated by functional CD8⁺ T cells characterized by higher expression of CTL markers, and higher expression of TCR signaling genes.

Figure 19. Expression of TCR signaling genes predict response to Ipilimumab therapy in melanoma patients. The expression score is significantly higher in PR/CR patients when treated with Ipilimumab but not in naive patients.

List of Tables

Table 1. List of all class I HLA proteins used for peptide binding analysis.

Table 2. List of features selected from the Ensemble model that separated immunogenic from non- immunogenic peptides. Table 3. Data used for rank ordering immunogenic peptides.

Table 4. List of immunogenic peptides from frequently occurring mutations in cancer.

Table 5. Summary of data generated from head and neck cancer tumor and paired normal sample.

Table 6. Preprocessing, alignment and coverage summary of exome sequencing data.

Table 7A. Summary of valiants detected in the sample. Table 7B. Classification of protein-altering variants .

Table 8. Pre-processing and alignment summary of RNA sequence data. Table 9. HLA class I alleles present in the sample. Table 10. Expression of HLA class I genes in the sample.

Table 1 1. Rank ordered list of immunogenic peptides from the mutations in head and neck cancer sample.

Table 12. Key steps in the CD8+ T-cell activation assay and critical parameters to monitor. Table 13. QC parameters for delivering a sensitive assay.

Table 14. Steps to identify and validate cancer vaccine candidate in vitro.

Table 15. List of 125 genes from which a subset is used to create immune cell type- and sub-type- specific gene signature. Table 16. Enriched pathways in CD8 T cell-infiltrated tumors.

Table 17. Impact of co-infiltration of two immune cell types on survival on TCGA data.

Table 18. List of inflammation and immune suppression markers. List of genes used as a signature for obtaining the inflammation and suppression scores. Detailed Description Of The Invention:

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as is commonly understood by one of ordinary skill in the art to which this invention belongs. All patents, applications, published applications and other publications referred to herein are incorporated by reference in their entirely.

As used in the description of the invention and the appended claims, the singular forms "a", "an" and "the" are used interchangeably and intended to include the plural forms as well and fall within each meaning, unless the context clearly indicates otherwise. Also, as used herein, "and/or" refers to and encompasses any and all possible combinations of one or more of the listed items, as well as the lack of combinations when interpreted in the alternative ("or").

As used herein, "one or more" is intended to mean "at least one" or all of the listed elements.

Except where noted otherwise, capitalized and non-capitalized forms of all terms fall within each meaning.

Unless otherwise indicated, it is to be understood that all numbers expressing quantities, ratios, and numerical properties of ingredients, reaction conditions, and so forth used in the specification and claims are contemplated to be able to be modified in all instances by the term "about." As used herein, the term "about" when used before a numerical designation, e.g. , temperature, time, amount, concentration, and such other, including a range, indicates approximations which may vary by ( + ) or ( - ) 10 %, 5 % or 1 %.

As used herein, the term "substantially free" includes being free of a given substance or cell type or nearly free of that substance or cell type, e.g. having less than about 1 % of the given substance or cell type.

As used herein, the term "feature" or "pre-defined feature" refers to physicochemical properties of amino acids that favour binding to T cell receptor or are used to characterize each amino acid of a peptide interacting with TCR. The physicochemical properties may include hydrophobic, helix/turn motif, polar, non-polar, β-sheet structure motif, charge of main chain, charge of side chain, solvent accessibility of an amino acid, spatial flexibility of the main chain and spatial flexibility of side chain of an amino acid.

As used herein, "physicochemical features of amino acids" refer to the functional groups present in amino acids which define interactions between amino acids and their chemical properties as defined in Amino Acid Index. s used herein, "an amino acid index" or "AAindexl section of Amino Acid Index database or its equivalent" refers to a list of physicochemical properties of each of the 20 naturally occurring amino acid (Kawashima and Kanehisa 2000).

By way of example, a TCR-binding score of greater than 0.5 from a range of 0 - 1 may predict the ability of the peptide(s) to interact with the TCR. As used herein, the term "cancer vaccine cocktail" refers to a mixture of immunogenic peptides that will induce an anti-tumor T cell response. A therapeutic vaccine may be administered during or after onset of a cancer. A prophylactic treatment vaccine may be administered prior to onset of the disease such as a cancer and is intended to prevent, inhibit or delay onset of the disease. As used herein, the term "validated immunogenic peptide" refers to a peptide that activates CD8 T cells in an ex vivo T cell activation assay when added from outside as a synthetic peptide or expressed as a minigene in antigen-presenting cells. Additionally, a peptide that activates CD8 T cells in an ex vivo T cell activation assay when added to the culture medium may be a synthetic peptide or may be expressed as a minigene. Moreover, the peptide that activates CD8 T cells must interact with the TCR expressed by the T cells and be bound by the TCR. The peptide is bound by MHC or HLA and is presented by antigen-presenting cells, including dendritic cells. The peptide comprises amino acid sequence that support MHC or HLA binding as well as TCR binding. TCR binding may be predicted using an in silico-based method to identify potential immunogenic peptide (predicted immunogenic peptide), whose immunogenicity may be validated using an ex vivo T cell activation assay, or by administering into mammals including humans (validated immunogenic peptide). The peptide may include additional amino acids outside of the MHC/HLA- and TCR-binding regions, such as protease cleavage sites or sequences.

As us d herein "immune checkpoint inhibitors" refers to agents that block immune checkpoints. Immune checkpoints are inhibitory pathways present in immune cells important for maintaining self-tolerance and controlling the degree of an immune response. Blocking these pathways may lead to reduced modulation of immune cells, or increased activation of immune cells. The vaccines or peptides of the invention (including e.g., the immunogenic peptides of Table 4) may be administered in the form of a pharmaceutical composition comprising the active ingredient in a pharmaceutically acceptable dosage form. Depending upon the type of disease and patient to be treated, as well as the route of administration, the compositions may be administered at varying doses. Administration may be by methods including, but not limited to, intratumoral delivery, peritumoral delivery, intraperitoneal delivery, intrathecal delivery, intramuscular injection, subcutaneous injection, intravenous delivery, nasal spray and other mucosal delivery (e.g. transmucosal delivery), intra-arterial delivery, intraventricular delivery, intrasternal delivery, intracranial delivery, intradermal injection, electroincorporation (e.g., with electroporation), oncolytic viruses, ulliasound, jet injector, and topical patches. Formulations suitable for administration include aqueous and non-aqueous sterile injection solutions which may contain anti-oxidants, buffers, bacteriostats and solutes which render the formulation isotonic with the blood of the intended recipient, and aqueous and non-aqueous sterile suspensions which may include suspending agents and thickening agents. The formulations may be presented in unit-dose or multi-dose containers, for example sealed ampoules and vials, and may be stored in a freeze-dried (lyophilised) condition requiring only the addition of the sterile liquid carrier, for example water for injections, immediately prior to use. Extemporaneous injection solutions and suspensions may be prepared from sterile powders, granules and tablets of the kind previously described.

When a vaccine or peptide of the invention described herein is being given to a subject, a skilled artisan would understand that the dosage depends on several factors, including, but not limited to, the subject's weight, disease and progression thereof or tumor size or tumor progression. With respect to duration and frequency of treatment, it is typical for skilled clinicians to monitor subjects in order to determine whether the treatment is providing therapeutic benefit, and to determine whether to increase or decrease dosage, increase or decrease administration frequency, discontinue treatment, resume or make other alterations to the treatment regimen.

In an embodiment, a non-limiting example of an administration protocol useful for the invention comprises multiple administrations of the vaccine or peptide of the invention during an initial period (such as, for example, a sixweek period, with, for example, administration every two weeks). Furthermore, an administration protocol may also include multiple administrations of the vaccine or peptide of the invention at first administration (such as at multiple sites within a tumor at first administration of the vaccine).

By "effective amount" as used herein with respect to a vaccine or peptide of the invention, is meant an amount of the vaccine or peptide of the invention, administered to a subject that results in an immune response by the mammal so as to inhibit the disease such as cancer. Further, an effective amount may include any amount which, as compared to a corresponding subject who has not received such amount, results in improved treatment, healing, prevention, or amelioration of a disease, disorder, or side effect, or a decrease in the rate of advancement of a disease or disorder. The term also includes within its scope amounts effective to enhance normal physiological function.

As used herein, "inhibiting a tumor" may be measured in any way as is known and accepted in the art, including complete regression of the tumor(s) (complete response); reduction in size or volume of the tumor(s) or even a slowing in a previously observed growth of a tumor(s), e.g., at least about a 10-30% decrease in the sum of the longest diameter (LD) of a tumor, taking as reference the baseline sum LD (partial response); mixed response (regression or stabilization of some tumors but not others); or no apparent growth or progression of tumor(s) or neither sufficient shrinkage to qualify for partial response nor sufficient increase to qualify for progressive disease, taking as reference the smallest sum LD since the treatment started (stable disease).

Tumor or cancer status may also be assessed by sampling for the number, concentration or density of tumor or cancer cells, alone or with respect to a reference. Tumor or cancer status may also be assessed through the use of surrogate marker(s), such as Her-2 in breast cancer or PSA in prostate cancer.

As used herein, the term "mutant allele for one or more genetically altered protein associated with the tumor in a population" refers to a mutation present in the DNA of tumor cells that encodes a genetically altered protein.

As used herein, "treating" means using a therapy to ameliorate a disease or disorder or one or more of the biological manifestations of the disease or disorder; to directly or indirectly interfere with (a) one or more points in the biological cascade that leads to, or is responsible for, the disease or disorder or (b) one or more of the biological manifestations of the disease or disorder; to alleviate one or more of the symptoms, effects or side effects associated with the disease or disorder or one or more of the symptoms or disorder or treatment thereof; or to slow the progression of the disease or disorder or one or more of the biological manifestations of the disease or disorder. Treatment includes eliciting a clinically significant response. Treatment may also include improving quality of life for a subject afflicted with the disease or disorder (e.g., a subject afflicted with a cancer may receive a lower dose of an anti-cancer drug that cause side-effects when the subject is immunized with a composition of the invention described herein). Throughout the specification, compositions of the invention and methods for the use thereof are provided and are chosen to provide suitable treatment for subjects in need thereof. In some embodiments, treatment with a composition of the invention described herein induces and/or sustains an immune response in a subject. Immune responses include innate immune response, adaptive immune response, or both. Innate immune response may be mediated by neutrophils, macrophages, natural killer cells (NK cells), and/or dendritic cells. Adaptive immune response includes humoral responses (i.e., the production of antibodies), cellular responses (i.e., proliferation and stimulation of T-lymphocytes), or both. Measurement of activation and duration of cellular response may be by any known methods including, for example, cytotoxic T-lymphocyte (CTL) assays. Humoral responses may be also measured by known methods including isolation and quantitation of antibody titers specific to the compositions of the invention (e.g., vaccines) such as IgG or IgM antibody fractions. In some embodiments, the methods of treatment (e.g., immunotherapy) described herein is used as a stand-alone therapy without combining with any other therapy.

In other embodiments, the methods of treatment (e.g., immunotherapy) described herein provide adjunct therapy to other therapies, e.g., cancer therapy, prescribed for a subject. For example, the methods of treatment (e.g., immunotherapy) described herein may be administered in combination with radiotherapy, chemotherapy, gene therapy or surgery. The combination is such that the method of treatment (e.g., immunotherapy) described herein may be administered prior to, with or following adjunct therapy. In accordance with the invention, the effect of anti-disease or disorder treatment (e.g., a cancer treatment) may be assessed by monitoring the patient, e.g., by measuring and comparing survival time or time to disease progression (disease-free survival). Any assessment of response may be compared to individuals who did not receive the treatment or were treated with a placebo, or to individuals who received an alternative treatment.

As used herein, "preventing" is understood to refer to the prophylactic administration of a drug to substantially diminish the likelihood or severity of a condition or biological manifestation thereof, or to delay the onset of such condition or biological manifestation. One skilled in the art will appreciate that prevention is not an absolute term. Prophylactic therapy is appropriate, for example, when a subject is considered at high risk for developing a particular disease or disorder (e.g., cancer), such as when a subject has a strong family history of a disease or disorder or when a subject has been exposed to e.g., a disease-causing agent, e.g., a carcinogen.

By way of example, HLA or MHC binding affinity may be in the range of about 0.1 nM - Ι ΟΟΟηΜ.

As used herein, "quality of proteasomal processing" depends on the proteasomal processing sites flanking the peptide that will generate the correct HLA-binding peptide. As used herein, "quality of TAP transporter binding" is high when a peptide has a high affinity TAP-binding score, which will enable the peptide to be transported into the endoplasmic reticulum as determined using NetCTLpan algorithm (Peters, Bulik et al. 2003)

By way of example, "positive in T cell activation assay" refers to either a peptide that induces about >0.1% of CD8 T cells to express IFN-γ in an ex vivo CD8 T cell activation assay, or a peptide that induces clonal amplification of CD8 T cells where about top ten amplified clones have a cumulative amplification frequency of about >20%.

By way of example, "T cell activation" refers to either a greater than about 2-fold amplification of CD8 T cells expressing IFN-γ by the mutant peptide (also referred to herein as a peptide variant) compared to wild type, and/or a greater than about 2-fold expression of IFN-γ by CD8 T cells by the mutant peptide compared to wild-type.

By way of example, a T-cell amplification response is monoclonal if one clone is about >20% of total clones and the rest are below about 2% frequency. A T-cell amplification is polyclonal if more than one clone is present at about >5%. As used herein, functional competence of T cells may be determined by the expression of T-helper markers or CTL markers such as IFN-γ, TNF-a, Granzyme A, Granzyme B, Perforin, Granulysin and/or PDCD1.

By way of example, "lack of anergic and/or exhaustion markers for T cells" refers to markers such as HAVCR2, LAG3, TIGTT, CC.T.3, CCL4, RBPMS, ZBED2 and/or PIP5K1 B.

By way of example, "moderate expression" or "low level of expression" in expressiong cells may refer to equal to or lower than about 10 fragment per kilobase per million (FPKM). As used in this application, "cancer-specific mutant peptide" refers to a peptide that comprises at least one mutated amino acid present in the cancer tissue and absent in the normal tissue, including for example immunogenic peptide, validated immunogenic peptide, predicted immunogenic peptide and peptide variant. The "cancer immunogenic peptide or tumor immunogenic peptide" which may refer to predicted immunogenic peptide or validated immunogenic peptide that comprises at least one mutated amino acid that is present in the cancer tissue and absent in the normal tissue and is capable of binding TCR and evoking a T cell response in the individual. The immunogenic peptides of the invention which are selected by the methods of the invention may be synthesized or expressed to be part of a larger polypeptide tumor vaccine. Alternatively, the nucleic acid encoding the immunogenic peptide of the invention may be used as part of a larger tumor vaccine. Cancer- tumor immunogenic peptides can arise from i) proteins altered in amino acid sequence in which one or more amino acids are altered, which may be arranged in a sequence or distributed randomly across the length of the protein; ii) proteins translated from fusion genes; iii) proteins produced from splice variants or from mutations in splicing sites, which results in the introduction of intronic region or part of an intronic region in frame with the protein coding sequence or exclusion of part or whole exon(s) resulting in an altered protein with new sequence at the site of the lost exonic region; iv) Proteins produced from insertions and/or deletions of nucleotides that cause frameshift in the protein coding sequence resulting in the introduction of one or more amino acids absent in the normal protein (Turajlic, Litchfield et al. 2017); or vi) protein arising from loss of stop codons (stop loss) that adds additional amino acids at the end of the protein (Romero Arenas, Fowler et al. 2014).

By way of a preferred example, an "immunogenic peptide" refers to a mutant peptide capable of transducing a signal in CD4⁺ and/or CD8¹ T cells. Merely by way of example, an "immunogenic peptide used as a vaccine" in this application refers to a longer peptide of length ranging from about >l l-mer up to about 50-mer containing within the longer peptide the minimal sequence of the immunogenic peptide. A "variant coding sequence" in this application refers to a nucleic acid sequence (DNA or RNA) from a cancer sample containing one or more variant nucleotides compared to the sequence in the reference normal sample. The sequence variation results in a change in the amino acid sequence of the protein encoded by the nucleic acid sequence. The "expressed variant coding sequence" in this application refers to a nucleic acid sequence derived from RNA expressed in the tumor or cancer tissue of the individual. A nucleic acid sequence "encoding" a peptide refers to a sequence of DNA or RNA containing the coding sequence of the peptide.

The "conceptual translation or in silico translation of the coding sequences" refers to translation of the coding sequence of a nucleic acid to amino acid sequence based on a codon table specifying amino acids, so as to obtain peptide or protein with a defined amino acid sequence. A computer and software may be used to perform the "conceptual translation or in silico translation of the coding sequences."

The "genetically altered protein(s) expressed by the mammalian tumor cell or the mammalian tumor tissue" refers to altered or mutated protein(s) reflective of changes in the genetic material present in the mammalian tumor cell or tissue.

The "class I HLA or equivalent" is class I MHC molecules of human or any other mammalian species.

The "HLA-binding neoepitope" in the context of class I HLA molecules refers to a peptide sequence of 8-1 1 amino acids in length in which one or more amino acids are mutated, which can bind or is predicted to bind to specific class I HLA molecules. The "HLA-binding epitope" in the context of class I HLA molecules refers to peptides containing mutated or non-mutated amino acids. For example, the HLA may be a class I HLA molecules.

The "MHC-binding neo-epitope" in the context of class I MHC molecules refers to a peptide sequence of 8-1 1 amino acids in length in which one or more amino acids are mutated, which can bind or is predicted to bind to specific class I MHC molecules. The "MHC-binding epitope" in the contest of class I MHC molecules refers to peptides containing mutated or non-mutated amino acids.

The "HLA-binding neo-epitope" in the context of class II HLA molecules refers to a peptide sequence of 13-21 amino acids in length in which one or more amino acids are mutated, which can bind or is predicted to bind to specific class II HLA molecules. The "HLA-binding epitope" in the contest of class II HLA molecules refers to peptides containing mutated or non-mutated amino acids.

The "MHC-binding neo-epitope" in the context of class II MHC molecules refers to a peptide sequence of 13-21 amino acids in length in which one or more amino acids are mutated, which can bind or is predicted to bind to specific class II MHC molecules. The "MHC-binding epitope" in the contest of class II MHC molecules refers to peptides containing mutated or non-mutated amino acids.

"T-cell neo-epitopes" refers to a peptide in which one or more amino acids are mutated, which can bind or is predicted to bind to T-cell receptor of CD8+ T-cell or CD4+ T-cell. HLA refers to human protein and MHC refer to mouse protein. Both peform the same function of presenting peptides to T cells.

An "immunogenic peptide" is a "HLA/MHC-binding neoepitope" "HLA/MHC-binding epitope". However, all HLA/MHC-binding neoepitopes or HLA/MHC-binding epitopes may not be "immunogenic peptides". The "peptide precursor" may be a protein present in the cancer tissue that contains the peptide of interest. Multiple "peptide precursors" can contain the peptide of interest.

A "disease tissue" in this application refers to tumor or cancer tissue from human or mice. A "tumor" or "neoplasm" is an abnormal growth of tissue whether benign or malignant.

A "cancer" may be a malignant tumor or malignant neoplasm. Cancer refers to any one of cancer, tumor growth, cancer of the colon, breast, bone, brain and others (e.g., osteosarcoma, neuroblastoma, colon adenocarcinoma), chronic myelogenous leukemia (CML), acute myeloid leukemia (AML), acute promyelocytic leukemia (APL), cardiac cancer (e.g., sarcoma, myxoma, rhabdomyoma, fibroma, lipoma and teratoma); lung cancer (e.g., bronchogenic carcinoma, alveolar carcinoma, bronchial adenoma, sarcoma, lymphoma, chondromatous hamartoma, mesothelioma); various gastrointestinal cancers (e.g., cancers of esophagus, stomach, pancreas, small bowel, and large bowel); genitourinary tract cancer (e.g., kidney, bladder and urethra, prostate, testis; liver cancer (e.g., hepatoma, cholangiocarcinoma, hepatoblastoma, angiosarcoma, hepatocellular adenoma, hemangioma); bone cancer (e.g., osteogenic sarcoma, fibrosarcoma, malignant fibrous histiocytoma, chondrosarcoma, Ewing's sarcoma, malignant lymphoma, multiple myeloma, malignant giant cell tumor chordoma, osteochronfroma, benign chondroma, chondroblastoma, chondromyxofibroma, osteoid osteoma and giant cell tumors); cancers of the nervous system (e.g., of the skull, meninges, brain, and spinal cord); gynecological cancers (e.g., uterus, cervix, ovaries, vulva, vagina); hematologic cancer (e.g., cancers relating to blood, Hodgkin's disease, non- Hodgkin's lymphoma); skin cancer (e.g., malignant melanoma, basal cell carcinoma, squamous cell carcinoma, Karposi's sarcoma, moles dysplastic nevi, lipoma, angioma, dermatofibroma, keloids, psoriasis); and cancers of the adrenal glands (e.g., neuroblastoma).

Examples of tumors include colorectal cancer, osteosarcoma, non-small cell lung cancer, breast cancer, ovarian cancer, glial cancer, solid tumors, metastatic tumor, acute lymphoblastic leukemia, acute myelogenous leukemia, adrenocortical carcinoma, Kaposi sarcoma, lymphoma, anal cancer, astrocytomas, basal cell carcinoma, bile duct cancer, bladder cancer, bone cancer, brain tumor, breast cancer, bronchial tumor, cervical cancer, chronic lymphocytic leukemia, chronic myelogenous leukemia, chronic myeloproliferative disorders, colon cancer, colorectal cancers, ductal carcinoma in situ, endometrial cancer, esophageal cancer, eye cancer, intraocular, retinoblastoma, metastatic melanoma, gallbladder cancer, gastric cancer, gastrointestinal carcinoid tumor, gastrointestinal stromal tumors, glioblastoma, glioma, hairy cell leukemia, head and neck cancer, hepatocellular carcinoma, hepatoma, Hodgkin lymphoma, hypopharyngeal cancer, Langerhans cell histiocytosis, laryngeal cancer, lip and oral cavity cancer, liver cancer, lobular carcinoma in situ, lung cancer, non-small cell lung cancer, small cell lung cancer, lymphoma, AIDS-related lymphoma, Burkitt lymphoma, non-Hodgkin lymphoma, cutaneous T-cell lymphoma, melanoma, squamous neck cancer, mouth cancer, multiple myeloma, myelodysplastic syndromes, myelodysplastic/myeloproliferative neoplasms, nasal cavity and paranasal sinus cancer, nasopharyngeal cancer, neuroblastoma, oral cavity cancer, oropharyngeal cancer, osteosarcoma, ovarian cancer, pancreatic carcinoma, papillary carcinomas, parathyroid cancer, pharyngeal cancer, pheochromocytoma, pineal parenchymal tumors, pineoblastoma, pituitary tumor, pleuropulmonary blastoma, primary central nervous system lymphoma, prostate cancer, rectal cancer, renal cell cancer, salivary gland cancer, sarcoma, Ewing sarcoma, soft tissue sarcoma, squamous cell carcinoma, Sezary syndrome, skin cancer, Merkel cell carcinoma, testicular cancer, throat cancer, thymoma, thymic carcinoma, thyroid cancer, urethral cancer, endometrial cancer, uterine cancer, uterine sarcoma, vaginal cancer, vulvar cancer, Waldenstrom macroglobulinemia, and Wilms tumor. In one embodiment, the tumor is a glioma. In one embodiment, the tumor is a tumor other than a glioma.

For example, an inhibition of growth of a cancer cell means that the rate of growth of a cancer cell that has been treated with a peptide of the invention is about 5-fold, 10-fold, 20- fold, 30-fold, 40-fold, 50-fold, 100-fold, or more, less than that of a cancer cell that has not been treated with a peptide of the invention. As used herein, "inhibition" as it refers to the rate of growth of a cancer cell that has been treated with a peptide of the invention also means that the rate is about 90%, 80%, 70%, 60%, 50%, 40%, 30%, 25%, 20%, 15%, 10%, 5% or less, lower than the rate of growth of a cancer cell that has not been treated with a peptide of the invention.

An inhibition of growth of a cancer cell also means that the number or growth of cancer cells that have been treated with a peptide of the invention is about 5-fold, 10-fold, 20-fold, 30-fold, 40-fold, 50-fold, 100-fold, or more, less than the number or growth of cancer cells that have not been treated with a peptide of the invention. By way of example, "inhibition" as it refers to the rate of growth of a cancer cell also means that the number or growth of cancer cells that have been treated with a peptide of the invention may be about 90%, 80%, 70%, 60%, 50%, 40%, 30%, 25%, 20%, 15%, 10%, 5% or less, lower than the growth or number of cancer cells that have not been treated with a peptide of the invention.

As used herein, "cancer" may be used interchangeably with "tumor," and vice versa, except when expressly or inherently prohibited. Similarly, "MHC" may be used interchangeably with "HLA," and vice versa, except when expressly or inherently prohibited. The term "unmutated or wild-type peptide" refers to a peptide derived from normal or healthy tissue cells or tissue. Normal or healthy cells or tissue are free of disease, and in the context of the invention, free of tumor/cancer tissue or cells. Unlike cancer-specific mutant peptide, tumor peptide variant(s) or cancer peptide variant(s), which are mutant or altered peptide specific to cancer or tumor cells or tissues and not present in non-tumor/cancer cells or tissue, the "unmutated or wild- type peptide" may be present in cancer or tumor cells or tissue.

As used herein, the terms "comprising" or "comprises" is intended to mean that the compositions and methods include the recited elements, but not excluding others. "Consisting essentially of when used to define compositions and methods, shall mean excluding other elements of any essential significance to the combination for the stated purpose. Thus, a composition consisting essentially of the elements as defined herein would not exclude other materials or steps that do not materially affect the basic and novel characteristic(s) of the present disclosure. "Consisting of shall mean excluding more than trace elements of other ingredients and substantial method steps. Embodiments defined by each of these transition terms are within the scope of the present disclosure. Methods of the Invention

Methods for Selection of a cancer vaccine from genetically altered protein(s) expressed by a mammalian cancer cell and/or tissue and validation methods

The invention provides methods of validating peptide variant(s) as an immunogenic peptide. In one embodiment, the method comprises (A) selecting a peptide variant(s) predicted to be an immunogenic peptide comprising the steps of 1) obtaining a sample from a subject with a tumor; 2) identifying genetically altered protein(s) expressed by a mammalian tumor cell or a mammalian tumor tissue in the sample from nucleic acid sequence(s) encoding the genetically altered protein(s); and 3) producing peptide fragment(s) comprising at least one mutated amino acid from the genetically altered protein(s) so identified in step A.2, so as to obtain one or more peptide variant(s) associated with the mammalian tumor cell or the mammalian tumor tissue. The method further comprises 4) selecting the peptide variant(s) from step A.3 predicted to bind T-cell receptor (TCR). In a specific embodiment, this involves i) selecting the peptide variant(s)-of a pre-defined length; ii) characterizing the peptide variant(s) in silico by selecting and matching features associated with an amino acid at each position of the peptide with selected pre-defined features for each position of peptides recognized by TCR associated with CD8+ T-cell, so as to obtain predictive ability of the peptide variant(s) to interact with the TCR; and iii) selecting the peptide variant(s) in step 4.ii based on the predicted ability of the peptide variant(s) to interact with the TCR, so as to be an immunogenic peptide that may or can serve as a mammalian tumor immunogenic peptide(s). Merely by way of example, the selected pre-defined features may comprise any, combination or all of hydrophobic, helix/turn motif, polar, non-polar, β -sheet structure motif, charge of main chain, charge of side chain, solvent accessibility of an amino acid, spatial flexibility of the main chain and spatial flexibility of side chain of an amino acid. Additional examples of selection predefined features may be used and are described in infra. Further, the method may additionally comprise the step of B) validating one or more immunogenic peptide(s) of step A above comprising the step of 1) determining whether the peptide variant(s) so selected is positive in an ex vivo CD8+ T-cell activation assay, and selecting the peptide variant(s) which is positive in a CD8+ T-cell activation assay so as to ensure ability of the peptide(s) to activate CD8+ T-cells, thereby validating the peptide variant(s) as an immunogenic peptide (B. l).

Additionally, in another aspect, the invention provides methods of selecting one or more validated immunogenic peptides for a cancer vaccine cocktail. In one embodiment, the method comprises one or more validated immunogenic peptide by the validation method above, wherein in step B, the validation method further comprises the steps of 1) quantitating the magnitude of CD8+ T-cell activation of the peptide variant(s) in step B. l , wherein peptide variant(s) generating about >2-fold expression of CD8+ T-cell activation marker IFN- γ and or about two fold expansion of CD8+ T- cell expressing IFN- γ compared to wild-type peptide or no-peptide control are selected; 2) determining monoclonal and polyclonal CD8+ T-cell amplification response in an ex vivo CD8+ T- cell activation assay to the peptide variant(s), such that the monoclonal and polyclonal CD8+ T-cell expansion is directed or skewed towards a polyclonal expansion of CD8+ T cells, such that a single peptide variant which activates two or more peptide variant-specific CD8+ T-cell clones is selected; 3) determining functional competence of CD8+ T-cells by quantitating expression of CTL markers in CD8+ T-cells in response to the peptide variant(s) in step B. l of the validation method. The CTL markers may comprise expression of about four or more of, e.g., IFN-γ, IL-2, TNF-a, CD69, Perforin, Granzyme A, Granzyme B, Granulysin, Fas-L and CD 107a (or all of these CTL markers). The methods provides selecting peptide variant(s) which express about four or more CTL markers. Further, the method comprises determining the anergic/exhaustion phenotype of CD8+ T-cells expanded in response to the peptide variant(s) and selecting peptide variant(s) inducing low or no expression of anergic and/or exhaustion markers in the expanded population of CD8+ T-cells. Merely by way of example, the anergic and/or exhaustion markers may include any, all or a combination of CTLA-4, PD-1 , Eomes, CD 160, TIGIT, ENTPD1 , MY07A, PHLDA1 , LAG-3, 2B4, BTLA, TIM3, VISTA and CD96. The combination of these steps permits selecting one or more validated immunogenic peptides for the cancer vaccine cocktail. In a preferred embodiment, the activation markers are IFN-γ, TNF-a and IL-2.

In an embodiment of the invention, the positive prediction of the validated immunogenic peptide to be bound by TCR in step (A)(4) of the validation method above comprises a TCR-binding algorithm. For example, the TCR-binding algorithm may comprise peptide(s) of a pre-defined length comprising one or more mutations and/or one or more alterations in level of expressed genetic material associated with the tumor; and selecting and matching features associated with an amino acid at each position of the peptide with selected pre-defined features for each position of peptides recognized by TCR associated with either CD8+ T-cell, so as to obtain predictive ability of the peptide(s) to interact with the TCR; wherein the features comprise physicochemical features of amino acids and wherein the physicochemical features are selected from an amino acid index and wherein the amino acid index is AAindexl section of Amino Acid Index database or its equivalent.

The invention provides methods for selecting a cancer vaccine from genetically altered protein(s) expressed by a mammalian cancer cell and/or tissue. The method comprises identifying neo- epitopes in mutant cancer peptides from the genetically altered protein(s) which is from the mammalian cancer cell and/or tissue. The method further comprises calculating probability of TCR binding of the neo-epitope(s) of (a) to generate a T-cell response, thereby identifying a T-cell activating neo-epitope(s) from the genetically altered protein. Additionally, the method comprises selecting one or more mutant cancer peptide(s) so identified above having the highest probability or a probability above a threshold setting that can modulate the immune response of a mammal when challenged with the mutant cancer peptide(s), thereby selecting a cancer vaccine; wherein the cancer vaccine comprises one or more mutant cancer peptides derived from the genetically altered protein(s) and wherein the mammalian subject expresses the genetically altered protein(s) and expresses an HLA or MHC molecule that binds the mutant cancer peptide(s). The invention further provides a cancer vaccine selected by the method of the invention. The invention additionally provides a method of treating a cancer comprising administering one or more of the cancer vaccine(s) selected by the method of the invention into a subject in need. Examples of the tumor may include, but are not limited to, a stomach tumor, a colorectal tumor, a colon tumor, a breast tumor, an ovarian tumor, a prostate tumor, a lung tumor, a kidney tumor, a gastric tumor, a testicular tumor, a head and neck tumor, a pancreatic tumor, a brain tumor, a melanoma, a lymphoma, and/or a leukemia.

In accordance with the practice of the invention, the cancer may be a stomach cancer, a bone cancer, a cervical thyroid cancer, a colorectal cancer, a colon cancer, a breast cancer, an ovarian cancer, a prostate cancer, a liver cancer, a lung cancer, a kidney cancer, a gastric cancer, a testicular cancer, a head and neck cancer, a skin cancer, an ovarian cancer, a pancreatic cancer, a brain cancer, a melanoma, a lymphoma or a leukemia. Examples of the colorectal cancer include, but are not limited to, familial adenomatous polyposis (FAP) and/or Lynch Syndrome.

In a further embodiment of the invention where the method comprises the step of identifying neo- epitopes from the mammalian cancer cell and/or tissue, the method further comprises sequencing nucleic acid sample of the subject's tumor and of a non-tumor (normal) sample of the subject, identifying about 4-20 sequences comprising tumor-specific non-silent mutations not present in the non-tumor (normal) sample. The method additionally comprises producing about 4-20 subject- specific peptides encoded by said 4-20 sequences comprising tumor-specific non-silent mutations not present in the non-tumor (normal) sample. Further, the method comprises measuring binding of said produced subject-specific peptides to an HLA protein of said subject, wherein each of said subject-specific peptides has a different tumor neo-epitope that is an epitope specific to the tumor of the subject, from the neo-epitopes identified in tumor specific mutations, wherein each neo-epitope is an expression product of a tumor-specific non-silent mutation not present in the non-tumor (normal) sample and each neo-epitope binds to an HLA or MHC protein of the subject.

In one embodiment of the method, the subject-specific immunogenic composition comprises a subject-specific peptide about 8 to 50 amino acids in length. In another embodiment, the subject- specific immunogenic composition comprises a subject-specific peptide that binds to the HLA or MHC protein of the subject with an IC50 less than about 500 nM.

In accordance with the invention, the mutant cancer peptide may be any one, one or more, two or more, five or more, ten or more, twenty or more, fifty or more, or one hundred or more of the peptides in any of Table 4.

In one embodiment of the method, the cancer vaccine is a subject specific cancer vaccine. In another embodiment of the invention, the cancer vaccine is an intra-species cancer vaccine.

In an additional embodiment of the method, the cancer vaccines selected by any of the method above that can modulate the immune response of a mammal comprise a mutant cancer peptide, wherein multiple mammalian subjects carry the same mutation as present in the mutant cancer peptide and express the same HLA molecule that binds the mutant cancer peptide. In one embodiment of the method, the genetically altered protein may be a mutant protein that is present in tumor cells but not in healthy cells. In a further embodiment of the method, the genetically altered protein may be overexpressed in tumor cells but not in healthy cells. In another embodiment of the method, the genetically altered protein may be a mutant protein that is present in tumor and healthy cells and is overexpressed in tumor cells but not in healthy cells. In one embodiment of the method, the cancer may be Breast, Lung, Head & Neck, Skin, Ovary, Pancreatic, Liver, Brain, Prostate, Cervical Thyroid, Bone or Stomach. In one embodiment of the method, the measuring of binding of the subject-specific peptides to the HLA or MHC protein comprises measuring binding of the subject-specific peptides to a class I HLA or class I MHC protein of the subject.

In another embodiment of the method, modulate the immune response of a mammal comprises a mutant cancer peptide eliciting a T-cell response. In a further embodiment, the method further comprises isolating the T-cell from the subject.

The invention further provides methods for preparing a subject-specific immunogenic composition. The method comprises selecting a cancer vaccine from genetically altered protein(s) expressed by a mammalian cancer cell and/or tissue by the method of the invention.

In another embodiment of the method, the composition further comprises at least one adjuvant. In one embodiment of the method, said subject- specific peptides have one or more amino acid mutation and bind to HLA or MHC proteins of the subject with an IC50 less than about 500 nM.

In another embodiment of the method, said subject-specific peptides comprise either of the following: a peptide that is encoded by a non-synonymous mutation leading to a different amino acid substitution in comparison with a protein of the non-tumor sample; or a peptide that is encoded by a read-through mutation in which a stop codon is modified or deleted, leading to translation of a longer protein in comparison with a protein of the non-tumor sample and having a novel tumor- specific sequence at the C-terminus; or a peptide that is encoded by an RNA derived from a splice site mutation that leads to the inclusion of an intron or part of an intron, or alternatively, exclusion of an exon or part of an exon in the mature mRNA and thus has a unique tumor-specific protein sequence; or a peptide representing a chromosomal rearrangement that gives rise to a chimeric protein with tumor-specific sequences at the junction of two proteins of the non-tumor sample and thus represents a gene fusion; or a peptide that is encoded by an insertion or a deletion of coding sequences resulting in a unique tumor-sepcific protein sequence; or a peptide that is encoded by a mRNA with a frameshift mutation resulting in a unique tumor-specific protein sequence.

Methods for Identification of a T-cell epitope fneoepitope) for cancer immunotherapy

The invention also provides methods for identification of a T-cell epitope (neoepitope) (a T cell activating peptide) for cancer immunotherapy. The method comprises obtaining peripheral blood mononuclear cells (PBMCs) from a subject. Then, CD 14+ CD 16+ monocytes are isolated from PBMCs. Additionally, the CD 14+ CD 16+ monocytes are contacted with a DC maturation cytokine cocktail comprising GM-CSF, IL4 and IFN so as to differentiate the CD 14+ CD16+ monocytes to dendritic cells (DCs). Furthermore, naive CD8+ T cells are isolated from PBMCs. After that, the DCs are contacted with a peptide from a protein overexpressed in a cancer cell or a genetically altered protein described by any of the method above. Moreober, the DCs described above are co- cultured with the isolated na^'ive CD8+ T cells described above in a culture medium comprising the DC maturation cytokine cocktail. The method further comprises supplementing the medium described with a second cytokine cocktail, contacting the co-culture described above with additional peptide-pulsed autologous PBMCs or DCs so as to re-stimulate the T cells. Optionally, the method comprises treating the cultured cells with an inhibitor of cellular transport prior to analysis of marker(s) of activated CD8+ T cells, and quantifying the amount of marker(s) of activated CD8+ T cells wherein presence of the marker(s) of activated CD8+ T cells above control level obtained from a co-culture with no peptide challenge or challenge with a peptide known not to stimulate na^'ive CD8+ T cells indicates T-cell recognition of the peptide presented by antigen presenting cells as a T-cell epitope. The invention further provides a T-cell epitope for cancer immunotherapy as identified by the method of the invention.

Methods for Classification of immunogenicity of a peptide to be used as a vaccine

The invention further provides methods for classifying immunogenicity of a peptide to be used as a vaccine comprising the method above, wherein the marker(s) of activated CD8+ T cells are INFy and/or TNF-a and wherein the amount of the INF-γ and/or TNF-a are used to classify the immunogenicity of a peptide. In one embodiment of the method, an immunogenic peptide elicits production of INF-γ and/or TNFa by CD8+ T cells. In another embodiment, a more immunogenic peptide elicits greater total production of INF-γ and/or TNFa. In yet another embodiment, an immunogenic peptide promotes T cell expansion. In a further embodiment, the T cell expansion may be either monoclonal or polyclonal. In one embodiment of the method, PMBCs obtained from a subject described by any of the method above is stored frozen and thaws with an efficiency of greater than 70% viability before use in subsequent steps. In another embodiment, CD 14+, CD 16+ monocytes are greater than 15% to less than or equal to 30% and CD8+ T cells are greater than 7% to less than or equal to 12% of total PBMCs obtained from the subject as described above. In yet another embodiment, DCs that are differentiated the CD 14+ CD 16+ monocytes as described above comprises predominantly of CD1 lc cell surface marker over CD 14+ and CD 16+ cell surface markers. In yet another

embodiment, greater than 40% of the CD 14+ and CD 16+ monocytes differentiate into CD1 lc+ dendritic cells. In another embodiment, the isolated naive CD8+ T cells as described above comprise greater than 90% CD8+ T cells and are depleted of natural killer (NK) and memory T cells. In a further embodiment, the isolated naive CD8+ T cells comprise less than 10% PMBCs having in total cells with any of CD56, CD57 or CD45RO cell surface marker. In another embodiment, the isolated naive CD8+ T cells lack CD56, CD57 and CD45RO cell surface markers. In one embodiment of the method, the peptide as described above is used at a concentration of about 1-10 micromolar.

In another embodiment of the method, sequence of the peptide used to contact the DCs or PBMCs as described above is identified in silico based on sequence analysis of proteins or protein coding regions in cancer and normal cells, followed by conceptual fragmentation of the proteins or putative proteins, docking or binding to a MHC class I or HLA class I complex, and docking or binding a T cell receptor onto a peptide-MHC class I or a peptide-HLA class I complex. In an additional embodiment, CD 14+ and CD 16+ monocytes and CD8+ T cells are isolated using a magnetic separation method. In a further embodiment, differentiation the CD 14+ and CD 16+ monocytes to dendritic cells (DCs) in a DC maturation cytokine cocktail as described above is for 4 days. In one embodiment, the isolated naive CD8+ T cells as described above is maintained in a culture medium comprising IL-7 overnight before mixing with DCs. In one embodiment, the co-culture comprising the DCs and the isolated naive CD8+ T cells is about 10 days. In one embodiment, when supplementing the medium with a second cytokine cocktail, the supplementing is every 2 days. In a further embodiment, the second cytokine cocktail comprises IL-7 and IL-15.

In one embodiment, contacting the co-culture supplemented with the second cytokine cocktail occurs on about day 10 from start of the co-culture for a duration of about 48 hours. In one embodiment, the inhibitor of cellular transport is brefeldin or equivalent. In one embodiment, treating the cultured cells with an inhibitor of cellular transport prior to analysis of marker(s) of activated CD8+ T cells occurs on about day 12 from start of the co-culture for a duration of about 24 or 48 hours. In one embodiment, examples of the marker(s) of activated CD8+ T cells include, but are not limited to, INF-γ, CD69, CD62L, CCR7, CD45RO, CD45RA, CD137, IL2 (Interleukin 2), TNF-α (Tumor necrosis factor) and MIPl -β (Macrophage Inflammatory Protein 1 beta). In one embodiment, CD8+ T cells are additionally positive for CD3+ cell surface marker as described in the method above. In a further embodiment, CD8+, CD3+ T cells are quantified for the level of INF-γ and/or TNF-a activated CD8+ T-cell markers.

In one embodiment, the method additionally comprises a positive control peptide or a collection of positive control peptides. In a further embodiment, the positive control peptide or a collection of positive control peptides are HLA class I-restricted T-cell epitopes. In one embodiment, the peptide is a mutant peptide from a mutant protein in a cancer cell. Methods for Identification of CD8+ T cell clones for adoptive T cell therapy

The invention provides methods for identifying CD8+ T cell clones for adoptive T cell therapy for a subject. The method comprises identifying an immunogenic peptide derived from an overexpressed or genetically altered protein from the subject in need by the method described above. The method then comprises contacting the immunogenic peptide identified above with isolated antigen presenting cells or dendritic cells from the subject in need or from an allogenic subject. Then, the cells obtained above are co-cultured with isolated naive CD8+ T cells from the subject in need or from an allogenic subject. Moreover, the method comprises detecting presence of marker(s) for activated CD8+ T cells. The method further comprises culturing activated CD8+ T cells so as to obtain a clonal population using a CD3/CD28 stimuli or an allogeneic stimulus using irradiated or mitomycin treated PBMC or lymphoblastic cells. The invention also provides CD8+ T cell clones identified by the method of the invention.

Methods for Identification of T cell receptor (TCR) recognizing an immunogenic peptide for therapeutic use

The invention provides methods for identifying T cell receptor (TCR) recognizing an immunogenic peptide for therapeutic use by engineering T cells against cancer. The method comprises identifying an immunogenic peptide derived from an overexpressed or genetically altered protein from a cancer cell by the method described above. The method then comprises contacting the immunogenic peptide identified above with isolated antigen presenting cells or dendritic cells from an autologous subject or from an allogenic subject. Moreover, the cells obtained above are co-cultured with isolated naive CD8+ T cells from the autologous subject or from an allogenic subject, so as to activate the CD8+ T cells. The method further comprises expanding clonal populations of T cells using a CD3/CD28 stimuli or an allogeneic stimulus using irradiated or mitomycin treated PBMC or lymphoblastic cells. Furthermore, the method comprises determining nucleic acid or protein sequence of T cell receptor from the activated CD8+ T cells, thereby identifying the T cell receptor (TCR) recognizing an immunogenic peptide for therapeutic use by engineering T cells against cancer. Methods for Selection of neoepitopes from genetically altered proteins expressed by human cancer cells and/or tissues

The invention provides methods of selecting neoepitopes from genetically altered proteins expressed by human cancer cells and/or tissues. The method comprises calculating the probability of HLA binding with optimal processing sites from a library of mutant cancer peptides.

Additionally, the method comprises calculating the probability of TCR binding to generate a T-cell response. The method then comprises selecting the mutant cancer peptides having the highest probability or a probability above a threshold setting so calculated from above that can modulate the immune response of a human, when challenged with the mutant cancer peptide; wherein, each selected mutant cancer peptide serves as or comprises a neoepitope.

In one embodiment, the mutant cancer peptide(s) is any one or more, two or more, five or more, ten or more, twenty or more, fifty or more, or one hundred or more of the peptides in any of Table 4. In another embodiment, the mutant cancer peptides are any one hundred or fewer of the peptides in any of Table 4.

Method for Selection of a cancer vaccine comprising one or more validated immunogenic peptides to treat a tumor

The invention provides methods of selecting a cancer vaccine comprising one or more validated immunogenic peptides to treat a tumor in a subject. The method comprises obtaining a tumor sample from the subject. The method additionally comprises identifying one or more mutations in expressed genetic material and/or one or more alterations in level of expressed genetic material associated with the tumor. Then, the method comprises predicting immunogenicity of said mutations and/or alteration in level of expressed genetic material associated with the tumor comprising a TCR-binding algorithm. The TCR-binding algorithm comprises peptide(s) of a predefined length comprising one or more mutations and/or one or more alterations in level of expressed genetic material associated with the tumor, and selecting and matching features associated with an amino acid at each position of the peptide with selected pre-defined features for each position of peptides recognized by TCR associated with either CD8+ T-cell or CD4+ T-cell, so as to obtain predictive ability of the peptide(s) to interact with the TCR. Moreover, the method comprises validating predicted immunogenic peptide(s) obtained above in a CD4⁺ and/or CD8⁺ T- cell activation assay, so as to ensure ability of the peptide(s) to activate CD4⁺ and/or CD8⁺ T-cell. Furthermore, the method comprises selecting validated immunogenic peptide(s) that elicit a specific T-cell response. The specific T-cell response comprises monoclonal or polyclonal expansion of T cells. The T-cell response also comprises expression of CD4+ T helper cell markers and/or CD8+ T cell cytolytic markers. The T-cell response further comprises sustainability of active T cells.

In accordance with the practice of the invention, the features associated with an amino acid at each position of the peptide may be physicochemical and/or biological properties of the amino acid. For example, each physicochemical and/or biological property of an amino acid may be assigned a numerical value within the context of other numerical values assigned to other amino acids.

Suitable examples of pre-defined features in accordance with the invention, include, but are not limited to, one of more of alpha-CH chemical shifts, hydrophobicity index (1), signal sequence helical potential, membrane-buried preference parameters, conformational parameter of inner helix, conformational parameter of beta-structure, conformational parameter of beta-turn, average flexibility indices, residue volume, information value for accessibility - average fraction 35%, information value for accessibility - average fraction 23%, retention coefficient in TFA, retention coefficient in HFBA, transfer free energy to surface, apparent partial specific volume, alpha-NH chemical shifts, alpha-CH chemical shifts, spin-spin coupling constants 3JHalpha-NH, normalized frequency of alpha-helix, normalized frequency of extended structure, steric parameter, polarizability parameter, free energy of solution in water - kcal/mole, Chou-Fasman parameter of the coil conformation, a parameter defined from the residuals obtained from the best correlation of the Chou-Fasman parameter of beta-sheet, number of atoms in the side chain labelled 1+1 , number of atoms in the side chain labelled 2+1 , number of atoms in the side chain labelled 3+1 , number of bonds in the longest chain, a parameter of charge transfer capability, a parameter of charge transfer donor capability, average volume of buried residue, residue accessible surface area in tripeptide, residue accessible surface area in folded protein, proportion of residues 95% buried, proportion of residues 100% buried, normalized frequency of beta-turn - 1 , normalized frequency of alpha-helix, normalized frequency of beta-sheet, normalized frequency of beta-turn - 2, normalized frequency of N-terminal helix, normalized frequency of C-terminal helix, normalized frequency of N-terminal non helical region, normalized frequency of C-terminal non helical region, normalized frequency of N-terminal beta-sheet, normalized frequency of C-terminal beta-sheet, normalized frequency of N- terminal non beta region, normalized frequency of C-terminal non beta region, frequency of the 1 st residue in turn, frequency of the 2nd residue in turn, frequency of the 3rd residue in turn, frequency of the 4th residue in turn, normalized frequency of the 2nd and 3rd residues in turn, normalized hydrophobicity scales for alpha-proteins, normalized hydrophobicity scales for beta-proteins, normalized hydrophobicity scales for alpha+beta-proteins, normalized hydrophobicity scales for alpha/beta-proteins, normalized average hydrophobicity scales, partial specific volume, normalized frequency of middle helix, normalized frequency of beta-sheet, normalized frequency of turn, size, amino acid composition, relative mutability, membrane preference for cytochrome b: MPH89, average membrane preference: AMP07, consensus normalized hydrophobicity scale, solvation free energy, atom-based hydrophobic moment, direction of hydrophobic moment, molecular weight, melting point, optical rotation, pK-N, pK-C, hydrophobic parameter pi, graph shape index, smoothed upsilon steric parameter, normalized van der Waals volume, STERIMOL length of the side chain, STERIMOL minimum width of the side chain, STERIMOL maximum width of the side chain, N.M.R. chemical shift of alpha-carbon, localized electrical effect, number of hydrogen bond donors, number of full nonbonding orbitals, positive charge, negative charge, pK-a(RCOOH), helix-coil equilibrium constant, helix initiation parameter at position i-1 , helix initiation parameter (at position i, i+1 , and i+2), helix termination parameter (at position j-2, j-1 , and j), helix termination parameter at position j+1 , partition coefficient, alpha-helix indices, alpha-helix indices for alpha-proteins, alpha-helix indices for beta-proteins, alpha-helix indices for alpha/beta-proteins, beta-strand indices, beta-strand indices for beta-proteins, beta-strand indices for alpha/beta-proteins, aperiodic indices, aperiodic indices for alpha-proteins, aperiodic indices for beta-proteins, aperiodic indices for alpha/beta-proteins, hydrophobicity factor, residue volume, composition, polarity, volume, partition energy, hydration number, hydrophilicity value, heat capacity, absolute entropy, entropy of formation, normalized relative frequency of alpha-helix, normalized relative frequency of extended structure, normalized relative frequency of bend, normalized relative frequency of bend R, normalized relative frequency of bend S, normalized relative frequency of helix end, normalized relative frequency of double bend, normalized relative frequency of coil, average accessible surface area, percentage of buried residues, percentage of exposed residues, ratio of buried and accessible molar fractions, transfer free energy, hydrophobicity (1), pK (-COOH), relative frequency of occurrence, relative mutability, amino acid distribution, sequence frequency, average relative probability of helix, average relative probability of beta-sheet, average relative probability of inner helix, average relative probability of inner beta-sheet, flexibility parameter for no rigid neighbors, flexibility parameter for one rigid neighbor, flexibility parameter for two rigid neighbors, Kerr- constant increments, net charge, side chain interaction parameter (1), side chain interaction parameter (2), fraction of site occupied by water, side chain volume, hydropathy index, transfer free energy, CHP/water, hydrophobic parameter, distance between C-alpha and centroid of side chain, side chain angle theta(AAR), side chain torsion angle phi(AAAR), radius of gyration of side chain, van der Waals parameter R0, van der Waals parameter epsilon, normalized frequency of alpha-helix with weights, Normalized frequency of beta-sheet with weights, normalized frequency of reverse turn with weights, normalized frequency of alpha-helix (unweighted), normalized frequency of beta-sheet (unweighted), normalized frequency of reverse turn (unweighted), frequency of occurrence in beta-bends, conformational preference for all beta-strands, conformational preference for parallel beta-strands, conformational preference for antiparallel beta-strands, average surrounding hydrophobicity, normalized frequency of alpha-helix, normalized frequency of extended structure, normalized frequency of zeta R, normalized frequency of left-handed alpha- helix, normalized frequency of zeta L, normalized frequency of alpha region, refractivity, retention coefficient in HPLC (pH7.4), retention coefficient in HPLC (pH2.1), retention coefficient in NaC104, retention coefficient in NaH2P04, average reduced distance for C-alpha, average reduced distance for side chain, average side chain orientation angle, effective partition energy, normalized frequency of alpha-helix, normalized frequency of beta-structure, normalized frequency of coil, AA composition of total proteins, SD of AA composition of total proteins, AA composition of mt- proteins, normalized composition of mt-proteins, A A composition of mt-proteins from animal, normalized composition from animal, AA composition of mt-proteins from fungi and plant, normalized composition from fungi and plant, AA composition of membrane proteins, normalized composition of membrane proteins, transmembrane regions of non-mt-proteins, transmembrane regions of mt-proteins, ratio of average and computed composition, AA composition of CYT of single-spanning proteins, AA composition of CYT2 of single-spanning proteins, AA composition of EXT of single-spanning proteins, AA composition of EXT2 of single-spanning proteins, AA composition of MEM of single-spanning proteins, AA composition of CYT of multi-spanning proteins, AA composition of EXT of multi-spanning proteins, AA composition of MEM of multi- spanning proteins, 8 A contact number, 14 A contact number, transfer energy, organic solvent/water, average non-bonded energy per atom, short and medium range non-bonded energy per atom, long range non-bonded energy per atom, average non-bonded energy per residue, short and medium range non-bonded energy per residue, optimized beta-structure-coil equilibrium constant, optimized propensity to form reverse turn, optimized transfer energy parameter, optimized average non-bonded energy per atom, optimized side chain interaction parameter, normalized frequency of alpha-helix from LG, normalized frequency of alpha-helix from CF, normalized frequency of beta-sheet from LG, normalized frequency of beta-sheet from CF, normalized frequency of turn from LG, normalized frequency of turn from CF, normalized frequency of alpha- helix in all-alpha class, normalized frequency of alpha-helix in alpha+beta class, normalized frequency of alpha-helix in alpha/beta class, normalized frequency of beta-sheet in all-beta class, normalized frequency of beta-sheet in alpha+beta class, normalized frequency of beta-sheet in alpha/beta class, normalized frequency of turn in all-alpha class, normalized frequency of turn in all-beta class, normalized frequency of turn in alpha+beta class, normalized frequency of turn in alpha/beta class, HPLC parameter, partition coefficient, surrounding hydrophobicity in folded form, average gain in surrounding hydrophobicity, average gain ratio in surrounding hydrophobicity, surrounding hydrophobicity in alpha-helix, surrounding hydrophobicity in beta-sheet, surrounding hydrophobicity in turn, accessibility reduction ratio, average number of surrounding residues, intercept in regression analysis, slope in regression analysis x 1.0E1 , correlation coefficient in regression analysis, hydrophobicity (2), relative frequency in alpha-helix, relative frequency in beta-sheet, relative frequency in reverse-turn, helix-coil equilibrium constant, beta-coil equilibrium constant, weights for alpha-helix at the window position of -6, weights for alpha-helix at the window position of -5, weights for alpha-helix at the window position of -4, weights for alpha-helix at the window position of -3, weights for alpha-helix at the window position of -2, weights for alpha-helix at the window position of -1 , weights for alpha-helix at the window position of 0, weights for alpha-helix at the window position of 1 , weights for alpha-helix at the window position of 2, weights for alpha-helix at the window position of 3, weights for alpha-helix at the window position of 4, weights for alpha-helix at the window position of 5, weights for alpha-helix at the window position of 6, weights for beta-sheet at the window position of -6, weights for beta-sheet at the window position of -5, weights for beta-sheet at the window position of -4, weights for beta- sheet at the window position of -3, weights for beta-sheet at the window position of -2, weights for beta-sheet at the window position of -1 , weights for beta-sheet at the window position of 0, weights for beta-sheet at the window position of 1 , weights for beta-sheet at the window position of 2, weights for beta- sheet at the window position of 3, weights for beta-sheet at the window position of 4, weights for beta-sheet at the window position of 5, weights for beta-sheet at the window position of 6, weights for coil at the window position of -6, weights for coil at the window position of -5, weights for coil at the window position of -4, weights for coil at the window position of -3, weights for coil at the window position of -2, weights for coil at the window position of -1 , weights for coil at the window position of 0, weights for coil at the window position of 1 , weights for coil at the window position of 2, weights for coil at the window position of 3, weights for coil at the window position of 4, weights for coil at the window position of 5, weights for coil at the window position of 6, average reduced distance for C-alpha, average reduced distance for side chain, side chain orientational preference, average relative fractional occurrence in A0(i), average relative fractional occurrence in AR(i), average relative fractional occurrence in AL(i), average relative fractional occurrence in EL(i), average relative fractional occurrence in E0(i), average relative fractional occurrence in ER(i), average relative fractional occurrence in A0(i-1), average relative fractional occurrence in AR(i-l), average relative fractional occurrence in AL(i-l), average relative fractional occurrence in EL(i-l), average relative fractional occurrence in E0(i-1), value of theta(i), value of theta(i-l), transfer free energy from chx to wat, transfer free energy from oct to wat, transfer free energy from vap to chx, transfer free energy from chx to oct, transfer free energy from vap to oct, accessible surface area, energy transfer from out to in (95%buried), mean polarity, relative preference value at N", relative preference value at N', relative preference value at N-cap, relative preference value at Nl, relative preference value at N2, relative preference value at N3, relative preference value at N4, relative preference value at N5, relative preference value at Mid, relative preference value at C5, relative preference value at C4, relative preference value at C3, relative preference value at C2, relative preference value at CI , relative preference value at C-cap, relative preference value at C, relative preference value at C", Information measure for alpha-helix, information measure for N-terminal helix, Information measure for middle helix, information measure for C-terminal helix, information measure for extended, information measure for pleated- sheet, information measure for extended without H-bond, information measure for turn, information measure for N-terminal turn, information measure for middle turn, information measure for C- terminal turn, information measure for coil, information measure for loop, hydration free energy, mean area buried on transfer, mean fractional area loss, side chain hydropathy - uncorrected for solvation, side chain hydropathy - corrected for solvation, loss of side chain hydropathy by helix formation, transfer free energy, principal component I, principal component II, principal component III, principal component IV, Zimm-Bragg parameter s at 20 C, Zimm-Bragg parameter sigma x 1.0E4, optimal matching hydrophobicity, normalized frequency of alpha-helix, normalized frequency of isolated helix, normalized frequency of extended structure, normalized frequency of chain reversal R, normalized frequency of chain reversal S, normalized frequency of chain reversal D , normalized frequency of left-handed helix, normalized frequency of zeta R, normalized frequency of coil, normalized frequency of chain reversal, relative population of conformational state A, relative population of conformational state C, relative population of conformational state E, electron-ion interaction potential, bitterness, transfer free energy to lipophilic phase, average interactions per side chain atom, RF value in high salt chromatography, propensity to be buried inside, free energy change of epsilon(i) to epsilon(ex), free energy change of alpha(Ri) to alpha(Rh), free energy change of epsilon(i) to alpha(Rh), polar requirement, hydration potential, principal property value zl, principal property value z2, principal property value z3, unfolding Gibbs energy in water (pH7.0), unfolding Gibbs energy in water (pH9.0), activation Gibbs energy of unfolding (pH7.0), activation Gibbs energy of unfolding (pH9.0), dependence of partition coefficient on ionic strength, hydrophobicity (3), bulkiness, polarity, isoelectric point, RF rank, normalized positional residue frequency at helix termini N4', normalized positional residue frequency at helix termini N'", normalized positional residue frequency at helix termini N", normalized positional residue frequency at helix termini N', normalized positional residue frequency at helix termini Nc, normalized positional residue frequency at helix termini Nl , normalized positional residue frequency at helix termini N2, normalized positional residue frequency at helix termini N3, normalized positional residue frequency at helix termini N4, normalized positional residue frequency at helix termini N5, normalized positional residue frequency at helix termini C5, normalized positional residue frequency at helix termini C4, normalized positional residue frequency at helix termini C3, normalized positional residue frequency at helix termini C2, normalized positional residue frequency at helix termini CI , normalized positional residue frequency at helix termini Cc, normalized positional residue frequency at helix termini C, normalized positional residue frequency at helix termini C", normalized positional residue frequency at helix termini C", normalized positional residue frequency at helix termini C4', Delta G values for the peptides extrapolated to 0 M urea, helix formation parameters (delta G), normalized flexibility parameters (B-values) - average, normalized flexibility parameters (B-values) for each residue surrounded by none rigid neighbors, normalized flexibility parameters (B-values) for each residue surrounded by one rigid neighbors, normalized flexibility parameters, Free energy in alpha-helical conformation, free energy in alpha-helical region, Free energy in beta-strand conformation, free energy in beta-strand region, free energy in beta-strand region, free energies of transfer of AcWl-X-LL peptides from bilayer interface to water, thermodynamic beta sheet propensity, turn propensity scale for transmembrane helices, alpha helix propensity of position 44 in T4 lysozyme, p-Values of mesophilic proteins based on the distributions of B values, p-Values of thermophilic proteins based on the distributions of B values, distribution of amino acid residues in the 18 non-redundant families of thermophilic proteins, distribution of amino acid residues in the 18 non-redundant families of mesophilic proteins, distribution of amino acid residues in the alpha-helices in thermophilic proteins, distribution of amino acid residues in the alpha-helices in mesophilic proteins, side-chain contribution to protein stability (kJ/mol), propensity of amino acids within pi-helices, hydropathy scale based on self- information values in the two-state model (5% accessibility), hydropathy scale based on self- information values in the two-state model (9% accessibility), hydropathy scale based on self- information values in the two-state model (16% accessibility), hydropathy scale based on self- information values in the two-state model (20% accessibility), hydropathy scale based on self- information values in the two-state model (25% accessibility), hydropathy scale based on self- information values in the two-state model (36% accessibility), hydropathy scale based on self- information values in the two-state model (50% accessibility), averaged turn propensities in a transmembrane helix, alpha-helix propensity derived from designed sequences, beta-sheet propensity derived from designed sequences, composition of amino acids in extracellular proteins (percent), composition of amino acids in anchored proteins (percent), composition of amino acids in membrane proteins (percent), composition of amino acids in intracellular proteins (percent), composition of amino acids in nuclear proteins (percent), surface composition of amino acids in intracellular proteins of thermophiles (percent), surface composition of amino acids in intracellular proteins of mesophiles (percent), surface composition of amino acids in extracellular proteins of mesophiles (percent), surface composition of amino acids in nuclear proteins (percent), interior composition of amino acids in intracellular proteins of thermophiles (percent), interior composition of amino acids in intracellular proteins of mesophiles (percent), interior composition of amino acids in extracellular proteins of mesophiles (percent), interior composition of amino acids in nuclear proteins (percent), entire chain composition of amino acids in intracellular proteins of thermophiles (percent), entire chain composition of amino acids in intracellular proteins of mesophiles (percent), entire chain composition of amino acids in extracellular proteins of mesophiles (percent), entire chain composition of amino acids in nuclear proteins (percent), screening coefficients gamma (local), screening coefficients gamma (non-local), slopes tripeptide - FDPB VFF neutral, slopes tripeptides - LD VFF neutral, slopes tripeptide - FDPB VFF noside, slopes tripeptide FDPB VFF all, slopes tripeptide FDPB PARSE neutral, slopes dekapeptide - FDPB VFF neutral, slopes proteins - FDPB VFF neutral, side-chain conformation by gaussian evolutionary method, amphiphilicity index, volumes including the crystallographic waters using the ProtOr, volumes not including the crystallographic waters using the ProtOr, electron-ion interaction potential values, hydrophobicity scales, hydrophobicity coefficient in RP-HPLC - CI 8 with 0.1%TFA/MeCN/H2O, hydrophobicity coefficient in RP-HPLC - C8 with 0.1%TFA/MeCN/H2O, hydrophobicity coefficient in RP-HPLC - C4 with 0.1%TFA/MeCN/H2O, hydrophobicity coefficient in RP-HPLC - CI 8 with 0.1%TFA/2-PrOH/MeCN/H2O, hydrophilicity scale, retention coefficient at pH 2, modified yte-Doolittle hydrophobicity scale, interactivity scale obtained from the contact matrix, interactivity scale obtained by maximizing the mean of correlation coefficient over single-domain globular proteins, interactivity scale obtained by maximizing the mean of correlation coefficient over pairs of sequences sharing the TIM barrel fold, linker propensity index, knowledge-based membrane-propensity scale from lD Helix in MPtopo databases, knowledge-based membrane- propensity scale from 3D_Helix in MPtopo databases, linker propensity from all dataset, linker propensity from 1-linker dataset, linker propensity from 2-linker dataset, linker propensity from 3- linker dataset, linker propensity from small dataset, linker propensity from medium dataset, linker propensity from long dataset, linker propensity from helical, linker propensity from non-helical (annotated by DSSP) dataset, stability scale from the knowledge-based atom-atom potential, relative stability scale extracted from mutation experiments, buriability, linker index, mean volumes of residues buried in protein interiors, average volumes of residues, hydrostatic pressure asymmetry index - PAI, hydrophobicity index (2), average internal preferences, hydrophobicity-related index, apparent partition energies calculated from Wertz-Scheraga index, apparent partition energies calculated from Robson-Osguthorpe index, apparent partition energies calculated from Janin index, apparent partition energies calculated from Chothia index, hydropathies of amino acid side chains - neutral form, hydropathies of amino acid side chains - pi-values in pH 7.0, weights from the IFH scale, hydrophobicity index 3.0 pH, scaled side chain hydrophobicity values, hydrophobicity scale from native protein structures, NNEIG index, SWEIG index, PRIFT index, PRILS index, ALTFT index, ALTLS index, TOTFT index, TOTLS index, relative partition energies derived by the Bethe approximation, optimized relative partition energies - method A, optimized relative partition energies - method B, optimized relative partition energies - method C, optimized relative partition energies - method D, hydrophobicity index (3) and hydrophobicity index (4) and combinations thereof. In a preferred embodiment, pre-defined features comprise any one or more of polar, non-polar, hydrophobic, helix/turn motif, β-sheet structure motif, charge of main chain, charge of side chain, solvent accessibility of an amino acid, spatial flexibility of the main chain and spatial flexibility of side chain of an amino acid. In one preferred embodiment of the invention, the peptide variant(s) with a pre-defined length is 9 amino acid long and pre-defined features comprise any one or more of polar, non-polar, hydrophobic, helix/turn motif, β-sheet structure motif, charge of main chain, charge of side chain, solvent accessibility of an amino acid, spatial flexibility of the main chain and spatial flexibility of side chain of an amino acid. In one embodiment of the invention, the pre-defined features comprise hydrophobic and helix/turn motif.

In another preferred embodiment of the invention, the peptide variant(s) with a pre-defined length and pre-defined features comprise at least hydrophobic and helix/turn motif. For example, the peptide variant(s) with a pre-defined length may be 9 amino acids long and pre-defined features comprise hydrophobic and helix/turn motif.

In one embodiment of the method, obtaining a tumor sample described above comprises any of a biopsied material from the tumor, a biological fluid derived from a subject afflicted with a tumor, a stool, a skin cell, a genetic material obtained from a subject afflicted with a tumor and a genetic material derived from a tumor. In a further embodiment, the biopsied material is any of tumor cell(s), a tumor tissue and a tumor organ. In another embodiment of the method, the biological fluid is any of blood, plasma, saliva, secretion, sweat, seamen and urine.

In one embodiment, the tumor is benign or malignant. In a further embodiment, the malignant tumor is cancer.

In another embodiment, the genetic material is DNA, RNA or a combination thereof. In a further embodiment, the DNA is genomic DNA, chromosomal DNA, mitochondrial DNA,

extrachromosomal DNA, viral DNA or a combination thereof. In yet a further embodiment, the RNA is cellular RNA, viral RNA, mRNA, mtRNA or a combination thereof. In one embodiment, the expressed genetic material is a protein. In a further embodiment, the protein is a mutant protein, a viral protein, an over-expressed protein or a combination thereof. In yet another embodiment, the mutant protein comprises an amino acid substitution, an amino acid deletion, an amino acid insertion or a combination thereof. Further, in one embodiment, the mutant protein is expressed at a level similar or comparable to wild-type protein level. In another embodiment, the mutant protein is not over-expressed when compared to wild-type protein level.

In one embodiment, the viral protein or over-expressed protein is an oncoprotein.

In another embodiment, the over-expressed protein is a mutant protein. In a further

embodiment, the mutant protein is expressed at a level higher than wild-type protein level.

In one embodiment, the protein prior to over-expression is not immunogenic in vivo or in the subject.

In another embodiment, the over- expressed protein prior to over-expression is not immunogenic in vivo or in the subject. In yet another embodiment, one or more peptides derived from the over- expressed protein is predicted to be immunogenic as described above. In a further embodiment, one or more peptide derived from the over-expressed protein predicted to be immunogenic is validated to be immunogenic by the method described above. In one embodiment, the pre-defined length is or comprises a peptide length bound by HLA class I or MHC class I protein. In a further embodiment, peptide length bound by HLA class I or MHC class I protein is any of 8-mer, 9-mer, and 10-mer peptide.

In another embodiment, the pre-defined length is or comprises a peptide length bound by HLA class II or MHC class II protein. In a further embodiment, peptide length bound by HLA class II or MHC class II protein is any of 14-mer, 15-mer, 16-mer and 17-mer peptide. In another embodiment, the pre-defined length is any of 8, 9, 10, 1 1 , 12, 13, 14, 15, 16 and 17 amino acids long. In another embodiment, the pre-defined length is 9 amino acids long. In a further embodiment, the 9-amino acid long peptide is bound by HLA class I or MHC class I protein. In one embodiment, the features comprise physicochemical features of amino acids. In a further embodiment, the physicochemical features are selected from an amino acid index. In another embodiment, the physicochemical features comprise features from an amino acid index. In one further embodiment, the amino acid index is AAindex section of Amino Acid Index database or its equivalent. In another embodiment, the AAindex section of Amino Acid Index Database is AAindexl . In a further embodiment, the AAindexl is version 9.2, which comprises 566 amino acid indices. In one embodiment, the features further comprise one or more PepLib descriptor(s). In another embodiment, the features further comprise a peptide-processing feature.

In one embodiment of the method, selecting and matching features associated with an amino acid at each position of the peptide comprises numerical values associated with amino acid

physicochemical properties and rules that specify a range of parameters that define an identity of each amino acid at each position of the peptide.

In a further embodiment, the features comprise any of a relative preference value at N2

(Richardson-Richardson), weights for alpha-helix at the window position of 0 (Qian-Sejnowski), activation Gibbs energy of unfolding, a combination of surface area and partial charge, relative population of conformational state A (Vasquez et al.), information measure for turn (Robson- Suzuki), amino acid composition of CYT of multi-spanning proteins (Nakashima-Nishikawa), weights for coil at the window position of 6 (Qian-Sejnowski), weights for coil at the window position of 5 (Qian-Sejnowski), number of atoms in the side chain labelled 1+1 (Charton-Charton), activation Gibbs energy of unfolding, amphiphilicity index (Mitaku et al.), a combination of surface area and partial charge, and average weighted atomic number or degree based on atomic number in the graph (Karkbara-Knisley) or a combination thereof.

In one embodiment, the selected pre-defined features comprise hydrophobic and helix/turn motif. In a further embodiment, the pre-defined features further comprise one or more of polar, non-polar, β- sheet structure motif, charge of main chain, charge of side chain, solvent accessibility of an amino acid, spatial flexibility of the main chain and spatial flexibility of side chain of an amino acid. In another embodiment of the method, the selected pre-defined features for each position of peptides recognized by TCR comprise a combination of numerical indices representing

physicochemical and biochemical properties of amino acids and pairs of amino acids. In one embodiment, the features additionally comprise physicochemical and biochemical properties of amino acids and pairs of amino acids. In another embodiment, the selected pre-defined features additionally comprise PepLib descriptors. In a further embodiment, the selected pre-defined features further comprise a peptide-processing feature.

In an embodiment, the selected pre-defined features for each position of peptides recognized by TCR comprise features selected from AAindexl or equivalent. Further, the features selected from AAindexl or equivalent may be 60 or more features, 140 or less features, between 60 and 140 features, or comprise 109 features. In one embodiment, the selected pre-defined features for each position of peptides recognized by TCR comprise features selected from PepLib descriptors or equivalent. In a further embodiment, the features selected from PepLib descriptors or equivalent may be 40 or less features, 5 or more features, between 5 and 40 features, or comprise 16 or 24 features. In another embodiment, the method further comprises 433 or fewer features selected from AAindexl or equivalent, 40 or fewer features selected from PepLib descriptors or equivalent, and one or more peptide processing feature(s) or a combination thereof.

T-cell activation assay

In one embodiment of the method where immunogenic peptides are validated as described above, a T-cell activation assay of the invention comprises contacting an antigen presentating cell with the predicted immunogenic peptide in vitro wherein the antigen presenting cell expresses HLA or MHC protein restricted in binding to the predicted immunogenic peptide described above. Then, the T- cell activation assay further comprises co-culturing the peptide-pulsed antigen presenting cells with naive CD4+ or CD8+ T-cell free in a standard dendritic cell cocktail. Moreover, the T-cell activation assay may additionallyu comprise supplementing the co-culture media with a fresh cytokine cocktail comprising IL-7 and IL-15. Further, the T-cell activation assay may comprise re- stimulating T-cells with peptide-pulsed antigen presenting cells or peptide-pulsed PMBCs. In another embodiment of the invention, T-cell activation assay may further comprise contacting cells described above with a cell transport inhibitor. Examples of the cell transport inhibitor include, but are not limited to, brefeldin A and monensin or a combination thereof. In another embodiment, the cell transport inhibitor is or comprise brefeldin A. In yet another embodiment, the cells may be in contact with a cell transport inhibitor for a period of about 6 hrs.

In accordance with the invention, the antigen presenting cell is a dendritic cell, a B cell or any cell with an antigen-presenting function. In one embodiment, the dendritic cell may be obtained by isolating CD 14+ and CD 16+ monocytes from peripheral blood mononuclear cells (PBMCs) differentiated in vitro using a cytokine cocktail comprising GMCSF, IL4 and IFN-γ for 4 days. Further, in another embodiment, the dendritic cell may cxprcssc CD1 lc. In a yet further

embodiment, the antigen-presenting cell(s) may expresse CD1 lc, CD83 aiid/ur CD86.

In specific embodiments, greater than 40% of isolated CD14+/CD16+ monocytes may differentiate into CDl lc dendritic cells. In another embodiment, the peripheral blood mononuclear cells (PBMCs) may be obtained from a healthy subject or an individual with cancer. In a further embodiment, the naive CD8+/CD4+ T-cell may be isolated from PBMCs obtained from a healthy subject or an individual with cancer. In a further embodiment, the healthy subject is a healthy human subject. In another embodiment, the individual with cancer is a human subject (child or adult).

In a specific embodiment, the PBMCs may comprise greater than about 15% but less than or equal to about 30% CD 14+ and CD 16+ monocytes and greater than 7% but less than or equal to 12% CD8+ T-cells.

In an embodiment, isolated CD8+ or CD4+ T-cell purity may be greater than about 90% deficient in cells with CD56, CD57 and CD45RO markers. In another embodiment, CD45+ RO memory T cells may comprise less than 10% of total CD8+ T cells. In another embodiment of the invention, the co-culture of the peptide-pulsed antigen presenting cells with naive CD4+ or CD8+ T-cell free as described above may be devoid or free of memory T- cells. In another embodiment, the co-culture medium as described above may be supplemented with IL-7 and IL-15, e.g., every 2 days.

For example, in one embodiment, the PBMCs may be obtained from blood of a healthy subject or an individual with cancer and frozen prior to use in the T-cell activation. In another embodiment, the PBMC and T cell viability after thawing frozen PBMCs may be greater than 70%. In yet another embodiment, the standard dendritic cell cocktail may contain or comprise GMCSF, IL4 and IFN-γ. In a specific embodiment, the cytokine cocktail comprising IL-7 and IL-15 used to supplement the co-culture media is replaced every 2 days. In another embodiment, re-stimulating T cells as described above is for an additional 24 or 48 hrs. In a further embodiment, the antigen presenting cell, PBMC and T cell may be autologous. In another embodiment, the method further comprises analyzing intracellular expression of INF-γ, TNF-α, GZMB, IL2 and/or CD69 expression. In an embodiment of the invention, ability of the peptide(s) to activate CD4+ and/or CD8+ T-cells comprise analyzing expression of CD4+ and/or CD8+ T-cell markers. Examples of the markers of activated CD4+ T-cells may include, but are not limited to, IFN-γ, IL-2, TNF-a, LT-a, CXCL12, STAT1 , STAT4 and T-bet, and/or a combination thereof. Examples of the markers of activated CD8+ T-cells may include, but are not limited to, one or more of IFN-γ, IL-2, TNF-a, CD69, Perforin, Granzyme A, Granzyme B, Granulysin, Fas L and CD 107a.

In one embodiment of the method, monoclonal or polyclonal expansion of T cells as described above comprises clonotype identification and/or TCR repertoire analysis. In a further embodiment the clonotype identification and/or TCR repertoire analysis comprises determination of expanded T- cell population nucleic acid sequence. In another embodiment of the invention the nucleic acid sequence is determined for genomic DNA or RNA transcripts for T-cell receptors. Also, in an embodiment of the invention, CD8+ T cell in step (e)(ii) may express any of IFN-γ, IL-2, TNF-a, CD69, Perforin, Granzyme A, Granzyme B, Granulysin, Fas L, CD 107a or a combination thereof. Accordingly, in one embodiment, sustainability of active T cells as described above may be marked by a lack of anergy and/or exhaustion T cell markers or continued expression of effector cytokines of CD4+ T helper cells or CD8+ cytolytic T cells. Examples of the anergy and/or exhaustion T cell markers for CD8+ T cells may include, but are not limited to, any one or more of CTLA-4, PD-1 , Eomes, CD 160, TIGIT, ENTPD1 , MY07A, PHLDA1 , LAG-3, 2B4, BTLA, TIM3, VISTA and CD96.

Examples of the effector cytokines may include, but are not limited to, IFN-γ, IL-2 and TNF-a, and/or a combination thereof. Also, in an embodiment of the invention, the selected validated immunogenic peptide(s) as described above comprise polyclonal expansion of T cells with 2 or more vaccine specific CD4+ or CD8+ T cell clones. In some specific embodiments, the vaccine specific CD4+ T cell clones possess T-helper function. In another embodiment, the vaccine specific CD8+ T cell clones possess cytolytic activity. Examples of CD8+ T-cell cytolytic markers may include, but are not limited to, IFN-γ, IL-2, TNF-a, CD69, Perforin, Granzyme A, Granzyme B, Granulysin, Fas L and CD 107a, and/or a combination thereof. The invention provides a cocktail of cancer vaccines selected by the method of selecting a cancer vaccine comprising one or more validated immunogenic peptides to treat a tumor in a subject. The invention also provides a cocktail of cancer vaccines selected by the method of selecting a cancer vaccine from genetically altered protein(s) expressed by a mammalian cancer cell and/or tissue as described above. In one embodiment, each numerical index consists of 20 numerical values corresponding to about 20 amino acids with each amino acid assigned a numerical value. The selected pre-defined features may comprise about 10, 20, 30, 40, 50, 60 or more numerical indices. The selected pre-defined features may comprise less than about 120, 140, 160 or 200 numerical indices.

Methods for Selection of a cancer vaccine cocktail comprising one or more validated immunogenic peptide to treat a tumor

The invention also provides methods of selecting a cancer vaccine cocktail comprising one or more validated immunogenic peptide to treat a tumor in a subject. In one embodiment, the method comprises use of an algorithm based on frequency of occurrence of mutant allele for one or more genetically altered protein associated with the tumor in a population.

The method may also comprise use of an algorithm based on: positive prediction of the validated immunogenic peptide to be bound by TCR, HLA or MHC binding affinity of the validated immunogenic peptide, quality of proteasomal processing of the validated immunogenic peptide derived from mutant protein, quality of TAP transporter binding of the validated immunogentic peptide derived from mutant protein, positive in T cell activation assay, magnitue of T cell activation, monoclonal and polyclonal T-cell amplification response, functional competence of T cells by expression of T-helper markers or CTL markers, lack of anergic and/or exhaustion markers for T cells, and/or a combination thereof.

In one embodiment of the method described above, the positive prediction of the validated immunogenic peptide to be bound by TCR comprises a TCR-binding algorithm. The TCR-binding algorithm comprises peptide(s) of a pre-defined length comprising one or more mutations and/or one or more alterations in level of expressed genetic material associated with the tumor.

Additionally, the TCR-binding algorithm also comprises selecting and matching features associated with an amino acid at each position of the peptide with selected pre-defined features for each position of peptides recognized by TCR associated with either CD8+ T-cell or CD4+ T-cell, so as to obtain predictive ability of the peptide(s) to interact with the TCR. The features described above comprise physicochemical features of amino acids, and the physicochemical features are selected from an amino acid index, such as AAindexl section of Amino Acid Index database or its equivalent.

In yet a further embodiment of the invention, the method comprises use of an algorithm comprising: frequency of occurrence of mutant allele for one or more genetically altered protein associated with the tumor in a population; HLA or MHC binding affinity of the validated immunogenic peptide;

Quality of proteasomal processing of the validated immunogenic peptide derived from mutant protein; Quality of TAP transporter binding of the validated immunogentic peptide derived from mutant protein; Magnitue of T cell activation; Monoclonal and polyclonal T-cell amplification response, Functional competence of T cells by expression of T-helper markers or CTL markers, and/or a combination thereof. In a specific embodiment, the method further comprises lack of anergic and/or exhaustion markers for T cells.

The method of selecting a cancer vaccine cocktail comprising one or more validated immunogenic peptide to treat a tumor in a subject comprising use of an algorithm, the frequency of occurrence of mutant allele for one or more genetically altered protein associated with the tumor in a population is based on exome and/or transcriptome data. In some other embodiments of the method, the frequency of occurrence of mutant allele for one or more genetically altered protein associated with the tumor in a population is based on proteomic data.

In preferred embodiments, of the method of selecting a cancer vaccine cocktail comprising one or more validated immunogenic peptide to treat a tumor in a subject comprising use of an algorithm, the magnitude of T cell activation comprises determining percent of antigen-specific T cells producing activation markers. In one embodiment of the invention, further comprising determining magnitude of activation marker expressed or produced by the percent of antigen-specific T cells producing activation markers. In another embodiment of the invention, the magnitude of T cell activation favourable toward a peptide's inclusion in the cocktail comprises a greater percent of antigen-specific T cells producing activation markers but at a moderate or low level of expression in expressing cells.

The invention further provides for a method, the antigen-specific T cells are or comprise CD4+ or CD8+ T cells binding the validated immunogenic peptide. In one embodiment the antigen-specific T cells producing activation markers are activated CD4+ T-cells producing markers selected from the group consisting of IFN-γ, IL-2, TNF-α, LT-a, CXCL12, STAT1 , STAT4 and T-bet and a combination thereof. In yet another further embodiment the antigen-specific T cells producing activation markers are activated CD8+ T-cells producing markers selected from the group consisting of IFN-γ, IL-2, TNF-a, CD69, Perforin, Granzyme A Granzyme B, Granulysin, Fas L and CD 107a and a combination thereof. Also, in an embodiment of the invention the activation markers are selected from the group consisting of IFN-γ, TNF-a and a combination thereof.

In some embodiments of the method, the monoclonal and polyclonal T-cell amplification response is directed or skewed toward polyclonal expansion of T cells with 2 or more vaccine specific T cell clones. The invention provides for a method, the functional competence of T cells by expression of T- helper markers comprises expression of IFN-γ, IL-2, TNF-a, LT-a, CXCL12, STAT1, STAT4 and T-bet or a combination thereof In other embodiments of the method, the functional competence of T cells by expression of CTL markers comprises expression of IFN-γ, IL-2, TNF-a, CD69, Perforin, Granzyme B, Granulysin, Fas L and CD 107a or a combination thereof. In another embodiments of the method, the functional competence of T cells by expression of CTL markers comprises expression of IFN-γ, TNF-a or a combination thereof.

Examples of the anergic and/or exhaustion markers for T cells may include, but are not limited to, CTLA-4, PD-1 , Eomes, CD 160, TIGIT, ENTPD1 , MY07A, PHLDA1 , LAG-3, 2B4, BTLA, TIM3, VISTA and CD96, and/or a combination thereof. In some embodiments of the method, the algorithm is favourable or skewed toward selection of validated immunogenic peptide for peptides with the characteristics comprising: a) a polyclonal T cell amplification response; b) a greater percent of antigen-specific T cells producing activation markers; c) a moderate or low expression of activation markers by expressing T cells; d) free or deficient in anergic and/or exhaustion markers for T cells.

In some embodiments of the method, the selected cancer vaccine produces an immunogenic response comprising a polyclonal T cell amplication of 5 or more T cell clones.

In a preferred embodiment of the invention, the T cell clones are activated CD8+ T cells.

According to embodiments of the invention, where the features comprise one or more PepLib descriptor(s), the PepLib descriptors are described in a PepLib descriptor package in R platform for analysis of a peptide sequence library. In a further embodiment, the peptide sequence library comprises TCR binding and non-TCR binding peptides.

In specific embodiments, the peptide sequence library consists or comprises at least about 1 16 non- TCR binding peptides and at least about 307 TCR-binding peptides. In a further embodiment, the peptide sequence library is analyzed to obtain physicochemical descriptors.

Accordingly, in one embodiment, where the features comprise one or more PepLib descriptor(s), the features selected from PepLib descriptors or equivalent are or comprise no less than 16 descriptors out of 56 descriptors. In one embodiment, the features selected from PepLib descriptors or equivalent are or comprise no less than 24 descriptors out of 56 descriptors. In yet a further embodiment, the features selected from PepLib descriptors or equivalent are or comprise no more than 40 descriptors out of 56 descriptors. In one embodiment, where 15 features are selected from PepLib descriptors or equivalent, the 15 features obtained comprise any of the features from (Table 2).

Method for obtaining Minimal Gene Expression Signature

The invention provides a method for obtaining a minimal gene expression signature associated with a specific immune cell type and/or subtype that distinguishes the specific immune cell type and/or oubtype from other immune cell types and/or subtypes. The method comprises: (a) obtaining a plurality of samples from a plurality of subjects (one or more sample from one or more subject); (b) determining gene expression of the specific immune cell type and/or subtype from the samples; (c) determining gene expression of other immune cell types and/or subtypes from the samples; (d) comparing the gene expression of (b) with (c) so as to identify for each immune cell type and/or subtype, the highest gene expression within each immune cell type and/or subtype but having greatest variance in gene expression between different immune cell types and/or subtypes; (e) selecting genes so identified in (d) with low plasticity of expression so as to reflect consistent gene expression or lowest variance in gene expression within each immune cell type and/or subtype; (f) validating utility of the selected genes from (e) for ability to discriminate cognate immune cell type and/or subtype from non-cognate immune cell type, and validating gene expression signature as a minimal gene expression signature consisting of a minimal set of genes with greatest difference in differentiating cognate from non-cognate immune cell type and/or subtypes; and (g) optionally, changing composition of the selected genes in (f) following discovery of an improved smaller subset of selected genes selected from (f) during validation in (f).

The method further comprises removing genes not showing significant expression in transcriptome data from isolated pure immune cells. In some embodiment, the method comprises removing genes lacking functional role in cognate immune cell type and/or subtype for which the immune signature is intended. In one embodiment of the invention, step (d) as described above comprises an average rank score of gene expression in a given cell type and/or subtypes. In another embodiment, step (e) as described above comprises a marker evaluation score for assessing gene expression across samples under different experimental conditions.

Validation methods of the selected genes and validation methods of gene expression signature as a minimal gene expression

In some embodiment, validating utility of the selected genes and validating gene expression signature as a minimal gene expression signature in step (f) as described above comprise computing a series of immune scores following removal of none, one gene or multiple genes from the genes selected in step (e) on RNA transcripts isolated from cognate and non-cognate immune cells. The method comprises comparing the immune scores so obtained, so as to identify the set or subset of genes yielding greatest difference or greatest average difference between immune scores of cognate and non-cognate immune cells. The method further comprises finding greatest difference or greatest average difference between immune scores of cognate and non-cognate cells to belong to a subset of genes, iteratively repeating above two steps (a)-(b) by replacing the genes selected in step (e) with the smaller subset of genes obtained following comparison in step (b) above until a smallest subset of genes is identified upon which removal of any one gene from a gene expression signature results in loss of the greatest difference or greatest average difference between immune score of cognate cell and n 555+ on-cognate cells and designating said identified subset of genes as the minimal gene expression signature associated with a specific immune cell type and/or subtype that distinguishes the specific immune cell type and/or subtype from other immune cell types and/or subtypes; The method alternatively comprises not finding any subset of genes obtained from the selected genes in step (e) to produce a greater difference or greater average difference between immune score of cognate and non-cognate cells, designating said selected genes in step (e) as the minimal gene expression signature associated with a specific immune cell type and/or subtype that distinguishes the specific immune cell type and/or subtype from other immune cell types and/or subtypes.

Also, in an embodiment of the invention, validation is performed using isolated RNA transcripts from immune cells. Isolated RNA transcripts are analyzed by RNA sequence determination analysis and/or microarray analysis. In a further embodiment of the invention, RNA transcripts from the selected genes are used to obtain a cell type and/or subtype immune score. In one embodiment, the immune score is a normalized gene set enrichment analysis (GSEA)-based cell type- and/or subtype-specific immune score. In another embodiment, the normalized GSEA-based cell type- and/or subtype-specific immune score is determined for cognate and/or non-cognate cell types and/or subtypes. In a further embodiment of the invention, the immune score is significantly higher for cognate immune cell type and/or subtype than non-cognate immune cell type and/or subtype.

In accordance with the practice of the invention, the minimal gene expression signature is a profile of a minimal gene expression signature associated with a cell type and/or subtype of interest or a profile of gene expression for a minimal set of genes which may be used to distinguish, identify and/or quantify the cell type and/or subtype from other cell types and/or subtypes. In some embodiment, the cell type and/or subtype of interest is an immune cell type and/or sub-type.

Examples of the minimal gene expression signature include, but are not limited to, expression profile of any 2 to 125 genes selected from ALOX15, ACAP1 , ANK3, AN RD55, ANXA3, APOC1, ARRB1, BACE2, BLK, C17orf96, Clorf54, CCL14, CCL13, CCL15, CCL17, CCL18, CCL19, CCL23, CCR2, CCR7, CCR8, CD14, CD 15/FUT4, CD1A, CD1B, CD1E, CD33, CD34, CD36, CD45, CD66b/CEACAM8, CD86, CD8A, CD8B, CLCN4, CMTM2, CTSW, CXCL10, CXCLl l, CXCL9, CXCRl, CXorf57, CYP27B1 , CYP4F3, EBFl, EGR2, EPHAl , ETV3, FABP4, FANK1, FCER2, FCRL2, FCRLA, FLJ13197, FLVCR2, FOXP3, FPR1 , FUT4, FZD2, GAL3ST4, GALRl , GPR97, HESXl , HLA-DQAl , HRHl , HS3ST2, HSDl lBl , IFI27, IL15RA, IL1R2, IL7R, ITGAM, KCNJ15, KIT, KYNU, LRRC32, MAOA, MARCO, M-CSFR/CSFIR, MEF2C, MGAM, MME, MMP12, MMP9, MRC1 , MS4A6A, MSC, NIDI , NLRP3, NPL, NRG1 , OLIG1 , PALLD, PDL1 , PI3, PID1 , PLA1A, PNOC, PPP1R14A, PROK2, PSMA2, PTGDR, QPCT, RENBP, RGPD1, RTKN2, S100A9, S1PR3, SERGEF, SH2D1B, SLC31A2, SLC38A6, SLC47A1,

TIE2/TEK, TIM3, TNFRSF10B, TSHZ2, VCAN, VILL, VISTA, VNN3, WNT5A, WNT7A, ZNF204P and ZNF324.

In one embodiment, the minimal gene expression signature consists of expression profile of 125 or less genes selected from the set of genes as shown in Table 15 or described in the expression profile above. In one further embodiment, the minimal gene expression signature consists of expression of less than 50 genes as shown in Table 15 or described in the expression profile above. Further, the minimal gene expression signature may consist of expression of less than 45 genes as shown in Table 15 or described in the expression profile above. The minimal gene expression signature consists of expression profile of 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14,15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42, 43, 44, 45, 46, 47, 48, 49, 50, 51 , 52, 53, 54, 55, 56, 57, 58, 59, 60, 61 , 62, 63, 64, 65, 66, 67, 68, 69, 70, 71 , 72, 73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101 , 102, 103, 104, 105, 106, 107, 108, 109, 1 10, 1 1 1 , 1 12, 1 13, 1 14, 1 15, 1 16, 1 17, 1 18, 1 19, 120, 121 , 122, 123, 124 or 125 genes selected from the set of genes as shown in Tabic 15or described in the expression profile above.

As used herein, the minimal gene expression signature consists of expression profile of 2 or more genes selected from the set of genes as shown in Table 15 or described in the expression profile above. In one embodiment of the invention, the minimal gene expression signature consists of expression of at least 40 genes and less than 50 genes as shown in Table 15 or described in the expression profile above. Further, the minimal gene expression signature may consists of expression of at least 40 genes and less than 45 genes as shown in Table 15 or described in the expression profile above. Further yet, the minimal gene expression signature may consists of expression of 42 genes as shown in Table 15 or described in the expression profile above.

The expression of 42 genes consists or comprises a combined expression of genes making up a minimal gene expression signature for distinguishing B-cell, CD4+ T-cell, CD8+ T-cell, Treg cell, monocyte, macrophage, natural killer (N ) cell and neutrophil.

The minimal gene expression signature further distinguishes myeloid-derived suppressor cell (MDSC) and dendritic cell. In one embodiment of the invention, the miminal gene expression signature further distinguishes macrophage Ml and M2 sub-types and granulocytic myeloid-derived suppressor cell (G-MDSC) subtype and monocytic myeloid-derived suppressor cell (M-MDSC) subtype. In some embodiments of the invention, the minimal gene expression signature may be associated with any 2, 3, 4, 5, 6, 7, or 8 out of the 8 immune cell types selected from the group consisting of B- cell, CD4+ T-cell, CD8+ T-cell, Treg cell, monocyte, macrophage, natural killer (NK) cell and neutrophil.

As used herein, the minimal gene expression signature may be based on expression of at least 2, 3, 4, 5, 6, or 7 genes selected from the group as shown in Table 15 or any 4 of the 8 immune cell types B-cell, CD4+ T-cell, CD8+ T-cell, Treg cell, monocyte, macrophage, natural killer (NK) cell and neutrophil as described above.

In some embodiments of the method, the minimal gene expression signature may be based on expression of about 2 to 25, 2 to 21 , 2 to 17, 2 to 13, 2 to 9, or 2 to 5 genes selected from the group as shown in Table 15 or any 2, 3, 4, 5, 6, 7, or 8 of the 8 immune cell types B-cell, CD4+ T-cell, CD8+ T-cell, Treg cell, monocyte, macrophage, natural killer (NK) cell and neutrophil as described above.

Examples of the immune cell types include, but are not limited to, CD4+ T-cell, CD8+ T-cell, Treg cell, monocyte, macrophage, natural killer (NK) cell and neutrophil and a combination thereof.

The minimal gene expression signature distinguishes one subtype of immune cell from a different subtype. In one embodiment of the invention, the minimal gene expression signature consists of two or more minimal gene expression signatures for two or more immune cell types and/or subtypes. In another embodiment of the invention, the minimal gene expression signature consists of 8 minimal gene expression signatures for distinguishing 8 different immune cell types.

In accordance with the invention, the minimal gene expression signature for each immune cell type may be used to further distinguish its immune cell subtype. In one embodiment, the minimal gene expression signature is used to distinguish an adapative immune cell from an innate immune cell. In a further embodiment, the minimal gene expression signature is used to distinguish an adapative immune cell from a different adaptive immune cell. In another embodiment, the minimal gene expression signature is used to distinguish an innate immune cell from a different innate immune cell.

Examples of the adapative immune cell or the different adaptive immune cell include, but are not limited to, B-cell, CD4+ T-cell, CD8+ T-cell and Treg cell. Examples of the innate immune cell or the different innate immune cell include, but are not limited to, monocyte, macrophage, myeloid- derived suppressor cell (MDSC), natural killer (NK) cell, dendritic cell and neutrophil.

In accordance with the practice of the invention, the minimal gene expression signature associated with a specific immune cell type comprises a minimal gene expression signature associated with a specific immune cell subtype. In one embodiment, the minimal gene expression signature associated with a specific immune cell subtype distinguishes the specific immune cell subtype from a related immune cell subtype.

Examples of the immune cell subtype for a macrophage include, but are not limited to, a M l macrophage subtype and a M2 macrophage subtype.

Examples of the immune cell subtype for a myeloid-derived suppressor cell (MDSC) include, but are not limited to, a granulocytic MDSC subtype and a monocytic MDSC subtype.

In one embodiment of the invention, the minimal gene expression signature is used to distinguish immune cell types and subtypes, wherein the the immune cell types and subtypes are or comprise B-cell, CD4+ T-cell, CD8+ T-cell, Treg cell, monocyte, Ml macrophage subtype, M2 macrophage subtype, granulocytic MDSC subtype, monocytic MDSC subtype, natural killer (NK) cell, dendritic cell and neutrophil. Additionally, in an embodiment of the invention, as described above, the minimal gene expression signature is a combination of several minimal gene expression signatures so as to distinguish multiple cell types and/or subtypes. Examples of the minimal gene expression signature used to distinguish 10 out of 10 immune cell types include, but are not limited to, B-cell, CD4+ T-cell, CD8+ T-cell, Treg cell, monocyte, macrophage, myeloid-derived suppressor cell (MDSC), natural killer (NK) cell, dendritic cell and neutrophil. In another embodiment, the minimal gene expression signature is used to further distinguish macrophage subtypes, wherein the macrophage subtypes are or comprise Ml macrophage subtype and M2 macrophage subtype. In yet another embodiment, the minimal gene expression signature is used to further distinguish myeloid-derived suppressor cell (MDSC) subtypes, wherein the MDSC subtypes are or comprise granulocytic MDSC subtype and monocytic MDSC subtype.

Methods of ranking

The invention additionally provides a method for ranking relative amount of specific immune cell type/subtype infiltrate in a tumor sample of a subject. The method comprises isolating the tumor from the subject, so as to obtain a tumor sample. The method further comprises determining gene expression for a set of genes that permit discriminating different immune cell types and/or subtypes as infiltrates in the tumor sample. In one particular embodiment, the set of genes in consist or comprise a combination of the genes as provided in Table 15. In another particular embodiment, the immune cell types consist of or comprise B-cell, CD4+ T-cell, CD8+ T-cell, Treg cell, monocyte, macrophage, natural killer (NK) cell and neutrophil or a combination thereof. The method also comprises obtaining a minimal gene expression signature by the method described above and applying minimal gene expression signature associated with specific immune cell types and/or subtypes so as to obtain an immune score associated with each specific immune cell type and/or subtype for the tumor sample. The method further comprises comparing immune scores for each immune cell type and/or subtype so obtained in (c) such that a higher immune score for a specific immune cell type and/or subtype signifies a greater relative amount of that particular immune cell type/subtype infiltrate over a lower immune score of a different immune cell type and/or subtype analyzed for the same tumor.

In accordance with the practice of the invention, the method provides for ranking relative amount of specific immune cell type/subtype infiltrate between two or more tumor samples obtained from one or more subject. In one embodiment, the method comprises: (a) obtaining two or more tumor samples from one or more subject; (b) determining gene expression for a set of genes that permit discriminating different immune cell types and/or subtypes as infiltrates in the tumor samples; (c) obtaining a minimal gene expression signature by the method described above and applying minimal gene expression signature associated with specific immune cell types and/or subtypes so as to obtain an immune score associated with each specific immune cell type and/or subtype for each tumor sample; and (d) comparing immune scores for each immune cell type and/or subtype so obtained in (c) such that a higher immune score for a specific immune cell type and/or subtype for one tumor sample signifies a greater relative amount of that particular immune cell type/subtype infiltrate in the tumor sample over a tumor sample with a lower immune score for the same immune cell type and/or subtype analyzed.

In one particular embodiment, the set of genes in (b) consist or comprise a combination of the genes as provided in Table 15, and wherein, the immune cell types and/or subtypes consist of or comprise B-cell, CD4+ T-cell, CD8+ T-cell, Treg cell, monocyte, macrophage, natural killer (NK) cell and neutrophil or a combination thereof.

In accordance with the invention, the method provides for quantifying amount of specific immune cell type/subtype infiltrate in a tumor of a subject. In one embodiment, the method comprises (a) isolating the tumor from the subject; (b) determining gene expression for a set of genes that permit discriminating different immune cell types and/or subtypes as infiltrates in the tumor from the subject; (c) obtaining a minimal gene expression signature by the method described above and applying minimal gene expression signature associated with specific immune cell types and/or subtypes so as to obtain an immune score associated with each specific immune cell type and/or subtype; and (d) comparing immune scores from (c) against reference curves so as to obtain amount of a specific immune cell type/subtype infiltrate in the tumor of a subject.

In some embodiments of the method, the immune cell types and/or subtypes additionally comprises myeloid-derived suppressor cell (MDSC), dendritic cell, macrophage Ml and M2 sub-types, granulocytic myeloid-derived suppressor cell (G-MDSC) subtype and monocytic myeloid-derived suppressor cell (M-MDSC) subtype.

In several embodiments of the method, the tumor or tumor sample(s) is any of adrenocortical carcinoma (ACC), bladder urothelial carcinoma (BLCA), breast invasive carcinoma (BRCA), cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), cholangiocarcinoma (CHOL), colon adenocarcinoma (COAD), lymphoid neoplasm diffuse large B-cell lymphoma (DLBC), esophageal carcinoma (ESCA), glioblastoma multiforme (GBM), head and neck squamous cell carcinoma (HNSC), kidney chromophobe (KICH), kidney renal clear cell carcinoma (KIRC), kidney renal papillary cell carcinoma (KIRP), acute myeloid leukemia (LAML), chronic myelogenous leukemia (LCML), brain lower grade glioma (LGG), liver hepatocellular carcinoma (LIHC), lung adenocarcinoma (LUAD), lung squamous cell carcinoma (LUSC), mesothelioma (MESO), ovarian serous cystadenocarcinoma (OV), pancreatic adenocarcinoma (PAAD), pheochromocytoma and paraganglioma (PCPG), prostate adenocarcinoma (PRAD), rectum adenocarcinoma (READ), sarcoma (SARC), skin cutaneous melanoma (SKCM), stomach adenocarcinoma (STAD), testicular germ cell tumors (TGCT), thyroid carcinoma (THCA), thymoma (THYM), uterine corpus endometrial carcinoma (UCEC), uterine carcinosarcoma (UCS) or uveal melanoma (UVM) or a combination thereof. Identification methods of Immune Cell Type/Subtype Infiltrate Preferentially Associated with a Type and/or Subtype of Tumor

The invention provides a method for identifying immune cell type/subtype infiltrate preferentially associated with a type and/or subtype of tumor among a collection of tumors. In one embodiment, the method comprises: (a) isolating the tumor from the subject; (b) determining gene expression for a set of genes that permit discriminating different immune cell types and/or subtypes as infiltrates in the tumor from the subject; (c) obtaining a minimal gene expression signature by the method described above and applying minimal gene expression signature associated with specific immune cell types and/or subtypes so as to obtain an immune score associated with each specific immune cell type and/or subtype; (d) repeating steps (a) to (c) for other tumors and/or subjects; (e) comparing immune scores so obtained for each immune cell type and/or subtype for the collection of tumors so as to obtain rank order of tumors based on the immune scores for each immune cell type and/or subtype; (f) stratifying the rank ordered tumors based on immune scores for each immune cell type and/or subtype of step (e); (g) determining percentage or fraction of a tumor type and/or subtype within each stratified group in step (f); (h) prepeating steps (e) to (g) for each immune cell type and/or subtype; (i) identifying tumor type and/or subtype overrepresented in one or more stratified group at the highest end of the immune score for each immune cell type, so as to identify immune cell type infiltrate preferentially associated with a type or subtype of tumor among a collection of tumors. In one particular emobodiment, the set of genes in step (b) consist or comprise a combination of the genes as provided in Table 15. In a further particular embodiment, the immune cell types and/or subtype may consist of or comprise any of B-cell, CD4+ T-cell, CD8+ T-cell, Treg cell, monocyte, macrophage, natural killer (NK) cell, neutrophil, myeloid-derived suppressor cell (MDSC), dendritic cell, macrophage Ml and M2 sub-types, granulocytic myeloid- derived suppressor cell (G-MDSC) subtype and monocytic myeloid-derived suppressor cell (M- MDSC) subtype, a combination thereof or all. Identification methods of Immune Cell Type

The invention provides a method for identifying immune cell type infiltrate absent or deficient from a type and/or subtype of tumor among a collection of tumors. In one embodiment, the method comprises: (a) isolating the tumor from the subject; (b) determining gene expression for a set of genes that permit discriminating different immune cell types and/or subtypes as infiltrates in the tumor from the subject; (c) obtaining a minimal gene expression signature by the method described above and applying minimal gene expression signature associated with specific immune cell types and/or subtypes so as to obtain an immune score associated with each specific immune cell type and/or subtype; (d) repeating steps (a) to (c) for other tumors and/or subjects; and (e) comparing immune scores so obtained for each immune cell type and/or subtype for the collection of tumors so as to obtain rank order of tumors based on the immune scores for each immune cell type and/or subtype. In one embodiment, the comparison involves (i) stratifying the rank ordered tumors based on immune scores for each immune cell type and/or subtype of step (e); (ii) determining percentage or fraction of a tumor type and/or subtype within each stratified group in step (f); (iii) repeating steps (e) to (g) for each immune cell type and/or subtype; and (iv) identifying tumor type and/or subtype underrepresented in one or more stratified group at the highest end of the immune score and/or overrepresented in one or more stratified group at the lowest end of the immune score for each immune cell type and/or subtype, so as to identify immune cell type/subtype infiltrate absent or deficient from a type and/or subtype of tumor among a collection of tumors. In another particular embodiment, the set of genes in step (b) may consist or comprise a combination of the genes as provided in Table 15. In a yet further embodiment, the immune cell types and/or subtypes may consist of or comprise any of B-cell, CD4+ T-cell, CD8+ T-cell, Treg cell, monocyte, macrophage, natural killer (NK) cell, neutrophil, myeloid-derived suppressor cell (MDSC), dendritic cell, macrophage Ml and M2 sub-types, granulocytic myeloid-derived suppressor cell (G-MDSC) subtype and monocytic myeloid-derived suppressor cell (M-MDSC) subtype, a combination thereof or all.

In some embodiments of the invention, the immune score for a particular immune cell infiltrate positively correlates with expression of chemoattractant gene or collection of chemoattractant genes for the immune cell infiltrate, wherein examples of the immune cell infiltrate include, but are not limited to, B-cell, CD8+ T-cell, Treg cell, monocyte, macrophage, natural killer (NK) cell, neutrophil, neutrophil, myeloid-derived suppressor cell (MDSC), dendritic cell, macrophage Ml and M2 sub-types, granulocytic myeloid-derived suppressor cell (G-MDSC) subtype and monocytic myeloid-derived suppressor cell (M-MDSC) subtype and a combination thereof.

Examples of chemoattractant gene or collection of chemoattractant genes for B-cell infiltrate may include, but are not limited to, CXCL12, CXCL13, CCL19, CCL21, CCL25, CCL20 and CCL3 and a combination thereof. Other examples of chemoattractant gene or collection of chemoattractant genes for CD8+ T-cell infiltrate may include, but are not limited to, CXCL9, CXCL10, CXCL11, CCL5, MIP3, CCL3 and CCL4 and a combination thereof. Additionally, examples of

chemoattractant gene or collection of chemoattractant genes for Treg cell infiltrate may include, but are not limited to, CCL20, CCL19, CCL21, CCL3, CCL4, CCL5, CCL17, CCL1, CCL22 and CCL28 and a combination thereof. Yet another set of examples of chemoattractant gene or collection of chemoattractant genes for monocyte infiltrate may include, but are not limited to, MJF, IL8, CCL2, CCL8, CCL7, CCL13, CCL12, CX3CL1 and CCL7 and a combination thereof. Further examples of chemoattractant gene or collection of chemoattractant genes for macrophage infiltrate may include, but are not limited to, CCL20, CXCL14 and CCL4 and a combination thereof. Further still, additional examples of chemoattractant gene or collection of chemoattractant genes for natural killer (NK) cell infiltrate may include, but are not limited to, CCL20, CCL8, CCL7, CCL13, CCL4, CXCL12, CCL5, IL8, CXCR3, CXCL9, CXCL10, CX3CL1 and IFNG and a combination thereof Further examples of chemoattractant gene or collection of chemoattractant genes for neutrophil infiltrate may include, but are not limited to, CXCL2, CCL3, IL8, CCL4, CXCL9, CXCL10,

CXCL11, CCL17, CXCL1 and CXCL5 and a combination thereof. Additional examples of chemoattractant gene or collection of chemoattractant genes for CD4+ T-cell infiltrate may include, but are not limited to, MIP3, CXCL11, CXCL10, CXCL9, CCL3 and CCL5 and a combination thereof.

In several of the methods of the invention, the immune score for a particular immune cell infiltrate negatively correlates with expression of chemoattractant gene or collection of chemoattractant genes for the immune cell infiltrate, wherein the immune cell infiltrate is CD4+ T-cell. In a further embodiment, correlation may be a correlation between a chemoattractant score and an immune score. In another embodiment of the invention, correlation may be a correlation between a chemoattractant score and an immune score.

Using a similar strategy, the method for identifying immune cell type infiltrate absent or deficient from a type and/or subtype of tumor among a collection of tumors as described above, stratifying the rank ordered tumors in step (f) is into quantiles or alternatively dividing the rank ordered tumors into groups of about equal proportion. In one embodiment, the quantiles or groups are two or more. In another embodiment of the invention, each quantile or group consists or comprises about half the total number of tumors in the collection or fewer. In further embodiment of the invention, each quantile or group consists or comprises about one fourth of the total number of tumors in the collection. Examples of the quantiles may include, but are not limited to, percentile, venules, hexadeciles, duo-deciles, deciles, octiles, septiles, sextiles, quintiles, quartiles and tertiles. In one embodiment, the quartiles or groups are of 4. Further, each quartile or group consists or comprises 2410 tumors, wherein the tumors in the collection number 9640. Further, in the method of the invention, the immune cell type/subtype infiltrate preferentially associated with a type and/or subtype of tumor among a collection of tumors is underrepresented in one or more stratified group at the lowest end of the immune score for the immune cell type and/or subtype. In another embodiment of the method, identifying tumor type and/or subtype

overrepresented and/or underrepresented in one or more stratified group in step (i) may be overrepresentation and/or underrepresentation by a factor of 2, 3, 4, 5, 6, 7, 8, 9, 10 or more. In further embodiments of the method, identifying tumor type and/or subtype overrepresented and/or underrepresented in one or more stratified group in step (i) is overrepresentation and/or

underrepresentation by a factor of 2-4. In other embodiments of the method, identifying tumor type or subtype overrepresented and/or underrepresented in one or more stratified group in step (i) is overrepresentation and/or underrepresentation by a factor of 2-4 tor the first quartile.

In one embodiment, the immune cell type/subtype infiltrate preferentially associated with a type and/or subtype of tumor among a collection of tumors is overrepresented in the quantile or stratified group with the highest immune score and underrepresented in the quantile or stratified group with the lowest immune score. In a further embodiment, intermediate quantiles or stratified groups comprises decreasing representation with decreasing immune scores or range of immune scores.

Using a similar strategy of the method of the invention, the immune cell type/subtype infiltrate absent or deficient from a type and/or subtype of tumor among a collection of tumors is

underrepresented in the quantile or stratified group with the highest immune score and

overrepresented in the quantile or stratified group with the lowest immune score. In a further embodiment, intermediate quantiles or stratified groups comprises increasing representation with decreasing immune scores or range of immune scores.

In an embodiment of the method, one or more immune cell type/subtype infiltrate is preferentially associated with a type and/or subtype of tumor or tumors among a collection of tumors. In another embodiment of the method, examples of the type and/or subtype of tumor enriched in B-cell infiltration include, but are not limited to, diffuse large B-cell lymphoma (DLBCL), kidney renal clear cell carcinoma (KIRC), sarcoma (SARC), skin cutaneous melanoma (SKCM) and uveal melanoma (UVM) (Figure 13 A). In yet another embodiment, examples of the type and/or subtype of tumor enriched in CD4+ T-cell infiltration include, but are not limited to, esophageal carcinoma (ESCA) and prostate adenocarcinoma (PRAD) (Figure 13 A). Some examples of the type and/or subtype of tumor enriched in CD8+ T-cell infiltration may include, but are not limited to, diffuse large B-cell lymphoma (DLBCL), acute myeloid leukemia (LAML) and thymoma (THYM) (Figure 13 A). Other examples of the type and/or subtype of tumor enriched in Treg-cell infiltration include, but are not limited to, breast invasive carcinoma (BRCA), diffuse large B-cell lymphoma (DLBCL), pancreatic adenocarcinoma (PAAD), stomach

adenocarcinoma (STAD), testicular germ cell tumors (TGCT) and thymoma (THYM) (Figure

13 A). Examples of the type and/or subtype of tumor enriched in monocyte infiltration include, but are not limited to, glioblastoma multiforme (GBM), kidney renal clear cell carcinoma (KIRC), low- grade glioma (LGG) and sarcoma (SARC) (Figure 13 A). Further examples of the type and/or subtype of tumor enriched in macrophage infiltration include, but are not limited to, adrenocortical carcinoma (ACC), kidney renal clear cell carcinoma (KIRC), kidney renal papillary cell carcinoma (KIRP) and liver hepatocellular carcinoma (LIHC) (Figure 13 A).

Examples of the type and/or subtype of tumor enriched in natural killer (NK)-cell infiltration may include, but are not limited to, breast invasive carcinoma (BRCA), colon adenocarcinoma (COAD), skin cutaneous melanoma (SKCM), kidney chromophobe (KICH), kidney renal clear cell carcinoma (KIRC) and acute myeloid leukemia (LAML) (Figure 13 A). In a further embodiment, the tumor enriched in natural killer (NK)-cell infiltration additional comprises granzyme-A

(GZMA) and perforin (PRFl) expression or overexpression. Examples of the type and/or subtype of tumor enriched in neutrophil infiltration include, but are not limited to, cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), esophageal carcinoma (ESCA), head and neck squamous cell carcinoma (H SC), kidney renal clear cell carcinoma (KIRC), lung squamous cell carcinoma (LUSC) and pancreatic adenocarcinoma (PAAD) (Figure 13 A).

In one embodiment of the method, the immune cell type/subtype infiltrate preferentially associated with a type and/or subtype of tumor among a collection of tumors is as shown in Figure 13 A,. In another embodiment of the invention, the type and/or subtype of tumor is preferentially associated with one or more immune cell type/subtype infiltrate (Figure 13 A).

In an additional embodiment of the invention, one or more immune cell type/subtype infiltrate is absent or deficient from a type and/or subtype of tumor or tumors among a collection of tumors. Examples of the type and/or subtype of tumor deficient in B-cell infiltrate absent or deficient from a type and/or subtype of tumor among a collection of tumors include, but are not limited to, cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), brain lower grade glioma (LGG), liver hepatocellular carcinoma (LIHC), pheochromocytoma and paraganglioma (PCPG), (PRAD) and rectum adenocarcinoma (READ) (Figure 13 A). Other examples of the type and/or subtype of tumor deficient in CD4+ T-cell infiltrate absent or deficient from a type and/or subtype of tumor among a collection of tumors include, but are not limited to, colon adenocarcinoma (COAD), lymphoid neoplasm diffuse large B-cell lymphoma (DLBC), glioblastoma multiforme (GBM), acute myeloid leukemia (LAML), mesothelioma (MESO), rectum adenocarcinoma (READ), sarcoma (SARC), skin cutaneous melanoma (SKCM), testicular germ cell tumors (TGCT) and uveal melanoma (UVM) (Figure 13 A). Some examples of the type and/or subtype of tumor deficient in CD8+ T-cell infiltrate absent or deficient from a type and/or subtype of tumor among a collection of tumors include, but are not limited to, glioblastoma multiforme (GBM), kidney chromophobe (KICH), brain lower grade glioma (LGG), pheochromocytoma and paraganglioma (PCPG) and prostate adenocarcinoma (PRAD) (Figure 13 A).

Further examples of the type and/or subtype of tumor deficient in Treg cell infiltrate absent or deficient from a type and/or subtype of tumor among a collection of tumors include, but are not limited to, glioblastoma multiforme (GBM), kidney chromophobe (KICH), kidney renal papillary cell carcinoma (KIRP), acute myeloid leukemia (LAML), brain lower grade glioma (LGG), liver hepatocellular carcinoma (LIHC), pheochromocytoma and paraganglioma (PCPG), prostate adenocarcinoma (PRAD) and uveal melanoma (UVM) (Figure 13 A). Other examples of the type and/or subtype of tumor deficient in monocyte infiltrate absent or deficient from a type and/or subtype of tumor among a collection of tumors include, but are not limited to, cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), lymphoid neoplasm diffuse large B-cell lymphoma (DLBC), kidney renal papillary cell carcinoma (KIRP), liver hepatocellular carcinoma (LIHC), ovarian serous cystadenocarcinoma (OV), prostate adenocarcinoma (PRAD), rectum adenocarcinoma (READ), thyroid carcinoma (THCA), thymoma (THYM), uterine corpus endometrial carcinoma (UCEC) and uveal melanoma (UVM) (Figure 13 A).

Other additional examples of the type and/or subtype of tumor deficient in macrophage infiltrate absent or deficient from a type and/or subtype of tumor among a collection of tumors include, but are not limited to, (LAML), ovarian serous cystadenocarcinoma (OV), prostate adenocarcinoma (PRAD), rectum adenocarcinoma (READ) and uveal melanoma (UVM) (Figure 13 A). Further examples of the type and/or subtype of tumor deficient in natural killer (NK) cell infiltrate absent or deficient from a type and/or subtype of tumor among a collection of tumors include, but are not limited to, glioblastoma multiforme (GBM), brain lower grade glioma (LGG), pheochromocytoma and paraganglioma (PCPG), prostate adenocarcinoma (PRAD), uterine carcinosarcoma (UCS) and uveal melanoma (UVM) (Figure 13 A). Other examples of the type and/or subtype of tumor deficient in neutrophil infiltrate absent or deficient from a type and/or subtype of tumor among a collection of tumors include, but are not limited to, adrenocortical carcinoma (ACC), breast invasive carcinoma (BRCA), lymphoid neoplasm diffuse large B-cell lymphoma (DLBC), glioblastoma multiforme (GBM), brain lower grade glioma (LGG), ovarian serous

cystadenocarcinoma (OV), pheochromocytoma and paraganglioma (PCPG), prostate

adenocarcinoma (PRAD), sarcoma (SARC), skin cutaneous melanoma (SKCM), testicular germ cell tumors (TGCT), thyroid carcinoma (THCA), thymoma (THYM), uterine carcinosarcoma (UCS) and uveal melanoma (UVM) Figure 13 A.

In one embodiment of the invention, the immune cell type infiltrate absent or deficient from a type and/or subtype of tumor among a collection of tumors is as shown in Figure 13 A. In another embodiment of the invention, the type and/or subtype of tumor is absent or deficient in one or more immune cell type/subtype infiltrate (Figure 13A). In several of the methods of the invention, one or more immune cell type/subtype infiltrate is preferentially associated with a type and/or subtype of tumor or tumors among a collection of tumors and is absent or deficient from a different type and/or subtype of tumor or tumors among a collection of tumors. In a further embodiment, the immune cell type/subtype infiltrate is as shown in Figure 13 A. As described above, the type and/or subtype of tumor is preferentially infiltrated by one or more immune cell type/subtype and excludes or is deficient in other immune cell type/subtypes. In another embodiment, the type and/or subtype of tumor is preferentially infiltrated by one or more immune cell type/subtype and excludes or is deficient in other immune cell type/subtypes is as shown in Figure 13 A.

In some embodiments of the method, examples of the type and/or subtype of tumor or tumor include, but are not limited to, adrenocortical carcinoma (ACC), bladder urothelial carcinoma (BLCA), breast invasive carcinoma (BRCA), cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), cholangiocarcinoma (CHOL), colon adenocarcinoma (COAD), lymphoid neoplasm diffuse large B-cell lymphoma (DLBC), esophageal carcinoma (ESCA), glioblastoma multiforme (GBM), head and neck squamous cell carcinoma (HNSC), kidney chromophobe

(KICH), kidney renal clear cell carcinoma (KIRC), kidney renal papillary cell carcinoma (KIRP), acute myeloid leukemia (LA L), chronic myelogenous leukemia (LCML), brain lower grade glioma (LGG), liver hepatocellular carcinoma (LIHC), lung adenocarcinoma (LUAD), lung squamous cell carcinoma (LUSC), mesothelioma (MESO), ovarian serous cystadenocarcinoma (OV), pancreatic adenocarcinoma (PAAD), pheochromocytoma and paraganglioma (PCPG), prostate adenocarcinoma (PRAD), rectum adenocarcinoma (READ), sarcoma (SARC), skin cutaneous melanoma (SKCM), stomach adenocarcinoma (STAD), testicular germ cell tumors (TGCT), thyroid carcinoma (THCA), thymoma (THYM), uterine corpus endometrial carcinoma (UCEC), uterine carcinosarcoma (UCS) or uveal melanoma (UVM) or a combination thereof (Figure 13 A).

Other embodiments of the invention, the collection of tumors consist or comprise adrenocortical carcinoma (ACC), bladder urothelial carcinoma (BLCA), breast invasive carcinoma (BRCA), cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), cholangiocarcinoma (CHOL), colon adenocarcinoma (COAD), lymphoid neoplasm diffuse large B-cell lymphoma (DLBC), esophageal carcinoma (ESCA), glioblastoma multiforme (GBM), head and neck squamous cell carcinoma (HNSC), kidney chromophobe (KICH), kidney renal clear cell carcinoma (KIRC), kidney renal papillary cell carcinoma (KIRP), acute myeloid leukemia (LAML), chronic myelogenous leukemia (LCML), brain lower grade glioma (LGG), liver hepatocellular carcinoma (LIHC), lung adenocarcinoma (LUAD), lung squamous cell carcinoma (LUSC), mesothelioma (MESO), ovarian serous cystadenocarcinoma (OV), pancreatic adenocarcinoma (PAAD), pheochromocytoma and paraganglioma (PCPG), prostate adenocarcinoma (PRAD), rectum adenocarcinoma (READ), sarcoma (SARC), skin cutaneous melanoma (SKCM), stomach adenocarcinoma (STAD), testicular germ cell tumors (TGCT), thyroid carcinoma (THCA), thymoma (THYM), uterine corpus endometrial carcinoma (UCEC), uterine carcinosarcoma (UCS) and uveal melanoma (UVM) or a combination thereof (Figure 13 A).

Identification methods of Characteristic Immune Cell Type/Subtype

The invention provides a method for identifying characteristic immune cell type/subtype infiltrates for a type and/or subtype of tumor among a collection of tumors. In one embodiment, the method comprises identifying none, one or more immune cell type infiltrate preferentially associated with a type and/or subtype of tumor among a collection of tumors by the method described above. In another embodiment, the method comprises identifying none, one or more immune cell type infiltrate absent or deficient from a type or subtype of tumor among a collection of tumors by the method described above, so as to identify characteristic immune cell type infiltrates for a type and/or subtype of tumor among a collection of tumors.

Additionally, the method comprises identifying characteristic immune cell type infiltrates for a type and/or subtype of tumor among a collection of tumors.

As described above, the type and/or subtype of tumor is enriched in one or more type and/or subtype of immune cell infiltrate, deficient in one or more type and/or subtype of immune cell infiltrate or a combination thereof. Further, the type and/or subtype of tumor is enriched in one or more type and/or subtype of immune cell infiltrate, deficient in one or more type and/or subtype of immune cell infiltrate or a combination thereof is as shown in Figure 13 A. In several embodiments of the method, the type of immune cell is an innate immune cell. Further, examples of the innate immune cell include, but are not limited to, monocyte, macrophage, macrophage Ml subtype, macrophage M2 sub-type, myeloid-derived suppressor cell (MDSC), granulocytic myeloid-derived suppressor cell (G-MDSC) subtype, monocytic myeloid-derived suppressor cell (M-MDSC) subtype, natural killer (NK) cell, dendritic cell and neutrophil and a combination thereof. Further embodiments of the method, the type of immune cell is an adaptive immune cell. The examples of the adaptive immune cell include, but are not limited to, CD8+ T-cell, CD4+ T-cell, Treg cell and B-cell and a combination thereof.

In one embodiment, one or more immune cell type/subtype infiltrate is preferentially associated with a type and/or subtype of tumor or tumors among a collection of tumors and one or more immune cell type/subtype infiltrate is absent or deficient from a type and/or subtype of tumor or tumors among a collection of tumors.

Examples of the type and/or subtype of tumor include, but are not limited to, adrenocortical carcinoma (ACC), bladder urothelial carcinoma (BLCA), breast invasive carcinoma (BRCA), cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), cholangiocarcinoma (CHOL), colon adenocarcinoma (COAD), lymphoid neoplasm diffuse large B-cell lymphoma (DLBC), esophageal carcinoma (ESCA), glioblastoma multiforme (GBM), head and neck squamous cell carcinoma (HNSC), kidney chromophobe (KICH), kidney renal clear cell carcinoma (KIRC), kidney renal papillary cell carcinoma (KIRP), acute myeloid leukemia (LAML), chronic myelogenous leukemia (LCML), brain lower grade glioma (LGG), liver hepatocellular carcinoma (LIHC), lung adenocarcinoma (LUAD), lung squamous cell carcinoma (LUSC), mesothelioma (MESO), ovarian serous cystadenocarcinoma (OV), pancreatic adenocarcinoma (PAAD), pheochromocytoma and paraganglioma (PCPG), prostate adenocarcinoma (PRAD), rectum adenocarcinoma (READ), sarcoma (SARC), skin cutaneous melanoma (SKCM), stomach adenocarcinoma (STAD), testicular germ cell tumors (TGCT), thyroid carcinoma (THCA), thymoma (THYM), uterine corpus endometrial carcinoma (UCEC), uterine carcinosarcoma (UCS) or uveal melanoma (UVM) or a combination thereof.

Other examples of the collection of tumors consist or comprise adrenocortical carcinoma (ACC), bladder urothelial carcinoma (BLCA), breast invasive carcinoma (BRCA), cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), cholangiocarcinoma (CHOL), colon adenocarcinoma (COAD), lymphoid neoplasm diffuse large B-cell lymphoma (DLBC), esophageal carcinoma (ESCA), glioblastoma multiforme (GBM), head and neck squamous cell carcinoma (HNSC), kidney chromophobe (KICH), kidney renal clear cell carcinoma (KIRC), kidney renal papillary cell carcinoma (KIRP), acute myeloid leukemia (LAML), chronic myelogenous leukemia (LCML), brain lower grade glioma (LGG), liver hepatocellular carcinoma (LIHC), lung adenocarcinoma (LUAD), lung squamous cell carcinoma (LUSC), mesothelioma (MESO), ovarian serous cystadenocarcinoma (OV), pancreatic adenocarcinoma (PAAD), pheochromocytoma and paraganglioma (PCPG), prostate adenocarcinoma (PRAD), rectum adenocarcinoma (READ), sarcoma (SARC), skin cutaneous melanoma (SKCM), stomach adenocarcinoma (STAD), testicular germ cell tumors (TGCT), thyroid carcinoma (THCA), thymoma (THYM), uterine corpus endometrial carcinoma (UCEC), uterine carcinosarcoma (UCS) and uveal melanoma (UVM) or a combination thereof.

Merely by way of example, the characteristic immune cell type/subtype infiltrate for adrenocortical carcinoma (ACC) is or comprises an abundance or overrepresentation of macrophages and absence or deficiency of neutrophils in a collection of tumor types and/or subtypes.

In some embodiments, the characteristic immune cell type/subtype infiltrate for breast invasive carcinoma (BRCA) is or comprises an abundance or overrepresentation of natural killer (NK) cells and Treg cells and absence or deficiency of neutrophils in a collection of tumor types and/or subtypes. In another embodiment, the characteristic immune cell type/subtype infiltrate for cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC) is or comprises an abundance or overrepresentation of neutrophils and absence or deficiency of B-cells and monocytes in a collection of tumor types and/or subtypes. In some instances, the characteristic immune cell type/subtype infiltrate for colon adenocarcinoma (COAD) is or comprises an abundance or overrepresentation of natural killer (NK) cells and absence or deficiency of CD4+ T-cells in a collection of tumor types and/or subtypes.

In another instance, the characteristic immune cell type/subtype infiltrate for lymphoid neoplasm diffuse large B-cell lymphoma (DLBC) is or comprises an abundance or overrepresentation of B- cells, CD8+ T-cells and Treg cells and absence or deficiency of CD4+ T-cells, monocytes and neutrophils in a collection of tumor types and/or subtypes. The characteristic immune cell type/subtype infiltrate for esophageal carcinoma (ESCA) may also be or comprise an abundance or overrepresentation of CD4+ T-cells and neutrophils in a collection of tumor types and/or subtypes.

In some embodiments of the invention, the characteristic immune cell type/subtype infiltrate for glioblastoma multiforme (GBM) is or comprises an abundance or overrepresentation of monocytes and absence or deficiency of CD4+ T-cells, CD8+ T-cells, Treg cells, natural killer (NK) cells, and neutrophils in a collection of tumor types and/or subtypes.

In another embodiment, the characteristic immune cell type/subtype infiltrate for kidney

chromophobe (KICH) is or comprises an abundance or overrepresentation of natural killer (NK) cells and absence or deficiency of CD8+ T-cells and Treg cells in a collection of tumor types and/or subtypes. Alternatively, the characteristic immune cell type infiltrate for kidney renal clear cell carcinoma (KIRC) is or comprises an abundance or overrepresentation of B-cells, monocytes, macrophages, natural killer (NK) cells and neutrophils in a collection of tumor types and/or subtypes. The characteristic immune cell type infiltrate for kidney renal papillary cell carcinoma (KIRP) may be or comprise an abundance or overrepresentation of macrophages and absence or deficiency of Treg cells and monocytes in a collection of tumor types and/or subtypes. In some embodiments of the invention, the characteristic immune cell type infiltrate for acute myeloid leukemia (LAML) is or comprises an abundance or overrepresentation of CD8+ T-cells and natural killer (NK) cells and absence or deficiency of CD4+ T-cells, Treg cells and macrophages in a collection of tumor types and/or subtypes.

Examples include the characteristic immune cell type infiltrate for brain lower grade glioma (LGG) that is or comprises an abundance or overrepresentation of monocytes and an absence or deficiency of B-cells, CD8+ T-cells, Treg cells, natural killer (NK) cells and neutrophils in a collection of tumor types and/or subtypes.

Another example is the characteristic immune cell type infiltrate for liver hepatocellular carcinoma (LIHC) is or comprises an abundance or overrepresentation of macrophages and absence or deficiency of B-cells, Treg cells and monocytes in a collection of tumor types and/or subtypes. The characteristic immune cell type infiltrate for ovarian serous cystadenocarcinoma (OV) is or comprises an absence or deficiency of monocytes, macrophages and neutrophils in a collection of tumor types and/or subtypes.

In some instances, the characteristic immune cell type infiltrate for pancreatic adenocarcinoma (PAAD) is or comprises an abundance or overrepresentation of Treg cells and neutrophils in a collection of tumor types and/or subtypes. Merely by way of example, the characteristic immune cell type infiltrate for pheochromocytoma and paraganglioma (PCPG) is or comprises an absence or deficiency of B-cells, CD8+ T-cells, Treg cells, natural killer (NK) cells and neutrophils in a collection of tumor types and/or subtypes. In this instance, the characteristic immune cell type infiltrate for prostate adenocarcinoma (PRAD) is or comprises an abundance or overrepresentation of CD4+ T-cells and absence or deficiency of B-cells, CD8+ T-cells, Treg cells, monocytes, macrophages, natural killer (NK) cells and neutrophils in a collection of tumor types and/or subtypes. In this example, the characteristic immune cell type infiltrate for rectum adenocarcinoma (READ) is or comprises an absence or deficiency of B-cells, CD4+ T-cells, monocytes and macrophages in a collection of tumor types and/or subtypes.

In another example, the characteristic immune cell type infiltrate for sarcoma (SARC) is or comprises an abundance or overrepresentation of B-cells and monocytes and absence or deficiency of CD4+ T-cells and neutrophils in a collection of tumor types and/or subtypes.

In one embodiment of the invention, the characteristic immune cell type infiltrate for skin cutaneous melanoma (SKCM) is or comprises an abundance or overrepresentation of B-cells and natural killer (NK) cells and absence or deficiency of CD4+ T-cells and neutrophils in a collection of tumor types and/or subtypes. In another embodiment of the invention, the characteristic immune cell type infiltrate for testicular germ cell tumors (TGCT) is or comprises an abundance or overrepresentation of Treg cells and absence or deficiency of CD4+ T-cells and neutrophils in a collection of tumor types and/or subtypes. Here, for example, the characteristic immune cell type infiltrates for thyroid carcinoma (THCA) is or comprises an absence or deficiency of monocytes and neutrophils in a collection of tumor types and/or subtypes.

Additionally, the characteristic immune cell type infiltrate for thymoma (THYM) is or comprises an abundance or overrepresentation of CD8+ T-cells and Treg cells and absence or deficiency of monocytes and neutrophils in a collection of tumor types and/or subtypes. Merely by way of example, the characteristic immune cell type infiltrate for uterine carcinosarcoma (UCS) is or comprises an absence or deficiency of natural killer (NK) cells and neutrophils in a collection of tumor types and/or subtypes. Also, the characteristic immune cell type infiltrate for uveal melanoma (UVM) is or comprises an abundance or overrepresentation of B-cells and Treg cells and absence or deficiency of CD4+ T-cells, Treg cells, monocytes, macrophages, natural killer (NK) cells and neutrophils in a collection of tumor types and/or subtypes.

The invention provides a method for identifying a cancer patient most likely to be responsiveness to immune checkpoint inhibitor therapy. In one embodiment, the method comprises obtaining a tumor sample from the cancer patient. The method comprises determining gene expression for a set of genes of the isolated tumor sample. Additionally, the method comprises, applying minimal gene expression signature associated with CD8+ T-cell so as to determine a threshold presence of CD8+ T-cell;

The method comprises determining functional state of the CD8+ T-cell by analyzing one or more marker associated with anergic and exhaustion of CD8+ T-cell, wherein the marker is selected from the group consisting of CTLA-4, LAG3 and TIM3 or a combination thereof. The method additionally comprises finding presence or upregulation of CTLA-4, LAG3 and/or TIM3 being indicative of anergic and exhausted CD8+ T-cell and a tumor infiltrated by dysfunctional CD8+ T- cell which is responsive to immune checkpoint blockade. In this embodiment, the set of genes consists or comprises the genes or combination of genes as provided in Table 15. In a further embodiment of the invention, the immune checkpoint therapy comprises use of anti-cytotoxic T lymphocyte antigen 4 (CTLA-4) antibody, anti-programmed death 1 (PD-1) monoclonal antibody, anti-CD 137 antibody, anti-IDO-1 antibody, an antibody against PD-1 , an antibody against PDL1 , an antibody against PDL2, an antibody against B7-H3, an antibody against B7-H4, an antibody against LAG3, an antibody against KIR, an antibody against TIM3, an antibody against TIGIT, an antibody against BTLA, an antibody against a CD 160, an antibody against A2aR, and/or an antibody against a VISTA protein(s).

microenvironment. In one embodiment, the method comprises (a) obtaining a tumor tissue sample from a subject; (b) determining gene expression of the isolated tumor tissue so as to obtain gene expression data; (c) deconvolving gene expression data of (b) by applying gene expression signatures associated with specific immune cell types and/or subtypes, so as to obtain immune scores for the immune cell types and/or subtypes with gene expression signatures used in deconvolving gene expression data; (d) optionally, determining one or more functional marker of immune cells so as to assess functional status of immune cell infiltrate; and (e) comparing the immune score for each specific immune cell type and/or subtype with the immune score for other immune cell types and/or subtypes, and optionally, functional status of immune cells, so as to identify specific immune cell types and/or subtypes as immune infiltrates enriched or deficient in the tumor tissue, and optionally, functional status of the specific immune cell types and/or subtypes of immune cell infiltrate.

Merely by way of example, the gene expression may be determined from RNA transcripts isolated from the sample. In a further example, the gene expression is determined by sequencing, hybridization or micro-array analysis of RNA transcripts or cDNA obtained from RNA transcripts.

In one embodiment of the method, the gene expression signatures associated with specific immune cell types and/or subtypes are obtained from examining expression of specific gene sets expressed highly selectively in one immune cell type and/or subtype than others. In a further embodiment, the gene expression signatures associated with specific immune cell types and/or subtypes are a collection of minimal gene expression signatures obtained by any of the methods described above. The tumor tissue sample is classified as originating from an epithelial, stromal or immune cell. In one embodiment of the invention, the tumor tissue sample is analysed for its epithelial and stromal content. Examples of the immune cells include, but are not limited to, B-cell, CD4+ T-cell, CD8+ T-cell, Treg cell, monocyte, macrophage, natural killer (NK) cell, neutrophil, myeloid-derived suppressor cell (MDSC), dendritic cell, macrophage Ml and M2 sub-types, granulocytic myeloid-derived suppressor cell (G-MDSC) subtype and monocytic myeloid-derived suppressor cell (M-MDSC) subtype or a combination thereof.

In one instance, the immune cells are CD8+ T-cells. In a further embodiment, the CD8+ T-cells are analyzed for one or more functional marker associated with anergic and exhaustion of CD8+ T-cell. Further, examples of the one or more functional marker associated with anergic and exhaustion of CD8+ T-cell include, but are not limited to, PD1 , TNF-a, IFN-γ, IL2, granazyme A and B, CTLA- 4, LΑG3 and TIM3 or a combination thereof.

In another instance, identifying immunogenic features of a tumor microenvironment additionally comprises assessing chemoattractant gene expression. Moreover, in a further embodiment, the chemoattractant gene expression is expression of chemoattractant genes for immune cells.

Here, examples of the chemoattractant genes include, but are not limited to, CCL1, CCL2, CCL3, CCL4, CCL5, CCL7, CCL8, CCLJ2, CCL13, CCL17, CCL19, CCL20, CCL21, CCL22, CCL25, CCL28, CXCLl, CXCL5, CXCL9, CXCLIO, CXCLU, CXCL12, CXCL13, CXCL14, CCL20, CCL4, CXCR3, CX3CL1, IL8, IFNG, M1F, and MIP3, or a combination thereof.

In one embodiment, identifying immunogenic features of a tumor microenvironment additionally comprises assessing tumor-associated genetic changes. In a further embodiment of the invention, the tumor-associated genetic changes are or comprise mutation in chromosomal DNA, changes to microsatellite repeats, instability of microsatellite repeats, addition of foreign genetic material or presence of extrachromosomal DNA or a combination thereof. In yet a further embodiment, mutation in chromosomal DNA is either in coding or non-coding region. In another embodiment, the foreign genetic material is introduced genetic material or genetic material of a virus. In another embodiment, the extrachromosomal DNA is produced by gene amplification of portion of a host chromosome or viral replication. In another embodiment of the invention, tumor-associated genetic change is expressed. In another embodiment of the invention, tumor-associated genetic change may be detected in a RNA transcript. In a further embodiment of the invention, tumor-associated genetic change is in a protein participating in a regulatory pathway, a protein participating in a signal transduction pathway, a protein participating in protein turnover, a protein participating in metabolic pathway, a cell cycle regulatory protein, a protein participating in cell turnover, a cytokine, a chemokine, cell adhesion molecule, a cell surface receptor, microsatellite repeats and/or a miRNA.

For example, tumor-associated genetic change is a mutation in an oncogene and/or a tumor suppressor gene. In a further embodiment, examples of the oncogene include, but are not limited to, EGFR, H-RAS, N-RAS, PIK3CA, RNF43, KRAS, IDH1, FGFR3 and BRAF. In another further embodiment, examples of the tumor suppressor gene include, but are not limited to, TP53, DOCK3, BMPR2, CHEK2, TP53INP1 and ACVR2A. Also, in one example, the tumor-associated genetic change is as shown in Figure 13D. In another embodiment of the invention, the immunogenic feature is as provided in Figure 16 F-H.

The invention provides a method for determining tumor grade based on immunogenic features of a tumor microenvironment. In one embodiment, the method comprises (a) determining the immunogenic features of a tumor microenvironment by the method described above; (b) comparing the immunogenic features so determine in step (a) to a reference comprising immunogenic features determined for different tumor grades for same type and/or subtype of tumor; and (c) finding the immunogenic features with the closest match so as to be able to determine tumor grade; thereby, determining tumor grade based on immunogenic features of a tumor microenvironment (Figure 14C).

The invention also provides a method for predicting likelihood of survival of a subject with cancer based on immunogenic features of a tumor microenvironment. In one embodiment, the method comprises (a) determining the immunogenic features of a tumor microenvironment by the method described above; (b) comparing the immunogenic features so determine in step (a) to a reference comprising immunogenic features for same type and/or subtype of tumor stratified by percent survival or likelihood of survival, or alternatively, a reference comprising immunogenic features for same type and/or subtype of tumor classified as being associated with live patients due to remission or stable disease or dead patients due to succumbing to cancer; and (c) finding the immunogenic features with the closest match so as to be able to predict likelihood of survival of a subject with cancer;thereby, predicting likelihood of survival of a subject with cancer based on immunogenic features of a tumor microenvironment (Figure 14 A-B.

The invention provides a method for predicting response to one or more cancer drug or a combination of cancer drugs in a subject based on immunogenic features of a tumor

microenvironment. In one embodiment, the method comprises: (a) determining the immunogenic features of a tumor microenvironment by the method described above; (b) comparing the immunogenic features so determine in step (a) to a reference comprising immunogenic features for same type and/or subtype of tumor stratified by percent response to one or more cancer drug or a combination of cancer drugs; and (c) finding the immunogenic features with the closest match so as to be able to predict response to one or more cancer drug or a combination of cancer drugs; thereby, predicting response to one or more cancer drug or a combination of cancer drugs in a subject based on immunogenic features of a tumor microenvironment.

Examples of the cancer drug include, but are not limited to, ABVD

(doxorubicin/bleomycin/vinblastine/dacarbazine combination), AC (Adriamycin/cyclophosphamide combination), ACE (Adriamycin/cyclophosphamide/etoposide combination), doxorubicin

(Adriamycin), vinblastine, dacarbazine (DTIC), etoposide (Eposin, Etopophos or Vepesid), abiraterone (Zytiga), nab-paclitaxel (Abraxane), Abstral, actinomycin D, Dactinomycin

(Cosmegen), Actiq, Afatinib (Giotrif), everolimus (Afinitor), aflibercept (Zaltrap), imiquimod cream (Aldara), aldesleukin (IL-2, Proleukin or interleukin 2), alemtuzumab (MabCampath), melphalan (Alkeran), amsacrine (amsidine, m-AMSA), anastrozole (Arimidex), cytarabine (Ara C, cytosine arabinoside), disodium pamidronate (Aredia), exemestane (Aromasin), arsenic trioxide (Trisenox, ATO), asparaginase (Crisantaspase, Erwinase), axitinib (Inlyta), azacitidine (Vidaza), BEACOPP

(bleomycin/etoposide/doxorubicin/cyclophosphamide/vincristine/procarbazine/prednisolone combination), BEAM (carmustine (BiCNU)/etoposide/cytarabine (Ara-C, cytosine

arabinoside)/melphalan combination), procarbazine, prednisolone, bendamustine (Levact), bevacizumab (Avastin), bexarotene (Targretin), bicalutamide (Casodex), bleomycin, BEP (bleomycin/etoposide/platinum (cisplatin) combination), bortezomib (Velcade), bosutinib (Bosulif), brentuximab (Adcetris), ibuprofen (Brufen, Nurofen), buserelin (Suprefact), busulfan (Myleran, Busilvex), CAPOX (CAPE-OX, XELOX; oxaliplatin and capecitabine combination), CAV

(cyclophosphamide/doxorubicin (Adriamycin)/vincristine combination), CAVE

(cyclophosphamide/doxorubicin (AdriamycinVvincristine/etoposide combination), lomustine (CCNU), CHOP (cyclophosphamide/doxorubicin hydrochloride (Adriamycin)/vincristine

(Oncovin)/prednisolone combination), CMF (cyclophosphamide/methotrexate/fluorouracil (5FU) combination), CMV (cisplatin/methotrexate/vinblastine combination), CTD

(cyclophosphamide/thalidomide/dexamethasone combination), CVP (cyclophosphamide/vincristine (Oncovin)/prednisolone combination), cabazitaxel (Jevtana), cabozantinib (Cometriq, Cabometyx), liposomal doxorubicin (Caelyx, Myocet, Doxil), paracetamol (Panadol, Anadin, Calpol), irinotecan (Campto), capecitabine (Xeloda), vandetanib (Caprelsa), Carbo MV

(carboplatin/methotrexate/vinblastine combination), PC (CarboTaxol; paclitaxel and carboplatin combination), carboplatin, carboplatin and etoposide combination, carmustine (BCNU, Gliadel), celecoxib (Celebrex), ceritinib (Zykadia), daunorubicin (Cerubidin), cetuximab (Erbitux), ChlVPP (chlorambucil/vinblastine/procarbazine/prednisolone combination), chlorambucil (Leukeran), cisplatin, cisplatin and teysuno combination, CX (cisplatin and capecitabine (Xeloda) combination), PEI (cisplatin, etoposide and ifosfamide combination), cisplatin/fluorouracil (5-FU)/trastuzumab combination, cladribine (Leustat, LITAK), sodium clodronate (Bonefos, Clasteon), clofarabine (Evoltra), co-codamol (Kapake, Solpadol, Tylex), cabozantinib (Cometriq, Cabometyx),

Dactinomycin (actinomycin D, Cosmegen), crizotinib (Xalkori), cyclophosphamide, cyproterone acetate (Cyprostat), DHAP (dexamethasone/high dose cytarabine/cisplatin combination), dacarbazine (DTIC), dabrafenib (Tafinlar), decitabine (Dacogen), dasatinib (Sprycel), de Gramont (fluorouracil (5FU)/folinic acid combination), triptorelin (Decapeptyl SR, Gonapeptyl Depot), degarelix (Firmagon), denosumab (Prolia, Xgeva), dexamethasone, prednisolone,

methylprednisolone, diamorphine, docetaxel (Taxotere), TPF (docetaxel

(Taxotere)/cisplatin/fluorouracil combination), Doxifos (dox-ifos; doxorubicin and ifosfamide combination), flutamide (Drogenil, Eulexin), fentanyl (Durogesic, Effentora, Instanyl), E-CMF (Epi-CMF; epirubicin/cyclophosphamide/methotrexate/fluorouracil combination), EC (epirubicin and cyclophosphamide combination), ECF (epirubicin, cisplatin and fluorouracil (5FU)

combination), EOF (epirubicin, oxaliplatin and fluorouracil (5FU) combination), EOX (epirubicin, oxaliplatin and capecitabine combination), EP (etoposide and cisplatin combination), ESHAP (etoposide, methylprednisolone, cytarabine and cisplatin combination), fluorouracil (5FU; Efudix), vindesine (Eldisine), oxaliplatin (Eloxatin), enzalutamide (Xtandi), epirubicin (Pharmorubicin), ECarboX (epirubicin (Pharmorubicin), carboplatin (Paraplatin) and capecitabine (Xeloda) combination), ECX (epirubicin (Pharmorubicin)/cisplatin/capecitabine (Xeloda) combination), etoposide (Eposin, Etopophos, Vepesid), cetuximab (Erbitux), eribulin (Halaven), erlotinib

(Tarceva), estramustine (Estracyt), ELF (etoposide/leucovorin (folinic acid, FA, calcium

folinate)/fluorouracil (5FU) combination), everolimus (Afinitor), clofarabine (Evoltra), exemestane (Aromasin), FAD (fludarabine/doxorubicin (Adriamycin)/dexamethasone combination), FC (fludarabine (Fludara)/cyclophosphamide combination), FCR (fludarabine, cyclophosphamide and rituximab combination), FEC (fluorouracil (5FU)/epirubicin/cyclophosphamide combination), FEC-T (fluorouracil (5FU)/epirubicin/cyclophosphamide/docetaxel (Taxotere) combination), FMD (fludarabine (Fludara)/mitoxantrone (Onkotrone)/dexamethasone combination), FOLFIRINOX (folinic acid (leucovorin, calcium folinate, FA)/fluorouracil (5FU)/irinotecan/oxaliplatin

combination), fulvestrant (Faslodex), letrozole (Femara), degarelix (Firmagon), fludarabine

(Fludara), fluorouracil (5FU), FOLFIRI (folinic acid, fluorouracil and irinotecan combination), FOLFOX (Folinic acid, fluorouracil and oxaliplatin combination), fulvestrant (Faslodex), granulocyte colony stimulating factor (G-CSF), lenograstim (Granocyte), filgrastim (Neupogen, Zarzio, Nivestim, Ratiograstim), long acting (pegylated) filgrastim (pegfilgrastim, Neulasta), long acting (pegylated) lipegfilgrastim (Longquex), gefitinib (Iressa), GemCarbo (gemcitabine and carboplatin combination), GemTaxol (Gemcitabine (Gemzar) and paclitaxel (Taxol) combination), gemcitabine (Gemzar), GemCap (gemcitabine and capecitabine combination), GC (gemcitabine and cisplatin combination), imatinib (Glivec), triptorelin (Decapeptyl SR, Gonapeptyl Depot), goserelin (Zoladex, Novgos), eribulin (Halaven), trastuzumab (Herceptin), topotecan (Hycamtin, Potactasol), hydroxycarbamide (Hydrea), hydroxyurea, I-DEX (Z-DEX; idarubicin (Zavedos) and

dexamethasone combination), ICE (ifosfamide, carboplatin and etoposide (Vepesid, Etopophos, Eposin) combination), aldesleukin (IL-2, Proleukin or interleukin 2), IPE (VIP; PEI; cisplatin, etoposide and ifosfamide combination), ibandronic acid (Bondronat), ibritumomab (Zevalin), ponatinib (Iclusig), idarubicin (Zavedos), idelalisib (Zydelig), ifosfamide (Mitoxana),

pomalidomide (Imnovid), interferon (intron A), ipilimumab (Yervoy), XELIRI (irinotecan and capecitabine combination), vinflunine (Javlor), trastuzumab emtansine ( adcyla), pembrolizumab (Keytruda), tioguanine (thioguanine, 6-TG, 6-tioguanine; Lanvis), lapatinib (Tyverb), lenalidomide (Revlimid), letrozole (Femara), leuprorelin (Prostap, Lutrate), olaparib (Lynparza), mitotane (Lysodren), MIC (mitomycin, ifosfamide and cisplatin combination), MM (mitoxantrone

(Mitozantrone, Onkotrone) and methotrexate (Maxtrex) combination), MMM (mitoxantrone (Mitozantrone), mitomycin C and methotrexate combination), morphine (Morphgesic SR, MXL, Zomorph, MST, MST Continus, Sevredol, Oramorph), MVAC (methotrexate, vinblastine, doxorubicin (Adriamycin) and cisplatin combination), MVP (mitomycin, vinblastine and cisplatin combination), rituximab (Mabthera), methotrexate (Maxtrex), medroxyprogesterone acetate (Provera), megestrol acetate (Megace), MPT (melphalan, prednisolone and thalidomide combination), mifamurtide (Mepact), mitomycin C (Mitomycin-C Kyowa), mitoxantrone

(Mitozantrone, Onkotrone), vinorelbine (Navelbine), nelarabine (Atriance), sorafenib (Nexavar), nilotinib (Tasigna), nintedanib (Vargatef), pentostatin (Nipent), nivolumab (Opdivo), ofatumumab (Arzerra), olaparib (Lynparza), vincristine (Oncovin), oxaliplatin (Eloxatin), XELOX (oxaliplatin (Eloxatin) and capecitabine (Xeloda) combination), PAD (bortezomib (Velcade), doxorubicin (Adriamycin) and dexamethasone combination), PCV (procarbazine, lomustine (CCNU) and vincristine combination), EP (etoposide (Vepesid, Eposin, Etopophos) and cisplatin combination), PMitCEBO (prednisolone, mitoxantrone, cyclophosphamide, etoposide, bleomycin and vincristine (Oncovin) combination), POMB/ACE (cisplatin, vincristine (Oncovin), methotrexate, bleomycin, Actinomycin (Dactinomycin), cyclophosphamide and etoposide (Eposin, Etopophos, Vepesid) combination), paclitaxel (Taxol), panitumumab (Vectibix), pazopanib (Votrient), pemetrexed (Alimta), pemetrexed (Alimta) and carboplatin combination, pemetrexed (Alimta) and cisplatin combination, pertuzumab (Perjeta), pixantrone (Pixuvri), mercaptopurine (Purinethol; Xaluprine), R-CHOP (rituximab (Mabthera), cyclophosphamide, doxorubicin hydrochloride, vincristine (Oncovin) and prednisolone combination), R-CVP (rituximab (Mabthera), cyclophosphamide, vincristine (Oncovin) and prednisolone combination), R-DHAP (rituximab (Mabthera), dexamethasone, cytarabine and cisplatin combination), R-ESHAP (rituximab (Mabthera), etoposide, methylprednisolone, cytarabine and cisplatin combination), R-GCVP (rituximab (Mabthera), gemcitabine, cyclophosphamide, vincristine and prednisolone combination), RICE (rituximab (Mabthera), ifosfamide, carboplatin and etoposide combination), raloxifene, raltitrexed (Tomudex), regorafenib (Stivarga), Stanford V (doxorubicin, vinblastine, mechlorethamine (mustine or nitrogen mustard), vincristine, bleomycin, etoposide and prednisone (or prednisolone) combination), streptozocin (Zanosar), sunitinib (Sutent), TAC (docetaxel (Taxotere), doxorubicin (Adriamycin) and cyclophosphamide combination), TIP (paclitaxel, ifosfamide and cisplatin combination), tamoxifen, TC (docetaxel (Taxotere) and cyclophosphamide combination), temozolomide (Temodal), temsirolimus (Torisel), thiotepa (Tepadina), trabectedin (Yondelis), treosulfan, tretinoin (Vesanoid, ATRA), VIDE (vincristine, ifosfamide, doxorubicin and etoposide combination), VelP (vinblastine, ifosfamide and cisplatin combination), vinblastine (Velbe), vemurafenib (Zelboraf), vincristine (Oncovin), VAC (vincristine, actinomycin D (dactinomycin) and cyclophosphamide combination), VAI (vincristine, actinomycin and ifosfamide combination), VAD (vincristine, Adriamycin (doxorubicin) and dexamethasone combination), vismodegib (Erivedge), ZELOX (oxaliplatin (Eloxatin) and capecitabine (Xeloda) combination) and zoledronic acid (Zometa) and combination thereof.

In one embodiment of the method, the cancer drug is a cancer immunotherapy drug. In a further embodiment, examples of the cancer immunotherapy drug include, but are not limited to, cyclosporine, dexamethasone, tacrolimus, infliximab, mycophenolate mofetil (M F),

methotrexate, sirolimus, antithymocyte globulin (ATG), pentostatin, anti-cytotoxic T lymphocyte antigen 4 (CTLA-4) antibody, anti-CD 137 antibody, anti-IDO-1 antibody, anti -programmed death 1 (PD-1) monoclonal antibody, an antibody against PD-1 , an antibody against PDL1, an antibody against PDL2, an antibody against B7-H3, an antibody against B7-H4, an antibody against LAG3, an antibody against KIR, an antibody against TIM3, an antibody against TIGIT, an antibody against BTLA, an antibody against a CD160, an antibody against A2aR, and an antibody against a VISTA protein(s).

microenvironment in a tumor sample from the subject; (d) comparing the the immunogenic features so obtained with the immunogenic features associated with a good or favourable tumor or cancer prognosis and/or associated with a bad or unfavourable tumor or cancer prognosis, so as to assess prognosis of a subject afflicted with a tumor or cancer; and (e) comparing the the immunogenic features so obtained with the immunogenic features associated with a good or favourable response to a cancer drug and/or associated with a bad or unfavourable response to a cancer drug, so as to predict response to a cancer drug by the subject.

In one particular embodiment, the immunogenic features associated with a good or favourable tumor or cancer prognosis and/or associated with a bad or unfavourable tumor or cancer prognosis are determined on one or more group of subjects with good or favourable outcome to tumor or cancer and/or one or more group of subjects with bad or unfavourable outcome to tumor or cancer, respectively (Figure 14 A-B and Table 17).

Another particular embodiment, the immunogenic features associated with a good or favourable response to a cancer drug and/or associated with a bad or unfavourable response to a cancer drug are determined on one or more group of subjects with good or favourable response to a cancer drug and/or one or more group of subjects with bad or unfavourable response to a cancer drug, respectively.

Examples of the tumor may include, but are not limited to, a tumor or cancer of the brain, head, eye, bladder, neck, mouth, nose, throat, thymus, lymph node, blood, lung, esophagus, trachea, stomach, intestine, colon, rectum, pancreas, liver, kidney, bone, skin, breast, arm, hand, chest, abdomen, leg, foot, genital, testes, ovary, uterus, cervix, urethra, and/or prostate.

Examples of the tumor include, but are not limited to, adrenocortical carcinoma (ACC), bladder urothelial carcinoma (BLCA), breast invasive carcinoma (BRCA), cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), cholangiocarcinoma (CHOL), colon adenocarcinoma (COAD), lymphoid neoplasm diffuse large B-cell lymphoma (DLBC), esophageal carcinoma (ESCA), glioblastoma multiforme (GBM), head and neck squamous cell carcinoma (HNSC), kidney chromophobe ( ICH), kidney renal clear cell carcinoma (KIRC), kidney renal papillary cell carcinoma (KIRP), acute myeloid leukemia (LAML), chronic myelogenous leukemia (LCML), brain lower grade glioma (LGG), liver hepatocellular carcinoma (LIHC), lung

adenocarcinoma (LUAD), lung squamous cell carcinoma (LUSC), mesothelioma (MESO), ovarian serous cystadenocarcinoma (OV), pancreatic adenocarcinoma (PAAD), pheochromocytoma and paraganglioma (PCPG), prostate adenocarcinoma (PRAD), rectum adenocarcinoma (READ), sarcoma (SARC), skin cutaneous melanoma (SKCM), stomach adenocarcinoma (STAD), testicular germ cell tumors (TGCT), thyroid carcinoma (THCA), thymoma (THY ), uterine corpus endometrial carcinoma (UCEC), uterine carcinosarcoma (UCS) or uveal melanoma (UVM) or a combination thereof.

Examples of the cancer drug may include, but are not limited to, ABVD

(Adriamycin), vinblastine, dacarbazine (DTIC), etoposide (Eposin, Etopophos or Vepesid), abiraterone (Zytiga), nab-paclitaxcl (Abraxone), Abstral, actinomycin D, Dactinomycin

(Cosmegen), Actiq, Afatinib (Giotrif), cvcrolimus (Afinitor), aflibercept (Zaltrap), imiquimod cream (Aldara), aldesleukin (IL-2, Proleukin or interleukin 2), alemtuzumab (MabCampath), melphalan (Alkeran), amsacrine (amsidine, m-AMSA), anastrozole (Arimidex), cytarabine (Ara C, cytosine arabinoside), disodium pamidronate (Aredia), exemestane (Aromasin), arsenic trioxide (Trisenox, ATO), asparaginase (Crisantaspase, Erwinase), axitinib (Inlyta), azacitidine (Vidaza), BEACOPP

arabinoside)/melphalan combination), procarbazine, prednisolone, bendamustine (Levact), bevacizumab (Avastin), bexarotene (Targretin), bicalutamide (Casodex), bleomycin, BEP

(bleomycin/etoposide/platinum (cisplatin) combination), bortezomib (Velcade), bosutinib (Bosulif), brentuximab (Adcetris), ibuprofen (Brufen, Nurofen), buserelin (Suprefact), busulfan (Myleran, Busilvex), CAPOX (CAPE-OX, XELOX; oxaliplatin and capecitabine combination), CAV

(cyclophosphamide/doxorubicin (Adriamycin)/vincristine combination), CAVE

(cyclophosphamide/doxorubicin (Adriamycin)/vincristine/etoposide combination), lomustine (CCNU), CHOP (cyclophosphamide/doxorubicin hydrochloride (Adriamycin)/vincristine

(OncovinVprednisolone combination), CMF (cyclophosphamide/methotrexate/fluorouracil (5FU) combination), CMV (cisplatin/methotrexate/vinblastine combination), CTD

(cyclophosphamide/thalidomide/dexamethasone combination), CVP (cyclophosphamide/vincristine (OncovinVprednisolone combination), cabazitaxel (Jevtana), cabozantinib (Cometriq, Cabometyx), liposomal doxorubicin (Caelyx, Myocet, Doxil), paracetamol (Panadol, Anadin, Calpol), irinotecan (Campto), capecitabine (Xeloda), vandetanib (Caprelsa), Carbo MV

methylprednisolone, diamorphine, docetaxel (Taxotere), TPF (docetaxel

(Taxotere)/cisplatin fluorouracil combination), Doxifos (dox-ifos; doxorubicin and ifosfamide combination), flutamide (Drogenil, Eulexin), fentanyl (Durogesic, Effentora, Instanyl), E-CMF (Epi-CMF; epirubicin/cyclophosphamide/methotrexate/fluorouracil combination), EC (epirubicin and cyclophosphamide combination), ECF (epirubicin, cisplatin and fluorouracil (5FU)

( l'arceva), estramustine (Estracyt), ELF (etoposide/leucovorin (folinic acid, FA, calcium

folinate)/fiuorouracil (5FU) combination), everolimus (Afinitor), clofarabine (Evoltra), exemestane (Aromasin), FAD (fludarabine/doxorubicin (Adriamycin)/dexamethasone combination), FC

(fludarabine (Fludara)/cyclophosphamide combination), FCR (fludarabine, cyclophosphamide and rituximab combination), FEC (fluorouracil (5FU)/epirubicin/cyclophosphamide combination), FEC-T (fluorouracil (5FU)/epirubicin/cyclophosphamide/docetaxel (Taxotere) combination), FMD (fludarabine (Fludara)/mitoxantrone (Onkotrone)/dexamethasone combination), FOLFIRINOX (folinic acid (leucovorin, calcium folinate, FA)/fluorouracil (5FU)/irinotecan/oxaliplatin

combination), fulvestrant (Faslodex), letrozole (Femara), degarelix (Firmagon), fludarabine (Fludara), fluorouracil (5FU), FOLFIRI (folinic acid, fluorouracil and irinotecan combination), FOLFOX (Folinic acid, fluorouracil and oxaliplatin combination), fulvestrant (Faslodex), granulocyte colony stimulating factor (G-CSF), lenograstim (Granocyte), filgrastim (Neupogen, Zarzio, Nivestim, Ratiograstim), long acting (pegylated) filgrastim (pegfilgrastim, Neulasta), long acting (pegylated) lipegfilgrastim (Longquex), gefitinib (Iressa), GemCarbo (gemcitabine and carboplatin combination), GemTaxol (Gemcitabine (Gemzar) and paclitaxel (Taxol) combination), gemcitabine (Gemzar), GemCap (gemcitabine and capecitabine combination), GC (gemcitabine and cisplatin combination), imatinib (Glivec), triptorelin (Decapeptyl SR, Gonapeptyl Depot), goserelin (Zoladex, Novgos), eribulin (Halaven), trastuzumab (Herceptin), topotecan (Hycamtin, Potactasol), hydroxycarbamide (Hydrea), hydroxyurea, I-DEX (Z-DEX; idarubicin (Zavedos) and

pomalidomide (Imnovid), interferon (intron A), ipilimumab (Yervoy), XELIRI (irinotecan and capecitabine combination), vinflunine (Javlor), trastuzumab emtansine (Kadcyla), pembrolizumab (Keytruda), tioguanine (thioguanine, 6-TG, 6-tioguanine; Lanvis), lapatinib (Tyverb), lenalidomide (Revlimid), letrozole (Femara), leuprorelin (Prostap, Lutrate), olaparib (Lynparza), mitotane (Lysodren), MIC (mitomycin, ifosfamide and cisplatin combination), MM (mitoxantrone

(Mitozantrone, Onkotrone) and methotrexate (Maxtrex) combination), MMM (mitoxantrone (Mitozantrone), mitomycin C and methotrexate combination), morphine (Morphgesic SR, MXL, Zomorph, MST, MST Continus, Sevredol, Oramorph), MVAC (methotrexate, vinblastine, doxorubicin (Adriamycin) and cisplatin combination), MVP (mitomycin, vinblastine and cisplatin combination), rituximab (Mabthera), methotrexate (Maxtrex), medroxyprogesterone acetate (Provera), megestrol acetate (Megace), MPT (melphalan, prednisolone and thalidomide

combination), mifamurtide (Mepact), mitomycin C (Mitomycin-C Kyowa), mitoxantrone

(Mitozantrone, Onkotrone), vinorelbine (Navelbine), nelarabine (Atriance), sorafenib (Nexavar), nilotinib (Tasigna), nintedanib (Vargatef), pentostatin (Nipent), nivolumab (Opdivo), ofatumumab (Arzerra), olaparib (Lynparza), vincristine (Oncovin), oxaliplatin (Eloxatin), XELOX (oxaliplatin (Eloxatin) and capecitabine (Xeloda) combination), PAD (bortezomib (Velcade), doxorubicin (Adriamycin) and dexamethasone combination), PCV (procarbazine, lomustine (CCNU) and vincristine combination), EP (etoposide (Vepesid, Eposin, Etopophos) and cisplatin combination), PMitCEBO (prednisolone, mitoxantrone, cyclophosphamide, etoposide, bleomycin and vincristine (Oncovin) combination), POMB/ACE (cisplatin, vincristine (Oncovin), methotrexate, bleomycin, Actinomycin (Dactinomycin), cyclophosphamide and etoposide (Eposin, Etopophos, Vepesid) combination), paclitaxel (Taxol), panitumumab (Vectibix), pazopanib (Votrient), pemetrexed (Alimta), pemetrexed (Alimta) and carboplatin combination, pemetrexed (Alimta) and cisplatin combination, pertuzumab (Perjeta), pixantrone (Pixuvri), mercaptopurine (Purinethol; Xaluprine), R-CHOP (rituximab (Mabthera), cyclophosphamide, doxorubicin hydrochloride, vincristine (Oncovin) and prednisolone combination), R-CVP (rituximab (Mabthera), cyclophosphamide, vincristine (Oncovin) and prednisolone combination), R-DHAP (rituximab (Mabthera),

dexamethasone, cytarabine and cisplatin combination), R-ESHAP (rituximab (Mabthera), etoposide, methylprednisolone, cytarabine and cisplatin combination), R-GCVP (rituximab

(Mabthera), gemcitabine, cyclophosphamide, vincristine and prednisolone combination), RICE (rituximab (Mabthera), ifosfamide, carboplatin and etoposide combination), raloxifene, raltitrexed (Tomudex), regorafenib (Stivarga), Stanford V (doxorubicin, vinblastine, mechlorethamine (mustine or nitrogen mustard), vincristine, bleomycin, etoposide and prednisone (or prednisolone) combination), streptozocin (Zanosar), sunitinib (Sutent), TAC (docetaxel (Taxotere), doxorubicin (Adriamycin) and cyclophosphamide combination), TIP (paclitaxel, ifosfamide and cisplatin combination), tamoxifen, TC (docetaxel (Taxotere) and cyclophosphamide combination), temozolomide (Temodal), temsirolimus (Torisel), thiotepa (Tepadina), trabectedin (Yondelis), treosulfan, tretinoin (Vesanoid, ATRA), VIDE (vincristine, ifosfamide, doxorubicin and etoposide combination), VelP (vinblastine, ifosfamide and cisplatin combination), vinblastine (Velbe), vemurafenib (Zelboraf), vincristine (Oncovin), VAC (vincristine, actinomycin D (dactinomycin) and cyclophosphamide combination), VAI (vincristine, actinomycin and ifosfamide combination), VAD (vincristine, Adriamycin (doxorubicin) and dexamethasone combination), vismodegib (Erivedge), ZELOX (oxaliplatin (Eloxatin) and capecitabine (Xeloda) combination) and zoledronic acid (Zometa), and/or a combination thereof.

In one embodiment, the cancer drug is a cancer immunotherapy drug. Examples of the cancer immunotherapy drug may include, but are not limited to, cyclosporine, dexamethasone, tacrolimus, infliximab, mycophenolate mofetil (MMF), methotrexate, sirolimus, antithymocyte globulin (ATG), pentostatin, anti-cytotoxic T lymphocyte antigen 4 (CTLA-4) antibody, anti-CD137 antibody, anti-IDO-1 antibody, anti -programmed death 1 (PD-1) monoclonal antibody, an antibody against PD-1 , an antibody against PDL1 , an antibody against PDL2, an antibody against IDO-1 , an antibody against CD137, an antibody against B7-H3, an antibody against B7-H4, an antibody against LAG3, an antibody against KIR, an antibody against TIM3, an antibody against TIGIT, an antibody against BTLA, an antibody against a CD 160, an antibody against A2aR, an antibody against a VISTA protein(s), and/or a combination thereof.

In a preferred embodiment of the invention, the subject is a mammal. Examples of the mammal may include, but are not limited to, human, mouse, rat, monkey, chimpanzee, cow, pig, horse, rabbit, cow, mink, guinea pig, and/or hamster. For example, the tumor sample may be a tissue comprising tumor cells.

Additionally, identifying immunogenic features of a tumor microenvironment comprises: (a) analyzing gene expression data sets of pure immune cells and selecting genes that satisfy three criteria so as to establish a gene signature for a particular immune cell type and/or subtype: (i) stable expression of the gene in a given immune cell type or subtypes within the particular immune cell type; (ii) significantly higher level of expression of the gene in the immune cell type or subtypes within that particular cell type of interest than in other immune cells; (b) converting cell type- and subtype-specific gene expression signatures to an immune score, which can be used to stratify tissue samples as a quantitative measure of immunogenic features; (c) generating a range of scores to distinguish tumors as containing no infiltration of immune cell type or subtype as follows; (i) Low infiltration of immune cell-type or subtype (about <5% infiltration); (ii) Medium infiltration of immune cell-type or subtype (about >5% -≤25% and (iii) High infiltration of immune cell-type or subtype (about >25%) (Figure 12).

In another instance, the immune signature-derived scores are applied on human tumor gene expression data to predict prognosis comprising: (a) generating gene expression signatures by the method described above and applying the gene expression signatures on tumors to separate them into clusters or groups; (b) characterizing immune cell infiltration profile for each cluster based on their immune scores; (c) selecting alive and dead individuals of each cluster and identifying signatures associated with prognosis; (d) assessing prognosis based on immune infiltrate

composistion of the tumor matching closely with either alive or dead so that signatures associated with good or poor prognosis are closely linked to the immune infiltrate composition of the tumor. In a further embodiment, the signatures associated with good and bad prognosis is closely linked to the function of infiltrating immune cells.

The invention provides a biomarker consisting of or comprising a MGESPs (minimal gene expression signature profiles) obtained by any of the method described above and anergic and exhaustive CD8+ T cells, wherein said biomarker is indicative of the therapeutic efficacy of a checkpoint inhibitor drug or a cancer drug that stimulates an immune response.

The invention also provides a kit for determining the efficacy of a cancer therapy, comprising one or more biomarkers described above, and written instructions for use of the kit for determining the efficacy of a cancer therapy. In one embodiment, one or more biomarkers is detected using mass spectrometry, immunoassay, microarray, nucleic acid sequencing or PCR.

The type of tumor may be defined by the tissue origin of the tumor. Examples of the type of tumor may include, but are not limited to, brain, head, eye, bladder, neck, mouth, nose, throat, thymus, lymph node, blood, lung, esophagus, trachea, stomach, intestine, colon, rectum, pancreas, liver, kidney, bone, skin, breast, arm, hand, chest, abdomen, leg, foot, genital, testes, ovary, uterus, cervix, urethra, prostate, and/or a combination thereof.

In a further embodiment, the type and/or subtype of tumor is further defined by epithelial or stromal origin. In another example, the type and/or subtype of tumor is defined by epithelial and/or stromal content or epithelial/stromal ratio. In another embodiment, the type and/or subtype of tumor is defined by chromosomal mutation or tumor-associated genetic changes.

The subtype of tumor may be type of tumor further defined by grade of tumor. In some

embodiments of the method, the subtype of tumor is type of tumor further defined by gene expression profile of the tumor. In other embodiments of the method, the subtype of tumor is type of tumor further defined by tumor prognosis. In additional embodiments of the method, the subtype of tumor is type of tumor further defined by tumor immune cell infiltrate. The subtype of tumor may also be type of tumor further defined by predicted response to a cancer drug or combination of cancer drugs. In some examples, the subtype of tumor is type of tumor further defined by tumor prognosis and predicted response to a cancer drug or combination of cancer drugs. In other examples, the subtype of tumor is type of tumor further defined by prior exposure to a cancer drug or a combination of cancer drugs.

The invention additionally provides methods for determining relative immunogenicity of a mutant peptide compared to its wild-type counterpart. In an embodiment of the invention, the method comprises: (a) Obtaining peripheral blood mononuclear cells (PBMCs) from a healthy human subject expressing a specific HLA capable of presenting the mutant peptide and/or wild-type peptide; (b)lsolating CD 14+ monocytes from the PBMCs in step (a) and culturing in differentiation medium so as to obtain dendritic cells (DCs);(c) introducing a nucleic acid comprising a sequence encoding a mutant peptide or wild-type peptide comprising internal native protease cleavage site(s) or external artificial protease cleavage site(s) flanking the peptide into a dendritic cell, so as to permit expression of the mutant peptide or wild-type peptide comprising internal native protease cleavage site(s) or artificial protease cleavage site(s); (d) Co-culturing the dendritic cell comprising the nucleic acid sequence of step (c) with isolated naive CD8+ T-cell from the PBMCs in step (a); (e) Contacting the co-culture of step (d) with additional autologous PMBCs of the subject in which the PBMCs additionally comprise nucleic acid sequence comprising a sequence encoding a mutant peptide or wild-type peptide comprising internal native protease cleavage site(s) or external artificial protease cleavage site(s) flanking the peptide and wherein the PBMCs express the mutant peptide or wild-type peptide comprising internal native protease cleavage site(s) or external artificial protease cleavage site(s) flanking the peptide; (f) Measuring amount of effector cytokine produced by CD8+ cytotoxic T-cell for the two different co-cultures in step (e); and (g)

Determining relative amount of effector cytokine produced by each co-culture in step (f) so as to determine the relative immunogenicity of a mutant peptide over its wild-type counterpart, wherein the greater amount of effector cytokine indicates greater immunogenicity. In one embodiment of the invention, the mutant peptide consists of one amino acid change from wild-type peptide. In some embodiments, the mutant peptide comprises one or more, or two or more amino acid change from wild-type peptide. Examples of the amino acid change may include, but are not limited to, an amino acid substitution, an insertion, and/or a deletion.

In some embodiments of the method, the amino acid change may be a result of a frame-shift mutation in coding sequence, or a result of a translocation event resulting in a fusion of a coding sequence with a second coding sequence. Further, the fusion of a coding sequence with a second coding sequence may be in frame or out of frame. Further in one embodiment of the method, the amino acid change is or comprises a mutation in a stop codon resulting in a novel peptide sequence. Examples of the mutant or wild-type peptide length may include, but are not limited to, at least about 7, 9, 9 or more, 25, and/or 25 or more amino acids long. Further examples of the mutant or wild-type peptide length may include, but are not limited to about 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34 and/or 35. Other examples of the mutant or wild-type peptide length may include, but are not limited to, less than or equal to about 50 amino acids , between about 7 and 50 amino acids, and/or between about 8-10 amino acids.

In one embodiment, the mutant peptide and its wild-type counterpart have the same number of amino acids. In another embodiment, the peripheral blood mononuclear cells (PBMCs) comprises CD 14+ monocytes, dendritic cells (DCs) and naive CD8+ T-cells. In another embodiment of the invention, isolating CD 14+ monocytes from the PBMCs comprises magnetic separation of CD 14+ monocytes.

In an additional embodiment of the invention, the differentiation medium in step (b) described above comprises a cytokine cocktail. Examples of the cytokine included in the cocktail may include, but are not limited to, GMCSF, IL4 and/or IFN-γ. In one particular embodiment, the cytokine cocktail comprises GMCSF, IL4 and IFN-γ.

In a further embodiment, an isolated CD 14+ monocytes from the PBMCs in step (a) as described above are cultured in differentiation medium for 4 days in step (b). The nucleic acid in the method described above may be DNA or RNA. Examples of the internal native protease cleavage site(s) may include, but are not limited to, immunoproteasome cleavage site, ERAAP/ERAP 1 endoplasmic reticulum aminopeptidase cleavage site, and/or ERAP2 endoplasmic reticulum aminopeptidase cleavage site. Examples of the external artificial protease cleavage site(s) flanking the peptide may include, but are not limited to, furin cleavage site. In another embodiment, the furin cleavage site is or comprises a R-X-(K/R)-R or R-X-X-R amino acid sequence.

In a further embodiment of the method, the nucleic acid additionally comprises one or more sequences required for expression and/or transport of the mutant or wild-type peptide so as to direct expressed and processed peptide to a cellular compartment shared with an HLA molecule.

Examples of one or more sequences required for expression may include, but are not limited to, an enhancer, a promoter, an intron, a splice site donor, a splice site acceptor, a transcriptional terminator, a polyadenylation signal, a ribosome binding site, a translational initiation codon and/or a stop codon. Examples of one or more sequence required for transport of the mutant or wild-type peptide may include, but are not limited to, endosomal targeting sequence, endoplasmic reticulum targeting sequence, and/or Golgi localization sequence.

In an additional embodiment of the method, the nucleic acid may be an expression vector comprising a minigene for expression of a polypeptide comprising one or more copies of a mulanl peptide or a wild-type peptide. In a further embodiment, the expression vector may comprise DNA or RNA. Further still, the DNA may be transfected or electroporated into a cell. In one particular embodiment,the cell may be a PBMC, a CD 14+ monocyte, a dendritic cell or an antigen presenting cell (APC). In a further particular embodiment, the expression vector is a viral vector or a virus- associated vector. Moreover, the viral vector or the virus-associated vector may be used to infect a cell, so as to allow expression of the mutant or wild-type peptide. In another embodiment, the minigene comprises one or more copies of a mutant or wild-type peptide sequence comprising internal protease cleavage site(s) or external protease cleavage site(s). In some embodiments of the invention, the internal protease cleavage site(s) is/are native protease cleavage site(s). Examples of the native protease cleavage site(s) may include, but are not limited to, ERAAP/ERAP 1 endoplasmic reticulum aminopeptidase cleavage site, and/or ERAP2 endoplasmic reticulum aminopeptidase cleavage site. In a further embodiment, the external protease cleavage site(s) are artificial protease cleavage site(s). Further still, the artificial protease cleavage site(s) may be a furin cleavage site. In one particular embodiment, the furin cleavage site is R-X-(K/R)-R or R-X-X-R. In another embodiment, the artificial protease cleavage site(s) comprises R-X-(K/R)-R or R-X-X-R. In some instances, the artificial protease cleavage site(s) are cleaved by proteases selected from the group consisting of furin.

In another embodiment of the invention, one or more copies of a mutant or wild-type peptide sequence comprises one or more native protease cleavage site(s) within the peptide. In a further embodiment, the native protease cleavage site(s) is an immunoproteasome cleavage site. Examples of the one or more copies of a mutant or wild-type peptide length may include, but are not limited to, greater than or equal to about a 15-mer peptide, less than or equal to about a 50-mer peptide, between about 15 to 50 amino acid, and/or equal to a 25-mer peptide. In accordance with the practive of the invention, native protease may be an immunoproteasome.

In one embodiment of the method, one or more copies of a mutant or wild-type peptide sequence is flanked by artificial protease cleavage site(s). In a further embodiment, cleavage at the artificial protease cleavage site produces individual peptide(s) of defined length. Further still, the individual peptide(s) of defined length comprises a mutant or wild-type peptide. In a particular embodiment, the individual peptide(s) of defined length additionally comprises part or all of the artificial protease cleavage site.

Examples of the individual peptide(s) of defined length may include, but are not limited to about 10, 1 1, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29 and 30 or more amino acids in length. In a particular embodiment, the individual peptide(s) of defined length is a 13-mer peptide.

In one embodiment, the minigene produces a polypeptide with a single copy of a mutant or wild- type peptide. In another embodiment, the minigene produces a polypeptide with two or more copies of mutant or wild-type peptide. In a further embodiment, the polypeptide comprises 4 copies of a mutant or wild-type peptide. Further still, each copy of the mutant or wild-type peptide may be cleaved internally or externally at its flank by a protease so as to produce individual peptide or portion thereof. In a particular embodiment, cleavage internally is at a native cleavage site.

Moreover, the native cleavage site may be an immunoproteasome cleavage site.

Examples of the copy of internally cleaved mutant or wild-type peptide length may include, but are not limited to, longer than 10 or more; longer than 15 or more; 20 or more; 25 or more; 30 or more; 35 or more; 40 or more; 45 or more; and/or 50 or more amino acids prior to cleavage. Other examples of the copy of internally cleaved mutant or wild-type peptide length may include, but are not limited to, shorter than 100 or less; 90 or less; 80 or less; 70 or less; 60 or less; 50 or less; 40 or less; and /or 30 or less amino acids prior to cleavage. Additional examples of the copy of internally cleaved mutant or wild-type peptide length may be between 10 to 100; between 20 to 80; between 20 to 60; between 20 to 40; and/or 25 amino acids in length.

In a particular embodiment, the mutant or wild-type peptide is a 25-mer comprising a native protease cleavage site. In another embodiment, the mutant or wild-type peptide is a 25-mer comprising one or more native protease cleavage site.

In one embodiment, the mutant or wild-type peptide is a 9-mer flanked by one or more artificial protease cleavage site(s). In a further embodiment, the one or more artificial protease cleavage site may be or may comprise a furin cleavage site and/or R-X-(K/R)-R or R-X-X-R.

In some embodiments, the polypeptide additionally comprises a subcellular targeting or localization sequence. Examples of the subcellular targeting or localization sequence may include, but are not limited to, endosomal targeting sequence, endoplasmic reticulum targeting sequence, or Golgi localization sequence. In another embodiment, the subcellular targeting or localization sequence is an endosome targeting sequence. The subcellular targeting or localization sequence may be located at the amino terminus of the polypeptide, the carboxyl terminus of the polypeptide, the amino terminus of the polypeptide, and/or the carboxyl terminus of the polypeptide.

In some embodiments, the minigene may comprise nucleic acid sequence for a single copy, and/or multiple copies of a 25-mer peptide comprising native protease cleavage sites. In a furthernother embodiment, the multiple copies of a 25-mer peptide are 4 copies. In a further embodiment, the minigene comprises nucleic acid sequence for multiple copies of a 9- mer peptide separable by protease cleavage between expressed peptide copies. In another embodiment, the protease cleavage produces peptides of greater than 9 amino acids in length, comprising 9-mer peptide separable by protease cleavage between expressed peptide copies and flanking amino acid sequence(s).

In a further embodiment, the minigene additionally comprises a subcellular targeting or localization sequence. Further still, the subcellular targeting or localization sequence may be an endosome targeting sequence. In yet a further embodiment, the endosome targeting sequence may be at the amino terminus.

In one embodiment, in the step (c) of the method described above of introducing a nucleic acid into a dendritic may comprise electroporation of the nucleic acid into a mature dendritic cell. In another embodiment, in the step (d) of co-culturing the dendritic cell with isolated naive CD8+ T-cell from the PBMCs in step (a) may comprise a period of about 10 days. Further, the isolated naive CD8+ T- cell from the PBMCs in step (a) is obtained by a magnetic separation method. Further still, co- culturing comprises a culture medium supplemented with a cytokine cocktail. In yet another embodiment, the culture medium is supplemented with a fresh cytokine cocktail every 2 days.

Further, the cytokine cocktail comprises IL-7 and IL-15.

In some embodiments, the PBMCs additionally comprises nucleic acid sequence comprising a sequence encoding a mutant peptide or wild-type peptide comprising internal native protease cleavage site(s) or external artificial protease cleavage site(s) flanking the peptide may be obtained by electroporating with nucleic acid comprising said nucleic acid sequence. In another embodiment, the step (e) of contacting the co-culture may comprise a period of about 48 hours. In a further embodiment, in the step (f) of measuring the amount of effector cytokine produced by CD8+ cytotoxic T-cell is performed after cells in the co-culture are incubated with a cell transport blocker or inhibitor. Further, the cell transport blocker or inhibitor is selected from the group consisting of brefeldin A, monensin and combination thereof. Further yet, the cell transport blocker or inhibitor may be or may comprise brefeldin A. In a further embodiment, incubation with a cell transport blocker or inhibitor may comprise a period of about 6 hours. In accordance with the practice of the invention, the effector cytokine produced by CD8+ T-cell mayinclude any of IFN-γ, TNF and LT-a. Further, the effector cytokine produced by CD8+ T-cell is IFN-γ. In another embodiment, PBMCs and CD8+ T-cells present in the PBMCs from a subject may be stored frozen in step (a) and thaw with an efficiency of greater than about 70% viability before use in subsequent steps. In yet another embodiment, the CD 14+ monocytes may be CD 14+, CD 16+ monocytes, comprising CD 14+ cell surface marker and CD 16+ cell surface marker.

Examples include but are not limited to the CD 14+, CD 16+ monocytes that may be greater than 15% to less than or equal to 30% and CD8+ T cells may be greater than about 7% to less than or equal to 12% of total PBMCs of step (a). Also, DCs of step (b) may comprise predominantly of CDl lc cell surface marker over CD 14+ and CD 16+ cell surface markers. In some instances, greater than 40% of the CD 14+ and CD 16+ monocytes may differentiate into CDl l c+ dendritic cells. In other instances, the isolated naive CD8+ T cells in step (d) may compris greater than 90% CD8+ T cells and are depleted of natural killer (NK) and memory T cells. Further, the isolated naive CD8+ T cells may comprise less than 10% PMBCs having in total cells with any of CD56, CD57 or CD45RO cell surface marker.

By way of example, the isolated naive CD8+ T cells may lack CD56, CD57 and CD45RO cell surface markers. In other embodiments, the viability of dendritic cells after electroporation of nucleic acid may be greater than or equal to about 50%.

Examples of a cancer vaccine include but are not limited to, the cancer vaccine selected by any of the method of the invention. For example, cancer vaccine may be a mutant peptide which may have greater immunogenicity than wild-type peptide. In another example, the vaccine may be a nucleic acid (1) comprising a nucleic acid encoding a mutant peptide, or a nucleic acid (1) additionally comprising a nucleic acid encoding a protease cleavage site. In another example, the cancer vaccine may be administered intravenously, subcutaneously, intradermally, or intraperitoneally, intramuscularly to a subject. In an additional example, the cancer vaccine may be administered at the site of the tumor, or intratumorally. Further, the cancer vaccine may be administered using a microprojectile. Further still, the microprojectile may comprise a nanoparticle, or a gold particle. The cancer vaccine may also be administered using a viral vector, or a viral infection. Other example of admissistration may include by adoptive cell transfer of an antigen presenting cell comprising a nucleic acid encoding a mutant peptide. The nucleic acid may additionally encode protease cleavage site(s).

The invention provides a method of preparing a subject-specific immunogenic composition comprising selecting a cancer vaccine from genetically altered protein(s) expressed by a mammalian cancer cell and/or tissue by the method of the invention as described above, thereby preparing the subject-specific immunogenic composition.

The invention further provides a method of selecting an immunogenic mutant peptide by the method of the invention described above, wherein the selected immunogenic peptide produces a greater amount of effector cytokine for the mutant peptide than wild-type counterpart. Producing a greater amount of effector cytokine may be at least 2-fold, 3-fold, 4-fold, 5-fold, and/or 10-fold higher for the mutant peptide than its wild-type counterpart.

In another embodiment, producing a greater amount of effector cytokine may be between 2-fold to 10-fold, and/or between 5-fold to 100-fold higher for the mutant peptide than its wild-type counterpart.

FORMULATIONS AND COMPOSTIONS OF THE INVENTION

The vaccines or peptides of the invention, may be provided in a composition comprising a pharmaceutically acceptable excipient, and may be in various formulations. As is well known in the art, a pharmaceutically acceptable excipient is a relatively inert substance that facilitates administration of a pharmacologically effective substance. For example, an excipient can give form or consistency, or act as a diluent. Suitable excipients include but are not limited to stabilizing agents, wetting and emulsifying agents, salts for varying osmolarity, encapsulating agents, buffers, and skin penetration enhancers. Excipients as well as formulations for parenteral and nonparenteral drug delivery are set forth in Remington's Pharmaceutical Sciences 19th Ed. Mack Publishing (1995).

Generally, these compositions are formulated for administration by injection or inhalation, e.g., intraperitoneally, intravenously, subcutaneously, intramuscularly, etc. Accordingly, these compositions are preferably combined with pharmaceutically acceptable vehicles such as saline, Ringer's solution, dextrose solution, and the like. The particular dosage regimen, i.e., dose, timing and repetition, will depend on the particular individual and that individual's medical history.

The invention also provides formulations comprising a subject-specific immunogenic composition prepared by any of the method of the invention. In one embodiment of the formulation, the formulation comprises preparing the composition for administering in conjunction with at least one adjuvant, wherein the adjuvant is administered separately. In another embodiment of the

formulation, the formulation comprises preparing the composition for administering in conjunction with at least one adjuvant, wherein preparing comprises including the adjuvant in the subject- specific immunogenic composition. In yet another embodiment, the formulation comprises preparing the composition for administering in conjunction with at least one carrier, wherein the preparation comprises including the carrier in the subject-specific immunogenic composition. In an additional embodiment, the formulation comprises preparing the composition for administering in conjunction with another anti-cancer therapeutic agent. In yet another embodiment, the formulation comprises preparing the composition for administering in conjunction with an immunostimulatory agent. In yet another embodiment, the formulation comprises preparing the composition for administering in conjunction with at least one adjuvant. In one embodiment of the formulation, the composition comprises at least one adjuvant, which is administered separately or concurrently therewith. In another embodiment, the composition is administered in conjunction with another anti-cancer therapeutic agent. In a further

embodiment of the method, the measuring binding of said produced subject-specific

peptides to an HLA or MHC protein comprises in vitro testing of peptide binding to HLA or MHC protein.

KITS

According to another aspect of the invention, kits are provided. Kits according to the invention include package(s) comprising vaccines or compositions of the invention. The phrase "package" means any vessel containing peptides or compositions presented herein. In preferred embodiments, the package can be a box or wrapping. Packaging materials for use in packaging pharmaceutical products are well known to those of skill in the art. Examples of pharmaceutical packaging materials include, but are not limited to, blister packs, bottles, tubes, inhalers, pumps, bags, vials, containers, syringes, bottles, and any packaging material suitable for a selected formulation and intended mode of administration and treatment. The kit can also contain items that are not contained within the package but are attached to the outside of the package, for example, pipettes.

Kits may optionally contain instructions for administering peptides or compositions of the present invention to a subject having a condition in need of treatment. Kits may also comprise instructions for approved uses of compounds herein by regulatory agencies, such as the United States Food and Drug Administration. Kits may optionally contain labeling or product inserts for the present compounds. The package(s) and/or any product insert(s) may themselves be approved by regulatory agencies. The kits can include vaccines in a solid phase or in a liquid phase (such as buffers provided) in a package. The kits also can include buffers for preparing solutions for conducting the methods, and pipettes for transferring liquids from one container to another. The kit may optionally also contain one or more other agents for use in combination therapies as described herein. In certain embodiments, the package(s) is a container for intravenous administration. In other embodiments vaccines are provided in the form of a liposome.

The following examples serve to illustrate the present invention. These examples are in no way intended to limit the scope of the invention.

EXAMPLES

Example 1. Predicting immunogenic peptides from human cancer

A brief description of the steps shown in Figure 1.

Step 1 & 2 involve the use of MedGenome's next generation sequencing pipeline to identify genetic alterations at the DNA and RNA level.

Step 3 involves standard bioinformatic processing of next generation sequencing data to identify cancer-specific genetic alterations at the DNA and RNA level.

Steps 4-6 uses MedGenome's variant calling pipeline to identify all variants and select those that pass the quality control metrics (Passed variants). Passed variant is identified based on:

1. Alignment

2. Read depth

3. Allele depth

4. Overall quality of the variant

Variants are classified as single nucleotide variant (SNV) meaning a change in one nucleotide that results in a change in amino acid at the protein level. Variants in which one or multiple (non-triplet) nucleotides are inserted or deleted result in frame shifted proteins and are identified. Step 7 applies further selection by considering variants that are expressed in the tissue using the transcript data from RNA sequencing. The RNA sequence data is analyzed using MedGenome's RNA analysis pipeline to identify expressed variants, identify splice variants, overexpressed genes and fusion genes. The pipeline defines expression as >1 FPKM (1 fragment per kilobase per million).

Step 8 compiles a list of all the expressed variants that will result in the generation of altered proteins. These altered proteins are likely absent in normal tissues and are cancer specific. In this step the overexpressed proteins are also considered as contributing to neo-epitopes because these proteins are not expressed at a high level in normal tissues and upon overexpression may be recognized by the immune system as foreign. Examples of overexpressed proteins in cancer that are recognized by the immune system include, prostate specific antigen (PSA), melanoma specific antigen NY-ESO-1 and MAGE and carcinoembryonic antigen (CEA) in colon cancer. 1. Level of overexpression: >5-fold is considered as overexpressed for neo-epitope analysis.

2. A variant is considered expressed if it has a value > 1 FPKM.

3. Fusion genes are identified when regions from two different genes are fused to each other, and are present as part of a transcript. The fusion gene is considered expressed if the fusion region has a value > 1 FPKM.

Step 9 generates peptides used in in silico HLA binding analysis in Step 10. Class I HLA binds 8- 10 mer peptides and Class II HLA binds 14-17 mer peptides. Our algorithm generates two sets of peptides for each mutation, one containing the non-mutated (wild-type) amino acid and the other corresponding to the mutant amino acid. The length of the peptide can vary from 8-mer to 17-mer. The algorithm automatically generates two sets of peptide libraries in which the wild-type or the mutant amino acid occupy each of the positions across the length of the peptide. For example, if a peptide is 8-mer long, the algorithm generates 8 wild-type peptides and 8 mutant peptides for in silico binding analysis.

Step 10 determines the binding affinity of both the wild-type and the mutant peptides with Class 1 HLA molecules. A list of all class I HLAs used for binding analysis is given in Table 1. The binding analysis is performed using commercial algorithm(s). Mutant peptides with lower binding score are generally consider as strong binder to HLA molecule. After binding prediction, three groups of peptides are selected:

1. High affinity binding peptides -≤ 500 nM

2. Medium affinity binding peptides - >500 nM -≤1000 nM 3. Low affinity binding peptides - >1000 nM peptides

Table 1. List of all class I HLA proteins used for peptide binding analysis

Step 1 1 screens the peptides for optimal processing. We use commercial algorithm(s) to identify proteasomal and immunoproteasomal processing sites around the peptide, with the objective of prioritizing peptides in which the processing sites are optimally located, such that upon processing, the correct size peptide is produced. This step is important because the class I and class II HLA molecules bind peptides of a particular length. Class I HLA binds peptides from 8-1 1 mer and Class II HLA binds peptides that are 14-17 mer. We have devised our own scoring method that takes intoaccount the presence of processing sites at the N and C-terminal ends of the peptide. When both sites are optimally located a maximum score of 20 is given. The score decreases as the processing sites are shifted away from the optimal location. A score >10 (from a scale of 0 - 20) is used to select peptides for the next step. Peptides that are scored higher than 10 either by the proteasomal or by the immunoproteasomal cleavage are selected.

Step 12 calculates the transporter (TAP) binding affinity of the peptides using a commercial algorithm. In order for the peptide to bind HLA molecule, the peptide need to be transported from cytosol to endoplasmic reticulum. In this step we perform the analysis to identify whether the peptide is delivered to HLA molecule by TAP. Any peptide exhibiting a TAP-binding score of <0.5 are selected for the final step of prioritizatio.

Step 13 uses a novel algorithm that we have developed to identify peptides that have a higher likelihood of eliciting a T-cell response. Peptides interact with TCR only if they are bound to the HLA molecule. The TCR interaction depends on the conformation of the peptide, the availability of amino acids that make contacts with the residues on the TCR, and the type of interactions that are made between residues on the peptide and the residues on the TCR. Our new method integrates information from sequence and structure of the peptides to model the TCR interaction and has been tested on gold standard datasets.

Predicting immunogenic peptides by their ability to hind TCRs

The prediction of TCR-binding peptide prediction involves four different steps (Figure 3): 1. Data set creation; 2. Feature creation; 3. Classification model; 4. Study of features. A brief description of each steps

1 . Dataset creation: In this step, we have first collected peptide and its immunogenicity status from IEDB database. After this we then performed processing of the peptides to have a clean dataset for the model building exercise. Further, we have generated several training and test instances for model building and performance evaluation.

2. Feature creation: In this step, various amino acid features, HLA binding and peptide

processing related feature is generated for the peptides.

3. Classification model: In this step, classification model is generated using feature matrix.

This step involves: feature selection, identification of classification method, scoring of the peptides.

4. Study of features: The important features are studied in detail and its correlation with

peptide structure/interactions in crystal structure is also studied in this step.

Data preparation The sequence, assay, HLA type, publication id (PMID), and immunogenicity information of the peptide was downloaded from IEDB database (Release 24-1 1 -2016). The database contains immunogenicity status for 2,521 unique 9-mer peptides for human. The peptide is first categorized into self and foreign peptide. The peptides generated by human body are known as self, while those that do not originate in human body are called non-self or foreign peptides. Of the total peptides, ~85% of them belong to foreign peptide category. The peptides are also classified based on assay that was performed to check its immunogenicity. Although there are several assay types, we have broadly grouped them into biological and non-biological type . Majority of the peptides (-90%) are assayed by biological type. Before using these peptides, we apply the following filters to focus on unambiguous assay prediction and for which the information as per our requirement is complete.

• Biological assay filter: The peptides predicted as immunogenic/non-immunogenic using one of the biological assay is taken further for the analysis.

• Prediction by assays: There are many peptides which are predicted as both immunogenic and non-immunogenic using one or more different assays. These peptides were removed from our analysis. • 4-digit HLA information: The peptides for which 4-digit information is available for the HLA type is considered for further analysis. Of the total peptides, for 1075 peptides 4-digit HLA information was available (Figure 5B)

Overall, we obtain 1 ,075 peptides for which unambiguous immunogenicity and HLA 4-digit information is complete. The classification model was built using 307 immunogenic peptides and 1 16 non-immunogenic peptides. These peptides bind HLA-A02:01.

Currently the binding affinity of the peptide is considered as the main criteria to select

immunogenic peptides. In general, binding affinity by standard programs such as NetMHCcons (Karosiene, Lundegaard et al. 2012) with <= 500nM is taken as cutoff to define immunogenic peptides. The distribution of binding affinity for the HLA-A*02:01 peptides is shown in Figure 4. If we consider <= 500 nM as cutoff to define immunogenic peptides then the sensitivity is about 74.5% whereas the specificity is only about 27.6%. Figure 2 demonstrates that HLA binding does not predict immunogenic peptides because both non-immunogenic and immunogenic peptides can bind HLA with high affinity. Feature construction and selection

In order to generate features that will discriminate the TCR-binding peptides from the non-binders, we analyzed the physico-chemical composition of the amino acids and their positional biases in the 9-mer peptides that interact with TCR when bound to the HLA molecule. We analyzed 58 crystal structure data of TCR-HL A-peptide complex to identify binding interactions that existed at each position of the 9-mer peptide and the HLA at one hand and the TCR on the other. A summary of the feature types is provided below:

I. Physicochemical features: An amino acid is an organic molecule with an amino group (-NH2) and a carboxyl group (-COOH). We obtained the physicochemical features from following two different sources. · AAindex: AAindex is a database that contains numerical representation for various

physicochemical and biochemical properties of amino acids and pairs of amino acids. We used AAindexl for our feature creation. Most of the defined indices belong to 4 major cluster- (i) a- helix and turn propensities, (ii) β-strand propensity, (iii) hydrophobicity and (v)

physicochemical properties. A total of 566 different AAindexl scale was obtained from this database (May 18, 2017). We use the following strategy to generate features.

Overall, we generated 1 1 ,300 features from AAindex.

• PepLib: Peplib is a R package that can be used to calculate the descriptors for each amino acid of given peptide sequence. These descriptors include counts of groups (polar, acidic, basic, aromatic etc), molecular weight, number of rotable bonds and charged based partial surface area descriptors. There are 53 variables to be calculated for each amino acid in the peptide sequence. Some of these descriptors are based on permutation of descriptors calculated on single amino acid. Along with the descriptors calculated for each amino acid, Peplib provides the values at sequence level also. Sequence level calculation involves three types of the descriptors - 1 . mean 2. variance and 3. autocorrelation function of the descriptors for each sequence. II. HLA binding feature: Prediction of HLA binding affinity score is the most important feature of the peptide that is being currently used by community to identify candidate T cell epitopes. Binding affinity of <= 500 nM is routinely used as a threshold for peptide selection. We have generated NetMHCcons (ref) binding affinity score as one of the feature for each peptide. NetMHCcons is a consensus based method of three different state-of-the-art MHC-peptide binding prediction methods (NetMHC, NetMHCpan and PickPocket) with peptides. NetMHCcons uses artificial neural network-based method give result as IC50 values trained on data from various MHC alleles and positional specific scoring matrices (Karosiene, Lundegaard et al. 2012).

III. Peptide processing features: · NetChop: Peptide cleavage is an important step for making sure that the peptide is generated for the transportation and then presentation by HLA molecule. We have used the IEDB NetChop 3.1 program (Nielsen, Lundegaard et al. 2005) to identify the cleavage sites.

NetChop is a neural network prediction based method for prediction of cleavage sires of the human proteasome. We generate two different features for each peptide - (a) C-term which is trained with the database consisting of publicly available MHC class I ligands using C- terminal cleavage sites of ligand into consideration, (b) 20S which is trained with the in vitro degradation data.

• TAP processing: The TAP processing includes the neural network based estimation of

ability of transportation of cleaved peptides by TAP transporter proteins to the endoplasmic reticulum. The neural network is trained on the in vitro experiments characterizing the sequence specificity of TAP transport. In total, six features based on TAP were generated for cach of the peptides.

Overall, from the total peptides 307 immunogenic and 1 16 non-immunogenic peptides that bind HLA-A*02:01 , we generated 12,094 total features (Figure 3).

Classification model

We performed the following steps to generate the classification model for predicting

immunogenicity of the peptides as shown in Figure 4.

• Creation of training and test set instances: Due to unbalanced dataset of immunogenic and non-immunogenic peptides (3: 1) in our study, we first generated 500 different instances of the complete dataset which had balanced number of immunogenic and non-immunogenic peptides. Each balanced dataset consists of -100 immunogenic and non-immunogenic peptides. The balance dataset is generated to avoid overfitting of classification model to either immunogenic or non-immunogenic peptide class.

Feature selection: We generated classification model using all 12,094 features for 500 training/test instances. Ensemble classifier is generated by combining the results from all classifier instances. Equal weight is given to each of the classifier instance. If > 50% of classifier predict a peptide as immunogenic then the prediction of the ensemble classifier is taken as immunogenic otherwise prediction is taken as non-immunogenic. The sensitivity and specificity of J4.8 classifier for the 500 instances is shown in Figure 5F. The ROC curve of the ensemble classifier is shown in Figure 5G. The ROC curve is generated by changing the cutoff/threshold of ensemble classifier for predicting a peptide as immunogenic or non- immunogenic.

Feature reduction: As a next step, we performed feature reduction for each 500 instances using CfsSubsetEval method available in Weka machine learning toolkit (Hall 1999). This method evaluates the worth of a subset of attributes by considering the individual predictive ability of each feature along with the degree of redundancy between them. During feature selection, some of the training instance failed to converge, hence, we were left with 433 training instances. A median of 45 features were selected for each training instance. Overall, 3680 features were selected when all 433 training instances were included. Of these 60% (2219) of the features were part of 2 or more training instances. Using the reduced 433 training instances a new classification model was built.

Performance evaluation of classifier instances: The reduced features for each training instances was trained using J4.8 classification system. We first created an ensemble classifier by combining the prediction from all 433 classifier instances. A sensitivity/specificity plot using 3680 features clearly separates the classifier instances into two groups (Figure 5D-F). The Group-2 classifier instances have higher sensitivity and specificity as compared to Group- 1 classifier instances (Figure 5F). We used voting based approach to classify the peptide sequence into immunogenic and non-immunogenic class. For an input peptide if > 50% of the classifiers predicts it as immunogenic then the peptide is classified as immunogenic otherwise the peptide is defined as non-immunogenic peptide (score >0.5 predicted as immunogenic). ROC curve of 433 classifier instances (Ensemble classified) performs better than using 500 classifier instances (Ensemble classifierl) (Compare Figures 5 C and G).

In the next step, we selected classifier instances for which about >= 75% sensitivity and >= 80% specificity on unseen dataset was observed. We found 45 such classifier instances. An ensemble classifier was created using the 45 classifiers. ROC curve of 45 classifier instances (Ensemble classifier3) is shown in Figure 5 D-G.

Performance evaluation of the three ensemble classifiers on unseen dataset is summarized.

Ensemble 1 classifier provides sensitivity and specificity of about 59.61% and 62.07%) respectively. Ensemble2 classifier provides sensitivity and specificity of about 71.66% and 92.24% respectively.

Ensemble3 classifier provides sensitivity and specificity of about 90.23% and 99.14% respectively, which is significantly higher than the HLA binding affinity of the peptides.

Frequently occurring features at each position of the 9-mer peptide was computed from Ensemble3 classifier and shown in Figure 6 Λ-J. Data generated from the pipeline is used to rank order the list of immunogenic peptides according to the parameters shown in Table 3. Scores from each step are aggregated to create the rank ordered list of peptides where peptides higher on the list is more likely to induce a CD8⁺ T cell response compared to peptides that are lower in rank.

Table 2. List of features selected from the Ensemble model that separated immunogenic from non-immunogenic peptides

Rules for predicting immunogenicity based on the features of amino acids at each of the 9 positions of the 9-mer peptide. The rules specify the range of parameters that define the identity of each amino acid at each position of the 9-mer peptide

The rank ordering of immunogenic peptides were performed using data shown in Table 3.

Table 3. Data used for rank ordering immunogenic peptides

We have analyzed somatic mutations that occur frequently in cancer and applied the steps in Table- 1 to identify immunogenic peptides. From a total 2.3 million unique mutations we predicted 1948 immunogenic peptides covering single nucleotide variants (SNVs), insertions, deletions and gene fusions (Table 4).

Table 4. List of immunogenic peptides from frequently occurring mutations in cancer

Example 2

The example demonstrates an exemplary methodology for predicting immunogenic peptides from a human Head and Neck cancer sample starting from human cancer tissue sample following the steps shown in Figure- 1 and described in detail in previous section, Example- 1 - "Predicting immunogenic peptides from human cancer".

Exome sequencing The exome sequencing was performed for the tumor and normal samples. The exome capturing was performed using Agilent SureSelect Human All Exon V5 kit. The RNA sequencing (RNA-seq) was performed for the total RNA extracted after Ribo-depletion of tumor sample RNA. All paired-end sequencing was performed using Illumina HiSeq 2500 platform. Total data obtained for the exome- seq and RNA-seq sample exceeds 12Gb and more than 90% of data exceed Q30 (shown in Table 5).

The exome-seq data is first pre-processed, where we remove the low-quality reads/bases and adapter sequences. The pre-processed reads are then aligned to the human reference genome (hgl9) using BWA program with default parameters. Then, we apply GATK-best practices where we remove the duplicate reads using Picard tools and re-align, re-calibrate using GATK and keep the file ready for somatic mutation identification (Table 6). The somatic mutations in the samples are identified using Strelka program. After this, only the quality passed and on-target mutations are processed further. A total of 222 mutations were identified in this sample. Of these 210 are SNPs and 12 are Indels (Table 7A). Of the total coding mutations, 106 of them are of missense type (Table 7B).

RNA sequencing

The RNA-seq data is first pre-processed, where we remove the low-quality reads/bases, adapter sequences and unwanted sequences like ribosomal RNA, tRNAs, repeat sequences. The pre- processed reads are then aligned to human reference transcriptome and genome using STAR aligner (Table 8). The expression of the gene is then identified using Cufflinks program.

HLA-typing

The RNA-seq data is then used for HLA typing (Sidney, Peters et al. 2008, Greenbaum, Sidney et al. 201 1). We used Seq2HLA program for HLA typing from RNA-seq. The Class-I HLA alleles identified for this sample is provided in Table 9. The expression of the HLA genes is provided in Table 10. The read depth of the mutant allele in RNA-seq is then calculated. Of the total mutations, we found 62 mutations with read support >= 1 in RNA-seq. These mutations are also termed as expressed mutations. The 62 mutations generated 578 unique 9-mer peptides.

Immunogenic peptide identification

The peptides derived from the expressed mutations were scored for TCR-binding followed by HLA binding prediction, then TAP prediction and finally proteasomal processing. The immunogenic peptides were further ranked based on the expression level of genes and variants, affinity of HLA binding, sensitivity to proteasomal processing and binding to the transporter. We applied the ranking method to 220 unique immunogenic peptides from this Head and Neck cancer sample. The ranked peptide along with HLA information is provided in Table 1 1.

Missense - Genetic alteration that results in a different amino acid.

Frameshift - Genetic alteration that changes the reading frame. This typically results in a string of different amino acids substitutions before encountering a stop cndnn.

InFrame - Genetic alteration that results in either deletion or insertion of one or more amino acids.

Table 9. HLA class I alleles present in the sample

Cell-based T cell activation assay to validate immunogenic peptides

Predicted peptides are validated for their immunogenicity by testing them in a T cell activation assay using three separate methods as described below:

(i) PBMC-based assay with external addition of peptides

(ii) DC-TC assay with external addition of peptides

(iii) Minigene-based DC-TC assay - peptides expressed as minigenes in DCs

For T cell activation assay using any of the three methods, key parameters of PBMCs and purified CD8⁺ T cells and dendritic cells need to be monitored as shown in Table 12.

The sensitivity of the assay is also determined by a variety of factors as outlined in Table 13.

Description of the three assay formats for the CD8 T cell activation assay PBMC-based assay with external addition of peptides

PBMCs expressing HLA specific to the mutant peptides were collected from patient and/or healthy donors and rested overnight at 37°C (day 0). On day-1 the rested cells were plated with culture media and stimulated with 10 uM peptides along with cytokine cocktail containing IL-2 and IL-15. On days 4, 10 and 17, 50% of the old media was replaced with equal volume of fresh culture media containing IL-2 and IL-15. PBMCs were restimulated with l OuM peptides on day 7, 14 and 21 by adding the peptides to the media. On day 22, peptide stimulated cells were incubated in brefeldin (to block cellular transport) for 6 hours and intracellular expression of INF-γ in CD8⁺ T cells was analyzed using flow-cytometry (Figure 7A) Dendritic cell-T cell co-culture assay with external addition of peptides.

Purified CD8⁺ T cells and monocytes were obtained from peripheral blood mononuclear cells (PBMCs) of healthy human donors and/or patient samples using magnetic separation method. CD 14⁺/CD 16⁺ double positive monocytes were differentiated into dendritic cells (DCs) using a cytokine cocktail containing GMCSF, IL4 and IFN-γ for 4 days and pulsed with peptides. Purified na'ive CD8⁺ T cells were co-cultured with peptide-pulsed mature DCs for a 10-day period. On day- 10 DC - TC co-culture were re-stimulated with peptide-pulsed autologous PBMCs for an additional 48 hours. At 24h and 48h, cells were processed and intracellular expression of INF-γ in CD8⁺ T cells was quantitated by flow-cytometry (Figure 7B).

Minigene-based DC-TC assay (peptides expressed as minigenes in DCs) The method for purifying DCs and CD8⁺ T cells were identical as described in the DC-TC co- culture assay. Immature DCs derived from monocytes were transfected with minigenes harboring multiple copies of peptides separated by synthetic protease cleavage sites and an endosomal targeting sequence. The conditions of the co-culture assay were identical as described in the DC-TC co-culture assay with the following modification. Instead of adding peptide-pulsed PBMCs the DC- TC co-culture is pulsed with minigene-transfected PBMCs on Day- 10. IFN-γ expression in CD8⁺ T cells was quantitated by flow cytometry (Figure 7C).

Immunogenic peptides identified using the CD8⁺ T cell activation assay

The CD8⁺ T cell activation assay was used to screen peptides derived from frequently occurring cancer mutations predicted using the OncoPeptVAC pipeline (Table 4). Example 4

A method is presented to select optimum vaccine candidates from cancer mutations using steps described in Examples 1-3 and additional steps described in Example-4. The invention comprises of the following steps shown in Table 14.

1. Tumor mutation detection: Tumor mutations are detected by next generations sequencing and using standard bioinformatics and data analysis pipelines described in examples 1 -2

2. Selecting vaccine candidates

Vaccine candidates will be selected using OncoPeptVAC pipeline. The pipeline uses HLA class I- binding peptides and performs the following steps automatically as described in examples 1 -2. 1. Select all tumor-specific genetic alterations that changes the protein coding sequence of a transcript (SNVs, indels, splice variants and gene fusions);

2. Determine expression of non-mutated and mutated alleles in the tumor expression data;

3. Convert mutated sequences into 8-10-mer and 14-17-mer peptides. Generate corresponding non-mutated sequence;

4. Perform HLA typing from the exome and RNA-sequencing data of the sample;

5. Determine TCR binding for the non-mutated and mutated sequence generated in step-3;

6. Determine HLA-binding affinity of the peptides with the HLA expressed in the sample from step-4;

7. Peptides are analyzed for proteasomal and immunoproteasomal processing;

8. Peptides arc analyzed for peptide transporter binding;

9. Each non-mutated and mutated peptide is scored for their expression (step-2) and for the steps 5-8 and a composite score for each peptide is calculated; and

10. Immunogenic peptides or vaccine candidates are selected on the basis of their composite score,

3. Validating the immunogenicity of peptides by OncoACT (T cell activation assay)

Peptides prioritized from the previous step- 10 are tested for immunogenicity by performing an ex vivo T cell activation assay. Peptides are tested in a purified dendritic cell - T cell co-culture assay by adding peptides from outside or by expressing the peptides as minigenes in dendritic cells. In a second version of the assay peptides or minigenes are tested on whole PBMCs without prior purification of cell types.

Mutated neoantigens are scored positive if they induce antigen specific T cells producing IFN-γ Figure 7 (A-C). The magnitude of response is indicated by the proportion of antigen specific T cells produced during the assay.

4. Detection of antigen specific T cells

The enormous diversity of human T cells is represented by 10⁹-10¹⁰ unique T cell receptor (TCR) alpha and beta chain pairs expressed as a unique combination in each T cell. Three key factors are considered in our vaccine design with respect to the clonal identity of the T cells. a) Vaccine candidates must induce clonal expansion of T cells. The magnitude and diversity of T cell amplification is an important parameter used by the algorithm to select the right candidate. The clonal amplification of T cells is determined by TCR repertoire sequencing using commercially available platforms Figure 8 shows example where different antigens induced the amplification of single T cell clones or multiple T cell clones. The clones amplified were not present in control samples (Figure 8). The numbers on top of the bar plots show clonal amplification of T cells in the presence of control and test antigen (Figure 8).

b) The clonally amplified T cell must possess the phenotype of cytolytic T cells, which means that they should express a gene expression signature associated with cytolytic phenotype. Single T cell transcriptome analysis is performed to determine the CTL phenotype of individual olonoo. c) Clonally amplified T cell must be functionally active. Activated T cells express markers of anergy and exhaustion if they receive strong antigen-specific stimulation or the duration of response is prolonged. Therefore, assays to determine the functional state of T cells following antigen stimulation will be performed.

5. T cell phenotype analysis

An effective cancer vaccine should elicit a T cell-mediated clearance of the tumor by generating tumor-specific cytotoxic CD-8 T cells of sufficient magnitude and functionality. In vitro studies have shown that vaccines that show clonal CD8 T cell expansion and activation coupled to production of cytolytic enzymes are able to lyse tumor cells. Also, vaccines that generate a sufficient memory T cell response are able to maintain long term efficacy. Figure 9 shows the workflow for assessing the functional phenotype of individual T cells using the 10X Genomics platform.

The gene expression coupled to single cell TCR clonal amplification facilitates in the identification of the phenotype of clonally amplified T cells. As an example, Figure 10A-B shows three amplified population of T cells with distinct phenotype. Whereas clonal population- 1 and 3 produced IFN-γ, clonal population-2 did not (Figure 10A). In addition, whereas clonal population- 1 had higher level of amplification, clonal population-3 showed higher expression of CTL markers suggesting that amplification of T cells is not always linked to their phenotype (Figure 10B).

6. Election of vaccines Vaccine candidates for the cocktail will be selected on the basis of their scores at each of the five steps. The bioinformatic algorithm will assign different weightage for each step and compute an aggregated score for each peptide. The criteria comprise of:

1. Immunogenicity of the mutated antigen;

2. Probability that the mutated antigen will be presented on the surface of cells;

3. Mutated antigen's ability to generate clonal amplification of T cells;

4. Mutated antigen's ability to induce CTL phenotype on the clonally amplified T cells; and

5. Mutated antigen's ability to keep T cells in a functionally active state.

Analysis of tumor microenvironment for selecting patients who will respond to vaccine therapy

For analysing the tumor microenvironment, we used cell-type specific gene expression signatures covering many different immune cell types. The magnitude of infiltration of an immune cell in the tumor microenvironment was estimated from whole transcriptome data using cell type specific gene expression signatures. The creation and validation of the signatures are described in the next section

Creating cell type-specific gene signatures

Unique gene expression signatures defining a specific immune cell type was identified by analyzing large microarray data sets (restricted to Affymetrix human genome U133 plus 2.0 platform) of pure immune cell types with the worktlow suggested in Figure 1 1 A. Genes showing low plasticity and high specificity (Wang, Yang et al. 2015) of expression were included as a part of a gene signature for a given cell type. Table 15 shows genes selected for building cell-type specific signatures. The cell type specific signatures were applied on RNA-seq data and based on the expression of the genes present in the signature, a score is generated using the Single Sample Gene Set Enrichment Analysis (ssGSEA) (Subramanian, Tamayo et al. 2005). Signatures were validated using gene expression data from pure immune cells present in Gene Expression Omnibus (GEO database) (Clough and Barrett 2016)(Fig 1 1B-D), single cell transcriptome data (Tirosh, Izar et al. 2016) (Fig 1 IE) and flow cytometry data (Fig 1 IF) (Bhattacharya, Andorf et al. 2014).

Example 5: Utility of the gene expression signatures in selecting tumors with high CD8⁺ T cell infiltration

RNA-seq Level 3 data was obtained from the TCGA Data Portal (Chandran, Medvedeva et al. 2016) for 33 Cancers (see detailed description of the invention). Expression values for the genes defined in our signature for all the tumor samples in each of these cancers. The immune infiltration scores for the cancers were obtained using the approach described above. The cancers were clustered based on their immune infiltration scores as high, medium and low based on CD8⁺ T cell infiltration. Cancers with >25% tumors showing a positive CD8 T cell infiltration score is classified as high (H). Cancers with ≤25% and >5% infiltration of CD8 T cells are classified as medium (M). Cancers showing <5% infiltration of CD8 T cells are classified as low (L). As an example, Skin Cutaneous Melanoma

[SKCM] is classified as high (H) Head and Neck squamous cell carcinoma [HNSC] is classified as medium (M) and Prostate adenocarcinoma [PRAD] is classified as low CD8 infiltration (Figure 12).

Differential expression analysis of individual genes in high and low CD8 infiltration samples identifies pathways associated with high presence of CD8 T cells in the tumor

Differential expression analysis of individual genes was carried out using DESeq Bioconductor package. For each cancer type, samples falling in the first and last quartile based on the immune infiltration scores were categorized into two groups as high infiltrated and low infiltrated samples. Raw counts were extracted from these samples, and DESeq was employed to find the differentially expressed genes between the two groups using an FDR cutoff of P-value <0.01 and a fold change oFOLD>= 2 for upregulated genes and FOLD<=2 for downregulated genes. Common genes that were upregulated and downregulated in all the three representative cancers used in the analysis were further used.

To gain a biological understanding of those gene sets statistically significantly associated with samples that are highly infiltrated by CD8⁺ T cells, we carried out pathway enrichment analysis using Reactome (Fabregat, Jupe et al. 2017). We used the genes that were identified as commonly upregulated or downregulated across the three cancers for identifying the pathways. An FDR of 0.01 was used for filtering the significant pathways. Significant pathways are shown in Table 16.

Example-6. Immune landscape of human cancers and impact of co-infiltration of immune cells on survival

We leveraged the whole transcriptome data from 9640 tumors across 33 cancers to identify cancer- type-specific infiltration of immune cells using a simple workflow described in Figure 13A (left panel). First, 9640 tumors were scored for infiltration of each immune cell type, and the tumors were sorted and stratified into quartiles by immune cell type-specific scores (2410 tumors in each quartile). Next, we analyzed what fraction of tumors from each cancer was present in each quartile, which is shown as a heatmap in Figure 13A (right panel). Using this approach, we were able to identify cancers in which infiltration of certain immune cell types was favored over others. As an example, the pattern of B cell infiltration indicates that diffuse large B-cell lymphoma (DLBCL), kidney renal clear cell carcinoma (KIRC), sarcoma (SARC), skin cutaneous melanoma (SKCM) and uveal melanoma (UVM) have 80% or more tumors in Ql with high B cell infiltration score (Figure 13A, B-cell column, deep red squares in Ql). Tumors lacking a specific cell type is represented as a white square in Ql indicating <5% tumors having high infiltration score, and a deep red square in Q4 indicating >80% tumors having low infiltration scores. For example, low-grade glioma (LGG) lack most adaptive immune cells and neutrophils of the innate compartment and has moderate infiltration of CD4⁺ T cells and macrophages, but was highly infiltrated by monocytes (Figure 13A, right panel, arrow) (Hambardzumyan, Gutmann et al. 2016). High infiltration of monocytes was also observed in GBM, KIRC, LGG and SARC cancers. The pattern of CD8⁺ T cell infiltration identified DLBCL, acute myeloid leukemia (LAML) and thymoma (THYM) as the highest infiltrated tumors. These tumors, however, have low CD4 infiltration indicating an inverse relationship between the two cell types (Figure 13A). Among these three tumors, both DLBCL and THYM have significant Treg cells in the tumor microenvironment which is likely to limit T cell-mediated tissue inflammation. Co-infiltration of multiple immune cells determine their response to therapy and survival. Therefore, we investigated the co-infiltration of multiple immune cells within the same tumor using our gene expression signature. We selected cancers in which >80% of tumors had high infiltration of a specific immune cell type (deep red Ql boxes in Figure 13A) and analyzed the infiltration of other immune cells in the same tumor using a correlation plot as shown in Figure 13B. Data across all tumors indicate that monocyte infiltration is anti-correlated with most immune cell types of the adaptive compartment. Neutrophil infiltration, on the other hand, shows weak correlation with CD4, CD8, and Treg cells. CD8⁺ T cell infiltrated tumors also contain NK cells but lack CD4⁺ T cells (Figure 13B).

To examine the mechanism of selective recruitment of immune cell subsets in specific tumors, we first considered the higher expression of chemoattractant proteins that mobilize immune cells to sites of inflammation. We analyzed co-expression of chemoattractant genes specific to each immune cell type and their infiltration score across all 33 cancers. As shown in Figure 13C, chemoattractant gene expression was positively correlated with immune cell infiltration across all cancers.

Next, we examined the possibility that mutations in oncogenes and tumor suppressor genes could impact immune cell infiltration by directly regulating the expression of chemoattractant genes or by other mechanisms related to changes in the tumor stroma, or directly impeding the migratory behavior of immune cells. Several recent studies have uncovered a relationship between somatic mutations and their impact on the tumor-associated immune infiltration (Spranger and Gajewski 2016). We selected tumors across different cancers that were enriched or depleted for different immune cells and analyzed their mutational landscape for 61 cancer census genes. We identified mutations in known and novel pathways that are associated with the enrichment or depletion of specific immune cells in different cancers (Figure 13D. Significantly, loss of function mutations in RNF43 and DOCKS genes were associated with higher infiltration of CD8 T cells in colon adenocarcinoma (COAD) (Figure 13D). A snapshot of other oncogenic mutations is shown in (Figure 13D). The immune cell-type specific gene expression signatures enabled us to perform comprehensive analysis of the immune infiltrate in the tumors and discover novel genetic alterations associated with enrichment of specific immune cell types. -

Example-7. Prognostic impact of tumor-infiltrated immune cells in different cancers

Cancer-related inflammation is considered the seventh hallmark of cancer (Tesniere, Zitvogel et al. 2006, Colotta, Allavena et al. 2009) and high tumor infiltrating leukocytes (TIL) is often correlated with increased progression-free survival (PFS) and overall survival (OS) in several solid tumors such as breast, colorectal, ovary and other cancers (Adams, Levine et al. 2009, Gooden, de Bock et al. 201 1 , Huh, Lee et al. 2012, Mao, Qu et al. 2016). Both targeted studies and large-scale genomic studies have revealed that different cancers benefit from infiltration of different immune cell types. For example, CD8⁺ T cells, activated macrophages (Ml -type) and NK cells are associated with good survival, whereas myeloid-derived suppressor cells (MDSCs), Treg cells and alternatively activated macrophages (M2-type) are associated with poor survival (Aran, Lasry et al. 2016, Charoentong, Finotello et al. 2017).

Few studies, however have investigated how different immune cell types cooperate or act against each other to impact survival (Colotta, Allavena et al. 2009, Varn, Wang et al. 2017). We interrogated 9640 tumors across all cancers in the TCGA data using cell-type specific MGESPs and rank ordered tumors based on their immune cell content. The bottom and the top 20% of the samples, having the lowest and the highest immune cell infiltration scores were used for the survival analysis. We selected 23 cancers that had at least 25 samples in the high and the low group and used their outcome data to analyze benefit or lack of benefit from immune cell infiltration. In accordance with other published studies, CD8⁺ T cell infiltration was associated with improved survival in seven of the 23 cancers, whereas monocyte/macrophage infiltration exhibited poor survival in seven of the 23 cancers (Figure 14A). Also, our analysis uncovered that both the lack of survival benefit or enhanced benefit was impacted by the infiltration of multiple immune cells that act in combinations which differed between cancers (Figure 14A). As an example, kidney renal carcinoma (KIRC) benefitted from the infiltration of CD4⁺ T cells and neutrophils, whereas in sarcomas (SARC) the survival benefit was observed from both CD8⁺ T cells and monocytes. Conversely, low-grade glioma showed poor survival from the infiltration of Treg cells and monocytes. Monocytes and neutrophils are associated with poor outcome. Therefore we wanted to test whether the survival benefit seen with these cell types was contributed by the co- infiltration of CD8⁺ or CD4⁺ T cells in the same tumor. As shown in Figure 14B and Table 17 the combined benefit of co-infiltration by CD8⁺ T cells + neutrophils in KIRC, or CD4⁺ T cells + monocytes in SARC exceeded the survival benefit from individual cell types. Similarly, other novel associations between co-infiltration and survival is shown in multiple cancers (Table 17). These observations strengthen the notion that cross-talk between different immune cell types is an important factor determining prognosis.

Cancer staging is yet another important tumor phenotype used for predicting prognosis. Therefore, we assessed whether the immune cell composition altered from being protective to permissive with the progression of cancer. Our analysis indicated that in many cancers, such as COAD, SKCM, THCA and uterine corpus endometrial carcinoma (UCEC) there was a progressive decrease in CD8⁺ T cell infiltration with increased disease stage (Figure 14C). Conversely, monocyte infiltration increased with stage in many cancers, indicating adverse impact on survival. Taken together, the results of these analyses suggest that immune cell infiltration is a good prognostic marker for survival, and certain combinations of immune cells in the tumor microenvironment produce survival benefit in certain cancer types, but not in others.

Example-8. Immunogenomic features determining prognosis

It is well recognized that a strong pro-inflammatory tumor microenvironment characterized by the presence of CD8⁺ T cells, NK cells, and Ml -type macrophages are strongly correlated with a long- term survival benefit, whereas an immune suppressive microenvironment characterized by Treg cells, MDSCs and alternatively activated macrophages (M2-type) predict poor survival (Aran, Lasry et al. 2016, Chen and Mellman 2017). With the introduction of checkpoint inhibitors to treat cancer, there has been a renewed interest in defining the immunogenic state of a tumor to predict whether the tumor will be rejected by the host immune response, or escape an immune attack when treated with checkpoint blockade inhibitors. Therefore, we sought to determine how the combination of immune cells interact with each other to affect patient outcome in multiple cancers. Rather than separate tumors based on the infiltration of one or few cell types as has been done in few studies (Becht, Giraldo et al. 2016, Li, Severson et al. 2016, Varn, Wang et al. 2017), we applied the 42-gene signature covering eight immune cell types on 9120 tumors and clustered them based on their combined immune infiltrate composition. The tumors clustered into four major clusters determined by their relative content of eight different immune cell types (Figure 15A). The distribution of cancers in each of the clusters is shown in Figure 15B. Three cancers, uveal melanoma (UVM), low-grade glioblastoma (LGG) and glioblastoma (GBM) were enriched in cluster- 1 (Figure 15B). Cluster-3 is exclusively composed of acute myeloid leukemia (LAML, 170 of 173 tumors) and clusters-2 and 4 are composed of tumors from many different cancers (Figure 15B). As expected from the cancer types present in each cluster, cluster-1 and 3 have poor epithelial content, whereas cluster-2 and 4 were enriched in epithelial tumors (Figure 15C). The stromal content of cluster-2 and 4 was significantly high compared to cluster-1 , and 3 and the immune content of cluster-3 and 4 were significantly higher than clusters 1 and 2 (Figure 15C). Analysis of the immune infiltrate composition of all the clusters revealed interesting patterns of immune cell infiltration (Figure 15D). Cluster-1 and 2 had poor CD8⁺ T cells and NK cells but were enriched for macrophages and monocytes. Cluster-2 had significantly higher CD4⁺ T cells compared to all other clusters. Cluster-3 and 4 were characterized by high CD8⁺ T cells compared to cluster 1 and 2 (Figure 15D). Tumors in cluster-2 and 4 were enriched for T_reg cells, which correlated with their higher CD4⁺ and CD8⁺ T cell content. As expected, cluster-3 containing exclusively of LAML samples had significantly lower macrophage content than all other clusters (Figure 15D).

Tumor mutational burden has been shown to sensitize tumors to cancer immunotherapy drugs through CD8⁺ T cell-mediated tumor killing (Rizvi, Hellmann et al. 2015, Nghiem, Bhatia et al. 2016, Le, Durham et al. 2017). Accordingly, we observed slightly higher mutational burden in cluster-4 tumors correlating with higher CD8⁺ T cell infiltration (Figure 15E). Next, we examined the distribution of MSI⁺ tumors in the four clusters and detected -75% of all MSI⁺ tumors in cluster-2 (102 out of 135 tumors). Interestingly, cluster-4 with highest CD8⁺ T cells had very few MSI⁺ tumors (19 out of 1552).

The landscape of immune infiltrates in MSI⁺ tumors indicated that they have higher infiltration of CD8⁺ T cells, Treg cells, and NK cells compared to MSI^" tumors correlating with their sensitivity to checkpoint blockade inhibitors (Le, Uram et al. 2015, Le, Durham et al. 2017) (Figure 15F).

Immune-mediated mechanisms affecting prognosis

Given that each cluster contains a unique composition of immune cells in their tumor microenvironment, we next investigated whether there were any differences in survival between the clusters. We first compared the survival of individuals belonging to cluster- 1 and cluster-2 with cluster- 4 to assess the effect of high CD8⁺ T cell infiltration on survival. Cluster-4 tumors had slightly better prognosis compared to cluster- 1 or cluster-2, confirming that infiltration of CD8⁺ T cells correlates with better prognosis (Figure 16A-B). Next, we investigated molecular features associated with good vs. poor prognosis in cluster-4 tumors rich in CD8⁺ T cells. We were particularly interested in uncovering the mechanism of poor prognosis in high CD8⁺ T cell-infiltrated tumors.

Of the 1552 cases in cluster-4, 1200 were alive, and the remaining were deceased. The composition of the tumor microenvironment with respect to epithelial and stromal content indicated that tumors associated with the deceased group had lower epithelial and higher stromal content (Figure 16C, left panel) and lower CD8⁺ T cell infiltrate but was high in monocytes (Figure 16C, right panel). We also observed lack of any significant difference in the expression of inflammation markers, and the expression of immunosuppressive factors in the deceased compared to the alive group (Figure 16D) (Table 18). Next, we determined whether the parainflammation score (Aran, Lasry et al. 2016) in the deceased group was higher, given that the tumors had high infiltration of monocytes. We failed to observe a significant difference in the parainflammation score between the two groups (Figure 16E) suggesting that the lack of survival benefit from high CD8⁺ T cell infiltration is likely to be associated with other aspects of CD8⁺ T cell biology.

IFNG IFNGR1 IFNGR2 IFNK IFNW 1 NOD I NOD2

Therefore, we assessed the functional state of CD8 T cells between the deceased and the alive groups by examining the expression of anergic and exhaustion markers on T cells (Chappert and Schwartz 2010, Crespo, Sun et al. 2013). Significantly, whereas in both alive and deceased groups, CD8⁺ T cells expressed the activation marker PD-1, the deceased group was specifically enriched in CD8⁺ T cells expressing anergic and exhaustion markers - CTLA-4, LAGS, and TIM3, indicating dysfunctional CD8⁺ T cells in the tumor microenvironment of the deceased group (Figure 16F). The dysfunctional T cells in this group showed reduced expression of CTL markers (Figure 16G), further confirming their weak anti-tumor response. Many studies have demonstrated that the cytolytic function of anergic and exhausted CD8⁺ T cells can sometimes be reversed by anti-PDl antibody to potentiate their anti-tumor activity (Kumar, Yu et al. 2017). Therefore, accurate detection of the functional state of CD8⁺ T cells in the tumor may prove to be an important biomarker of patient selection for checkpoint blockade therapy.

Having identified that the tumor microenvironment of the deceased group is enriched in dysfunctional CD8⁺ T cells, we next examined factors that may have contributed to the anergic/exhausted phenotype. We compiled genes differentially expressed between the tumors belonging to the alive or the deceased groups and mapped the differentially expressed genes into pathways using REACTOME (Table 16). Significantly, pathway mapping revealed significant upregulation of a core network of genes in the alive group functioning in T cell receptor (TCR) signaling. Remarkably, all the upregulated genes encode proteins of the TCR complex and proximal kinases that transduce signaling following TCR activation (Figure 16G). Taken together, the dysfunctional state of CD8⁺ T cells in the deceased group was linked to the reduced expression of TCR signaling genes (Figure 17), and high expression of genes associated with the anergic and exhausted phenotype. Figure 18 summarizes the utility of the gene expression signature in predicting long-term survival of a cancer patient based on the immune inicioenvironment of the tumor.

Example-8. TCR signaling genes predict response to Ipilimumab therapy in melanoma

The upregulated TCR signaling genes associated with long-term survival as described in Example-7 were tested on a dataset of patient response to Ipilimumab. The RNA-seq data available in the public domain (GSE91061) contains data from 65 melanoma patients of which 36 were treated with anti- CTLA-4 checkpoint inhibitor (Ipilimumab) (Riaz, Havel et al. 2017). We applied our gene signature correlating with long term survival on Ipilimumab-treated patient data and observed that patient who had a response to Ipilimumab (partial or complete response - PR/CR) had a higher score than patients who had stable disease (SD) or progressive disease (PD) (Figure 19). Infact the score increased with response - PR/CR > SD > PD. Therefore, the signature of long-term survival has further clinical utility in predicting respone to checkpoint inhibitors such as Ipilimumab.

References

Adams, S. F., D. A. Levine, M. G. Cadungog, R. Hammond, A. Facciabene, N. Olvera, S. C. Rubin, J. Boyd, P. A. Gimotty and G. Coukos (2009). "Intraepithelial T cells and tumor proliferation:

impact on the benefit from surgical cytoreduction in advanced serous ovarian cancer." Cancer

115(13): 2891 -2902.

Aran, D., A. Lasry, A. Zinger, M. Biton, E. Pikarsky, A. Hellman, A. J. Butte and Y. Ben-Neriah (2016). "Widespread parainflammation in human cancer." Genome Biol 17(1): 145.

Becht, E., N. A. Giraldo, L. Lacroix, B. Buttard, N. Elarouci, F. Petitprez, J. Selves, P. Laurent- Puig, C. Sautes-Fridman, W. H. Fridman and A. de Reynies (2016). "Estimating the population abundance of tissue-infiltrating immune and stromal cell populations using gene expression."

Genome. Biol 17(1): 218.

Bhattacharya, S., S. Andorf, L. Gomes, P. Dunn, H. Schaefer, J. Pontius, P. Berger, V. Desborough, T. Smith, J. Campbell, E. Thomson, R. Monteiro, P. Guimaraes, B. Walters, J. Wiser and A. J. Butte (2014). "ImmPort: disseminating data to the public for the future of immunology." Immunol Res 58(2-3): 234-239.

Carreras, J., Y. Y. Kikuti, S. Bea, M. Miyaoka, S. Hiraiwa, H. Ikoma, R. Nagao, S. Tomita, D. Martin-Garcia, I. Salaverria, A. Sato, A. Ichiki, G. Roncador, J. F. Garcia, K. Ando, E. Campo and N. Nakamura (2017). "Clinicopathological characteristics and genomic profile of primary sinonasal tract diffuse large B cell lymphoma (DLBCL) reveals gain at 1 q31 and RGS1 encoding protein; high RGS 1 immunohistochemical expression associates with poor overall survival in DLBCL not otherwise specified (NOS)." Histopathologv 70(4): 595-621.

Chandran, U. R., O. P. Medvedeva, M. M. Barmada, P. D. Blood, A. Chakka, S. Luthra, A.

Ferreira, K. F. Wong, A. V. Lee, Z. Zhang, R. Budden, J. R. Scott, A. Berndt, J. M. Berg and R. S. Jacobson (2016). "TCGA Expedition: A Data Acquisition and Management System for TCGA Data." PLoS One 11(10): e0165395.

Chappert, P. and R. H. Schwartz (2010). "Induction of T cell anergy: integration of environmental cues and infectious tolerance." Curr Opin Immunol 22(5): 552-559.

Charoentong, P., F. Finotello, M. Angelova, C. Mayer, M. Efremova, D. Rieder, H. Hackl and Z. Trajanoski (2017). "Pan-cancer Immunogenomic Analyses Reveal Genotype-Immunophenotype Relationships and Predictors of Response to Checkpoint Blockade." Cell Rep 18(1): 248-262. Chen, D. S. and I. Mellman (2017). "Elements of cancer immunity and the cancer-immune set point." Nature 541(7637): 321-330.

Clough, E. and T. Barrett (2016). "The Gene Expression Omnibus Database." Methods Mol Biol 1418: 93-1 10.

Colotta, F., P. Allavena, A. Sica, C. Garlanda and A. Mantovani (2009). "Cancer-related

inflammation, the seventh hallmark of cancer: links to genetic instability." Carcinogenesis 30(7): 1073-1081.

Crespo, J., H. Sun, T. H. Welling, Z. Tian and W. Zou (2013). "T cell anergy, exhaustion, senescence, and sternness in the tumor microenvironment." Curr Opin Immunol 25(2): 214-221. Fabregat, A., S. Jupe, L. Matthews, K. Sidiropoulos, M. Gillespie, P. Garapati, R. Haw, B. Jassal, F. Korninger, B. May, M. Milacic, C. D. Roca, K. Rothfels, C. Sevilla, V. Shamovsky, S. Shorser, T. Varusai, G. Viteri, J. Weiser, G. Wu, L. Stein, H. Hermjakob and P. D'Eustachio (2017). "The Reactome Pathway Knowledgebase." Nucleic Acids Res.

Gooden, M. J., G. H. de Bock, N. Leffers, T. Daemen and H. W. Nijman (201 1). "The prognostic influence of tumour-infiltrating lymphocytes in cancer: a systematic review with meta-analysis." Br J Cancer 105(1): 93-103.

Greenbaum, J., J. Sidney, J. Chung, C. Brander, B. Peters and A. Sette (201 1). "Functional classification of class II human leukocyte antigen (HLA) molecules reveals seven different supertypes and a surprising degree of repertoire sharing across supertypes." Immunogenetics 63(6): 325035.

Hall, M. A. (1999). "Correlation-based Feature Selection for Machine Learning,"

Hambardzumyan, D., D. H. Gutmann and H. Kettenmann (2016). "The role of microglia and macrophages in glioma maintenance and progression." Nat Neurosci 19(1): 20-27.

Huh, J. W., J. H. Lee and H. R. Kim (2012). "Prognostic significance of tumor-infiltrating lymphocytes for patients with colorectal cancer." Arch Surg 147(4): 366-372.

Karosiene, E., C. Lundegaard, O. Lund and M. Nielsen (2012). "NetMHCcons: a consensus method for the major histocompatibility complex class I predictions." Immunogenetics 64(3): 177-186. Kawashima, S. and M. Kanehisa (2000). "AAindex: amino acid index database." Nucleic Acids Res 28(1): 374. Kumar, R., F. Yu, Y. H. Zhen, B. Li, J. Wang, Y. Yang, H. X. Ge, P. S. Hu and J. Xiu (2017). "PD- 1 blockade restores impaired function of ex vivo expanded CD8+ T cells and enhances apoptosis in mismatch repair deficient EpCAM+PD-Ll+ cancer cells." Onco Targets Ther 10: 3453-3465.

Le, D. T., J. N. Durham, K. N. Smith, H. Wang, B. R. Bartlett, L. K. Aulakh, S. Lu, H. Kemberling, C. Wilt, B. S. Luber, F. Wong, N. S. Azad, A. A. Rucki, D. Laheru, R. Donehower, A. Zaheer, G. A. Fisher, T. S. Crocenzi, J. J. Lee, T. F. Greten, A. G. Duffy, K. K. Ciombor, A. D. Eyring, B. H. Lam, A. Joe, S. P. Kang, M. Holdhoff, L. Danilova, L. Cope, C. Meyer, S. Zhou, R. M. Goldberg, D. K. Armstrong, K. M. Bever, A. N. Fader, J. Taube, F. Housseau, D. Spetzler, N. Xiao, D. M. Pardoll, N. Papadopoulos, K. W. Kinzler, J. R. Eshleman, B. Vogelstein, R. A. Anders and L. A. Diaz, Jr. (2017). "Mismatch repair deficiency predicts response of solid tumors to PD-1 blockade." Science 357(6349): 409-413.

Le, D. T., J. N. Uram, H. Wang, B. R, Bartlett, H. Kemberling, A. D. Eyring, A. D. Skora, B. S. Luber, N. S. Azad, D. Laheru, B. Biedrzycki, R. C. Donehower, A. Zaheer, G. A. Fisher, T. S. Crocenzi, J. J. Lee, S. M. Duffy, R. M. Goldberg, A. de la Chapelle, M. Koshiji, F. Bhaijee, T. Huebner, R. H. Hruban, L. D. Wood, N. Cuka, D. M. Pardoll, N. Papadopoulos, K. W. Kinzler, S. Zhou, T. C. Cornish, J. M. Taube, R. A. Anders, J. R. Eshleman, B. Vogelstein and L. A. Diaz, Jr. (2015). "PD-1 Blockade in Tumors with Mismatch-Repair Deficiency." N Engl J Med 372(26): 2509-2520.

Li, B., E. Severson, J. C. Pignon, H. Zhao, T. Li, J. Novak, P. Jiang, H. Shen, J. C. Aster, S. Rodig, S. Signoretti, J. S. Liu and X. S. Liu (2016). "Comprehensive analyses of tumor immunity:

implications for cancer immunotherapy." Genome Biol 17(1): 174.

Mao, Y., Q. Qu, X. Chen, O. Huang, J. Wu and K. Shen (2016). "The Prognostic Value of Tumor- Infiltrating Lymphocytes in Breast Cancer: A Systematic Review and Meta- Analysis." PLoS One 11(4): e0152500.

Nghiem, P. T., S. Bhatia, E. J. Lipson, R. R. Kudchadkar, N. J. Miller, L. Annamalai, S. Berry, E.

K. Chartash, A. Daud, S. P. Fling, P. A. Friedlander, H. M. Kluger, H. E. Kohrt, L. Lundgren, K.

Margolin, A. Mitchell, T. Olencki, D. M. Pardoll, S. A. Reddy, E. M. Shantha, W. H. Sharfman, E.

Sharon, L. R. Shemanski, M. M. Shinohara, J. C. Sunshine, J. M. Taube, J. A. Thompson, S. M.

Townson, J. H. Yearley, S. L. Topalian and M. A. Cheever (2016). "PD-1 Blockade with

Pembrolizumab in Advanced Merkel-Cell Carcinoma." N Engl J Med 374(26): 2542-2552. Nielsen, M, C. Lundegaard, O. Lund and C. Kesmir (2005). "The role of the proteasome in generating cytotoxic T-cell epitopes: insights obtained from improved predictions of proteasomal cleavage." Immunogenetics 57(1-2 : 33-41.

Peters, B., S. Bulik, R. Tampe, P. M. Van Endert and H. G. Holzhutter (2003). "Identifying MHC class I epitopes by predicting the TAP transport efficiency of epitope precursors." J Immunol

171(4): 1741 -1749.

Riaz, N., J. J. Havel, V. Makarov, A. Desrichard, W. J. Urba, J. S. Sims, F. S. Hodi, S. Martin- Algarra, R. Mandal, W. H. Sharfman, S. Bhatia, W. J. Hwu, T. F. Gajewski, C. L. Slingluff, Jr., D. Chowell, S. M. Kendall, H. Chang, R. Shah, F. Kuo, L. G. T. Morris, J. W. Sidhom, J. P. Schneck, C. E. Horak, N. Weinhold and T. A. Chan (2017). "Tumor and Microenvironment Evolution during Immunotherapy with Nivolumab." Cell 171(4): 934-949 e915.

Rizvi, N. A., M. D. Hellmann, A. Snyder, P. Kvistborg, V. Makarov, J. J. Havel, W. Lee, J. Yuan, P. Wong, T. S. Ho, M. L. Miller, N. Rekhtman, A. L. Moreira, F. Ibrahim, C. Bruggeman, B.

Gasmi, R. Zappasodi, Y. Maeda, C. Sander, E. B. Garon, T. Merghoub, J. D. Wolchok, T. N.

Schumacher and T. A. Chan (2015). "Cancer immunology. Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer." Science 348(6230): 124-128.

Romero Arenas, M. A., R. G. Fowler, F. A. San Lucas, J. Shen, T. A. Rich, E. G. Grubbs, J. E. Lee, P. Scheet, N. D. Perrier and H. Zhao (2014). "Preliminary whole-exome sequencing reveals mutations that imply common tumorigenicity pathways in multiple endocrine neoplasia type 1 patients." Surgery 156(6): 1351-1357; discussion 1357-1358.

Rooney, M. S., S. A. Shukla, C. J. Wu, G. Getz and N. Hacohen (2015). "Molecular and genetic properties of tumors associated with local immune cytolytic activity." Cell 160(1 -2): 48-61.

Sidney, J., B. Peters, N. Frahm, C. Brander and A. Sette (2008). "HLA class I supertypes: a revised and updated classification." BMC Immunol 9: 1.

Spranger, S. and T. F. Gajewski (2016). "Tumor-intrinsic oncogene pathways mediating immune avoidance." Oncoimmunology 5(3): el 086862.

Subramanian, A., P. Tamayo, V. K. Mootha, S. Mukherjee, B. L. Ebert, M. A. Gillette, A.

Paulovich, S. L. Pomeroy, T. R. Golub, E. S. Lander and J. P. Mesirov (2005). "Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles." Proc Natl Acad Sci U S A 102(43): 15545-15550.

Tesniere, A., L. Zitvogel and G. Kroemer (2006). "The immune system: taming and unleashing cancer." Discov Med 6(36): 21 1 -216. Tirosh, I., B. Izar, S. M. Prakadan, M. H. Wadsworth, 2nd, D. Treacy, J. J. Trombetta, A. Rotem, C. Rodman, C. Lian, G. Murphy, M. Fallahi-Sichani, K. Dutton-Regester, J. R. Lin, O. Cohen, P. Shah, D. Lu, A. S. Genshaft, T. K. Hughes, C. G. Ziegler, S. W. Kazer, A. Gaillard, K. E. Kolb, A. C. Villani, C. M. Johannessen, A. Y. Andreev, E. M. Van Allen, M. Bertagnolli, P. K. Sorger, R. J. Sullivan, K. T. Flaherty, D. T. Frederick, J. Jane-Valbuena, C. H. Yoon, O. Rozenblatt-Rosen, A. K. Shalek, A. Regev and L. A. Garraway (2016). "Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq." Science 352(6282): 189-196.

Turajlic, S., K. Litchfield, H. Xu, R. Rosenthal, N. McGranahan, J. L. Reading, Y. N. S. Wong, A. Rowan, N. Kami, M. Al Bakir, T. Chambers, R. Salgado, P. Savas, S. Loi, N. J. Birkbak, L.

Sansregret, M. Gore, J. Larkin, S. A. Quezada and C. Swanton (2017). "Insertion-and-deletion- derived tumour-specific neoantigens and the immunogenic phenotype: a pan-cancer analysis."

Lancet. Oncol 18(8): 1009- 1 021 .

Varn, F. S., Y. Wang, D. W. Mullins, S. Fiering and C. Cheng (2017). "Systematic Pan-Cancer Analysis Reveals Immune Cell Interactions in the Tumor Microenvironment." Cancer Res 77(6): 1271 - 1282.

Wang, P., Y. Yang, W. Han and D. Ma (2015). "ImmuSort, a database on gene plasticity and electronic sorting for immune cells." Sci Rep 5: 10370.

Weissferdt, A., J. Fujimoto, N. Kalhor, J. Rodriguez, R. Bassett, Wistuba, II and C. A. Moran (2017). "Expression of PD- 1 and PD-L1 in thymic epithelial neoplasms." Mod Pathol 30(6): 826- 833.

Wolfl, M. and P. D. Greenberg (2014). "Antigen-specific activation and cytokine-facilitated expansion of naive, human CD8+ T cells." Nat Protoc 9(4): 950-966.

Claims

What is claimed is:

A method of validating peptide variant(s) as an immunogenic peptide comprising:

A. selecting a peptide variant(s) predicted to be an immunogenic peptide comprising the steps of

1) obtaining a sample from a subject with a tumor;

2) identifying genetically altered protein(s) expressed by a mammalian tumor cell or a mammalian tumor tissue in the sample from nucleic acid sequence(s) encoding the genetically altered protein(s);

3) producing peptide fragment(s) comprising at least one mutated amino acid from the genetically altered protein(s) so identified in step A.2, so as to obtain one or more peptide variant(s) associated with the mammalian tumor cell or the mammalian tumor tissue,

4) selecting the peptide variant(s) from step A.3 predicted to bind T-cell receptor (TCR) comprising: i) selecting the peptide variant(s)-of a pre-defined length; ii) characterizing the peptide variant(s) in silico by selecting and matching features associated with an amino acid at each position of the peptide with selected pre-defined features for each position of peptides recognized by TCR associated with CD8+ T-cell, so as to obtain predictive ability of the peptide variant(s) to interact with the TCR; wherein the selected pre-defined features comprise hydrophobic_^ helix/turn motif_j polar, non-polar, β-sheet structure motif, charge of main chain, charge of side chain, solvent accessibility of an amino acid, spatial flexibility of the main chain and spatial flexibility of side chain of an amino acid; iii) selecting the peptide variant(s) in step 4.ii based on predicted ability of the peptide variant(s) to interact with the TCR, so as to be an immunogenic peptide that may or can serve as a mammalian tumor immunogenic peptide (s); and

B. validating one or more immunogenic peptide(s) of step A comprising the step of

1) determining whether the peptide variant(s) so selected is positive in an ex vivo CD8+ T-cell activation assay, and selecting the peptide variant(s) which is positive in a CD8+ T-cell activation assay so as to ensure ability of the peptide(s) to activate CD8+ T-cells, thereby validating the peptide variant(s) as an immunogenic peptide.

2. A method of selecting one or more validated immunogenic peptides for a cancer vaccine cocktail comprising one or more validated immunogenic peptide by the method of claim 1 , and in step B of claim 1 further comprising the steps:

A. quantitating the magnitude of CD8+ T-cell activation of the peptide variant(s) in step B. l, wherein peptide variant(s) generating about >2-fold expression of CD8+ T-cell activation marker IFN- γ and or about two fold expansion of CD8+ T-cell expressing IFN- γ compared to wild-type peptide or no-peptide control are selected;

B. determining monoclonal and polyclonal CD8+ T-cell amplification response in an ex vivo CD8+ T-cell activation assay to the peptide variant(s), such that the monoclonal and polyclonal CD8+ T-cell expansion is directed or skewed towards a polyclonal expansion of CD8+ T cells, such that a single peptide variant which activates two or more peptide variant-specific CD8+ T-cell clones is selected;

C. determining functional competence of CD8+ T-cells by quantitating expression of CTL markers in CD8+ T-cells in response to the peptide variant(s) in step B. l such that the CTL markers comprise expression of four or more of IFN-γ, IL-2, TNF-a, CD69, Perforin, Granzyme A, Granzyme B, Granulysin, Fas-L and CD 107a or a combination thereof or all and selecting peptide variant(s) which express four or more CTL markers; and

D. determining the anergic/exhaustion phenotype of CD8+ T-cells expanded in response to the peptide variant(s) and selecting peptide variant(s) inducing low or no expression of anergic and/or exhaustion markers in the expanded population of CD8+ T-cells wherein the anergic and/or exhaustion markers includes any, all or a combination of CTLA-4, PD-1 , Eomes, CD 160, TIGIT, ENTPD1 , MY07A, PHLDA1 , LAG-3, 2B4, BTLA, TIM3, VISTA and CD96; thereby, selecting one or more validated immunogenic peptides for the cancer vaccine cocktail.

The method of claim 1 , wherein the positive prediction of the validated immunogenic peptide to be bound by TCR in step (A)(4) comprises a TCR-binding algorithm, wherein the TCR-binding algorithm comprises:

A. peptide(s) of a pre-defined length comprising one or more mutations and/or one or more alterations in level of expressed genetic material associated with the tumor; and

B. selecting and matching features associated with an amino acid at each position of the peptide with selected pre-defined features for each position of peptides recognized by TCR associated with either CD8+ T-cell, so as to obtain predictive ability of the peptide(s) to interact with the TCR; wherein the features comprise physicochemical features of amino acids and wherein the physicochemical features are selected from an amino acid index and wherein the amino acid index is AAindexl section of Amino Acid Index database or its equivalent.

The method of claim 2, wherein the magnitude of T cell activation further comprises determining percent of antigen-specific T cells producing activation markers.

The method of claim 4, further comprising determining magnitude of activation marker expressed or produced by the percent of antigen-specific T cells producing activation markers.

The method of claim 5, wherein the magnitude of T cell activation favourable toward a peptide's inclusion in the cocktail comprises a greater percent of antigen-specific T cells producing activation markers but at a moderate or low level of expression in expressing cells.

7. The method of claim 4, wherein antigen-specific T cells are or comprise CD8+ T cells binding the validated immunogenic peptide.

8. The method of claim 2, wherein the antigen-specific T cells producing activation markers are activated CD8+ T-cells producing markers selected from the group consisting of IFN-γ, IL-2, TNF-a, LT-a, CXCL12, STAT1 , STAT4 and T-bet and a combination thereof.

9. The method of claim 2, wherein the antigen-specific T cells producing activation markers are activated CD8+ T-cells producing markers selected from the group consisting of IFN-γ, IL-2, TNF-a, CD69, Perforin, Granzyme A, Granzyme B, Granulysin, Fas L and CD 107a and a combination thereof.

10. The method of claim 2, wherein the activation markers are selected from the group consisting of IFN-γ, TNF-a and a combination thereof.

1 1. The method of claim 2, wherein the monoclonal and polyclonal T-cell amplification response is directed or skewed toward polyclonal expansion of T cells with 2 or more vaccine specific T cell clones. 12. The method of claim 2, wherein the activation markers are IFN-γ, TNF-a and IL-2.

13. The method of claim 2, wherein the functional competence of T cells by expression of CTL markers comprises expression of IFN-γ, IL-2, TNF-a, CD69, Perforin, Granzyme A Granzyme B, Granulysin, Fas L and CD 107a or a combination thereof.

14. The method of claim 2, wherein the functional competence of T cells by expression of CTL markers comprises expression of IFN-γ, TNF-a or a combination thereof.

15. The method of claim 2, wherein the anergic and/or exhaustion markers for T cells are selected from the group consisting of CTLA-4, PD-1 , Eomes, CD 160, TIGIT, ENTPD1 , MY07A,

PHLDA1 , LAG-3, 2B4, BTLA, TIM3, VISTA and CD96 and a combination thereof.

16. The method of claim 3, wherein the algorithm is favourable or skewed toward selection of validated immunogenic peptide for peptides with the characteristics comprising:

A. a polyclonal T cell amplification response;

B. a greater percent of antigen-specific T cells producing activation markers;

C. a moderate or low expression of activation markers by expressing T cells;

D. free or deficient in anergic and/or exhaustion markers for T cells.

17. The method of claim 2, wherein the selected cancer vaccine produces an immunogenic

response comprising a polyclonal T cell amplication of 2 or more T cell clones.

18. The method of claim 17, wherein the T cell clones are activated CD8+ T cells.

19. The method of claim 2 comprising the steps 2 to 5 for selecting one or more peptide variant(s) to be included in the cancer vaccine cocktail.

20. The method of claim 20 comprising the steps 2 to 5 for selecting two or more peptide variant(s) to be included in the cancer vaccine cocktail.

21. The method of claim 20 comprising the steps 2 to 5 for selecting five or more peptide

variant(s) to be included in the cancer vaccine cocktail.

22. The method of claim 20 comprising the steps 2 to 5 for selecting ten or more peptide variant(s) to be included in the cancer vaccine cocktail. 23. The method of claim 204 comprising the steps 2 to 5 for selecting thirty or fewer peptide

variant(s) to be included in the cancer vaccine cocktail.

24. The method of claim 20 comprising the steps 2 to 5 for selecting twenty or fewer peptide

variant(s) to be included in the cancer vaccine cocktail.

25. The method of claim 20 comprising the steps 2 to 5 for selecting five to thirty peptide

variant(s) to be included in the cancer vaccine cocktail.

26. The method of claim 20 comprising the steps 2 to 5 for selecting five peptide variant(s) to be included in the cancer vaccine cocktail.

27. The method of claim 20 comprising the steps 2 to 5 for selecting ten peptide variant(s) to be included in the cancer vaccine cocktail.

28. The method of claim 20 comprising the steps 2 to 5 for selecting fifteen peptide variant(s) to be included in the cancer vaccine cocktail. 29. A method of selecting a cancer vaccine from genetically altered protein(s) expressed by a

mammalian cancer cell and/or tissue which comprises:

A. identifying neo-epitopes in mutant cancer peptides from the genetically altered protein(s) which is from the mammalian cancer cell and/or tissue;

B. calculating probability of TCR binding of the neo-epitope(s) of (a) to generate a T- cell response, thereby identifying a T-cell activating neo-epitope(s) from the genetically altered protein;

C. selecting one or more mutant cancer peptide(s) so identified from (b) having the highest probability or a probability above a threshold setting that can modulate the immune response of a mammal when challenged with the mutant cancer peptide(s), thereby selecting a cancer vaccine; wherein the cancer vaccine comprises one or more mutant cancer peptides derived from the genetically altered protein(s) and wherein the mammalian subject expresses the genetically altered protein(s) and expresses an HLA or MHC molecule that binds the mutant cancer peptide(s).

30. A cancer vaccine selected by the method of claim 29. 31. The method of claim 29, wherein the mutant cancer peptides is a peptide from Table 4 .

32. A method of preparing a subject-specific immunogenic composition comprising selecting a cancer vaccine from genetically altered protein(s) expressed by a mammalian cancer cell and/or tissue by the method of claim 1, thereby preparing the subject-specific immunogenic composition.

The method of claim 1, wherein the validated immunogenic peptides have one or more amino acid mutation and bind to HLA or MHC proteins of the subject with an IC50 less than about 1000 nM.

A formulation comprising a validated immunogenic peptide prepared by the method of claim

1 or 2. 35. A method of treating a cancer comprising administering one or more of the cancer vaccine cocktail of claim 2 into a subject in need thereof thereby treating the cancer.

36. The method of claim 35, wherein the cancer is a stomach cancer, a colorectal cancer, a colon cancer, a breast cancer, an ovarian cancer, a prostate cancer, a lung cancer, a kidney cancer, a gastric cancer, a testicular cancer, a head and neck cancer, a pancreatic cancer, a brain cancer, a melanoma, a lymphoma or a leukemia.

37. The method of claim 36, wherein the colorectal cancer is a familial cancer selected from the group consisting of familial adenomatous polyposis (FAP) and Lynch Syndrome.

38. A method for obtaining a minimal gene expression signature associated with a specific

immune cell type and/or subtype that distinguishes the specific immune cell type and/or subtype from other immune cell types and/or subtypes comprising:

A. obtaining a plurality of samples from a plurality of subjects (one or more sample from one or more subject);

B. determining gene expression of the specific immune cell type and/or subtype from the samples;

C. determining gene expression of other immune cell types and/or subtypes from the samples;

D. comparing the gene expression of (b) with (c) so as to identify for each immune cell type and/or subtype, the highest gene expression within each immune cell type and/or subtype but having greatest variance in gene expression between different immune cell types and/or subtypes; E. selecting genes so identified in (d) with low plasticity of expression so as to reflect consistent gene expression or lowest variance in gene expression within each immune cell type and/or subtype; and

F. validating utility of the selected genes from (e) for ability to discriminate cognate immune cell type and/or subtype from non-cognate immune cell type, and validating gene expression signature as a minimal gene expression signature consisting of a minimal set of genes with greatest difference in differentiating cognate from non- cognate immune cell type and/or subtypes; and

G. optionally, changing composition of the selected genes in (f) following discovery of an improved smaller subset of selected genes selected from (f) during validation in

(f);

thereby, obtaining a minimal gene expression signature associated with a specific immune cell type and/or subtype that distinguishes the specific immune cell type and/or subtype from other immune cell types and/or subtypes.

The method of claim 38, wherein the minimal gene expression signature consists of expression profile of 2 to 125 genes selected from the group consisting of ALOX15, ACAPl , ANK3, ANKRD55, ANXA3, APOC 1 , ARRB1 , BACE2, BLK, C17orf96, Clorf54, CCL14, CCL13, CCL1 5, CCL17, CCL18, CCL19, CCL23, CCR2, CCR7, CCR8, CD14, CD15/FUT4, CD1A, CD1B, CD1E, CD33, CD34, CD36, CD45, CD66b/CEACAM8, CD86, CD8A, CD8B, CLCN4, CMTM2, CTSW, CXCL10, CXCL1 1 , CXCL9, CXCR1, CXorf57,

CYP27B1 , CYP4F3, EBF1, EGR2, EPHA1 , ETV3, FABP4, FANK1 , FCER2, FCRL2, FCRLA, FLJ13197, FLVCR2, FOXP3, FPR1, FUT4, FZD2, GAL3ST4, GALR1 , GPR97, HESXl , HLA-DQA1 , HRH1, HS3ST2, HSD1 1B1 , IFI27, IL15RA, IL1R2, IL7R, ITGAM, KCNJ15, KIT, KYNU, LRRC32, MAOA, MARCO, M-CSFR/CSFIR, MEF2C, MGAM, MME, MMP12, MMP9, MRC1 , MS4A6A, MSC, NIDI , NLRP3, NPL, NRG1 , OLlG l , PALLD, PDL1 , PI3, PID1 , PLA1A, PNOC, PPP1 R14A, PROK2, PSMA2, PTGDR, QPCT, RENBP, RGPD1, RTKN2, S100A9, S1PR3, SERGEF, SH2D1 B, SLC31A2, SLC38A6, SLC47A1 , TIE2/TEK, TIM3, TNFRSF10B, TSHZ2, VCAN, VILL, VISTA, VNN3,

WNT5A, W T7A, ZNF204P and ZNF324. A method for identifying immune cell type/subtype infiltrate preferentially associated with a type and/or subtype of tumor among a collection of tumors comprising:

A. isolating the tumor from the subject;

B. determining gene expression for a set of genes that permit discriminating different immune cell types and/or subtypes as infiltrates in the tumor from the subject;

C. obtaining a minimal gene expression signature by the method of claim 38 and

applying minimal gene expression signature associated with specific immune cell types and/or subtypes so as to obtain an immune score associated with each specific immune cell type and/or subtype;

D. repeating steps (a) to (c) for other tumors and/or subjects; and

E. comparing immune scores so obtained for each immune cell type and/or subtype for the collection of tumors so as to obtain rank order of tumors based on the immune scores for each immune cell type and/or subtype;

F. stratifying the rank ordered tumors based on immune scores for each immune cell type and/or subtype of step (e);

G. determining percentage or fraction of a tumor type and/or subtype within each

stratified group in step (f);

H. repeating steps (e) to (g) for each immune cell type and/or subtype;

I. identifying tumor type and/or subtype overrepresented in one or more stratified group at the highest end of the immune score for each immune cell type, so as to identify immune cell type infiltrate preferentially associated with a type or subtype of tumor among a collection of tumors, wherein, the set of genes in (b) consist or comprise a combination of the genes as provided in Table 15, and wherein, the immune cell types and/or subtype consist of or comprise B- cell, CD4+ T-cell, CD8+ T-cell, Treg cell, monocyte, macrophage, natural killer (NK) cell, neutrophil, myeloid-derived suppressor cell (MDSC), dendritic cell, macrophage Ml and M2 sub-types, granulocytic myeloid-derived suppressor cell (G-MDSC) subtype and monocytic myeloid-derived suppressor cell (M-MDSC) subtype or a combination thereof, thereby, identifying immune cell type/subtype infiltrate preferentially associated with a type and/or subtype of tumor among a collection of tumors.

A method for identifying immune cell type infiltrate absent or deficient from a type and/or subtype of tumor among a collection of tumors comprising:

A. isolating the tumor from the subject;

C. obtaining a minimal gene expression signature by the method of claim 38 and

D. repeating steps (a) to (c) for other tumors and/or subjects; and

E. comparing immune scores so obtained for each immune cell type and/or subtype for the collection of tumors so as to obtain rank order of tumors based on the immune scores ibr each immune cell type and/or subtype;

stratified group in step (f);

H. repeating steps (e) to (g) for each immune cell type and/or subtype;

I. identifying tumor type and/or subtype underrepresented in one or more stratified group at the highest end of the immune score and/or overrepresented in one or more stratified group at the lowest end of the immune score for each immune cell type and/or subtype, so as to identify immune cell type/subtype infiltrate absent or deficient from a type and/or subtype of tumor among a collection of tumors, wherein, the set of genes in (b) consist or comprise a combination of the genes as provided in Table 15, and wherein, the immune cell types and/or subtypes consist of or comprise B- cell, CD4+ T-cell, CD8+ T-cell, Treg cell, monocyte, macrophage, natural killer (NK) cell, neutrophil, myeloid-derived suppressor cell (MDSC), dendritic cell, macrophage Ml and M2 sub-types, granulocytic myeloid-derived suppressor cell (G-MDSC) subtype and monocytic myeloid-derived suppressor cell (M-MDSC) subtype or a combination thereof, thereby, identifying immune cell type infiltrate absent or deficient from a type and/or subtype of tumor among a collection of tumors.

A method for identifying characteristic immune cell type/subtype infiltrates for a type and/or subtype of tumor among a collection of tumors comprising:

A. identifying none, one or more immune cell type infiltrate preferentially associated with a type and/or subtype of tumor among a collection of tumors by the method of claim 71 ; and

B. identifying none, one or more immune cell type infiltrate absent or deficient from a type or subtype of tumor among a collection of tumors by the method of claim 72, so as to identify characteristic immune cell type infiltrates for a type and/or subtype of tumor among a collection of tumors, thereby, identifying characteristic immune cell type infiltrates for a type and/or subtype of tumor among a collection of tumors.

The method of claim 40, 41 or 42, wherein the type and/or subtype of tumor enriched in B- cell infiltration is selected from the group consisting of diffuse large B-cell lymphoma (DLBCL), kidney renal clear cell carcinoma (KIRC), sarcoma (SARC), skin cutaneous melanoma (SKCM) and uveal melanoma (UVM).

A method for identifying a cancer patient most likely to be responsiveness to immune checkpoint inhibitor therapy comprising:

A. obtaining a tumor sample from the cancer patient;

B. determining gene expression for a set of genes of the isolated tumor sample;

C. applying minimal gene expression signature associated with CD8+ T-cell so as to determine a threshold presence of CD8+ T-cell; D. determining functional state of the CD8+ T-cell by analyzing one or more marker associated with anergic and exhaustion of CD8+ T-cell, wherein the marker is selected from the group consisting of CTLA-4, LAG3 and TIM3 or a combination thereof; and

E. finding presence or upregulation of CTLA-4, LAG3 and/or TIM3 being indicative of anergic and exhausted CD8+ T-cell and a tumor infiltrated by dysfunctional CD8+ T-cell which is responsive to immune checkpoint blockade, thereby, identifying the cancer patient most likely to be responsiveness to immune checkpoint inhibitor therapy.

The method of claim 44, wherein the set of genes consists or comprises the genes or combination of genes as provided in Table 1 5.

The method of claim 44, wherein the immune checkpoint therapy comprises use of anti- cytotoxic T lymphocyte antigen 4 (CTLA-4) antibody, anti-programmed death 1 (PD-1) monoclonal antibody, anti-CD 137 antibody, anti-IDO-1 antibody, an antibody against PD-1 , an antibody against PDL1 , an antibody against PDL2, an antibody against B7-H3, an antibody against B7-H4, an antibody against LAG3, an antibody against KIR, an antibody against TIM3, an antibody against TIGIT, an antibody against BTLA, an antibody against a CD 160, an antibody against A2aR, and/or an antibody against a VISTA protein(s).

A method of identifying immunogenic features of a tumor microenvironment which comprises:

A. obtaining a tumor tissue sample from a subject;

B. determining gene expression of the isolated tumor tissue so as to obtain gene

expression data;

C. deconvolving gene expression data of (b) by applying gene expression signatures associated with specific immune cell types and/or subtypes, so as to obtain immune scores for the immune cell types and/or subtypes with gene expression signatures used in deconvolving gene expression data;

D. optionally, determining one or more functional marker of immune cells so as to

assess functional status of immune cell infiltrate; and E. comparing the immune score for each specific immune cell type and/or subtype with the immune score for other immune cell types and/or subtypes, and optionally, functional status of immune cells, so as to identify specific immune cell types and/or subtypes as immune infiltrates enriched or deficient in the tumor tissue, and optionally, functional status of the specific immune cell types and/or subtypes of immune cell infiltrate; thereby, identifying immunogenic features of the tumor microenvironment.

The method of claim 47, wherein the gene expression is determined from RNA transcripts isolated from the sample.

A method for determining tumor grade based on immunogenic features of a tumor microenvironment comprising:

A. determining the immunogenic features of a tumor microenvironment by the method of claim 47;

B. comparing the immunogenic features so determine in step (a) to a reference

comprising immunogenic features determined for different tumor grades for same type and/or subtype of tumor; and

C. finding the immunogenic features with the closest match so as to be able to

determine tumor grade;

thereby, determining tumor grade based on immunogenic features of a tumor

microenvironment.

A method for predicting likelihood of survival of a subject with cancer based on

immunogenic features of a tumor microenvironment comprising:

B. comparing the immunogenic features so determine in step (a) to a reference

comprising immunogenic features for same type and/or subtype of tumor stratified by percent survival or likelihood of survival, or alternatively, a reference comprising immunogenic features for same type and/or subtype of tumor classified as being associated with live patients due to remission or stable disease or dead patients due to succumbing to cancer; and C. finding the immunogenic features with the closest match so as to be able to predict likelihood of survival of a subject with cancer;

thereby, predicting likelihood of survival of a subject with cancer based on immunogenic features of a tumor microenvironment.

A method for predicting response to one or more cancer drug or a combination of cancer a subject based on immunogenic features of a tumor microenvironment comprising:

B. comparing the immunogenic features so determine in step (a) to a reference

comprising immunogenic features for same type and/or subtype of tumor stratified by percent response to one or more cancer drug or a combination of cancer drugs; and

C. finding the immunogenic features with the closest match so as to be able to predict response to one or more cancer drug or a combination of cancer drugs;

thereby, predicting response to one or more cancer drug or a combination of cancer drugs in a subject based on immunogenic features of a tumor microenvironment.

The method of claim 51 , wherein the cancer drug is selected from the group consisting of ABVD (doxorubicin/bleomycin/vinblastine/dacarbazine combination), AC

(Adriamycin/cyclophosphamidc combination), ACE

(Adriamycin/cyclophosphamide/etoposide combination), doxorubicin (Adriamycin), vinblastine, dacarbazine (DTIC), etoposide (Eposin, Etopophos or Vepesid), abiraterone (Zytiga), nab-paclitaxel (Abraxane), Abstral, actinomycin D, Dactinomycin (Cosmegen), Actiq, Afatinib (Giotrif), everolimus (Afinitor), aflibercept (Zaltrap), imiquimod cream (Aldara), aldesleukin (IL-2, Proleukin or interleukin 2), alemtuzumab (MabCampath), melphalan (Alkeran), amsacrine (amsidine, m-AMSA), anastrozole (Arimidex), cytarabine (Ara C, cytosine arabinoside), disodium pamidronate (Aredia), exemestane (Aromasin), arsenic trioxide (Trisenox, ATO), asparaginase (Crisantaspase, Erwinase), axitinib (Inlyta), azacitidine (Vidaza), BEACOPP

(bleomycin/etoposide/doxorubicin/cyclophosphamide/vincristine/procarbazine/prednisolone combination), BEAM (carmustine (BiCNU)/etoposide/cytarabine (Ara-C, cytosine arabinoside)/melphalan combination), procarbazine, prednisolone, bendamustine (Levact), bevacizumab (Avastin), bexarotene (Targretin), bicalutamide (Casodex), bleomycin, BEP (bleomycin/etoposide/platinum (cisplatin) combination), bortezomib (Velcade), bosutinib (Bosulif), brentuximab (Adcetris), ibuprofen (Brufen, Nurofen), buserelin (Suprefact), busulfan (Myleran, Busilvex), CAPOX (CAPE-OX, XELOX; oxaliplatin and capecitabine combination), CAV (cyclophosphamide/doxorubicin (Adriamycin)/vincristine

combination), CAVE (cyclophosphamide/doxorubicin (Adriamycin)/vincristine/etoposide combination), lomustine (CCNU), CHOP (cyclophosphamide/doxorubicin hydrochloride (Adriamycin)/vincristine (Oncovin)/prednisolone combination), CMF

(cyclophosphamide/methotrexate/fluorouracil (5FU) combination), CMV

(cisplatin/methotrexate/vinblastine combination), CTD

(cyclophosphamide/thalidomide/dexamethasone combination), CVP

(cyclophosphamide/vincristine (Oncovin)/prednisolone combination), cabazitaxel (Jevtana), cabozantinib (Cometriq, Cabometyx), liposomal doxorubicin (Caelyx, Myocet, Doxil), paracetamol (Panadol, Anadin, Calpol), irinotecan (Campto), capecitabine (Xeloda), vandetanib (Caprelsa), Carbo MV (carboplatin/methotrexate/vinblastine combination), PC (CarboTaxol; paclitaxel and carboplatin combination), carboplatin, carboplatin and etoposide combination, carmustine (BCNU, Gliadel), celecoxib (Celebrex), ceritinib

(Zykadia), daunorubicin (Cerubidin), cetuximab (Erbitux), ChlVPP

(chlorambucil/vinblastine/procarbazine/prednisolone combination), chlorambucil

(Leukeran), cisplatin, cisplatin and teysuno combination, CX (cisplatin and capecitabine (Xeloda) combination), PEI (cisplatin, etoposide and ifosfamide combination),

cisplatin/fluorouracil (5-FU)/trastuzumab combination, cladribine (Leustat, LITAK), sodium clodronate (Bonefos, Clasteon), clofarabine (Evoltra), co-codamol (Kapake, Solpadol, Tylex), cabozantinib (Cometriq, Cabometyx), Dactinomycin (actinomycin D, Cosmegen), crizotinib (Xalkori), cyclophosphamide, cyproterone acetate (Cyprostat), DHAP (dexamethasone/high dose cytarabine/cisplatin combination), dacarbazine (DTIC), dabrafenib (Tafinlar), decitabine (Dacogen), dasatinib (Sprycel), de Gramont (fluorouracil (5FU)/folinic acid combination), triptorelin (Decapeptyl SR, Gonapeptyl Depot), degarelix (Firmagon), denosumab (Prolia, Xgeva), dexamethasone, prednisolone, methylprednisolone, diamorphine, docetaxel (Taxotere), TPF (docetaxel (Taxotere)/cisplatin/fluorouracil combination), Doxifos (dox-ifos; doxorubicin and ifosfamide combination), flutamide (Drogenil, Eulexin), fentanyl (Durogesic, Effentora, Instanyl), E-CMF (Epi-CMF; epirubicin/cyclophosphamide/methotrexate/fluorouracil combination), EC (epirubicin and cyclophosphamide combination), ECF (epirubicin, cisplatin and fluorouracil (5FU) combination), EOF (epirubicin, oxaliplatin and fluorouracil (5FU) combination), EOX (epirubicin, oxaliplatin and capecitabine combination), EP (etoposide and cisplatin combination), ESHAP (etoposide, methylprednisolone, cytarabine and cisplatin

combination), fluorouracil (5FU; Efudix), vindesine (Eldisine), oxaliplatin (Eloxatin), enzalutamide (Xtandi), epirubicin (Pharmorubicin), ECarboX (epirubicin (Pharmorubicin), carboplatin (Paraplatin) and capecitabine (Xeloda) combination), ECX (epirubicin

(Pharmorubicin)/cisplatin/capecitabine (Xeloda) combination), etoposide (Eposin,

Etopophos, Vepesid), cetuximab (Erbitux), eribulin (Halaven), erlotinib (Tarceva), estramustine (Eslra yf), ELF (ctoposide/leucovorin (folinic acid, FA, calcium

folinate)/fluorouracil (5FU) combination), everolimus (Afinitor), clofarabine (Evoltra), exemestane (Aromasin), FAD (fludarabine/doxorubicin (Adriamycin)/dexamethasone combination), FC (fludarabine (Fludara)/cyclophosphamide combination), FCR

(fludarabine, cyclophosphamide and rituximah combination), FEC (fluorouracil

(5FU)/epirubicin/cyclophosphamide combination), FEC-T (fluorouracil

(5FU)/epirubicin/cyclophosphamide/docetaxel (Taxotere) combination), FMU (fludarabine (Fludara)/mitoxantrone (Onkotrone)/dexamethasone combination), FOLFIRINOX (folinic acid (leucovorin, calcium folinate, FA)/fluorouracil (5FU)/irinotecan/oxaliplatin

combination), fulvestrant (Faslodex), letrozole (Femara), degarelix (Firmagon), fludarabine (Fludara), fluorouracil (5FU), FOLFIRI (folinic acid, fluorouracil and irinotecan

combination), FOLFOX (Folinic acid, fluorouracil and oxaliplatin combination), fulvestrant (Faslodex), granulocyte colony stimulating factor (G-CSF), lenograstim (Granocyte), filgrastim (Neupogen, Zarzio, Nivestim, Ratiograstim), long acting (pegylated) filgrastim (pegfilgrastim, Neulasta), long acting (pegylated) lipegfilgrastim (Longquex), gefitinib (Iressa), GemCarbo (gemcitabine and carboplatin combination), GemTaxol (Gemcitabine (Gemzar) and paclitaxel (Taxol) combination), gemcitabine (Gemzar), GemCap

(gemcitabine and capecitabine combination), GC (gemcitabine and cisplatin combination), imatinib (Glivec), triptorelin (Decapeptyl SR, Gonapeptyl Depot), goserelin (Zoladex, Novgos), eribulin (Halaven), trastuzumab (Herceptin), topotecan (Hycamtin, Potactasol), hydroxycarbamide (Hydrea), hydroxyurea, I-DEX (Z-DEX; idarubicin (Zavedos) and dexamethasone combination), ICE (ifosfamide, carboplatin and etoposide (Vepesid, Etopophos, Eposin) combination), aldesleukin (IL-2, Proleukin or interleukin 2), IPE (VIP; PEI; cisplatin, etoposide and ifosfamide combination), ibandronic acid (Bondronat), ibritumomab (Zevalin), ponatinib (Iclusig), idarubicin (Zavedos), idelalisib (Zydelig), ifosfamide (Mitoxana), pomalidomide (Imnovid), interferon (intron A), ipilimumab

(Yervoy), XELIRI (irinotecan and capecitabine combination), vinflunine (Javlor), trastuzumab emtansine (Kadcyla), pembrolizumab ( eytruda), tioguanine (thioguanine, 6- TG, 6-tioguanine; Lanvis), lapatinib (Tyverb), lenalidomide (Revlimid), letrozole (Femara), leuprorelin (Prostap, Lutrate), olaparib (Lynparza), mitotane (Lysodren), MIC (mitomycin, ifosfamide and cisplatin combination), MM (mitoxantrone (Mitozantrone, Onkotrone) and methotrexate (Maxtrex) combination), MMM (mitoxantrone (Mitozantrone), mitomycin C and methotrexate combination), morphine (Morphgesic SR, MXL, Zomorph, MST, MST Continus, Scvredol, Oraiiiorph), MVAC (methotrexate, vinblastine., doxorubicin

(Adriamycin) and cisplatin combination), MVP (mitomycin, vinblastine and cisplatin combination), rituximab (Mabthera), methotrexate (Maxtrex), medroxyprogesterone acetate (Provera), megestrol acetate (Megace), MPT (melphalan, prednisolone and thalidomide combination), mifamurtide (Mepact), mitomycin C (Mitomycin-C Kyowa), mitoxantrone (Mitozantrone, Onkotrone), vinorelbine (Navelbine), nelarabine (Atriance), sorafenib (Nexavar), nilotinib (Tasigna), nintedanib (Vargatef), pentostatin (Nipent), nivolumab (Opdivo), ofatumumab (Arzerra), olaparib (Lynparza), vincristine (Oncovin), oxaliplatin (Eloxatin), XELOX (oxaliplatin (Eloxatin) and capecitabine (Xeloda) combination), PAD (bortezomib (Velcade), doxorubicin (Adriamycin) and dexamethasone combination), PCV (procarbazine, lomustine (CCNU) and vincristine combination), EP (etoposide (Vepesid, Eposin, Etopophos) and cisplatin combination), PMitCEBO (prednisolone, mitoxantrone, cyclophosphamide, etoposide, bleomycin and vincristine (Oncovin)

combination), POMB/ACE (cisplatin, vincristine (Oncovin), methotrexate, bleomycin, Actinomycin (Dactinomycin), cyclophosphamide and etoposide (Eposin, Etopophos, Vepesid) combination), paclitaxel (Taxol), panitumumab (Vectibix), pazopanib (Votrient), pemetrexed (Alimta), pemetrexed (Alimta) and carboplatin combination, pemetrexed (Alimta) and cisplatin combination, pertuzumab (Perjeta), pixantrone (Pixuvri),

mercaptopurine (Purinethol; Xaluprine), R-CHOP (rituximab (Mabthera),

cyclophosphamide, doxorubicin hydrochloride, vincristine (Oncovin) and prednisolone combination), R-CVP (rituximab (Mabthera), cyclophosphamide, vincristine (Oncovin) and prednisolone combination), R-DHAP (rituximab (Mabthera), dexamethasone, cytarabine and cisplatin combination), R-ESHAP (rituximab (Mabthera), etoposide,

methylprednisolone, cytarabine and cisplatin combination), R-GCVP (rituximab

(Mabthera), gemcitabine, cyclophosphamide, vincristine and prednisolone combination), RICE (rituximab (Mabthera), ifosfamide, carboplatin and etoposide combination), raloxifene, raltitrexed (Tomudex), regorafenib (Stivarga), Stanford V (doxorubicin, vinblastine, mechlorethamine (mustine or nitrogen mustard), vincristine, bleomycin, etoposide and prednisone (or prednisolone) combination), streptozocin (Zanosar), sunitinib (Sutent), TAC (docetaxel (Taxotere), doxorubicin (Adriamycin) and cyclophosphamide combination), TIP (paclitaxel, ifosfamide and cisplatin combination), tamoxifen, TC

(docetaxel (Taxotere) and cyclophosphamide combination), temozolomide (Temodal), temsirolirnus (Toriscl), thiotepa (Tepadina), trabectedin (Yondelis), treosulfan, tretinoin (Vesanoid, ATRA), VIDE (vincristine, ifosfamide, doxorubicin and etoposide combination), VelP (vinblastine, ifosfamide and cisplatin combination), vinblastine (Velbe), vemurafenib (Zelboraf), vincristine (Oncovin), VAC (vincristine, actinomycin D (dactinomycin) and cyclophosphamide combination), VAI (vincristine, actinomycin and ifosfamide

combination), VAD (vincristine, Adriamycin (doxorubicin) and dexamethasone

combination), vismodegib (Erivedge), ZELOX (oxaliplatin (Eloxatin) and capecitabine (Xeloda) combination) and zoledronic acid (Zometa) and combination thereof.

A method for assessing prognosis of a subject afflicted with a tumor or cancer and predicting response to a. cancer drug by the subject comprising:

A. identifying a subject afflicted by a particular type or subtype of tumor;

B. obtaining a tumor sample from the subject;

C. identifying immunogenic features of a tumor microenvironment in a tumor sample from the subject;

D. comparing the the immunogenic features so obtained with the immunogenic features associated with a good or favourable tumor or cancer prognosis and/or associated with a bad or unfavourable tumor or cancer prognosis, so as to assess prognosis of a subject afflicted with a tumor or cancer; and E. comparing the the immunogenic features so obtained with the immunogenic features associated with a good or favourable response to a cancer drug and/or associated with a bad or unfavourable response to a cancer drug, so as to predict response to a cancer drug by the subject; wherein the immunogenic features associated with a good or favourable tumor or cancer prognosis and/or associated with a bad or unfavourable tumor or cancer prognosis are determined on one or more group of subjects with good or favourable outcome to tumor or cancer and/or one or more group of subjects with bad or unfavourable outcome to tumor or cancer, respectively; and, wherein the immunogenic features associated with a good or favourable response to a cancer drug and/or associated with a bad or unfavourable response to a cancer drug are determined on one or more group of subjects with good or favourable response to a cancer drug and/or one or more group of subjects with bad or unfavourable response to a cancer drug, respectively;

thereby, assessing prognosis of a subject afflicted with a tumor or cancer and predicting response to a cancer drug by the subject.

54. The method of claim 53, wherein the tumor is a tumor or cancer of the brain, head, eye, bladder, neck, mouth, nose, throat, thymus, lymph node, blood, lung, esophagus, trachea, stomach, intestine, colon, rectum, pancreas, liver, kidney, bone, skin, breast, arm, hand, chest, abdomen, leg, foot, genital, testes, ovary, uterus, cervix, urethra and prostate.

55. The method of claim 50, 51 or 53, where the subject is a mammal.

56. The method of claim 55, wherein the mammal is selected from the group consisting of

human, mouse, rat, monkey, chimpanzee, cow, pig, horse, rabbit, cow, mink, guinea pig and hamster.

57. The method of claim 53, wherein identifying immunogenic features of a tumor

microenvironment comprises:

A. analyzing gene expression data sets of pure immune cells and selecting genes that satisfy three criteria so as to establish a gene signature for a particular immune cell type and/or subtype:

B. stable expression of the gene in a given immune cell type or subtypes within the particular immune cell type; C. significantly higher level of expression of the gene in the immune cell type or subtypes within that particular cell type of interest than in other immune cells;

D. converting cell type- and subtype-specific gene expression signatures to an immune score, which can be used to stratify tissue samples as a quantitative measure of immunogenic features;

E. generating a range of scores to distinguish tumors as containing no infiltration of immune cell type or subtype as follows;

F. Low infiltration of immune cell-type or subtype;

G. Medium infiltration of immune cell-type or subtype and

H. High infiltration of immune cell-type or subtype.

The method of claim 53, wherein the immune signature-derived scores are applied on human tumor gene expression data to predict prognosis comprising:

A. generating gene expression signatures by the method of claim 38 and applying the gene expression signatures on tumors to separate them into clusters or groups;

B. characterizing immune cell infiltration profile for each cluster based on their

immune scores;

C. selecting alive and dead individuals of each cluster and identifying signatures

associated with prognosis;

D. assessing prognosis based on immune infiltrate composistion of the tumor matching closely with either alive or dead so that signatures associated with good or poor prognosis are closely linked to the immune infiltrate composition of the tumor.

The method of claim 58, wherein the signatures associated with good and bad prognosis is closely linked to the function of infiltrating immune cells

A biomarker consisting of or comprising a MGES obtained by the method of claim 38 and anergic and exhaustive CD8+ T cells, wherein said biomarker is indicative of the therapeutic efficacy of a checkpoint inhibitor drug or a cancer drug that stimulates an immune response. A kit for determining the efficacy of a cancer therapy, comprising one or more biomarkers of claim 60, and written instructions for use of the kit for determining the efficacy of a cancer therapy.

The kit of claim 61 , wherein one or more biomarkers is detected