NZ714059B2

NZ714059B2 - Predicting immunogenicity of t cell epitopes

Info

Publication number: NZ714059B2
Application number: NZ714059A
Authority: NZ
Inventors: Sebastian Boegel; John Christopher Castle; Martin Lower; Ugur Sahin; Arbel David Tadmor
Original assignee: Biontech Rna Pharmaceuticals Gmbh; Tron Translationale Onkologie An Der Universitätsmedizin Der Johannes Gutenberg Universität Mainz Gemeinnützige Gmbh
Priority date: 2013-05-10
Filing date: 2014-05-07
Publication date: 2021-11-30

Abstract

The present invention relates to methods for predicting T cell epitopes. In particular, the present invention relates to methods for predicting whether modifications in peptides or polypeptides such as tumor-associated neoantigens are immunogenic or not. The methods of the invention are useful, in particular, for the provision of vaccines which are specific for a patient's tumor and thus, in the context of personalized cancer vaccines. In a particular embodiment the invention is a method of making a vaccine by predicting immunogenicity of modified peptides using three scores/criteria related to the (1) binding of non-modified peptides to MHC molecules (Mwt) (2) binding of modified peptides to MHC molecules (Mmut) and (3) a T score based on the chemical and physical similarities between the non-modified and modified amino acids as an indicator for TCR binding. articular, for the provision of vaccines which are specific for a patient's tumor and thus, in the context of personalized cancer vaccines. In a particular embodiment the invention is a method of making a vaccine by predicting immunogenicity of modified peptides using three scores/criteria related to the (1) binding of non-modified peptides to MHC molecules (Mwt) (2) binding of modified peptides to MHC molecules (Mmut) and (3) a T score based on the chemical and physical similarities between the non-modified and modified amino acids as an indicator for TCR binding.

Description

(12) Granted patent speciﬁcaon (19) NZ (11) 714059 (13) B2 (47) Publicaon date: 2021.12.24 (54) PREDICTING IMMUNOGENICITY OF T CELL EPITOPES (51) Internaonal Patent Classiﬁcaon(s): A61K 39/00 G06F 19/18 (22) Filing date: (73) Owner(s): 2014.05.07 TRON - ationale Onkologie an der Uni versitätsmedizin der Johannes Gutenberg-U (23) Complete speciﬁcaon ﬁling date: niversität Mainz gemeinnützige GmbH 2014.05.07 BioNTech RNA Pharmaceuticals GmbH (30) aonal Priority Data: (74) Contact: EP 2013/001400 2013.05.10 FB Rice Pty Ltd (86) Internaonal Applicaon No.: (72) or(s): SAHIN, Ugur LÖWER, Martin (87) Internaonal Publicaon number: TADMOR, Arbel David WO/2014/180569 CASTLE, John Christopher BOEGEL, Sebastian (57) Abstract: The present invenon relates to methods for predicng T cell epitopes. In parcular, the present invenon relates to s for predicng whether modiﬁcaons in pepdes or polypepdes such as tumor-associated neoangens are immunogenic or not. The methods of the invenon are useful, in lar, for the provision of vaccines which are speciﬁc for a paent's tumor and thus, in the t of personalized cancer vaccines. In a parcular ment the on is a method of making a vaccine by predicng immunogenicity of modiﬁed pepdes using three scores/criteria related to the (1) binding of non-modiﬁed pepdes to MHC molecules (Mwt) (2) binding of modiﬁed pepdes to MHC molecules (Mmut) and (3) a T score based on the chemical and physical similaries between the non-modiﬁed and modiﬁed amino acids as an indicator for TCR binding. 714059 B2 PREDICTING IMMUNOGENICITY OF T CELL EPITOPES TECHNICAL FIELD OF THE INVENTION The t invention relates to methods for predicting T cell epitOpes. In particular. the present invention relates to methods for predicting whether ations in peptides or polypeptides such as tumor-associated neoantigens are immunogenic or not. The methods of the invention are useful. in particular. for the provision of vaccines which are speciﬁc for a patient's tumor and. thus. in the context ofpersonalized cancer vaccines.

BACKGROUND OF THE ION Personalized cancer vaccines are therapeutic vaccines custom tailored to target tumor-speciﬁc mutations that are unique to a given patient. Such a ent offers great hope for cancer patients as it does not harm healthy cells and has the potential to provide ong remission. Yet not every mutation expressed by the tumor can be used as a target for a vaccine. In fact. most cancer somatic mutations will not lead to an immune response when vaccinated against (.1. C.

Castle et a1.. Exploiting the mutanome for tumor vaccination. Cancer Research 72. 1081 (2012)).

Since tumors can encode as many as 100.000 somatic mutations (M. R. on. e Signalling 331. 1553 (2011)) s vaccines target only a handful of epitopes. it is nt that a critical goal of cancer immunotherapy is to fy which mutations are likely to be immunogenic.

From a biological perspective. in order for a somatic mutation to generate an immune response several criteria need to be satisﬁed: the allele containing the mutation should be expressed by the cell. the mutation should be in a n coding region and nonsynonymous. the ated protein should be cleaved by the proteasome and an epitope containing the mutation should be presented by the MHC complex. the presented epitope should be recognized by a T cell receptor ( TCR) and. ﬁnally. the TCR-pMHC complex should launch a signaling cascade that activates the T cell (S. Whelan. N. Goldman. Molecular biology and evolution 18. 691 (2001)). Thus far no algorithm has been put forth that is capable of predicting with a high degree of certainty which mutations are likely to fulﬁll all these criteria. In the present report we consider several factors that may contribute to immunogenicity. compare these factors against experimental data and e a simple model for identifying immunogenic mutations.

MHC binding prediction: State of the art Over 20 years ago. it was established. that there are positions in a MHC binding peptide. which contribute more to the binding capability then others (e.g., (A. Sette et al.. Proceedings of the National Academy of Sciences 86. 3296 (1989))). The identiﬁcation and description of those anchor positions enabled finding ns of MHC binding es and thus were the basis for developing methods for predicting. In recent years cant developments in the ﬁeld of in silica models of the Antigen Processing Machinery were achieved. The two pioneering approaches. which were developed in the late 1990’s. BIMAS (K. C. Parker. M. A. Bednarek. J.

E. Coligan. The Journal of Immunology 152. 163 (1994)) and SYFPEITHI (H.-G. Rammensee.

J. Bachmann. N. P. N. Emmerich. O. A. Bachor. S. Stevanovié. Immunogenetics 50. 213 (1999)). were based on the knowledge of anchor positions and on the derived allele-speciﬁc motifs. As more and more experimental MHC peptide-binding data became available. more tools have been developed using a wide variety of statistical and computational techniques (see Fig. l for an ew). The led -based methods use position-specific scoring matrices to determine if a e sequence matches the binding motif of particular MHC allele. Another class of MHC binding prediction methods use machine learning ques. such as Artificial Neural Networks or Support Vector Machines (see Fig. 1). The performance of these algorithms strongly depends on quantity and quality of the available ng dataset for each allele model (e.g. "HLA-A*02:01". "HZ—Db" etc.) to "learn" underlying patterns /features. which have prediction capability for binding. Recently. structure-based s are emerging. which circumvent the bottleneck of having a large training set. as they solely rely on peptide-MHC crystal ures and scoring functions (e.g. different energy functions) to predict peptide-MHC interactions by. eg. energy minimization (see Fig. 1). r. the accuracy of those approaches is still far behind the sequence-based methods. Benchmarking studies shows that the artificial neural network based tool NetMHC (C. Lundegaard et al.. Nucleic Acids Research 36.

W509 (2008)) and the matrix based algorithm SMM (B. . A. Sette. BMC ormatics 6. 132 (2005)) m best on the tested evaluation data (B. Peters. A. Sette. BMC bioinformatics 6. 132 (2005): H. H. Lin. S. Ray. S. Tongchusak. E. L. Reinherz. V. Brusic. BMC immunology 9. 8 (2008)). Both approaches are integrated in the so-called IEDB consensus methods. available at the Immune e se (Y. Kim et al.. Nucleic Acids Research 40. W525 (2012)).

Modeling ctions of peptide-MHC II binding is far more complex than for MHC I. as MHC II molecules possess a binding groove with open ends at either side. allowing binding of peptides of different s. Whereas peptides binding to MHC I is restricted to mainly 8-12 amino acids. this length can differ for MHC II peptides dramatically (9-30 amino acids). A recent benchmarking study shows. that the available MHC 11 predictions methods offer a limited accuracy compared to MHC I prediction (H. H. Lin. S. Ray. S. Tongchusak. E. L. Reinherz. V.

. BMC immunology 9. 8 ).

The ﬁrst scale and systematic use of those algorithms to ﬁnd T cell epitopes was undertaken by Moutaftsi et ul. (M. Moutaftsi et al.. Nature Biotechnology 24. 817 (2006)). where different tools were combined to predict le e candidates of vaccinia virus infected C57BL76 mice. extracted spleenocytes and measured CD8+ T cell responses t the top 1% of the predicted peptides. They identiﬁed 49 (out of 2256) peptides. that induced a T cell response. Since then many s have been published using various MHC binding prediction tools to search for T cell epitopes as candidates for a e. mainly for pathogens. e.g..

Leis/mania major (C. Herrera-Najera. R. Piﬁa-Aguilar, F. Xacur—Garcia. M. J. Ramirez-Sierra.

E. Dumonteil. Proteomics 9. 1293 (2009)). However. to use solely MHC 1 binding prediction tools for prediction of immunogenicity is misleading. as those tools are trained to predict whether a given peptide has the potential to bind to a given MHC allele. The rationale of using MHC binding predictions for ting immunogenicity is the assumption that peptides binding with high afﬁnity a respective MHC allele is more likely to be immunogenic (A. Sette et al.. The Journal of Immunology 153. 5586 (1994)). However. there are numerous studies indicating. that also low MHC binding afﬁnity can result in high immunogenicity (M. C. Feltkamp. M. P.

Vierboom. W. M. Kast. C. J. Melief. Molecular logy 31. 1391 (1994)) and that peptide- MHC stability might be a better predictor for immunogenicity than peptide afﬁnity (M. Hamdahl et al.. European Journal of logy 42. 1405 (2012)). For that reason. immunogenicity prediction was not very accurate so far. which is mirrored in the low success rates for predicting immunogenicity. Nevertheless. peptide binding is a necessary but not sufﬁcient condition of T of es cell epitope recognition. and etﬁcient tion can dramatically reduce the number to be tested experimentally. into It is clear that the development of a model that predicts immunogenicity needs also to take account the recognition ofthe T cell receptor (TCR) as well as central tolerance. i.e.. the negative and positive selection ofT cells during development in the thymus.

There is a need for a predictive model. which is capable to model all the aspects mentioned above to accurately predict immunogenicity of an e, rather than only binding.

DESCRIPTION OF INVENTION SUMMARY OF THE ION acid In one aspect, the present invention relates to a method for predicting immunogenic amino modiﬁcations. the method comprising the steps: a) ascertaining a score for binding of a modiﬁed peptide to one or more MHC molecules. b) ascertaining a score for binding of the non—modiﬁed peptide to one or more MHC molecules. and/or c) ascertaining a score for binding of the modified peptide when present in a MHC-peptide complex to one or more T cell receptors.

In one ment. the modiﬁed peptide comprises a fragment of a modiﬁed protein. said fragment comprising the modiﬁcation(s) t in the protein. In one embodiment. the non- modiﬁed peptide or protein has the germline amino acid at the position(s) corresponding to the on(s) of the modiﬁcation(s) in the ed peptide or protein.

In one embodiment. the non-modiﬁed peptide or protein and modiﬁed e or protein are identical but for the modiﬁcation(s). Preferably. the non-modiﬁed peptide or protein modiﬁed peptide or protein have the same length/or and sequence (except for the modiﬁcation( 3)).

In one embodiment. the non-modiﬁed e and ed e are 8 to 15. preferably 8 to 12 amino acids in length.

In one embodiment. the one or more MHC molecules comprise different MHC molecule types. in particular different MHC alleles. In one ment. the one or more MHC molecules are MHC class I molecules and/or MHC class II les. In one embodiment. the one or more MHC molecules comprise a set of MHC alleles such as a set of MHC alleles of an individual or a subset thereof.

In one embodiment. the score for binding to one or more MHC molecules is ascertained by a process comprising a sequence comparison with a se ofMHC-binding In one embodiment, step a) comprises aining r said score satisﬁes a termined threshold for binding to one or more MHC molecules and/or step b) comprises ascertaining whether said score satisﬁes a pre—detemiined threshold for binding to one or more MHC molecules. In one embodiment. the threshold applied in step a) is different to the threshold applied in step b). In one embodiment. the pre-determined threshold for binding to one or more MHC molecules reﬂects a ility for binding to one or more MHC molecules.

In one embodiment. the one or more T cell receptors comprise a set ofT cell receptors such a set of T cell receptors of an individual or a subset thereof. In one embodiment. step 0‘) comprises assuming that said set ofT cell receptors does not include T cell receptors which bind to the non- modiﬁed peptide when present in a MHC-peptide complex and/or does not include T cell receptors which bind to the non-modiﬁed peptide when present in a MHC-peptide complex with high afﬁnity.

In one embodiment. step c) comprises ascertaining a score for the chemical and physical similarities between the non-modiﬁed and modified amino acids. In one embodiment. step c) ses ascertaining whether said score satisfies a pre-determined threshold for the chemical and physical similarities between amino acids. In one embodiment. said pre-determined threshold for the al and physical similarities between amino acids reﬂects a probability for amino acids being chemically and physically similar. In one embodiment. the score for the chemical and physical similarities is ascertained on the basis of the probability of amino acids being interchanged in nature. In one embodiment. the more frequently amino acids are interchanged in nature the more similar the amino acids are considered and vice versa. In one embodiment. the chemical and physical similarities are determined using evolutionary based log- odds matrices.

In one embodiment. if the diﬁed peptide has a score for binding to one or more MHC molecules ying a old indicating binding to one or more MHC molecules and the d e has a score for binding to one or more MHC molecules satisfying a threshold indicating binding to one or more MHC molecules. the modiﬁcation or modiﬁed peptide is predicted as immunogenic if the non—modiﬁed and d amino acids have a score for the chemical and physical similarities satisfying a threshold indicating chemical and physical dissimilarity.

In one embodiment. if the diﬁed peptide binds to one or more MHC molecules or has a probability for binding to one or more MHC molecules and the modiﬁed peptide binds to one or more MHC molecules or has a probability for g to one or more MHC molecules. the modiﬁcation or modiﬁed peptide is predicted as immunogenic if the non-modiﬁed and modiﬁed amino acids are chemically and physically dissimilar or have a probability of being chemically and physically dissimilar.

In one embodiment. the ation is not in an anchor position for binding to one or more MHC molecules. in one embodiment. if the non-modiﬁed peptide has a score for binding to one or more MHC molecules satisfying a threshold indicating no binding to one or more MHC molecules and the modified peptide has a score for binding to one-or more MHC les satisfying a threshold indicating binding to one or more MHC molecules. the modification or modiﬁed peptide is predicted as genic.

In one embodiment. if the dified peptide does not bind to one or more MHC molecules or binds has a probability for not binding to one or more MHC molecules and the modified peptide to one or more MHC molecules. to one or more MHC molecules or has a probability for binding the modiﬁcation or modiﬁed peptide is ted as immunogenic.

In one embodiment. the modiﬁcation is in an anchor position for binding to one or more MHC molecules.

In one embodiment. the method of the invention comprises ming step a) on two or more different modiﬁed peptides. said two or more different modiﬁed peptides sing the same modification(s). In one embodiment. the two or more different modiﬁed peptides comprising the modification(s) comprise different nts of a modiﬁed protein. said different same fragments comprising the same modiﬁcation(s) t in the n. In one embodiment. the all two or more different modiﬁed peptides comprising the same modiﬁcation(s) comprise potential MHC binding fragments of a modified protein. said fragments comprising the same modification(s) present in the protein. In one embodiment. the method of the invention further comprises selecting (the) modified peptide(s) from the two or more different modified peptides comprising the same modification(s) having a ility or having the highest probability for binding to one or more MHC molecules. In one ment. the two or more different modified peptides comprising the the same modification(s) differ in length and/or position of modification(s).

In one embodiment. the method of the ion comprises performing step a) and Optionally one In one embodiment. said or both of steps b) and c) on two or more different modiﬁed peptides. two or more different ed peptides comprise the same ation(s) and/or comprise different modifications. In one embodiment. the different modifications are present in the same and/or in different proteins. The set of two or more different modified peptides used in step a) and optionally one or both of steps b) and c) may be the same or different. In one embodiment. 2014/001232 the set of two or more different modiﬁed es used in step b) and/or step c) is a subset of the set of two or more different modiﬁed peptides used in step a). Preferably. said subset includes the e(s) scoring best in step a).

In one embodiment. the method of the invention ses comparing the scores of two or more of said different modiﬁed peptides. In one embodiment. the method of the invention comprises ranking two or more of said different d peptides. In one embodiment. a score for binding of the modiﬁed peptide to one or more MHC molecules is weighted higher than a score for binding of the modified peptide when present in a MHC-peptide complex to one or more T cell the dified receptors. preferably a score for the chemical and physical similarities between and modified amino acids and a score for binding of the modified e when present in a ptide complex to one or more T cell receptors. preferably a score for the chemical and physical similarities between the non-modiﬁed and modiﬁed amino acids is weighted higher than a score for binding of the non—modiﬁed peptide to one or more MHC molecules.

In one embodiment. the method of the invention further comprises identifying non-synonymous mutations in one or more n-coding regions.

In one embodiment. modifications are identiﬁed according to the invention by partially or completely sequencing the genome or transcriptome of one or more cells such as one or more in one or cancer cells and optionally one or more non-cancerous cells and identifying mutations more protein-coding regions.

In one ment. said mutations are somatic mutations. In one ment. said mutations are cancer mutations.

In one embodiment. the method of the invention is used in the manufacture of a vaccine. In one embodiment. the vaccine is derived from (a) modiﬁcation(s) or (a) modiﬁed peptide(s) predicted as immunogenic by the methods of the invention.

In a further . the present invention provides a method for providing a vaccine comprising the step: identifying (a) modiﬁcation(s) or (a) modified peptide(s) predicted as immunogenic by the methods of the invention.

In one embodiment. the method further comprises the step: providing a e sing a peptide or ptide comprising the modiﬁcation(s) or modiﬁed peptide(s) predicted as immunogenic. or a nucleic acid encoding the peptide or polypeptide.

In present obtainable using the a further aspect. the invention es a vaccine which is methods according to the invention. Preferred embodiments of such vaccines are described herein.

A vaccine provided according to the invention may comprise a pharmaceutically acceptable carrier and may optionally se one or more adjuvants, stabilizers etc. The vaccine may in the form ofa therapeutic or prophylactic vaccine.

Another aspect s to a method for inducing an immune response in a patient. comprising administering to the patient a vaccine provided according to the invention.

Another aspect relates to a method of treating a cancer patient comprising the steps: (a) providing a vaccine by the methods according to the invention: and (b) stering said vaccine to the patient.

Another aspect relates to a method of ng a cancer patient comprising administering the vaccine according to the invention to the t.

In further aspects. the invention provides the vaccines described herein for use in the methods of treatment described herein. in particular for use in treating or preventing cancer.

WO 80569 and/or The treatments of cancer described herein can be combined with surgical resection radiation and/or traditional chemotherapy. , Other features and advantages of the instant invention will be apparent from the following detailed description and claims.

DETAILED DESCRIPTION OF THE INVENTION the present invention is described in detail below. it is to be understood that this Although herein invention is not limited to the ular methodologies, protocols and reagents described as these may vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the present be d only by the appended claims. Unless deﬁned ise. all invention which will technical and scientiﬁc terms used herein have the same meanings as commonly understood by one of ordinary skill in the art.

In the following, the elements of the present invention will be described. These elements are listed with specific embodiments. r. it should be understood that they may be combined in any manner and in any number to create additional embodiments. The variously described examples and preferred ments should not be construed to limit the present invention to only the explicitly described embodiments. This description should be understood to support bed embodiments with any number encompass embodiments which combine the explicitly ations of of the disclosed and/or preferred elements. Furthermore. any ations and of the all described elements in this application should be considered sed by the description present application unless the context indicates otherwise.

Preferably. the terms used herein are defined as described in "A multilingual glossary of biotechnological terms: (IUPAC Recommendations)". HG. W. erger, B. Nagel. and H. [(611)]. Eds. (1995) Helvetica Chimica Acta. CH—40108asel. Si'vitzerlt'mcl. 2014/001232 The ce of the present invention will employ. unless otherwise indicated. conventional methods of biochemistry. cell y. immunology, and recombinant DNA techniques which (cf.. zl/Ianuul. are explained in the literature in the ﬁeld e.g., Molecular Cloning: A Laboratory 2nd Edition. J. ok et al. eds.. Cold Spring Harbor Laboratory Press. Cold Spring Harbor 1989). otherwise.

Throughout this speciﬁcation and the claims which follow. unless the context requires will be understood to the word "comprise". and variations such as "comprises" and "comprising". imply the inclusion of a stated member. r or step or group of members, rs or steps or steps but not the exclusion of any other member. integer or step or group of members. integers members. integers although in some embodiments such other member. integer or step or group of or steps may be excluded. i.e. the subject-matter consists in the inclusion of a stated member. integer or step or group of s. integers or steps. The terms "a" and "an" and "the" and in the context of the similar nce used in the context of describing the invention (especially otherwise indicated claims) are to be construed to cover both the singular and the plural. unless herein y contradicted by context. Recitation of ranges of values herein is merely value falling intended to serve as a and method of referring individually to each separate is incorporated into within the range. Unless ise indicated herein. each individual value the speciﬁcation as if it were individually recited herein. otherwise indicated All methods described herein can be performed in any suitable order unless and all or herein clearly contradicted by context. The use of any or otherwise examples. exemplary language (e.g.. "such as"). provided herein is intended merely to better illustrate the claimed. No invention and does not pose a limitation on the scope of the invention otherwise element language in the speciﬁcation should be construed as indicating any non-claimed essential to the practice of the invention.

Several documents are cited throughout the text of this speciﬁcation. Each of the documents cited herein (including all s. patent applications. iﬁc publications. manufacturer's reference in speciﬁcations. instructions. etc). whether supra or infra. are hereby incorporated by their entirety. Nothing herein is to be construed as an admission that the invention is not entitled to antedate such disclosure by virtue ofprior invention.

According to the present invention. the term "peptide" refers to substances comprising two or more. preferably 3 or more. preferably 4 or more. preferably 6 or more. preferably 8 or more. preferably 10 or more, preferably [3 or more. preferably 16 more. preferably 21 or more and up to preferably 8. 10. 20. 30. 40 or 50. in particular [00 amino acids joined covalently by peptide bonds. The term "polypeptide" or "protein" refers to large peptides. preferably to es with more than 100 amino acid es. but in general the terms "peptide". "polypeptide" and "protein” are synonyms and are used interchangeably herein.

According to the invention. the term "modification" with respect to peptides, polypeptides or ns relates to a sequence change in a peptide, polypeptide or protein compared to a parental sequence such as the ce of a wildtype peptide, polypeptide or protein. The term includes amino acid ion variants. amino acid addition variants. amino acid deletion variants and amino acid substitution ts. preferably amino acid tution variants. All these sequence changes according to the invention may potentially create new epitopes.

Amino acid insertion ts comprise insertions of single or two or more amino acids in a particular amino acid sequence.

Amino acid addition variants comprise amino- and/or carboxy-terminal fusions of one or more amino acids. such as l. 2. 3. 4 or 5. or more amino acids.

Amino acid deletion variants are characterized by the removal of one or more amino acids from the sequence. such as by removal of l. 2. 3. 4 or 5. or more amino acids.

Amino acid tution variants are characterized by at least one residue in the sequence being removed and another residue being inserted in its place. ing to the invention. a modiﬁcation or modiﬁed peptide used for testing in the methods of the invention may be derived from a protein comprising a modiﬁcation.

The term "derived" means according to the invention that a particular entity. in particular a particular peptide sequence. is present in the object from which it is derived. In the case of amino acid sequences. especially particular sequence regions. "derived" in particular means that the relevant amino acid sequence is derived from an amino acid sequence in which it is t.

A protein comprising a modiﬁcation from which a modiﬁcation or modiﬁed peptide used for testing in the methods of the invention may be derived may be a igen. ing to the invention. the term "neoantigen" s to a e or protein including one or more amino acid modiﬁcations compared to the parental peptide or protein. For example, the neoantigen may be a tumor-associated neoantigen. n the term "tumor-associated ' neoantigen" includes a peptide or protein including amino acid modiﬁcations due to tumor- speciﬁc mutations.

According to the invention. the term "tumor-speciﬁc mutation" or "cancer-speciﬁc mutation" relates to a somatic mutation that is present in the nucleic acid of a tumor or cancer cell but absent in the nucleic acid of a corresponding . i.e. non-tumorous or non-cancerous. cell.

The terms "tumor-speciﬁc mutation" and "tumor mutation" and the terms "cancer-speciﬁc mutation" and "cancer mutation" are used interchangeably herein.

The term e response" refers to an integrated bodily response to a target such as an antigen and preferably refers to a cellular immune response or a cellular as well as a humoral immune response. The immune response may be protective/preventive/prophylactic and/or therapeutic.

"Inducing an immune response" may mean that there was no immune se before induction. ' but it may also mean that there was a certain level of immune response before induction and after induction said immune response is enhanced. Thus. "inducing an immune response" also includes "enhancing an immune response". Preferably. after inducing an immune response in a subject. said subject is protected from developing a disease such as a cancer disease or the disease condition is ameliorated by inducing an immune response. For example. an immune in a patient having a cancer e response t a tumor-expressed antigen may be induced in this or in a subject being at risk of developing a cancer disease. Inducing an immune response case may mean that the disease condition of the t is ameliorated. that the subject does not develop metastases. or that the subject being at risk of developing a cancer disease does not develop a cancer disease.

The terms "cellular immune response" and "cellular response" or similar terms refer to an immune response directed to cells characterized by presentation of an antigen with class 1 or class II MHC involving T cells or T—lymphocytes which act as either rs" or "killers". The helper T cells (also termed CD4T T cells) play a central role by regulating the immune response and the killer cells (also termed cytotoxic T cells. tic T cells. CD8+ T cells or CTLs) kill diseased cells such as cancer cells. preventing the production of more diseased cells. In preferred embodiments, the present invention involves the stimulation of an anti-tumor CTL response against tumor cells sing one or more tumor-expressed antigens and ably presenting such tumor—expressed antigens with class I MHC.

An "antigen" according to the invention covers any substance. preferably a peptide or protein. that is a target of and/or induces an immune response such as a specific reaction with antibodies such as a T cell or T—lymphocytes (T cells). Preferably, an antigen comprises at least one epitope epitope. Preferably. an antigen in the context of the present invention is a molecule which. optionally after processing. induces an immune reaction. which is preferably specific for the antigen ding cells expressing the n). The antigen or a T cell epitope thereof preferably presented by a cell. preferably by an antigen presenting cell which includes a diseased cell. in particular a cancer cell. in the context of MHC molecules. which results in an immune the antigen). se against the n (including cells expressing In one ment. an antigen is a tumor n (also termed tumor-expressed antigen ).

Le. a part of a tumor cell such as a protein or peptide expressed in a tumor cell which may be derived from the cytoplasm. the cell surface or the cell nucleus. in particular those which primarily occur intracellularly or as surface antigens of tumor cells. For example. tumor antigens include the carcinoembryonal antigen. al-fetoprotein. isoferritin. and fetal sulphoglycoprotein. erroprotein and y-fetoprotein. According to the t invention. a tumor n preferably comprises any antigen which is expressed in and optionally characteristic with respect to type and/or expression level for tumors or cancers as well as for tumor or cancer cells. Le. a In one embodiment. the term "tumor-associated antigen" relates to - tumor-associated antigen. proteins that are under normal conditions speciﬁcally expressed in a limited number of tissues and/or organs or in speciﬁc developmental stages. for example. the tumor-associated antigens may be under normal conditions speciﬁcally expressed in stomach tissue, preferably in gastric mucosa. in reproductive organs. e.g.. in testis. in trophoblastic tissue. e.g.. in placenta. or in germ line cells. and are expressed or aberrantly expressed in one or more tumor or cancer tissues. In this context. "a d number" ably means not more than 3. more preferably not more than 2. The tumor antigens in the context of the present invention include. for example. differentiation antigens, ably cell type specific differentiation antigens. i.e.. proteins that a certain are under normal ions specifically expressed in a certain cell type at differentiation stage. cancer/testis antigens. i.e.. proteins that are under normal conditions speciﬁcally expressed in testis and sometimes in placenta. and germ line speciﬁc antigens.

Preferably. the tumor antigen or the aberrant expression of the tumor antigen identifies cancer cells. In the context of the present invention. the tumor n that is expressed by a cancer cell in a subject. in said e.g.. a patient suffering from a cancer disease. is preferably a self-protein subject. In preferred embodiments. the tumor n in the context of the present invention is expressed under normal conditions specifically in a tissue or organ that is non-essential. i.e.. tissues or organs which when d by the immune system do not lead to death of the subject. or in organs or structures of the body which are not or only hardly accessible by the immune According to the invention. the terms "tumor antigen", "tumor-expressed antigen". r antigen" and "cancer-expressed antigen" are equivalents and are used interchangeably herein.

The term "immunogenicity" relates to the relative ivity to induce an immune response that WO 80569 2014/001232 is preferably associated with eutic treatments. such as treatments against cancers. As used herein. the term "immunogenic" relates to the property of having immunogenicity. For e. the term "immunogenic modiﬁcation" when used in the context of a peptide, polypeptide or n relates to the effectivity of said peptide. polypeptide or protein to induce an immune that is caused by and/or ed against said modiﬁcation. Preferably, the response non- ed peptide. polypeptide or protein does not induce an immune response. induces a different immune response or s a different level. preferably a lower level. of immune response.

The terms "major histocompatibility complex" and the abbreviation "MHC" include MHC class I and MHC class [1 molecules and relate to a complex of genes which occurs in all vertebrates.

MHC proteins or les are important for signaling between lymphocytes and antigen presenting cells or diseased cells in immune reactions, wherein the MHC proteins or molecules bind peptides and present them for recognition by T cell receptors. The proteins encoded by the MHC are expressed on the surface of cells. and display both self antigens (peptide fragments from the cell itself) and non-self antigens (e.g., fragments of invading microorganisms) to a T cell.

The MHC region is divided into three ups. class 1. class II. and class III. MHC class I proteins contain an a—chain and BZ-microglobulin (not part ofthe MHC encoded by chromosome [5). They t antigen fragments to cytotoxic T cells. On most immune system cells. specifically on antigen—presenting cells. MHC class II proteins c0ntain a— and B-chains and they present antigen fragments to T-helper cells. MHC class [[1 region encodes for other immune components. such as complement components and some that encode cytokines.

The MHC is both polygenic (there are several MHC class I and MHC class [I genes) and polymorphic (there are multiple alleles of each gene).

As used herein. the term "haplotype" refers to the HLA alleles found on one chromosome and the proteins encoded thereby. Haplotype may also refer to the allele present at any one locus within the MHC. Each class of MHC is represented by several loci: e.g.. HLA—A (Human Leukocyte Antigen-A). HLA—B. HLA-C. HLA-E. HLA-F. HLA-G. HLA~H. HLA-J. HLA-K. HLA—L.

HLA-P and HLA-V for class I and A. HLA-DRBI-9. HLA-. HLA-DQAI. HLA- DQBl. HLA—DPAI HLA-DMA. HLA-DMB. A. and HLA-DOB . HLA-DPBI, class II. The terms "HLA allele" and "MHC allele" are used hangeably herein.

The MHCs exhibit extreme polymorphism: within the human population there are. at each genetic locus. a great number of haplotypes comprising distinct alleles. Different polymorphic MHC alleles. of both class I and class II. have different peptide specificities: each allele encodes proteins that bind peptides exhibiting particular sequence patterns.

In one preferred embodiment of all aspects of the invention an MI-IC molecule is an HLA molecule. class I In the context of the present invention. the term "MHC binding e" includes MHC I and/or and/or class II binding peptides or es that can be processed to produce MHC class class [1 binding peptides. In the case of class I ptide complexes. the binding peptides are typically 8-12. preferably 8-10 amino acids long although longer or shorter peptides may be effective. In the case ofclass II MHC/peptide complexes. the binding peptides are typically 9-30. preferably 10—25 amino acids long and are in particular 13-18 amino acids long. s longer and shorter peptides may be effective.

Ifa peptide is to be presented directly. i.e.. without processing. in particular without cleavage. has in particular a class I MHC a length which is suitable for binding to an MHC molecule. molecule. and preferably is 7-30 amino acids in length such as 7-20 amino acids in length. more preferably 7-12 amino acids in length, more preferably 8-1 1 amino acids in length, in particular 9 or 10 amino acids in .

If a e is part ofa larger entity comprising onal sequences. e.g. ofa e sequence or polypeptide. and is to be presented following processing. in particular ing cleavage. the peptide produced by processing has a length which is suitable for binding to an MHC molecule. in particular a class I MHC molecule. and preferably is 7—30 amino acids in length such as 7-20 amino acids in length. more preferably 7-12 amino acids in length. more preferably 8—1 1 amino acids in length. in particular 9 or 10 amino acids in length. Preferably. the ce of the peptide which is to be presented following processing is derived from the amino acid sequence of an antigen or polypeptide used for vaccination. i.e.. its sequence ntially corresponds and is ably completely cal to a fragment of the antigen or polypeptide.

Thus. an MHC binding peptide in one embodiment comprises a sequence which ntially corresponds and is ably completely identical to a fragment of an antigen.

The term "epitope" refers to an antigenic determinant in a molecule such as an antigen. i.e.. to a that is part in or fragment of the molecule that is recognized by the immune system. for example. recognized by a T cell. in particular when presented in the context of MHC molecules. An epitope of a protein such as a tumor antigen preferably comprises a continuous or discontinuous portion of said protein and is preferably between'S and 100. preferably between 5 and 50, more preferably between 8 and 30, most preferably between 10' and 25 amino acids in length. for e. the epitope may be preferably 9, 10. ll. l2. l3. 14. 15. l6. l7, l8. 19. 20. 21. 22. 23. 24. or 25 amino acids in length. lt is particularly preferred that the epitope in the context of the present invention is a T cell epitope.

According to the invention an epitope may bind to MHC molecules such as MHC molecules on the surface ofa cell and thus. may be a "MHC binding e".

As used herein the term "neo—epitope" refers to an epitope that is not present in a nce such as a normal non-cancerous or germline cell but is found in cancer cells. This includes. particular. situations wherein in a normal ncerous or ine cell a corresponding epitope is found. however, due to one or more ons in a cancer cell the sequence of the epitope is changed so as to result in the neo-epitope.

As used herein. the term "T cell epitope" refers to a peptide which binds to a MHC molecule in a configuration recognized by a T cell receptor. Typically. T cell epitopes are presented on the surface of an antigen—presenting cell.

As used herein. the term "predicting T cell epitopes" refers to a prediction r a peptide will bind to a MHC molecule and will be recognized by a T cell receptor. The term "predicting T cell epitopes" is essentially synonymous with the phrase "predicting whether a peptide is immunogenic". ing to the invention. a T cell e may be present in a vaccine as a part of a larger entity such as a vaccine sequence and/or a polypeptide comprising more than one T cell epitope.

The presented peptide or T cell epitope is produced following suitable processing.

T cell epitopes may be modiﬁed at 'one or more residues that are not essential for TCR recognition or for binding to MHC. Such modiﬁed T cell epitopes may be considered immunologically equivalent.

Preferably a T cell epitope when presented by MHC and recognized by a T cell receptor is able to induce in the presence of appropriate mulatory signals. clonal expansion of the T cell carrying the T cell receptor Speciﬁcally recognizing the peptide/MHC-complex. ably, a T cell epitope ses an amino acid sequence ntially corresponding to the amino acid sequence of a fragment of an antigen. Preferably. said nt of an antigen is an MHC class I and/or class ll presented peptide.

A T cell epitope according to the invention ably s to a portion or fragment of an antigen which is capable of stimulating an immune response. preferably a cellular response against the n or cells characterized by expression of the antigen and preferably by presentation of the antigen such as diseased cells. in particular cancer cells. Preferably. a T cell epitope is capable of stimulating a cellular response against. a cell characterized by presentation of an antigen with class I MHC and preferably is capable of stimulating an antigen-responsive cytotoxic T-lymphocyte (CTL).

"Antigen processing" or ssing" refers to the degradation of a peptide. polypeptide or protein into sion products. which are fragments of said e. polypeptide or protein of these (e.g.. the degradation of a polypeptide into peptides) and the association of one or more fragments (e.<’.. via binding) with MHC molecules for presentation by cells. preferably antigen presenting cells. to speciﬁc T cells.

"Antigen presenting cells" (APC) are cells which present e fragments of protein antigens in association with MHC molecules on their cell surface. Some APCs may activate antigen speciﬁc T cells.

Professional antigen-presenting cells are very efﬁcient at internalizing antigen. either by phagocytosis or by receptor-mediated endocytosis. and then displaying a fragment of the n. bound to a class II MHC molecule. on their membrane. The T cell recognizes and interacts with the antigen-class II MHC molecule x on the membrane of the antigen-presenting cell. An additional co-stimulatory signal is then produced by the antigen-presenting cell. leading to activation of the T cell. The expression of co-stimulatory molecules is a deﬁning feature of professional antigen-presenting cells.

The main types of professional antigen—presenting cells are dendritic cells. which have the st range of antigen presentation. and are probably the most important antigen-presenting cells. macrophages. s. and certain activated epithelial cells. Dendritic cells (DCs) are leukocyte populations that present antigens captured in peripheral tissues to T cells via both MHC class II and I antigen presentation pathways. It is well known that tic cells are potent inducers of immune responses and the activation of these cells is a al step for the induction of antitumoral immunity. Dendritic cells are conveniently categorized as ure" and "mature" cells. which can be used as a simple way to discriminate between two well characterized ypes. r. this nomenclature should not be construed to exclude all possible intermediate stages of differentiation. re dendritic cells are characterized as antigen presenting cells with a high capacity for antigen uptake and processing. which correlates with the high expression of Fey receptor and mannose receptor. The mature phenotype is typically characterized by a lower expression of these markers. but a high expression of cell surface molecules responsible for T cell activation such as class I and class II MHC. adhesion molecules (e. CD86 and g. CD54 and CD11) and costimulatory molecules (e. g, CD40. CD80. 4-1 BB). Dendritic cell maturation is ed to as the status of dendritic cell activation at which such antigen-presenting dendritic cells lead to T cell priming, while presentation by immature dendritic cells results in tolerance. Dendritic cell maturation is chieﬂy caused by biomolecules with microbial features detected by innate receptors (bacterial DNA. viral RNA. endotoxin. etc.). pro-inﬂammatory cytokines (TNF. IL-l. IFNs). ligation of CD40 on the dendritic cell surface by CD40L. and substances released from cells undergoing stressful cell death. The dendritic cells can be derived by culturing bone marrow cells in vitro with nes. such as granulocyte- macrophage colony-stimulating factor F) and tumor necrosis factor alpha.

Non-professional antigen-presenting cells do not constitutively express the MHC class ll proteins required for interaction with naive T cells: these are expressed only upon stimulation of the non-professional antigen-presenting cells by certain cytokines such as lFNy.

Antigen presenting cells can be loaded with MHC class I presented es by transducing the cells with c acid. preferably RNA. encoding a peptide or polypeptide comprising the peptide to be presented. e.g. a nucleic acid encoding an antigen or polypeptide used for vaccination.

In some embodiments. a pharmaceutical composition or e comprising a nucleic acid delivery vehicle that targets a dendritic or other antigen presenting cell may be administered to a patient. resulting in transfection that occurs in vivo. In vii-'0 transfection of dendritic cells. for e, may generally be performed using any methods known in the art. such as those described in WO 47. or the gene gun ch described by Mahvi et al.. Immunology and cell y 75: 456—460. 1997.

According to the invention. the term "antigen presenting cell" also includes target cells.

"Target cell" shall mean a cell which is a target for an immune response such as a cellular immune se. Target cells e cells that present an antigen. i.e. a peptide fragment derived from an antigen. and include any undesirable cell such as a cancer cell. In preferred WO-2014/180569 2014/001232 embodiments. the target cell is a cell expressing an n as described herein and preferably presenting said antigen with class I MHC.

The term "portion" refers to a fraction. With respect to a particular structure such as an amino acid sequence or protein the term "portion" thereof may designate a continuous or a discontinuous fraction of said structure. Preferably. a portion of an amino acid sequence comprises at least 1%. at least 5%. at least 10%. at least 20%. at least 30%. ably at least 40%. preferably at least 50%. more ably at least 60%. more preferably at least 70%. even of said amino more preferably at least 80%. and most preferably at least 90% of the amino acids acid sequence. Preferably. if the n is a discontinuous fraction said discontinuous on is ed of2. 3. 4. 5. 6. 7. 8. or more parts ofa structure. each part being a continuous element of the For example. a discontinuous fraction of an amino acid sequence may be structure. composed of 2. 3. 4. 5. 6. 7, 8. or more. preferably not more than 4 parts of said amino acid least 5 continuous amino acids. at least 10 sequence. wherein each part preferably comprises at continuous amino acids. preferably at least 20 uous amino acids. preferably at least 30 continuous amino acids of the amino acid sequence.

The terms "part" and "fragment" are used interchangeably herein and refer to a continuous element. For example. a part of a structure such as an amino acid sequence or protein refers to a continuous element of said structure. A portion. a part or a fragment of a structure preferably comprises one or more onal properties of said structure. For example. a portion. a pan or a nt of an epitope, peptide or protein is preferably immunologically equivalent to the epitope. peptide or n it is derived from. In the context of the present invention. a "part" of a structure such as an amino acid sequence preferably comprises. preferably consists of at least 10%. at least 20%. at least 30%. at least 40%. at least 50%. at least 60%. at least 70%. at least 80%. at least 85%. at least 90%. at least 92%. at least 94%. at least 96%. at least 98%. at least 99% of the entire structure or amino acid sequence.

The term "immunoreactive cell" in the context of the present invention relates to a cell which 3O exerts effector functions during an immune reaction. An ”immunoreactive cell" preferably is capable of binding an antigen or a cell characterized by presentation of an antigen or a peptide 2014/001232 such fragment thereof (e.g. a T cell epitope) and ing an immune response. For example. cells e cytokines and/0r chemokines. secrete antibodies. recognize cancerous cells. and optionally eliminate such cells. For example. reactive cells comprise T cells (cytotoxic cells. helper T cells. tumor inﬁltrating T cells). B cells. natural killer cells. neutrophils. macrophages. and dendritic cells. Preferably, in the t of the present invention. "immunoreactive cells" are T cells. preferably CD4+ and/or CD8+ T cells.

Preferably. an "immunoreactive cell" recognizes an antigen or a peptide fragment thereof with MHC molecules such as on some degree of specificity. in particular ifpresented in the context of said the surface of antigen presenting cells or diseased cells such as cancer cells. ably. recognition enables the cell that recognizes an antigen or a peptide fragment thereof to be responsive or reactive. If the cell is a helper T cell (CD4+ T cell) bearing receptors that recognize an antigen or a peptide nt thereof in the context of MHC class [I molecules such CD8+ responsiveness or reactivity may involve the release of cytokines and/or the activation of lymphocytes (CTLS) and/or B-cells. lf the cell is a CTL such responsiveness or reactivity may involve the elimination of cells presented in the context of MHC class I molecules. i.e.. cells characterized by presentation of an antigen with class I MHC. for example, via apoptosis or perform-mediated cell lysis. According to the invention. CTL siveness may include sustained calcium ﬂux. cell division. production of nes such as lFN-y and TNF-a. up— tion of activation s such as CD44 and CD69. and specific cytolytic killing of antigen expressing target cells. CTL responsiveness may also be determined using an artiﬁcial reporter that accurately indicates CTL responsiveness. Such CTL that ize an antigen or an antigen fragment and are responsive or ve are also termed "antigen-responsive CTL" herein. If the cell is a B cell such responsiveness may involve the release ofimmunoglobulins.

The terms "T cell" and "T lymphocyte" are used interchangeably herein and include T helper cells (CD4+ T cells) and cytotoxic T cells (CTLs. CD8+ T cells) which comprise cytolytic T cells.

T cells belong to a group of white blood cells known as lymphocytes. and play a central role cell-mediated immunity. They can be distinguished from other lymphocyte types. such as B cells T cell and natural killer cells by the presence of a special receptor on their cell surface called for the maturation of T cells. receptor (TCR). The thymus is the principal organ responsible l different s ofT cells have been discovered. each with a distinct function. maturation ofB T helper cells assist other white blood cells in immunologic processes. including other cells into plasma cells and activation of cytotoxic T cells and macrophages. among CD4 protein on functions. These cells are also known as CD4+ T cells e they express the their surface. Helper T cells become activated when they are presented with peptide ns by MHC class II les that are expressed on the surface of antigen presenting cells (APCs). that regulate or Once activated. they divide rapidly and secrete small proteins called nes assist in the active immune response.

Cytotoxic T cells destroy virally infected cells and tumor cells. and are also ated in the CD8 transplant rejection. These cells are also known as CD8+ T cells since they express associated glycoprotein at their surface. These cells recognize their targets by binding to antigen with MHC class I. which is present on the surface of nearly every cell of the body.

A majority ofT cells have a T cell receptor (TCR) existing as a complex of l proteins. from the actual T cell receptor is composed of two separate peptide chains. which are produced ndent T cell receptor alpha and beta (TCRa and TCRB) genes and are called a- and B-TCR chains. 76 T cells (gamma delta T cells) represent a small subset ofT cells that possess a T cell receptor (TCR) on their surface. However. in 76 T cells. the TCR is made up of one y— chain and one 6-chain. This group ofT cells is much less common (2% of total T cells) than the up T cells.

The first signal in activation of T cells is provided by binding of the T cell receptor to a short peptide presented by the MHC on another cell. This ensures that only a T cell with a TCR specific to that peptide is activated. The partner cell is usually an antigen presenting cell such as a professional antigen presenting cell. usually a dendritic cell in the case of naive responses. although B cells and macrophages can be important APCs. to a if it has a According to the present invention. a molecule is capable of binding target significant afﬁnity for said predetermined target and binds to said predetermined target in standard . "Afﬁnity" or "binding afﬁnity" is often measured by equilibrium dissociation if it has no constant (K0). A molecule is not (substantially) capable of binding to a target said target in standard . signiﬁcant afﬁnity for said target and does not bind signiﬁcantly to of an antigen or a peptide Cytotoxic T lymphocytes may be generated in vivo by incorporation fragment thereof fragment thereof into antigen-presenting cells in viva. The antigen or a peptide DNA (e.g. within a vector) or as RNA. The antigen may be may be represented as protein. as while a fragment thereof may be sed to produce a peptide partner for the MHC molecule. the case in particular. if these can presented without the need for further processing. The latter is general. administration to is bind to MHC molecules. in a patient by intradermal injection node (Maloy et al. possible. However. injection may also be carried out intranodally into a lymph cells the complex of . Proc Natl Acad Sci USA 98:3299-303). The resulting t which then propagate. interest and are recognized by autologous cytotoxic T lymphocytes of ways. Methods for Speciﬁc activation ofCD4+ or CD8+ T cells may be detected in a variety of T cells. the production detecting c T cell activation include detecting the proliferation of cytokines (e.g., kines). or the generation of cytolytic activity. For CD4+ T cells. a preferred method for detecting Speciﬁc T cell activation is the ion of the proliferation of T is the cells. For CD8+ T cells. a preferred method for detecting speciﬁc T cell activation detection of the generation ofcytolytic activity. similar By "cell terized by presentation of an antigen" or "cell presenting an antigen" or cell sions is meant a cell such as a diseased cell. e.g. a cancer cell. or an antigen presenting processing of presenting the n it ses or a fragment derived from said antigen. e.g. by the n. in the context of MHC molecules. in ular MHC Class I molecules. Similarly. denotes a disease involving cells the terms "disease characterized by presentation of an antigen" characterized by tation of an antigen. in particular with class I MHC. Presentation of an antigen by a cell may be effected by transfecting the cell with a nucleic acid such as RNA encoding the antigen. is meant that the nt By "fragment of an antigen which is presented" or similar expressions can be presented by MHC class I or class II. preferably MHC class I, e.g. when added directly to the fragment is naturally antigen presenting cells. In one embodiment. is a fragment which presented by cells expressing an antigen.

The term "immunologically equivalent" means that the immunologically equivalent molecule such as the logically equivalent amino acid sequence exhibits the same or essentially same or essentially the same immunological same immunological properties and/or exerts the induction of a humoral effects. e.g.. with respect to the type of the immunological effect such as immune on. or and/0r cellular immune response. the strength and/or duration of the induced invention. the term the specificity of the induced immune reaction. In the context of the t ”immunologically equivalent" is preferably used with respect to the logical effects or properties of a peptide used for immunization. For example. an amino acid sequence a reference amino acid sequence if said amino acid sequence 15 immunologically equivalent to when exposed to the immune system of a subject induces an immune reaction having a specificity of reacting with the reference amino acid sequence. effector functions" in the t of the t invention includes any The term "immune in the killing functions mediated by components of the immune system that result. for example. of tumor cells. or in the inhibition of tumor growth and/or inhibition of tumor development. and metastasis. ably. the immune effector including inhibition of tumor dissemination ons in the context of the present invention are T cell mediated effector functions. Such of an n or an functions comprise in the case ofa helper T cell (CD4+ T cell) the recognition antigen fragment in the context of MHC class II molecules by T cell receptors. the e of in the case of nes and/or the activation of CD8+ lymphocytes (CTLs) and/or B-cells. and CTL the recognition of an antigen or an antigen fragment in the context of MHC class I molecules by T cell receptors. the elimination of cells presented in the context of MHC class les. i.e.. cells characterized by presentation of an antigen with class I MHC. for example. and TNF—a. via apoptosis or perform-mediated cell lysis, production of cytokines such as lFN-y and specific cytolytic killing of antigen expressing target cells.

According to the invention. the term "score" relates to a result. usually expressed numerically. " or "score best" relate to a better result or the a test or examination. Terms such as "score best result of a test or examination.

Terms such as "predict" 9 "predicting" or "prediction" relate to the determination ofa likelihood.

According to the invention. ascertaining a score for binding of a peptide to one or more molecules includes determining the likelihood of binding of a peptide to one or more MHC molecules. be ascertained by using any A score for binding ofa peptide to one or more MHC molecules may peptidezMHC binding predictive tools. For example. the immune epitope database analysis used. resource (IEDB-AR: http://tools.iedlxorg) may be ent MHC Predictions are usually made against a set of MHC molecules such as a set of in a patient alleles such as all possible MHC alleles or a set or subset of MHC alleles found preferably having the modiﬁcation(s) the immunogenicity of which is to be determined according to the invention.

According to the invention. ascertaining a score for g of a modiﬁed peptide when present the likelihood in a MHC-peptide x to one or more T cell receptors includes determining ofbinding ofa peptide when present in a x with an MHC molecule to T cell receptors. in a t Predictions may be made against one T cell or such as a T cell receptor found as an unknown set ofdifferent T cell receptors or preferably against a set ofT cell receptors such the modiﬁcation(s) the or a set or subset of T cell receptors found in a patient preferably having immunogenicity ofwhich is to be determined according to the invention.

Furthermore. predictions are usually made against a set of MHC molecules such as a set of 30 different MHC s such as all possible MHC s or a set or subset of MHC alleles found is to be in a patient preferably having the modiﬁcation(s) the immunogenicity of which determined according to the invention.

A score for binding of a modiﬁed peptide when present in a MHC-peptide complex to one or more T cell receptors may be ascertained by estimating the effect of the modiﬁcation on the binding of a T cell receptor-peptide-MHC complex given an (unknown) T cell receptor repertoire. The score for binding ofa d peptide when t in a MHC-peptide complex to one or more T cell receptors may generally be deﬁned as a kind of a proxy for the recognition a given peptide—MHC molecule to a matching T cell receptor.

The score for g of a modiﬁed peptide when present in a MHC-peptide complex to one or ascertained by ascertaining the physico-chemical differences more T cell receptors may be between the modiﬁed and the non-modiﬁed amino acid. For example. substitution matrices may be used. Such matrices describe the rate at which one amino acid in a sequence changes to other amino acid states over time.

For e log-odds matrices such as ionary based log-odds matrices may be used: a substitution with a low log odds score has a better chance of ﬁnding a ng T cell receptor from the pool of wn) T cell receptor molecules than a substitution with a high log odds score (due to negative selection of T cell receptor matching diﬁed peptides). However there are other ways of ascertaining this score. For example. considering the position of the mutation in the peptide (some positions may have a lower impact on binding than others). taking into account the nearest neighbors of the substituted amino acid (which could impact the secondary structure of the substituted amino acid). taking into account the entire peptide sequence. taking into account the complete structural information of the peptide in the MHC molecule. an so on. Ascertaining the score could also involve determination of a T cell receptor repertoire (such as the T cell receptor repertoire of a patient or a subset thereof) e.g. via NGS and performing docking simulations ofT cell receptor-peptide—MHC complexes.

The present invention also may comprise ming the method of the ion on ent peptides comprising the same modification(s) and/or different modiﬁcations. 2014/001232 The term "different peptides comprising the same modification(s)" in one embodiment relates to es comprising or consisting of different fragments of a d protein. said ent fragments sing the same modification(s) present in the protein but differing in length and/0r position of the modification(s). If a protein has a modiﬁcation at position x. two or more fragments of said protein each sing a different sequence window of said protein covering said position x are considered different peptides comprising the same modification(s).

The term "different peptides comprising different modifications" in one embodiment relates to »10 peptides either of the same and/or ing lengths comprising different modifications of either of the same and/or different proteins..lf a protein has modifications at positions x and y. two fragments of said protein each comprising a sequence window of said protein covering either position x or position y are considered different peptides sing different modifications.

The present ion also may comprise breaking of protein sequences having modiﬁcations the immunogenicity of which is to be determined according to the invention into appropriate peptide lengths for MHC binding and ascertaining scores for binding to one or more MHC molecules of different modified peptides comprising the same and/or different modifications of either the same and/or different proteins. Outputs may be ranked and may consist of a list of peptides and their predicted . indicating their likelihood ofbinding.

The step of ascertaining a score for g of the non-modified peptide to one or more MHC molecules and/or the step of ascertaining a score for binding of the modified peptide when present in a MHC-peptide complex to one or more T cell receptors may subsequently performed with all different modified peptides comprising the same and/or different modiﬁcations. a subset f. e.g. those modified peptides comprising the same and/or different cations scoring best for binding to one or more MHC molecules. or only with the one modified peptide scoring best for binding to one or more MHC molecules.

Following said r steps. the results may be ranked and may consist of a list of peptides and their predicted scores. ting their likelihood of being immunogenic.

Preferably. in such g. 3 score for binding of the modiﬁed peptide to one or more MHC molecules is weighted higher than a score for binding of the modified peptide when present in a MHC-peptide complex to one or more T cell receptors. preferably a score for the chemical and physical similarities between the non-modiﬁed and d amino acids and a score for binding of the modified peptide when present in a MHC-peptide complex to one or more T cell receptors, ably a score for the chemical and physical similarities n the non-modiﬁed and modiﬁed amino acids is weighted higher than a score for binding of the non-modiﬁed peptide to one or more MHC molecules.

The amino acid modiﬁcations the immunogenicity of which is to be determined according to the present invention may result from mutations in the nucleic acid ofa cell. Such mutations may be identiﬁed by known sequencing ques.

In one embodiment. the mutations are cancer speciﬁc somatic mutations in a tumor specimen of a cancer patient which may be determined by identifying sequence differences between the . exome and/or transcriptome of a tumor specimen and the genome. exome and/or transcriptome of a non-tumorigenous en.

According to the invention a tumor specimen relates to any sample such as a bodily sample derived from a t containing or being expected of containing tumor or cancer cells. The bodily sample may be any tissue sample such as blood. a tissue sample obtained from the primary tumor or from tumor metastases or any other sample containing tumor or cancer cells.

Preferably. a bodily sample is blood and cancer speciﬁc somatic mutations or sequence differences are determined in one or more circulating tumor cells (CTCs) contained in the blood.

In another embodiment. a tumor specimen relates to one or more isolated tumor or cancer cells such as circulating tumor cells (CTCs) or a sample containing one or more isolated tumor or cancer cells such as circulating tumor cells (CTCs).

A non-tumorigenous en relates to any sample such as a bodily sample derived from a patient or another individual which preferably is of the same species as the t. preferably a healthy individual not containing or not being expected of containing tumor or cancer cells. The bodily sample may be any tissue sample such as blood or a sample from a non-tumorigenous tissue.

The ion may e the determination of the cancer mutation signature of a t. The term "cancer mutation signature" may refer to all cancer mutations present in one or more cancer cells of a patient or it may refer to only a portion of the cancer mutations present in one or more cancer cells of a t. Accordingly, the present ion may involve the identiﬁcation of all cancer speciﬁc mutations present in one or more cancer cells of a t or it may involve the ﬁcation of only a portion of the cancer c mutations present in one or more cancer cells of a patient. Generally. the methods of the invention provides for the identiﬁcation of a number of mutations which provides a sufﬁcient number of modiﬁcations or modiﬁed peptides to be included in the methods ofthe invention.

Preferably. the mutations identiﬁed according to the present invention are non-synonymous mutations. preferably non—synonymous mutations ofproteins expressed in a tumor or cancer cell.

In one embodiment. cancer speciﬁc somatic mutations or sequence differences are determined in the genome. preferably the entire genome. of a tumor specimen. Thus. the invention may comprise identifying the cancer mutation signature of the genome. preferably the entire genome of one or more cancer cells. In one ment. the step of fying cancer specific somatic mutations in a tumor specimen of a cancer t comprises identifying the genome-wide cancer mutation proﬁle.

In one embodiment. cancer speciﬁc somatic mutations or sequence differences are determined in the exome. preferably the entire exome. ofa tumor specimen. Thus. the invention may comprise identifying the cancer mutation signature of the exome. preferably the entire exome of one or more cancer cells. In one embodiment. the step of identifying cancer specific somatic ons in a tumor specimen of a cancer patient comprises identifying the exome-wide cancer mutation proﬁle.

In one embodiment. cancer speciﬁc somatic mutations or sequence differences are determined in the transcriptome. ably the entire transcriptome. of a tumor specimen. Thus. the invention may comprise identifying the cancer mutation signature of the transcriptome. preferably the entire transcriptome of one or more cancer cells. In one embodiment. the step of identifying cancer speciﬁc c mutations in a tumor specimen of a cancer patient ses identifying the transcriptome-wide cancer mutation profile.

In one embodiment. the step of identifying cancer specific c mutations or identifying sequence differences comprises single cell sequencing of one or more. preferably 2. 3. 4. 5. 6. 7. 8. 9. 10. l l. 12. 13, 14. 15. 16. 17. 18. 19. 20 or even more cancer cells. Thus. the invention may comprise identifying a cancer mutation ure of said one or more cancer cells. In one embodiment. the cancer cells are circulating tumor cells. The cancer cells such as the circulating tumor cells may be isolated prior to single cell sequencing.

In one embodiment. the step of identifying cancer specific somatic mutations or identifying sequence ences involves using'next generation sequencing (NGS).

In one embodiment. the step of identifying cancer specific somatic mutations or identifying sequence differences comprises sequencing genomic DNA and/or RNA of the tumor en.

To reveal cancer specific somatic mutations or sequence differences the sequence information obtained from the tumor specimen is preferably compared with a reference such as ce information obtained from sequencing c acid such as DNA or RNA of normal non— cancerous cells such as gennline cells which may either be obtained from the patient or a different individual. In one embodiment. normal genomic ine DNA is obtained from peripheral blood mononuclear cells (PBMCS) The term "genome" relates to the total amount of genetic information in the chromosomes of an organism or a cell.

The term ”exome" refers to part of the genome of an organism formed by exons. which are coding portions of sed the blueprint used in the genes. The exome provides genetic synthesis of proteins and other functional gene products. It is the most onally relevant part of the genome and. therefore. it is most likely to contribute to the phenotype of an organism. The exome of the human genome is estimated to comprise 1.5% of the total genome (Ng, PC et (11..

PLOS Gen. 4(8): 1-15. 2008).

The term "transcriptome" relates to the set of'all RNA les. including mRNA. rRNA.

(RNA. and other non-coding RNA produced in one cell or a population ot‘cells. ln context of the '10 t invention the transcriptome means the set of all RNA molecules produced in one cell. a population ofcells. preferably a population of cancer cells. or all cells of a given individual at a certain time point.

A "nucleic acid" is according to the ion preferably deoxyribonucleic acid (DNA) or ribonucleic acid (RNA). more preferably RNA. most preferably in vitro transcribed RNA (IVT RNA) or synthetic RNA. Nucleic acids include according to the invention genomic DNA. cDNA. mRNA. recombinantly produced andchemically synthesized molecules. According to the invention. a nucleic acid may be present as a single—stranded or double-stranded and linear or covalently circularly closed molecule. A nucleic acid can. ing to the invention. be isolated.

The term "isolated nucleic acid" means. according to the invention. that the nucleic acid (i) was ampliﬁed in vim). for example via polymerase chain reaction (PCR). (ii) was ed recombinantly by cloning, (iii) was puriﬁed. for example. by cleavage and separation by gel electrophoresis. or (iv) was sized. for example. by chemical synthesis. A nucleic can be employed for introduction into. i.e. transfection of. cells. in particular. in the form of RNA which can be prepared by in vitro transcription from a DNA template. The RNA can moreover be modiﬁed before application by izing ces. capping, and polyadenylation.

The term "genetic material" refers to isolated nucleic acid. either DNA or RNA. 3 n of a double helix. a n of a chromosome. or an organism's or cell's entire genome. in particular its exome or transcriptome.

The term "mutation" refers to a change of or difference in the nucleic acid sequence otide substitution. addition or deletion) compared to a reference. A "somatic mutation" can occur in and egg) and therefore are not passed any of the cells of the body except the germ cells (sperm on to en. These alterations can (but do not always) cause cancer or other diseases.

Preferably a mutation is a non-synonymous mutation. The term "non-synonymous mutation" refers to a mutation. preferably a nucleotide substitution. which does result in an amino acid change such as an amino acid substitution in the translation product.

According to the invention. the term "mutation" includes Point mutations. lndels. fusions. chromothripsis and RNA edits.

According to the ion. the term " describes a special mutation class. defined as a on resulting in a colocalized insertion and deletion and a net gain or loss in nucleotides. In coding regions of the . unless the length of an indel is a multiple of 3. they produce a frameshift mutation. lndels can be contrasted with a point mutation; where an lndel inserts and deletes nucleotides from a sequence. a point mutation is a form of substitution that replaces one of the nucleotides.

Fusions can te hybrid genes formed from two previously separate genes. It can occur as the result of a translocation. interstitial deletion. or chromosomal inversion. Often. fusion genes are oncogenes. Oncogenic fusion genes may lead to a gene t with a new or different function from the two fusion partners. Alternatively, a proto-oncogene is fused to a strong caused by the promoter. and thereby the oncogenic function is set to on by an upregulation also be caused strong promoter of the upstream fusion partner. Oncogenic fusion transcripts may by trans-splicing or read-through events. ing to the invention. the term "chromothn'psis" refers to a genetic phenomenon by which specific regions of the genome are shattered and then stitched together via a single devastating event.

According to the invention. the term ”RNA edit" or ”RNA editing" refers to molecular processes in which the information content in an RNA molecule is altered through a chemical change in the base makeup. RNA editing includes nucleoside modiﬁcations such as cytidine (C) to uridine (U) and adenosine (A) to e (I) deaminations. as well as non-templated nucleotide additions and insertions. RNA editing in mRNAs effectively alters the amino acid sequence of the encoded protein so that it s from that predicted by the c DNA sequence.

The term "cancer mutation signature" refers to a set of mutations which are present in cancer cells when compared to non—cancerous reference cells.

According to the invention. a "reference" may be used to correlate and compare the results obtained in the methods of the invention from a tumor en. Typically the "reference" may be obtained on the basis of one or more normal specimens. in particular specimens which are not affected by a cancer disease. either obtained from a patient or one or more different individuals. preferably healthy individuals. in particular individuals ofthe same species. A "reference" can be ined empirically by testing a sufﬁciently large number of normal specimens.

Any suitable sequencing method can be used according to the invention for determining mutations. Next Generation Sequencing (NGS) logies being preferred. Third Generation Sequencing methods might substitute for the NGS technology in the future to speed up the sequencing step of the method. For clariﬁcation es: the terms "Next Generation Sequencing" or "NGS" in the context of the t invention mean all novel high throughput sequencing technologies which. in contrast to the "conventional" sequencing methodology known as Sanger chemistry, read c acid templates randomly in parallel along the entire known as genome by breaking the entire genome into small pieces. Such NGS technologies (also massively parallel sequencing technologies) are able to deliver nucleic acid sequence information of a whole . exome. transcriptome (all transcribed sequences of a genome) or ome (all methylated sequences of a genome) in very short time periods. e.g. within 1-2 weeks. preferably within 1-7 days or most preferably within less than 24 hours and allow. in principle. single cell sequencing ches. Multiple NGS platforms which are cially available or which are mentioned in the literature can be used in the context of the present invention e.g. those described in detail in thmg et al. 2011:. The impact oflien-generation sequencing on genomics. J. Genet cs 38 (3), 95-109: or in Voelkerding et a]. 2009: Next generation sequencing: From basic research to diagnostics. Clinical chemistry 55. 641-658. miting examples of such NGS technologies/platforms are l) The sequencing-by-synthesis technology known as pyrosequencing implemented e.g. in TM of Roche-associated the GS—FLX 454 Genome Sequencer company 454 Life Sciences (Branford. Connecticut). ﬁrst described in Ronag/zi et al. 1998: A sequencing method based on real-time pyrophosphate". Science 281 (53 75), 5. This technology uses an emulsion PCR in which single-stranded DNA binding beads are encapsulated by vigorous vortexing into aqueous micelles containing PCR reactants surrounded by oil for emulsion PCR amplification. During the pyrosequencing process. light emitted from phosphate molecules during nucleotide incorporation is recorded as the polymerase synthesizes the DNA strand.

The cing-by-synthesis approaches developed by Solexa (now part of na Inc..

San Diego. California) which is based on reversible dye-terminators and implemented egg. in the lllumina/Solexa Genome Analyzer TM and in the Illumina HiSeq 2000 Genome er'M. In this technology, all four nucleotides are added simultaneously into Oligo- ‘primed cluster fragments in ll channels along with DNA polymerase. Bridge ampliﬁcation extends r strands with all four fluorescently labeled nucleotides for sequencing. 3) cing-by—ligation approaches, e.g. implemented in the SOLidTM platform of Applied Biosystems (now Life Technologies Corporation. Carlsbad. California). In this technology, a pool of all possible oligonucleotides ofa fixed length are labeled according to the ced position. Oligonucleotides are ed and ligated; the preferential ligation by DNA ligase for matching ces results in a signal informative of the nucleotide at that position. Before cing. the DNA is amplified by emulsion PCR.

The resulting bead. each ning only copies of the same DNA molecule. are deposited on a glass slide. As a second example. he PolonatorrM G.007 platform of Dover Systems (Salem. New Hampshire) also employs a sequencing-by-ligation approach by using a randomly arrayed. bead-based. emulsion PCR to amplify DNA fragments for parallel sequencing. 4) Single-molecule sequencing logies such as e.g. implemented in the PacBio RS system of Paciﬁc Biosciences (Menlo Park. Califomia) or in the HeliScopeTM platform Helicos Biosciences (Cambridge, Massachusetts). The distinct characteristic of this technology is its ability to sequence single DNA or RNA les without ampliﬁcation. defined as -Molecule Real Time (SMRT) DNA sequencing. For example, HeliScope uses a highly sensitive fluorescence detection system to ly detect each nucleotide as it is synthesized. A similar apprOach based on ﬂuorescence resonance energy transfer (FRET) has been ped from Visigen Biotechnology (Houston. Texas). Other ﬂuorescence-based single—molecule techniques are from US.

Genomics (GeneEnginerM) and Genovoxx (AnyGeneTM).

U: v Nano-technologies for single-molecule sequencing in which various nanostructures are used which are e.g. arranged on a chip to monitor the nt of a polymerase molecule on a single strand during replication. Non-limiting examples for ches based on nano-technologies are the GridONTM platform of Oxford Nanopore Technologies (Oxford. UK). the hybridization-assisted nano-pore sequencing (HANSTM) platforms developed by Nabsys (Providence. Rhode Island). and the proprietary - based DNA sequencing platform with DNA nanoball (DNB) technology called combinatorial anchor ligation (cPALrM). 6) Electron microscopy based technologies for -molecule sequencing. e.g. those developed by LightSpeed Genomics (Sunnyvale, Califomia) and Halcyon Molecular (Redwood City. Califomia) 7) [on semiconductor sequencing which is based on the detection of hydrogen ions that are released during the polymerisation of DNA. For example. Ion Torrent Systems (San Francisco. California) uses a high—density array of micro—machined wells to perform this biochemical process in a massively parallel way. Each well holds a different DNA template. Beneath the wells is an ion—sensitive layer and h that a proprietary Ion SCl’lSOI'.

Preferably. DNA and RNA preparations serve as starting material for N65. Such nucleic acids can be easily obtained from samples such as biological material. e.g. from fresh. flash-frozen or formalin-fixed in embedded tumor tissues (FFPE) or from freshly isolated cells or from CTCs which are present in the peripheral blood of patients. Normal non-mutated genomic DNA or RNA can be extracted from normal. somatic tissue. however ne cells are preferred in the context of the present invention. Gennline DNA or RNA may be extracted from peripheral blood mononuclear cells (PBMCs) in patients with non-hematological malignancies. Although nucleic acids ted from FFPE s or freshly isolated single cells are highly fragmented. they are suitable for NGS ations.

Several targeted NGS methods for exome sequencing are bed in the literature (for review see e.g. Teer and Mullikin 2010: Human M0] Genet 19 (2), R145-51). all ofwhich can be used in conjunction with the present invention. Many of these methods (described e.g. as genome and include capture. genome partitioning. genome enrichment etc.) use ization techniques array-based (e.g. Hodges et a1. 2007: Nat. Genet. 39. 1522-1527) and liquid-based (e.g. Choi at ul. 2009: Proc. Natl. Acad. Sci USA 106. 19096-19101) hybridization approaches. Commercial kits for DNA sample preparation and subsequent exome e are also available: for example.

Illumina Inc. (San Diego. Califomia) offers the TruSeqTM DNA Sample Preparation Kit and the Exome Enrichment Kit TruSeqTM Exome ment Kit. .

In order to reduce the number of false positive findings in detecting cancer speciﬁc c mutations or sequence differences when comparing e.g. the ce of a tumor sample to the sequence of a reference sample such as the sequence of a germ line sample it is preferred to determine the sequence in replicates of one or both of these sample types. Thus. it is preferred that the sequence ofa reference sample such as the sequence ofa germ line sample is determined twice. three times or more. Alternatively or additionally. the ce of a tumor sample is determined twice. three times or more. It may also be possible to determine the sequence of a reference sample such as the sequence of a germ line sample and/or the sequence of a tumor sample more than once by determining at least once the sequence in genomic DNA and determining at least once the sequence in RNA of said reference sample and/or of said tumor . For example. by determining the variations between replicates of a reference sample such as a germ line sample the expected rate of false positive (FDR) somatic mutations as a statistical quantity can be estimated. Technical repeats of a sample should generate identical results and any ed mutation in this ”same vs. same comparison" is a false positive. In 2014/001232 particular. to determine the false discovery rate for somatic mutation detection in a tumor sample relative to a reference . a technical repeat of the reference sample can be used as a reference to estimate the number of false positives. Furthermore. various quality related metrics (e.g. coverage or SNP quality) may be combined into a single quality score using a machine learning ch. For a given somatic variation all other variations with an exceeding quality score may be counted. which enables a ranking of all ions in a dataset.

In the context of the present invention. the term "RNA" relates to a molecule which comprises at least one ribonucleotide residue and preferably being entirely or substantially composed of .r 10 ribonucleotide es. ”Ribonucleotide" relates to a nucleotide with a hydroxyl group at the 2’- position of a [S-D-ribofuranosyl group. The term "RNA" ses double-stranded RNA. single-stranded RNA. isolated RNA such as partially or completely puriﬁed RNA. essentially pure RNA. synthetic RNA. and recombinantly generated RNA such as modified RNA which differs from naturally occurring RNA by addition. deletion. substitution and/or alteration of one such as to or more nucleotides. Such alterations can include addition of non-nucleotide material. the end(s) of a RNA or internally. for example at one or more nucleotides of the RNA.

Nucleotides in RNA les can also comprise non-standard nucleotides. such as non- naturally occurring nucleotides or chemically synthesized nucleotides or deoxynucleotides.

These altered RNAs can be referred to as analogs or s of naturally-occurring RNA.

According to the present invention. the term "RNA" includes and preferably relates to "mRNA".

The term "mRNA" means nger—RNA" and relates to a "transcript" which is generated by using a DNA template and encodes a e or ptide. Typically. an mRNA comprises a ’—UTR. a protein coding region. and a 3’-UTR. mRNA only possesses limited half-life in cells and in vitro. In the context of the t invention. mRNA may be generated by in vitro ription from a DNA template. The in vitro transcription methodology is known to the skilled person. For example. there is a variety of in vitro transcription kits commercially available.

According to the invention. the stability and ation ncy of RNA may be modified as required. For example. RNA may be stabilized and its translation increased by one or more 2014/001232 modiﬁcations having a stabilizing effects and/or increasing translation efﬁciency of RNA. Such modiﬁcations are described. for example. in incorporated herein by reference. In order to increase expression of the RNA used according to the present invention. it may be modiﬁed within the coding region. i.e. the sequence encoding the sed peptide or protein. preferably without altering the sequence of the expressed peptide or protein. so as to increase the GC—content to increase mRNA stability and to perform a codon optimization and. thus. enhance ation in cells.

The term cation" in the context of the RNA used in the present invention includes any modiﬁcation of an RNA which is not naturally present in said RNA.

In one embodiment of the invention. the RNA used according to the ion does not have uncapped 5'-triphosphates. l of such uncapped 5'-triphosphates can be achieved by treating RNA with a phosphatase.

The RNA according to the invention may have modiﬁed ribonucleotides in order to increase its stability and/or decrease cytotoxicity. For example, in one ment. in the RNA used according to the invention 5-methylcytidine is substituted partially or completely. preferably completely. for cytidine. Alternatively or additionally. in one embodiment. in the RNA used according to the invention uridine is substituted partially or completely. preferably completely. for e.

In one embodiment. the term "modiﬁcation" relates to providing an RNA with a 5’-cap or 5’-cap analog. The term "5’-cap" refers to a cap structure found on the 5'-end of an mRNA molecule and lly consists of a guanosine tide connected to the mRNA via an unusual 5' to 5' sphate linkage. In one embodiment. this guanosine is methylated at the 7-position. The term "conventional 5’-cap" refers to a naturally occurring RNA 5’-cap. preferably to the 7- methylguanosine cap (m7G). In the context of the present invention. the term "5’-cap" includes a ’-cap analog that resembles the RNA cap structure and is modiﬁed to possess the ability to stabilize RNA and/or enhance ation of RNA if attached thereto. preferably in viva and/or in a cell. of a Providing an RNA with a 5’-cap or 5’-cap analog may be achieved by in vitro transcription DNA template in or 5’-cap analog. wherein said presence of said 5’—cap 3’-cap is co- transcriptionally incorporated into the generated RNA strand. or the RNA may be generated. for e. by in vitro transcription, and the 5’-cap may be attached to the RNA post- transcriptionally using capping enzymes. for example. capping enzymes of vaccinia virus.

The RNA may comprise further modiﬁcations. For example. a further modiﬁcation of the RNA used in the present invention may be an ion or truncation of the naturally occurring poly(A) tail or an alteration of the 5’- or 3’-untranslated regions (UTR) such as introduction of a UTR which is not d to the coding region of said RNA. for example. the exchange of the existing 3’-UTR with or the insertion of one or more. preferably two copies of a 3’-UTR derived from a globin gene. such as alphaZ-globin. alphal-globin. beta-globin. preferably lobin. more ably human beta-globin.

RNA having an unmasked poly-A sequence is ated more efﬁciently than RNA having a masked poly-A sequence. The term "poly(A) tail" or "poly—A sequence" relates to a sequence of adenyl (A) residues which lly is located on the 3’-end of a RNA molecule and "unmasked poly-A ce" means that the poly-A sequence at the 3’ end of an RNA molecule ends with at the 3’ an A of the poly-A sequence and is not followed by nucleotides other than A located end. i.e. downstream. of the poly-A sequence. Furthermore. a long poly-A sequence of about 120 base pairs results in an optimal transcript stability and translation efﬁciency of RNA.

Therefore. in order to increase stability and/or expression of the RNA used according to the present invention. it may be modiﬁed so as to be present in conjunction with a poly-A sequence. ably having a length of 10 to 500. more preferably 30 to 300. even more preferably 65 to 200 and ally 100 to 150 adenosine residues. In an especially preferred embodiment the poly-A sequence has a length of approximately 120 adenosine residues. To further increase stability and/or expression of the RNA used according to the invention. the poly-A sequence can be unmasked.

In addition. incorporation of a 3’Anon translated region (UTR) into the 3’-non ated region of an RNA molecule can result in an enhancement in translation efficiency. A synergistic effect The 3’-non may be achieved by incorporating two or more of such 3’-non translated regions. translated regions may be autologous or heterologous to the RNA into which they are introduced.

In one particular embodiment the 3’-non translated region is derived from the human in gene.

A combination of the above described modifications. i.e. incorporation of a poly-A sequence. unmasking ofa poly-A sequence and incorporation of one or more 3’-non translated regions. has a synergistic inﬂuence on the stability of RNA and se in ation efficiency.

The term "stability" of RNA relates to the "half-life" of RNA. "Half-life" s to the period of time which is needed to eliminate half of the activity. amount. or number of molecules. In the context of the present invention. the half-life of an RNA is indicative for the stability of said RNA. The half-life of RNA may inﬂuence the "duration of expression" of the RNA. It can be expected that RNA having a long half-life will be expressed for an extended time .

Of course. if ing to the present invention it is desired to decrease stability and/or translation efﬁciency of RNA. it is possible to modify RNA so as to interfere with the function of elements as described above increasing the stability and/or translation efﬁciency of RNA.

The term "expression" is used according to the invention in its most general meaning and comprises the production of RNA and/or peptides. polypeptides or proteins. e.g. by transcription and/or ation. With respect to RNA. the term "expression" or "translation" relates in particular to the production of peptides. polypeptides or proteins. It also comprises partial expression of nucleic acids. Moreover. expression can be transient or .

According to the invention. the term expression also includes an "aberrant expression" or "abnormal sion". "Aberrant expression" or "abnormal sion" means according to the invention that expression is altered. preferably sed. compared to a reference. e.g. a state in a t not having a disease associated with aberrant or abnormal expression of a certain protein. least 10%. in e.g.. a tumor antigen. An increase in expression refers to an se by at particular at least 20%. at least 50% or at least 100%. or more. In one embodiment. expression is only found in a diseased . while expression in a healthy tissue is repressed.

The term "speciﬁcally expressed" means that a protein is essentially only expressed in a speciﬁc tissue or organ. For example. a tumor antigen speciﬁcally expressed in gastric mucosa means that said protein is ily expressed in gastric mucosa and is not expressed in other s or is not expressed to a cant extent in other tissue or organ types. Thus. a protein that is exclusively expressed in cells of the gastric mucosa and to a signiﬁcantly lesser extent in any other tissue. such as testis. is speciﬁcally expressed in cells of the gastric mucosa. In some embodiments. a tumor antigen may also be speciﬁcally expressed under normal conditions in more than one tissue type or organ. such as in 2 or 3 tissue types or organs. but preferably in not more than 3 different tissue or organ types. In this case. the tumor antigen is then speciﬁcally expressed in these organs. For example. if a tumor antigen is expressed under normal conditions preferably to an imately equal extent in lung and stomach. said tumor antigen is speciﬁcally expressed in lung and stomach.

In the context of the present invention. the term "transcription" relates to a process. wherein the genetic code in a DNA sequence is ribed into RNA. Subsequently. the RNA may be translated into protein. According to the present invention. the term cription" comprises "in vitro transcription". wherein the term "in vitro transcription" relates to a process n RNA. in particular mRNA. is in vitro synthesized in a cell-free system. preferably using appropriate cell extracts. Preferably. cloning vectors are applied for the generation of transcripts. These cloning vectors are generally ated as transcription vectors and are according to the present invention encompassed by the term "vector". According to the t invention. the RNA used in the present invention ably is in vitro transcribed RNA (IVT-RNA) and may be obtained by in vr’tm transcription of an appropriate DNA template. The promoter for lling transcription can be any promoter for any RNA polymerase. Particular examples of RNA polymerases are the T7. T3. and SP6 RNA polymerases. Preferably. the in vitro transcription according to the ion is controlled by a T7 or SP6 promoter. A DNA template for in t'itl'O transcription may be obtained by cloning of a nucleic acid. in particular cDNA. and introducing it into an appropriate vector for in vitro transcription. The cDNA may be obtained by reverse transcription of RNA.

The term "translation" according to the invention relates to the process in the ribosomes of a cell by which a strand of ger RNA directs the assembly ofa sequence of amino acids to make a peptide. ptide or protein.

Expression control sequences or regulatory sequences. which according to the invention may be linked onally with a nucleic acid. can be homologous or heterologous with respect to the nucleic acid. A coding sequence and a regulatory sequence are linked together "functionally" if they are bound together covalently. so that the transcription or translation of the coding sequence is under the control or under the influence of the regulatory sequence. If the coding sequence is to be translated into a functional protein. with functional linkage of a regulatory sequence with the coding sequence. induction of the regulatory sequence leads to a transcription of the coding of the coding sequence. without g a reading frame shift in the coding sequence or inability sequence to be translated into the d protein or peptide.

The term "expression control sequence" or "regulatory sequence" comprises. according to the invention. promoters. me-binding sequences and other l elements. which control the transcription of a nucleic acid or the translation of the derived RNA. In certain embodiments of the invention. the regulatory sequences can be controlled. The precise structure of regulatory sequences can vary ing on the species or depending on the cell type. but generally comprises 5’—untranscribed and 5’- and 3’-untranslated sequences. which are involved in the initiation of ription or ation. such as TATA-box. capping-sequence. CAAT-sequence and the like. In particular. 5’-untranscribed regulatory sequences comprise a promoter region that includes a promoter sequence for transcriptional control of the onally bound gene.

Regulatory sequences can also se enhancer sequences or upstream tor sequences. ably. according to the invention. RNA to be expressed in a cell is introduced into said cell.

In one embodiment of the methods according to the invention. the RNA that is to be introduced into a cell is obtained by in vitro transcription of an appropriate DNA template.

According to the invention. terms such as "RNA capable of expressing” and "RNA encoding" are used interchangeably herein and with respect to a particular peptide or polypeptide mean that the RNA. if present in the appropriate environment. preferably within a cell. can be expressed to produce said peptide or polypeptide. Preferably. RNA according to the invention is able to interact with the cellular ation machinery to provide the peptide or polypeptide it is capable of expressing.

Terms such as "transferring". "introducing" or "transfecting" are used hangeably herein and L0 relate to the introduction of nucleic acids. in particular exogenous or heterologous nucleic acids. in particular RNA into a cell. According to the present invention. the cell can form part of an the administration of a organ. a tissue and/or an organism. According to the present invention. nucleic acid is either achieved as naked nucleic acid or in combination with an administration acids. reagent. Preferably. administration of c acids is in the form of naked nucleic ably, the RNA is administered in combination with stabilizing substances such as RNase inhibitors. The present invention also envisions the repeated introduction of nucleic acids into cells to allow sustained expression for extended time periods.

Cells can be transfected with any carriers with which RNA can be associated. e.g. by g complexes with the RNA or forming vesicles in which the RNA is enclosed or encapsulated. resulting in increased ity of the RNA compared to naked RNA. Carriers useful according to the invention include. for example. lipid-containing carriers such as cationic lipids, liposomes. in ular cationic liposomes. and micelles. and nanoparticles. Cationic lipids may form complexes with vely charged nucleic acids. Any cationic lipid may be used according to the invention.

Preferably. the introduction of RNA which encodes a peptide or polypeptide into a cell. in particular into a cell present in viva. results in sion of said peptide or polypeptide in the cell. In particular embodiments. the targeting of the nucleic acids to particular cells is red.

In such ments. a can'ier which is applied for the administration of the nucleic acid to a cell (for e. a retrovirus or a liposome). exhibits a targeting molecule. For example. a molecule such as an antibody which is speciﬁc for a surface membrane protein on the target cell into the nucleic acid carrier or or a ligand for a receptor on the target cell may be incorporated acid is administered by liposomes. proteins which bind may be bound thereto. In case the nucleic into the to a surface membrane protein which is associated with endocytosis may be incorporated liposome formulation in order to enable targeting and/or . Such proteins encompass capsid proteins of fragments thereof which are speciﬁc for a particular cell type. antibodies against proteins which are alized. proteins which target an intracellular location etc.

The term "cell" or "host cell" preferably is an intact cell. i.e. a cell with an intact membrane that IO has not released its normal intracellular components such as enzymes, organelles. or genetic material. An intact cell preferably is a viable cell. i.e. a living cell e of carrying out its normal metabolic functions. Preferably said term relates according to the invention to any cell which can be transformed or transfected with an exogenous nucleic acid. The term "cell" includes according to the invention yotic cells (e.g._. E. coli) or otic cells (e.g.. dendritic cells. B cells. CHO cells. COS cells. K562 cells. HEK293 cells. HELA cells. 15 yeast cells. and insect cells). The exogenous nucleic acid may be found inside the cell (i) freely dispersed as such. (ii) incorporated in a recombinant . or (iii) integrated into the host cell cells are particularly preferred. such as cells from genome or mitochondrial DNA. ian humans. mice. hamsters. pigs, goats. and es. The cells may be derived from a large number of tissue include primary cells and cell lines. Speciﬁc examples include types and keratinocytes. peripheral blood leukocytes. bone marrow stem cells. and embryonic stem cells. In r embodiments. the cell is an antigen-presenting cell. in particular a dendritic cell. a monocyte. or macrophage.

A cell which comprises a nucleic acid le preferably expresses the peptide or polypeptide encoded by the nucleic acid.

The term "clonal ion" refers to a process wherein a specific entity is multiplied. In the context of the t invention. the term is preferably used in the context of an immunological response in which lymphocytes are stimulated by an antigen. proliferate. and the specific WO 80569 2014/001232 leads cyte izing said antigen is ampliﬁed. Preferably, clonal expansion to differentiation of the cytes.

Terms such as "reducing" or "inhibiting" relate to the ability to cause an l decrease. preferably of 5% or greater. 10% or greater. 20% or greater. more preferably of 50% or greater. and most preferably of75% or greater. in the level. The term "inhibit" or similar phrases includes to zero. a complete or essentially te inhibition. i.e. a reduction to zero or essentially Terms such as "increasing". cing", "promoting" or "prolonging" preferably relate to an increase. enhancement. promotion or prolongation by about at least 10%. preferably at least "/ at least 30%. preferably at least 40%. preferably at least 50%. preferably at least . preferably 80%. preferably at least 100%. preferably at least 200% and in ular at least 300%. These terms may also relate to an se. enhancement. promotion or prolongation from zero or a non-measurable or tectable level to a level of more than zero or a level which is measurable or detectable.

The present invention provides vaccines such as cancer vaccines designed on the basis of amino acid modifications or modified peptides predicted as being immunogenic by the methods of the present invention.

According to the invention. the term "vaccine" relates to a pharmaceutical preparation (pharmaceutical composition) or product that upon administration induces an immune response. in particular a cellular immune response. which recognizes and attacks a pathogen or a diseased cell such as a cancer cell. A vaccine may be used for the prevention or treatment of a disease.

The term "personalized cancer vaccine" or "individualized cancer vaccine" concerns a particular cancer patient and means that a cancer vaccine is adapted to the needs or special circumstances of an dual cancer patient.

In one embodiment. a vaccine provided according to the invention may comprise a peptide or polypeptide comprising one or more amino acid modifications or one or more modified peptides predicted as being immunogenic by the methods of the invention or a nucleic acid. preferably RNA. encoding said peptide or polypeptide.

The cancer vaccines provided according to the invention when administered to a patent provide and/or expanding T cells speciﬁc one or more T cell epitopes suitable for stimulating. priming from for the patient's tumor. The T cells are preferably directed against cells expressing antigens which the T cell epitopes are derived. Thus. the vaccines described herein are preferably capable of inducing a cellular response, preferably xic T cell or promoting activity. against a with cancer disease characterized by presentation of one or more tumor-associated neoantigens class I MHC. Since a vaccine provided according to the t invention will target cancer specific mutations it will be ic for the patient's tumor.

A vaccine provided according to the invention relates to a vaccine which when administered to a 2 or more. 5 or more. l0 or more. patent preferably provides one or more T cell epitopes. such as 15 or more. 20 or more. 25 or more. 30 or more and preferably up to 60. up to 55. up to 50. up to 45, up to 40. up to 35 or up to 30 T cell epitopes, incorporating amino acid modiﬁcations or modified peptides predicted as being immunogenic by the methods of the invention. Such T cell epitopes herein. epitopes by cells of are also termed "neo-epitopes" Presentation of these a patient. in particular antigen presenting cells. preferably results in T cells targeting the es when bound to MHC and thus. the patient's tumor. ably the primary tumor as well as tumor metastases. expressing antigens from which the T cell epitopes are derived and presenting same epitopes on the surface of the tumor cells.

The methods of the invention may comprise the r step of determining the usability of fied amino acid modifications or modified peptides for cancer vaccination. Thus further involve one or more of the following: (i) ing whether the modifications are steps can d in known or ted MHC presented epitopes. (ii) in vitro and/or in silico testing whether the cations are located in MHC presented epitopes. e.g. testing r the modifications are part of e sequences which are processed into and/or presented as MHC presented epitopes. and (iii) in vitro testing whether the envisaged modified epitopes. in particular when in their natural sequence context. e.g. when ﬂanked by amino acid present sequences also ﬂanking said es in the naturally occurring protein. and when expressed antigen presenting cells are able to stimulate T cells such as T cells of the patient having the desired speciﬁcity. Such ﬂanking sequences each may comprise 3 or more. 5 or more. 10 or more. 15 or more. 20 or more and preferably up to 50. up to 45. up to 40. up to 35 or up to 30 amino acids and may ﬂank the epitope sequence N~terminally and!or C-terminally.

Modified peptides ined according to the invention may be ranked for their usability as epitopes for cancer vaccination. Thus. in one aspect, the method of the invention ses a manual or computer-based analytical s in which the identified modified peptides are analyzed and ed for their usability in the respective vaccine to be provided. In a preferred embodiment. said analytical process is a computational algorithm-based process. Preferably. said analytical process comprises determining and/or ranking epitOpes according to a prediction of their capacity of being immunogenic.

The neo-epitopes identiﬁed according to the invention and provided by a vaccine of the ion are preferably present in the form of a polypeptide comprising said nee—epitopes such as a polyepitopic polypeptide or a c acid. in particular RNA. encoding said polypeptide.

Furthermore. the neo-epitopes may be present in the polypeptide in the form of a vaccine sequence. i.e. present in their natural sequence context. e.g. flanked by amino acid sequences also flanking said epitopes in the naturally occurring n. Such flanking sequences each may comprise 5 or more. 10 or more. 15 or more. 20 or more and preferably up to 50. up to 45. up to 40. up to 35 or up to 30 amino acids and may flank the epitope sequence N-terminally and/or C- terminally. Thus. a e sequence may se 20 or more. 25 or more. 30 or more. 35 or more. 40 or more and preferably up to 50. up to 45. up to 40. up to 35 or up to 30 amino acids. in one embodiment. the neo-epitopes and/or vaccine sequences are lined up in the polypeptide head-to-tail.

In one ment. the nee-epitopes and/or vaccine sequences are spaced by linkers. in particular l s. The term ”linker" according to the invention relates to a peptide added between two peptide domains such as epitopes or vaccine sequences to connect said peptide domains. There is no ular limitation regarding the linker sequence. However. it is preferred well that the linker sequence reduces steric hindrance between the two e domains. is should have translated. and supports or allows processing ofthe epitopes. Furthermore. the linker not create non- no or only little immunogenic sequence elements. Linkers preferably should endogenous neo-epitopes like those generated from the junction suture n nt neo- vaccine epitopes. which might generate unwanted immune reactions. Therefore. the polyepitopic should preferably contain linker sequences which are able to reduce the number of unwanted MHC binding junction epitopes. Hoyt et al. (EMBO J. 25(8), 1720-9. 2006) and Zlumg et al. (J.

Biol. Chem. 279/10), 8635-41, 2004) have shown that glycine-rich sequences impair proteasomal sing and thus the use of glycine rich linker sequences act to minimize the Furthermore. number of linker-contained peptides that can be processed by the proteasome. glycine was observed to inhibit a strong binding in MHC binding groove positions adu e! (11., J. Immunol. 151(7), 5. [993). Schlessinger et (1/. (Proteins, 61(1). 115—26, 2005) had found that amino acids glycine and serine included in an amino acid sequence result in a more ﬂexible protein that is more efﬁciently translated and'processed by the proteasome. enabling better access to the encoded neo-epitopes. The linker each may comprise 3 or more. 6 or more. to 50. up to 45, up to 40. up to 35 or more. 10 or more. 15 or more. 20 or more and preferably up in glycine and/or serine amino acids. or up to 30 amino acids. Preferably the linker is enriched Preferably. at least 50%. at least 60%. at least 70%. at least 80%. at least 90%. or at least 95% of the amino acids of the linker are glycine and/or serine. In one preferred embodiment. a linker substantially composed of the amino acids glycine and serine. In one embodiment. the linker comprises the amino acid sequence (GGS)a(GSS)h(GGGMSSGMGSG)e n a. b. c. d and e is independently a number selected from 0. l. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. or 20 and wherein a + b + c + d + e are different from 0 and preferably are 2 or more. 3 or a sequence as described more. 4 or more or 5 or more. In one embodiment. the linker comprises herein including the linker sequences described in the examples such as ce GGSGGGGSG. in one particularly preferred embodiment. a ptide incorporating one or more neo-epitopes such as a polyepitopic polypeptide according to the t invention is stered to a patient in the form of a c acid. preferably RNA such as in vitro ribed or synthetic RNA. which may be expressed in cells of a patient such as antigen presenting cells to produce the polypeptide. The present invention also ons the administration of one or more multiepitopic polypeptides which for the purpose of the present invention are comprised by the term "polyepitopic polypeptide", preferably in the form of a nucleic acid. preferably RNA such such as as in vitro transcribed or synthetic RNA. which may be expressed in cells of a patient antigen ting cells to produce the one or more ptides. In the case of an administration of more than one multiepitopic polypeptide the neo-epitopes provided by the different multiepitopic polypeptides may be different or lly overlapping. Once present in cells of a patient such as antigen presenting cells the polypeptide according to the invention is processed to produce the neo-epitopes identiﬁed according to the invention. Administration of a vaccine that provided according to the invention may provide MHC class II-presented epitopes are capable of eliciting a CD4+ helper T cell response against cells expressing antigens from which the MHC presented es are derived. Alternatively or additionally, administration of a vaccine provided according to the invention may provide MHC class ented epitopes that from which the are capable of eliciting a CD8+ T cell response against cells expressing antigens MHC presented epitopes are derived. Furthermore. administration of a vaccine ed according to the invention may provide one or more neo-epitopes (including known neo-epitopes and neo-epitopes ﬁed according to the invention) as well as one or more es not containing cancer speciﬁc somatic mutations but being expressed by cancer cells and preferably inducing an immune response against cancer cells. preferably a cancer c immune response. In one embodiment. administration of a vaccine provided according to the invention provides neo-epitopes that are MHC class lI-presented es and/or are capable of ing a CD4+ helper T cell response against cells sing ns from which the MHC presented epitopes are derived as well as epitopes not containing cancer-speciﬁc somatic mutations that are MHC class l-presented epitopes and/or are capable of eliciting a CD8+ T cell response against cells expressing antigens from which the MHC presented epitopes are derived. In one embodiment. the epitopes not ning cancer-speciﬁc somatic mutations are derived from a tumor antigen. In one embodiment. the neo—epitopes and epitopes not containing cancer—speciﬁc somatic mutations have a synergistic effect in the treatment of cancer. Preferably. a vaccine provided according to the ion is useful for itopic stimulation of cytotoxic and/or helper T cell responses.

The e provided according to the invention may be a recombinant vaccine.

The term "recombinant" in the context of the present invention means "made through genetic engineering". Preferably. a "recombinant " such as a recombinant polypeptide in the context of the is present invention is not occurring naturally. and preferably a result of a combination of entities such as amino acid or c acid sequences which are not combined in nature. For example. a recombinant polypeptide in the context of the present invention may contain several amino acid sequences such as itopes or vaccine sequences derived from different proteins or different portions of the same protein fused together. e.g., by peptide bonds or appropriate linkers.

The term "naturally occurring" as used herein refers to the fact that an object can be found in nature. For example, a e or nucleic acid that is t in an organism (including viruses) and can be isolated from a source in nature and which has not been intentionally modified by man in the laboratory is naturally occurring.

Agents, compositions and methods bed herein can be used to treat a subject with a disease. disease terized by the of diseased cells expressing an antigen and e.g., a presence presenting thereof. diseases. a fragment Particularly preferred es are cancer Agents. compositions and methods described herein may also be used for immunization or vaccination to prevent a disease described herein.

According to the invention. the term "disease" refers to any pathological state. including cancer diseases. in ular those fomis ofcancer diseases described herein.

The term "normal" refers to the y state or the conditions in a healthy subject or tissue. i.e.. non-pathological conditions. wherein "healthy" ably means non-cancerous.

"Disease involving cells expressing an antigen" means according to the invention that expression of the antigen in cells of a diseased tissue or organ is detected. Expression in cells of a diseased tissue or organ may be sed compared to the state in a healthy tissue or organ. An increase at least 100%. at refers to an increase by at least 10%. in particular at least 20%. at least 50%. least 200%. at least 500%. at least 1000%. at least 10000% or even more. In one ment. expression is only found in a diseased tissue. while expression in a healthy tissue is repressed.

According to the invention. diseases involving or being associated with cells expressing an antigen include cancer diseases.

According to the invention. the term "tumor" or "tumor e" refers to an abnormal growth cells (called neoplastic cells. tumorigenous cells or tumor cells) preferably forming a ng or lesion. By "tumor cell" is meant an abnormal cell that grows by a rapid. rolled cellular proliferation and continues to grow after the stimuli that initiated the new growth cease. Tumors coordination with the show partial or complete lack of structural organization and functional normal tissue. and usually form a distinct mass of tissue. which may be either benign. pre- malignant or malignant.

Cancer (medical term: ant neoplasm) is a class of diseases in which a group of cells display uncontrolled growth (division beyond the normal ). invasion (intrusion on and destruction of adjacent tissues). and sometimes metastasis (spread to other locations in the body via lymph or blood). These three malignant properties of cancers differentiate them from benign Most cancers form a tumor but tumors. which are self-limited. and do not invade or metastasize. some. like ia. do not. ancy, malignant neoplasm. and malignant tumor essentially synonymous with cancer.

Neoplasm is an abnormal mass of tissue as a result of neoplasia. Neoplasia (new growth Greek) is the al proliferation of cells. The growth of the cells exceeds. and is in the same uncoordinated with that of the normal tissues around it. The growth persists excessive manner even after cessation of the stimuli. It usually causes a lump or tumor. sms may be benign. pre-malignant or malignant. of a "Growth of a tumor" or "tumor growth" according to the invention relates to the tendency tumor to increase its size and/or to the tendency of tumor cells to proliferate.

For of the present invention. the terms r" and "cancer disease" are used purposes interchangeably with the terms "tumor" and "tumor disease".

Cancers are classiﬁed by the type of cell that resembles the tumor and. therefore. the tissue presumed to be the origin of the tumor. These are the histology and the location. respectively.

The term "cancer" according to the invention ses carcinomas. adenocarcinomas. blastomas. leukemias. seminomas. melanomas. teratomas. lymphomas. neuroblastomas. gliomas. rectal cancer. endometrial cancer. kidney cancer. adrenal cancer. d cancer. blood . skin cancer. cancer of the brain. cervical cancer. intestinal . liver . colon cancer. stomach cancer. intestine cancer. head and neck cancer. gastrointestinal cancer. lymph node cancer. cancer. esophagus cancer. colorectal cancer. as cancer. ear. nose and throat (ENT) breast cancer. prostate cancer. cancer of the uterus. ovarian cancer and lung cancer and the metastases thereof. Examples thereof are lung carcinomas. mamma carcinomas. prostate carcinomas. .colon carcinomas. renal cell carcinomas. cervical carcinomas. or metastases of the cancer types or tumors described above. The term cancer according to the invention also comprises cancer ases and relapse of cancer.

By "metastasis" is meant the spread of cancer cells from its original site to another part of the body. The formation of metastasis is a very complex process and s on detachment of malignant cells from the primary tumor. invasion of the extracellular matrix. penetration of the endothelial basement membranes to enter the body cavity and vessels. and then. after being transported by the blood. inﬁltration of target organs. Finally. the growth of a new tumor. i.e. a secondary tumor or metastatic tumor. at the target site depends on angiogenesis. Tumor metastasis often occurs even after the l of the primary tumor because tumor cells or remain and develop metastatic potential. In one embodiment. the components may term "metastasis" according to the ion s to ”distant metastasis" which relates to a metastasis which is remote from the primary tumor and the regional lymph node system.

The cells ofa secondary or metastatic tumor are like those in the al tumor. This means. for example. that. if ovarian cancer asizes to the liver. the secondary tumor is made up of abnormal ovarian cells. not of al liver cells. The tumor in the liver is then called metastatic ovarian cancer. not liver cancer.

The term "circulating tumor cells" or "CTCS" relates to cells that have ed from a primary circulate in the bloodstream. CTCS may constitute seeds for tumor or tumor metastases and cells subsequent growth of additional tumors (metastasis) in ent tissues. Circulating tumor are found in frequencies in the order of 1-10 CTC per mL of whole blood in patients with metastatic disease. Research methods have been developed to isolate CTC. Several research that methods have been described in the art to isolate CTCS. e.g. techniques which use of the fact epithelial cells commonly express the cell adhesion protein EpCAM. which is absent in normal capture with blood cells. lmmunomagnetic bead—based involves treating blood specimens antibody to EpCAM that has been conjugated with magnetic particles. followed by tion tagged cells in a magnetic field. Isolated cells are then stained with antibody to another lial marker. ratin. a common yte marker CD45. so as to distinguish rare as well as CTCS from inating white blood cells. This robust and semi-automated approach identifies CTCS with an average yield of approximately 1 CTC/mL and a purity of 0.1% (Allard et a]... 2004: Clin Cancer Res 10. 6897—6904). A second method for isolating CTCS uses a microﬂuidic- based CTC capture device which involves ﬂowing whole blood through a chamber embedded with 80.000 microposts that have been rendered functional by coating with antibody to EpCAM.

CTCS are then stained with secondary antibodies against either cytokeratin or tissue specific markers. such as PSA in prostate cancer or HERZ in breast cancer and are visualized by automated scanning of microposts in multiple planes along three dimensional coordinates. CTC— chips are able to fying cytokerating-positive circulating tumor cells in ts with a median yield of 50 cells/ml and purity ranging from 1—80% (Nagruth et al., 2007: Nature 450.

Tumor [ 235-123 9). Another possibility for isolating CTCs is using the CellSearchTM Circulating Cell (CTC) Test from Veridex. LLC an, NJ) which captures. identifies. and counts CTCS a tube of blood. The CellSearchTM system is a US. Food and Drug Administration (FDA) approved methodology for enumeration of CTC in whole blood which is based on a ation of immunomagnetic labeling and automated l microscopy. There are other methods for isolating CTCS described in the literature all of which can be used in conjunction with the present invention. them A relapse or recurrence occurs when a person is affected again by a condition that ed in the past. For example. if a patient has suffered from a tumor disease. has received a successful treatment of said disease and again develops said e said newly developed disease may considered as relapse or recurrence. However. according to the invention. a relapse or recurrence of a tumor disease may but does not necessarily occur at the site of the original tumor disease.

Thus. for example. if a patient has suffered from ovarian tumor and has received a sful treatment a relapse or recurrence may be the occurrence of an ovarian tumor or the occurrence a tumor also includes situations a tumor at a site ent to ovary. A relapse or recurrence of wherein a tumor occurs at a site different to the site of the original tumor as well as at the site of the original tumor. Preferably. the original tumor for which the patient has received a treatment is a primary tumor and the tumor at a site different to the site of the original tumor is a secondary or metastatic tumor.

By "treat" is meant to administer a compound or composition as described herein to a subject in order to prevent or eliminate a disease. ing reducing the size of a tumor or the number tumors in a subject; arrest or slow a disease in a subject; inhibit or slow the development of a and/or recurrences in a new disease in a t; decrease the ncy or severity of symptoms subject who currently has or who previously has had a disease: and/or prolong, i.e. increase the lifespan of the subject. In particular. the term "treatment of a disease" includes , shortening the duration. rating, preventing. slowing down or inhibiting progression or ing. or preventing or delaying the onset of a e or the symptoms thereof.

By "being at risk" is meant a subject. i.e. a t. that is identified as having a higher than normal chance of developing a disease. in particular cancer. compared to the general population.

In addition. a subject who has had. or who currently has. a disease. in particular cancer. is a subject who has an increased risk for developing a disease. as such a subject may continue to develop a disease. Subjects who currently have. or who have had. a cancer also have an increased risk for cancer metastases.

The term otherapy" relates to a treatment involving activation of a specific immune reaction. In the context of the present invention. terms such as "protect". "prevent". "prophylactic". "preventive". or "protective” relate to the prevention or treatment or both of the in particular. to minimizing the occurrence and/or the propagation of a disease in a subject and. chance that a subject will develop a disease or to delaying the development of a e. For example. a person at risk for a tumor. as described above. would be a candidate for therapy to prevent a tumor.

A prophylactic administration of an immunotherapy, for example. a prophylactic administration ofa e of the invention. preferably protects the recipient from the development of a disease.

A therapeutic administration of an immunotherapy. for example. a therapeutic stration of of the disease. This a vaccine ot‘ the invention. may lead to the inhibition of the progress/growth comprises the ration of the progress/growth of the disease. in particular a disruption of the progression of the disease. which preferably leads to ation of the disease. lmmunotherapy may be performed using any of a variety of techniques. in which agents provided herein function to remove diseased cells from a patient. Such l may take place for an antigen or a as a result of enhancing or inducing an immune response in a patient speciﬁc cell expressing an antigen.

Within certain embodiments. therapy may be active immunotherapy, in which treatment relies on the in viva stimulation of the endogenous host immune system to react t diseased cells with the administration of immune response-modifying agents (such as polypeptides and c acids as provided herein).

The agents and compositions provided herein may be used alone or in combination with conventional therapeutic regimens such as surgery. irradiation. chemotherapy and/or bone marrow transplantation (autologous. syngeneic. allogeneic or ted).

The term "immunization" or "vaccination" describes the s of treating a subject with the reasons. purpose of inducing an immune response for therapeutic or prophylactic The term "in vivo" relates to the situation in a subject. used interchangeably and relate to The terms "subject". idual". "organism" or "patient" are of the present invention vertebrates. ably mammals. For example. mammals in the context animals such as dogs. cats. sheep. cattle. goats. are humans. non-human primates. domesticated rabbits. etc. as well as pigs. horses etc.. tory animals such as mice. rats. guinea pigs, animals in captivity such as animals of zoos. The term "animal" as used herein also includes patient. i.e.. an animal, preferably a human humans. The term "subject" may also include a having a disease. preferably a disease as described herein.

The term "autologous" is used to describe anything that is d from the same subject. For derived from the same example. "autologous transplant" refers to a lant of tissue or organs the immunological barrier subject. Such procedures are advantageous because they overcome which otherwise results in rejection. elements.

The term "heterologous" is used to describe something consisting of le different transfer of one individual’s bone marrow into a different individual As an example, the derived from a source other constitutes a heterologous lant. A heterologous gene is a gene than the subject. one or more agents As part of the ition for an zation or a vaccination. preferably for an as described herein are administered together with one or more adjuvants inducing increasing The term "adjuvant" relates for to immune response or an immune response.

The composition of compounds which prolongs or enhances or accelerates an immune response. preferably exerts its effect without on of adjuvants. Still. the the present invention comprise a composition of the present application may contain any known nt. Adjuvants heterogeneous group of compounds such as oil emulsions (e.g., Freund’s adjuvants). mineral . liposomes. compounds (such as alum). bacterial products ( such as Bordetella pertussis (MPL and immune~stimulating complexes. Examples for nts are monophosphoryl-lipid-A SmithKline Beecham’). Saponins such as 0821 (SmithKline Beecham). DQSZl (SmithKline Beecham: WO 96/33739). QS7. QSl7. QSIS. and QS-Ll (So et al.. 1997. Mol. Cells 7: 178- l86). incomplete Freund’s adjuvants. complete Freund’s adjuvants. vitamin E. id. alum.

CpG oligonucleotides (Kn'eg et al.. 1995. Nature 374: 9). and various in-oil emulsions which are prepared from biologically degradable oils such as squalene and/or tocopherol.

Other substances which stimulate an immune response of the patient may also be administered. It is possible. for example. to use cytokines in a vaccination. owing to their regulatory properties on lymphocytes. Such cytokines comprise. for example. interleukin-12 (IL-12) which was shown to increase the protective s of es (cf. Science 268:1432-1434. 1995). GM-CSF and IL There are a number of compounds which enhance an immune response and which therefore may be used in a vaccination. Said compounds comprise mulating molecules provided in the form of proteins or nucleic acids such as 37—] and 87-2 (CD80 and CD86. respectively).

According to the invention. a bodily sample may be a tissue sample. including body ﬂuids. and/or a cellular sample. Such bodily s may be obtained in the conventional manner such as by tissue biopsy, including punch biopsy, and by taking blood. bronchial aspirate. sputum. urine. feces or other body ﬂuids. According to the invention. the term "sample" also includes sed samples such as ons or isolates of biological samples. e.g. nucleic acid or cell isolates.

The agents such as vaccines and compositions described herein may be administered via any conventional route. including by injection or infusion. The administration may be carried out. for example, orally. intravenously, intraperitoneally. intramuscularly, subcutaneously or transdermally. In one ment. administration is carried out intranodally such as by ion into a lymph node. Other forms of administration envision the in 1-‘itl‘0 transfection of antigen presenting cells such as dendritic cells with nucleic acids described herein followed by administration of the antigen presenting cells.

The agents described herein are administered in effective s. An "effective amount" refers- . to the amount which achieves a desired reaction or a desired effect alone or together with further doses. In the case of treatment of a particular disease or of a particular condition. the desired reaction preferably s to inhibition of the course of the disease. This comprises g down the progress of the disease and. in particular. interrupting or reversing the progress of the disease. The d reaction in a treatment of a disease or of a condition may also be delay of the onset or a prevention of the onset of said disease or said condition.

An effective amount of an agent described herein will depend on the condition to be treated. the severeness of the e. the dual parameters of the patient. including age, physiological condition. size and weight. the duration of treatment. the type of an accompanying therapy (if present). the specific route of administration and similar factors. Accordingly. the doses stered of the agents described herein may depend on various of such parameters. In the case that a reaction in a t is insufficient with an initial dose. higher doses (or effectively higher doses achieved by a different. more localized route of administration) may be used.

The pharmaceutical compositions described herein are preferably sterile and contain an effective amount of the eutically active nce to generate the desired reaction or the desired The pharmaceutical compositions described herein are generally administered in pharmaceutically compatible amounts and in ceutically compatible preparation. The term “pharmaceutically compatible" refers to a nontoxic material which does not interact with the action of the active component of the pharmaceutical composition. Preparations of this kind may usually contain salts. buffer substances. preservatives. carriers. supplementing immunity- enhancing substances such as adjuvants. e.g. CpG oligonucleotides. cytokines. chemokines. saponin. GM-CSF and/or RNA and. where appropriate. other therapeutically active compounds.

When used in medicine. the salts should be pharmaceutically compatible. However. salts which are not pharmaceutically compatible may used for preparing pharrnaceutically compatible salts and are included in the invention. Pharmacologically and pharmaceutically compatible salts of this kind comprise in a non-limiting way those prepared from the following acids: hydrochloric. hydrobromic. sulfuric. nitric. phosphoric. maleic. acetic. salicylic. citric. formic. malonic. succinic acids. and the like. Pharmaceutically compatible salts may also be prepared as alkali metal salts or alkaline earth metal salts. such as sodium salts. potassium salts or calcium salts.

A pharmaceutical ition described herein may comprise a pharmaceutically compatible carrier. The term "carrier" refers to an organic or inorganic component. of a l or synthetic nature. in which the active ent is combined in order to facilitate application. According the invention. the term aceutically compatible carrier" includes one or more compatible solid or liquid fillers. diluents or encapsulating substances. which are le for administration .10 to a patient. The components of the pharmaceutical composition of the invention are usually such that no interaction occurs which substantially impairs the d pharmaceutical efﬁcacy.

The pharmaceutical compositions bed herein may n suitable buffer substances such acid in a salt. as acetic acid in a salt. citric acid in a salt. boric acid in a salt and phosphoric The pharmaceutical compositions may. where appropriate. also contain suitable preservatives such as benzalkonium chloride. chlorobutanol. paraben and thimerosal.

The pharmaceutical compositions are usually provided in a uniform dosage form and may be prepared in a manner known per se. Pharmaceutical compositions of the invention may be in the form of capsules. s. lozenges. ons. suspensions. syrups. elixirs or in the form of an emulsion. for example.

Compositions suitable for parenteral administration USually comprise a sterile aqueous or is preferably isotonic to the blood of the nonaqueous preparation ofthe active nd. which recipient. Examples of compatible carriers and solvents are Ringer solution and isotonic sodium chloride solution. In addition. usually e. fixed oils are used as solution or suspension medium.

The present invention is described in detail by the ﬁgures and examples below. which are used only for illustration purposes and are not meant to be limiting. Owing to the description and the included in the invention are accessible to the examples. further embodiments which are likewise skilled worker.

Figure 1. MHC binding prediction overview as a function of the Mmm score for 50 prioritized Figure 2. Analysis of immunogenicity (132 mutations in prioritized CT26.WT mutations total), of BIGFIO mutations and 82 vaccinations were performed with RNA. For Bl6FlO which 30 were immunogenic. with RNA and measuring the immune immunogenicity was assayed by challenging BMDCs with ELISPOT and FACS. For CT26.WT immunogenicity was assayed response of splenocytes and measuring the immune response by challenging BMDCs with RNA and peptides separately considered immunogenic ifeither peptide or RNA of cytes with ELISPOT: a mutation was mutations a registered distribution of immunogenic as an immune response. A Cumulative the total number of ons below a given Mm“, on of the Mm... score. The graph shows the t of of mutations that were immunogenic (blue). and score (red). of these. the number of t of immunogenic immunogenic mutations from the total (black). B Histogram Errors shown are standard mutations per Mm", bin for the ing ranges: 50.3. (0.3. 1]. >1. errors. function of Mm”.

Figure 3. Analysis of Bl6FlO and CT26.WT immunogenicity as a function of the Mm“, score for 816 (A) tive bution of immunogenic mutations as a mutations per Mm”, bin for the following and CT26 (C). Histogram of percent of immunogenic for 816 (B) and CT26 (D). Figures A and B are based on ranges: [0.1. 0.3]. (0.3. l], (1.00) which 12 were immunogenic. Figures C and D analysis of 50 BléFlO prioritized mutations. of ons. of which 30 were immunogenic. For are based on analysis of 82 Bl6F10 prioritized Errors are standard errors. more details see legend ofFig. 2.

Figure 4. Models of immunogenicity and control hypotheses. Class immunogenicity.

WT and MUT epitopes are presented by denoted by H... makes the assumption that both the cells. and that the mutation sufﬁciency altered the physico-chemical properties of the amino acid so that the immune system ers this change and generates an immune response (denoted by the lightning bolt). The H" hypothesis. serving as a control for H4. is simply the inverted H4 hypothesis. namely, that the mutation did not significantly alter the physico-chemical properties of the amino acid and therefore has a lower likelihood of being "detected" by the immune system and generating an immune se. In class [I genicity (H3 U He) the WT epitope is not presented but the MUT epitope is presented. H5 and Hg are distinguished by high (T>r) versus low (T51) T scores. respectively. Note that for ai=a. the H30 model for immunogenicity (Mm,</3) is a composite of all four groups: H3c1=U[H4,1-13,HC,H,,].

Figure 5. Hypothesized relation of the T score to immunogenincity. According to the class l immunogenicity model. during T cell development TCRs that bound strongly to the wild type epitope were deleted. Extant TCRs should t only Weak or no binding affinity to the wild T score have type e (A). es that contain an amino acid tuion that has a high similar physico-chemical properties to the wild type amino acid and therefore will likely have little impact on the binding affinity to extant TCRs (B). Epitopes that contain an amino acid substituion with a T score have a greater chance to increase the binding afﬁnity to exact TCRs and therefore a greater likelihood to be immunogenic (C). In this schematic illustration. color coding is used to pair T cells with a matching peptide. /yellow mutations represent mutations with high T scores (similar to the WT). where as blue/purple mutiations represent mutations with low T scores (signiﬁcant physico-chemical difference compared to the WT).

Figure 6. Cumulative distribution of immunogenic mutations as a function of 114m“. A Comparison of the percent of immunogenic ons that satisfy the baseline control hypothesis H30: {Mm S } with the percent of immunogenic mutations that satisfy the partial hypothesis H41 Hm m i T S r} 3 a} and the full hypothesis H4: . the partial hypothesis ch3: Han (\{MW Hm (\{MW S a } (\{T S r}. for 0: =1. 2' =1. B Comparison of the percent of immunogenic mutations that satisfy the baseline control hypotheses Ham: {MW 3/3} with the percent of immunogenic ons that y the inverse partial hypotheses: Hmnﬂ" >2"l and HMl (\{M > a }. The is in A and B are based on the pooled BloFlO and T ts. comprising of 132 mutations. of which 30 were immunogenic. Each data point in the graphs is based on 24 mutations.

Figure 7. Cumulative distribution of immunogenic mutations as a on of the 1 mu, score. Comparison of the percent of immunogenic mutations that satisfy the baseline control hypothesis Hag]: {Mm S ﬂ} with the percent of immunogenic mutations that satisfy the partial hypothesis Hi: HﬂmnﬂSr} the partial hypothesis Haczi Hm (\{MW nd the full hypothesis Hi: HmrﬂMnm, Sa}n{TSr}, given a=l, r=l for 816 (A) and CT26 (C).

Comparison of the percent of immunogenic mutations that satisfy the baseline control hypotheses HBCI: {ll/[mm S [3} with the percent of genic mutations that y the inverse partial hypotheses: Hm m { T > r} and Ham n{MW > a} for 316 (B) and CT26 (D)- Figures A and B are based on analysis of the 50 B 16F10 prioritized ons. of which 12 were immunogenic. Figures C and D are based on the 82 B 16F 10 tized mutations. of which 30 were immunogenic. Each data point in the graphs is based on 24 mutations.

Figure 8. lling for WT immunogenicity. To check whether omitting MUT+/WT+ solutions had an impact on these ﬁndings we excluded from the dataset 9 MUT+/WT+ mutations and 2 mutations for which the WT has not been measured. leaving in total 121 mutations (43 816 and 78 CT26) of which 19 were MUT+/WT- (5 for 816 and 14 for CT26). We again found the same trends as in the complete dataset. namely, highly non-linear response as a function of the M,,,,,, score. superiority of the H4 hypothesis over partial hypothesis. and inferiority of inverted hypotheses compared to the baseline control H35]. A Cumulative distribution of immunogenicity as a function of the Mm”, score. B Histogram of percent of immunogenic mutations per Mm”, bin. C Comparison of the percent of immunogenic mutations that y the baseline control hypotheses HBc/ with H4; chg and H4. D Comparison of the percent of immunogenic mutations that satisfy the baseline control hypotheses H35, with the inverse hypotheses. See Fig. 5 legend for additional details.

Red: all 50 816 Figure 9. Fraction of immunogenic mutations as a function of RPKM. in mutations and 82 CT26 mutations with no filtering (132 mutations total). Blue: mutations passing the Hi hypothesis with a =1,,[3=0.5.r =1. B. Percent of immunogenic mutations different RPKM ranges with no filtering. RPKM bins are: l=(0.l],2=(l.5],3=(5.50].4=(50.oo). C Percent of immunogenic ons for different RPKM ranges under the H4 hypothesis with a =1.,B = 0.5.1 =1. RPKM bins are: l=(0,l]. 2=(l. m). Errors are S.E.

Anchor Figure 10. Anchor and non-anchor position mutated class [I immunogenic epitopes. on motifs were ed using SYFPEITHI.

Figure .11. Proposed models for immunogenic tumor-associated epitopes.

For each mutation Figure 12. e of a method for weighing rank position of mutations. the number of the rank position in the list of ranked mutations can be further weighed by window lengths for solutions for which the combination of HLA types for the patient, possible solution with low Mm", or the HLA type and mutation position within the epitope resulted in a resulted in a H4 and/or HBUHC classiﬁcation. Since all solutions per on ially can rank position of presented in parallel. this weighing factor may be an ant contributor to the the on. from CT26 Figure 13. Example of scatter plot of all epitope solutions for mutation chrl4_52837882 t M,,,,,, and AM: M,,,,,, —M,,, .

EXAMPLES in a manner known The techniques and methods used herein are described herein or carried out per se and as bed. for example, in Sambrook et al.. Molecular Cloning: A Laboratory Manual. 2'“ll Edition (1989) Cold Spring Harbor Laboratory Press. Cold Spring Harbor. NY. All methods including the use of kits and reagents are carried out according to the manufacturers’ information unless specifically indicated.

WO 80569 e 1: Establishing a model for predicting immunogenicity ofT cell epitopes Previously we explored the immunogenicity of 50 somatic mutations identiﬁed in the BléFlO murine melanoma cell line (J. C. Castle et al.. ting the mutanome for tumor vaccination.

Cancer Research 72. 1081 (2012)). These 50 mutations were selected from a pool of 563 expressed nonsynonymous somatic mutations primarily to maximize MHC class I expression (J.

C. Castle et al.. Exploiting the mutanome for tumor ation. Cancer Research 72. 1081 (2012)) i.e.. the (see also Example 2). For each mutation we predicted the minimal epitope. epitope scoring the lowest MHC class I consensus score (Y. Kim et al.. c Acids Research 40. W525 (2012)) d here as .Mmu) when searching the space of all le MHC class alleles. potential epitope lengths and sequence windows (where to position the mutation) (J. C .

Castle et al.. Exploiting the mutanome for tumor vaccination. Cancer Research 72. l081 (2012)). ing the immunogenicity of these mutations using RNA vaccination followed by peptide readout (see Example 2) conﬁrmed earlier ﬁndings using peptide vaccination (J. C. Castle et al..

Exploiting the me for tumor vaccination. Cancer Research 72. 1081 (2012)). and showed that only 12 out of 50 mutations (24%) were immunogenic (Table l). with MUT+/WT- tested. sequences comprising only 10% for ofall mutations Table l. Vumber of immunogenic mutations after RNA vaccination of Bl6FlO and CT26.WT murine strains. —_+/- -_—___l- 14*”— *Two CT26 MUT+ mutations were ed .lrom this table because their WT reactivity has not been measured yet. In total there were 18 MUT+ mutations out 0182 CT26 mutations measured thus/gr. resulting in 22% success rate.

The results of the Bl6FlO murine test case demonstrate that naively selecting expressed nonsynonymous mutations with low 1V[,,,,,, scores ($3.9) yields rather low success rates for predicting immunogenicity. Hence a better understanding of the mechanisms driving immunogenicity is required if personalized vaccines targeting tumor-speciﬁc neoantigens are to become effective therapies. In an effort to uncover onal variables that contribute immunogenicity we ed the genicity of expressed onymous somatic mutations identified in a colorectal murine cell line CT26.WT. In total. 96 mutations were selected based on their Mm“, scores (low vs. high). mean RPKM (low vs. high), and cellular localization (intra- vs. extra- ar), and tested for immunogenicity using RNA vaccination with both peptide and RNA readout (see Example 2 for further details). Together with the Bl6FlO cell line. our dataset comprised of 132 epitopes. whose immunogenicity was measured ex vivo on murine splenocytes.

The MHC consensus score. To investigate the dependence of immunogenicity on ,. we plotted the cumulative percent of immunogenic mutations as a function of Mm”, that is. the that were percent of ons with an Mm“, score smaller than a given threshold (denoted by ,6) immunogenic. An analysis of the ed 816 and CT26 datasets spanning a total of I32 mutations s a highly nonlinear dependence of the immunogenicity success rate on Mm”, (Fig. 2A). Fig. 2A shows that immunogenic mutations are enriched for extremely low Mm", scores (S~0.2). For Mm,,,50.l the percent of immunogenic mutations peaks at ~60%, and quickly decays as M,,,,,, increases. dropping below ~25% for Mm“, 22. The percent of immunogenic mutations with Mm,,,50.3 versus >0.3 was 44.4% compared with 17.1%. a statistically signiﬁcant difference (P value = 0.004. Fisher’s exact test. one tailed). A histogram of the percent of immunogenic mutations for three Mm“, bins: 50.3. (0.3. l] and >1 shows that the percent of immunogenic mutations drops as Mm“, increases (Fig. 28). The differences between the success rate of the lowest bin (Mm, 50.3), 44.4%. and both the central bin. 20.7%. and the highest bin (A/[,,,,,,>l). 15.8%. in Fig. 2B were statistically signiﬁcant (P values = 0.05 and 0.004. respectively. Fisher’s exact test. one tailed). indicating that for Mm”, >~0.3 the success rate drops in a statistically significant manner. A similar trend in the success rate is also observed when analyzing 816 and CT26 mutanomes separately (Fig. 3).

Thus far our criteria for selecting mutations focused on presentation, and we have seen that restricting the MHC g score of the mutated epitope allows prediction of immunogenic epitopes with but ient up to 60% precision. Presentation. however. is a necessary not condition to induce genicity. By identifying additional criteria for TCR ition we We hypothesized two ly may be able to further improve the ion of our prediction.

WO 80569 ive mechanisms for driving immunogenicity. which we refer to as the class I and class II immunogenicity models.

Class I immunogenicity. In order for the TCR repertoire to recognize a mutated epitope and be satisﬁed (H4 generate an immune response we hypothesized that three conditions must Fig. 4): (i) the wild type epitope. at some point during the development of the organism. was presented to the immune system leading to deletion of matching TCRs via strong TCR/pMHC binding. (ii) the mutated e is presented. and (iii) the physico—chemical properties of the mutated amino acid are sufﬁciently "different" from the wild type amino acid (by some metric that we shall define below) so that the TCR repertoire is able to "detect" or "register" this substitution. Conditions (i) and (ii) ensure that the immune system is actually exposed to the change. i.e.. the mutation. Condition (iii) requires that the mutation cantly change the physico-chemical character of the wild type amino acid so that the binding afﬁnity of the mutated epitope to extant eted) TCR potentially ses. thereby turning on the signaling cascade that leads to an immune response (Fig. 5).

The TCR recognition score. Class I genicity models es a metric to estimate the physico-chemical difference between two amino acids. It is well known in molecular evolution that amino acids that interchange frequently are likely to have chemical and physical similarities whereas amino acids that interchange rarely are likely to have different physico—chemical properties. The likelihood for a given substitution to occur in nature compared with the likelihood for this substitution to occur by chance is measured by log-odds matrices. The patterns observed in d matrices imposed by natural selection "reflect the rity of the functions of the amino acid residues in their weak ctions with one another in the three dimensional conformation of proteins" (M. O. Dayhoff. R. M. Schwartz. B. C. Orcutt. A model for evolutionary change. MO Dayhoff. ed. Atlas of protein sequence and stiucture Vol.5. 345 ). We therefore used evolutionary based log-odds matrices. which we refer to here as "T scores" to reflect TCR recognition. as effective scoring matrices for cancer associated amino acid substitutions. Substitutions with ve T scores (i.e.. log-odds) are likely to occur in nature. and hence correspond to two amino acids that have similar physico-chemical properties. The class 1 model predicts that substitutions with positive T scores would have a lower likelihood of being immunogenic. Conversely. substitutions with negative T scores reﬂect substitutions that are unlikely to occur in nature and hence pond to two amino acids that have significantly different physico-chemical properties. According to our model. such substitutions would have a of estimating log—odds greater chance of being genic. We compared different methods matrices and found results to be largely robust to the exact method chosen. The maximum likelihood (ML) based estimation approach known as WAG (S. Whelan. N. Goldman. Molecular biology and evolution 18. 691 ). using a PAM (point accepted mutation) distance of 250 ed to separate predicted immunogenic from non-immunogenic mutations best. and therefore we present results with this matrix (see Example 2 for further details).

Class [I immunogenicity. ln the class ll model for immunogenicity we hypothesize that a mutation is likely to be immunogenic if the immune system has never before seen the wild type epitope. and is therefore challenged by the mutated epitope. ore in order for a mutation to be genic in this model we hypothesized that two conditions must be ied: (i) the wild—type epitope was never presented to the immune system, (ii) the mutated peptide is presented. These conditions can be co-satisfied if. for example. the mutation hits an anchor position thereby changing a "nonbinder" epitope into a "binder". Formally. class [I immunogenicity can be separated into two sub-hypotheses: high T scores (H3 in Fig. 4) and low T scores (Hg in Fig. 4). r. since the assumption is that the wild type epitope is not ted. the nature of the amino acid substitution is not expected to have an impact on TCR recognition and we shall therefore equate class II immunogenicity with the united esis: H” U Hc.

Testing class I immunogenicity. The tions of class I immunogenicity (H4 in Fig. 4) can be restated mathematically as follows: we require that the wild type epitope is presented (Mw, .<_ a ). the mutated epitope is presented ( Mm", S ﬂ ). and the amino acid substitution is non- trivial ( T S 2' ). where 1 consensus score of the mutated epitope (same W, is deﬁned as the MHC HLA allele and window length) replacing the mutated amino acid with the wild type amino acid. and T denotes the T score. Since all three conditions are necessary. we expect that the precision ofthe H4 classifier will be higher ed to a classifier based on Mm”, alone (H35, in Fig. 4) or com ared to the partial hypotheses: H m-‘M, Sa and H m-‘TSr‘. We therefore p an t u! an t l calculated the percent of immunogenic mutations (number of true positives d by the sum of true ves and false positives) as a function of ﬂ for H30, for the partial hypothesis H4 .; Hm n-{T S r} and for the partial hypothesis chgl Hm (\{Mm S a}. We found that a conservative threshold for r in the range of z0.5 tol performed best (the range of the WAG250 matrix is from -5.1 (FHG substitution) to +5.4 (FHY tution). We also found that a can be restricted conservatively compared to If. setting a z 1 . Fig. 6A indeed shows that. when ering the pooled me of 816 and CT26. classifiers based on HBO, and H4- attained greater ion than the baseline control hypothesis HBO. Moreover. a classifier based on the complete hypothesis H4 attained greater precision than the partial hypotheses H30, and chg, thereby demonstrating an additive effect. The same conclusions hold when analyzing the 816 and CT26 datasets separately (Fig. 7).

Since the conditions M S a and T_<_ z' are to be conditions for w, postulated necessary immunogenicity. one would except that a classiﬁer based on either the condition HBF, n { T > r} or the condition H M > a} m n{ W (i.e., negating the secondary condition) would perform worse than HBO. Indeed we found that this is the case for B 16 and CT26 when analyzed together (Fig. 6B) or separately (Fig. 7). Therefore we conclude that the 316 and CT26 datasets support both together and separately the H4 hypothesis. Omitting mutations where the WT RNA also showed reactivity did not affect these sions (Fig. 8).

Controlling for the HA hypothesis. Although mutations with high T scores may still be immunogenic. a hypothesis that enriches for such mutations should statistically enrich for non— immunogenic mutations. Therefore if we compare the H4 hypothesis (Hm ﬂlT S r}) with its inverse. H . n‘ T> rl (H in Fig. 4), we should observe a statistically cant ion of BC- l l H immunogenic mutations. Table 2 indeed shows that for Mm,,,,<_[5=0.5. M“~,Sa=l. and TSI=L 1-1,; outperforms H". with a s rate of 52.5% (n=21) compared to 21.4% (n=l4: P=0.068. one tailed Fisher’s exact test). based on the 816 Table 2. t of immunogenic mutations under various hypotheses and CT26 pooled datasets comprising 133 mutations.

Hypothesis parameters % of Mmur T score 'mmumgen". threshold ([3) threshold ((1) threshold (1') 22/83 (26.5%) 18/56 32.1% 6/10 (60%) Hypothesis —— As we H4 also performs better than the ne control Hacg, which achieves 41.2% (n=35). since the more decrease B the ence between the success rates ot‘H4 and H” become larger stringent the condition on B, the more false positives are removed from the H4 group. For example. for B =0.25 the success rate of the HA group was 67% (n=l4) compared to a success Table 3. rate of 17% (n=6) for group H" (P=0.066. one tailed Fisher’s exact test) — see the basic control Table 3. Ranked list of 133 measured B CT26.WT mutations that satisfy esis ch, (Mm,,,50.25) broken down into the three disjoint hypothesis classes: H4 hypothesis for immunogenic mutations (M...,SO.8. T50.5). H,,/inverse H4 hypothesis ing non-immunogenic mutations (M...,SO.8, 750.5). and HBUHC hypothesis for immunogenic the relative mutations 0.8). H4 and HBUHC candidates are proposed to be ranked based on importance of distinguishing variables. For H4 the proposed order is: 1 —> T m“, (descending) order is: score (descending) —> Mm descending. For HBUHC the proposed M,,,,,, (descending) —> Muﬁascending). Errors are standard errors.

RNA RNA Mean T score Symbol MHCI epltope (MUT, epitope (W‘nI M," M "' "‘ Sample Mut Response response response (Ingenuity) allele sion (WAGZSO) (MU'n (WT) Class I immunogenicity (HA): 6711456 s rate VALHM 19.6 yes no F’BK H-Z-Db AAVILRDALHM no no Nphp3 H-Z—Dd GGPGSEKSL GGPGSGKSL LALPNNYCDF 20.0 816 37 no no DPFZ H-Z—Db LALPNNYCDV H-Z-Db SHLNNDVWQI SHLNNDFWQI 21.7 PLODZ 816 25 CD4 yes yes YYMRDVTAI 5.5 CTZS 37 CD4 yes no th35 H-2~Kd YYMRDVIAI lYLQPAQAQM 29.5 CTZG 26 CDS H-Z-Kd TYLQPAQAQM yes no E2f8 QRLGFTYL 816 21 C04 yes no ATPllA H-Z-Db QSLGFTYL EYWASRALDS EYWASRALGS CT26 13 HZ-QS H»2~Kd GYLQFAYEGC GYLQFAYEGR 5. 1 125.6 yes yes ACTN4 H—Z-Kb VTFQAFIDVMS VTFQAFIDFMS no Slc4132 H-Z-Kd PYLTALDDLL PYLTALGDLL CT26 15 no Agxt2l2 ADAI AGGLFVADEI Class II immunogencity (HaUI-lc): 0% success rate VGlNFLQSYQ VGINSLQSVQ TRPARDGTF GTF EPQlDMDDM CT26 40 no no pr449 EPQIAMDDM H": 1711596 success rate FATl H-Z-Db IAMQN‘ITQL lAlQNTTQL 18.8 no no AIYYHASRAI 51.1 no no TM95F3 H~2-Kb AIYHHASRAI (1.7 39 CBS yes no AlsZ H-Z-Kd SYlALVDKNl SYLALVDKNI CT26 VQF 15.5 CT26 2 no no Snap47 H-Z-Dd VIPILEMQF no ‘ no, HZ-QB H-Z-Kd GYLQFAYDGR GYLQFAYEGR CT26 17 VYLNLLLK FT VYLNLFLKl—T CTZB 38 no g is An Example of additional weighing factors that may further improve immunogenicity given in Example 3. into the three More generally the list of ons that satisfy H35, (MmmSB) can be classiﬁed categories: H4, H". and HBUHC (Table 3). where H4 enriches for immunogenic mutations. H" and CT26. all three candidates in enriches for non-immunogenic mutations. In the case of 816 the HBUHC group were non-immunogenic. ry to our expectation. However. if a more that realistic threshold (1* for MW, is chosen such that a*>>u. then there would be no predictions could be tested for HBUHC.

Table l the average success Maximal precision of immunogenicity classiﬁers. According to for prediction immunogenicity in the combined 816 and CT26 datasets was 22.7% rate the Mm”, score (=30/ 132). By applying the most stringent threshold on ( ,8 = 0.1 ). the precision of to 60% (=6/10: H55; in Table 2). By combining ch/ an immunogenicity classiﬁer ses with either the MM S a condition or the T S 1' condition (a =1. r =1) ion is increased to 66.7% . The H4 based classifier. which combines both criteria. results in an ve response. which increases the precision to 75% (=6/8) (Table 2). 316 epitope MUT33. The H4-class epitope that was ranked the t by all evolutionary models (except the PAM matrix) in the pooled B 6 dataset was Bl6’s MUT33 (see Table 3). Further analysis revealed that MUT33 indeed invoked an MHC class I restricted CD8+ response and ted ex vivo immunogenicity against the minimal predicted epitope (data not shown).

Role of gene expression. ng the fraction of immunogenic mutations (no. of immunogenic mutations with RPKM values below a given threshold over the total no. of immunogenic mutations) as a function of RPKM values for 816 and CT26 indicates that this ratio somewhat stagnates at very low RPKM values (Fig. 9A). This effect is observed whether the H4 criterion is applied or not. Plotting the percent genic mutations for different RPKM bins (Fig. 9B and C) suggests that RPKM values 5 ~l have a somewhat lower success rate (both with or without applying the H4 ﬁltering hypothesis). although suggestive. it should be noted that these results are within the range of error.

Survey of published CD8+ epitopes. We were next interested to see if published T cell-deﬁned tumor antigens with single amino acid substitutions ing CD8+ restricted response fulﬁlled our models for immunogenicity. Of the 17 es that were published (P. Van der Bruggen, V.

Stroobant. N. Vigneron. B. Van den Eynde. (Cancer lmmun. http://www.cancerimmunity.org/peptide/. 2013)) (Table 4). five satisfied the criteria for H.) ((1:07 [3:02. 1:05). four satisfied the criteria for HcUHB (a=2.2. B=0.4). and two satisfied the H" criterion (a=0.6. B=O.3. t: l .7).

Table 4. hed epitopes with single amino acid substitution generating CD8+ responses. See Example 2 for list of references. Anchor position mutations in the HBUHC group are highlighted in red.

M Tscore Hypothesis <02 <07 $05 SIRTZ KIFSEVTLK _IFSEVTPK1 (-MZ7-MEL SNRPD SHETVIIEL SHENTIEL 1 MEL ———-__- —_____-(-MZ7MEL ———lma——n IE.- 1—o.zo EFTUDZ KILDAVVAQK KILDAVVAQE -(-MZ7-MEL 0.20 2.3 ms womevew YVDFREYEYD E ——-—-——(-MZ7-MEL ————-—n ‘ Based on WAGZSO logodds matrix, color legend: T s 0.5 U}l',1 1.1 Thus. the H4 and HCUHB hypotheses together accounted for roughly 50% of the published epitopes. Interestingly, 3 out of the 4 published epitopes that satisfied the HCUHH ion (red boxes 'in Table 4) had an MW, score that was larger than 10 due to anchor position mutations (Fig.

). Since the requirement for the HCUHB hypothesis is that the probability that any cell present the wild type epitope during the development of the organism is kept negligibly small it is expected that the threshold for MW, should be kept high. i.e.. 0' >> a a . Indeed when increasing from 0.8 to >3 the false positives for Bl6/CT26 in Table 3 disappear. Therefore a more realistic threshold for MW, under the HCUHB hypothesis may be ere between 3 and 10.

The MZ7—MEL cell line. To test the ability of our immunogenicity models to predict immunogenic epitopes in a human tumor model setting. we explored the MZ7-MEL cell line. ished in 1988 from a splenic metastasis of a patient with malignant melanoma (V. Lennerz et a1.. dings of the al Academy of Sciences of the United States of America 102. 16013 (2005)). Screening of a cDNA library from L cells with autologous tumor- reactive T cells revealed at least ﬁve neoantigens e of generating CD8+ responses (V. z et a1.. Proceedings of the National Academy of Sciences of the United States ofAmerica 102. 16013 (2005)). This constitutes the t set of CD8+ neoantigens derived from a patient to date. Applying our immunogenicity models to these epitopes we found that three neoantigens were classified as H4 epitopes. and one neoantigen. an anchor position mutation was classiﬁed as an HBUHC epitope (arrows in Table 4. and Fig. 10). Thus. four of the ﬁve epitopes could be explained by our immunogenicity models.

To test our ability to t these epitopes de novo in the L cell line we sequenced the exome of the MZ7-MEL cell line (see Methods). In total 743 expressed nonsynonymous mutations were identiﬁed. All ﬁve mutations previously identiﬁed by z et al. (V. Lennerz et al.. dings of the al Academy of Sciences of the United States of America 102. 16013 (2005)) were found. We then calculated for each mutation the T score. Mm”, and MW, reporting also the HLA allele and epitope that resulted in the minimal MHC consensus score for the given mutation. Mutations were classiﬁed into one of three groups: H4, HBUHC, and Hn using the thresholds a 3 0.813 = 03- T = 0-5 (plus the condition RPKM >02). and then ranked based on their potential to be immunogenic, as explained in Table 3. We found that out of 743 mutations. 32 mutations satisﬁed the HA criteria (Table 5). 12 satisﬁed the HBUHC criterion (Table 6) and 15 satisﬁed the H" criterion.

Table 5. H4-classiﬁed MZ7-MEL cell mutations. 32 of the 743 sed nonsynonymous mutations in MZ7-MEL were classiﬁed as Hrimmunogenic using the thresholds: a = 05 = 0-2 and T = 0-5 . Rank is based on an Mm", (descending) —-—> T score (descending) sorting scheme. genic neoantigens identiﬁed by Lennerz et al. are highlighted in yellow.

In addition RPKM was required to exceed 0.2.

Rank. Gene M mu, Tscore" M w. Mean EXP 1 DPHZ 0.1 -2.9 0.1 10.7 ADHFEl 0.1 -2.9 0.1 2.7 2 DDX41 0.1 -2.1 0.1 24.8 SIRTZ 0.1 -2.1 0.1 15.7 €MZ7-MEL 3 PRICZBS 0.1 -1.0 0.4 4 CSTF3 0.1 ‘0.5 0.1 11.2 ETFDH 0.1 -0.5 0.1 10.8 MEDlZ 0.1 ‘0.1 0.1 21.9 SNRPDI 0.1 -0.1 0.1 18.0 {-MZ7-MEL MLLT5 0.1 -0.1 0.2 14.7 AFAPl 0.1 -0.1 0.2 5.1 6 1 0.1 0.1 0.1 41.9 7 DHX30 0.1 0.3 0.1 34.4 ALK 0.1 0.3 0.1 0.4 CHMP4B 0.1 0.3 0.7 52.3 8 HADHB 0.1 0.5 0.1 60.6 SUPTGH 0.1 0.5 0.1 25.4 C120I’f35 0.1 0.5 0.1 3.1 ZDHHCS 0.1 0.5 0.4 27.6 9 WlPFl 0.15 -2.1 0.15 37.2 ZNF740 0.15 -2.1 0.5 9.7 MLL 0.15 0.5 0.3 3.7 11 15 0.2 -2.1 0.1 6.3 CHDS 0.2 -2.1 0.2 10.3 12 DDXZS 0.2 -1.7 0.2 7.2 13 MAPKllPlL 0.2 ~1.0 0.2 12.1 «mums.

TRAKZ 0.2 ~0.1 0.5 21.8 16 KLHL13 0.2 0.1 0.2 3.4 17 FOSLZ 0.2 0.3 0.2 9.4 18 UTRN 0.2 0.5 0.15 6.3 'Rank is based on M m... and the T score T score is based on the WAGZSO log-odds matrix HgUHc-classiﬁed MZ7-MEL cell mutations. 12 of the 743 Table 6. expressed nonsynonymous mutations in MZ7-MEL were classified as HBUHcimmunogenic using the thresholds: 0" = 0-3 B = 03 and RPKM > 2. Rank is based on a Mm“, (descending) —-+ MW, (ascending) sorting scheme. Immunogenic neoantigens identiﬁed by Lennerz et are highlighted in .

Rank. Gene M mu, M w, Mean exp N F1 0.1 1.4 6.8 MESPZ 0.1 1.3 0.3 EFTUDZ (SN RP116) 0.15 10.2 22.0 (—MZ7-MEL SEC31A 0.15 2.55 33.3 ZNF335 0.2 18.35 8.3 CPEBl 0.2 4.2 6.0 UBAC2 0.2 2.8 ZN F557 TLK2 'Rank is based on M mu, and M M Of the 32 mutations classified as H4, the three H4-class mutations identiﬁed by z et al.

(SIRT2. SNRPDI and RBAF600) were ranked in 2"“. 4‘“ and 13'“ positions out of 18 rank- classes. using a M,,,,,,$T score ranking scheme (see Table 3). Of the 12 mutations classiﬁed as higher (more realistic) threshold for M..., was employed (e. g.. ““5 ) then the forth Lenneiz et ul. on is ranked in the ISI position (together with just one additional anchor position mutation — Table 7). Finally, the four Lennerz et al. mutations were ted to have the correct HLA allele. epitope length and mutation position as reported by the authors.

Table 7. classiﬁed MZ7—MEL cell mutations. 2 of the 743 expressed nonsynonymous mutations in MZ7-MEL were classified as HBUHc-immunogenic using the thresholds: CV = 5. I3 = 0-2 and RPKM > 2. Rank is based on a Mm", (descending) --’ Mm (ascending) sorting . lmmunogenic neoantigens identiﬁed by Lennerz et al. are highlighted in yellow.

Rank. Gene M”m, M w, Mean exp 1 EFTUDZ (SN RP116) 0.15 10.2 22.0 MEL 2 ZN F335 0.2 18.35 8.3 'Rank is based on Mm“, and M W, Conclusions The analysis of the 816 and CT26 datasets support a model where genicity is conferred if three ions are satisﬁed: the wild type peptide is presented. the mutated peptide is presented. and the amino acid substitution has a sufﬁciently low log—odds score (Fig. 1 1A). This model for immunogenicity. which we refer to as class I immunogenicity. is further supported in the human melanoma cell line model. MZ7-MEL. The MZ7—MEL model and published CD8+ restricted igens support a second model. which we refer to as class II immunogenicity. in which the wild type epitope is not presented. but a substitution (e._g.. in an anchor position) leads to a signiﬁcant increase in the MHC consensus score (>5 to 10). resulting in a novel. never- before-seen epitope (Fig. 118). This framework for ng immunogenicity is captured with a three-variable classiﬁcation scheme (Mmm. MW. T score). Using this classification scheme we were able to reduce the MZ7-MEL 743 mutations to a list of 34 mutations. with 3 of the 3 Lennerz et al. epitopes ranking in the t0p 5 classes.

Table 7 demonstrates that class II immunogenic mutations are rare. Out of 743 mutations only 2 were classified as class II immunogenic (using a realistic threshold for MW) compared with roughly 30 class I immunogenic ons. A paucity of HBUHpclass mutations was also observed in the mouse melanoma models (Table 8). This observation underscores the importance of class 1 immunogenic of mutations for personalize vaccines. which are ed to be the te type mutations found in patient s that can be used for vaccination. At the same time. the fact that one of the ﬁve epitopes found by Lennerz et a]. was class II immunogenic may indicate that class II immunogenic mutations are more potent or somehow selected by the immune system.

Table 8. Number of candidate HA and HBUHC ons in different tumor models. (a = 0.8.a' = 5.8 = 0.2.1‘ = 0.5 )- Hypothesis Strain 2014/001232 Example 2: Materials and methods The materials and s used in Example 1 are described below: Animals C57BL/6J and Balb/cJ mice (CRL) were kept in accordance with federal and state policies on animal research at the University of Mainz.

Cells for melanoma and ctal murine tumor model Bl6F10 ma cell line (Product: ATCC CRL-6475. Lot Number: 58078645) and CT26.WT colon carcinoma cell line (Product: ATCC CRL—2638. Lot Number: 54) were purchased in 2010 from the American Type Culture Collection. Early (3rd. 4th) passages ofcells were used for sequencing experiments. Cells were routinely tested for Mycoplasma. Re-authentification cells has not been performed since receipt. MZ7-MEL cell line (established January 1988) and Thomas- an autologous Epstein—Barr virus-transformed B cell line were obtained from Dr.

W'olfel (Department of Medicine. Hematology Oncology. Johannes Gutenberg University). tic peptides Peptides were purchased from Jerini Peptide Technologies (Berlin. Germany) or synthesized from the TRON peptide ty. Synthetic peptides were 27 amino acids long with the mutated (MUT) or wild-type (WT) amino acid on position 14.

Immunization of mice Age-matched female C57BL/6 or Balb/c mice were injected intravenously with 20 ug in vitro transcribed mRNA formulated with 20 ul LipofectamineTM RNAiMAX (Invitrogen) in PBS in a total injection volume of 200 pl (3 mice per group). The mice were zed on day 0. 3. 7. l4 and 18. -three days after the initial injection mice were sacriﬁced and splenocytes were isolated for immunological testing (see ELISPOT assay). quences representing one (Monoepitope) or two mutations (Biepitope) were constructed using the sequence of 27 amino acids (aa) with the on on position 14 and cloned into the pSTl-ZBgUTR-AIZO backbone (S. Holtkamp et al.. Blood 108. 4009 (2006)). In vitro transcription from this template and WO 80569 puriﬁcation were previously described (S. Kreiter et al.. Cancer Immunology, lmmunotherapy 56. 1577 (2007)).

Enzyme-linked immunospot assay Enzyme-linked immunospot (ELISPOT) assay (S. Kreiter et al.. Cancer Research 70. 9031 ) and generation of syngeneic bone marrow derived dendritic cells (BMDCs) as stimulators were previously described (L. MB et al.. J. Immunol. Methods 223. 77 (1999)). For the Bl6FlO model BMDCs were peptide pulsed (6 ug/ml). with the indicated mutation. the corresponding wild-type or with control e (VSV—NP). For the CT26 model in addition to in vitro 1'10 the restimulation with peptides BMDCs were transfected with the ponding transcribed mRNA and used for restimulation. as well. For the assay. 5 X 104 BMDCs were coincubated with 5 X 105 freshly isolated splenocytes in a microtiter plate coated with anti-1FN-y antibody (10 ttg/mL. clone ANl8: Mabtech). After 18 hours at 37°C. cytokine secretion was detected with an anti-IFN-y antibody (clone R4—6A2; h). Spot numbers were counted and analyzed with the lmmunoSpot® SS Versa ELISPOT Analyzer. the ImmunoCaptureTM Image Acquisition software and the lmmunoSpot® Analysis software Version 5. Statistical analysis was done by t's t-test and Mann-Whitney test (non-parametric test). Responses were considered signiﬁcant with a p-value < 0.05.

Intracellular cytokine assay Aliquots of the splenocytes prepared for the T assay were subjected to analysis of cytokine production by intracellular ﬂow try. To this end 2 x 10° splenocytes per sample were plated in culture medium (RPMI + 10% FCS) mented with the Golgi inhibitor Brefeldin A (lOug/mL) in a 96-well plate. Cells from each animal were restimulated for 5h at 37°C with 2 x 105 peptide pulsed or ansfected BMDCs. After incubation the cells were washed with PBS. resuspended in 50ul PBS and extracellularly stained with the following anti- mouse antibodies for 20 min at 4°C: anti-CD4 FITC. anti-CD8 APC-Cy7 (BD Pharmingen).

After incubation the cells were washed with PBS and subsequently resuspended in lOOuL Cytotix/Cytoperm (BD Bioscience) solution for 20 min at 4°C for permeabilization of the outer membrane. After bilization the cells were washed with Pemi/Wash—Buffer (BD ence). ended in 50uL/sample in Perm/Wash-Buffer and intracellularly stained with the following anti—mouse antibodies for 30 min at 4°C: anti-IFN- y PE. anti—TNF—u PE-Cy7. anti- 1L2 APC (BD Pharmingen). After washing with Perm/Wash-Buffer the cells were resuspended in PBS containing 1% paraformyldehyde for ﬂow try analysis. The samples were analyzed using a BD FACSCantoTM II cytometer and FlowJo (Version 7.6.3).

Next generation sequencing Nucleic acid extraction: DNA and RNA from bulk cells and DNA from mouse s were extracted using Qiagen DNeasy Blood and Tissue kit (for DNA) and Qiagen RNeasy Micro kit (for RNA).

DNA exome sequencing: Exome capture for BlGFlO. C57BL/6J and CT26.WT and DNA re— sequencing for Balb/cJ were performed in triplicates as previously described (.l. C. Castle et al..

Exploiting the mutanome for tumor vaccination. Cancer Research 72. 1081 (2012)). Exome capture for MZ7—MEL/EBV—B DNA re-sequencing was med in duplicates using t XT Human all Exon 50 Mb solution-based capture assay, designed to capture all protein coding regions. 3 ug d genomic DNA (gDNA) was fragmented to 0 bp using a Covaris $2 ultrasound device. Fragments were end repaired, 5’ phosphorylated and 3’ adenylated according to the maufacturer’s instructions. Agilent indexing speciﬁc paired-end adapters were ligated to the gDNA fragments using a [0:1 molar ratio of adapter to gDNA. 4 cycle pre-capture ampliﬁcation was done using Agilent’s lnPE 1.0 and SureSelect indexing pre-capture PCR piimers and Herculasell polymerase. 500 ng of adapter ligated. PCR enriched gDNA fragments were hybridized to Agilent’s exome e baits for 24 hrs at 65 °C. Hybridized gDNA/RNA bait complexes where removed using streptavidin coated magnetic beads. washed and the RNA baits cleaved off during elution in SureSelect elution buffer. The eluted gDNA fragments were PCR amplified post-capture for 10 cycles using lect ng Post-Capture PCR and index PCR primers and HerculaselI polymerase. All ps were done with 1.8x volume of Agencourt AMPure XP magnetic beads. All quality controls were done using Invitrogen’s Qubit HS assay and fragment size was ined using Agilent’s 2100 Bioanalyzer HS DNA assay.

Exome enriched gDNA libraries were clustered on the cBot using Truseq SR cluster kit v2.5 using 7 pM library and 1x100 bps were sequenced on the na HiSquOOO using Truseq SBS kits. 2014/001232 RNA gene expression proﬁling (RNA-Seq): Barcoded mRNA-seq cDNA libraries were prepared in ate. from 5 pg of total RNA (modiﬁed lllumina mRNA—seq ol using NEB Scientiﬁc) and reagents) mRNA was isolated using g Oligo(dT) magnetic beads (Thermo fragmented using divalent cations and heat. Resulting fragments (160-220 bp) were ted to cDNA using random s and SuperScriptlI (Invitrogen) followed by second strand synthesis using DNA polymerase I and RNaseH. cDNA was end repaired. 5’ phosphorylated and 3’ adenylated ing to NEB RNA library kit instructions. 3’ single T-overhang Illumina multiplex c adapters were ligated with T4 DNA ligase using a 10:1 molar ratio of adapter SizeSelect to cDNA insert. cDNA libraries were purified and size selected at 300 bp (E—Gel 2% gel, lnvitrogen). ment. adding of Illumina six base index and flow cell speciﬁc sequences was done by PCR using Phusion DNA polymerase and Illumina speciﬁc PCR primers. All ps up to this step were done with 1.8x volume of Agencourt AMPure XP magnetic beads.

All quality controls were done using lnvitrogen’s Qubit HS assay and fragment size was determined using Agilent’s 2100 Bioanalyzer HS DNA assay. Barcoded RNA-Seq libraries were clustered and 50 bps were sequenced as described above.

NGS data analysis, gene expression: The output sequence reads from RNA samples were cessed according to the Illumina standard protocol. including ﬁltering for low quality reads. Sequence reads were aligned to the mm9 (A. T. Chinwalla et al.. Nature 420. 520 (2002)) or th8 (F. Collins. E. Lander. J. . R. Waterston. I. Conso, Nature 431. 931 (2004)) reference genomic sequence with bowtie (version 0.12.5) (B. ad. C. Trapnell. M. Pop. S.

L. Salzberg. Genome Biol 10. R25 (2009)). For genome ents two mismatches were allowed and only the best alignment ("-v2 ~best") reported. for transcript alignments default database parameters were used. Reads not alignable to the genomic sequence were aligned to a of all possible exon-exon junction sequences of the UCSC known genes (F. Hsu et al..

Bioinformatics 22. 1036 (2006)). Expression values were determined by intersecting read coordinates with those of RetSeq transcripts. counting overlapping exon and junction reads. and normalizing to RPKM expression units (Reads which map per Kilobase of exon model per million mapped reads) (A. Mortazavi. B. A. Williams. K. McCue. L. Schaeffer. B. Wold. Nature methods 5. 621 (2008)).

NGS data analysis. somatic mutation discovery: Somatic mutations were identiﬁed as previously described (1. C. Castle et a1.. Exploiting the mutanome for tumor vaccination. Cancer Research 72. 1081 (2012)). Sequence reads aligned to the mm9 or thS reference genome using bwa (default options. version 0.5.8c) (H. Li. R. Durbin. Bioinformatics 25. 1754 ). Ambiguous reads g to multiple locations of the genome were removed. Mutations were identiﬁed using a consensus of two software programs: samtools (version 0.1.8) (H. Li. Bioinformatics 27. 1157 (2011)) and SomaticSniper (A. McKenna et al.. Genome Research 20. 1297 (2010)). For BléFlO and C57BL/6J. also GATK was included (A. McKenna et a1.. Genome Research 20. 1297 ). Potential somatic variations identiﬁed in all respective replicates were assigned a "false discovery rate" (FDR) nce value (M. L6wer et al.. PLoS computational biology 8. e1002714 (2012)) (CT26 and MZ7-MEL only).

Mutation selection and validation The criteria for selecting the 50 Bl6F 10 mutations for immunogenicity testing were previously described (1. C. Castle et al.. Exploiting the mutanome for tumor vaccination. Cancer Research 72. 1081 (2012)). These criteria for the mutations included: (i) ce in all three BléFIO replicates and absence from all C57BL/6 cates. (ii) occur in a RetSeq transcript. (iii) cause nonsynonymous change, (iv) occurrence in BIGF 10—expressed genes (median RPKM across ates >10. exon expression > 0) and (v) for each mutation the Mm”, score (see below) was required to be < 5. Of the 59 remaining mutations. the product of the le ranks of MHC class 1 score. MHC class 11 score and transcript expression was formed. and the ﬁrst 50 mutations (0.15111m,,,s3.9) were selected for conﬁrmation by PCR (see (1. C. Castle et al..

Exploiting the mutanome for tumor vaccination. Cancer Research 72. 1081 (2012)) for ﬁuther details). The ia for the 96 CT26.WT mutations ed for imrnunogenicity g were further reﬁned and ed the following: (i) presence in all CT26.WT three replicates and absence from all Balb/cJ three replicates. (ii) FDR S 0.05. (iii) occur in a UCSC known gene transcript. (iv) cause nonsynonymous change. (v) not present in dbSNP database (vi) not in a genomic repeat region. From the remaining 493 mutations. eight 12—member groups were deﬁned according to three features: Mm”, score (lowest - .9] versus highest - [3.9-20.3]). compartment of the protein (extra-cellular. intra-cellular), and gene expression (below versus above the median of 7.1 RPKM). selecting mutations according to a greedy algorithm. and ing thresholds accordingly. 94 of the resultant 96 mutations were conﬁrmed by PCR followed by Sanger sequencing.

The criteria for selecting MZ7-ML mutations for analysis included: (i) ce in two MZ7- MEL replicates and absence from two autologous EBV-B replicates. followed by steps (ii) to (vi) describe above for CT26.WT. Applying steps (i)—(vi) d the l list of~8000 mutations to 743.

MHC g prediction and calculation of the Mm", score MHC binding tions are performed using the [BBB analysis resource Consensus tool (htt ://tools.immunee itoie.orufanal Ize/‘html/mhc hindinwhtml) (Y. Kim et al.. Nucleic Acids Research 40. W525 (2012)). which combines the best performing prediction methods based on benchmarking studies (H. H. Lin. S. Ray, S. Tongchusak. E. L. Reinherz. V. Brusic. BMC immunology 9. 8 (2008): B. Peters et al.. PLoS computational biology 2. e65 (2006)) from ANN (C. Lundegaard et al.. Nucleic Acids Research 36. W509 (2008); M. Nielsen et al.. Protein Science 12. 1007 (2009)), SMM (B. Peters, A. Sette. BMC bioinformatics 6. 132 (2005)) and for some allele models also comblib (J. Sidney et al.. Immunome Research 4. 2 (2008)). The consensus approach es the prediction scores of all tools by generating a percentile rank. which reﬂects the g tion scores of the given peptide against peptide scores of ﬁve million random peptides from SWISSPROT.

For each mutation we calculated the predicted MHC consensus scores for all possible (i) and (iii) possible murine sequence windows (where to position the mutation). (ii) e lengths MHC class I alleles. The m of all MHC consensus scores was deﬁned to be the M,,,,,, score.

Calculation of log-odds matrices and the T score Log—odds matrices can be estimated from ce alignment comparisons of large protein databases. The early log-odds matrices were based on pairwise comparison of sequences (BLOSUM62 ('S. Kreiter et al.. Cancer Immunology. lmmunotherapy 56. 1577 (2007))) and the maximum parsimony (MP) estimation method (e.g.. PAM250 (M. O. Dayhoff. R. M. Schwartz. 8. C. Orcutt. A model for evolutionary change. MO Dayhoff. ed. Atlas of protein sequence and structure Vol.5. 345 (1978)). JTT250 (S. Q.‘ Le. O. Gascuel. Molecular biology and evolution 25. 1307 (2008)). and the Gonnet matrix (C. C. Dang. V. Lefort. V. S. Le. Q. S. Le. O. l.

Bioinformatics 27. 2758 (2011))). More recently. maximum likelihood (ML) based methods were developed (e.g.. VT16O (P. G. Higgs. T. K. Attwood. Bioinformatics and molecular evolution. (Wiley-Blackwell. 2009)). WAG (S. Whelan. N. Goldman. Molecular biology and evolution 18. 691 ) and LG (V. Lennerz et al.. Proceedings of the National Academy of Sciences of the United States of America 102. 16013 )). Since ML is not limited to comparison of only closely d sequences. as is the case with MP based approaches. this estimation approach is expected to be the most accurate.

Calculation of ds matrix has been described in detail elsewhere (C. C. Dang. V. Lelbrt. V.

S. Le. Q. S. Le. O. Gascuel. Bioinformatics 27. 2758 (2011)). Brieﬂy. the standard model for amino acid substitution assumes a Markovian. time-continuous. time-reversible model represented by a 20x20 rate matrix 0.. . where q” (i ¢ j) is the number of substitutions from amino acid i to j per unit of time. and where diagonal elements are chosen to satisfy Q“. =—ZQ, that is a symmetric . Q can be decomposed such Q”, = Sn- . ”I. for i¢ j. where SM exchangeability matrix. and 72". is the probability to observe amino acid 1' (C. C. Dang. V. Lefort.

V. S. Le. Q. S. Le. O. Gascuel. Bioinformatics 27. 2758 (2011)). Finally. Q is normalized such that 1 = —Z ”IQ“ time unit r=l .0 ponds to 1.0 expected substitution per site. or . so that a one ted point mutation" per site. denoted by a PAM distance of 100 (M. O. Dayhot'f. R.

M. Schwartz. B. C. Orcutt. A model for ionary change. MO Dayhoff. ed. Atlas of protein sequence and structure Vol.5. 345 (1978); S. Q. Le. O. l. Molecular biology and evolution 25. 1307 : C. C. Dang. V. Let'ort. V. S. Le. Q. S. Le. O. Gascuel. Bioinformatics 27. 2758 (201 1)). The probability for amino acid 1' to be replaced by amino acid j after time r.Pr(i—-)_/'|t)=l?i(t). is given by the 20x20 probability matrix ’Q (with notation denoting matrix exponentiation). The log-odds matrix calculated for time t is given by the log- T,._,.=1010g,0£ my)”. (2) ]( M. O. Dayhoff. R. M. Schwartz. B. C. Orcutt. A odds 20x20 matrix 77,-”) Vol.5. model for ionary change. MO Dayhoff. ed. Atlas of protein sequence and structure 345 (1978. 1978)). A eversible mean that ﬂ/ﬁ(t)=7r/li;i(t). and therefore TM. is symmetric (P. G. Higgs. T. K. Attwood. ormatics and molecular evolution. (Wiley- Blackwell. .

The T score for the substitution i (—> j is defined here as T. on the evolutionary I. . and depends model and the time t. We explored various models and PAM distances for the T score. including PAM. BLOSUM62. .lTT. VT160. Gonnet. WAG. WAG*. and LG (see references above). The the WAG model and a PAM ﬁgures in this report were generated using a T score based on the amino distance of 250. Such a large PAM distance means that there is substantial chance for acid to change (P. G. Higgs. T. K. Attwood. Bioinfomiatics and molecular evolution. (Wiley- Blackwell. 2009)). and is useful in detecting distant relationships between sequences where residues may not be identical but the o—chemical properties of the amino acids are conserved (M. O. Dayhoff. R. M. tz. B. C..Orcutt. A model for evolutionary change.

Dayhoff. ed. Atlas of protein sequence and structure Vol.5. 345 (1978): P. G. Higgs. T. K.

Attwood. Bioinformatics and molecular evolution. (Wiley-Blackwell. 2009)).

Using a t-distribution test tic we compared the mean T scores of immunogenic versus non— . 25. immunogenic epitopes from Table 3 for the WAG matrix using various PAM scores (I. 50. 100. 150. 200. and 250). Analysis of the test statistic showed that the P value decreased monotonically with PAM ce. implying that a PAM distance of 250 was the Optimal solution. as would be anticipated (data not shown). The ﬁcation into H4 and H,, was the is the least accurate of all ionary same for all matrices except for the PAM matrix. which models. Of all evolutionary models. the WAG250 model resulted in the maximum separation between H4 and H,, epitopes in Table 3. measuring separation with the test statistic: [max T score(H,4)—min T score(H,,)]/6(T score ([14). T score(H,,)) (data not shown). The same test statistic was also maximal for a PAM distance 250 compared to smaller distance.

Published CD8+ es CD8+ epitopes with single mutated amino acids were ted from the list of tumor antigens resulting from mutations published by the Cancer Immunity Journal (P. Van der Bruggen. V. Stroobant. N. Vigneron. B. Van den Eynde. (Cancer lmmun. littp://www.cancerimmunity.org/peptide/. 2013) (htt :/'/'cancerimmunitv.org/3e utations/ HLA alleles were taken either from the published table or from the original paper if the latter was more precise. References listed in Table 4 are the following: (1) Lennerz et al. PNAS 102 (44). pp. 16013—16018 (2005); (2) Karanikas et al. Cancer Res 61 (9). pp. 3718—3724 ; (3) Sensi et al. Cancer Res 65 (2). 4802—4808 (2002); (5) Zorn et pp. 632-640 (2005); (4) Linard et al. J. Immunol 168 (9), pp. al. Eur. J. Immunol 29 (2). pp. 592—601 (1999); (6) Grafet al. Blood 109 (7). pp. 2985—2988 (2007): (7) Robbins et ul. J. Exp. Med 183 (3), pp. 1185-1192 (1996); (8) Vigneron et ul.

Cancer Immun 2, pp. 9 (2002); (9) Echchakir et al. Cancer Res 61 (10), pp. 4078—4083 ' (2001): (10) Hogan et al. Cancer Res 58 (22). pp. 150 (1998); (11) Ito et a]. Int. J.

Cancer 120 (12). Pp. 2618—2624 (2007): (12) wolfel et a]. Science 269 (5228). Pp. 1281— 1284 (1995); (13) Gjertsen et a]. Int. J. Cancer 72 (5), pp. 784—790 (1997).

Example 3: Example of a scheme for weighing mutation scores to improve prioritization of immunogenic mutations RNA that is injected into the cell. once translated and cleaved into short peptides. can be presented on different HLA types within the cell. Therefore it stands to reason that the more HLA types that are predicted to have a low MHC consensus (or similar) score. the more likely a given mutation will be immunogenic since it can potentially be displayed on more than one HLA is type in parallel. Thus. ng mutations by the number of HLA types for which the mutation classified as H4 and/or HBUHC or even weighing each mutation simply by the number of HLA types that have a low Mm“, score may improve genicity ranking. 1n the most l solution. when we inject a 27mer RNA or peptide into the cell. there is not just the freedom to select the HLA type. but also the length of the peptide and the position of the mutation within this peptide. Therefore. one can scan all possible HLA types. all le window lengths and all possible positions for the mutation within the window and calculate the number of solutions (per 2014/001232 given mutation) that are classiﬁed as H4 and/or HBUHC (Fig. 12). This may be an important weighing factor for mutation prioritization to select the most efﬁcacious epitopes for vaccination.

An example of a scatter plot of all these solutions as a function of M,,,,,, and AM: Mm“, -M\.., is shown in Fig. 13.

Claims

1. A method for producing a e, the method comprising the steps: a) aining a score (Mwt) for binding of a non-modified peptide to one or 5 more MHC molecules, b ascertaining a score (Mmut) for binding of a modified peptide to one or more MHC les, wherein the modified peptide comprises the amino acid sequence of the non-modified peptide with an amino acid substituted at a position corresponding to the same relative position in the non-modified peptide, 10 c) ascertaining a T score for binding of the modified peptide when present in a MHC-peptide complex to one or more T cell receptors, wherein the T score is based on the chemical and physical similarities between the substituted amino acid in the modified peptide and the amino acid at the corresponding position in the non-modified peptide, n the score for the chemical and physical similarities is ascertained on the basis of 15 the probability of amino acids being interchanged in naturally occurring amino acid sequence; d) predicting the modified e to be immunogenic if (i) the Mwt meets a threshold indicating binding to the one or more MHC molecules, (ii) the Mmut meets a threshold indicating binding to the one or more MHC les; and (iii) the T score 20 meets a threshold indicating binding of the modified peptide when present in a MHC-peptide complex to one or more T cell ors, and e) producing a vaccine comprising a e comprising the amino acid sequence of the modified peptide predicted to be immunogenic in d) or a nucleic acid encoding the peptide sing the amino acid of the peptide predicted to be 25 immunogenic.

2. The method of claim 1, wherein the modified peptide comprises a fragment of a modified protein, said fragment comprising the cation(s) present in the protein. 30

3. The method of claim 1 or 2, wherein the non-modified peptide has a germline cell amino acid at the position(s) corresponding to the position(s) of the cation(s) in the modified peptide.

4. The method of any one of claims 1 to 3, wherein the non-modified peptide and 35 modified peptide are identical but for the modification(s).

5. The method of any one of claims 1 to 4, wherein the non-modified peptide and ed peptide are 8 to 15 amino acids in length.

6. The method of any one of claims 1 to 5, wherein the one or more MHC molecules 5 comprise different MHC molecule types, in particular different MHC alleles.

7. The method of any one of claims 1 to 6, wherein the one or more MHC molecules are MHC class I molecules and/or MHC class II les. 10

8. The method of any one of claims 1 to 7, wherein the score for binding to one or more MHC molecules is ascertained by a process comprising a ce comparison with a database of MHC-binding motifs.

9. The method of any one of claims 1 to 8, wherein the old applied in step a) is 15 different to the threshold applied in step b).

10. The method of any one of claims 1 to 9, wherein the threshold for binding to one or more MHC molecules reflects a probability for binding to one or more MHC les. 20

11. The method of any one of claims 1 to 10, wherein the chemical and physical similarities are determined using evolutionary based log-odds matrices.

12. The method of any one of claims 1 to 11, wherein the modification is not in an anchor position for binding to one or more MHC molecules.

13. The method of any one of claims 1 to 11, wherein the modification is in an anchor on for binding to one or more MHC molecules.

14. The method of any one of claims 1 to 13 which comprises performing step b) on 30 two or more different modified peptides comprising different substituted amino acids.

15. The method of claim 14, wherein the different substituted amino acids are present in different ns. 35

16. The method of claim 15, wherein the different tuted amino acids are present in the same protein.

17. The method of any one of claims 14 to 16, which comprises comparing the scores of two or more of said different modified es.

18. The method of claim 17, wherein a score for binding of the modified peptide to one 5 or more MHC les is weighted higher than a score for binding of the modified e when present in a MHC-peptide complex to one or more T cell receptors.

19. The method of claim 17, wherein a score for binding of the modified peptide when present in a MHC peptide complex to one or more T cell receptors is weighted higher than 10 a score for binding of the non-modified peptide to one or more MHC molecules.

20. The method of any one of claims 1 to 19, r comprising identifying nonsynonymous mutations in one or more protein-coding regions. 15

21. The method of any one of claims 1 to 20, wherein modifications are identified by partially or tely sequencing the genome or transcriptome of one or more cells and identifying mutations in one or more protein-coding regions.

22. The method of claim 21, wherein the one or more cells comprise one or more 20 cancer cells.

23. The method of claim 21, further comprising partially or completely sequencing the genome or transcriptome of one or more non-cancerous cells. 25

24. The method of any one of claim 20 to 23, wherein said mutations are somatic mutations.

25. The method of any one of claims 20 to 23, wherein said ons are cancer mutations.

26. A method for producing a e, the method comprising the steps: a) ascertaining a score (Mwt) for binding of a non-modified peptide to one or more MHC molecules, b) ascertaining a score (Mmut) for binding of a modified peptide to one or more 35 MHC les, wherein the modified peptide comprises the amino acid sequence of the non-modified peptide with an amino acid substituted at a position ponding to the same relative position in the non-modified peptide, c) ascertaining a T score for binding of the modified peptide when present in a MHC-peptide complex to one or more T cell receptors, wherein the T score is based on the al and physical similarities between the substituted amino acid in the modified peptide and the amino acid at the ponding position in the non-modified peptide, 5 wherein the score for the chemical and physical rities is ascertained on the basis of the probability of amino acids being interchanged in naturally occurring amino acid d) predicting the modified e to be immunogenic if (i) the Mwt meets a threshold indicating binding to the one or more MHC molecules, (ii) the Mmut meets a 10 threshold indicating binding to the one or more MHC molecules; and (iii) the T score meets a threshold indicating binding of the modified peptide when present in a MHC-peptide complex to one or more T cell receptors, and e) producing a vaccine comprising a peptide comprising the amino acid sequence of the modified peptide predicted to be immunogenic in d) or a nucleic acid 15 encoding the peptide comprising the amino acid of the peptide predicted to be immunogenic, wherein step b) is performed on two or more different modified es, said two or more different modified peptides comprising the substituted amino acid and wherein the method further comprises selecting a ed e, from the two or more different modified 20 peptides comprising the same substituted amino acid, having a high probability or having the t probability of binding to one or more MHC molecules.

27. The method of claim 26, wherein the two or more different ed peptides comprising the same substituted amino acid comprise different fragments of a modified 25 protein comprising the same substituted amino acid.

28. The method of claim 26 or 27, wherein the two or more different modified peptides comprising the same substituted amino acid comprise all potential MHC g nts of a modified protein comprising the same tuted amino acid.

29. The method of any one of claims 26 to 28, wherein the two or more different modified peptides comprising the same modification(s) differ in length and/or position of the modification(s). 35

30. A method for producing a vaccine, the method comprising ing a modified peptide predicted to be immunogenic, or a nucleic acid encoding the modified peptide predicted to be immunogenic, with one or more pharmaceutically acceptable excipients, wherein the modified e was previously predicted to be immunogenic by a method that sed: a) ascertaining a score (Mwt) for binding of a non- modified peptide to one or more MHC molecules, 5 b) ascertaining a score (Mmut) for binding of a modified peptide to one or more MHC molecules, wherein the modified peptide comprises the amino acid sequence of the non-modified peptide with an amino acid substituted at a position ponding to the same relative position in the dified peptide, c) ascertaining a T score for binding of the modified peptide when present in a 10 MHC-peptide complex to one or more T cell receptors, wherein the T score is based on the chemical and physical similarities between the substituted amino acid in the modified peptide and the amino acid at the corresponding position in the dified peptide, wherein the score for the chemical and al similarities is ascertained on the basis of the probability of amino acids being interchanged in naturally occurring amino acid 15 sequence; and d) predicting the modified peptide to be immunogenic if (i) the Mwt meets a threshold indicating g to the one or more MHC molecules, (ii) the Mmut meets a threshold indicating binding to the one or more MHC molecules; and (iii) the T score meets a old indicating binding of the modified peptide when present in a 20 MHC-peptide complex to one or more T cell receptors.