WO2017151768A1 - Traitement et classification de données pour la détermination d'un score d'efficacité d'une immunothérapie - Google Patents
Traitement et classification de données pour la détermination d'un score d'efficacité d'une immunothérapie Download PDFInfo
- Publication number
- WO2017151768A1 WO2017151768A1 PCT/US2017/020200 US2017020200W WO2017151768A1 WO 2017151768 A1 WO2017151768 A1 WO 2017151768A1 US 2017020200 W US2017020200 W US 2017020200W WO 2017151768 A1 WO2017151768 A1 WO 2017151768A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- gene
- group
- genes
- immunotherapy
- classifier
- Prior art date
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B40/00—Libraries per se, e.g. arrays, mixtures
- C40B40/04—Libraries containing only organic compounds
- C40B40/06—Libraries containing nucleotides or polynucleotides, or derivatives thereof
- C40B40/08—Libraries containing RNA or DNA which encodes proteins, e.g. gene libraries
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/106—Pharmacogenomics, i.e. genetic variability in individual responses to drugs and drug metabolism
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/158—Expression markers
Definitions
- the disclosure relates to data processing methods, computer readable hardware storage devices, and systems for correlating data corresponding to levels of biomarkers with immunotherapy effectiveness.
- a classifier maps input data to a category, by determining whether the input data classifies with a first category as opposed to another category.
- classifiers There are various types of classifiers, including, linear discriminant classifiers, logistic regression classifier, support vector machine classifiers, nearest neighbor classifiers, ensemble classifiers, and so forth.
- the present disclosure relates to a computer-implemented method for processing data in one or more data processing devices to determine an effectiveness score for predicting response to an immunotherapy.
- the disclosure relates to a computer-implemented method for processing data in one or more data processing devices to determine an effectiveness score for an immunotherapy for a test subject.
- the method includes: inputting, into a classifier, data representing one or more values for a classifier parameter that represents a gene-specific level of mRNA transcribed from a gene of a set of genes in a sample of blood collected from a test subject who was not treated with the immunotherapy prior to collecting the sample, with the input data specifying a gene-specific level of mRNA transcribed from each gene of the set of genes in the sample of blood of the test subject.
- the set of genes can be any set of genes that can be used to determine an effectiveness score for an immunotherapy for a test subject who was not treated with the immunotherapy as described in this disclosure, with the classifier being for determining an effectiveness score indicating whether the gene-specific levels of mRNA transcribed from each gene in the set of genes classifies with (A) a set of responder levels, the set of responder levels being a set of gene-specific levels of mRNA transcribed from each gene of the set of genes in blood samples collected from a first group of individuals prior to treatment with the immunotherapy, wherein each individual of the first group was, after collection of his or her blood sample, treated with and responded to the immunotherapy; as opposed to classifying with (B) a set of non-responder levels, the set of non-responder levels being a set of gene-specific levels of mRNA transcribed from each gene of the set of genes in blood samples collected from a second group of individuals prior to treatment with the immunotherapy, wherein each individual of the second group was, after collection of his or her blood
- the input data include data having one or more records that each have one or more values for the parameter representing the level of transcribed mRNA.
- the methods of determining the effectiveness score for the input data include data having one or more records that each have one or more values for the parameter representing the level of transcribed mRNA.
- immunotherapy for the test subject include: determining, by the one or more data processing devices based on application of the classifier to the input data having the one or more records, the effectiveness score for the immunotherapy for the test subject.
- the disclosure also relates to one or more machine-readable hardware storage devices for processing data to determine an effectiveness score for an immunotherapy for a test subject by storing instructions that are executable by one or more data processing devices to perform operations.
- the operations include: inputting, into a classifier, data representing one or more values for a classifier parameter that represents a gene-specific level of mRNA transcribed from a gene of a set of genes in a sample of blood collected from a test subject who was not treated with the immunotherapy prior to collecting the sample, with the input data specifying a gene-specific level of mRNA transcribed from each gene of the set of genes in the sample of blood of the test subject.
- the set of genes can be any set of genes that can be used to determine an effectiveness score for an immunotherapy for a test subject who was not treated with the immunotherapy as described in this disclosure, with the classifier being for determining an effectiveness score indicating whether the gene-specific levels of mRNA transcribed from each gene in the set of genes classifies with (A) a set of responder levels, the set of responder levels being a set of gene-specific levels of mRNA transcribed from each gene of the set of genes in blood samples collected from a first group of individuals prior to treatment with the immunotherapy, wherein each individual of the first group was, after collection of his or her blood sample, treated with and responded to the immunotherapy; as opposed to classifying with (B) a set of non-responder levels, the set of non-responder levels being a set of gene-specific levels of mRNA transcribed from each gene of the set of genes in blood samples collected from a second group of individuals prior to treatment with the immunotherapy, wherein each individual of the second group was, after collection of his or her blood
- the disclosure also relates to a system having one or more data processing devices; and one or more machine-readable hardware storage devices for processing data to determine an effectiveness score for an immunotherapy for a test subject by storing instructions that are executable by the one or more data processing devices to perform operations.
- the operations include: inputting, into a classifier, data representing one or more values for a classifier parameter that represents a gene-specific level of mRNA transcribed from a gene of a set of genes in a sample of blood collected from a test subject who was not treated with the immunotherapy prior to collecting the sample, with the input data specifying a gene-specific level of mRNA transcribed from each gene of the set of genes in the sample of blood of the test subject.
- the set of genes can be any set of genes that can be used to determine an effectiveness score for an immunotherapy for a test subject who was not treated with the immunotherapy as described in this disclosure, with the classifier being for determining an effectiveness score indicating whether the gene-specific levels of mRNA transcribed from each gene in the set of genes classifies with (A) a set of responder levels, the set of responder levels being a set of gene-specific levels of mRNA transcribed from each gene of the set of genes in blood samples collected from a first group of individuals prior to treatment with the immunotherapy, wherein each individual of the first group was, after collection of his or her blood sample, treated with and responded to the immunotherapy; as opposed to classifying with (B) a set of non-responder levels, the set of non-responder levels being a set of gene- specific levels of mRNA transcribed from each gene of the set of genes in blood samples collected from a second group of individuals prior to treatment with the immunotherapy, wherein each individual of the second group was, after collection of his or her blood
- Y is an effectiveness score indicating a probability that the set of test levels classifies with the set of responder levels, as opposed to the set of non-responder levels
- X i is a level of mRNA transcribed from an ith gene of the set of genes in blood of the test subject
- ⁇ i is a logistic regression equation coefficient for the ith gene
- ⁇ is a logistic regression equation constant that can be zero
- ⁇ i and ⁇ are the result of applying logistic regression analysis to the set of responder levels and the set of non-responder levels.
- ⁇ i is a standardized logistic regression coefficient for the ith gene.
- the disclosure relates to a set of genes that can be used to determine an effectiveness score for an immunotherapy for a test subject who was not treated with the immunotherapy prior to collecting the sample.
- the set of genes includes at least two genes listed in Table 2, and in some embodiments the entire set is limited to genes selected from those listed in Table 2.
- a set of genes useful in the present methods will include from 3 to 168 genes, e.g., 3 to 79, 3 to 55, 3 to 20, 3 to 18, 3 to 10, or 3 to 6.
- the set of genes includes any one or more of the combinations listed below as (a), (b), (c), (d), (e) and (f):
- BAX BCL2-Associated X Protein
- the set of genes further includes at least one, at least two, or at least three genes selected from Group B. In some embodiments, the set of genes further includes at least one, at least two, or at least three genes selected from Group C.
- the set of genes includes at least four, at least five, or at least six genes selected from Groups A-C, including at least one or at least two genes from each of Groups A-C; and at least one, at least two, or at least three genes selected from Group D.
- a ratio defined as the total number of genes that are selected from Group D divided by the total number of genes that are selected from a group of genes consisting of BAX, LARGE and genes in Groups A-D is 0.34 or less.
- the set of genes includes at least six, at least seven, or at least eight genes selected from Groups A-C, including at least one or at least two genes from each of Groups A-C; and any one or more of the following combinations:
- the ratio defined as the total number of genes that are selected from Groups E1-E5 divided by the total number of genes that are selected from Groups A, B, C, E1, E2, E3, E4, and E5 is 0.25 or less.
- the set of genes includes at least six, at least seven, or at least eight genes selected from Groups A-D, including at least one or at least two genes from each of Groups A-D; and any one or more of the following combinations:
- the ratio defined as the total number of genes that are selected from Groups E1-E5, divided by the total number of genes that are selected from Groups A, B, C, D, E1, E2, E3, E4, and E5 is 0.25 or less.
- the gene list in Group A can be substituted with the gene list in Group A1.
- the gene list in Group B can be substituted with the gene list in Group B1.
- the gene list in Group C can be substituted with the gene list in Group C1.
- the gene list in Group D can be substituted with the gene list in Group D1.
- a combination of genes from Groups A1 and B1 can be substituted for a combination of genes from Groups A and B, wherever the latter combination is mentioned (whether alone or in combination with other genes, such as genes from other Groups); and a combination of genes from Groups A1, B1, and C1 can be substituted for a combination of genes from Groups A, B, and C, wherever the latter combination is mentioned (whether alone or in combination with other genes, such as genes from other Groups).
- the set of genes includes any one or more of the combinations listed below as (a), (b), (c), (d), (e) and (f):
- the set of genes further includes at least one, at least two, or at least three genes selected from Group B1. In some embodiments, the set of genes further includes at least one, at least two, or at least three genes selected from Group C1.
- the set of genes includes LARGE and at least one (e.g., at least two, at least three, at least four, at least five, at least six, at least seven, at least eight, at least nine, at least ten, at least eleven, at least twelve, or at least thirteen) of the genes selected from the group of ADAM17, CDK2, CDKN2A, DPP4, ERBB2, HLADRA, ICOS, ITGA4, MYC, NAB2, NRAS, RHOC, TGFB1, and TIMP1.
- at least one e.g., at least two, at least three, at least four, at least five, at least six, at least seven, at least eight, at least nine, at least ten, at least eleven, at least twelve, or at least thirteen
- the set of genes includes all of ADAM17, CDK2, CDKN2A, DPP4, ERBB2, HLADRA, ICOS, ITGA4, LARGE, MYC, NAB2, NRAS, RHOC, TGFB1, and TIMP1.
- the set of genes includes at least four, at least five, or at least six genes selected from Groups A1, B1, and C1, including at least one or at least two genes from each of Groups A1, B1, and C1; and at least one, at least two, or at least three genes selected from Group D1.
- a ratio defined as the total number of genes that are selected from Group D1 divided by the total number of genes that are selected from a group of genes consisting of BAX, LARGE and genes in Groups A1, B1, C1, and D1 is 0.34 or less.
- the test subject, the group of individuals who responded to the immunotherapy, and the group of individuals who did not respond to the immunotherapy are all chemotherapy treatment naive, and the set of genes further includes at least one, at least two, or at least three genes selected from Group F2.
- test subject the group of individuals who responded to the immunotherapy, and the group of individuals who did not respond to the immunotherapy are all chemotherapy treatment naive
- set of genes includes any one or more of the combinations listed below as (a), (b), (c) and (d):
- the set of genes further includes at least one, at least two, or at least three genes selected from Group B. In some embodiments, the set of genes further includes at least one, at least two, or at least three genes selected from Group C.
- the set of genes includes any of the combinations listed below as (a), (b), (c), (d), (e), (f), (g), (h), (i), (j), and (k):
- CDK2 Cyclin-Dependent Kinase 2
- the set of genes further includes at least one gene, at least two genes, or at least three genes selected from AXIN2, BAD, CD26, CD97, CDKN1B, CXCR3, FOXP3, GZMA, ICOS, IL18BP, MMP9, NFKB1, PLA2G7, PTPRC, TGFB1, TLR9, TNFRSF13B, TNFRSF1B, TNFSF6, TOSO, and TXNRD1.
- the set of genes includes CDK2 and TIMP1. In some embodiments, the set of genes includes CDK2, TIMP1 and NFKB1. In some embodiments, the set of genes includes CDK2, TIMP1, NFKB1, and TXNRD1. In some embodiments, the set of genes includes CDK2 and at least one other gene. In some embodiments, the set of genes includes CDK2 and at least one (e.g., at least two, at least three, at least four, at least five, or at least six) of the genes selected from the group of CCR9, EGR1, ELANE, HSPA1A, ICAM1, MMP9, and TIMP1. In some embodiments, the set of genes includes all of CDK2, CCR9, EGR1, ELANE, HSPA1A, ICAM1, MMP9, and TIMP1.
- the set of genes includes LARGE and at least one other gene.
- the set of genes includes LARGE and at least one (e.g., at least two, at least three, at least four, at least five, at least six, at least seven, at least eight, at least nine, at least ten, at least eleven, at least twelve, at least thirteen, or at least fourteen) of the genes selected from the group of ADAM17, CD19, CDK2, CDKN2A, DPP4, ERBB2, HLADRA, ICOS, ITGA4, MYC, NAB2, NRAS, RHOC, TGFB1, and TIMP1.
- the set of genes includes all of LARGE, ADAM17, CD19, CDK2, CDKN2A, DPP4, ERBB2, HLADRA, ICOS, ITGA4, MYC, NAB2, NRAS, RHOC, TGFB1, and TIMP1.
- one gene in the set of genes is LARGE.
- the logistic regression equation coefficient for LARGE in the classifier is negative.
- the logistic regression equation coefficient for LARGE in the classifier has a value between - 1.0 and - 0.5, between -0.9 to -0.4, or between -0.8 to -0.5.
- the standardized logistic regression equation coefficient for LARGE in the classifier is negative.
- the standardized logistic regression equation coefficient for LARGE in the classifier has a value between -3.0 and -1.5, between -3.0 to - 2.0, or between -2.5 to -2.0.
- the set of genes includes CD19, CD8A, and LARGE
- the logistic regression equation coefficient for CD19 has a value between +0.1139 and +0.0380
- the logistic regression equation coefficient for CD8A has a value between -0.3551 and - 0.1184
- the logistic regression equation coefficient for LARGE has a value between -0.9858 and -0.3286
- the logistic regression equation constant has a value between +1.077 and +0.3590.
- the logistic regression equation coefficient for ADAM17 has a value between -1.4040 and -0.4680
- the logistic regression equation coefficient for CD19 has a value between -0.1730 and -0.0577
- the logistic regression equation coefficient for CDK2 has a value between +0.3263 and +0.9788
- the logistic regression equation coefficient for CDKN2A has a value between -0.8549 and -0.2850
- the logistic regression equation coefficient for DPP4 has a value between +0.2052 and +0.6156
- the logistic regression equation coefficient for ERBB2 has a value between -0.3309 and -0.1103
- the logistic regression equation coefficient for HLADRA has a value between +0.9714 and +2.9141
- the logistic regression equation coefficient for ICOS has a value between +0.1199 and +0.3596
- the logistic regression equation coefficient for ITGA4 has a value between -1.8089 and - 0.6030
- the logistic regression equation coefficient for LARGE has a value between -1.3736 and -
- the logistic regression equation coefficient for ADAM17 has a value between -2.1390 and -0.7130
- the logistic regression equation coefficient for CCR3 has a value between +0.1833 and +0.5498
- the logistic regression equation coefficient for CD19 has a value between -0.7077 and -0.2359
- the logistic regression equation coefficient for CD8A has a value between -0.0045 and -0.0015
- the logistic regression equation coefficient for CDKN2A has a value between -0.5580 and -0.1860
- the logistic regression equation coefficient for ERBB2 has a value between -0.3914 and -0.1305
- the logistic regression equation coefficient for HLADRA has a value between +0.9247 and +2.7741
- the logistic regression equation coefficient for IL2RA has a value between +0.2626 and +0.7878,
- the logistic regression equation coefficient for ITGA4 has a value between -2.2017 and -0.7339
- the logistic regression equation coefficient for LCK has a value between -0.6215
- the logistic regression equation coefficient for ADAM17 has a value between -1.3124 and -0.4375
- the logistic regression equation coefficient for BAD has a value between -1.0023 and -0.3341
- the logistic regression equation coefficient for CCR3 has a value between +0.1353 and +0.4059
- the logistic regression equation coefficient for CD19 has a value between -0.5181 and -0.1727
- the logistic regression equation coefficient for CD86 has a value between +0.0255 and +0.0765
- the logistic regression equation coefficient for CD8A has a value between -0.1344 and -0.0448
- the logistic regression equation coefficient for CDKN2A has a value between -0.4808 and -0.1603
- the logistic regression equation coefficient for ERBB2 has a value between -0.2820 and -0.0940
- the logistic regression equation coefficient for HLADRA has a value between +0.5916 and +1.7747
- the logistic regression equation coefficient for IL23A has a value between +0.2074
- the logistic regression equation coefficient for ADAM17 has a value between -1.4040 and -0.4680
- the logistic regression equation coefficient for CD19 has a value between -0.1730 and -0.0577
- the logistic regression equation coefficient for CDK2 has a value between +0.3263 and +0.9788
- the logistic regression equation coefficient for CDKN2A has a value between -0.8549 and -0.2850
- the logistic regression equation coefficient for DPP4 has a value between +0.2052 and +0.6156
- the logistic regression equation coefficient for ERBB2 has a value between -0.3309 and -0.1103
- the logistic regression equation coefficient for HLADRA has a value between +0.9714 and +2.9141
- the logistic regression equation coefficient for ICOS has a value between +0.1199 and +0.3596
- the logistic regression equation coefficient for ITGA4 has a value between -1.8089 and - 0.6030
- the logistic regression equation coefficient for LARGE has a value between -1.3736 and -
- one gene in the set of genes is CDK2.
- the logistic regression equation coefficient for CDK2 in the classifier has a value between - 2.0 and + 2.0, e.g., a value between -1.0 and + 1.0.
- the logistic regression equation coefficient for CDK2 has a value between -1.1820 and -0.3940
- the logistic regression equation coefficient for NFKB1 has a value between -0.5757 and -0.1919
- the logistic regression equation coefficient for TIMP1 has a value between +0.6245 and +1.8734
- the logistic regression equation coefficient for TXNRD1 has a value between -0.7029 and -0.2343
- the constant has a value between +4.7504 and +14.2512.
- the standardized coefficient for CDK2 has a value between -1.8426 and -0.6142
- the standardized coefficient for NFKB1 has a value between -0.8834 and -0.2945
- the standardized coefficient for TIMP1 has a value between +1.0630 and +3.1890
- the standardized coefficient for TXNRD1 has a value between - 1.1051 and -0.3684.
- the disclosure also relates to a computer-implemented method for processing data in one or more data processing devices to determine an effectiveness score for an immunotherapy for a test subject, the method including: inputting, into a classifier, data representing one or more values for a classifier parameter that represents a gene-specific level of mRNA transcribed from a gene of a set of genes in a sample of blood collected from a test subject who was treated with the immunotherapy prior to collecting the sample, with the input data specifying a gene-specific level of mRNA transcribed from each gene of the set of genes in the sample of blood of the test subject.
- the set of genes can be any set of genes that can be used to determine an effectiveness score for an immunotherapy for a test subject who was treated with the immunotherapy as described in this disclosure, with the classifier being for determining an effectiveness score indicating whether the gene-specific levels of mRNA transcribed from each gene in the set of genes classifies with (A) a set of responder levels, the set of responder levels being a set of gene-specific levels of mRNA transcribed from each gene of the set of genes in blood samples collected from a first group of individuals who were treated with the immunotherapy prior to collecting the sample, wherein each individual of the first group responded to the immunotherapy; as opposed to classifying with (B) a set of non-responder levels, the set of non-responder levels being a set of gene- specific levels of mRNA transcribed from each gene of the set of genes in blood samples collected from a second group of individuals who were treated with the immunotherapy prior to collecting the sample, wherein each individual of the second group did not respond to the immunotherapy; for each of one or more
- the input data include data having one or more records that each have one or more values for the parameter representing the level of transcribed mRNA.
- the methods of determining the effectiveness score for the input data include data having one or more records that each have one or more values for the parameter representing the level of transcribed mRNA.
- immunotherapy for the test subject include: determining, by the one or more data processing devices based on application of the classifier to the input data having the one or more records, the effectiveness score for the immunotherapy for the test subject.
- the disclosure also relates to one or more machine-readable hardware storage devices for processing data to determine an effectiveness score of an immunotherapy for a test subject by storing instructions that are executable by one or more data processing devices to perform operations.
- the operations include: inputting, into a classifier, data representing one or more values for a classifier parameter that represents a gene-specific level of mRNA transcribed from a gene of a set of genes in a sample of blood collected from a test subject who was treated with the immunotherapy prior to collecting the sample, with the input data specifying a gene-specific level of mRNA transcribed from each gene of the set of genes in the sample of blood of the test subject.
- the set of genes can be any set of genes that can be used to determine an effectiveness score for an immunotherapy for a test subject who was treated with the immunotherapy as described in this disclosure, with the classifier being for determining an effectiveness score indicating whether the gene-specific levels of mRNA transcribed from each gene in the set of genes classifies with (A) a set of responder levels, the set of responder levels being a set of gene-specific levels of mRNA transcribed from each gene of the set of genes in blood samples collected from a first group of individuals who were treated with the immunotherapy prior to collecting the sample, wherein each individual of the first group responded to the immunotherapy; as opposed to classifying with (B) a set of non- responder levels, the set of non-responder levels being a set of gene-specific levels of mRNA transcribed from each gene of the set of genes in blood samples collected from a second group of individuals who were treated with the immunotherapy prior to collecting the sample, wherein each individual of the second group did not respond to the immunotherapy; for each of one or more
- the disclosure also relates to a system having one or more data processing devices and one or more machine-readable hardware storage devices for processing data to determine an effectiveness score of an immunotherapy for a test subject by storing instructions that are executable by the one or more data processing devices to perform operations.
- the operations include inputting, into a classifier, data representing one or more values for a classifier parameter that represents a gene-specific level of mRNA transcribed from a gene of a set of genes in a sample of blood collected from a test subject who was treated with the
- the set of genes can be any set of genes that can be used to determine an effectiveness score for an immunotherapy for a test subject who was treated with the immunotherapy as described in this disclosure, with the classifier being for determining an effectiveness score indicating whether the gene-specific levels of mRNA transcribed from each gene in the set of genes classifies with (A) a set of responder levels, the set of responder levels being a set of gene-specific levels of mRNA transcribed from each gene of the set of genes in blood samples collected from a first group of individuals who were treated with the immunotherapy prior to collecting the sample, wherein each individual of the first group responded to the immunotherapy; as opposed to classifying with (B) a set of non-responder levels, the set of non-responder levels being a set of gene-specific levels of mRNA transcribed from each gene of the set of genes in blood
- Y is an effectiveness score indicating a probability that the set of test levels classifies with the set of responder levels, as opposed to the set of non-responder levels
- X i is a level of mRNA transcribed from an ith gene of the set of genes in blood of the test subject
- ⁇ i is a logistic regression equation coefficient for the ith gene
- ⁇ is a logistic regression equation constant that can be zero
- ⁇ i and ⁇ are the result of applying logistic regression analysis to the set of responder levels and the set of non-responder levels.
- ⁇ i is a standardized logistic regression coefficient for the ith gene.
- the disclosure also relates to a set of genes that can be used to determine an effectiveness score for an immunotherapy for a test subject who was treated with the immunotherapy.
- the set of genes includes at least two genes listed in Table 2, and in some embodiments the entire set is limited to genes selected from those listed in Table 2.
- a set of genes useful in the present methods will include from 3 to 168 genes, e.g., 3 to 140, 3 to 94, 3 to 28, 3 to 20, 3 to 10, or 3 to 6.
- the set of genes includes any one or more of the combinations listed below as (a) and (b): (a) a combination of at least two, at least three, or at least four genes selected from Group J,
- Group J and at least one, at least two, or at least three genes selected from Group K.
- the set of genes further includes at least one, at least two, or at least three genes selected from Group K.
- the set of genes further includes at least one, at least two, or at least three genes selected from Group L.
- the set of genes includes at least three, at least four, or at least five genes selected from Groups J-L, including at least one or at least two genes from each of Groups J-L; and at least one, at least two, or at least three genes selected from Group M.
- the ratio defined as the total number of genes that are selected from Group M divided by the total number of genes that are selected from Groups J-M is 0.34 or less.
- the set of genes includes at least six, at least seven, or at least eight genes selected from Groups J-L, including at least one or at least two genes from each of Groups J-L; and any one or more of the following combinations:
- the ratio defined as the total number of genes that are selected from Groups N1-N5 divided by the total number of genes that are selected from Groups J, K, L, N1, N2, N3, N4, and N5 is 0.25 or less.
- the set of genes includes at least six, at least seven, or at least eight genes selected from Groups J-M, including at least one or at least two genes from each of Groups J-M; and any one or more of the following combinations:
- the ratio defined as the total number of genes that are selected from Groups N1-N5, divided by the total number of genes that are selected from Groups J, K, L, M, N1, N2, N3, N4, and N5 is 0.25 or less.
- the set of genes includes any of the combinations listed below as (a), (b) and (c):
- the gene list in Group J can be substituted by the gene list in Group J1.
- the gene list in Group K can be substituted by the gene list in Group K1.
- the gene list in Group M can be substituted by the gene list in Group M1.
- a combination of genes from Groups J1 and K1 can be substituted for a combination of genes from Groups J and K, wherever the latter combination is mentioned (whether alone or in combination with other genes, such as genes from other Groups); and a combination of genes from Groups J1, K1, and M1 can be substituted for a combination of genes from Groups J, K, and M, wherever the latter combination is mentioned (whether alone or in combination with other genes, such as genes from other Groups).
- the set of genes includes any one or more of the combinations listed below as (a) and (b):
- Group J1 and at least one, at least two, or at least three genes selected from Group K1.
- the set of genes further includes at least one, at least two, or at least three genes selected from Group K1.
- the set of genes further includes at least one, at least two, or at least three genes selected from Group L.
- the set of genes includes at least three, at least four, or at least five genes selected from Groups J1, K1, and L, including at least one or at least two genes from each of Groups J1, K1, and L; and at least one, at least two, or at least three genes selected from Group M1.
- a ratio defined as the total number of genes that are selected from Group M1 divided by the total number of genes that are selected from Groups J1, K1, L, and M1 is 0.34 or less.
- the set of genes includes at least six genes, at least seven, or at least eight genes selected from Groups J1, K1, and L, including at least one or at least two genes from each of Groups J1, K1, and L; and any one or more of the following
- the ratio defined as the total number of genes that are selected from Groups N1-N5 divided by the total number of genes that are selected from Groups J1, K1, L, N1, N2, N3, N4, and N5 is 0.25 or less.
- the set of genes includes at least six, at least seven, or at least eight genes selected from J1, K1, L, and M1, including at least one or at least two genes from each of J1, K1, L, and M1; and any one or more of the following combinations:
- the ratio defined as the total number of genes that are selected from Groups N1-N5, divided by the total number of genes that are selected from Groups J1, K1, L, M1, N1, N2, N3, N4, and N5 is 0.25 or less.
- the set of genes includes at least six, at least seven, or at least eight genes selected from J1, K1, and M1, including at least one or at least two genes from each of J1, K1, and M1; and any one or more of the following combinations:
- a ratio defined as the total number of genes that are selected from Groups N1-N5, divided by the total number of genes that are selected from Groups J1, K1, M1, N1, N2, N3, N4, and N5 is 0.25 or less.
- the set of genes includes CDK2 and TIMP1.
- the set of genes includes CDK2, TIMP1 and NFKB1.
- the set of genes includes CDK2, TIMP1, NFKB1, and TXNRD1.
- the test subject, the group of individuals who responded to the immunotherapy, and the group of individuals who did not respond to the immunotherapy are all chemotherapy treatment naive, and the set of genes further includes at least one, at least two, or at least three genes selected from Group P2.
- the test subject, the group of individuals who responded to the immunotherapy, and the group of individuals who did not respond to the immunotherapy are all chemotherapy treatment naive, and the set of genes includes at least two, at least three, or at least four genes selected from Group Q.
- one gene in the set of genes is CCR9.
- the logistic regression equation coefficient for CCR9 in the classifier is negative.
- the logistic regression equation coefficient for CCR9 in the classifier has a value between–0.45 to +0.06, between -0.45 to -0.20, or between -0.45 to -0.20.
- ⁇ i is a standardized logistic regression coefficient for the ith gene.
- the standardized logistic regression equation coefficient for CCR9 in the classifier is negative.
- the standardized logistic regression equation coefficient for CCR9 in the classifier has a value between–1.42 to + 0.20, between -2.0 to - 1.0, or between -1.5 to -1.0.
- the logistic regression equation coefficient for CCR9 has a value between -0.5558 and -0.1853
- the logistic regression equation coefficient for EGR1 has a value between +0.4386 and +1.3157
- the logistic regression equation coefficient for ELANE has a value between +0.0784 and +0.2351
- the logistic regression equation coefficient for MMP9 has a value between +0.0647 and +0.1940.
- the logistic regression equation coefficient for ANLN has a value between -1.5108 and -0.5036, the logistic regression equation coefficient for CARD12 has a value between +0.1489 and +0.4467, the logistic regression equation coefficient for CCR9 has a value between -0.2115 and -0.0705, the logistic regression equation coefficient for CD86 has a value between +0.1111 and +0.3332, the logistic regression equation coefficient for EGR1 has a value between +0.7947 and +2.3840, the logistic regression equation coefficient for ELANE has a value between +0.1471 and +0.4412, the logistic regression equation coefficient for HSPA1A has a value between -0.5627 and -0.1876, the logistic regression equation coefficient for ICAM1 has a value between -0.5582 and -0.1861, the logistic regression equation coefficient for IL1R1 has a value between -0.7392 and - 0.2464, the logistic regression equation coefficient for MMP9 has a value between +0.0631 and +0.1893, the logistic regression equation coefficient for MNDA
- the logistic regression equation coefficient for ANLN has a value between -1.0182 and -0.3394, the logistic regression equation coefficient for CARD12 has a value between +0.1423 and +0.4268, the logistic regression equation coefficient for CCR9 has a value between -0.2789 and -0.0930, the logistic regression equation coefficient for CD86 has a value between -0.3192 and -0.1064, the logistic regression equation coefficient for EGR1 has a value between +0.6505 and +1.9514, and the logistic regression equation coefficient for LARGE has a value between -0.4365 and -0.1455.
- the logistic regression equation coefficient for CDK2 has a value between -0.6218 and -0.2073
- the logistic regression equation coefficient for NFKB1 has a value between -1.0664 and -0.3555
- the logistic regression equation coefficient for TIMP1 has a value between +0.6544 and +1.9632
- the logistic regression equation coefficient for TXNRD1 has a value between -1.0532 and -0.3511
- the constant has a value between +5.2603 and +15.7808.
- the standardized coefficient for CDK2 has a value between -0.9821 and -0.3274, the standardized coefficient for NFKB1 has a value between -1.5623 and -0.5208, the standardized coefficient for TIMP1 has a value between +1.1087 and +3.3260, and the standardized coefficient for TXNRD1 has a value between - 1.4286 and -0.4762.
- the disclosure also relates to a method of administering immunotherapy to a patient in need of treatment for cancer, where the method includes:
- immunotherapy for the patient is greater than a first reference threshold; (b) based on the determination of (a), recommending that the patient be treated with the immunotherapy;
- the set of genes includes any one or more of the combinations listed below as (a) and (b):
- the second classifier being for determining an effectiveness score indicating whether the gene-specific levels of mRNA transcribed from each gene in the set of genes classifies with (A) a set of responder levels, the set of responder levels being a set of gene-specific levels of mRNA transcribed from each gene of the set of genes in blood samples collected from a first group of individuals who were treated with the immunotherapy prior to collecting the sample, wherein each individual of the first group responded to the immunotherapy; as opposed to classifying with (B) a set of non-responder levels, the set of non-responder levels being a set of gene-specific levels of mRNA transcribed from each gene of the set of genes in blood samples collected from a second group of individuals who were treated with the immunotherapy prior to collecting the sample, wherein each individual of the second group did not respond to the immunotherapy;
- the set of genes in (e) includes at least one, at least two, or at least three genes selected from Group K.
- the set of genes in (e) includes at least one, at least two, or at least three genes selected from Group L.
- the second classifier has a sensitivity greater than 90% and a specificity greater than 50%.
- the logistic regression equation coefficient for CHPT1 has a value between +0.0533 and +0.1598, the logistic regression equation coefficient for EGR1 has a value between +0.6064 and +1.8191, the logistic regression equation coefficient for HSPA1A has a value between -0.8847 and -0.2949, the logistic regression equation coefficient for ICAM1 has a value between -0.8372 and -0.2791, the logistic regression equation coefficient for MMP9 has a value between +0.2296 and +0.6888.
- the logistic regression equation coefficient for EGR1 has a value between +0.6182 and +1.8546
- the logistic regression equation coefficient for HSPA1A has a value between -1.0622 and -0.3541
- the logistic regression equation coefficient for ICAM1 has a value between -0.6630 and -0.2210
- the logistic regression equation coefficient for MMP9 has a value between +0.2427 and +0.7281.
- the disclosure also relates to a method of administering an immunotherapy to a patient in need of treatment for cancer, the method including:
- the set of genes can be any set of genes that can be used to determine an effectiveness score for an immunotherapy for a patient who was not treated with the immunotherapy, as described in this disclosure;
- Step (3) based on the probability determined in Step (2), recommending that the patient be treated with the immunotherapy.
- the disclosure also relates to a method of administering an immunotherapy to a patient in need of treatment for cancer, the method including:
- Step (3) recommending that the patient continue to be treated with the immunotherapy.
- the disclosure also relates to a method of administering immunotherapy to a patient in need of treatment for cancer, the method comprising:
- the sample of blood is collected from the patient within 90 days after administration of the immunotherapy has begun.
- the probability is greater than a reference threshold.
- the immunotherapy is an anti-CTLA4 immunotherapy, such as tremelimumab.
- the cancer is melanoma, prostate cancer or bladder cancer.
- the prostate cancer is castration-resistant prostate cancer.
- the cancer is colon cancer or metastatic colon cancer.
- the test subject has melanoma, prostate cancer, castration- resistant prostate cancer, colon cancer or metastatic colon cancer.
- a“gene” refers to a locus (or segment) of DNA that is transcribed into a functional RNA product or encodes a functional protein or peptide product.
- a set of refers to two or more, for example, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more.
- a“blood sample” or“sample of blood” refers to whole blood, serum- reduced whole blood, lysed blood (erythrocyte-depleted blood), centrifuged lysed blood (serum-depleted, erythrocyte-depleted blood), serum-depleted whole blood or peripheral blood leukocytes (PBLs), globin-reduced RNA from blood, or any possible fraction of blood as would be understood by a person skilled in the art.
- Immunotherapy refers to a type of cancer treatment designed to alter the body's natural defenses to fight the cancer. Immunotherapy can induce, enhance, or suppress an immune response. Immunotherapy can be, for example, an interferon, an interleukin, or an antibody that targets receptors or ligands that are involved in the immune system.
- Current antibody immunotherapies include, but are not limited to, alemtuzumab, ipilimumab, ofatumumab, nivolumab, pembrolizumab, rituximab, and so forth. Antibody immunotherapies are described in detail in Creelan, Benjamin C.,“Update on immune checkpoint inhibitors in lung cancer,” Cancer Control 21.1 (2014): 80-89, which is incorporated by reference in its entirety.
- mRNA refers to an RNA complementary to the exons of a gene.
- An mRNA sequence includes a protein coding region or part of the coding region, and also may include 5′ and 3′ untranslated regions (UTR).
- “responding to an immunotherapy” or“response to immunotherapy” or“immunotherapy response” means that an immunotherapy is effective in treating a cancer in a subject. Efficacy is usually measured by the clinical response of the patient who has been or is being treated with a drug. A drug is considered to be effective if it achieves desired clinical results: for example, by reducing or preventing the progression and/or severity of one or more cancer pathologies, preventing the development, recurrence of onset of cancers, or enhancing or improving the prophylactic or therapeutic effect(s) of another therapy.
- art-accepted criteria include the Immune-Related Response Criteria (irRC) (Hoos, Axel, and Cedrik M.
- “patient,”“individual,” or“subject” each refers to a mammal, which in some embodiments is a human.
- RNA means a measurable quantity (either absolute or relative quantity) of a given nucleic acid.
- the quantity can be determined by various means, for example, by microarray, quantitative polymerase chain reaction (QPCR), or sequencing.
- a“primer” refers to an oligonucleotide that is capable of acting as a point of initiation of DNA or RNA synthesis complementary to a strand of nucleic acid, when placed under conditions in which synthesis of a primer extension product complementary to the nucleic acid strand is induced, i.e., in the presence of mononucleotides and an inducing agent such as a DNA polymerase and at a suitable temperature and pH.
- an inducing agent such as a DNA polymerase and at a suitable temperature and pH.
- the primer may be single-stranded and is sufficiently long to prime the synthesis of the desired extension product in the presence of the inducing agent.
- cancer refers to cells having the capacity for autonomous growth within a subject (e.g., a mouse, a human). Examples of such cells include cells having an abnormal state or condition characterized by rapidly proliferating cell growth. Cancer further includes cancerous growths, e.g., tumors, oncogenic processes, metastatic tissues, and malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness.
- Cancer further includes malignancies of the various organ systems, such as skin, respiratory, cardiovascular, renal, reproductive, hematological, neurological, hepatic, gastrointestinal, and endocrine systems; as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine, and cancer of the esophagus.
- organ systems such as skin, respiratory, cardiovascular, renal, reproductive, hematological, neurological, hepatic, gastrointestinal, and endocrine systems
- adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine, and cancer of the esophagus.
- Cancer that is“naturally arising” includes any cancer that is not experimentally induced by implantation of cancer cells into a subject, and includes, for example, spontaneously arising cancer, cancer caused by exposure of a patient to a carcinogen(s), cancer resulting from insertion of a transgenic oncogene or knockout of a tumor suppressor gene, and cancer caused by infections, e.g., viral infections.
- the presently described methods can determine the effectiveness score for immunotherapy in subjects with various cancers, including cancers of the skin (e.g., melanoma, unresectable melanoma or metastatic melanoma), stomach, colon, rectum, mouth/pharynx, esophagus, larynx, liver, pancreas, lung, breast, cervix uteri, corpus uteri, ovary, prostate (e.g., castration-resistant prostate cancer), testis, bladder, bone, kidney, head, neck, brain/central nervous system, and throat etc., and also Hodgkins disease, non-Hodgkins lymphoma, sarcomas, choriocarcinoma, lymphoma, neuroblastoma (e.g., pediatric neuroblastoma), chronic lymphocytic leukemia, and squamous non-small cell lung cancer, among others.
- cancers of the skin e.g., melanoma, un
- melanoma refers to a type of skin cancer which develops from melanocytes, the skin cells in the epidermis that produce the skin pigment melanin.
- melanoma includes Stage I, Stage II, Stage III and Stage IV melanoma, as determined by the American Joint Committee on Cancer (AJCC) (6th Edition), non-melanotic melanoma, nodular melanoma, acral lentiginous melanoma, and lentigo maligna.
- AJCC American Joint Committee on Cancer
- Active melanoma is a type of melanoma in which subjects have clinical evidence of disease.
- “Inactive melanoma” includes melanoma in which subjects have no clinical evidence of disease.
- prostate cancer refers to cancer in the prostate gland.
- Castration- resistant prostate cancer is a subcategory of prostate cancer that is not responsive to castration treatment (reduction of available androgen/testosterone/DHT by chemical or surgical means).
- colon cancer refers to cancer in the colon or rectum.
- a“biomarker” refers to a measurable indicator of some biological state or condition, for example, the level of transcribed mRNA, and the amount of protein.
- the term“data” in relation to biomarkers generally refers to data reflective of the absolute and/or relative abundance (level) of a biomarker in a sample, for example, the level of one or more particular transcribed mRNAs, or the amount of one or more particular proteins.
- a“dataset” in relation to biomarkers refers to a set of data representing the absolute and/or relative abundance (level) of one biomarker or a panel of two or more biomarkers in a group of subjects.
- a“mathematical model” refers to a description of a system using mathematical concepts and language. The process of developing a mathematical model is termed mathematical modeling or model construction.
- the term“classifier” refers to a mathematical model with appropriate parameters that can determine whether a test subject classifies with a first group of subjects (e.g., a group of subjects that responds to an immunotherapy) as opposed to another group of subjects (e.g., a group of subjects that do not respond to an immunotherapy).
- a classifier can generate a score (e.g., effectiveness score) by applying the classifier to data obtained from a test subject.
- the classifier maps a test subject to a category by comparing the score generated by the classifier to a reference threshold.
- the score is indicative of the probability that the test subject classifies with a first group of subjects as opposed to another group of subjects.
- treatment naive is used to describe a subject who has not undergone chemotherapy treatment for cancer.
- a treatment-naive patient may have undergone a non- chemotherapy treatment such as surgery or radiation, and still be considered treatment-naive for present purposes.
- Random selection or“randomly selected” refers to a method of selecting items (often called units) from a group of items or a population randomly.
- the probability of choosing a specific item is the proportion of those items in the population. For example, the probability of randomly selecting one particular gene out of a group of 10 genes is 0.1.
- FIG.1 is a schematic diagram showing a system for processing and classifying data to determine an effectiveness score for immunotherapy in a subject.
- FIG.2 is a flow diagram of a process for processing and classifying data to determine an effectiveness score for immunotherapy in a subject.
- FIG.3A is a graph showing a receiver operating characteristic (ROC) curve of one exemplary pre-treatment classifier (the classifier shown in Table 38, No.1).
- ROC receiver operating characteristic
- FIG.3B is a graph showing the results of applying the same pre-treatment classifier used in FIG.3A to all subjects in a training dataset.
- the X axis represents correlated components.
- the Y axis represents the effectiveness score (Y value).
- Data points labeled with black squares represent responders, as determined by Response Evaluation Criteria in Solid Tumors (RECIST).
- Data points labeled with white circles represent non-responders, as determined by RECIST.
- FIG.4A is a graph showing a ROC curve of one exemplary post-treatment classifier (the classifier shown in Table 39, No.1).
- FIG.4B is a graph showing the results of applying the same post-treatment classifier used in FIG.4A to all subjects in a training dataset.
- the X axis represents correlated components.
- the Y axis represents the effectiveness score (Y value).
- Data points labeled with black squares represent responders, as determined by Response Evaluation Criteria in Solid Tumors (RECIST).
- Data points labeled with white circles represent non-responders, as determined by RECIST.
- This disclosure relates to a computer-implemented method for processing data to determine an effectiveness score for immunotherapy in a subject.
- a data processing system consistent with this disclosure applies classifiers to data corresponding to levels of transcribed mRNAs of a set of genes.
- system 10 classifies groups of data via binding data to parameters and applying a classifier to the input data, and outputs a score.
- the score is indicative of the probability of immunotherapy response.
- System 10 includes client device 12, data processing system 18, data repository 20, network 16 and wireless device 14.
- Data processing system 18 retrieves, from data repository 20, data 21 representing one or more values for a classifier parameter that represents a gene-specific level of transcribed mRNA from a gene of a set of genes in a sample of blood of a test subject, as described in further detail below.
- Data processing system 18 inputs the retrieved data into a classifier, e.g., into classifier data processing program 30.
- classifier data processing program 30 is programmed to execute a data classifier.
- data classifiers There are various types of data classifiers, including, e.g., linear discriminant classifiers, support vector machine classifiers, nearest neighbor classifiers, ensemble classifiers, and so forth.
- classifier data processing program 30 is configured to execute a classifier in accordance with the below equation:
- Y is a value (e.g., effectiveness score) indicating whether the set of test expression levels for a given subject should classify with the set of responder levels, as opposed to the set of non-responder levels.
- Xi is a level of mRNA transcribed from an ith gene of the set of genes in blood of the test subject.
- ⁇ i is a logistic regression equation coefficient for the ith gene.
- ⁇ is a logistic regression equation constant that can be zero.
- ⁇ i and ⁇ are the result of applying logistic regression analysis to the set of responder levels and the set of non-responder levels.
- Xi represents a classifier parameter.
- Data processing system 18 binds to classifier parameter X i one or more values representing a gene-specific level of transcribed mRNA from that gene, as specified in retrieved data 21.
- Data processing system 18 binds values of the data to the classifier parameter by modifying a database record such that a value of the parameter is set to be the value of data 21 (or a portion thereof).
- Data 21 includes a plurality of data records that each have one or more values for the parameter X i representing the level of transcribed mRNA, and in some embodiments, some parameters of the classifier (e.g., values for logistic regression equation coefficients and logistic regression equation constants).
- data processing system 18 applies classifier data processing program 30 to each of the records by applying classifier data processing program 30 to the bound values for the parameter Xi. Based on application of classifier data processing program 30 to the bound values (e.g., as specified in data 21 or in records in data 21), data processing system 18 determines an effectiveness score for immunotherapy and outputs, e.g., to client device 12 via network 16 and/or wireless device 14, data indicative of the determined effectiveness score for immunotherapy. Data processing system 18 generates data for a graphical user interface that when rendered on a display device of client device 12 display a visual representation of the output.
- data processing system 18 generates the classifier by applying the mathematical model to a dataset to determine parameters of a classifier (e.g., values for logistic regression equation coefficients and logistic regression equation constants).
- parameters of a classifier e.g., values for logistic regression equation coefficients and logistic regression equation constants.
- the values for these parameters can be stored in data repository 20 or memory 22.
- Client device 12 can be any sort of computing device capable of taking input from a user and communicating over network 16 with data processing system 18 and/or with other client devices.
- Client device 12 can be a mobile device, a desktop computer, a laptop, a cell phone, a personal digital assistant (PDA), a server, an embedded computing system, a mobile device and so forth.
- PDA personal digital assistant
- Data processing system 18 can be a variety of computing devices capable of receiving data and running one or more services.
- data processing system 18 can include a server, a distributed computing system, a desktop computer, a laptop, a cell phone, a rack-mounted server, and the like.
- Data processing system 18 can be a single server or a group of servers that are at a same position or at different positions (i.e., locations).
- Data processing system 18 and client device 12 can run programs having a client-server relationship to each other. Although distinct modules are shown in the figures, in some embodiments, client and server programs can run on the same device.
- Data processing system 18 can receive data from wireless devices 14, and/or client device 12 through input/output (I/O) interface 24, and data repository 20.
- Data repository 20 can store a variety of data values for classifier data processing program 30.
- the classifier data processing program (which may also be referred to as a program, software, a software application, a script, or code) can be written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
- the classifier data processing program may, but need not, correspond to a file in a file system.
- the program can be stored in a portion of a file that holds other programs or information (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code).
- the classifier data processing program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
- data repository 20 stores data 21 indicative of the gene-specific levels of mRNA, for example, the gene-specific levels of mRNA transcribed from each gene in the set of genes for a group of individuals who responded to the immunotherapy, a group of individuals who did not respond to the immunotherapy, and/or a test subject.
- data repository 20 stores parameters of a classifier, for example, coefficients and constants of a logistic regression equation.
- I/O interface 24 can be a type of interface capable of receiving data over a network, including, e.g., an Ethernet interface, a wireless networking interface, a fiber-optic networking interface, a modem, and so forth.
- Data processing system 18 also includes a processing device 28.
- a“processing device” encompasses all kinds of apparatus, devices, and machines for processing information, including by way of example a programmable processor, a computer, or multiple processors or computers.
- the apparatus can include special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit) or RISC (reduced instruction set circuit).
- the apparatus can also include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, an information base management system, an operating system, or a combination of one or more of them.
- code that creates an execution environment for the computer program in question e.g., code that constitutes processor firmware, a protocol stack, an information base management system, an operating system, or a combination of one or more of them.
- Data processing system 18 also includes memory 22 and a bus system 26, including, for example, a data bus and a motherboard, can be used to establish and to control data communication between the components of data processing system 18.
- Processing device 28 can include one or more microprocessors. Generally, processing device 28 can include an appropriate processor and/or logic that is capable of receiving and storing data, and of communicating over a network (not shown).
- Memory 22 can include a hard drive and a random access memory storage device, including, e.g., a dynamic random access memory, or other types of non-transitory machine-readable storage devices.
- Memory 22 stores classifier data processing program 30 that is executable by processing device 28.
- These computer programs may include a data engine (not shown) for implementing the operations and/or the techniques described herein.
- the data engine can be implemented in software running on a computer device, hardware or a combination of software and hardware.
- data processing system 18 performs process 100 to output information indicative of an effectiveness score for immunotherapy response.
- data processing system 18 inputs (102), into a classifier, data representing one or more values for a classifier parameter.
- the data can come from wireless devices 14, client device 12, and/or data repository 20.
- Data processing system 18 binds (104) one or more values representing a gene-specific level of transcribed mRNA to the classifier parameter.
- Data processing system 18 applies (106) the classifier to bound values for the parameter, and determines (108) the effectiveness score for immunotherapy response.
- Data processing system 18 outputs (110), by the one or more data processing devices 28, information indicative of the effectiveness score for immunotherapy response.
- the output may be transmitted to a display device, e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor, or transmitted to client device 12, or wireless device 14 through network 16.
- CTR cathode ray tube
- LCD liquid crystal display
- checkpoint inhibitors proteins that control the immune system, preventing it from attacking normal tissue and thereby preventing autoimmune diseases.
- these checkpoint inhibitors can also allow cancer cells to escape immune system surveillance, leading to tumor proliferation. Creelan, Benjamin C.“Update on immune checkpoint inhibitors in lung cancer.” Cancer Control 21.1 (2014): 80-89.
- these recently approved cancer immunotherapies including Yervoy® (ipilimumab) from Bristol Meyers Squibb and Keytruda® (pembrolizumab) from Merck, stimulate the immune system to“take the brakes off,” which helps the immune system recognize and attack cancer cells more effectively.
- Yervoy® and Keytruda® have proven to be effective in about 20% and 37%, respectively, of malignant melanoma patients treated in clinical trials.
- a subject can include an individual who has already been diagnosed as having cancer or a condition related to cancer. Diagnosis of cancer can be made by lab tests and imaging techniques, for example, X- rays, CT scans, MRIs, PET and PET/CTs, ultrasound, and LDH testing, and biopsy, including shave, punch, incisional, and excisional biopsy.
- a subject has skin cancer (including e.g., melanoma), prostate cancer (including, e.g., castration-resistant prostate cancer), bladder cancer, lung cancer, breast cancer, colorectal cancer, pancreatic cancer, gastric cancer, liver cancer, head and neck cancer, ovarian cancer, esophageal cancer, renal cell carcinoma, myeloma, lymphoma, leukemia, etc.
- skin cancer including e.g., melanoma
- prostate cancer including, e.g., castration-resistant prostate cancer
- bladder cancer including, e.g., breast cancer, colorectal cancer, pancreatic cancer, gastric cancer, liver cancer, head and neck cancer, ovarian cancer, esophageal cancer, renal cell carcinoma, myeloma, lymphoma, leukemia, etc.
- Diagnosis of skin cancer can often include a dermatoscopic exam and/or a visual examination of the skin looking for common features of cancerous skin lesions, including but not limited to bumps, shiny translucent, pearly, or red nodules, a sore that continuously heals and re-opens, a crusted or scaly area of the skin with a red inflamed base that resembles a growing tumor, a non-healing ulcer, crusted-over patch of skin, new moles, changes in the size, shape, or color of an existing mole, the spread of pigmentation beyond the border of a mole or mark, oozing or bleeding from a mole, and a mole that feels itchy, hard, lumpy, swollen, or tender to the touch.
- a dermatoscopic exam and/or a visual examination of the skin looking for common features of cancerous skin lesions, including but not limited to bumps, shiny translucent, pearly, or red nodules, a sore that continuously heals and
- a subject can be someone who is suffering from any of various stages of skin cancer, e.g., Stage 1 through Stage 4 melanoma.
- Stage 1 indicates that no lymph nodes or lymph ducts include cancer cells (i.e., there are no“positive” lymph nodes) and there is no sign of cancer spread.
- the primary melanoma is less than 2.0 mm thick, or less than 1.0 mm thick and is ulcerated, i.e., the covering layer of the skin over the tumor is broken.
- Stage 2 melanomas also have no sign of spread or positive lymph nodes.
- Stage 2 melanomas are over 2.0 mm thick or over 1.0 mm thick and ulcerated.
- Stage 3 melanomas there are positive lymph nodes, but no sign of the cancer having spread anywhere else in the body.
- Stage 4 melanomas have spread elsewhere in the body, away from the primary site.
- the subject has been previously treated with a surgical procedure for removing skin cancer (e.g., melanoma), including, but not limited to, any one or combination of the following treatments: cryosurgery, i.e., the process of freezing with liquid nitrogen; curettage and electrodessication, i.e., the scraping of the lesion and destruction of any remaining malignant cells with an electric current; and removal of a lesion layer-by-layer down to normal margins (Moh's surgery).
- the subject has previously been treated with any one or more therapeutic treatments for melanoma, alone or in combination with a surgical procedure for removing skin cancer.
- Therapeutic treatments for melanoma include, but are not limited to, chemotherapy, immunotherapy, monoclonal antibody therapy, gene therapy, adoptive T-cell therapy, and vaccine therapy.
- the individual from whom a sample is obtained is a test subject for whom it is unknown whether the subject will respond to an immunotherapy.
- tremelimumab was administered by IV infusion once every 90 days for up to four cycles to patients.
- Tremelimumab mechanism of action involves stimulation of an immune response, and there is an expected lag period before an effective immune response is initiated.
- tumor responses can be assessed every 90 days (one cycle) in patients treated with tremelimumab. There can be a planned assessment of tumor response at 6 months to determine progression-free survival (PFS) rate at this time point.
- PFS progression-free survival
- Prostate cancer is cancer of the prostate gland.
- the cancer cells may spread from the prostate gland to other parts of the body, particularly the bones and lymph nodes.
- Prostate cancer may initially cause no symptoms. In later stages, it can lead to difficulty urinating, blood in the urine, or pain in the pelvis or back, or when urinating.
- Treatment of prostate cancer may involve surgery (e.g., radical prostatectomy), radiation therapy including brachytherapy (prostate brachytherapy) and external beam radiation therapy, high-intensity focused ultrasound (HIFU), chemotherapy (e.g., oral chemotherapy such as with Temozolomide/TMZ), cryosurgery, hormonal therapy, or some combination.
- surgery e.g., radical prostatectomy
- radiation therapy including brachytherapy (prostate brachytherapy) and external beam radiation therapy
- high-intensity focused ultrasound (HIFU) high-intensity focused ultrasound
- chemotherapy e.g., oral chemotherapy such as with Temozolomide/TMZ
- cryosurgery e.g., hormonal therapy, or some combination.
- Prostate cancer initially responsive to hormone therapy can become resistant to hormone therapy after one to three years and resume growth.
- the term“castration-resistant” means that the prostate cancer is no longer responsive to castration treatment (reduction of available androgen/testosterone/DHT by chemical or surgical means). Many conventional treatments for castration-resistant prostate cancer (CRPC) are considered palliative and not shown to be effective.
- Detection and diagnosis of prostate cancer can be accomplished by methods that include, e.g., a prostate-specific antigen (PSA) test, digital rectal exam (DRE), ultrasound, and biopsy.
- PSA prostate-specific antigen
- DRE digital rectal exam
- a blood sample is drawn from a subject and analyzed for PSA. If a higher than normal level is found, it may be an indication of prostate infection, inflammation, enlargement or cancer.
- a gloved, lubricated finger is inserted into a subject’s rectum to examine the texture, shape or size of the prostate gland.
- PSA testing combined with DRE helps identify prostate cancers at an early stage.
- transrectal ultrasound can be employed to further evaluate the prostate, or a prostate biopsy procedure can be performed to collect a tissue sample from the prostate gland.
- Prostate biopsy is often done using a thin needle that is inserted into the prostate to collect tissue. The tissue sample is analyzed in a lab to determine whether cancer cells are present.
- the presently claimed methods can be used for a subject who has prostate cancer, such as CRPC.
- the subject has previously been treated with any one or more therapeutic treatments for prostate cancer, alone or in combination with a surgical procedure for removing the tumor.
- Therapeutic treatments for prostate cancer are known in the art and include, but are not limited to, chemotherapy, immunotherapy, monoclonal antibody therapy, gene therapy, adoptive T-cell therapy, and vaccine therapy.
- the individual from whom a sample is obtained is a test subject for whom it is unknown whether the subject will respond to an immunotherapy. Immunotherapy response
- the actual immunotherapy response in a given cancer patient can be determined by standard immune response criteria.
- standard immune response criteria include the Immune-Related Response Criteria (irRC), the World Health Organization (WHO) Criteria, and the Response Evaluation Criteria in Solid Tumors (RECIST).
- the irRC is an alternate set of response criteria that is described in Wolchok et al., "Guidelines for the evaluation of immune therapy activity in solid tumors: immune-related response criteria.” Clinical Cancer Research 15.23 (2009): 7412-7420. Response Evaluation Criteria in Solid Tumors (RECIST) Quick Reference
- Measurable disease the presence of at least one measurable lesion. If the measurable disease is restricted to a solitary lesion, its neoplastic nature should be confirmed by cytology/histology.
- Measurable lesions - lesions that can be accurately measured in at least one dimension with longest diameter ⁇ 20 mm using conventional techniques or ⁇ 10 mm with spiral CT scan.
- Non-measurable lesions - all other lesions including small lesions (longest diameter ⁇ 20 mm with conventional techniques or ⁇ 10 mm with spiral CT scan), i.e., bone lesions, leptomeningeal disease, ascites, pleural/pericardial effusion, inflammatory breast disease, lymphangitis cutis/pulmonis, cystic lesions, and also abdominal masses that are not confirmed and followed by imaging techniques; and.
- Clinical lesions will only be considered measurable when they are superficial (e.g., skin nodules and palpable lymph nodes). For the case of skin lesions, documentation by color photography, including a ruler to estimate the size of the lesion, is recommended.
- CT and MRI are the best currently available and reproducible methods to measure target lesions selected for response assessment.
- Conventional CT and MRI should be performed with cuts of 10 mm or less in slice thickness contiguously.
- Spiral CT should be performed using a 5 mm contiguous reconstruction algorithm. This applies to tumors of the chest, abdomen and pelvis. Head and neck tumors and those of extremities usually require specific protocols.
- US ultrasound
- Cytology and histology can be used to differentiate between PR and CR in rare cases (e.g., after treatment to differentiate between residual benign lesions and residual malignant lesions in tumor types such as germ cell tumors).
- Target lesions should be selected on the basis of their size (lesions with the longest diameter) and their suitability for accurate repeated measurements (either by imaging techniques or clinically).
- a sum of the longest diameter (LD) for all target lesions will be calculated and reported as the baseline sum LD.
- the baseline sum LD will be used as reference by which to characterize the objective tumor.
- the best overall response is the best response recorded from the start of the treatment until disease progression/recurrence (taking as reference for PD the smallest measurements recorded since the treatment started).
- the patient's best response assignment will depend on the achievement of both measurement and confirmation criteria.
- the main goal of confirmation of objective response is to avoid overestimating the response rate observed. In cases where confirmation of response is not feasible, it should be made clear when reporting the outcome of such studies that the responses are not confirmed.
- the duration of overall response is measured from the time measurement criteria are met for CR or PR (whichever status is recorded first) until the first date that recurrence or PD is objectively documented, taking as reference for PD the smallest measurements recorded since the treatment started.
- Samples for use in the techniques described herein include any of various types of biological molecules, cells and/or tissues that can be isolated and/or derived from a subject.
- the sample can be isolated and/or derived from any fluid, cell or tissue.
- the sample can also be one isolated and/or derived from any fluid and/or tissue that predominantly comprises blood cells.
- the sample that is isolated and/or derived from a subject can be assayed for gene expression products.
- the sample is a fluid sample, a lymph sample, a lymph tissue sample or a blood sample.
- the sample is isolated and/or derived from peripheral blood.
- the sample may be isolated and/or derived from alternative sources, including from any one of various types of lymphoid tissue.
- a sample of blood is obtained from an individual according to methods well known in the art.
- a drop of blood is collected from a simple pin prick made in the skin of an individual.
- Blood may be drawn from an individual from any part of the body (e.g., a finger, a hand, a wrist, an arm, a leg, a foot, an ankle, stomach, and neck) using techniques known to one of skill in the art, such as phlebotomy.
- samples isolated and/or derived from blood include samples of whole blood, serum-reduced whole blood, serum-depleted blood, and serum-depleted and erythrocyte depleted blood.
- whole blood collected from an individual is fractionated (i.e., separated into components) before isolating products of the biomarkers from the sample.
- blood is serum-depleted (or serum-reduced).
- the blood is plasma-depleted (or plasma-reduced).
- blood is erythrocyte-depleted or erythrocyte-reduced.
- erythrocyte reduction is performed by preferentially lysing the red blood cells.
- erythrocyte depletion or reduction is performed by lysing the red blood cells and further fractionating the remaining cells.
- erythrocyte depletion or reduction is performed but the remaining cells are not further fractionated.
- blood cells are separated from whole blood collected from an individual using other techniques known in the art. For example, blood collected from an individual can be subjected to Ficoll-HypaqueTM (Pharmacia) gradient centrifugation. Such centrifugation may separate various types of cells in a blood sample. In particular, Ficoll-HypaqueTM gradient centrifugation is useful to isolate peripheral blood leukocytes (PBLs).
- PBLs peripheral blood leukocytes
- the level of expression of a biomarker can be determined by any means known in the art.
- the quantity of RNA can be determined by various means, for example, by microarray (e.g., RNA microarray, cDNA microarray), quantitative polymerase chain reaction (qPCR), or sequencing technology (e.g., RNA-seq).
- a level of expression (when referring to RNA) is stated as a number of PCR cycles required to reach a threshold amount of RNA or DNA, e.g., 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, or 60 cycles.
- the level of expression when referring to RNA can also refer to a measurable quantity of a given nucleic acid as determined relative to the amount of total RNA, or cDNA used in QRT-PCR, in which the amount of total RNA used is, for example, 100 ng, 50 ng, 25 ng, 10 ng, 5 ng, 1.25 ng, 0.05 ng, 0.3 ng, 0.1 ng, 0.09 ng, 0.08 ng, 0.07 ng, 0.06 ng, or 0.05 ng.
- the level of expression of a nucleic acid can be determined by any methods known in the art.
- the level of expression is measured by hybridization analysis using nucleic acids corresponding to RNA isolated from one or more individuals, according to methods well known in the art.
- the label used can be a luminescent label, an enzymatic label, a radioactive label, a chemical label or a physical label.
- target and/or probe nucleic acids are labeled with a fluorescent molecule.
- the level of expression when referring to RNA can also refer to a measurable quantity of a given nucleic acid as determined relative to the amount of total RNA or cDNA used in microarray hybridizations.
- the amount of total RNA is 10 ⁇ g, 5 ⁇ g, 2.5 ⁇ g, 2 ⁇ g, 1 ⁇ g, 0.5 ⁇ g, 0.1 ⁇ g, 0.05 ⁇ g, 0.01 ⁇ g, 0.005 ⁇ g, 0.001 ⁇ g or the like.
- the level of expression when referring to RNA can also refer to the number of mapped reads identified by RNA-Seq. The reads can be further normalized, e.g., by the total number of mapped reads, so that expression levels are expressed as Fragments Per Kilobase of transcript per Million mapped reads (FPKM).
- RNA is obtained from the nucleic acid mix using a filter-based RNA isolation system from Ambion (RNAqueousTM, Phenol-free Total RNA Isolation Kit, Catalog #1912, version 9908; Austin, Tex.) and the PAXgeneTM Blood RNA System (from Pre-Analytix).
- Ambion RNAqueousTM, Phenol-free Total RNA Isolation Kit, Catalog #1912, version 9908; Austin, Tex.
- PAXgeneTM Blood RNA System from Pre-Analytix. The detailed method is described in pp.55-104, in RNA Methodologies, A laboratory guide for isolation and characterization, 2nd edition, 1998, Robert E. Farrell, Jr., Ed., Academic Press.
- RNA is prepared using some known commercial kits for isolating RNA (including isolating total RNA or mRNA, and the like) such as oligo dT based purification methods, Qiagen® RNA isolation methods, LeukoLOCKT TM Total RNA Isolation System, MagMAX TM -96 Blood Technology from Ambion, Promega® polyA mRNA isolation system, and the like.
- isolating RNA including isolating total RNA or mRNA, and the like
- isolating total RNA or mRNA, and the like such as oligo dT based purification methods, Qiagen® RNA isolation methods, LeukoLOCKT TM Total RNA Isolation System, MagMAX TM -96 Blood Technology from Ambion, Promega® polyA mRNA isolation system, and the like.
- the level of transcribed mRNA can be quantified by quantitative real-time PCR (QRT-PCR), for example, with ABI Prism® 7900 Sequence Detector, Applied Biosystem Prism® instrument, Cepheid SmartCycler® instruments, Cepheid GeneXpert® instruments or the Roche LightCycler® 480 Real-Time PCR System.
- QRT-PCR quantitative real-time PCR
- Table 1 lists genes for Groups A-G from which a set of genes can be selected.
- Table 3 lists genes for Groups J-Q. (There are no Groups H, I, or O) The short name, full name, and aliases for each of these genes are listed in Table 2.
- a pre-treatment model can be used to determine the effectiveness score for immunotherapy response in a subject that has not received any immunotherapy.
- These models can involve different sets of genes. The model is applied a training dataset, generating relevant parameters for a classifier, thus creating a pre-treatment classifier.
- a sample is collected from the subject before the subject receives the immunotherapy treatment.
- the sample is tested in accordance with a pre-treatment classifier, such as one of the disclosed pre-treatment classifiers, and the subject’s immunotherapy effectiveness score (e.g., the probability that the subject will respond to the immunotherapy, or a value indicative of the probability that the subject will respond to the immunotherapy) is calculated. Based on that score, a physician can determine whether to treat the subject with the immunotherapy, or instead to seek another type of therapy more likely to work in the subject.
- A“Core model” is a mathematical model that includes a core model gene set.
- Various types of mathematical models may be used as the core model, including, e.g., the regression model in the form of logistic regression, principal component analysis, linear discriminant analysis, and correlated component analysis etc.
- the gene set for the Core models includes genes that are selected from one or more of the following combinations (a)– (f):
- BAX BCL2-Associated X Protein
- LARGE a combination of LARGE and at least one gene selected from ADAM17, CCR3, CD86, HLADRA, IL23A, IL2RA, LTA, MIF, MYC, RHOC, S100A4, TNF, and TP53,
- additional genes can be added into the gene set for model construction.
- the gene set can be used in connection with a mathematical model, for example, logistic regression, to construct a Core model.
- the Core model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a“Core classifier.”
- A“Core + Group C” model is a mathematical model that includes a Core + Group C model gene set. Any gene or any combination of genes in Group C can be added to the gene set used in a Core model to build a Core + Group C model. Therefore, a Core + Group C model may involve 3 ⁇ 48 genes.
- the gene set then can be used in connection with a mathematical model, for example, logistic regression, to construct a Core + Group C model.
- the Core + Group C model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a“Core + Group C classifier.”
- A“Core + Group CD” model is a mathematical model that includes a Core +
- Group CD model gene set One or more Group D genes can be added to Core + Group C models that have at least 3 genes, to arrive at the Core + Group CD model gene set.
- the gene set then can be used in connection with a mathematical model, for example, logistic regression, to construct a Core + Group CD model.
- the Core + Group CD model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a “Core + Group CD classifier.”
- the total number of genes that are selected from Group D divided by the total number of genes that are selected from the group of genes consisting of BAX, LARGE and genes in Groups A-D is 0.34 or less.
- a single Group D gene is added to a three-gene Core + Group C model to make a four-gene Core + Group CD model in which the Group D gene represents 25% of the total genes.
- two Group D genes can be added to a five-gene Core + Group C model to make a seven-gene Core + Group CD model in which the two Group D genes represent 28.6% of the total genes.
- all 21 Group D genes can be added to a gene set that includes LARGE, BAX, and all genes from Groups A, B, and C, to make a 69-gene Core + Group CD model in which the Group D genes represent 30.4% of the total genes.
- A“Core + Group CE” model has at least six genes that are selected from Groups A-C (including at least one gene from each of Groups A-C), plus any one or more of the following alternatives a)– e):
- the total number of genes that are selected from Groups E1- E5, divided by the total number of genes that are selected from Groups A, B, C, E1, E2, E3, E4, and E5, is 0.25 or less.
- the gene set is used in connection with a mathematical model, for example, logistic regression, to construct a Core + Group CE model.
- the Core + Group CE model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a “Core + Group CE classifier.”
- A“Core + Group CDE” model has at least six genes that are selected from Groups A- D (including at least one gene from Groups A-D), plus any one or more of the following alternatives a)– e):
- the total number of genes that are selected from Groups E1- E5, divided by the total number of genes that are selected from Groups A, B, C, D, E1, E2, E3, E4, and E5, is 0.25 or less.
- the gene set is used in connection with a mathematical model, for example, logistic regression, to construct a Core + Group CDE model.
- the Core + Group CDE model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a “Core + Group CDE classifier.”
- A“Core + Group F2” model is a mathematical model that includes a Core +
- Group F2 model gene set Group F2 has six genes. Any gene or any combination of genes in Group F2 can be added to the gene set used in any Core model. Therefore, a Core + Group F2 model involves at least 3 genes.
- the Core + Group F2 model gene set then can be used in connection with a mathematical model, for example, logistic regression, to construct a Core + Group F2 model.
- the Core + Group F2 model can be used to determine the effectiveness score for an immunotherapy in a subject who is chemotherapy treatment naive.
- the Core + Group F2 model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a“Core + Group F2 classifier.”
- A“Core + Group CF2” model is a mathematical model that includes a Core + Group CF2 model gene set. Any gene or any combination of genes in Group F2 can be added to the gene set used in a Core + Group C model. The resulting gene set then can be used in connection with a mathematical model, for example, logistic regression, to construct a Core + Group CF2 model.
- the Core + Group CF2 model can be used to determine the effectiveness score for an immunotherapy in a subject who is chemotherapy treatment naive.
- the Core + Group CF2 model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a“Core + Group CF2 classifier.”
- Group G includes all genes in Group A and Group F1. Therefore, Group G has 30 genes.
- A“Group G” model includes genes that are selected from one or more of the following combinations (a)– (d):
- the gene set then can be used in connection with a mathematical model, for example, logistic regression, to construct a Group G model.
- the Group G model can be used to determine the effectiveness score for an immunotherapy in a subject who is chemotherapy treatment naive.
- one or more additional genes listed in Group A and/or Group B can be included in the gene set for model construction.
- a Group G model can further include one or more genes selected from Group B, and one or more genes selected from Group C.
- the Group G model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a“Group G classifier.”
- A“pretreatment CDK2 model” is a mathematical model that includes a pretreatment model gene set that includes Cyclin-Dependent Kinase 2 (“CDK2”).
- the gene set for the pretreatment CDK2 models includes any of the combinations listed below as (a), (b), (c), (d), (e), (f), (g), (h), (i), (j), and (k), with or without other genes:
- the gene set can further include at least one gene, at least two genes, or at least three genes selected from AXIN2, BAD, CD26, CD97, CDKN1B, CXCR3, FOXP3, GZMA, ICOS, IL18BP, MMP9, NFKB1, PLA2G7, PTPRC, TGFB1, TLR9, TNFRSF13B, TNFRSF1B, TNFSF6, TOSO, and TXNRD1.
- the gene set includes CDK2 and TIMP1. In some embodiments, the gene set includes CDK2 and TIMP1. In some embodiments,
- the gene set includes CDK2, TIMP1 and NFKB1. In some embodiments, the gene set includes CDK2, TIMP1, NFKB1, and TXNRD1.
- the gene set can be used in connection with a mathematical model, for example, logistic regression, to construct a pretreatment CDK2 model.
- the pretreatment CDK2 model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a“pretreatment CDK2 classifier.”
- the gene set for Core(1) models includes genes that are selected from one or more of the following combinations (a)– (f):
- Group A1 has CDK2 and all genes in Group A.
- Group B1 has 22 genes (Table 69), including all genes in Group B. After the genes are selected from one or more of the combinations (a)– (f) described above, additional genes (such as one or more genes that are listed in Group A1 and/or Group B1) can be added into the gene set for model construction.
- the gene set can be used in connection with a mathematical model, for example, logistic regression, to construct Core(1) models. These models can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a“Core(1) classifier.”
- a mathematical model for example, logistic regression
- A“Core(1) + Group C1” model is a mathematical model that includes a Core(1) + Group C1 model gene set. Any gene or any combination of genes in Group C1 can be added to the gene set used in a Core(1) model to build a Core(1) + Group C1 model. The gene set then can be used in connection with a mathematical model, for example, logistic regression, to construct a Core(1) + Group C1 model. The Core(1) + Group C1 model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a “Core(1) + Group C1 classifier.”
- A“Core(1) + Group C1D1” model is a mathematical model that includes a Core(1) + Group C1D1 model gene set.
- One or more Group D1 genes can be added to a Core(1) + Group C1 model gene set that has at least 3 genes, to arrive at the Core(1) + Group C1D1 model gene set.
- the gene set then can be used in connection with a mathematical model, for example, logistic regression, to construct a Core(1) + Group C1D1 model.
- the Core(1) + Group C1D1 model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a“Core(1) + Group C1D1 classifier.”
- the total number of genes that are selected from Group D1 divided by the total number of genes that are selected from the group of genes consisting of BAX, LARGE and genes in Groups A1, B1, C1, and D1 is 0.34 or less.
- a post-treatment model can be used to determine the effectiveness score for immunotherapy response in a subject who has received an
- the post-treatment models can offer an early determination regarding whether the immunotherapy will eventually be effective. If the immunotherapy treatment is determined not likely to be effective for a subject who is undergoing the treatment, a recommendation to terminate the immunotherapy treatment for that subject can be made. If appropriate, the subject can then be started on a different type of treatment, such as a different immunotherapy, chemotherapy, radiation, or surgery.
- These post-treatment models also involve different sets of genes. In general, samples for post-treatment models are collected from the subject after the immunotherapy treatment has begun, e.g., about 1 ⁇ 6 months after the treatment has begun.
- samples are collected at about 30 days, 60 days, or 90 days after the treatment has begun. They may be collected just once after treatment has begun, or two or more times during the course of treatment, to track efficacy of the treatment over time. Those samples can be used to produce a training dataset for generating post-treatment models and post-treatment classifiers.
- a post-treatment model is applied to such a training dataset, generating relevant parameters for a classifier, thus creating a post-treatment classifier.
- Some exemplary post- treatment models are listed below. The lists of genes for Groups J-Q are shown in Table 3.
- A“Group J/K” model is a mathematical model that includes a Group J/K model gene set.
- Group J (shown in Table 3) has five genes.
- Group K has 15 genes.
- the Group J/K model gene set includes genes that are selected from one or both of the following combinations (a) and (b):
- additional genes such as one or more additional genes from Group J and/or Group K
- additional genes can be added into the gene set for model construction.
- the gene set can be used in connection with a mathematical model, for example, logistic regression, to construct a Group J/K model.
- the Group J/K model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a “Group J/K classifier.”
- A“Group J/K + L” model is a mathematical model that includes a Group J/K + L model gene set. Any gene or any combination of genes in Group L can be added to the gene set used in a Group J/K model to build a Group J/K + L model. The gene set then can be used in connection with a mathematical model, for example, logistic regression, to construct a Group J/K + L model. The Group J/K+L model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a“Group J/K+L classifier.” (3) Group J/K + LM A“Group J/K + LM” model is a mathematical model that includes a Group J/K + LM model gene set.
- Group M genes can be added to Group J/K + L models that have at least 3 genes, to arrive at the Group J/K + LM model gene set.
- the gene set then can be used in connection with a mathematical model, for example, logistic regression, to construct a Group J/K + LM model.
- the Group J/K+LM model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a“Group J/K+LM classifier.”
- the total number of genes selected from Group M divided by the total number of genes selected from Groups J-M equals 0.34 or less.
- A“Group J/K + LN” model has at least six genes that are selected from Groups J-L (including at least one gene from each of Groups J-L), plus any one or more of the following alternatives a)– e):
- the total number of genes selected from Groups N1-N5, divided by the total number of genes selected from Groups J, K, L, N1, N2, N3, N4, and N5, equals 0.25 or less.
- the gene set is used in connection with a mathematical model, for example, logistic regression, to construct a Group J/K + LN model.
- the Group J/K+LN model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a “Group J/K+LN classifier.”
- A“Group J/K + LMN” model has at least six genes selected from Groups J-M (including at least one gene from each of Groups J-M), plus any one or more of the following alternatives a)– e):
- the total number of genes selected from Groups N1-N5, divided by the total number of genes selected from Groups J, K, L, M, N1, N2, N3, N4, and N5, is 0.25 or less.
- the gene set is used in connection with a mathematical model, for example, logistic regression, to construct a Group J/K + LMN model.
- the Group J/K+LMN model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a “Group J/K+LMN classifier.”
- A“Group J/K + P2” model is a mathematical model that includes a Group J/K + P2 model gene set. Any gene or any combination of genes in Group P2 can be added to the gene set used in any Group J/K model. Therefore, a Group J/K + P2 model involves at least 3 genes. The Group J/K + P2 model gene set then can be used in connection with a
- the Group J/K + P2 model can be used to determine the effectiveness score for an immunotherapy in a subject who is chemotherapy treatment naive.
- the Group J/K+P2 model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a“Group J/K+P2 classifier.”
- A“Group J/K + LP2” model is a mathematical model that includes a Group J/K + LP2 model gene set. Any gene or any combination of genes in Group P2 can be added to the gene set used in a Group J/K + L model. The resulting gene set then can be used in connection with a mathematical model, for example, logistic regression, to construct a Group J/K + LP2 model.
- the Group J/K + LP2 model can be used to determine the effectiveness score for an immunotherapy in a subject who is chemotherapy treatment naive.
- the Group J/K+LP2 model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a“Group J/K+LP2 classifier.”
- Group Q includes LARGE, all genes in Group J, and all genes in Group P1.
- a Group Q model includes at least two genes selected from Group Q.
- the gene set can be used in connection with a mathematical model, for example, logistic regression, to construct a Group Q model.
- the Group Q model can be used to determine the effectiveness score for an immunotherapy in a subject who is chemotherapy treatment naive.
- the Group Q model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a“Group Q classifier.” (9) Post-treatment CDK2 model/classifier
- A“post-treatment CDK2 model” is a mathematical model that includes a post- treatment model gene set that includes at least CDK2.
- the gene set for the post-treatment CDK2 model includes any of the combinations listed below as (a), (b) and (c):
- the gene set includes CDK2 and TIMP1. In some embodiments, the gene set includes CDK2 and TIMP1. In some embodiments,
- the gene set includes CDK2, TIMP1 and NFKB1. In some embodiments, the gene set includes CDK2, TIMP1, NFKB1, and TXNRD1.
- the gene set can be used in connection with a mathematical model, for example, logistic regression, to construct a post-treatment CDK2 model.
- the post-treatment CDK2 model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a“post-treatment CDK2 classifier.”
- A“Group J1/K1” model is a mathematical model that includes a Group J1/K1 model gene set.
- the Group J1/K1 model gene set includes genes that are selected from one or both of the following combinations (a) and (b):
- Group J1 includes all genes in Group J and CDK2. Several genes from Group M has been moved to Group K, creating Group K1 and Group M1. Thus, Group K1 has 38 genes, including all genes in Group K, plus several genes from Group M. After the genes are selected from one or both of the combinations (a) and (b) described immediately above, additional genes (such as one or more additional genes from Group J1 and/or Group K1) can be added into the gene set for model construction.
- the gene set can be used in connection with a mathematical model, for example, logistic regression, to construct a Group J1/K1 model.
- the Group J1/K1 model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a “Group J1/K1 classifier.”
- Group J1/K1 + L A“Group J1/K1 + L” model is a mathematical model that includes a Group J1/K1 + L model gene set. Any gene or any combination of genes in Group L can be added to the gene set used in a Group J1/K1 model to build a Group J1/K1 + L model. The gene set then can be used in connection with a mathematical model, for example, logistic regression, to construct a Group J1/K1 + L model. The Group J1/K1+L model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a“Group J1/K1+L classifier.”
- A“Group J1/K1 + LM1” model is a mathematical model that includes a Group J1/K1 + LM1 model gene set.
- One or more Group M1 genes can be added to a Group J1/K1 + L model gene set that has at least 3 genes, to arrive at the Group J1/K1 + LM1 model gene set.
- the gene set then can be used in connection with a mathematical model, for example, logistic regression, to construct a Group J1/K1 + LM1 model.
- the Group J1/K1+LM1 model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a“Group J1/K1+LM1 classifier.”
- the total number of genes selected from Group M1 divided by the total number of genes selected from Groups J1, K1, L, and M1 equals 0.34 or less.
- A“Group J1/K1 + LN” model has at least six genes that are selected from Groups J1, K1, L (including at least one gene from each of Groups J1, K1, and L), plus any one or more of the following alternatives a)– e):
- the total number of genes selected from Groups N1-N5, divided by the total number of genes selected from Groups J1, K1, L, N1, N2, N3, N4, and N5, equals 0.25 or less.
- the gene set is used in connection with a mathematical model, for example, logistic regression, to construct a Group J1/K1 + LN model.
- the Group J1/K1+LN model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a “Group J1/K1+LN classifier.”
- Group J1/K1 + LM1N A“Group J1/K1 + LM1N” model has at least six genes selected from Groups J1, K1, L, M1 (including at least one gene from each of Groups J1, K1, L, M1), plus any one or more of the following alternatives a)– e):
- the total number of genes selected from Groups N1-N5, divided by the total number of genes selected from Groups J1, K1, L, M1, N1, N2, N3, N4, and N5, is 0.25 or less.
- the gene set is used in connection with a mathematical model, for example, logistic regression, to construct a Group J1/K1 + LM1N model.
- the Group J1/K1+LM1N model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a“Group J1/K1+LM1N classifier.”
- A“Group J1/K1 + M1N” model has at least six genes selected from Groups J1, K1, M1 (including at least one gene from each of Groups J1, K1, and M1), plus any one or more of the following alternatives a)– e):
- the total number of genes selected from Groups N1-N5, divided by the total number of genes selected from Groups J1, K1, M1, N1, N2, N3, N4, and N5, is 0.25 or less.
- the gene set is used in connection with a mathematical model, for example, logistic regression, to construct a Group J1/K1 + M1N model.
- the Group J1/K1+M1N model can then be applied to a training dataset, generating appropriate classifier parameters, thus creating a“Group J1/K1+M1N classifier.”
- Combo model
- a pre-treatment model can be used in combination with a post-treatment model to create a combination (“Combo”) model.
- the pre-treatment model can be any pre-treatment model described above, e.g. a Core model, a Core + Group C model, a Core + Group CD model, a Core + Group CE model, a Core + Group CDE model, a Core + Group F2 model, a Core + Group CF2 model, a Group G model, etc.
- the pre-treatment model can then be applied to a training dataset generated from samples taken from subjects prior to treatment, generating appropriate pre-treatment classifier parameters, thus creating a pre-treatment classifier.
- a post-treatment model used in a Combo model can be any post-treatment model described above, e.g. a Group J/K model, a Group J/K +L model, a Group J/K +LM model, etc., but the coefficients and constant, in some instances, will be different from what is described above for post-treatment classifiers.
- the coefficients and the constant of a Combo post-treatment classifier are obtained by applying the Combo post-treatment model to a subset of the training data, where the subset includes all subjects who were predicted by a pre-treatment model to be responders, and excludes any subjects predicted by the pre-treatment model to be non-responders.
- the subset used to obtain the coefficients and constant for a Combo post- treatment classifier is made up of those predicted to be responders, and is not limited to those who turn out to be actual responders.
- the Combo post-treatment classifiers are obtained by applying the Combo post-treatment model to this subset, generating Combo post-treatment classifier parameters.
- the Combo post-treatment classifier has a relatively high sensitivity, e.g., in some of these models, the sensitivity can be higher than 80%, 90% or 95%.
- the Combo post-treatment classifiers have at least 90% sensitivity with a corresponding specificity of greater than 50%.
- the Combo post treatment classifier can be used to monitor the patient over time to see whether the immunotherapy is still working. Samples would be taken periodically during treatment, e.g., once per week, per two weeks, per month, per two months, per three months, or per six months, and assessed using the Combo post treatment classifiers. If the calculations show that the immunotherapy is no longer working, the therapy can be halted. If appropriate, the patient can then be switched to another type of therapy. In some embodiments, multiple pre-treatment classifiers and multiple Combo post- treatment classifiers can be used in the Combo model to improve accuracy.
- a pre-treatment classifier is applied to a sample from a subject (e.g., a cancer patient) who has not yet been treated by an immunotherapy. Using that pre- treatment classifier, the subject is classified as a (predicted) responder or as a (predicted) non- responder. A subject who is classified as a non-responder will not receive the pre-treatment classifier.
- immunotherapy but may be treated by some other, more appropriate technique.
- a subject who is classified as a responder will receive the immunotherapy.
- another sample will be taken from that subject and a Combo post-treatment classifier will be applied to that sample. This allows the caregiver to determine whether the treated subject classifies with responders under the Combo post-treatment model and so should continue to be treated with the
- a method that employs the Combo classifier may be carried out as follows.
- a subject is identified as having a cancer or other condition potentially treatable with an immunotherapy.
- a blood sample is obtained from the subject.
- a pre-treatment classifier e.g., a Core classifier
- a pre-treatment classifier is applied to the gene expression levels in this sample to determine the subject’s immunotherapy “effectiveness score.” If the effectiveness score is greater than a reference threshold, a recommendation is made to treat the subject with the immunotherapy, and the
- immunotherapy will be administered to the subject.
- Subjects whose effective scores are below the reference threshold are predicted by the model to be non-responders, and so are excluded from the immunotherapy treatment.
- the result of applying the classifier (the“result” being, e.g., the value of dependent variable (Y) in a logistic regression) to the subject’s gene expression data will be used to determine the probability of immunotherapy response. If the probability is greater than a reference threshold, the subject’s caregiver will recommend the subject be treated with the immunotherapy, and the immunotherapy will be administered to the subject.
- the classifier the“result” being, e.g., the value of dependent variable (Y) in a logistic regression
- a second blood sample is obtained after the immunotherapy has begun.
- a Combo post-treatment classifier e.g., a Group J/K classifier, or a Group J/K+L classifier, such as one of the models described in Table 41
- the effectiveness score is indicative of the probability that the subject will eventually respond, or will continue to respond, to the ongoing immunotherapy. If the effectiveness score is greater than a reference threshold, the subject will continue to be treated with the immunotherapy. If the effectiveness score is less than or equal to the reference threshold, the immunotherapy treatment will be terminated, or at least a recommendation to terminate the immunotherapy treatment will be made.
- the effectiveness score will be used to determine the probability of immunotherapy response. If the probability is greater than a reference threshold, the subject’s caregiver will recommend the subject continue to be treated with the immunotherapy.
- the sensitivity and the specificity of the classifier depend on the selected reference threshold (or the cut-off point). The higher the reference threshold, the lower the sensitivity and the higher the specificity.
- the reference threshold can be optimized for the sensitivity, the specificity, or the percentage of correct predictions. Possible combinations making up a set of genes
- a set of genes includes any combination of genes that are selected based on the presently disclosed rules for model construction. For instance, the number of possible combinations of a subset of n genes selected from a group of m genes is described using the general formula:
- the number of possible combinations of at least n genes selected from a group of m genes is calculated by using the general formula:
- the number of possible combinations of at least 1 gene selected from a group of 4 genes is
- classifiers are generated via data processing system 18 by applying one or more mathematical models to data representative of the level of transcribed mRNAs of selected genes across a population encompassing both subjects who do not respond to an immunotherapy and subjects who do respond to the same immunotherapy.
- the mathematical model is logistic regression, as described herein.
- data processing system 18 generates the classifier by applying the mathematical model with a set of genes to the training dataset to determine values for logistic regression equation coefficients and logistic regression equation constants.
- the training data set includes data representing one or more levels of mRNA of a training population (including individuals who responded to a particular immunotherapy and individuals who didn’t respond to the immunotherapy).
- data processing system 18 generates and trains a classifier for each gene set (e.g., the Core gene set, the Core + Group C gene set, the Core + Group CD gene set, and so forth).
- the classifier is the resultant mathematical model including the values for logistic regression equation coefficients and logistic regression equation constants.
- data processing system 18 applies one or more of these generated classifiers to data specifying level of mRNA expression in one or more test subjects to determine the effectiveness score for
- the set of genes is selected based on a rule disclosed herein.
- additional genes in the gene set can be selected based on the p value as a measure of the likelihood that the transcribed mRNA of the individual gene can distinguish between the two phenotypic trait subgroups (i.e., responder vs. non-responder).
- additional genes are chosen to test in combination by input into a model wherein the p value of each gene is less than 0.2, less than 0.1, less than 0.5, less than 0.1, less than 0.05, less than 0.01, less than 0.005, less than 0.001, less than 0.0005, less than 0.0001, less than 0.00005, less than 0.00001, less than 0.000005, less than 0.000001, etc.
- Classifiers can be used alone or in combination with each other to create a formula for determining the effectiveness score for an immunotherapy in a subject.
- One or more selected classifiers can be used to generate a formula. It is not necessary that the method used to generate the data for creating the formulas be the same method used to generate data from the test subject.
- the individuals of the training population used to derive the model or the classifier are different from the individuals of a population used to test the model or the classifier.
- this allows a person skilled in the art to characterize an individual whose phenotypic trait characterization is unknown, for example, to determine the effectiveness score for an immunotherapy in the individual, or the probability that the individual will respond to an immunotherapy.
- the data that are input into the mathematical model can be any data that are representative of the expression levels of transcribed mRNA.
- Mathematical models useful in accordance with the disclosure include those using both supervised and unsupervised learning techniques.
- the mathematical model chosen uses supervised learning in conjunction with a training population to evaluate each possible combination of transcribed mRNAs.
- Various mathematical models can be used, for example, a regression model, a logistic regression model, a neural network, a clustering model, principal component analysis, correlated component analysis, nearest neighbor classifier analysis, linear discriminant analysis, quadratic discriminant analysis, a support vector machine, a decision tree, a genetic algorithm, classifier optimization using bagging, classifier optimization using boosting, classifier optimization using the Random Subspace Method, a projection pursuit, and genetic programming and weighted voting, etc.
- Applying a mathematical model to the data will generate one or more classifiers.
- multiple classifiers are created that are satisfactory for the given purpose (e.g., all have sufficient AUC and/or sensitivity and/or specificity).
- a formula is generated that utilizes more than one classifier.
- a formula can be generated that utilizes classifiers in series. Other possible combinations and weightings of classifiers would be understood and are encompassed herein.
- a classifier can be evaluated for its ability to properly characterize each individual of a population (e.g., a training population or a validation population) using methods known to a person of ordinary skill in the art. Various statistical criteria can be used, for example, area under the curve (AUC), percentage of correct predictions, sensitivity, and/or specificity.
- AUC area under the curve
- the classifier is evaluated by cross validation, Leave One OUT Cross Validation (LOOCV), n-fold cross validation, and jackknife analysis.
- each classifier is evaluated for its ability to properly characterize those individuals of a population not used to generate the classifier (a“test population”).
- the method used to evaluate the classifier for its ability to properly characterize each individual of the training population is a method that evaluates the classifier's sensitivity (true positive fraction) and 1-specificity (true negative fraction).
- the method used to test the classifier is a Receiver Operating Characteristic (ROC), which provides several parameters to evaluate both the sensitivity and the specificity of the result of the equation generated.
- ROC area area under the curve
- a perfect ROC area score of 1.0 is indicative of both 100% sensitivity and 100% specificity.
- classifiers are selected on the basis of the evaluation score.
- the evaluation scoring system used is a receiver operating characteristic (ROC) curve score determined by the area under the ROC curve.
- ROC receiver operating characteristic
- classifiers with scores of greater than 0.95, 0.9, 0.85, 0.8, 0.7, 0.65, 0.6, 0.55, or 0.5 are chosen.
- a sensitivity threshold can be set, and classifiers ranked on the basis of the specificity are chosen.
- classifiers with a cutoff for specificity of greater than 0.95, 0.9, 0.85, 0.8, 0.7, 0.65, 0.6, 0.55 0.5 or 0.45 can be chosen.
- the specificity threshold can be set, and classifiers ranked on the basis of sensitivity (e.g., greater than 0.95, 0.9, 0.85, 0.8, 0.7, 0.65, 0.6, 0.55 0.5 or 0.45) can be chosen.
- sensitivity e.g. 0.95, 0.9, 0.85, 0.8, 0.7, 0.65, 0.6, 0.55 0.5 or 0.45
- only the top ten ranking classifiers, the top twenty ranking classifiers, or the top one hundred ranking classifiers are selected.
- the ROC curve can be calculated by various statistical tools, including, but not limited to, Statistical Analysis System (SAS®), CORExpress® statistical analysis software, and a web based calculator for ROC curves provided by Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, at a webpage located at World Wide Web (rad.jhmi.edu/jeng/javarad/roc/JROCFITi.html).
- SAS® Statistical Analysis System
- CORExpress® statistical analysis software a web based calculator for ROC curves provided by Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, at a webpage located at World Wide Web (rad.jhmi.edu/jeng/javarad/roc/JROCFITi.html).
- the utility of the combinations and classifiers determined by a mathematical model will depend upon some characteristics (e.g., race, age group, gender, medical history) of the population used to generate the data for input into the model.
- the reference or training population includes between 50 and 100 subjects. In another embodiment, the reference population includes between 100 and 500 subjects. In still other embodiments, the reference population includes two or more populations each including between 50 and 100, 100 and 500, between 500 and 1000, or more than 1000 subjects.
- the reference population includes two or more subpopulations. In one embodiment, the phenotypic trait characteristics of the subpopulations are similar but for the phenotypic trait that is under investigation, for example, response to an immunotherapy. In some embodiments, the subpopulations are of roughly equivalent numbers. The present methods do not require using data from every member of a population, but instead may rely on data from a subset of a population in question.
- the reference population For a reference population used to provide input into a mathematical model to identify those biomarkers that are useful in determining the effectiveness score for an immunotherapy in a subject or the probability that a subject responds to an immunotherapy, the reference population includes individuals who respond to the immunotherapy (the first subpopulation) and individuals who do not respond to the immunotherapy (the second subpopulation).
- a test population which is comprised of individuals who respond to an immunotherapy and individuals who do not respond to an immunotherapy, is used to evaluate a classifier for its ability to properly characterize each individual.
- Data for input into the mathematical models are data representative of the respective levels of products of a set of genes.
- the data are the measure that represents a gene-specific level of transcribed RNA from a gene of a set of genes.
- the RNA includes, but is not limited to, mRNA, all of the spliced variants of the mRNA, and unspliced transcript.
- all of the RNA products are expressed in blood.
- the data are the measure that represents a gene-specific level of protein.
- the level of proteins can be determined by any techniques that are known in the art, for example, protein mass spectrometry and enzyme-linked immunosorbent assay (ELISA).
- the mathematical model can be applied to a dataset to generate a classifier.
- the dataset need not comprise data for each biomarker product of each individual.
- the“dataset,” in the context of a dataset to be applied to a classifier can include data representing levels of each biomarker for each individual.
- the data set includes data representing levels of each biomarker for fewer than all of the individuals (e.g., 99%, 95%, 90%, 85%, 80%, 75%, 70% or fewer) and can still be useful for purposes of generating a classifier.
- a mathematic model has the form:
- V is a value indicating the effectiveness score for an
- the effectiveness score is indicative of the probability that a test subject responds to an immunotherapy.
- X i represents the level of mRNA transcribed from an ith gene of the set of genes in a sample from the test subject.
- ⁇ i is a coefficient for ⁇ (X i ), which is a variable corresponds to the level of mRNA transcribed from the ith gene.
- f(x) may be a function for normalization or standardization.
- the formula may include additional parameters to account for age, sex, and race category.
- V is a value indicating the effectiveness score for an immunotherapy response. In some embodiments, V is an actual probability (a number varying between 0 and 1). In other embodiments, V is a value from which a probability can be derived.
- the mathematical model is a regression model, for example, a logistic regression model or a linear regression model.
- the regression model can be used to test various sets of genes.
- the classifiers generated can be used to analyze expression data from a test subject and to provide a result indicative of a quantitative measure of the test subject, for example, the predicted effectiveness of an immunotherapy.
- the dependent variable indicates a quantitative measure of a biological feature (e.g., effectiveness of an immunotherapy).
- the dependent variable Y depends on k explanatory variables (the measured characteristic values for the k select genes, e.g., the level of transcribed mRNA from subjects in the first and second subgroups), plus an error term that encompasses various unspecified omitted factors.
- the parameter ⁇ 1 gauges the effect of the first explanatory variable X1 on the dependent variable Y.
- ⁇ 2 gives the effect of the explanatory variable X 2 on Y.
- a logistic regression model is a non-linear transformation of the linear regression.
- the logistic regression model is often referred to as the“logit” model and can be expressed as
- p is the probability that the event Y occurs
- ⁇ and ⁇ can be folded into a single constant, and expressed as ⁇ .
- ⁇ is used, and ⁇ is omitted.
- The“logistic” distribution is an S-shaped distribution function. The logit distribution constrains the estimated probabilities (p) to lie between 0 and 1.
- the logistic regression model is expressed as
- Y is a value (e.g., an effectiveness score) indicating whether the set of test expression levels for a given subject should classify with the set of responder levels, as opposed to the set of non-responder levels.
- the probability that the set of test levels classifies with the set of responder levels, as opposed to the set of non-responder levels, can be derived from Y. The higher the score, the higher the probability that the set of test levels classifies with the set of responder levels.
- Xi is a level of mRNA transcribed from an ith gene of the set of genes in blood of the test subject, ⁇ i is a logistic regression equation coefficient for the ith gene, ⁇ is a logistic regression equation constant that can be zero, and ⁇ i and ⁇ are the result of applying logistic regression analysis to the set of responder levels and the set of non-responder levels.
- the logistic regression model is fit by maximum likelihood estimation (MLE).
- the coefficients e.g., ⁇ , ⁇ 1, ⁇ 2, ...
- a likelihood is a conditional probability (e.g., P(Y
- the likelihood function (L) measures the probability of observing the particular set of dependent variable values (Y1, Y2, ... , Yn) that occur in the sample data set. In some embodiments, it is written as the product of the probability of observing Y1, Y2, ... , Yn:
- MLE involves finding the coefficients ( ⁇ , ⁇ 1, ⁇ 2, ... ) that make the log of the likelihood function (LL ⁇ 0) as large as possible or ⁇ 2 times the log of the likelihood function ( ⁇ 2LL) as small as possible.
- some initial estimates of the parameters ⁇ , ⁇ 1, ⁇ 2, and so forth are made. Then, the likelihood of the data given these parameter estimates is computed. The parameter estimates are improved, and the likelihood of the data is recalculated.
- the classifier can be readily applied to a test subject to obtain Y.
- explanatory variables are standardized before fitting into the model.
- Standardized coefficients or beta coefficients
- beta coefficients are the estimates resulting from a regression analysis that have been standardized so that the variances of dependent and explanatory variables are 1. Therefore, standardized coefficients represent how many standard deviations a dependent variable will change, per standard deviation increase in the explanatory variable.
- the absolute value of the standardized coefficient equals the correlation coefficient. Standardization of the coefficient is usually performed to identify which of the explanatory variables have a greater effect on the dependent variable in a multiple regression analysis.
- variables are standardized before fitting into a logistic regression model.
- explanatory variables are standardized, and in some other embodiments, only dependent variables are standardized. Further, in some embodiments, both explanatory variables and dependent variables are standardized.
- the standardized regression coefficient equals the corresponding unstandardized coefficient multiplied by the ratio std(X i )/std(Y), where“std” denotes standard deviation.
- classifiers useful to determine whether a subject responds to an immunotherapy.
- classifiers including, e.g., clustering, principal component analysis, nearest neighbor classifier analysis, linear discriminant analysis, neural networks, and support vector machines. Rounding and Range
- Rounding refers to a mathematical operation that replaces a value by another value that is approximately equal but has a shorter, simpler, or more explicit representation.
- the most common type of rounding is to round to an integer; or, more generally, to an integer multiple of some increment, for example, tenths, hundredths, or five tenths.
- the increment m depends on the magnitude of the number to be rounded (or of the rounded result).
- the increment m is normally a finite fraction in a number system that is used to represent the numbers. For example, in the decimal number system, m is an integer times a power of 10, such as 1x10 -3 or 25x10 -2 .
- the experimentally-derived value provided in the examples and tables of the present disclosure for each coefficient or constant has n significant digits after the decimal point. Each value can be rounded to n-1 or n-2 or n-3 significant digits.
- a number shown with n significant digits after the decimal point is intended to provide literal support for the same number that is rounded to a number with fewer significant digits after the decimal point (e.g., n-1, n-2, n-3).
- the number“-0.7709” (with four significant digits after the decimal point) is intended to provide full literal support for expressing the same number as - 0.771, -0.77, -0.8, or -1.
- the number“0.1132” is intended to provide full literal support for expressing the same number as 0.113, 0.11, 0.1, or 0.
- the present disclosure thus includes not only each of the precise values explicitly set out in the tables and examples, and each of the corresponding rounded-off values as discussed above, but also a range of values around each of those explicitly disclosed values or rounded-off values, where the range can be any of +/- 50%, +/- 30%, +/- 25%, +/- 20%, +/- 10%, or +/- 5%.
- a coefficient listed in a table as“-0.2932” could have any of the following ranges:
- -0.2932 +/- 5% (corresponding to a range of -0.3079 to -0.2785), and each of its rounded-off values (-0.293, -0.29, and -0.3) could also have a range of any of +/- 50%, +/- 30%, +/- 25%, +/- 20%, +/- 10%, or +/- 5%.
- each coefficient or constant in each model can be increased or decreased by an appropriate amount and still remain useful in the present methods
- the value for each coefficient and constant listed in any of the tables also explicitly constitutes a disclosure for a value that is reasonably close to the explicitly disclosed value.
- a constant listed in a table as“+13.5029” is deemed to be a disclosure not only of +13.5029 per se (and a disclosure of rounded-off versions of that number, including +13.503, +13.50, +13.5, and +14), but also a disclosure of“about +13.5029”,“about +13.503”,
- RNAs transcribed from a set of genes can be determined by using a kit and following instructions that would typically be provided with the kit.
- a kit can include materials and reagents required for obtaining an appropriate blood sample from a subject, or for measuring the levels of particular transcribed RNAs.
- a kit includes primers appropriate for the transcribed RNAs.
- a kit is designed to determine the amounts of particular proteins present in a sample.
- the amount of a protein can be determined by any techniques that are known in the art, for example, protein mass spectrometry and enzyme-linked immunosorbent assay (ELISA).
- the kit includes materials, reagents, and instructions required for measuring the amount of protein products of a particular set of genes, for example, an antibody or antibody fragment that targets each protein product of interest.
- a kit may further include one or more reagents for various purposes, such as: (1) reagents for purifying RNA from blood; (2) primers for transcribed mRNA; (3) dNTPs and/or rNTPs (either premixed or separate), optionally with one or more uniquely labeled dNTPs and/or rNTPs (e.g., biotinylated or Cy3 or Cy5 tagged dNTPs); (4) post-synthesis labeling reagents, such as chemically active derivatives of fluorescent dyes; (5) enzymes, such as reverse transcriptases, DNA polymerases, and the like; (6) various buffer mediums, e.g., hybridization and washing buffers; (7) labeled probe purification reagents and components, e.g., spin columns; (8) protein purification reagents; and/or (9) signal generation and detection reagents, e.g., streptavidin-alkaline phosphatase
- kits are Quantitative PCR (QPCR) kits, with reagents and instructions suitable for QPCR.
- the kits are nucleic acid arrays or protein arrays and instructions for using the arrays.
- kits for measuring a RNA product of a gene includes materials and reagents that are necessary for measuring the expression of the RNA product.
- a microarray or a QPCR kit may contain only those reagents and materials that are necessary for measuring the levels of RNA products of a set of genes that are disclosed in the present disclosure.
- the kits can include materials and reagents for RNA products that are not discussed in the present disclosure.
- the kit will be packaged with instructions or will indicate a website where instructions can be accessed.
- the kits generally include probes attached or localized to a support surface.
- the probes may be labeled with a detectable label.
- the probes are specific for the 5’ region, the 3’ region, the internal coding region, an exon(s), an intron(s), an exon junction(s), or an exon-intron junction(s), of a RNA product(s).
- the microarray kits may include instructions for performing the assay and methods for interpreting and analyzing the data resulting from the performance of the assay.
- the kits may also include hybridization reagents and/or reagents necessary for detecting a signal when a probe hybridizes to a target nucleic acid sequence.
- the materials and reagents for the microarray kits are in one or more packages.
- the kits generally include pre-selected primers specific for RNA products (e.g., an exon(s), an intron(s), an exon junction(s), and an exon-intron junction(s)).
- the QPCR kits may also include enzymes suitable for reverse transcribing and/or amplifying nucleic acids (e.g., polymerases such as Taq, reverse transcriptase etc.), and deoxynucleotides and buffers needed for the reaction mixture for reverse transcription and amplification.
- the probes may or may not be labeled with a detectable label (e.g., a fluorescent label). In some embodiments, when contemplating multiplexing, the probes are labeled with a different detectable label (e.g.
- kits may include different containers suitable for each individual reagent, enzyme, primer and probe.
- the QPCR kits may include instructions for performing the assay and methods for interpreting and analyzing the data resulting from the performance of the assay. The instructions for analyzing the data will typically be provided on a machine-readable medium programmed in accordance with the presently disclosed analytical methods.
- the kit can include, for example: (1) a first antibody (which may or may not be attached to a support) which binds to protein of interest (e.g., protein products of a set of genes); and, optionally, (2) a second, different antibody which binds to either the protein, or the first antibody and is conjugated to a detectable label (e.g., a fluorescent label, a radioactive isotope or an enzyme).
- the antibody-based kits may also include beads for conducting an immunoprecipitation. Each component of the antibody-based kits is generally in its own suitable package. Thus, these kits generally include different packages suitable for each antibody.
- the antibody-based kits may include instructions for performing the assay and methods for interpreting and analyzing the data resulting from the performance of the assay.
- the instructions for analyzing the data will typically be provided on a machine-readable medium programmed in accordance with the presently disclosed analytical methods.
- the therapy targets any one of CD52, CTLA4, CD20, programmed cell death 1 (PD-1) receptor, or programmed death-ligand 1 (PD-L1; also known as CD274).
- the immunotherapy treatment could be an
- an antibody immunotherapy e.g., alemtuzumab, ipilimumab, ofatumumab, nivolumab, pembrolizumab, or rituximab.
- alemtuzumab targets CD52
- ipilimumab targets CTLA4, ofatumumab and rituximab target CD20
- nivolumab and pembrolizumab target programmed cell death 1 (PD-1) receptor
- the antibody immunotherapy is an anti-CTLA4 antibody, for example, ipilimumab (Yervoy®) or tremelimumab.
- a subject for an immunotherapy can be used to determine the treatment regimen for the subject.
- effectiveness score e.g., the Y value in a logistic regression
- the subject will be treated with the immunotherapy.
- the effectiveness score is less than a reference threshold, the subject will not be treated with the immunotherapy, or the immunotherapy treatment will be discontinued if it has already begun.
- the appropriate reference threshold for each classifier can be different, and can be optimized for various statistical measures (e.g., sensitivity, specificity, percentage of correct predictions).
- the reference threshold is determined by experiments or in a clinical trial.
- the probability of an immunotherapy response for a subject is derived from the effectiveness score.
- a pre-determined threshold for example, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9
- the subject will be treated with the immunotherapy.
- the probability of an immunotherapy response is less than a pre-determined threshold, for example, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, the subject will not be treated with the immunotherapy, or the immunotherapy treatment will be discontinued if it has already begun.
- the pre-determined threshold for treatment is 0.4, 0.5, or 0.6
- the pre-determined threshold for not treating, or for discontinuation of treatment is 0.4, 0.5, or 0.6
- the pre-determined threshold is determined by experiments or in a clinical trial.
- the sensitivity and the specificity of the method depend on the reference threshold (or the cut-off point). When the reference threshold is raised, the sensitivity will decrease, but the specificity will increase. In some embodiments, the reference threshold can be optimized for the sensitivity, the specificity, or the percentage of correct predictions.
- EXAMPLE 1 Melanoma Patient Population
- NCBI National Center for Biotechnology Information
- GEO Gene Expression Omnibus
- a measure disease includes at least lesion that meets the following criteria: a sole lesion that can be accurately measured in at least one dimension, lesion on a CT scan has a longest diameter ⁇ 2.0 cm using conventional techniques or ⁇ 1.0 cm with spiral CT scan.
- a skin lesion has longest diameter at least 1.0 cm, clinically detected lesions are superficial (e.g., skin nodules), and the longest diameter is ⁇ 2.0 cm, palpable lymph nodes ⁇ 2.0 cm should be demonstrable by CT scan. If the measurable disease is restricted to a solitary lesion, its neoplastic nature is confirmed by cytology or histology.
- Tumor lesions that are situated in a previously irradiated area will be considered measurable if progression is documented following completion of radiation therapy.
- Non-measurable disease includes lesions that fail to meet the above criteria for measurability.
- Adequate bone marrow, hepatic, and renal function determined within 14 days prior to randomization defined as: a) Absolute neutrophil count ⁇ 1.5 ⁇ 109 cells/L; b) Platelets ⁇ 100 ⁇ 109/L; c) Hemoglobin ⁇ 10 g/dL; d) Aspartate and alanine aminotransferases (AST, ALT) ⁇ 2.5 ⁇ Upper Limit of Normal (ULN), or ⁇ 5 ⁇ ULN, if documented liver metastases are present; e) Total serum bilirubin ⁇ 1.5 ⁇ ULN (except patients with documented Gilbert's syndrome); and f) Serum creatinine ⁇ 2.0 mg/dL or calculated creatinine clearance ⁇ 60 mL/min;
- Females of childbearing potential have had a negative serum or urine pregnancy test within 14 days prior to randomization; females who underwent surgical sterilization or who were postmenopausal for at least 2 years were not considered to be of childbearing potential;
- corticosteroids or other immunosuppressive medication e.g., methotrexate, rapamycin
- corticosteroids or other immunosuppressive medication e.g., methotrexate, rapamycin
- Tumor responses were assessed every 90 days (one cycle) in patients treated with tremelimumab, every 42 days (two cycles) in patients treated with DTIC, and every 56 days (two cycles) in patients treated with temozolomide. In both study arms, there was a planned assessment of tumor response at 6 months to determine PFS rate at this time point. Tumor data assessed by investigators were reviewed by the sponsor to ensure compliance with RECIST criteria. Patients were evaluated for toxicity at every scheduled visit, and any toxicities were assessed according to the National Cancer Institute Common Terminology Criteria for Adverse Events, version 3.0. A detailed description of this clinical trial can be found in, e.g., Ribas, Antoni, et al. "Phase III randomized clinical trial comparing
- Responders A total of 28 melanoma patients had lived beyond six months when expert panel of radiologists reviewed their cases one year after first blood draw and were classified as responders. Two had died between six months and the time of the radiologist report.
- Received prior treatment including at least one systemic therapy for treatment of
- metastatic disease premature systemic regimen for the treatment of metastatic melanoma included IL-2, dacarbazine and/or temozolamide or interferon- ⁇ ; patient have received at least one cycle at full dose;
- AST, ALT aminotransferases
- Chronic autoimmune disease e.g., Addison's disease, multiple sclerosis, Graves disease, Hashimoto's thyroiditis, inflammatory bowel disease, psoriasis, rheumatoid arthritis, systemic lupus erythematosus, hypophysitis, etc.; active vitiligo or a history of vitiligo will not be a basis for exclusion;
- Brain metastases radiological documentation of absence of brain metastases at screening was required for patients (note that a history of treated brain metastases was acceptable); 13) History of other malignancies, except for adequately treated basal cell carcinoma or squamous cell skin cancer or carcinoma of cervix, unless the patient was disease-free for at least 5 years; and
- Pregnancy or breast-feeding female patients are surgically sterile or are postmenopausal for two years, or have agreed to use effective contraception during the period of treatment and 12 months after. Female patients with reproductive potential have had a negative pregnancy test (serum/urine) within 72 hours prior to enrollment).
- An anti-CTLA4 treatment (tremelimumab) was administered intravenously at a dose of 15 mg/kg every 90 days in patients with previously treated advanced melanoma. Patients were allowed to receive up to 4 doses of Tremelimumab in a 12-month period. Tumor data were reviewed under the RECIST guidelines. A total of 150 patients in the 1008 patient population, including 43 responders and 107 non-responders, provided blood samples collected both prior to treatment (i.e., "pre-treatment”) and 30 days following the start of treatment (i.e.,“post–treatment”).
- EXAMPLE 2 Sample Preparation
- RNA samples were mixed with RNase inhibitor, MultiScribeTM reverse transcriptase and reaction reagents (including reverse transcriptase buffer, dNTPs, and random hexamers, etc.). Samples were incubated at room temperature for 10 minutes, and then at 37° C for 1 hour. Samples were further incubated at 90° C for 10 minutes. Samples were then spun in microcentrifuge. PCR quality control was run on samples using 18S RNA and ⁇ -actin mRNA. Quantitative PCR on the Applied Biosystem Prism® Instrument
- Primer/Probe Mix for each gene of interest and for18s endogenous control were prepared. TaqmanTM Universal PCR Master Mix were added to the Primer/Probe Mix. cDNA was diluted. The amount of cDNA was adjusted to give Ct values between 10 and 18, typically between 12 and 16. Primer/Probe mix and cDNA stock solution were pipetted into the appropriate wells of an Applied Biosystems 384-Well Optical Reaction Plate. The plate was sealed with Applied Biosystems Optical Caps, or optical-clear film, and then then analyzed on the ABI Prism® 7900 Sequence Detector. Results
- Quantitative PCR was performed on the ABI Prism® 7900 Sequence Detector system to determine the amount of RNA corresponding to specific genes in these samples.
- target gene measurements may be beyond the detection limit of the particular platform instrument used to detect and quantify constituents of a target gene.
- the detection limit was reset and the“undetermined” constituents were “flagged.”
- target gene FAM measurements that were beyond the detection limit of the instrument (>40 cycles) were reported as“undetermined.” Detection Limit Reset was performed when at least 1 of 3 target gene FAM CT replicates was not detected after 40 cycles.
- Samples were typically run on a 384 well PCR plate in replicates of three wells for each target gene (assay). A sample was divided into aliquots. For each aliquot, the concentration of each constituent target gene was measured in a separate well of the 384 well plate. With each assay conducted in triplicate, an average coefficient of variation (in accordance with (standard deviation/average)*100) of less than 2 percent was found among the normalized ⁇ Ct measurements for each assay. In this embodiment, normalized quantitation of the target mRNA was determined by the difference in threshold cycles between the internal control (e.g., an endogenous marker such as 18S rRNA, or an exogenous marker) and the gene of interest.
- the internal control e.g., an endogenous marker such as 18S rRNA, or an exogenous marker
- the gene set for each Core model includes genes that are selected from one or more of the following combinations (a)– (f):
- LARGE Like-Glycosyltransferase
- BAX BCL2-Associated X Protein
- LARGE and at least one gene selected from ADAM17, CCR3, CD86, HLADRA, IL23A, IL2RA, LTA, MIF, MYC, RHOC, S100A4, TNF, and TP53,
- the genes are selected from one or more of the combinations (a)– (f) described above, one or more additional genes listed in Group A and/or Group B can be included in the gene set for model construction.
- the levels of transcribed mRNA corresponding to the genes in the gene set were used as explanatory variables in logistic regression to construct a Core model.
- the Core model was then applied to the“1009” dataset to create each Core classifier. Logistic regressions were first performed in the“1009” training dataset to determine the parameters for each Core classifier. Each resulting Core classifier was then tested in the“1008” validation dataset.
- Table 4 lists all two-gene Core classifiers that involve LARGE and one gene selected from Group A, providing coefficients, logistic regression equation constant, and two AUCs for each classifier.
- the AUC for these two-gene classifiers in the training dataset is greater than 0.60.
- Each classifier was then tested in the“1008” validation dataset.
- the AUC in the “1008” validation dataset is also greater than 0.60.
- Table 4 also provides standardized coefficients for each gene.
- Table 5 lists several Core classifiers that involve LARGE and at least two genes from Group A, providing coefficients, logistic regression equation constant, and two AUCs for each of the classifiers. Table 5 also provides standardized coefficients for each gene.
- Table 6 lists all two-gene Core classifiers that involve BAX and one gene selected from Group A, providing coefficients, logistic regression equation constant, and two AUCs for each classifier. The AUC for these classifiers in the training dataset is greater than 0.60. Each classifier was then validated in the“1008” validation dataset. Table 6 also provides standardized coefficients for each gene.
- Table 7 lists several Core classifiers that include BAX and at least two genes from Group A, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- Table 8 lists all two-gene Core classifiers in which both genes are selected from Group A, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- the AUC for these two-gene classifiers in the training dataset is greater than 0.60.
- Each classifier was then validated in the“1008” validation dataset.
- Table 9 lists several Core classifiers that involve three or more genes selected from Group A, providing coefficients, logistic regression equation constant, two AUCs and standardized coefficients for each classifier.
- Table 10 lists all two-gene Core classifiers that involve LARGE and one gene selected from ADAM17, CCR3, CD86, HLADRA, IL23A, IL2RA, LTA, MIF, MYC, RHOC, S100A4, TNF, and TP53, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- the AUC for each of these classifiers in the training dataset is greater than 0.60.
- Each classifier was then tested in the“1008” validation dataset.
- the AUCs in the“1008” validation dataset are also greater than 0.60.
- Table 11 lists several Core classifiers that involve LARGE and at least two genes from Group B, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- Table 12 lists all two-gene Core classifiers that involve BAX and one gene selected from ADAM17, CCR3, CD86, HLADRA, IL23A, IL2RA, LTA, MIF, MYC, RHOC, S100A4, TNF, and TP53, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- the AUC for each of these classifiers in the training dataset is greater than 0.60.
- Each classifier was then validated in the“1008” validation dataset.
- Table 13 lists several Core classifiers that involve BAX and at least two genes from Group B, providing coefficients, logistic regression equation constant, two AUCs and standardized coefficients for each classifier.
- Table 14 lists all two-gene Core classifiers that involve one gene from Group A and one gene from Group B, providing coefficients, logistic regression equation constant, two AUCs and standardized coefficients for each classifier.
- the AUC for each of these two-gene classifiers in the training dataset is greater than 0.60.
- Each classifier was then validated in the “1008” validation dataset.
- Table 15 lists several Core classifies that involve at least three genes from Groups A and B, with at least one of those genes being from Group A and at least one from Group B. The Table also provides coefficients, logistic regression equation constant, two AUCs and standardized coefficients for each classifier.
- Table 16 lists several Core classifiers that involve LARGE, at least one gene from Group A, and at least one gene from Group B, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- Table 17 lists examples of classifiers that involve BAX, one gene selected from Group A, and at least one gene selected from Group B.
- Classifier Nos.1-2 include only BAX and Group A/B genes (so are Core classifiers), while Classifier Nos.3-5 also include genes from other Groups.
- the Table also provides coefficients, logistic regression equation constant, two AUCs and standardized coefficients for the classifiers.
- EXAMPLE 4 Core + Group C classifiers
- Group C has twenty-two genes. Any gene or any combination of genes in Group C can be added to the set of genes used in a Core model. Therefore, a Core + Group C model can have 3 to 48 genes.
- the levels of transcribed mRNA corresponding to the genes in the gene set were used as explanatory variables in logistic regression to construct a Core + Group C model.
- the Core + Group C model was then applied to the“1009” dataset to create a Core + Group C classifier.
- Logistic regressions were first performed in the“1009” training dataset to determine the parameters for the classifier.
- the resulting Core + Group C classifier was then tested in the“1008” validation dataset.
- Table 18 lists several Core + Group C classifiers, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- Table 19 lists several Core + Group C classifiers that also include LARGE, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- EXAMPLE 5 Core + Group CD classifiers
- One or more Group D genes were added to Core + Group C models that have at least three genes.
- the quotient of the total number of genes that are selected from Group D divided by the total number of genes that are selected from the group of genes consisting of BAX, LARGE and genes in Groups A-D is 0.34 or less.
- the levels of transcribed mRNA corresponding to the genes in the gene set were then used as explanatory variables in logistic regression to construct a Core + Group CD model.
- the Core + Group CD model was then applied to the“1009” dataset to create a Core + Group CD classifier. Logistic regressions were first performed in the“1009” training dataset to determine the parameters for the classifier. The resulting Core + Group CD classifier was then tested in the“1008” validation dataset.
- Table 20 lists several Core + Group CD classifiers, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- Table 17 lists one Core + Group CD classifier that involve BAX. See Classifier No.3.
- EXAMPLE 6 Core + Group CE classifiers
- Core + Group CE models have at least six genes that are selected from Groups A-C, including at least one gene from each of Groups A-C, plus any one or more of the following combinations of Group E1-E5 genes:
- the quotient of the total number of genes selected from Groups E1-E5, divided by the total number of genes selected from all of Groups A, B, C, E1, E2, E3, E4, and E5, may be 0.25 or less.
- the levels of transcribed mRNA corresponding to the genes in the gene set were used as explanatory variables in logistic regression to construct a Core + Group CE model.
- the Core + Group CE model was then applied to the“1009” dataset to create a Core + Group CE classifier. Logistic regressions were first performed in the“1009” training dataset to determine the parameters for the classifier. The resulting Core + Group CE classifier was then tested in the“1008” validation dataset.
- Table 21 lists examples of Core + Group CE classifiers, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- EXAMPLE 7 Core + Group CDE classifiers
- a Core + Group CDE model has at least six genes selected from Groups A-D, including at least one gene from each of Groups A-D; and also any one or more of the following alternatives a)– e):
- the quotient of the total number of genes that are selected from Groups E1-E5, divided by the total number of genes that are selected from Groups A, B, C, D, E1, E2, E3, E4, and E5, may be 0.25 or less.
- the levels of transcribed mRNA corresponding to the genes in the gene set were used as explanatory variables in logistic regression to construct a Core + Group CDE model.
- the Core + Group CDE model was then applied to the“1009” dataset to create a Core +
- Group CDE classifier Logistic regressions were first performed in the“1009” training dataset to determine the parameters for the classifier. The resulting Core + Group CDE classifier was then tested in the“1008” validation dataset.
- Table 22 lists several representative Core + Group CDE classifiers, providing coefficients, logistic regression equation constants, two AUCs, and standardized coefficients for each classifier.
- EXAMPLE 8 Core + Group F2 classifiers
- a Core + Group F2 model may involve 3 ⁇ 32 genes.
- the levels of transcribed mRNA corresponding to the genes in the gene set were used as explanatory variables in logistic regression to construct a Core + Group F2 model.
- the Core + Group F2 model is used to determine the effectiveness score of an immunotherapy response in a subject who is chemotherapy treatment naive (such as the subjects in the “1009” dataset).
- the Core + Group F2 model was then applied to the“1009” dataset to create a Core + Group F2 classifier. Logistic regressions were performed in the“1009” dataset to determine the parameters for the classifier.
- Table 23 lists examples of Core + Group F2 classifiers, providing coefficients, logistic regression equation constant, and the AUC for each classifier.
- EXAMPLE 9 Core + Group CF2 classifiers
- Any gene or any combination of genes in Group F2 can be added to the gene set used in a Core + Group C model.
- the levels of transcribed mRNA corresponding to the genes in the gene set were used as explanatory variables in logistic regression to construct a Core + Group CF2 model.
- the Core + Group CF2 model is used to determine the effectiveness score of an immunotherapy response in a subject who is chemotherapy treatment naive (such as the subjects in the “1009” dataset).
- the Core + Group CF2 model was then applied to the“1009” dataset to create a Core + Group CF2 classifier. Logistic regressions were performed in the“1009” dataset to determine the parameters for the classifier.
- Table 24 lists examples of Core + Group CF2 classifiers, including coefficients, logistic regression equation constant, AUC, and standardized coefficients for each classifier.
- EXAMPLE 10 Group G classifiers
- the gene set for a Group G model includes genes selected from one or more of the following combinations (a)– (d):
- a Group G model further included one or more genes selected from Group C.
- the levels of transcribed mRNA corresponding to the genes in the gene set were used as explanatory variables in logistic regression to construct a Group G model.
- the Group G model is used to determine the effectiveness score of an immunotherapy in a subject who is chemotherapy treatment naive (such as the subjects in the“1009” dataset).
- the Group G model was then applied to the“1009” dataset to create a Group G classifier. Logistic regressions were performed in the“1009” dataset to determine the parameters for the classifier.
- Table 25 lists examples of Group G classifiers, providing coefficients, logistic regression equation constant, AUC, and standardized coefficients for each classifier.
- EXAMPLE 11 Group J/K classifiers
- the Group J/K model gene set includes genes that are selected from one or both of the following combinations (a) and (b):
- the levels of transcribed mRNA corresponding to the genes in the gene set were used as explanatory variables in logistic regression to construct a Group J/K model.
- the Group J/K model was then applied to the“1009” dataset to create a Group J/K classifier.
- Logistic regressions were first performed in the“1009” training dataset to determine the parameters for the classifier.
- the resulting Group J/K classifier with optimized coefficients (classifier) was then tested in the“1008” validation dataset.
- Table 26 lists several Group J/K classifiers that have 2 genes from Group J, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- Table 27 lists several Group J/K classifiers that involve three or more genes from Group J, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- Table 28 lists several Group J/K classifiers that have at least one gene from Group J and at least one gene from Group K, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- EXAMPLE 12 Group J/K +L classifiers
- Any gene or any combination of genes in Group L can be added to the gene set used in a Group J/K model to build a Group J/K + L model.
- the levels of transcribed mRNA corresponding to the genes in the gene set were used as explanatory variables in logistic regression to construct a Group J/K + L model.
- the Group J/K+L model was then applied to the“1009” dataset to create a Group J/K+L classifier.
- Logistic regressions were first performed in the“1009” training dataset to determine the parameters for the classifier.
- the resulting Group J/K + L classifier was then tested in the“1008” validation dataset.
- Table 29 lists several Group J/K + L classifiers that involve LARGE, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- Table 30 lists several Group J/K + L classifiers, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- Table 31 lists one Group J/K + L classifier (Classifier No.1), providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for that classifier.
- EXAMPLE 13 Group J/K + LM classifiers
- One or more Group M genes can be added to Group J/K + L models that have at least 3 genes, to arrive at the Group J/K + LM model gene set.
- the total number of genes that are selected from Group M divided by the total number of genes that are selected from the group of genes consisting of Groups J-M is 0.34 or less.
- the levels of transcribed mRNA corresponding to the genes in the gene set were used as explanatory variables in logistic regression to construct a Group J/K + LM model.
- the Group J/K+LM model was then applied to the“1009” dataset to create a Group J/K+LM classifier.
- Logistic regressions were first performed in the“1009” training dataset to determine the parameters for the classifier.
- the resulting Group J/K + LM classifier was then tested in the“1008” validation dataset.
- Table 32 lists several Group J/K + LM classifiers, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- Table 31 lists one Group J/K + LM classifier that involves LARGE, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for that classifier. See Classifier No.2 of Table 31.
- EXAMPLE 14 Group J/K + LN classifiers
- A“Group J/K + LN” model has at least six genes that are selected from Groups J-L (including at least one gene from each of Groups J-L), plus any one or more of the following alternatives a)– e):
- the total number of genes that are selected from Groups N1-N5, divided by the total number of genes that are selected from Groups J, K, L, N1, N2, N3, N4, and N5, is 0.25 or less.
- the levels of transcribed mRNA corresponding to the genes in the gene set were used as explanatory variables in logistic regression to construct a Group J/K + LN model.
- the Group J/K+LN model was then applied to the“1009” dataset to create a Group J/K+LN classifier.
- Logistic regressions were first performed in the“1009” training dataset to determine the parameters for the classifier.
- the resulting Group J/K + LN classifier was then tested in the“1008” validation dataset.
- Table 33 lists several Group J/K + LN classifiers, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- Table 31 lists one Group J/K + LN classifier that involves LARGE, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for that classifier. See Classifier No.3 of Table 31.
- EXAMPLE 15 Group J/K + LMN classifiers
- A“Group J/K + LMN” model has at least six genes that are selected from Groups J-M (including at least one gene from each of Groups J-M), plus any one or more of the following alternatives a)– e):
- the total number of genes that are selected from Groups N1- N5, divided by the total number of genes that are selected from Groups J, K, L, M, N1, N2, N3, N4, and N5, is 0.25 or less.
- the levels of transcribed mRNA corresponding to the genes in the gene set were used as explanatory variables in logistic regression to construct a Group J/K + LMN model.
- the Group J/K+LMN model was then applied to the“1009” dataset to create a Group J/K+LMN classifier.
- Logistic regressions were first performed in the“1009” training dataset to determine the parameters for the classifier.
- the resulting Group J/K + LMN classifier was then tested in the“1008” validation dataset.
- Table 34 lists several Group J/K + LMN classifiers, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- Table 31 lists one Group J/K + LMN classifier involving LARGE, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for that classifier. See Classifier No.4 of Table 31.
- EXAMPLE 16 Group J/K + P2 classifiers
- Any gene or any combination of genes in Group P2 can be added to the gene set used in any Group J/K model.
- the levels of transcribed mRNA corresponding to the genes in the gene set were used as explanatory variables in logistic regression to construct a Group J/K + P2 model.
- the Group J/K+P2 model was then applied to the“1009” dataset to create a Group J/K+P2 classifier.
- Logistic regressions were performed in the“1009” dataset to determine the parameters for the classifier.
- the Group J/K + P2 classifier is used to determine the effectiveness score of an immunotherapy in a subject who is chemotherapy treatment naive (similar to the subjects in the“1009” dataset).
- Table 35 lists several Group J/K + P2 classifiers, providing coefficients, logistic regression equation constant, AUC, and standardized coefficients for each classifier.
- EXAMPLE 17 Group J/K + LP2 classifiers
- Any gene or any combination of genes in Group P2 can be added to the gene set used in a Group J/K + L model, to create a Group J/K + LP2 model.
- the levels of transcribed mRNA corresponding to the genes in the gene set were used as explanatory variables in logistic regression to construct a Group J/K + LP2 model.
- the Group J/K+LP2 model was then applied to the“1009” dataset to create a Group J/K+LP2 classifier. Logistic regressions were performed in the“1009” dataset to determine the parameters for the classifier.
- the Group J/K + LP2 model is used to determine the effectiveness score of an immunotherapy in a subject who is chemotherapy treatment naive (similar to the subjects in the“1009” dataset).
- Table 36 lists several Group J/K + LP2 classifiers, providing coefficients, logistic regression equation constant, AUC, and standardized coefficients for each classifier.
- EXAMPLE 18 Group Q classifiers
- Group Q includes LARGE, all genes in Group J, and all genes in Group P1.
- a Group Q model includes at least two genes selected from Group Q.
- the levels of transcribed mRNA corresponding to the genes in the gene set were used as explanatory variables in logistic regression to construct a Group Q model.
- the Group Q model was then applied to the“1009” dataset to create a Group Q classifier. Logistic regressions were performed in the“1009” dataset to determine the parameters for the classifier.
- the Group Q model is used to determine the effectiveness score of an
- Table 37 lists several Group Q classifiers, providing coefficients, logistic regression equation constant, AUC, and standardized coefficients for each classifier.
- EXAMPLE 19 Determining a subject’s effectiveness score for an immunotherapy using a pre-treatment classifier
- Table 38 provides a list of genes, their coefficients, a logistic regression equation constant, and two AUCs for one exemplary pre-treatment Core + CD classifier. (FIG.3A shows the ROC curve for this particular classifier.) To illustrate how one of the pre-treatment classifiers would be used in practice to determine a given cancer patient’s effectiveness score, Table 38 also provides the RNA level (expressed as delta cycle thermal units) for each of the listed genes in a blood sample of one of the subjects included in the“1009” dataset (Patient ID 10931003), and the product of multiplying the listed coefficient for each gene by the delta cycle thermal units listed for that gene.
- RNA level expressed as delta cycle thermal units
- the effectiveness score (Y) produced with this classifier is -0.226047.
- a caregiver can compare the effectiveness score against a reference threshold. If the effectiveness score is greater than the reference threshold, the subject’s caregiver will recommend the subject be treated with the
- FIG.3B is a graph plotting the results of applying the above classifier to the pre- treatment RNA expression data of all subjects in the training dataset.
- the X axis of this graph represents correlated components. Correlated components can be calculated by various statistical tools, e.g., CORExpress® Correlated Component Regression Software.
- the Y axis of the graph represents the effectiveness score (Y value).
- Y pre-treatment effectiveness score
- any patient with a pre-treatment Y value below -2.09 in this model would be predicted to be a non-responder
- a patient with a pre-treatment Y value above -2.09 in this model would be predicted to be a responder, or at least would not be predicted to be a non-responder.
- FIG.3B not all subjects with a pre-treatment Y value above -2.09 turned out to be actual responders.
- EXAMPLE 20 Determining a subject’s effectiveness score for an immunotherapy using a post-treatment classifier
- Table 39 provides coefficients, logistic regression equation constant, and two AUCs for one exemplary post-treatment Group J/K+L classifier. (FIG.4A shows the ROC curve for this classifier.)
- Table 38 provides the RNA level (expressed as delta cycle thermal units) for each of the listed genes in a post-treatment blood sample (i.e., a blood sample collected after treatment had begun) from the same“1009” dataset subject (Patient ID 10931003) described in Example 19 above, and provides the product of multiplying the listed coefficient for each gene by the delta cycle thermal units listed for that gene.
- a caregiver can compare the patient’s post-treatment effectiveness score against a reference threshold. If the effectiveness score is greater than the reference threshold, the subject’s caregiver will recommend the subject continue to be treated with the immunotherapy. If the effectiveness score is less than the reference threshold, the subject’s caregiver will recommend the subject’s treatment with the immunotherapy be terminated. The patient can then optionally be given some other appropriate medications or procedures.
- FIG.4B is a graph plotting the results of applying the post-treatment classifier of Table 38 to the data of each subject in the“1009” dataset.
- the X axis of this graph represents correlated components.
- the Y axis of the graph represents the effectiveness score (Y value).
- Table 40 shows two exemplary pre-treatment classifiers, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients. In these classifiers, the AUC in the training dataset is greater than 0.8, and the AUC for validation is around 0.6.
- EXAMPLE 22 Exemplary Combo classifier
- a Combo classifier is used to further refine the prediction for a patient who initially classified with responders when tested with a pre-treatment classifier.
- a Combo classifier utilizes two levels of classifiers, as follows. Initially, a pre-treatment classifier is used to predict whether a given patient should begin treatment with an immunotherapy. If a given patient classifies with responders (has a Y value above the reference threshold), the patient is treated with the immunotherapy and then a second or subsequent blood sample is obtained and data from that second/subsequent blood sample are analyzed with a Combo post- treatment classifier to predict whether that patient is likely eventually to respond, or to continue to respond, to the immunotherapy. This combination of classifiers, described in more detail below, is termed a“Combo classifier”.
- a pre-treatment blood sample is collected from the subject.
- the sample is processed as needed to permit quantification of gene expression levels of selected genes.
- a pre-treatment classifier is applied to the gene expression levels in the sample to determine the effectiveness score of the immunotherapy for the subject. If the effectiveness score is low (i.e., the subject is predicted to be a non-responder), the subject would be excluded from the immunotherapy treatment as unlikely to benefit from it. The subject’s caregiver would then determine what if any alternate treatment would be more likely to benefit the subject.
- a subject who is predicted to be likely to respond to the immunotherapy (or, put another way, is predicted as unlikely to be a non-responder) is treated with the
- a second blood sample is collected from the subject about 30 days after the treatment has begun.
- a Combo post-treatment classifier is then used to determine the patient’s effectiveness score.
- the effectiveness score indicates the likelihood that the subject will eventually respond, or will continue to respond, to the ongoing immunotherapy.
- a series of blood samples can be taken at various points in time throughout the therapy, to assess on an ongoing basis (using an appropriate Combo post-treatment classifier) whether the subject is responding appropriately to the immunotherapy. If at any point during the therapy, the subject’s gene expression data classifies the subject with non-responders rather than with responders, the caregiver will presumably elect to adjust or end the immunotherapy.
- Table 41 lists some exemplary Combo post-treatment classifiers that can be used in the Combo classifier. Table 41 also provides coefficients, logistic regression equation constant, two AUCs and standardized coefficients for each classifier. These Combo post treatment classifiers with appropriate cut-off points were found to have at least 90% sensitivity and at least 50% specificity. The sensitivity of these classifiers is between 92.3% and 96.4% and the specificity is between 56% and 60%.
- Combo classifiers Two Combo classifiers (described below) were tested as examples.
- a pre-treatment model was first applied to the training dataset (1009 dataset) to generate a pre-treatment classifier.
- a subset of subjects was created by excluding all subjects who were classified as non-responders.
- a Combo post treatment model was applied to post-treatment samples from that subset of subjects, creating a Combo post treatment classifier to predict which individuals of that subset would turn out to be responders.
- the same pre-treatment classifier and Combo post treatment classifier were applied to the validation dataset (1008 dataset) for validation.
- Combo Classifier 1 is an exemplary Combo classifier.
- Combo Classifier 1 involves one of the pre-treatment classifiers described in Table 40 (Classifier No.1 of Table 40) and one of the Combo post treatment classifiers described in Table 41 (Classifier No.2 of Table 41 with corresponding parameters).
- Table 42 shows a comparison of actual response to the immunotherapy and predicted response as determined by the pre-treatment classifier.
- Table 43 shows sensitivity and specificity of this pre-treatment classifier. In this example, 131 melanoma patients are classified as non-responders by the pre-treatment classifier, and in fact proved to be non-responders when given an anti-CTLA4 antibody treatment.
- Combo post treatment classifiers in Table 41 were then applied to the RNA expression data from post treatment samples taken from the subset of 79 subjects from the“1009” trial that would not have been excluded by the results of the pre- treatment classifier. Applying this Table 41 Combo post treatment classifier to that post- treatment data generates the results of Combo Classifier 1 shown in Table 44.
- Table 44 shows a comparison of actual response to the immunotherapy and predicted response as determined by Combo Classifier 1.
- Table 45 shows sensitivity and specificity of this Combo classifier.
- the Combo classifier identified 18 subjects that were not identified as non-responsive by the pre-treatment classifier, but were identified as non-responsive by the Combo post treatment classifier. Thus, a total of 149 melanoma patients (131 + 18) were identified as non-responders by the Combo classifier. Exclusion of 149 patients from a currently marketed anti-CTLA4 antibody treatment (priced at about $125, 000 per patient per year) that they otherwise would have been given (with ultimately no benefit to them) could save the health care system over $18.7 million per year. It would also mean that these patients would not waste time on a treatment that will not help them, and can be given another treatment more likely to help.
- Table 46 shows a comparison of actual response and predicted response as determined by the Combo classifier in the 1008 dataset.
- the cut-off point for the Combo post treatment classifier was optimized for higher sensitivity.
- Table 47 shows sensitivity and specificity.
- Table 48 is another example of applying the same Combo classifier to the 1008 dataset, but with a different cutoff point used to separate predicted responders from predicted non-responders.
- the cut-off point for the Combo post treatment classifier was optimized for higher percentage of correct prediction.
- Table 49 shows the sensitivity and specificity of this Combo classifier in the 1008 dataset using this alternate cutoff point.
- EXAMPLE 23 Exemplary Combo classifier
- Combo Classifier 2 is another exemplary Combo classifier.
- Combo Classifier 2 involves one of the pre-treatment classifiers described in Table 40 (Classifier No.2 of Table 40) and one of the post treatment classifiers described in Table 41 (Classifier No.2 of Table 41).
- Table 50 shows a comparison of actual response to the immunotherapy and predicted response as determined by the pre-treatment classifier.
- Table 51 shows sensitivity and specificity of this pre-treatment classifier. In this example, 126 melanoma patients are classified as non-responders by the pre-treatment classifier, and 125 of those patients proved to be non-responders when given an anti-CTLA4 antibody treatment.
- Combo post treatment classifiers in Table 41 were then applied to the RNA expression data from post treatment samples taken from the 84 subjects from the“1009” trial that would not have been excluded by the results of the pre-treatment classifier of Table 50. Applying this Combo post treatment classifier to that post-treatment data generates results of Combo Classifier 2 shown in Table 52.
- Table 52 shows a comparison of actual response to the immunotherapy and predicted response as determined by Combo Classifier 2.
- Table 53 shows sensitivity and specificity of this Combo classifier.
- the Combo classifier identified 16 subjects that were not identified as non-responsive by the pre-treatment classifier, but were identified as non-responsive by the Combo post treatment classifier. Thus, a total of 142 melanoma patients (126 + 16) were identified as non-responders by the Combo classifier.
- CRPC Dataset 1 The first dataset (“CRPC Dataset 1”) includes 62 patients with castration-resistant prostate cancer, with or without the presence of radiographic metastases and on various treatment regimens, enrolled at the Dana-Farber Cancer Institute from August 2006 to June 2008 on a genitourinary oncology clinical database and biorepository protocol.
- Whole-blood samples were prospectively collected in PAXgene® Blood RNA tubes (PreAnalytiX, Hombrechtikon, Switzerland) from patients in this cohort. One patient was removed because of poor data quality.
- the second dataset (“CRPC Dataset 2”) includes 140 patients with castration-resistant prostate cancer, with or without the presence of radiographic metastases and on various treatment regimens, enrolled on a genitourinary oncology clinical database and bio repository protocol at Memorial Sloan- Kettering Cancer Center (New York, NY, USA). Banked PAXgene® Blood RNA tubes were collected between August 2006 and February 2009. Two patients had poor quality RNA, leaving 138 patient samples for validation. Clinical data, including survival outcomes, were gathered. A detailed description of these two datasets can be found in Ross, Robert W., et al. “A whole-blood RNA transcript-based prognostic model in men with castration-resistant prostate cancer: a prospective study,” The Lancet Oncology 13.11: 1105-1113 (2012). The contents of this publication are incorporated by reference herein.
- RNA integrity number [RIN] ⁇ 6.3 RNA integrity number [RIN] ⁇ 6.3
- First-strand complementary DNA was synthesized from random hexamer-primed RNA templates using TaqMan® reverse transcription reagents (Applied Biosystem, Division of Life Technologies Corporation, Carlsbad, CA, USA). Quantitative PCR analysis was performed for the 18S rRNA content of newly synthesized complementary DNA, with the 7900HT fast real-time PCR System (Applied Biosystems, Foster City, CA, USA) as a quality check of the first-strand synthesis reaction.
- Target-gene amplification was done in a quantitative PCR reaction with TaqMan® Universal PCR Master Mix (Applied Biosystem, Division of Life Technologies Corporation, Carlsbad, CA, USA) and Precision Profiles (Source MDx, Boulder, CO, USA). Individual target-gene amplification was multiplexed with the eukaryotic 18S rRNA endogenous control and run in triplicate in a 384-well format on the 7900HT fast real-time PCR system. For quality control, all replicate cycle threshold (CT) values (both target gene and endogenous control) were independently checked and automatically filtered by rule. Normalized CT values ( ⁇ CT) for each amplified target gene replicate were calculated. Resulting triplicate ⁇ CT values for individual target genes were averaged, yielding a final ⁇ CT value.
- EXAMPLE 25 Core classifiers and Core + Group C classifiers in CRPC patients
- Core classifiers and Core + Group C classifiers derived from the 1009 (training) dataset as described above were applied to one or both of the CRPC datasets.
- Some exemplary Core classifiers are listed in Tables 4-17.
- Some exemplary Core + Group C classifiers are listed in Tables 18 and 19.
- a surrogate endpoint was used to evaluate the relevance of the Core and Core + Group C classifiers in the CRPC datasets.
- CRPC Dataset 1 Each patient in the CRPC datasets was identified as either a“survivor” or a “non-survivor.”
- CRPC Dataset 2 the“survivor” or“non-survivor” status was determined based on whether the patient was alive or dead 1 year after the blood sample was collected from the patient for the prostate cancer study at Memorial Sloan-Kettering Cancer Center.
- the classifiers produced a high AUC in a CRPC dataset.
- the AUC for Classifier No.5 in Table 56 is 0.6479 in the 1009 training dataset, is 0.6073 in the 1008 validation dataset, and 0.6732 in the CRPC Dataset 1.
- Group J/K classifiers derived from the 1009 (training) dataset as described above were applied to one or both of the CRPC datasets. Some exemplary Group J/K classifiers that were applied in this manner are listed in Tables 26-28. As the CRPC patients of the two CRPC datasets were not treated with an immunotherapy during the periods of the CRPC studies, no immunotherapy response information was available for those patients. Thus, survival was used as a surrogate endpoint to evaluate the relevance of the Group J/K classifiers in the CRPC datasets. In the CRPC datasets, each patient was identified as either a “survivor” or a“non-survivor,” as described above.
- Tables 62 and 63 list several examples of pretreatment CDK2 classifiers, providing coefficients, logistic regression equation constant, AUCs, and standardized coefficients for each classifier as applied to the two melanoma datasets (1009 and 1008).
- the classifiers produced a high AUC in a CRPC dataset.
- the AUC for Classifier No.2 in Table 63 is 0.6949 in the 1009 training dataset, is 0.6371 in the 1008 validation dataset, 0.6005 in the CRPC Dataset 2, and 0.7226 in CRPC Dataset 1.
- this classifier has 95.1% sensitivity and 55.0% specificity in the CRPC Dataset 1.
- EXAMPLE 28 Post-treatment CDK2 classifiers
- the genes in the gene set for the post-treatment CDK2 models were selected based on the presently disclosed rules for model construction.
- the levels of transcribed mRNA corresponding to the genes in the gene set were used as explanatory variables in logistic regression to construct each post-treatment CDK2 model.
- the post-treatment CDK2 model was then applied to the“1009” dataset to create a post-treatment CDK2 classifier. Logistic regressions were first performed in the“1009” training dataset to determine the parameters for the classifier. The resulting post-treatment CDK2 classifier was then tested in the“1008” validation dataset.
- Tables 65 and 66 list several examples of post-treatment CDK2 classifiers, providing coefficients, logistic regression equation constant, AUCs, and standardized coefficients for each classifier in the two melanoma datasets (1009 and 1008). All of these classifiers include CDK2 and at least one gene from Group J or K. Some include one or more genes from both Groups J and K, and some include at least one selected from Group L and/or Group P2. The Table 65 and 66 classifiers were also applied to one or both CRPC datasets. Survival as a surrogate endpoint was used to evaluate the relevance of post-treatment CDK2 classifiers in the CRPC datasets.
- the classifiers produced a high AUC in one or both CRPC datasets.
- the AUC for Classifier No.4 in Table 66 is 0.7057 in the 1009 training dataset, is 0.6294 in the 1008 validation dataset, 0.5780 in the CRPC Dataset 2, and 0.8007 in CRPC Dataset 1.
- this classifier has 90.2% sensitivity and 60.0% specificity in the CRPC Dataset 1.
- Table 68 shows another example of a post-treatment CDK2 classifier, providing coefficients, logistic regression equation constant, AUCs, and standardized coefficients in the two melanoma datasets (1009 and 1008).
- This classifier includes the following genes:
- the effectiveness score for the immunotherapy can be calculated by the formula below:
- Effectiveness Score -7.03578-(0.1614xCCR9) - (0.8716xCDK2)+(1.2606xEGR1)+(0.1878x ELANE)-(0.4208xHSPA1A)- (0.2937xICAM1)+(0.1444xMMP9)+(0.3805xTIMP1)
- the AUC for this classifier is 0.7853 in the 1009 training dataset and is 0.6125 in the 1008 validation dataset.
- the negative predictive value (NPV) which is the proportion of negative results that are true negative, was 99.7% in the 1009 training dataset and 92.9% in the 1008 validation dataset.
- EXAMPLE 29 Predicting immunotherapy response for a prostate cancer patient by Core, Core + Group C, and pretreatment CDK2 classifiers that are obtained from Melanoma Dataset 1009
- the pre-treatment classifiers e.g., Core classifiers, Core + Group C, and pretreatment CDK2 classifiers obtained from Melanoma Dataset 1009 can be used to predict either survival or immunotherapy response, or both, in cancer patients, regardless of the type of cancer or the type of immunotherapy (e.g., anti-CTLA-4 therapeutic antibodies).
- the discovery that these classifiers predict not only immunotherapy response in melanoma but also survival in CRPC (a type of cancer very different from melanoma) supports this belief in the broadly applicable predictive capability of the classifiers in all types of cancer.
- classifiers represent expression levels of particular immune system genes that are needed for a robust immune response against cancer, and so identify patients whose immune systems are already primed to fight the cancer and are particularly amenable to a boost in effectiveness provided by administration of immunotherapy. There may be other theories that better explain the phenomenon. Ultimately, the present methods do not require an understanding of the theory behind how the classifiers work, and are not to be construed as limited by any theory in that regard.
- a classifier listed in any of Tables 4-19 or 62-63 is applied to the data representing gene-specific levels of transcribed mRNA in the whole-blood samples of a patient who has castration-resistant prostate cancer but has not yet been treated with an immunotherapy.
- a caregiver can compare the effectiveness score against a reference threshold. If the effectiveness score is greater than the reference threshold, the subject’s caregiver will recommend the subject be treated with the immunotherapy.
- EXAMPLE 30 Predicting immunotherapy response for a prostate cancer patient by Group J/K and post-treatment CDK2 classifiers that are obtained from Melanoma Dataset 1009
- the post-treatment classifiers can be used to predict immunotherapy response for a prostate cancer patient who has been treated with an immunotherapy (e.g., an anti-CTLA-4 checkpoint inhibitor), but whose response (or lack of response) to the immunotherapy has yet to be determined. Since a detectable response to an immunotherapy can be delayed for several months after the start of this type of treatment, the present methods may be useful for identifying the patients who are likely ultimately to respond if the therapy is continued, vs. the patients who are not likely to respond even if therapy is continued. Early information that the therapy is not likely to succeed permits the caregiver to switch the patient to an alternate treatment more likely to be helpful.
- an immunotherapy e.g., an anti-CTLA-4 checkpoint inhibitor
- the immunotherapy e.g., an anti-CTLA-4 checkpoint inhibitor
- all Core classifiers are also Core(1) classifiers. They are shown in Tables 4– 17. Additional representative Core(1) classifiers are shown in Tables 70-72.
- Table 70 lists Core(1) classifiers that involve BAX and one gene selected from Group B1.
- Table 71 lists Core(1) classifiers that involve LARGE and one gene selected from Group B1.
- Table 72 lists Core(1) classifiers that involve one gene selected from
- the AUC for Classifier 4 in Table 73 is 0.8619 in training date set, and is 0.6789 in the validation dataset.
- the gene set of this classifier includes the following genes: ADAM17, CDK2, CDKN2A, DPP4, ERBB2, HLADRA, ICOS, ITGA4, LARGE, MYC, NAB2, NRAS, RHOC, TGFB1, and TIMP1.
- EXAMPLE 33 Group J1/K1 classifiers
- all Group J/K + L classifiers are also Group J1/K1 + L classifiers. They are shown in Tables 29-31. Table 75 lists additional representative Group J1/K1 +L classifiers, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- Classifier 4 in Table 75 involves CD28, CTLA4, EGR1, ICOS, IFNG, and LCK, and the AUC for this classifier is 0.7718 in the 1009 training dataset, and 0.5975 in the 1008 validation dataset.
- EXAMPLE 35 Group J1/K1 + LM1 classifiers
- Group J1/K1 + LM1 classifiers were constructed by the methods disclosed herein. The levels of transcribed mRNA in blood samples collected after subjects had begun treatment were used in the analyses. As the gene list of Group J1 includes all genes in Group J, the gene list of Group K1 includes all genes in Group K, and the gene list of Group M1 overlaps with the gene list of Group M, many Group J/K + LM classifiers are also Group J1/K1 + LM1 classifiers. They are shown in Tables 31 and 32. Table 76 lists additional representative Group J1/K1 +LM1 classifiers, providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier. EXAMPLE 36: Group J1/K1 + LN classifiers
- all Group J/K + LN classifiers are also Group J1/K1 + LN classifiers. They are shown in Tables 31 and 33. Table 77 lists additional representative Group J1/K1 + LN classifiers (Classifiers 1, 2, 3, 10, 12, and 13), providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- Classifier 1 in Table 77 which involves genes CCR5, CD28, CTLA4, EGR1, ICOS, IL7R, ITGAL, and TNFRSF13B, has an AUC of 0.7790 in the 1009 training dataset and 0.5851 in the 1008 validation dataset.
- Classifier 10 in Table 77 which involves genes CCR5, CD28, CTLA4, EGR1, ICOS, IL2RA, IL7R, ITGAL, and TNFRSF13B, has an AUC of 0.7810 in the 1009 training dataset and 0.5811 in the 1008 validation dataset.
- Classifier 12 in Table 77 which involves genes CCR5, CD28, CTLA4, EGR1, ICOS, IL2RA, IL7R, ITGAL, PTEN, and TNFRSF13B, has an AUC of 0.7920 in the 1009 training dataset and 0.5738 in the 1008 validation dataset.
- Classifier 13 in Table 77 which involves CCR5, CD28, CTLA4, ELA2, ICOS, IL2RA, IL7R, ITGAL, PTEN, and TNFRSF13B, has an AUC of 0.7590 in the 1009 training dataset and 0.5418 in the 1008 validation dataset.
- EXAMPLE 37 Group J1/K1 + LM1N classifiers
- Group J1/K1 + LM1N classifiers were constructed by the methods disclosed herein. The levels of transcribed mRNA in blood samples collected after subjects had begun treatment were used in the analyses. As the gene list of Group J1 includes all genes in Group J, the gene list of Group K1 includes all genes in Group K, and the gene list of Group M1 overlaps with the gene list of Group M, many Group J/K + LMN classifiers are also Group J1/K1 + LM1N classifiers. They are shown in Tables 31 and 34. Table 77 shows additional representative Group J1/K1 + LM1N classifiers (Classifiers 4, 5, 6, 7, and 11), providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- Classifier 11 in Table 77 which involves genes CCR5, CCR7, CD28, CTLA4, EGR1, ICOS, IL2RA, IL7R, ITGAL, LCK, NFATC1 and TNFRSF13B, has an AUC of 0.7865 in the 1009 training dataset and 0.5850 in the 1008 validation dataset.
- EXAMPLE 38 Group J1/K1 + M1N classifiers
- Table 77 shows two representative Group J1/K1 + M1N classifiers (Classifiers 8 and 9), providing coefficients, logistic regression equation constant, two AUCs, and standardized coefficients for each classifier.
- implementations of the subject matter and the functional operations described in this specification can be implemented in digital electronic circuitry, in tangibly- embodied computer software or firmware, in computer hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them.
- Implementations of the subject matter described in this specification can be implemented as one or more computer programs, i.e., one or more modules of computer program instructions encoded on a tangible program carrier for execution by, or to control the operation of, a processing device.
- the program instructions can be encoded on a propagated signal that is an artificially generated signal, e.g., a machine- generated electrical, optical, or electromagnetic signal that is generated to encode information for transmission to suitable receiver apparatus for execution by a processing device.
- a machine-readable medium can be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
- various methods and formulae are implemented in the form of computer program instructions and executed by processing device.
- Suitable programming languages for expressing the program instructions include, but are not limited to, C, C++, Java, Python, SQL, Perl, Tcl/Tk, JavaScript, ADA, OCaml, Haskell, Scala, and statistical analysis software, such as SAS, R, MATLAB, SPSS, CORExpress® statistical analysis software and Stata etc.
- Various aspects of the methods may be written in different computing languages from one another, and the various aspects are caused to communicate with one another by appropriate system-level-tools available on a given system.
- the processes and logic flows described in this specification can be performed by one or more programmable computers executing one or more computer programs to perform functions by operating on input information and generating output.
- the processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit) or RISC.
- special purpose logic circuitry e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit) or RISC.
- Computers suitable for the execution of a computer program include, by way of example, general or special purpose microprocessors or both, or any other kind of central processing unit.
- a central processing unit will receive instructions and information from a read only memory or a random access memory or both.
- the essential elements of a computer are a central processing unit for performing or executing instructions and one or more memory devices for storing instructions and information.
- a computer will also include, or be operatively coupled to receive information from or transfer information to, or both, one or more mass storage devices for storing information, e.g., magnetic, magneto optical disks, or optical disks.
- mass storage devices for storing information, e.g., magnetic, magneto optical disks, or optical disks.
- a computer need not have such devices.
- a computer can be embedded in another device, e.g., a mobile telephone, a smartphone or a tablet, a touchscreen device or surface, a personal digital assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a portable storage device (e.g., a universal serial bus (USB) flash drive), to name just a few.
- a mobile telephone e.g., a smartphone or a tablet, a touchscreen device or surface, a personal digital assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a portable storage device (e.g., a universal serial bus (USB) flash drive), to name just a few.
- PDA personal digital assistant
- GPS Global Positioning System
- USB universal serial bus
- Computer readable media suitable for storing computer program instructions and information include various forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and (Blue Ray) DVD-ROM disks.
- semiconductor memory devices e.g., EPROM, EEPROM, and flash memory devices
- magnetic disks e.g., internal hard disks or removable disks
- magneto optical disks e.g., CD ROM and (Blue Ray) DVD-ROM disks.
- the processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
- implementations of the subject matter described in this specification can be implemented on a computer having a display device, e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor, for displaying information to the user and a keyboard and a pointing device, e.g., a mouse or a trackball, by which the user can provide input to the computer.
- a display device e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
- keyboard and a pointing device e.g., a mouse or a trackball
- Other kinds of devices can be used to provide for interaction with a user as well.
- a computer can interact with a user by sending documents to and receiving documents from a device that is used by the user; for example, by sending web pages to a web browser on a user’s client device in response to requests received from the web browser.
- Implementations of the subject matter described in this specification can be implemented in a computing system that includes a back end component, e.g., as an information server, or that includes a middleware component, e.g., an application server, or that includes a front end component, e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the subject matter described in this specification, or any combination of one or more such back end, middleware, or front end components.
- the components of the system can be interconnected by any form or medium of digital information communication, e.g., a communication network. Examples of communication networks include a local area network (“LAN”) and a wide area network (“WAN”), e.g., the Internet.
- LAN local area network
- WAN wide area network
- the computing system can include clients and servers.
- a client and server are generally remote from each other and typically interact through a communication network.
- the relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
- the server can be in the cloud via cloud computing services.
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Physics & Mathematics (AREA)
- Genetics & Genomics (AREA)
- Molecular Biology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Analytical Chemistry (AREA)
- General Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Biophysics (AREA)
- Biotechnology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Medical Informatics (AREA)
- Evolutionary Biology (AREA)
- Theoretical Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Biochemistry (AREA)
- Oncology (AREA)
- Microbiology (AREA)
- Hospice & Palliative Care (AREA)
- General Engineering & Computer Science (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Medicinal Chemistry (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
La présente invention concerne un procédé mis en œuvre par ordinateur pour traiter des données dans un ou plusieurs dispositifs de traitement de données pour déterminer un score d'efficacité d'une immunothérapie.
Applications Claiming Priority (10)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201662302678P | 2016-03-02 | 2016-03-02 | |
US62/302,678 | 2016-03-02 | ||
US201662338922P | 2016-05-19 | 2016-05-19 | |
US62/338,922 | 2016-05-19 | ||
US201662409118P | 2016-10-17 | 2016-10-17 | |
US62/409,118 | 2016-10-17 | ||
US201662431190P | 2016-12-07 | 2016-12-07 | |
US62/431,190 | 2016-12-07 | ||
US201762450466P | 2017-01-25 | 2017-01-25 | |
US62/450,466 | 2017-01-25 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017151768A1 true WO2017151768A1 (fr) | 2017-09-08 |
Family
ID=59743181
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2017/020200 WO2017151768A1 (fr) | 2016-03-02 | 2017-03-01 | Traitement et classification de données pour la détermination d'un score d'efficacité d'une immunothérapie |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2017151768A1 (fr) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100048414A1 (en) * | 2008-05-09 | 2010-02-25 | The Regents Of The University Of California A California Corporation | Novel methods for predicting and treating tumors resistant to drug, immunotherapy, and radiation |
US20100196889A1 (en) * | 2006-11-13 | 2010-08-05 | Bankaitis-Davis Danute M | Gene Expression Profiling for Identification, Monitoring and Treatment of Colorectal Cancer |
US20110070582A1 (en) * | 2008-11-03 | 2011-03-24 | Source Precision Medicine, Inc. d/b/d Source MDX | Gene Expression Profiling for Predicting the Response to Immunotherapy and/or the Survivability of Melanoma Subjects |
WO2015120382A1 (fr) * | 2014-02-07 | 2015-08-13 | The Johns Hopkins University | Prédiction de réponse à un traitement utilisant des médicaments épigénétiques |
-
2017
- 2017-03-01 WO PCT/US2017/020200 patent/WO2017151768A1/fr active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100196889A1 (en) * | 2006-11-13 | 2010-08-05 | Bankaitis-Davis Danute M | Gene Expression Profiling for Identification, Monitoring and Treatment of Colorectal Cancer |
US20100048414A1 (en) * | 2008-05-09 | 2010-02-25 | The Regents Of The University Of California A California Corporation | Novel methods for predicting and treating tumors resistant to drug, immunotherapy, and radiation |
US20110070582A1 (en) * | 2008-11-03 | 2011-03-24 | Source Precision Medicine, Inc. d/b/d Source MDX | Gene Expression Profiling for Predicting the Response to Immunotherapy and/or the Survivability of Melanoma Subjects |
WO2015120382A1 (fr) * | 2014-02-07 | 2015-08-13 | The Johns Hopkins University | Prédiction de réponse à un traitement utilisant des médicaments épigénétiques |
Non-Patent Citations (4)
Title |
---|
FAN ET AL.: "Engagement of the ICOS pathway markedly enhances efficacy of CTLA-4 blockade in cancer immunotherapy", J EXP MED, vol. 211, 31 March 2014 (2014-03-31), pages 715 - 725, XP055414943 * |
SAENGER ET AL.: "Blood mRNA Expression Profiling Predicts Survival in Patients Treated with Tremelimumab", CLIN CANCER RES, vol. 20, 15 June 2014 (2014-06-15), pages 3310 - 3318, XP055414939 * |
SAXENA ET AL.: "Mcl-1 and Bcl-2/Bax Ratio Are Associated With Treatment Response but Not With Rai Stage in B- Cell Chronic Lymphocytic Leukemia", AMERICAN JOURNAL OF HEMATOLOGY, vol. 75, 1 January 2004 (2004-01-01), pages 22 - 33, XP055223048 * |
TANDON ET AL.: "EphrinA1-EphA2 interaction-mediated apoptosis and Flt3L-induced immunotherapy inhibits tumor growth in a breast cancer mouse model", J GENE MED, vol. 14, 1 February 2012 (2012-02-01), pages 77 - 89, XP055414945 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Paulson et al. | Transcriptome-wide studies of merkel cell carcinoma and validation of intratumoral CD8+ lymphocyte invasion as an independent predictor of survival | |
Tsai et al. | Gene expression signatures of neuroendocrine prostate cancer and primary small cell prostatic carcinoma | |
Jansen et al. | Decreased expression of ABAT and STC2 hallmarks ER-positive inflammatory breast cancer and endocrine therapy resistance in advanced disease | |
JP6140202B2 (ja) | 乳癌の予後を予測するための遺伝子発現プロフィール | |
Endoh et al. | Prognostic model of pulmonary adenocarcinoma by expression profiling of eight genes as determined by quantitative real-time reverse transcriptase polymerase chain reaction | |
KR102256141B1 (ko) | 다수의 세포 신호전달 경로 활성을 이용하는 치료 반응의 의학적 예후 및 예측 | |
Miller et al. | Changes in breast cancer transcriptional profiles after treatment with the aromatase inhibitor, letrozole | |
JP5971769B2 (ja) | アントラサイクリン療法を用いて乳癌を処置する方法 | |
JP6144695B2 (ja) | タキサン療法を用いて乳癌を処置する方法 | |
CA2996426A1 (fr) | Procede de classification et de diagnostic du cancer | |
JP2016178921A (ja) | 遺伝子発現を用いた前立腺癌の予後を定量化する方法 | |
US20110070582A1 (en) | Gene Expression Profiling for Predicting the Response to Immunotherapy and/or the Survivability of Melanoma Subjects | |
ES2934741T3 (es) | Predicción de la respuesta terapéutica en condiciones inflamatorias | |
Bechmann et al. | Plasma HER2 amplification in cell-free DNA during neoadjuvant chemotherapy in breast cancer | |
Rizzi et al. | A novel gene signature for molecular diagnosis of human prostate cancer by RT-qPCR | |
US9683264B2 (en) | Circulating miRNAs as early detection marker and prognostic marker | |
Sun et al. | Genomic instability-associated lncRNA signature predicts prognosis and distinct immune landscape in gastric cancer | |
Sikic et al. | The prognostic value of FGFR3 expression in patients with T1 non-muscle invasive bladder cancer | |
Su et al. | Prognostic significance of pregnancy zone protein and its correlation with immune infiltrates in hepatocellular carcinoma | |
Punnoose et al. | BCL2 expression in first-line diffuse large B-cell lymphoma identifies a patient population with poor prognosis | |
Lerebours et al. | Hemoglobin overexpression and splice signature as new features of inflammatory breast cancer? | |
Eckel-Passow et al. | Somatic expression of ENRAGE is associated with obesity status among patients with clear cell renal cell carcinoma | |
US20180223369A1 (en) | Methods for predicting the efficacy of treatment | |
Lin et al. | ADORA1 is a diagnostic-related biomarker and correlated with immune infiltrates in papillary thyroid carcinoma | |
Liu et al. | Identification of cell proliferation, immune response and cell migration as critical pathways in a prognostic signature for HER2+: ERα-breast cancer |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17760718 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17760718 Country of ref document: EP Kind code of ref document: A1 |