WO1998030722A9 - Controle d'expression genique permettant d'identifier une fonction de gene - Google Patents
Controle d'expression genique permettant d'identifier une fonction de geneInfo
- Publication number
- WO1998030722A9 WO1998030722A9 PCT/US1998/001206 US9801206W WO9830722A9 WO 1998030722 A9 WO1998030722 A9 WO 1998030722A9 US 9801206 W US9801206 W US 9801206W WO 9830722 A9 WO9830722 A9 WO 9830722A9
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- genes
- gene
- target
- nucleic acid
- expression
- Prior art date
Links
- 230000014509 gene expression Effects 0.000 title claims abstract description 285
- 230000001105 regulatory Effects 0.000 claims abstract description 138
- 230000035772 mutation Effects 0.000 claims abstract description 77
- 239000000523 sample Substances 0.000 claims description 170
- 150000007523 nucleic acids Chemical class 0.000 claims description 165
- 210000004027 cells Anatomy 0.000 claims description 155
- 108020004707 nucleic acids Proteins 0.000 claims description 149
- 102100019730 TP53 Human genes 0.000 claims description 145
- 101710026335 TP53 Proteins 0.000 claims description 145
- 238000009396 hybridization Methods 0.000 claims description 137
- 210000000481 Breast Anatomy 0.000 claims description 59
- 229920000272 Oligonucleotide Polymers 0.000 claims description 59
- 229920000160 (ribonucleotides)n+m Polymers 0.000 claims description 58
- 230000035897 transcription Effects 0.000 claims description 56
- 108020004999 Messenger RNA Proteins 0.000 claims description 52
- 229920002106 messenger RNA Polymers 0.000 claims description 51
- 230000002068 genetic Effects 0.000 claims description 50
- 108020004711 Nucleic Acid Probes Proteins 0.000 claims description 39
- 239000002853 nucleic acid probe Substances 0.000 claims description 39
- 230000000295 complement Effects 0.000 claims description 33
- 125000003729 nucleotide group Chemical group 0.000 claims description 29
- 239000002773 nucleotide Substances 0.000 claims description 28
- 239000012472 biological sample Substances 0.000 claims description 27
- 229920002676 Complementary DNA Polymers 0.000 claims description 26
- 239000002299 complementary DNA Substances 0.000 claims description 26
- 230000000694 effects Effects 0.000 claims description 26
- 102000004169 proteins and genes Human genes 0.000 claims description 26
- 108090000623 proteins and genes Proteins 0.000 claims description 26
- 229920001850 Nucleic acid sequence Polymers 0.000 claims description 25
- 206010006187 Breast cancer Diseases 0.000 claims description 24
- 239000007787 solid Substances 0.000 claims description 18
- 230000000875 corresponding Effects 0.000 claims description 17
- 238000000034 method Methods 0.000 claims description 15
- -1 p21wafl Proteins 0.000 claims description 15
- 230000004075 alteration Effects 0.000 claims description 13
- 238000003499 nucleic acid array Methods 0.000 claims description 12
- 102000002431 Cyclin G Human genes 0.000 claims description 11
- 108010068188 Cyclin G Proteins 0.000 claims description 11
- 102100007880 PCNA Human genes 0.000 claims description 11
- 108010009063 Proliferating Cell Nuclear Antigen Proteins 0.000 claims description 11
- 102000002938 thrombospondin family Human genes 0.000 claims description 11
- 108060008245 thrombospondin family Proteins 0.000 claims description 11
- 238000011144 upstream manufacturing Methods 0.000 claims description 11
- 230000001364 causal effect Effects 0.000 claims description 10
- 102100015262 MYC Human genes 0.000 claims description 7
- 230000001965 increased Effects 0.000 claims description 7
- 108020000948 Antisense Oligonucleotides Proteins 0.000 claims description 6
- 102000011422 GADD45 protein Human genes 0.000 claims description 6
- 108091007852 GADD45 protein Proteins 0.000 claims description 6
- 239000000074 antisense oligonucleotide Substances 0.000 claims description 6
- 239000000126 substance Substances 0.000 claims description 6
- 102100002974 CDKN1A Human genes 0.000 claims description 5
- 108020005544 Antisense RNA Proteins 0.000 claims description 4
- 206010028980 Neoplasm Diseases 0.000 claims description 4
- 229920002847 antisense RNA Polymers 0.000 claims description 4
- 230000000903 blocking Effects 0.000 claims description 4
- 239000003184 complementary RNA Substances 0.000 claims description 4
- 210000004881 tumor cells Anatomy 0.000 claims description 4
- 108091006542 SLC12A9 Proteins 0.000 claims description 3
- 150000001875 compounds Chemical class 0.000 claims description 3
- 230000003247 decreasing Effects 0.000 claims description 3
- 238000002703 mutagenesis Methods 0.000 claims description 3
- 231100000350 mutagenesis Toxicity 0.000 claims description 3
- 229940088597 Hormone Drugs 0.000 claims description 2
- 239000005556 hormone Substances 0.000 claims description 2
- 230000002779 inactivation Effects 0.000 claims description 2
- 230000001575 pathological Effects 0.000 claims description 2
- 230000001613 neoplastic Effects 0.000 claims 4
- 208000000409 Breast Neoplasms Diseases 0.000 claims 2
- 239000002246 antineoplastic agent Substances 0.000 claims 2
- 102000009178 bcl-2-Associated X Protein Human genes 0.000 claims 2
- 108010048571 bcl-2-Associated X Protein Proteins 0.000 claims 2
- 230000004777 loss-of-function mutation Effects 0.000 claims 2
- 238000004166 bioassay Methods 0.000 claims 1
- 238000002825 functional assay Methods 0.000 claims 1
- 238000000520 microinjection Methods 0.000 claims 1
- 230000004936 stimulating Effects 0.000 claims 1
- 239000000203 mixture Substances 0.000 abstract description 16
- 229920003013 deoxyribonucleic acid Polymers 0.000 description 42
- 238000004458 analytical method Methods 0.000 description 38
- 239000000047 product Substances 0.000 description 38
- 230000002103 transcriptional Effects 0.000 description 31
- 210000001519 tissues Anatomy 0.000 description 30
- 239000000758 substrate Substances 0.000 description 29
- 238000001514 detection method Methods 0.000 description 26
- 230000003321 amplification Effects 0.000 description 25
- 238000010606 normalization Methods 0.000 description 25
- 238000003199 nucleic acid amplification method Methods 0.000 description 25
- 230000027455 binding Effects 0.000 description 24
- 238000003786 synthesis reaction Methods 0.000 description 21
- 238000006243 chemical reaction Methods 0.000 description 20
- 230000003211 malignant Effects 0.000 description 19
- 238000003752 polymerase chain reaction Methods 0.000 description 19
- 235000018102 proteins Nutrition 0.000 description 19
- 230000002194 synthesizing Effects 0.000 description 19
- 230000015572 biosynthetic process Effects 0.000 description 18
- 230000033228 biological regulation Effects 0.000 description 17
- 230000037361 pathway Effects 0.000 description 17
- 108020005187 Oligonucleotide Probes Proteins 0.000 description 15
- 239000002751 oligonucleotide probe Substances 0.000 description 15
- 238000002360 preparation method Methods 0.000 description 14
- 230000000692 anti-sense Effects 0.000 description 12
- 238000000338 in vitro Methods 0.000 description 12
- 201000011510 cancer Diseases 0.000 description 11
- 239000000178 monomer Substances 0.000 description 11
- 102000027656 receptor tyrosine kinases Human genes 0.000 description 11
- 108091007921 receptor tyrosine kinases Proteins 0.000 description 11
- 229920001776 Mature messenger RNA Polymers 0.000 description 10
- 230000001413 cellular Effects 0.000 description 10
- 239000003153 chemical reaction reagent Substances 0.000 description 10
- 230000011664 signaling Effects 0.000 description 10
- 239000011521 glass Substances 0.000 description 9
- 239000000543 intermediate Substances 0.000 description 9
- 238000011160 research Methods 0.000 description 9
- 230000004913 activation Effects 0.000 description 8
- 238000003260 fluorescence intensity Methods 0.000 description 8
- 230000002401 inhibitory effect Effects 0.000 description 8
- 238000005259 measurement Methods 0.000 description 8
- 230000004568 DNA-binding Effects 0.000 description 7
- 102000027760 ERBB2 Human genes 0.000 description 7
- 201000008275 breast carcinoma Diseases 0.000 description 7
- 238000002474 experimental method Methods 0.000 description 7
- 239000003112 inhibitor Substances 0.000 description 7
- 230000002829 reduced Effects 0.000 description 7
- 239000011780 sodium chloride Substances 0.000 description 7
- 102000009193 Caveolins Human genes 0.000 description 6
- 108050000084 Caveolins Proteins 0.000 description 6
- 108010066668 ErbB-2 Receptor Proteins 0.000 description 6
- 229920000665 Exon Polymers 0.000 description 6
- 108020004532 RAS Proteins 0.000 description 6
- 229920001785 Response element Polymers 0.000 description 6
- 230000015556 catabolic process Effects 0.000 description 6
- 230000004059 degradation Effects 0.000 description 6
- 238000006731 degradation reaction Methods 0.000 description 6
- 201000010099 disease Diseases 0.000 description 6
- 230000003993 interaction Effects 0.000 description 6
- 238000002372 labelling Methods 0.000 description 6
- 238000002966 oligonucleotide array Methods 0.000 description 6
- 229920000642 polymer Polymers 0.000 description 6
- 238000011002 quantification Methods 0.000 description 6
- 108020001180 rasD Proteins 0.000 description 6
- 230000022983 regulation of cell cycle Effects 0.000 description 6
- 230000019491 signal transduction Effects 0.000 description 6
- 238000010186 staining Methods 0.000 description 6
- 238000007619 statistical method Methods 0.000 description 6
- 238000006467 substitution reaction Methods 0.000 description 6
- 238000005406 washing Methods 0.000 description 6
- 101710042656 BQ2027_MB1231C Proteins 0.000 description 5
- 210000004323 Caveolae Anatomy 0.000 description 5
- 239000003298 DNA probe Substances 0.000 description 5
- 102000001301 EGF receptors Human genes 0.000 description 5
- 108060006698 EGF receptors Proteins 0.000 description 5
- 229920002760 Expressed sequence tag Polymers 0.000 description 5
- 102100006425 GAPDH Human genes 0.000 description 5
- 101710008404 GAPDH Proteins 0.000 description 5
- 238000010357 RNA editing Methods 0.000 description 5
- 230000026279 RNA modification Effects 0.000 description 5
- 235000001014 amino acid Nutrition 0.000 description 5
- XKRFYHLGVUSROY-UHFFFAOYSA-N argon Chemical compound [Ar] XKRFYHLGVUSROY-UHFFFAOYSA-N 0.000 description 5
- 230000002596 correlated Effects 0.000 description 5
- 238000007405 data analysis Methods 0.000 description 5
- 239000007850 fluorescent dye Substances 0.000 description 5
- 238000004519 manufacturing process Methods 0.000 description 5
- 230000022532 regulation of transcription, DNA-dependent Effects 0.000 description 5
- 150000003839 salts Chemical class 0.000 description 5
- 230000001629 suppression Effects 0.000 description 5
- UCSJYZPVAKXKNQ-HZYVHMACSA-N 1-[(1S,2R,3R,4S,5R,6R)-3-carbamimidamido-6-{[(2R,3R,4R,5S)-3-{[(2S,3S,4S,5R,6S)-4,5-dihydroxy-6-(hydroxymethyl)-3-(methylamino)oxan-2-yl]oxy}-4-formyl-4-hydroxy-5-methyloxolan-2-yl]oxy}-2,4,5-trihydroxycyclohexyl]guanidine Chemical compound CN[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O[C@H]1O[C@@H]1[C@](C=O)(O)[C@H](C)O[C@H]1O[C@@H]1[C@@H](NC(N)=N)[C@H](O)[C@@H](NC(N)=N)[C@H](O)[C@H]1O UCSJYZPVAKXKNQ-HZYVHMACSA-N 0.000 description 4
- 101700033661 ACTB Proteins 0.000 description 4
- 102100011550 ACTB Human genes 0.000 description 4
- 101710032514 ACTI Proteins 0.000 description 4
- 102000027776 ERBB3 Human genes 0.000 description 4
- 101700041204 ERBB3 Proteins 0.000 description 4
- 229920001914 Ribonucleotide Polymers 0.000 description 4
- 238000007792 addition Methods 0.000 description 4
- 150000001413 amino acids Chemical class 0.000 description 4
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 4
- 229960002685 biotin Drugs 0.000 description 4
- 235000020958 biotin Nutrition 0.000 description 4
- 239000011616 biotin Substances 0.000 description 4
- 230000022131 cell cycle Effects 0.000 description 4
- 230000001808 coupling Effects 0.000 description 4
- 238000010168 coupling process Methods 0.000 description 4
- 238000005859 coupling reaction Methods 0.000 description 4
- 230000001809 detectable Effects 0.000 description 4
- 238000011161 development Methods 0.000 description 4
- 230000018109 developmental process Effects 0.000 description 4
- 230000001973 epigenetic Effects 0.000 description 4
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 4
- 230000005284 excitation Effects 0.000 description 4
- GNBHRKFJIUUOQI-UHFFFAOYSA-N fluorescein Chemical compound O1C(=O)C2=CC=CC=C2C21C1=CC=C(O)C=C1OC1=CC(O)=CC=C21 GNBHRKFJIUUOQI-UHFFFAOYSA-N 0.000 description 4
- ZHNUHDYFZUAESO-UHFFFAOYSA-N formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 4
- 239000003102 growth factor Substances 0.000 description 4
- 238000005734 heterodimerization reaction Methods 0.000 description 4
- NOESYZHRGYRDHS-UHFFFAOYSA-N insulin Chemical compound N1C(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(NC(=O)CN)C(C)CC)CSSCC(C(NC(CO)C(=O)NC(CC(C)C)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CCC(N)=O)C(=O)NC(CC(C)C)C(=O)NC(CCC(O)=O)C(=O)NC(CC(N)=O)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CSSCC(NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2C=CC(O)=CC=2)NC(=O)C(CC(C)C)NC(=O)C(C)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2NC=NC=2)NC(=O)C(CO)NC(=O)CNC2=O)C(=O)NCC(=O)NC(CCC(O)=O)C(=O)NC(CCCNC(N)=N)C(=O)NCC(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC(O)=CC=3)C(=O)NC(C(C)O)C(=O)N3C(CCC3)C(=O)NC(CCCCN)C(=O)NC(C)C(O)=O)C(=O)NC(CC(N)=O)C(O)=O)=O)NC(=O)C(C(C)CC)NC(=O)C(CO)NC(=O)C(C(C)O)NC(=O)C1CSSCC2NC(=O)C(CC(C)C)NC(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CC(N)=O)NC(=O)C(NC(=O)C(N)CC=1C=CC=CC=1)C(C)C)CC1=CN=CN1 NOESYZHRGYRDHS-UHFFFAOYSA-N 0.000 description 4
- 238000011005 laboratory method Methods 0.000 description 4
- 230000002018 overexpression Effects 0.000 description 4
- 230000014493 regulation of gene expression Effects 0.000 description 4
- 239000002336 ribonucleotide Substances 0.000 description 4
- 125000002652 ribonucleotide group Chemical group 0.000 description 4
- 230000003827 upregulation Effects 0.000 description 4
- 206010059512 Apoptosis Diseases 0.000 description 3
- 101700011961 DPOM Proteins 0.000 description 3
- 208000006402 Ductal Carcinoma Diseases 0.000 description 3
- 101700025368 ERBB2 Proteins 0.000 description 3
- 102000004190 Enzymes Human genes 0.000 description 3
- 108090000790 Enzymes Proteins 0.000 description 3
- 210000000981 Epithelium Anatomy 0.000 description 3
- 108091006011 G proteins Proteins 0.000 description 3
- 108091000058 GTP-Binding Proteins Proteins 0.000 description 3
- 102000030007 GTP-Binding Proteins Human genes 0.000 description 3
- 229920002459 Intron Polymers 0.000 description 3
- 101700067074 MAPK Proteins 0.000 description 3
- 101710041325 MAPKAPK2 Proteins 0.000 description 3
- 101710029649 MDV043 Proteins 0.000 description 3
- 101700080605 NUC1 Proteins 0.000 description 3
- 101700061424 POLB Proteins 0.000 description 3
- 101700054624 RF1 Proteins 0.000 description 3
- 230000006907 apoptotic process Effects 0.000 description 3
- 229910052786 argon Inorganic materials 0.000 description 3
- 230000001580 bacterial Effects 0.000 description 3
- 108010003152 bacteriophage T7 RNA polymerase Proteins 0.000 description 3
- 239000011324 bead Substances 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 239000003795 chemical substances by application Substances 0.000 description 3
- 238000007621 cluster analysis Methods 0.000 description 3
- 238000010192 crystallographic characterization Methods 0.000 description 3
- 230000001419 dependent Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 239000012530 fluid Substances 0.000 description 3
- 238000002073 fluorescence micrograph Methods 0.000 description 3
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 3
- 230000012010 growth Effects 0.000 description 3
- 229910052739 hydrogen Inorganic materials 0.000 description 3
- 239000001257 hydrogen Substances 0.000 description 3
- 238000005286 illumination Methods 0.000 description 3
- 230000001900 immune effect Effects 0.000 description 3
- 238000003364 immunohistochemistry Methods 0.000 description 3
- 238000002955 isolation Methods 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 239000012528 membrane Substances 0.000 description 3
- 101700006494 nucA Proteins 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 230000002093 peripheral Effects 0.000 description 3
- 238000000206 photolithography Methods 0.000 description 3
- 238000004445 quantitative analysis Methods 0.000 description 3
- 101710024887 rl Proteins 0.000 description 3
- 239000007790 solid phase Substances 0.000 description 3
- 101700045897 spk-1 Proteins 0.000 description 3
- 230000001131 transforming Effects 0.000 description 3
- PCDQPRRSZKQHHS-XVFCMESISA-N ({[({[(2R,3S,4R,5R)-5-(4-amino-2-oxo-1,2-dihydropyrimidin-1-yl)-3,4-dihydroxyoxolan-2-yl]methoxy}(hydroxy)phosphoryl)oxy](hydroxy)phosphoryl}oxy)phosphonic acid Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 PCDQPRRSZKQHHS-XVFCMESISA-N 0.000 description 2
- 101710027066 ALB Proteins 0.000 description 2
- 102100001085 APOB Human genes 0.000 description 2
- 101700065507 APOB Proteins 0.000 description 2
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 2
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 2
- 239000000592 Artificial Cell Substances 0.000 description 2
- 102100013894 BCL2 Human genes 0.000 description 2
- 108060000885 BCL2 Proteins 0.000 description 2
- 101710023465 BIO4 Proteins 0.000 description 2
- 210000004369 Blood Anatomy 0.000 description 2
- IXIBAKNTJSCKJM-BUBXBXGNSA-N Bovine insulin Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@H]1CSSC[C@H]2C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@H](C(=O)N[C@H](C(N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=3C=CC(O)=CC=3)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC=3C=CC(O)=CC=3)C(=O)N[C@@H](CSSC[C@H](NC(=O)[C@H](C(C)C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC=3C=CC(O)=CC=3)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC=3NC=NC=3)NC(=O)[C@H](CO)NC(=O)CNC1=O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O)=O)CSSC[C@@H](C(N2)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](NC(=O)CN)[C@@H](C)CC)C(C)C)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC=1C=CC=CC=1)C(C)C)C1=CN=CN1 IXIBAKNTJSCKJM-BUBXBXGNSA-N 0.000 description 2
- 101710022308 CDKN1A Proteins 0.000 description 2
- DNKYDHSONDSTNJ-XJVRLEFXSA-N CHEMBL1910953 Chemical compound C([C@@H](C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)NCC(=O)NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](C(C)C)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(N)=O)NC(=O)[C@@H](NC(=O)[C@@H](NC(=O)[C@H](CS)NC(=O)[C@@H](NC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CS)NC(=O)[C@H](C)N)[C@@H](C)O)[C@@H](C)O)C(C)C)[C@@H](C)O)C1=CN=CN1 DNKYDHSONDSTNJ-XJVRLEFXSA-N 0.000 description 2
- 108090000932 Calcitonin gene-related peptide Proteins 0.000 description 2
- 102000004414 Calcitonin gene-related peptide Human genes 0.000 description 2
- 210000001736 Capillaries Anatomy 0.000 description 2
- 208000005623 Carcinogenesis Diseases 0.000 description 2
- 210000003483 Chromatin Anatomy 0.000 description 2
- 108010077544 Chromatin Proteins 0.000 description 2
- 229920001405 Coding region Polymers 0.000 description 2
- 108010092799 EC 2.7.7.49 Proteins 0.000 description 2
- 102100016662 ERBB2 Human genes 0.000 description 2
- 102000033147 ERVK-25 Human genes 0.000 description 2
- 229940088598 Enzyme Drugs 0.000 description 2
- 241000206602 Eukaryota Species 0.000 description 2
- 108090000045 G-protein coupled receptors Proteins 0.000 description 2
- 102000003688 G-protein coupled receptors Human genes 0.000 description 2
- FDGQSTZJBFJUBT-UHFFFAOYSA-N Hypoxanthine Chemical compound O=C1NC=NC2=C1NC=N2 FDGQSTZJBFJUBT-UHFFFAOYSA-N 0.000 description 2
- 102000004877 Insulin Human genes 0.000 description 2
- 108090001061 Insulin Proteins 0.000 description 2
- 108010002350 Interleukin-2 Proteins 0.000 description 2
- 108020004391 Introns Proteins 0.000 description 2
- 108091007472 MAP kinase family Proteins 0.000 description 2
- 241000699660 Mus musculus Species 0.000 description 2
- 229920002332 Noncoding DNA Polymers 0.000 description 2
- 210000004940 Nucleus Anatomy 0.000 description 2
- 206010033128 Ovarian cancer Diseases 0.000 description 2
- 229940049954 Penicillin Drugs 0.000 description 2
- JGSARLDLIJGVTE-MBNYWOFBSA-N Penicillin G Chemical compound N([C@H]1[C@H]2SC([C@@H](N2C1=O)C(O)=O)(C)C)C(=O)CC1=CC=CC=C1 JGSARLDLIJGVTE-MBNYWOFBSA-N 0.000 description 2
- LLKYUHGUYSLMPA-UHFFFAOYSA-N Phosphoramidite Chemical compound NP([O-])[O-] LLKYUHGUYSLMPA-UHFFFAOYSA-N 0.000 description 2
- 239000004952 Polyamide Substances 0.000 description 2
- 102000001253 Protein Kinases Human genes 0.000 description 2
- 102000006382 Ribonucleases Human genes 0.000 description 2
- 108010083644 Ribonucleases Proteins 0.000 description 2
- 108050003452 SH2 domains Proteins 0.000 description 2
- 102000014400 SH2 domains Human genes 0.000 description 2
- 229960005322 Streptomycin Drugs 0.000 description 2
- 229920004890 Triton X-100 Polymers 0.000 description 2
- 102000015098 Tumor Suppressor Protein p53 Human genes 0.000 description 2
- 108010078814 Tumor Suppressor Protein p53 Proteins 0.000 description 2
- PGAVKCOVUIYSFO-XVFCMESISA-N Uridine triphosphate Chemical compound O[C@@H]1[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O[C@H]1N1C(=O)NC(=O)C=C1 PGAVKCOVUIYSFO-XVFCMESISA-N 0.000 description 2
- 108090001123 antibodies Proteins 0.000 description 2
- 102000004965 antibodies Human genes 0.000 description 2
- 229960000070 antineoplastic Monoclonal antibodies Drugs 0.000 description 2
- 229910052785 arsenic Inorganic materials 0.000 description 2
- 229960000626 benzylpenicillin Drugs 0.000 description 2
- 101710038807 bio2 Proteins 0.000 description 2
- 101700001148 bioB Proteins 0.000 description 2
- 101700001402 bioC Proteins 0.000 description 2
- 101700053308 bioH Proteins 0.000 description 2
- 239000008280 blood Substances 0.000 description 2
- 201000009030 carcinoma Diseases 0.000 description 2
- 238000004113 cell culture Methods 0.000 description 2
- HEDRZPFGACZZDS-UHFFFAOYSA-N chloroform Chemical compound ClC(Cl)Cl HEDRZPFGACZZDS-UHFFFAOYSA-N 0.000 description 2
- 238000007374 clinical diagnostic method Methods 0.000 description 2
- 238000010367 cloning Methods 0.000 description 2
- 230000003828 downregulation Effects 0.000 description 2
- 238000007876 drug discovery Methods 0.000 description 2
- 239000003623 enhancer Substances 0.000 description 2
- 230000002708 enhancing Effects 0.000 description 2
- 239000012091 fetal bovine serum Substances 0.000 description 2
- 239000012467 final product Substances 0.000 description 2
- 125000000524 functional group Chemical group 0.000 description 2
- 238000010438 heat treatment Methods 0.000 description 2
- 229940094991 herring sperm DNA Drugs 0.000 description 2
- 102000034443 heterotrimeric G proteins Human genes 0.000 description 2
- 108091006077 heterotrimeric G proteins Proteins 0.000 description 2
- 230000001976 improved Effects 0.000 description 2
- 239000000411 inducer Substances 0.000 description 2
- 230000001939 inductive effect Effects 0.000 description 2
- 201000010985 invasive ductal carcinoma Diseases 0.000 description 2
- 238000007834 ligase chain reaction Methods 0.000 description 2
- 238000007403 mPCR Methods 0.000 description 2
- 239000002609 media Substances 0.000 description 2
- 230000001404 mediated Effects 0.000 description 2
- 238000002844 melting Methods 0.000 description 2
- 239000003226 mitogen Substances 0.000 description 2
- 229960000060 monoclonal antibodies Drugs 0.000 description 2
- 108010045030 monoclonal antibodies Proteins 0.000 description 2
- 102000005614 monoclonal antibodies Human genes 0.000 description 2
- 231100000707 mutagenic chemical Toxicity 0.000 description 2
- 230000001537 neural Effects 0.000 description 2
- 239000002777 nucleoside Substances 0.000 description 2
- 231100000590 oncogenic Toxicity 0.000 description 2
- 230000002246 oncogenic Effects 0.000 description 2
- 102000025475 oncoproteins Human genes 0.000 description 2
- 108091008124 oncoproteins Proteins 0.000 description 2
- 230000003287 optical Effects 0.000 description 2
- 230000008506 pathogenesis Effects 0.000 description 2
- 150000008300 phosphoramidites Chemical class 0.000 description 2
- 229920002120 photoresistant polymer Polymers 0.000 description 2
- 230000001402 polyadenylating Effects 0.000 description 2
- 229920002647 polyamide Polymers 0.000 description 2
- 239000011528 polyamide (building material) Substances 0.000 description 2
- 229920000023 polynucleotide Polymers 0.000 description 2
- 239000002157 polynucleotide Substances 0.000 description 2
- 238000001556 precipitation Methods 0.000 description 2
- 102000004196 processed proteins & peptides Human genes 0.000 description 2
- 108090000765 processed proteins & peptides Proteins 0.000 description 2
- 230000020978 protein processing Effects 0.000 description 2
- 238000003753 real-time PCR Methods 0.000 description 2
- 238000010839 reverse transcription Methods 0.000 description 2
- 238000003757 reverse transcription PCR Methods 0.000 description 2
- 238000003196 serial analysis of gene expression Methods 0.000 description 2
- FAPWRFPIFSIZLT-UHFFFAOYSA-M sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 2
- 231100000588 tumorigenic Toxicity 0.000 description 2
- 230000000381 tumorigenic Effects 0.000 description 2
- 229950010342 uridine triphosphate Drugs 0.000 description 2
- WRGQSWVCFNIUNZ-GDCKJWNLSA-N 1-oleoyl-sn-glycerol 3-phosphate Chemical compound CCCCCCCC\C=C/CCCCCCCC(=O)OC[C@@H](O)COP(O)(O)=O WRGQSWVCFNIUNZ-GDCKJWNLSA-N 0.000 description 1
- MSSXOMSJDRHRMC-UHFFFAOYSA-N 2,6-Diaminopurine Chemical compound NC1=NC(N)=C2NC=NC2=N1 MSSXOMSJDRHRMC-UHFFFAOYSA-N 0.000 description 1
- LENZDBCJOHFCAS-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 1
- NOIRDLRUNWIUMX-UHFFFAOYSA-N 2-amino-3,7-dihydropurin-6-one;6-amino-1H-pyrimidin-2-one Chemical compound NC=1C=CNC(=O)N=1.O=C1NC(N)=NC2=C1NC=N2 NOIRDLRUNWIUMX-UHFFFAOYSA-N 0.000 description 1
- JRYMOPZHXMVHTA-DAGMQNCNSA-N 2-amino-7-[(2R,3R,4S,5R)-3,4-dihydroxy-5-(hydroxymethyl)oxolan-2-yl]-1H-pyrrolo[2,3-d]pyrimidin-4-one Chemical compound C1=CC=2C(=O)NC(N)=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O JRYMOPZHXMVHTA-DAGMQNCNSA-N 0.000 description 1
- OBULAGGRIVAQEG-DFGXMLLCSA-N 5-[(3aS,4S,6aR)-2-oxo-1,3,3a,4,6,6a-hexahydrothieno[3,4-d]imidazol-4-yl]pentanoic acid;[[(2R,3S,4R,5R)-5-(2,4-dioxopyrimidin-1-yl)-3,4-dihydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl] phosphono hydrogen phosphate Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21.O[C@@H]1[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O[C@H]1N1C(=O)NC(=O)C=C1 OBULAGGRIVAQEG-DFGXMLLCSA-N 0.000 description 1
- 102100001249 ALB Human genes 0.000 description 1
- 102000007592 Apolipoproteins Human genes 0.000 description 1
- 108010071619 Apolipoproteins Proteins 0.000 description 1
- 210000003567 Ascitic Fluid Anatomy 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 108060006878 Biotin synthases Proteins 0.000 description 1
- 210000000601 Blood Cells Anatomy 0.000 description 1
- 210000004556 Brain Anatomy 0.000 description 1
- 108060001064 Calcitonin Proteins 0.000 description 1
- 229960004015 Calcitonin Drugs 0.000 description 1
- 102400000113 Calcitonin Human genes 0.000 description 1
- 102000003727 Caveolin-1 Human genes 0.000 description 1
- 108090000026 Caveolin-1 Proteins 0.000 description 1
- 102000003692 Caveolin-2 Human genes 0.000 description 1
- 108090000032 Caveolin-2 Proteins 0.000 description 1
- 210000000170 Cell Membrane Anatomy 0.000 description 1
- 210000003855 Cell Nucleus Anatomy 0.000 description 1
- 102000005853 Clathrin Human genes 0.000 description 1
- 108010019874 Clathrin Proteins 0.000 description 1
- 230000036881 Clu Effects 0.000 description 1
- 108020004705 Codon Proteins 0.000 description 1
- 108010051219 Cre recombinase Proteins 0.000 description 1
- 102000031025 DNA-Binding Proteins Human genes 0.000 description 1
- 108091000102 DNA-Binding Proteins Proteins 0.000 description 1
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 1
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 1
- 101710028159 DNTT Proteins 0.000 description 1
- 102100002445 DNTT Human genes 0.000 description 1
- 102000001039 Dystrophin Human genes 0.000 description 1
- 108010069091 Dystrophin Proteins 0.000 description 1
- 101700033006 EGF Proteins 0.000 description 1
- 102100010813 EGF Human genes 0.000 description 1
- 238000002965 ELISA Methods 0.000 description 1
- 241000701867 Enterobacteria phage T7 Species 0.000 description 1
- 229940116977 Epidermal Growth Factor Drugs 0.000 description 1
- 108010073043 ErbB Receptors Proteins 0.000 description 1
- 241000588724 Escherichia coli Species 0.000 description 1
- 241000701959 Escherichia virus Lambda Species 0.000 description 1
- PLUBXMRUUVWRLT-UHFFFAOYSA-N Ethyl methane sulfonate Chemical compound CCOS(C)(=O)=O PLUBXMRUUVWRLT-UHFFFAOYSA-N 0.000 description 1
- 101700032527 GRAP Proteins 0.000 description 1
- 101700046691 GRB2 Proteins 0.000 description 1
- 108010001498 Galectin 1 Proteins 0.000 description 1
- 206010064571 Gene mutation Diseases 0.000 description 1
- 102000018899 Glutamate Receptors Human genes 0.000 description 1
- 108010027915 Glutamate Receptors Proteins 0.000 description 1
- 229960002989 Glutamic Acid Drugs 0.000 description 1
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 1
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 1
- UYTPUPDQBNUYGX-UHFFFAOYSA-N Guanine Chemical group O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 1
- 102000016285 Guanine Nucleotide Exchange Factors Human genes 0.000 description 1
- 108010067218 Guanine Nucleotide Exchange Factors Proteins 0.000 description 1
- 101710033925 HRAS Proteins 0.000 description 1
- 102100009283 HRAS Human genes 0.000 description 1
- 230000036499 Half live Effects 0.000 description 1
- 108010001336 Horseradish Peroxidase Proteins 0.000 description 1
- 229940045644 Human calcitonin Drugs 0.000 description 1
- 210000003016 Hypothalamus Anatomy 0.000 description 1
- UGQMRVRMYYASKQ-KMPDEGCQSA-N Inosine Natural products O[C@H]1[C@H](O)[C@@H](CO)O[C@@H]1N1C(N=CNC2=O)=C2N=C1 UGQMRVRMYYASKQ-KMPDEGCQSA-N 0.000 description 1
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 description 1
- 102400000022 Insulin-Like Growth Factor II Human genes 0.000 description 1
- 108090001117 Insulin-Like Growth Factor II Proteins 0.000 description 1
- 210000000936 Intestines Anatomy 0.000 description 1
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 1
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 1
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 1
- 102100009633 LGALS1 Human genes 0.000 description 1
- 102000002297 Laminin Receptors Human genes 0.000 description 1
- 108010000851 Laminin Receptors Proteins 0.000 description 1
- 241000272168 Laridae Species 0.000 description 1
- 229920000126 Latex Polymers 0.000 description 1
- 210000004185 Liver Anatomy 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- 102000038027 MAP kinase family Human genes 0.000 description 1
- 102000004331 Mitogen-Activated Protein Kinases Human genes 0.000 description 1
- 108090000823 Mitogen-Activated Protein Kinases Proteins 0.000 description 1
- 210000003205 Muscles Anatomy 0.000 description 1
- 102000008300 Mutant Proteins Human genes 0.000 description 1
- 108010021466 Mutant Proteins Proteins 0.000 description 1
- 101710034230 NR2F1 Proteins 0.000 description 1
- 102100016102 NTRK1 Human genes 0.000 description 1
- 101700043017 NTRK1 Proteins 0.000 description 1
- 238000000636 Northern blotting Methods 0.000 description 1
- 101710016786 P/C Proteins 0.000 description 1
- 101710039569 POLM Proteins 0.000 description 1
- 102000035443 Peptidases Human genes 0.000 description 1
- 108091005771 Peptidases Proteins 0.000 description 1
- 102000007982 Phosphoproteins Human genes 0.000 description 1
- 108091000081 Phosphotransferases Proteins 0.000 description 1
- 210000004910 Pleural fluid Anatomy 0.000 description 1
- 229920000795 Polyadenylation Polymers 0.000 description 1
- 239000004743 Polypropylene Substances 0.000 description 1
- 239000004793 Polystyrene Substances 0.000 description 1
- SCVFZCLFOSHCOH-UHFFFAOYSA-M Potassium acetate Chemical compound [K+].CC([O-])=O SCVFZCLFOSHCOH-UHFFFAOYSA-M 0.000 description 1
- 206010036790 Productive cough Diseases 0.000 description 1
- 108060006633 Protein Kinases Proteins 0.000 description 1
- 240000007072 Prunus domestica Species 0.000 description 1
- 101710037934 QRSL1 Proteins 0.000 description 1
- 229920000320 RNA (poly(A)) Polymers 0.000 description 1
- 108020004412 RNA 3' Polyadenylation Signals Proteins 0.000 description 1
- 108020005093 RNA Precursors Proteins 0.000 description 1
- 239000007759 RPMI Media 1640 Substances 0.000 description 1
- 241000700159 Rattus Species 0.000 description 1
- PYWVYCXTNDRMGF-UHFFFAOYSA-N Rhodamine B Chemical compound [Cl-].C=12C=CC(=[N+](CC)CC)C=C2OC2=CC(N(CC)CC)=CC=C2C=1C1=CC=CC=C1C(O)=O PYWVYCXTNDRMGF-UHFFFAOYSA-N 0.000 description 1
- 229920000972 Sense strand Polymers 0.000 description 1
- 108010071390 Serum Albumin Proteins 0.000 description 1
- 108020004682 Single-Stranded DNA Proteins 0.000 description 1
- 210000003802 Sputum Anatomy 0.000 description 1
- 108010090804 Streptavidin Proteins 0.000 description 1
- 102100008904 TFRC Human genes 0.000 description 1
- 108010006785 Taq Polymerase Proteins 0.000 description 1
- OKIZCWYLBDKLSU-UHFFFAOYSA-M Tetramethylammonium chloride Chemical compound [Cl-].C[N+](C)(C)C OKIZCWYLBDKLSU-UHFFFAOYSA-M 0.000 description 1
- MPLHNVLQVRSVEE-UHFFFAOYSA-N Texas Red Chemical compound [O-]S(=O)(=O)C1=CC(S(Cl)(=O)=O)=CC=C1C(C1=CC=2CCCN3CCCC(C=23)=C1O1)=C2C1=C(CCC1)C3=[N+]1CCCC3=C2 MPLHNVLQVRSVEE-UHFFFAOYSA-N 0.000 description 1
- 210000001685 Thyroid Gland Anatomy 0.000 description 1
- 102000002142 Trans-Activators Human genes 0.000 description 1
- 108010040939 Trans-Activators Proteins 0.000 description 1
- 108010033576 Transferrin Receptors Proteins 0.000 description 1
- SHGAZHPCJJPHSC-NWVFGJFESA-N Tretinoin Chemical compound OC(=O)/C=C(\C)/C=C/C=C(C)C=CC1=C(C)CCCC1(C)C SHGAZHPCJJPHSC-NWVFGJFESA-N 0.000 description 1
- 229960001727 Tretinoin Drugs 0.000 description 1
- 210000002700 Urine Anatomy 0.000 description 1
- 101700021643 VP4A Proteins 0.000 description 1
- 108010067390 Viral Proteins Proteins 0.000 description 1
- 102000016350 Viral Proteins Human genes 0.000 description 1
- 208000001756 Virus Disease Diseases 0.000 description 1
- 208000008383 Wilms Tumor Diseases 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K [O-]P([O-])([O-])=O Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 230000001594 aberrant Effects 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 230000003213 activating Effects 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N adenyl group Chemical group N1=CN=C2N=CNC2=C1N GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- 102000019633 alpha-2 adrenergic receptor family Human genes 0.000 description 1
- 108020004101 alpha-2 adrenergic receptor family Proteins 0.000 description 1
- 125000000539 amino acid group Chemical group 0.000 description 1
- 125000003277 amino group Chemical group 0.000 description 1
- 230000003322 aneuploid Effects 0.000 description 1
- 230000002457 bidirectional Effects 0.000 description 1
- 230000031018 biological processes and functions Effects 0.000 description 1
- 238000001574 biopsy Methods 0.000 description 1
- 239000002981 blocking agent Substances 0.000 description 1
- 238000006664 bond formation reaction Methods 0.000 description 1
- 238000010805 cDNA synthesis kit Methods 0.000 description 1
- BBBFJLBPOGFECG-VJVYQDLKSA-N calcitonin Chemical compound N([C@H](C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1NC=NC=1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CO)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N1[C@@H](CCC1)C(N)=O)C(C)C)C(=O)[C@@H]1CSSC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1 BBBFJLBPOGFECG-VJVYQDLKSA-N 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000036952 cancer formation Effects 0.000 description 1
- XFIOKOXROGCUQX-UHFFFAOYSA-O carbamimidoylazanium;chloroform;phenol Chemical compound NC([NH3+])=N.ClC(Cl)Cl.OC1=CC=CC=C1 XFIOKOXROGCUQX-UHFFFAOYSA-O 0.000 description 1
- 231100000504 carcinogenesis Toxicity 0.000 description 1
- 230000030833 cell death Effects 0.000 description 1
- 230000024245 cell differentiation Effects 0.000 description 1
- 229920002092 cellular RNA Polymers 0.000 description 1
- 230000003196 chaotropic Effects 0.000 description 1
- 239000007795 chemical reaction product Substances 0.000 description 1
- 230000002759 chromosomal Effects 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 238000004440 column chromatography Methods 0.000 description 1
- 238000004624 confocal microscopy Methods 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 239000000356 contaminant Substances 0.000 description 1
- 210000004748 cultured cells Anatomy 0.000 description 1
- 230000001351 cycling Effects 0.000 description 1
- 230000002498 deadly Effects 0.000 description 1
- 239000007857 degradation product Substances 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 230000002074 deregulated Effects 0.000 description 1
- 230000003831 deregulation Effects 0.000 description 1
- 230000000368 destabilizing Effects 0.000 description 1
- 239000003599 detergent Substances 0.000 description 1
- 239000005546 dideoxynucleotide Substances 0.000 description 1
- 230000003292 diminished Effects 0.000 description 1
- 239000003596 drug target Substances 0.000 description 1
- 230000004064 dysfunction Effects 0.000 description 1
- KCXVZYZYPLLWCC-UHFFFAOYSA-N edta Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 1
- 230000000464 effect on transcription Effects 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- 230000002255 enzymatic Effects 0.000 description 1
- 210000002919 epithelial cells Anatomy 0.000 description 1
- 239000003797 essential amino acid Substances 0.000 description 1
- 235000020776 essential amino acid Nutrition 0.000 description 1
- 125000001495 ethyl group Chemical group [H]C([H])([H])C([H])([H])* 0.000 description 1
- 210000003527 eukaryotic cell Anatomy 0.000 description 1
- 238000010195 expression analysis Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000000556 factor analysis Methods 0.000 description 1
- 210000002950 fibroblast Anatomy 0.000 description 1
- WSFSSNUMVMOOMR-UHFFFAOYSA-N formaldehyde Chemical compound O=C WSFSSNUMVMOOMR-UHFFFAOYSA-N 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 230000005714 functional activity Effects 0.000 description 1
- 238000001415 gene therapy Methods 0.000 description 1
- 102000034448 gene-regulatory proteins Human genes 0.000 description 1
- 108091006088 gene-regulatory proteins Proteins 0.000 description 1
- 238000010353 genetic engineering Methods 0.000 description 1
- 239000003365 glass fiber Substances 0.000 description 1
- 239000003862 glucocorticoid Substances 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 239000005090 green fluorescent protein Substances 0.000 description 1
- 239000000122 growth hormone Substances 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000003284 homeostatic Effects 0.000 description 1
- 101500011263 human Calcitonin Proteins 0.000 description 1
- 230000002209 hydrophobic Effects 0.000 description 1
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 1
- 230000000984 immunochemical Effects 0.000 description 1
- 238000003365 immunocytochemistry Methods 0.000 description 1
- 230000000415 inactivating Effects 0.000 description 1
- 230000000977 initiatory Effects 0.000 description 1
- 229960003786 inosine Drugs 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000005304 joining Methods 0.000 description 1
- 101710017890 large T Proteins 0.000 description 1
- 239000004816 latex Substances 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 230000000670 limiting Effects 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 239000006166 lysate Substances 0.000 description 1
- FYYHWMGAXLPEAU-UHFFFAOYSA-N magnesium Chemical compound [Mg] FYYHWMGAXLPEAU-UHFFFAOYSA-N 0.000 description 1
- 239000011777 magnesium Substances 0.000 description 1
- 229910052749 magnesium Inorganic materials 0.000 description 1
- 230000036210 malignancy Effects 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 230000000873 masking Effects 0.000 description 1
- 238000004949 mass spectrometry Methods 0.000 description 1
- 238000002493 microarray Methods 0.000 description 1
- 238000000386 microscopy Methods 0.000 description 1
- 230000002297 mitogenic Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006011 modification reaction Methods 0.000 description 1
- 230000000051 modifying Effects 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 238000000329 molecular dynamics simulation Methods 0.000 description 1
- 230000004784 molecular pathogenesis Effects 0.000 description 1
- 230000003990 molecular pathway Effects 0.000 description 1
- 239000003068 molecular probe Substances 0.000 description 1
- 238000000491 multivariate analysis Methods 0.000 description 1
- 230000003505 mutagenic Effects 0.000 description 1
- 239000003471 mutagenic agent Substances 0.000 description 1
- 230000003227 neuromodulating Effects 0.000 description 1
- 229910052757 nitrogen Inorganic materials 0.000 description 1
- 238000009828 non-uniform distribution Methods 0.000 description 1
- 238000001613 nuclear run-on assay Methods 0.000 description 1
- 238000007899 nucleic acid hybridization Methods 0.000 description 1
- 125000003835 nucleoside group Chemical group 0.000 description 1
- 230000036961 partial Effects 0.000 description 1
- 230000001717 pathogenic Effects 0.000 description 1
- 238000002823 phage display Methods 0.000 description 1
- 239000012071 phase Substances 0.000 description 1
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N phenol Chemical compound OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 1
- 238000002205 phenol-chloroform extraction Methods 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- 150000004713 phosphodiesters Chemical class 0.000 description 1
- 238000006303 photolysis reaction Methods 0.000 description 1
- 230000015843 photosynthesis, light reaction Effects 0.000 description 1
- 239000004033 plastic Substances 0.000 description 1
- 108091008117 polyclonal antibodies Proteins 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
- 229920001155 polypropylene Polymers 0.000 description 1
- 229920002223 polystyrene Polymers 0.000 description 1
- 230000023603 positive regulation of transcription initiation, DNA-dependent Effects 0.000 description 1
- 230000001124 posttranscriptional Effects 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 230000000644 propagated Effects 0.000 description 1
- 235000019833 protease Nutrition 0.000 description 1
- 125000006239 protecting group Chemical group 0.000 description 1
- 239000011253 protective coating Substances 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 238000004451 qualitative analysis Methods 0.000 description 1
- 230000002285 radioactive Effects 0.000 description 1
- 108010014186 ras Proteins Proteins 0.000 description 1
- 102000016914 ras Proteins Human genes 0.000 description 1
- 239000000376 reactant Substances 0.000 description 1
- 230000025915 regulation of apoptotic process Effects 0.000 description 1
- 230000025053 regulation of cell proliferation Effects 0.000 description 1
- 230000000754 repressing Effects 0.000 description 1
- 230000001718 repressive Effects 0.000 description 1
- 229930002330 retinoic acid Natural products 0.000 description 1
- 238000004805 robotic Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- BLRPTPMANUNPDV-UHFFFAOYSA-N silane Chemical compound [SiH4] BLRPTPMANUNPDV-UHFFFAOYSA-N 0.000 description 1
- 229910000077 silane Inorganic materials 0.000 description 1
- 229910001415 sodium ion Inorganic materials 0.000 description 1
- 229910000162 sodium phosphate Inorganic materials 0.000 description 1
- 238000010532 solid phase synthesis reaction Methods 0.000 description 1
- 239000002904 solvent Substances 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 239000007858 starting material Substances 0.000 description 1
- 230000004083 survival Effects 0.000 description 1
- 238000010189 synthetic method Methods 0.000 description 1
- 230000001225 therapeutic Effects 0.000 description 1
- 230000035916 transactivation Effects 0.000 description 1
- 238000001890 transfection Methods 0.000 description 1
- PIEPQKCYPFFYMG-UHFFFAOYSA-N tris acetate Chemical compound CC(O)=O.OCC(N)(CO)CO PIEPQKCYPFFYMG-UHFFFAOYSA-N 0.000 description 1
- 230000001228 trophic Effects 0.000 description 1
- 238000000539 two dimensional gel electrophoresis Methods 0.000 description 1
- 241001515965 unidentified phage Species 0.000 description 1
- 230000003612 virological Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 238000001262 western blot Methods 0.000 description 1
- 238000009736 wetting Methods 0.000 description 1
- HCHKCACWOHOZIP-UHFFFAOYSA-N zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 description 1
- 229910052725 zinc Inorganic materials 0.000 description 1
- 239000011701 zinc Substances 0.000 description 1
Definitions
- Gene expression is also associated with pathogenesis.
- the lack of sufficient expression of functional tumor suppressor genes and/or the over expression of oncogene/protooncogenes could lead to tumorgenesis (Marshall, Cell, 64: 313-326 (1991); Weinberg, Science, 254: 1138-1146 (1991), incorporated herein by reference for all purposes).
- changes in the expression levels of particular genes e.g. oncogenes or tumor suppressors
- This invention provides methods, compositions, and apparatus for studying the complex regulatory relationships among genes.
- this invention provides methods, compositions, and apparatus for detecting mutations of upstream regulatory genes by monitoring the expression of down-stream genes.
- gene expression monitoring is used to determine certain functions of a gene by identifying its down-stream regulated genes. Similar embodiments use gene expression to discern the effect of specific mutations of upstream genes. Gene expression is also used to identify upstream regulatory genes in some embodiments.
- gene expression monitoring is used to decipher the complex regulatory relationship among genes.
- the expression of more than 10 genes, preferably more than 100 genes, more preferably more than 1,000 genes and most preferably more than 5,000 genes are monitored in a large number of samples of cells.
- each of the samples has an expression pattern different from that of other samples.
- a plurality of independent samples are assayed.
- the expression data can be analyzed to understand the complex relationships among genes.
- the expression data are analyzed to develop a map describing such complex relationships.
- antisense oligonucleotides or antisense genes are used to block the expression of specific genes.
- homozygous, knock- out techniques are used to specifically suppress the expression of genes.
- transfection of regulatory genes is used to alter the expression profile of a cell.
- antisense oligonucleotides of random sequence are introduced to cells to block the expression of genes.
- expression data are analyzed to generate cluster maps indicating a correlation among genes.
- cluster maps are then analyzed using statistical methods to generate a map consisting of regulatory pathways describing the complex relationship among the genes. Many statistical methods are suitable for building such maps.
- the LISREL method is particularly useful in such application.
- the structure of the map is refined as more data become available. Thus, the map is dynamic and updated automatically as new data sets are entered.
- Such a gene network map has a wide variety of applications, such as in the fields of diagnostics, drug discovery, gene therapy, and biological research. For example, an investigator interested in a particular gene may consult such a map to find putative upstream and down-stream genes with statistical confidence. The investigator can then focus further research on those genes.
- gene expression monitoring is used to detect potential malfunction of regulatory genes.
- the expression of a subset of genes of interest in a diseased tissue is analyzed to obtain a diseased expression pattern.
- the subset contains at least one or more than 5, 10, 20, 25, 50, 75, 100, 150, 200, 250, 300, 400, 500, 750, 1,000 , 1,250, 1500, 3000, 4550, or 6,000 genes of all the known genes.
- the expression of the same genes in a normal tissue can also be similarly analyzed to generate a normal gene expression pattern. Difference in the expression of genes indicates the abnormality of regulation in the diseased tissue.
- a data filter is used to identify those genes whose expression is significantly altered. By using a data filter, only those genes whose expression is enhanced or reduced in the diseased tissue more than, e.g., 3, 5, or 10-fold are identified as altered.
- the upstream regulatory gene of the altered gene is indicated as a candidate malfunctioning gene.
- a upstream gene is identified as a candidate malfunctioning gene only if the expression of two of more of its down-stream genes is affected.
- the candidate malfunctioning gene is then sequenced to check whether a mutation is present, or the malfunction is due to epigenetic or nongenetic effectors.
- a mutation may not be present in the genome, and yet the product of the regulatory gene appears to be malfunctioning.
- p53 can be functionally rather than genetically inactivated by binding to viral proteins, such as ElB and large T antigen. By assaying for the ability of a regulatory protein to activate or repress other gene's expression, the both genetic and phenotypic inactivation can be assessed.
- the function of a particular mutation in a regulatory gene can be determined by gene expression monitoring.
- the expression profiles of cells containing the specific mutation and control cells lacking the mutation are compared to determine whether the mutation affects the expression of down-stream genes.
- the function of a particular gene may be determined.
- the expression of a large number of genes is monitored in biological samples with the target gene expression to produce a control expression profile.
- the expression of the target gene is then suppressed to produce a target expression profile.
- p53 activated and repressed genes are monitored to detect loss of wild-type p53 function.
- gene expression monitoring is used to detect the in-cell function of p53.
- loss of function of a nucleic acid encoding a reuglatory molecule in a test cell can be determined.
- a first nucleic acid molecule encoding a regulatory molecule is selected for analysis.
- a set of second nucleic acid molecules whose expression is induced or repressed by the regulatory molecule in normal cells is compiled or selected.
- a transcription indicator of a test cell is hybridized to a set of nucleic acid probes. The transcription indicator is selected from the group consisting of mRNA, cDNA and cRNA.
- Each member of the set of nucleic acid probes comprises a portion of a nucleic acid molecule which is a member of the set of second nucleic acid molecules which are induced or repressed by the selected regulatory molecule.
- the amount of transcription indicator which hybridizes to each of said set of nucleic acid probes is determined.
- a test cell is identified as having lost function of the regulatory molecule if (1) hybridization of the transcription indicator of the test cell to a probe which comprises a portion of a nucleic acid which is induced by the regulatory molecule is lower than hybridization using a transcription indicator from a normal cell, or (2) hybridization of the transcription indicator of the test cell to a probe which comprises a portion of a nucleic acid which is repressed by the regulatory molecule is higher than hybridization using a transcription indicator from a normal cell.
- Figure 1 shows a hypothetical genetic network
- Figure 2 shows a schematic of one embodiment for interrogating the genetic network.
- Figure 3 shows a schematic of one embodiment for expression monitoring for gene function identification.
- Figure 4 shows a schematic of one embodiment for expression monitoring for mutation function identification.
- Figure 5 shows the result of GeneChip ® sequence analysis of p53 genes in normal and malignant breast epithelium cells.
- Figure 6 shows an expression profile of normal and malignant breast epithelium cell.
- Figure 7 shows a schematic of one embodiment for expression monitoring for mutation function identification
- Figure 8 shows the result of GeneChip ® mutation detection of p53 gene in two malignant breast epithelium cell lines.
- FIGs 9A and 9B illustrate fluorescence images of oligonucleotide arrays monitoring 1,650 genes in parallel (1 of a set of 4 arrays covering 6,600 genes).
- Fig. 9A representative hybridization patterns of fluorescently labeled cRNA from normal (HT-125) and malignant breast (BT-474) cells are shown. The images were obtained after hybridization of arrays with fragmented, biotin labeled cR A and subsequent staining with a phycoerytherin-strepavidin conjugate. Bright rows indicate messages present at high levels. Low level messages (1-10 copies/cell) are unambiguously detected based on quantitative analysis of PM/MM intensity patterns.
- a magnified view of a portion of the array highlighting examples of altered gene expression between BT-474 and HT-125 is shown.
- induced (>10-fold change in hybridization intensity) genes are shown
- unchanged ( ⁇ 2-fold change in hybridization intensity) are shown
- in area 3 repressed (> 10-fold change in hybridization intensity) are shown.
- Fig. 9B illustrates zoom-in images of genes 1, 2, & 3 in (A) as 20 probe pairs of perfect-matched (PM) and single base mis-matched (MM) oligonucleotide probe cells.
- Figure 10 illustrates expression profiles of subset of genes from normal versus malignant breast cells. Average perfect match-mismatch (PM-MM )intensity differences (normalized to ⁇ -Actin and GAPDH signals) were plotted for the genes highlighted in Figure 9A that demonstrated greater than a 2-fold difference in hybridization signals between HT-125 and BT-474. Values for signals off scale are indicated.
- PM-MM perfect match-mismatch
- Figure 11 illustrates p53 sequence analysis and mutation detection by hybridization.
- Figure 11 A an image of the p53 genotyping array hybridized to 1,490 bp of the BT-474 breast carcinoma p53 gene (left) is shown.
- In each column are 4 identical probes with an A, C, G or T substituted at a central position.
- the hybridized target sequence identified based on mismatch detection from left to right as the complement of the substitution base with the brightest signal.
- BT-474 The G-A transition seen in BT-474 is accompanied by a loss of signal at flanking positions as these probes now have a single-base mismatch to the target distinct from the query position.
- Fig. 1 IB top
- comparison of wild-type reference black
- BT-474 p53 gene red
- hybridization intensity patterns from sense above
- anti-sense strands below
- the area shown demonstrates the "footprint” and detection of a single-base difference between the samples (vertical green line).
- GeneChip data analysis output is shown (bottom) that unambiguously identifies a G-A base change at nucleotide 1,279 of p53 in BT-474 resulting in a glutamic acid to lysine amino acid change in exon 8 (DNA binding domain).
- the upper portion of output displays the p53 wild-type reference sequence. Aligned outputs of wild-type p53 control and BT-474 samples are shown.
- BindCs substantially refers to complementary hybridization between a probe nucleic acid and a target nucleic acid and embraces minor mismatches that can be accommodated by reducing the stringency of the hybridization media to achieve the desired detection of the target polynucleotide sequence.
- Background refers to hybridization signals resulting from non-specific binding, or other interactions, between the labeled target nucleic acids and components of the oligonucleotide array (e.g., the oligonucleotide probes, control probes, the array substrate, etc.). Background signals may also be produced by intrinsic fluorescence of the array components themselves.
- a single background signal can be calculated for the entire array, or a different background signal may be calculated for each target nucleic acid.
- background is calculated as the average hybridization signal intensity for the lowest 5% to 10% of the probes in the array, or, where a different background signal is calculated for each target gene, for the lowest 5% to 10% of the probes for each gene
- background may be calculated as the average hybridization signal intensity produced by hybridization to probes that are not complementary to any sequence found in the sample (e.g. probes directed to nucleic acids of the opposite sense or to genes not found in the sample such as bacterial genes where the sample is mammalian nucleic acids)
- Background can also be calculated as the average signal intensity produced by regions of the array that lack any probes at all
- Cis-acting is used here to refer to the regulation of gene expression by a DNA subsequence in the same DNA molecule as the target gene. Cis-acting can be exerted either by the binding of trans-acting transcriptional factors or by long range control
- complexity is used here according to standard meaning of this term as established by Britten et al. Methods of Enzymol 29 363 (1974) See, also Cantor and Schimmel Biophysical Chemistry: Part III at 1228-1230 for further explanation of nucleic acid complexity
- Hybridizing specifically to refers to the binding, duplexing, or hybridizing of a molecule substantially to or only to a particular nucleotide sequence or sequences under stringent conditions when that sequence is present in a complex mixture (e.g., total cellular) DNA or RNA Introns: noncoding DNA sequences which separate neighboring coding regions. During gene transcription, introns, like exons, are transcribed into RNA but are subsequently removed by RNA splicing.
- Massive Parallel Screening refers to the simultaneous screening of at least about 100, preferably about 1000, more preferably about 10,000 and most preferably about 1,000,000 different nucleic acid hybridizations.
- mismatch control refers to a probe whose sequence is deliberately selected not to be perfectly complementary to a particular target sequence. For each mismatch (MM) control in a high-density array there typically exists a corresponding perfect match (PM) probe that is perfectly complementary to the same particular target sequence.
- the mismatch may comprise one or more bases. While the mismatch(s) may be located anywhere in the mismatch probe, terminal mismatches are less desirable as a terminal mismatch is less likely to prevent hybridization of the target sequence. In a particularly preferred embodiment, the mismatch is located at or near the center of the probe such that the mismatch is most likely to destabilize the duplex with the target sequence under the test hybridization conditions.
- mRNA or transcript refers to transcripts of a gene.
- Transcripts are RNA including, for example, mature messenger RNA ready for translation, products of various stages of transcript processing. Transcript processing may include splicing, editing and degradation.
- nucleic acid or “nucleic acid molecule” refer to a deoxyribonucleotide or ribonucleotide polymer in either single-or double-stranded form, and unless otherwise limited, would encompass analogs of natural nucleotide that can function in a similar manner as naturally occurring nucleotide.
- An oligo-nucleotide is a single-stranded nucleic acid of 2 to n bases, where n may be greater than 500 to 1000.
- Nucleic acids may be cloned or synthesized using any technique known in the art. They may also include non-natually occurring nucleotide analogs, such as those which are modified to improve hybridization and peptide nucleic acids.
- Nucleic acid encoding a regulatory molecule may be DNA, RNA or protein. Thus for example DNA sites which bind protein or other nucleic acid molecules are included within the class of regulatory molecules encoded by a nucleic acid.
- Perfect match probe refers to a probe that has a sequence that is perfectly complementary to a particular target sequence. The test probe is typically perfectly complementary to a portion (subsequence) of the target sequence.
- the perfect match (PM) probe can be a "test probe", a "normalization control” probe, an expression level control probe and the like. A perfect match control or perfect match probe is, however, distinguished from a "mismatch control” or “mismatch probe.”
- a "probe” is defined as a nucleic acid, capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation.
- a probe may include natural (i.e. A, G, U, C, or T) or modified bases (7-deazaguanosine, inosine, etc.).
- the bases in probes may be joined by a linkage other than a phosphodiester bond, so long as it does not interfere with hybridization.
- probes may be peptide nucleic acids in which the constituent bases are joined by peptide bonds rather than phosphodiester linkages.
- Target nucleic acid refers to a nucleic acid (often derived from a biological sample), to which the probe is designed to specifically hybridize. It is either the presence or absence of the target nucleic acid that is to be detected, or the amount of the target nucleic acid that is to be quantified.
- the target nucleic acid has a sequence that is complementary to the nucleic acid sequence of the corresponding probe directed to the target.
- target nucleic acid may refer to the specific subsequence of a larger nucleic acid to which the probe is directed or to the overall sequence (e.g., gene or mRNA) whose expression level it is desired to detect. The difference in usage will be apparent from context.
- Trans-acting refers to regulation of gene expression by a product that is encoded by a gene at a remote location, usually as a result of binding to a cis-element.
- Stringent conditions refers to conditions under which a probe will hybridize to its target subsequence, but with only insubstantial hybridization to other sequences or to other sequences such that the difference may be identified. Stringent conditions are sequence-dependent and will be different in different circumstances. Longer sequences hybridize specifically at higher temperatures. Generally, stringent conditions are selected to be about 5 ° C lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH.
- Tm thermal melting point
- Subsequence refers to a sequence of nucleic acids that comprise a part of a longer sequence of nucleic acids.
- Tm Thermal melting point
- the Tm is the temperature, under defined ionic strength, pH, and nucleic acid concentration, at which 50% of the probes complementary to the target sequence hybridize to the target sequence at equilibrium. As the target sequences are generally present in excess, at Tm, 50% of the probes are occupied at equilibrium).
- stringent conditions will be those in which the salt concentration is at least about 0.01 to 1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30 C for short probes (e.g., 10 to 50 nucleotide). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide.
- Quantifying when used in the context of quantifying transcription levels of a gene can refer to absolute or to relative quantification. Absolute quantification may be accomplished by inclusion of known concentration(s) of one or more target nucleic acids (e.g. control nucleic acids such as Bio B or with known amounts the target nucleic acids themselves) and referencing the hybridization intensity of unknowns with the known target nucleic acids (e.g. through generation of a standard curve). Alternatively, relative quantification can be accomplished by comparison of hybridization signals between two or more genes, or between two or more treatments to quantify the changes in hybridization intensity and, by implication, transcription level.
- target nucleic acids e.g. control nucleic acids such as Bio B or with known amounts the target nucleic acids themselves
- relative quantification can be accomplished by comparison of hybridization signals between two or more genes, or between two or more treatments to quantify the changes in hybridization intensity and, by implication, transcription level.
- Sequence identity The "percentage of sequence identity” or “sequence identity” is determined by comparing two optimally aligned sequences or subsequences over a comparison window or span, wherein the portion of the polynucleotide sequence in the comparison window may optionally comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical subunit (e.g. nucleic acid base or amino acid residue) occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity. Percentage sequence identity when calculated using the programs GAP or BESTFIT (see below) is calculated using default gap weights.
- Optimal alignment of sequences for comparison may be conducted by the local homology algorithm of Smith and Waterman, Adv. Appl. Math. 2: 482 (1981), by the homology alignment algorithm of Needleman and Wunsch J. Mol. Biol. 48: 443 (1970), by the search for similarity method of Pearson and Lipman, Proc. Natl. Acad. Sci.
- Up-stream or down-stream gene If the expression of a first gene is regulated by a second gene, the second gene is called an "up-stream gene" for the first gene and the first gene is the "down-stream” gene of the second gene.
- the regulation of the first gene by second gene could be through trans-activation.
- the first gene encodes a transcriptional factor that controls the expression of the second gene.
- the regulation can also be exerted by cis-acting.
- the first gene is in the proximity of the second gene and exerts a positional effect on the expression of the second gene. In this case, the first gene does not have to be expressed in order to have an influence on the second gene.
- This invention provides methods, compositions and apparatus for interrogating the genetic network and for studying normal and abnormal functions for specific genes.
- the methods involve quantifying the level of expression of a large number of genes.
- a high density oligonucleotide array is used to hybridize with a target nucleic acid sample to detect the expression level of a large number of genes, preferably more than 10, more preferably more than 100, and most preferably more than 1000 genes.
- a variety of nucleic acid samples are prepared according to the methods of the invention to represent many states of the genetic network. By comparing the expression levels of those samples, regulatory relationships among genes can be determined with a certain statistical confidence.
- a dynamic map can be constructed based upon expression data.
- Such a genetic network map is extremely useful for drug discovery. For example, if a gene of interest is found to be associated with a particular disease, a list of potential up-stream regulatory genes can be found using such a genetic network map. Research efforts can then be concentrated on the potential up-stream genes as drug targets. Similarly, if a gene mutation causes a disease, it may affect genes that are both related and unrelated to the pathogenesis of the disease. The relationships can be explored to find the pathogenic genes. In such embodiments, the association between a disease state and the expression of a large number of genes is determined, and the genes whose expression is altered in the diseased tissue are identified. The upstream genes that regulate the altered genes are indicated as functionally altered or potentially mutated.
- the regulatory function of a particular gene can be identified by monitoring a large number of genes.
- the expression of a gene of interest is suppressed by applying antisense oligonucleotides. The expression of a large number of genes are monitored to provide an expression pattern. The expression of the gene of interest is then restored and the expression of a large number of genes are similarly monitored to provide another expression pattern. By comparing the expression patterns, the regulatory function of the gene of interest can be deduced.
- Activity of a gene is reflected by the activity of its product(s): the proteins or other molecules encoded by the gene. Those product molecules perform biological functions. Directly measuring the activity of a gene product is, however, often difficult for certain genes. Instead, the immunological activities or the amount of the final product(s) or its peptide processing intermediates are determined as a measurement of the gene activity. More frequently, the amount or activity of intermediates, such as transcripts, RNA processing intermediates, or mature mRNAs are detected as a measurement of gene activity.
- intermediates such as transcripts, RNA processing intermediates, or mature mRNAs are detected as a measurement of gene activity.
- the form and function of the final product(s) of a gene is unknown.
- the activity of a gene is measured conveniently by the amount or activity of transcript(s), RNA processing intermediate(s), mature mRNA(s) or its protein product(s) or functional activity of its protein product(s).
- any methods that measure the activity of a gene are useful for at least some embodiments of this invention.
- traditional Northern blotting and hybridization, nuclease protection, RT- PCR and differential display have been used for detecting gene activity.
- Those methods are useful for some embodiments of the invention.
- this invention is most useful in conjunction with methods for detecting the expression of a large number of genes.
- High density arrays are particularly useful for monitoring the expression control at the transcriptional, RNA processing and degradation level.
- the fabrication and application of high density arrays in gene expression monitoring have been disclosed previously in, for example, WO 97/10365, WO 92/10588, U.S. Application Ser. No. 08/772,376 filed December 23, 1996; serial number 08/529,115 filed on September 15, 1995; serial number 08/168,904 filed December 15, 1993; serial number 07/624,114 filed on December 6, 1990, serial number 07/362,901 filed June 7, 1990, all incorporated herein for all purposed by reference.
- high density oligonucleotide arrays are synthesized using methods such as the Very Large Scale Immobilized Polymer Synthesis (VLSIPS) disclosed in U.S. Pat. No. 5,445,934 incorporated herein for all purposes by reference.
- VLSIPS Very Large Scale Immobilized Polymer Synthesis
- Each oligonucleotide occupies a known location on a substrate.
- a nucleic acid target sample is hybridized with a high density array of oligonucleotides and then the amount of target nucleic acids hybridized to each probe in the array is quantified.
- One preferred quantifying method is to use confocal microscope and fluorescent labels.
- the GeneChip ® system (Afl-ymetrix, Santa Clara, CA) is particularly suitable for quantifying the hybridization; however, it will be apparent to those of skill in the art that any similar systems or other effectively equivalent detection methods can also be used.
- High density arrays are suitable for quantifying a small variations in expression levels of a gene in the presence of a large population of heterogeneous nucleic acids.
- Such high density arrays can be fabricated either by de novo synthesis on a substrate or by spotting or transporting nucleic acid sequences onto specific locations of substrate.
- Nucleic acids are purified and/or isolated from biological materials, such as a bacterial plasmid containing a cloned segment of sequence of interest. Suitable nucleic acids are also produced by amplification of templates. As a nonlimiting illustration, polymerase chain reaction, and/or in vitro transcription, are suitable nucleic acid amplification methods.
- Oligonucleotide arrays are particularly preferred for this invention. Oligonucleotide arrays have numerous advantages, as opposed to other methods, such as efficiency of production, reduced intra- and inter array variability, increased information content and high signal-to-noise ratio.
- Preferred high density arrays for gene function identification and genetic network mapping comprise greater than about 100, preferably greater than about 1000, more preferably greater than about 16,000 and most preferably greater than 65,000 or 250,000 or even greater than about 1,000,000 different oligonucleotide probes, preferably in less than 1 cm 2 of surface area.
- the oligonucleotide probes range from about 5 to about 50 or about 500 nucleotides, more preferably from about 10 to about 40 nucleotide and most preferably from about 15 to about 40 nucleotides in length.
- the first level of regulation is at the level of transcription, i.e., by varying the frequency with which a gene is transcribed into nascent pre-mRNA by a RNA polymerase.
- the regulation of transcription is one of the most important steps in the control of gene expression because transcription constitutes the input of the mRNA pool. It is generally known in the art that transcriptional regulation can be achieved through various means.
- transcription can be controlled by a) c ⁇ s-acting transcriptional control sequences and transcriptional factors; b) different gene products from a single transcription unit and c) epigenetic mechanisms; and d) long range control of genetic expression by chromatin structure.
- the current invention provides methods for detecting the transcriptional regulation of individual genes at all of these levels of control. a) c/5-acting transcriptional control sequences, transcriptional factors and measurement of transcription rate
- One level of transcriptional control is through the binding of transcriptional factors to the c/s-acting transcriptional control sequences
- Promoters are a class of c/s-acting elements usually located immediately up-stream (often within 200 bp) of the transcriptional start sites Promoters (TATA box, CCAAT Box, GC box, etc ) are often recognized by ubiquitous transcriptional factors
- promoters may be involved in the control of tissue- specific expression through the binding of tissue specific transcriptional factors.
- c/5-acting elements are the response elements (REs)
- REs response elements
- Those elements are typically found in genes whose expression is responsive to the presence of signaling molecules such as growth factors, hormones, and secondary messengers
- signaling molecules such as growth factors, hormones, and secondary messengers
- Such elements include, but not limited to, cAMP REs, retinoic acid REs, growth factor REs, glucocorticoid REs
- Enhancers and repressors are yet another class of the cz-f-acting elements Those elements have a positive or negative effect on transcription and their functions are generally independent of their orientation in the gene
- Transcriptional factors are proteins that recognize and bind cis-acting trancriptional elements Often, but not always, those factors contain two domains, a DNA-binding domain and an activation domain
- the DNA-binding domain typically contains the leucine zipper motif, the helix-loop-helix motif, helix-turn-helix motif, and/or the zinc finger motif
- Transcriptional factors are encoded by their own genes Therefore, the expression level of transcriptional factors may affect the expression of other genes Those trans-acting factors are integral part of the genetic network
- the expression of transcriptional factors is monitored by the use of a high density array
- the expression of transcriptional factors are monitored at the protein level by the use of two dimensional gel electrophoresis, mass- spectrometry or immunological methods
- nuclei are isolated from cells of interest Isolated nuclei are incubated with labeled nucleotides for a period of time Transcripts are then hybridized with probes In some preferred embodiments, transcripts are quantified with high density nucleic acid array b) different gene products from a single transcription unit
- a transcriptional unit is a continuous segment of DNA that is transcribed into RNA
- bacteria can continuously transcribe several contiguous genes to make polycistronic mRNAs.
- the contiguous genes are from the same transcriptional unit. It is well known in the art that higher organisms also use several mechanisms to make a variety of different gene products from a single transcriptional unit.
- alternative promoters Many genes are known to have several alternative promoters, the use of each promoter resulting one particular transcript.
- the use of alternative promoters is frequently employed to regulate tissue specific gene expression.
- human dystrophin gene has at least seven promoters. The most 5' upstream promoter is used to transcribe a brain specific transcript; a promoter 100 kb down-stream from the first promoter is used to transcribe a muscle specific transcript and a promoter 100 kb downstream of the second promoter is used to transcribe Purkinje cell specific transcript.
- the use of alternative promoters is part of the gene network control mechanism. In several embodiments of the invention, the use of alternative promoters can be monitored and mapped to resolve its regulatory relationship among genes.
- a high density oligonucleotide array is used to monitor the use of specific promoters by measuring the amount of transcripts resulting from each of the alternative promoters.
- probes are designed to be specific for each of the exons that are alternatively used.
- a high density oligonucleotide array is particularly useful for this purpose because of the flexibility of probe design.
- other methods such as DNA arrays, RT- PCR, differential display, optical oligonucleotide sensors, can also be used to monitor the alternative use of promoters.
- RNA splicing is the most common method of RNA processing. Nascent pre-mRNAs are cut and pasted by specialized apparatus called splicesomes. Some non-coding regions transcribed from the intron regions are excised. Exons are linked to form a contiguous coding region ready for translation.
- a single type of nascent pre-mRNAs are used to generate multiple types of mature RNA by a process called alternative splicing in which exons are alternatively used to form different mature mRNAs which code for different proteins.
- the human Calcitonin gene (CALC) is spliced as calcitonin, a circulating Ca 2+ homeostatic hormone, in the thyroid; and as calcitonin gene-related peptide (CGRP), a neuromodulatory and trophic factor, in the hypothalamus (See, Hodges and Bernstein, 1994, Adv. Genet., 31, 207-281).
- This diversity of gene product is achieved by a combination of alternative splicing and alternative adenylation. Regulation of the alternative splicing and adenylation is a part of the genetic network. In some embodiments of the invention, alternative splicing are monitored. Many methods are suitable for detecting alternative splicing and adenylation. High density oligonucleotide arrays are particularly suitable for this purpose because of their design flexibility. Oligonucleotide probes against specific sequence diversity can be readily synthesized and used to detect the level of each of the sequences produced by alternative splicing and adenylation.
- RNA editing is another form of post-transcriptional processing.
- certain genes such as the Wilm's tumor susceptibility gene (WTI), apolipoprotein (APOB) gene, and glutamate receptor gene undergo C->U or U->C substitution editing events (See, Scott, 1995, Cell, 81, 833-836).
- WTI Wilm's tumor susceptibility gene
- APOB apolipoprotein
- glutamate receptor gene undergo C->U or U->C substitution editing events (See, Scott, 1995, Cell, 81, 833-836).
- WTI Wilm's tumor susceptibility gene
- APOB apolipoprotein
- glutamate receptor gene undergo C->U or U->C substitution editing events (See, Scott, 1995, Cell, 81, 833-836).
- the human APOB gene encodes a 4536 amino acid product.
- the same gene encodes a 2152 amino acid product.
- the smaller product is due to the addition of a stop codon during RNA editing.
- genomic imprinting can be monitored by measuring the level of transcripts of specific sequence.
- Long range control of gene expression provides additional means for one gene to interact with another in expression. Competition for enhancers or siliencers, position effects, chromatin domains and X -inactivation are mechanisms for a region of DNA or a gene to exert its control over the expression of other genes without expressing itself.
- Long range control of gene expression may increase the complexity of analyzing correlated gene expression data. For example, expression of certain genes may be correlated because of their proximity, not because they are under the control of the expressed product of a common gene. Knowledge regarding the position of genes is useful in analyzing such data.
- a hidden variable i.e., a variable that is not measurable by using expression monitoring
- Long range control effects can be inferred by a consistent positional effect, i.e., a correlation between expression and proximity of genes.
- Direct measurement of long range control over gene expression can be carried out by combining gene expression monitoring and traditional methods for identifying such control. Traditional methods are described in, for example, Strachan and Read, Human Molecular Genetics, 1996, incorporated herein by reference for all purposes.
- the current invention is based upon the regulatory relationships among genes. It is well known in the art that gene expression is regulated at transcription, RNA processing, RNA degradation, translation, and protein processing levels. Some of the specific preferred embodiments of the invention monitor expression control at the transcription, RNA processing and degradation level. Those embodiments are described in detail to illustrate the methods of the invention. It would be apparent to those of ordinary skill in the art that monitoring translation and protein processing can be similarly used for the current invention.
- antibodies are used to detect the amount of protein products using procedures such as Western blotting and immunocytochemistry. Other immunological methods can also be used. Traditional polyclonal or monoclonal antibodies are useful.
- Genetic engineering methods such as the phage display technology described, for example, in Strachan and Read, Human Molecular Genetics, 1996, incorporated previously for all purposes by reference, are particularly preferred in some embodiments to obtain a large number of antibodies for monitoring the expression of a large number of genes.
- nucleic acid array methods for monitoring gene expression are disclosed and discussed in detail in PCT Application WO 092.10588 (published on June 25, 1992), all incorporated herein by reference for all purposes.
- those methods of monitoring gene expression involve (a) providing a pool of target nucleic acids comprising RNA transcript(s) of one or more target gene(s), or nucleic acids derived from the RNA transcript(s); (b) hybridizing the nucleic acid sample to a high density array of probes and (c) detecting the hybridized nucleic acids and calculating a relative and/or absolute expression (transcription, RNA processing or degradation) level.
- nucleic samples containing target nucleic acid sequences that reflect the transcripts of interest may contain transcripts of interest.
- suitable nucleic acid samples may contain nucleic acids derived from the transcripts of interest.
- a nucleic acid derived from a transcript refers to a nucleic acid for whose synthesis the mRNA transcript or a subsequence thereof has ultimately served as a template.
- a cDNA reverse transcribed from a transcript, an RNA transcribed from that cDNA, a DNA amplified from the cDNA, an RNA transcribed from the amplified DNA, etc. are all derived from the transcript and detection of such derived products is indicative of the presence and/or abundance of the original transcript in a sample.
- suitable samples include, but are not limited to, transcripts of the gene or genes, cDNA reverse transcribed from the transcript, cRNA transcribed from the cDNA, DNA amplified from the genes, RNA transcribed from amplified DNA, and the like.
- Transcripts may include, but not limited to pre-mRNA nascent transcript(s), transcript processing intermediates, mature mRNA(s) and degradation products. It is not necessary to monitor all types of transcripts to practice this invention. For example, one may choose to practice the invention to measure the mature mRNA levels only.
- such sample is a homogenate of cells or tissues or other biological samples.
- such sample is a total RNA preparation of a biological sample.
- a nucleic acid sample is the total mRNA isolated from a biological sample.
- the total mRNA prepared with most methods includes not only the mature mRNA, but also the RNA processing intermediates and nascent pre-mRNA transcripts.
- total mRNA purified with a poly (dT) column contains RNA molecules with poly (A) tails. Those polyA + RNA molecules could be mature mRNA, RNA processing intermediates, nascent transcripts or degradation intermediates.
- Biological samples may be of any biological tissue or fluid or cells from any organism. Frequently the sample will be a "clinical sample” which is a sample derived from a patient. Clinical samples provide a rich source of information regarding the various states of genetic network or gene expression. Some embodiments of the invention are employed to detect mutations and to identify the phenotype of mutations. Such embodiments have extensive applications in clinical diagnostics and clinical studies. Typical clinical samples include, but are not limited to, sputum, blood, blood cells (e.g., white cells), tissue or fine needle biopsy samples, urine, peritoneal fluid, and pleural fluid, or cells therefrom. Biological samples may also include sections of tissues, such as frozen sections or formalin fixed sections taken for histological purposes. Another typical source of biological samples are cell cultures where gene expression states can be manipulated to explore the relationship among genes In one aspect of the invention, methods are provided to generate biological samples reflecting a wide variety of states of the genetic network
- RNase is inhibited or destroyed by heat treatment followed by proteinase treatment
- the total RNA is isolated from a given sample using, for example, an acid guanidinium-phenol-chloroform extraction method and polyA + mRNA is isolated by oligo(dT) column chromatography or by using (dT) on magnetic beads (see, e.g., Sambrook et al., Molecular Cloning: A Laboratory Manual (2nd ed ), Vols 1-3, Cold Spring Harbor Laboratory, (1989), or Current Protocols in Molecular Biology, F Ausubel et al., ed Greene Publishing and Wiley-Interscience, New York (1987))
- Quantitative amplification involves simultaneously co-amplifying a known quantity of a control sequence using the same primers This provides an internal standard that may be used to calibrate the PCR reaction
- the high density array may then include probes specific to the internal standard for quantification of the amplified nucleic acid
- One preferred internal standard is a synthetic AW 106 cRNA
- the AW 106 cRNA is combined with RNA isolated from the sample according to standard techniques known to those of skilled in the art
- the RNA is then reverse transcribed using a reverse transcriptase to provide copy DNA.
- the cDNA sequences are then amplified (e.g., by PCR) using labeled primers.
- the amplification products are separated, typically by electrophoresis, and the amount of radioactivity (proportional to the amount of amplified product) is determined.
- the amount of mRNA in the sample is then calculated by comparison with the signal produced by the known AW106 RNA standard.
- Detailed protocols for quantitative PCR are provided in PCR Protocols, A Guide to Methods and Applications, Innis et al., Academic Press, Inc. N.Y., (1990).
- PCR polymerase chain reaction
- LCR ligase chain reaction
- RT-PCR typically incorporates preliminary steps to isolate total RNA or mRNA for subsequent use as an amplification template.
- a one-tube mRNA capture method may be used to prepare poly(A) + RNA samples suitable for immediate RT- PCR in the same tube (Boehringer Mannheim). The captured mRNA can be directly subjected to RT-PCR by adding a reverse transcription mix and, subsequently, a PCR mix.
- the sample mRNA is reverse transcribed with a reverse transcriptase and a primer consisting of oligo(dT) and a sequence encoding the phage T7 promoter to provide single stranded DNA template.
- the second DNA strand is polymerized using a DNA polymerase.
- T7 RNA polymerase is added and RNA is transcribed from the cDNA template. Successive rounds of transcription from each single cDNA template results in amplified RNA.
- Methods of in vitro polymerization are well known to those of skill in the art (see, e.g., Sambrook, supra.) and this particular method is described in detail by Van Gelder, etal, Proc.
- the oligonucleotide probes provided in the array are chosen to be complementary to subsequences of the antisense nucleic acids.
- the target nucleic acid pool is a pool of sense nucleic acids
- the oligonucleotide probes are selected to be complementary to subsequences of the sense nucleic acids.
- the probes may be of either sense as the target nucleic acids include both sense and antisense strands.
- the protocols cited above include methods of generating pools of either sense or antisense nucleic acids. Indeed, one approach can be used to generate either sense or antisense nucleic acids as desired.
- the cDNA can be directionally cloned into a vector (e.g., Stratagene's p Bluscript II KS (+) phagemid) such that it is flanked by the T3 and T7 promoters. //; vitro transcription with the T3 polymerase will produce RNA of one sense (the sense depending on the orientation of the insert), while in vitro transcription with the T7 polymerase will produce RNA having the opposite sense.
- a vector e.g., Stratagene's p Bluscript II KS (+) phagemid
- Other suitable cloning systems include phage lambda vectors designed for Cre-loxP plasmid subcloning (see e.g., Palazzolo et al., Gene, 88: 25-36 (1990)).
- the high density array will typically include a number of probes that specifically hybridize to the sequences of interest.
- the array will include one or more control probes.
- test probes could be oligonucleotides that range from about 5 to about 45 or 5 to about 500 nucleotides, more preferably from about 10 to about 40 nucleotides and most preferably from about 15 to about 40 nucleotides in length. In other particularly preferred embodiments the probes are 20 or 25 nucleotides in length. In another preferred embodiments, test probes are double or single strand DNA sequences. DNA sequences are isolated or cloned from nature sources or amplified from nature sources using nature nucleic acid as templates. These probes have sequences complementary to particular subsequences of the genes whose expression they are designed to detect. Thus, the test probes are capable of specifically hybridizing to the target nucleic acid they are to detect.
- the high density array can contain a number of control probes.
- the control probes fall into three categories referred to herein as 1) normalization controls; 2) expression level controls; and 3) mismatch controls.
- Normalization controls are oligonucleotide or other nucleic acid probes that are complementary to labeled reference oligonucleotides or other nucleic acid sequences that are added to the nucleic acid sample.
- the signals obtained from the normalization controls after hybridization provide a control for variations in hybridization conditions, label intensity, "reading" efficiency and other factors that may cause the signal of a perfect hybridization to vary between arrays.
- signals (e.g., fluorescence intensity) read from all other probes in the array are divided by the signal (e.g., fluorescence intensity) from the control probes thereby normalizing the measurements.
- Virtually any probe may serve as a normalization control.
- Preferred normalization probes are selected to reflect the average length of the other probes present in the array, however, they can be selected to cover a range of lengths.
- the normalization control(s) can also be selected to reflect the (average) base composition of the other probes in the array, however in a preferred embodiment, only one or a few normalization probes are used and they are selected such that they hybridize well (i.e. no secondary structure) and do not match any target-specific probes.
- Expression level controls are probes that hybridize specifically with constitutively expressed genes in the biological sample. Virtually any constitutively expressed gene provides a suitable target for expression level controls. Typically expression level control probes have sequences complementary to subsequences of constitutively expressed "housekeeping genes" including, but not limited to the ⁇ -actin gene, the transferrin receptor gene, the GAPDH gene, and the like.
- Mismatch controls may also be provided for the probes to the target genes, for expression level controls or for normalization controls.
- Mismatch controls are oligonucleotide probes or other nucleic acid probes identical to their corresponding test or control probes except for the presence of one or more mismatched bases.
- a mismatched base is a base selected so that it is not complementary to the corresponding base in the target sequence to which the probe would otherwise specifically hybridize.
- One or more mismatches are selected such that under appropriate hybridization conditions (e.g. stringent conditions) the test or control probe would be expected to hybridize with its target sequence, but the mismatch probe would not hybridize (or would hybridize to a significantly lesser extent).
- Preferred mismatch probes contain a central mismatch.
- a corresponding mismatch probe will have the identical sequence except for a single base mismatch (e.g., substituting a G, a C or a T for an A) at any of positions 6 through 14 (the central mismatch).
- ⁇ Mismatch probes thus provide a control for non-specific binding or cross- hybridization to a nucleic acid in the sample other than the target to which the probe is directed. Mismatch probes thus indicate whether a hybridization is specific or not. For example, if the target is present the perfect match probes should be consistently brighter than the mismatch probes. In addition, if all central mismatches are present, the mismatch probes can be used to detect a mutation. The difference in intensity between the perfect match and the mismatch probe (I(PM)-I(MM)) provides a good measure of the concentration of the hybridized material.
- the high density array may also include sample preparation/amplification control probes. These are probes that are complementary to subsequences of control genes selected because they do not normally occur in the nucleic acids of the particular biological sample being assayed. Suitable sample preparation/amplification control probes include, for example, probes to bacterial genes (e.g., Bio B) where the sample in question is a biological from a eukaryote.
- sample preparation/amplification control probes include, for example, probes to bacterial genes (e.g., Bio B) where the sample in question is a biological from a eukaryote.
- RNA sample is then spiked with a known amount of the nucleic acid to which the sample preparation amplification control probe is directed before processing. Quantification of the hybridization of the sample preparation/amplification control probe then provides a measure of alteration in the abundance of the nucleic acids caused by processing steps (e.g. PCR, reverse transcription, in vitro transcription, etc.).
- processing steps e.g. PCR, reverse transcription, in vitro transcription, etc.
- oligonucleotide probes in the high density array are selected to bind specifically to the nucleic acid target to which they are directed with minimal non-specific binding or cross-hybridization under the particular hybridization conditions utilized.
- the high density arrays of this invention can contain in excess of 1,000,000 different probes, it is possible to provide every probe of a characteristic length that binds to a particular nucleic acid sequence.
- the high density array can contain every possible 20-mer sequence complementary to an IL-2 mRNA.
- probes directed to these subsequences are expected to cross-hybridize with occurrences of their complementary sequence in other regions of the sample genome.
- other probes simply may not hybridize effectively under the hybridization conditions (e.g., due to secondary structure, or interactions with the substrate or other probes).
- the probes that show such poor specificity or hybridization efficiency are identified and may not be included either in the high density array itself (e.g., during fabrication of the array) or in the post-hybridization data analysis.
- expression monitoring arrays are used to identify the presence and expression (transcription) level of genes which are several hundred base pairs long. For most applications it would be useful to identify the presence, absence, or expression level of several thousand to one hundred thousand genes. Because the number of oligonucleotides per array is limited in a preferred embodiment, it is desired to include only a limited set of probes specific to each gene whose expression is to be detected.
- probes as short as 15, 20, or 25 nucleotide are sufficient to hybridize to a subsequence of a gene and that, for most genes, there is a set of probes that performs well across a wide range of target nucleic acid concentrations. In a preferred embodiment, it is desirable to choose a preferred or "optimum" subset of probes for each gene before synthesizing the high density array.
- oligonucleotide analogue array can be synthesized on a solid substrate by a variety of methods, including, but not limited to, light-directed chemical coupling, and mechanically directed coupling. See Pirrung et al, U.S. Patent No. 5,143,854 (see also PCT Application No. WO 90/15070) and Fodor et al, PCT Publication Nos. WO 92/10092 and WO 93/09668 and US Ser. No.
- a glass surface is derivatized with a silane reagent containing a functional group, e.g., a hydroxyl or amine group blocked by a photolabile protecting group.
- a functional group e.g., a hydroxyl or amine group blocked by a photolabile protecting group.
- Photolysis through a photolithogaphic mask is used selectively to expose functional groups which are then ready to react with incoming 5'-photoprotected nucleoside phosphoramidites.
- the phosphoramidites react only with those sites which are illuminated (and thus exposed by removal of the photolabile blocking group).
- the phosphoramidites only add to those areas selectively exposed from the preceding step. These steps are repeated until the desired array of sequences have been synthesized on the solid surface. Combinatorial synthesis of different oligonucleotide analogues at different locations on the array is determined by the pattern of illumination during synthesis and the order of addition of coupling reagents.
- Peptide nucleic acids are commercially available from, e.g., Biosearch, Inc. (Bedford, MA) which comprise a polyamide backbone and the bases found in naturally occurring nucleosides. Peptide nucleic acids are capable of binding to nucleic acids with high specificity, and are considered "oligonucleotide analogues" for purposes of this disclosure.
- a typical "flow channel” method applied to the compounds and libraries of the present invention can generally be described as follows. Diverse polymer sequences are synthesized at selected regions of a substrate or solid support by forming flow channels on a surface of the substrate through which appropriate reagents flow or in which appropriate reagents are placed. For example, assume a monomer "A" is to be bound to the substrate in a first group of selected regions. If necessary, all or part of the surface of the substrate in all or a part of the selected regions is activated for binding by, for example, flowing appropriate reagents through all or some of the channels, or by washing the entire substrate with appropriate reagents.
- a reagent having the monomer A flows through or is placed in all or some of the channel(s).
- the channels provide fluid contact to the first selected regions, thereby binding the monomer A on the substrate directly or indirectly (via a spacer) in the first selected regions.
- a monomer B is coupled to second selected regions, some of which may be included among the first selected regions.
- the second selected regions will be in fluid contact with a second flow channel(s) through translation, rotation, or replacement of the channel block on the surface of the substrate; through opening or closing a selected valve; or through deposition of a layer of chemical or photoresist.
- a step is performed for activating at least the second regions.
- the monomer B is flowed through or placed in the second flow channel(s), binding monomer B at the second selected locations.
- the resulting sequences bound to the substrate at this stage of processing will be, for example, A, B, and AB. The process is repeated to form a vast array of sequences of desired length at known locations on the substrate.
- monomer A can be flowed through some of the channels, monomer B can be flowed through other channels, a monomer C can be flowed through still other channels, etc.
- monomer A can be flowed through some of the channels, monomer B can be flowed through other channels, a monomer C can be flowed through still other channels, etc.
- many or all of the reaction regions are reacted with a monomer before the channel block must be moved or the substrate must be washed and/or reactivated.
- the number of washing and activation steps can be minimized.
- a protective coating such as a hydrophilic or hydrophobic coating (depending upon the nature of the solvent) is utilized over portions of the substrate to be protected, sometimes in combination with materials that facilitate wetting by the reactant solution in other regions. In this manner, the flowing solutions are further prevented from passing outside of their designated flow paths.
- High density nucleic acid arrays can be fabricated by depositing presynthezied or natural nucleic acids in predined positions. Synthesized or natural nucleic acids are deposited on specific locations of a substrate by light directed targeting and oligonucleotide directed targeting. Nucleic acids can also be directed to specific locations in much the same manner as the flow channel methods For example, a nucleic acid A can be delivered to and coupled with a first group of reaction regions which have been appropriately activated. Thereafter, a nucleic acid B can be delivered to and reacted with a second group of activated reaction regions Nucleic acids are deposited in selected regions.
- a dispenser that moves from region to region to deposit nucleic acids in specific spots
- Typical dispensers include a micropipette or capillary pin to deliver nucleic acid to the substrate and a robotic system to control the position of the micropipette with respect to the substrate.
- the dispenser includes a series of tubes, a manifold, an array of pipettes or capillary pins, or the like so that various reagents can be delivered to the reaction regions simultaneously.
- Nucleic acid hybridization simply involves contacting a probe and target nucleic acid under conditions where the probe and its complementary target can form stable hybrid duplexes through complementary base pairing The nucleic acids that do not form hybrid duplexes are then washed away leaving the hybridized nucleic acids to be detected, typically through detection of an attached detectable label.
- nucleic acids are denatured by increasing the temperature or decreasing the salt concentration of the buffer containing the nucleic acids
- low stringency conditions e.g., low temperature and/or high salt
- hybrid duplexes e.g., DNA:DNA, RNA:RNA, or RNA:DNA
- specificity of hybridization is reduced at lower stringency
- higher stringency e.g., higher temperature or lower salt
- hybridization conditions may be selected to provide any degree of stringency.
- hybridization is performed at low stringency in this case in 6X SSPE-T at 37 C (0.005% Triton X-100) to ensure hybridization and then subsequent washes are performed at higher stringency (e.g., 1 X SSPE-T at 37 C) to eliminate mismatched hybrid duplexes. Successive washes may be performed at increasingly higher stringency (e.g., down to as low as 0.25 X SSPE- T at 37 C to 50 C) until a desired level of hybridization specificity is obtained. Stringency can also be increased by addition of agents such as formamide. Hybridization specificity may be evaluated by comparison of hybridization to the test probes with hybridization to the various controls that can be present (e.g., expression level control, normalization control, mismatch controls, etc.).
- the wash is performed at the highest stringency that produces consistent results and that provides a signal intensity greater than approximately 10% of the background intensity.
- the hybridized array may be washed at successively higher stringency solutions and read between each wash. Analysis of the data sets thus produced will reveal a wash stringency above which the hybridization pattern is not appreciably altered and which provides adequate signal for the particular oligonucleotide probes of interest.
- background signal is reduced by the use of a detergent (e.g., C-TAB) or a blocking reagent (e.g., sperm DNA, cot-1 DNA, etc.) during the hybridization to reduce non-specific binding.
- a detergent e.g., C-TAB
- a blocking reagent e.g., sperm DNA, cot-1 DNA, etc.
- the hybridization is performed in the presence of about 0.5 mg/ml DNA (e.g., herring sperm DNA).
- the use of blocking agents in hybridization is well known to those of skill in the art (see, e.g., Chapter 8 in P. Tijssen, supra.)
- RNA:RNA > RNA:DNA > DNA:DNA The stability of duplexes formed between RNAs or DNAs are generally in the order of RNA:RNA > RNA:DNA > DNA:DNA, in solution.
- Long probes have better duplex stability with a target, but poorer mismatch discrimination than shorter probes (mismatch discrimination refers to the measured hybridization signal ratio between a perfect match probe and a single base mismatch probe).
- Shorter probes e.g., 8-mers discriminate mismatches very well, but the overall duplex stability is low.
- T m thermal stability
- A-T duplexes have a lower T m than guanine-cytosine (G- C) duplexes, due in part to the fact that the A-T duplexes have 2 hydrogen bonds per base-pair, while the G-C duplexes have 3 hydrogen bonds per base pair.
- oligonucleotide arrays in which there is a non-uniform distribution of bases, it is not generally possible to optimize hybridization for each oligonucleotide probe simultaneously.
- TMAC1 salt tetramethyl ammonium chloride
- Altered duplex stability conferred by using oligonucleotide analogue probes can be ascertained by following, e.g., fluorescence signal intensity of oligonucleotide analogue arrays hybridized with a target oligonucleotide over time.
- the data allow optimization of specific hybridization conditions at, e.g., room temperature (for simplified diagnostic applications in the future).
- Another way of verifying altered duplex stability is by following the signal intensity generated upon hybridization with time. Previous experiments using DNA targets and DNA chips have shown that signal intensity increases with time, and that the more stable duplexes generate higher signal intensities faster than less stable duplexes. The signals reach a plateau or "saturate" after a certain amount of time due to all of the binding sites becoming occupied. These data allow for optimization of hybridization, and determination of the best conditions at a specified temperature.
- the hybridized nucleic acids are detected by detecting one or more labels attached to the sample nucleic acids.
- the labels may be incorporated by any of a number of means well known to those of skill in the art. However, in a preferred embodiment, the label is simultaneously incorporated during the amplification step in the preparation of the sample nucleic acids.
- PCR polymerase chain reaction
- transcription amplification as described above, using a labeled nucleotide (e.g. fluorescein-labeled UTP and/or CTP) incorporates a label into the transcribed nucleic acids.
- a label may be added directly to the original nucleic acid sample (e.g., mRNA, polyA mRNA, cDNA, etc.) or to the amplification product after the amplification is completed
- Means of attaching labels to nucleic acids are well known to those of skill in the art and include, for example nick translation or end-labeling (e.g. with a labeled RNA) by kinasing of the nucleic acid and subsequent attachment (ligation) of a nucleic acid linker joining the sample nucleic acid to a label (e.g., a fluorophore)
- Detectable labels suitable for use in the present invention include any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means
- Useful labels in the present invention include biotin for staining with labeled streptavidin conjugate, magnetic beads (e.g., Dynabeads ⁇ M ⁇ fluorescent dyes (e.g., fluorescein, texas red, rhodamine, green fluorescent protein, and the like), radiolabels (e.g., -1H, ⁇ 5 ⁇ 355 ⁇ I4C, or ⁇ 2P), enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others commonly used in an ELISA), and colorimetric labels such as colloidal gold or colored glass or plastic (e.g., polystyrene, polypropylene, latex, etc ) beads
- Patents teaching the use of such labels include U S Patent Nos 3,817,837, 3,850,752, 3,9
- radiolabels may be detected using photographic film or scintillation counters
- fluorescent markers may be detected using a photodetector to detect emitted light
- Enzymatic labels are typically detected by providing the enzyme with a substrate and detecting the reaction product produced by the action of the enzyme on the substrate, and colorimetric labels are detected by simply visualizing the colored label
- colloidal gold label that can be detected by measuring scattered light
- the label may be added to the target (sample) nucleic acid(s) prior to, or after the hybridization
- direct labels are detectable labels that are directly attached to or incorporated into the target (sample) nucleic acid prior to hybridization
- indirect labels are joined to the hybrid duplex after hybridization
- the indirect label is attached to a binding moiety that has been attached to the target nucleic acid prior to the hybridization
- the target nucleic acid may be biotinylated before the hybridization After hybridization, an aviden-conjugated fluorophore will bind the biotin bearing hybrid duplexes providing a label that is easily detected
- Fluorescent labels are preferred and easily added during an in vitro transcription reaction.
- fluorescein labeled UTP and CTP are incorporated into the RNA produced in an in vitro transcription reaction as described above.
- Means of detecting labeled target (sample) nucleic acids hybridized to the probes of the high density array are known to those of skill in the art. Thus, for example, where a colorimetric label is used, simple visualization of the label is sufficient. Where a radioactive labeled probe is used, detection of the radiation (e.g. with photographic film or a solid state detector) is sufficient.
- the target nucleic acids are labeled with a fluorescent label and the localization of the label on the probe array is accomplished with fluorescent microscopy.
- the hybridized array is excited with a light source at the excitation wavelength of the particular fluorescent label and the resulting fluorescence at the emission wavelength is detected.
- the excitation light source is a laser appropriate for the excitation of the fluorescent label.
- the confocal microscope may be automated with a computer-controlled stage to automatically scan the entire high density array.
- the microscope may be equipped with a phototransducer (e.g., a photomultiplier, a solid state array, a CCD camera, etc.) attached to an automated data acquisition system to automatically record the fluorescence signal produced by hybridization to each oligonucleotide probe on the array.
- a phototransducer e.g., a photomultiplier, a solid state array, a CCD camera, etc.
- Such automated systems are described at length in U.S. Patent No: 5,143,854, PCT Application 20 92/10092, and copending U.S. Application Ser. No. 08/195,889 filed on February 10, 1994.
- Use of laser illumination in conjunction with automated confocal microscopy for signal detection permits detection at a resolution of better than about 100 ⁇ m, more preferably better than about 50 ⁇ m, and most preferably better than about 25 ⁇ m.
- hybridization signals will vary in strength with efficiency of hybridization, the amount of label on the sample nucleic acid and the amount of the particular nucleic acid in the sample.
- nucleic acids present at very low levels e.g., ⁇ lpM
- a threshold intensity value may be selected below which a signal is not counted as being essentially indistinguishable from background.
- a lower threshold is chosen. Conversely, where only high expression levels are to be evaluated a higher threshold level is selected. In a preferred embodiment, a suitable threshold is about 10% above that of the average background signal.
- the hybridization array is provided with normalization controls.
- These normalization controls are probes complementary to control sequences added in a known concentration to the sample. Where the overall hybridization conditions are poor, the normalization controls will show a smaller signal reflecting reduced hybridization. Conversely, where hybridization conditions are good, the normalization controls will provide a higher signal reflecting the improved hybridization. Normalization of the signal derived from other probes in the array to the normalization controls thus provides a control for variations in hybridization conditions. Typically, normalization is accomplished by dividing the measured signal from the other probes in the array by the average signal produced by the normalization controls.
- Normalization may also include correction for variations due to sample preparation and amplification. Such normalization may be accomplished by dividing the measured signal by the average signal from the sample preparation/amplification control probes (e.g., the Bio B probes). The resulting values may be multiplied by a constant value to scale the results.
- sample preparation/amplification control probes e.g., the Bio B probes
- the high density array can include mismatch controls.
- the difference in hybridization signal intensity between the target specific probe and its corresponding mismatch control is a measure of the discrimination of the target- specific probe.
- the signal of the mismatch probe is subtracted from the signal from its corresponding test probe to provide a measure of the signal due to specific binding of the test probe.
- the concentration of a particular sequence can then be determined by measuring the signal intensity of each of the probes that bind specifically to that gene and normalizing to the normalization controls. Where the signal from the probes is greater than the mismatch, the mismatch is subtracted. Where the mismatch intensity is equal to or greater than its corresponding test probe, the signal is ignored.
- the expression level of a particular gene can then be scored by the number of positive signals (either absolute or above a threshold value), the intensity of the positive signals (either absolute or above a selected threshold value), or a combination of both metrics (e.g., a weighted average).
- a computer system is used to compare the hybridization intensities of the perfect match and mismatch probes of each pair. If the gene is expressed, the hybridization intensity (or affinity) of a perfect match probe of a pair should be recognizably higher than the corresponding mismatch probe. Generally, if the hybridizations intensities of a pair of probes are substantially the same, it may indicate the gene is not expressed. However, the determination is not based on a single pair of probes, the determination of whether a gene is expressed is based on an analysis of many pairs of probes.
- the system After the system compares the hybridization intensity of the perfect match and mismatch probes, the system indicates expression of the gene. As an example, the system may indicate to a user that the gene is either present (expressed), marginal or absent (unexpressed). Specific procedures for data analysis is disclosed in U.S. Application 08/772,376, previously incorporated for all purposes.
- RNA processing and RNA editing are all accomplished by proteins which are coded by their own genes.
- DNA sequences can exert long range control over the expression of other genes by positional effects. Therefore, the expression of genes is often regulated by the expression of other genes.
- Those regulatory genes are called up-stream genes, relative to the regulated or down-stream genes. In a simple regulatory pathway:
- A, B, C, D are genes ++ up-regulates ⁇ down-regulates
- Gene A is an up-stream gene of gene B and B is an up-stream gene of C.
- B is an up-stream gene of C.
- the network is frequently looped and inter-connected.
- the expression of a gene is regulated by its own product as either a positive or negative feedback ( Figure 1 illustrates a hypothetical gene network).
- One aspect of this invention is to provide a systematic approach to understand the regulatory relationship among genes in the entire genetic network in many or all species. This approach is premised, in part, on the development of methods for massive parallel monitoring of gene expression. The current invention is also premised, in part, upon the availability of biological samples reflecting the genetic network in various of stages and states. By observing the change of gene expression, the regulatory relationship can be defined with various data processing methods. Because of the complexity of the genetic network, those of skill in the art would appreciate the importance of obtaining a large number of biological samples reflecting various stages of the genetic network. Cultured cells or tissue samples reflecting various physiological, developmental, pathological states are useful because they reflect independent states of the genetic network. One aspect of the invention provides methods to generate a large number of additional biological samples reflecting a massive number of independent states of the genetic network.
- a cell line is treated with mutagens, radiation, virus infection, or transcription vectors. Treated cells are then cloned and propagated to produce sufficient amount of mRNA. Each clone reflect an independent state of gene expression.
- more sophisticated methods of genetic mutation is used for systematically knocking out genes. Methods such as the Random Homozygous Knock-out are particularly preferred because of their efficiency in obtaining homozygous knock out cells.
- mutagenic chemicals such as ethyl nitrorsurea and ethyl methylsulfonate or to high dose of X-rays can be used to produced a large number mutant cells.
- mutagenesis is essentially random, even though some mutants may not survive and thus will not be represented in the final clones.
- Clone mutant cells will have different profiles of gene expression.
- Oligonucleotides with sequence complementary to a mRNA sequence can be introduced into cells to block the translation of the mRNA, thus blocking the function of the gene encoding the mRNA.
- the use of oligonucleotides to block gene expression is preferred in some embodiments because of the simplicity of its procedure.
- the use of oligonucleotides to block gene expression is described, for example, in, Strachan and Read, Human Molecular Genetics, 1996, previously incorporated by reference for all purposes.
- an antisense minigene is constructed by cloning a DNA sequence complementary to the mRNA targeted.
- the DNA sequence is under the control of a promoter sequence at one end and enclosed with a polyadenylation sequence at the other end. Transcription from such a vector produces an antisense RNA which blocks the function of the targeted mRNA.
- Directional, correlational, and causation models of gene regulation can be built based upon the level of expression of different genes.
- models are built by incorporating expression data and current knowledge about the regulation of specific genes.
- Path analysis can be used to decomposing relations among variables and for testing causal models for genetic networks. However, path analysis is generally limited by its assumptions, such as variables measured withour error, no correlation among residuals and unidirectional causal flow.
- LISREL Linear Structural Relationship Analysis
- One aspect of the invention provides methods for detecting regulatory functions of a gene by identifying whether or not the target gene regulates the expression of other genes.
- Figure 3 illustrates one embodiment of such a method.
- the gene of interest is mutated or its functions repressed by other means (2).
- a variety of methods can be used to specifically suppress the expression of a target gene, including the use of antisense oligonucleotides and antisense genes.
- the gene of interest is introduced to a cell that lacks the expression of the gene of interest and the expression of a large number of genes is monitored to detect in the alteration of expression pattern. Cell lines are preferred for studying the regulatory function of a target gene in some embodiments because of low cost of maintenance and construction.
- the expression of a large number of genes preferably more than 10, more preferably more than 100, and most preferably more than 1000 genes, is monitored to detect significant changes in the pattern of expression (1,2).
- the change in expression is then analyzed to detect the specific genes that are potentially regulated by the gene of interest (3).
- the expression level of down-stream genes is often not directly correlated with the expression of up-stream genes.
- an up-stream gene encoding a transcriptional factor that regulates a down-stream gene may be expressed at a constant rate.
- the regulation of down-stream gene activity is through changes in the transcriptional factor activity by binding to a signal molecule, not through changes in the amount of the transcriptional factor. It would be difficult to interpret the regulatory relationship solely based upon the correlation between the activity of two genes.
- the target gene is completely suppressed for a certain period of time to deplete any reserve of gene products.
- FIG. 4 illustrates one such embodiment.
- nucleic acid samples from a wild-type biological sample (1) and from a mutant (2) are analyzed to obtain wild-type and mutant expression profile of several down-stream genes. A change in the expression of down-stream gene may indicate that the mutation is not silent. (7).
- the function of p53 mutation is determined by monitoring the expression of p53 up-regulated gadd45, cyclin G, p21wafl, Bax, IGF-BP3 and Thrombospondin genes and p53 down regulated c-myc and PCNA genes.
- the inclusion of more regulated genes improves the quality and reliability of the analysis.
- Mutations of the p53 gene are the most commonly found abnormality in human cancer (Volgelstein, 1990, A deadly inheritance. Nature 348:681-2). A recent compilation and analysis of screening data indicated that 37% of the 2567 cancers contained mutations in the p53 gene (Greenblatt et al., 1994, Mutations in the p53 tumor suppressor gene: clues to cancer etiology and molecular pathogenesis. Cancer Res. 54:4855-78).
- IHC Immunohistochemistry
- oligonucleotide arrays that include more than 6,500 human gene sequences derived from the GenBank (http://www.ncbi.nlm.nih.gov) and dbEST databases were generated. These arrays were used to monitor and compare the expression of >6,500 genes in parallel from normal and malignant breast tissue cell lines.
- RNA for array hybridization experiments was derived from the malignant breast cell line BT-474 and normal breast tissue from primary cell line HT- 125.
- BT-474 was isolated from a solid, invasive ductal carcinoma of the breast and is tumorigenic in athymic nude mice.
- HT-125 was obtained from a cell line derived from normal breast tissue peripheral to an infiltrating ductal carcinoma.
- p53 genotyping array Affymetrix, Santa Clara, CA
- Figure 5 shows the results of the genotypic analysis demonstrating that there was a G to A base change resulting in a E to K amino acid change at position 285 in exon 8, the p53 DNA binding domain.
- ds cDNA double stranded cDNA
- IVT in vitro transcription
- the labeled cRNA was then fragmented in the presence of heat and Mg 2+ and hybridized to the oligonucleotide arrays in the presence of label control targets used for array normalization and message quantitation. After washing and staining with streptavidin-phycoerythrin conjugate, hybridization patterns were visualized using an argon laser scanning confocal microscope (Affymetix, Santa Clara, CA) and the fluorescence intensity images processed and quantitated by GeneSeq software (Affymetrix, Santa Clara, CA).
- Bax Inducer of apoptosis (bcl-2 40+3 undetected associated protein)
- IGF-BP3 Insulin growth factor pathway 4,000+400 undetected inhibitor p21 AF1/CIP1 Cyclin-CDK and DNA 350+30 undetected replication inhibitor
- Figure 6 shows that the expression of a number of genes is altered in normal and malignant cell lines.
- Table 1 summarizes the expression of several p53 downstream genes.
- the expression level of p53 activated targets gadd45, cyclin G, p21wafl, Bax, IGF-BP3 and Thrombospondin in BT-474 is lost.
- a coincident gain of expression was seen of the p53 repressed targets c-myc and PCNA.
- These expression patterns in BT-474 indicated a loss of wild-type p53 function. Therefore, the G to A mutation in the p53 gene impairs the regulatory function of the p53 gene.
- V ⁇ Mutation Detection by Gene Expression Monitoring
- gene expression monitoring is used to detect potential malfunction of regulatory genes, such as a mutation in the coding or regulatory regions.
- the expression of a subset of genes of interest in a diseased tissue is analyzed to obtain a diseased expression pattern (2).
- the subset contains at least one gene, preferably more than 5 genes, preferably more than 100 genes, more preferably more than 1,000 genes, and most preferably more than 6,000 genes or all the known genes.
- the expression of the same genes in a normal tissue can also be similarly analyzed to generate a normal gene expression pattern (1).
- Difference in the expression of genes indicates the abnormality of regulation of changed genes in the diseased tissue (3).
- a filter is used to identify those genes whose expression is significantly altered.
- nucleic acid sequence analysis methods can be used for detecting sequence changes.
- high density oligonucleotide arrays are used to detect the sequence changes.
- One advantage of using oligonucleotide arrays is that the sequence interrogation can be performed in conjunction with gene expression monitoring in a single chip.
- One aspect of the invention provides a method for detecting mutations in the p53 gene which affect function.
- RNA for array hybridization experiments was derived from the malignant breast cell line MDA 468 and MDA231 and normal breast tissue from primary cell line HT-125.
- HT-125 was obtained from a cell line derived from normal breast tissue peripheral to an infiltrating ductal carcinoma.
- the normal and malignant cells were harvested, lysed and Poly A + RNA isolated and used as template for double stranded cDNA (ds cDNA) synthesis using an oligo dT primer containing a T7 promoter sequence at its 5' end.
- ds cDNA product then served as template in an in vitro transcription (IVT) reaction using T7 polymerase and biotinylated ribonucleotides.
- the labeled cRNA was then fragmented in the presence of heat and Mg 2+ and hybridized to the oligonucleotide arrays in the presence of label control targets used for array normalization and message quantitation. After washing and staining with streptavidin-phycoerythrin conjugate, hybridization patterns were visualized using an argon laser scanning confocal microscope (Affymetix, Santa Clara, CA) and the fluorescence intensity images processed and quantitated by GeneSeq software (Affymetrix, Santa Clara, CA).
- Bax Inducer of apoptosis (bcl-2 40+5 undetected undetected associated protein)
- IGF-BP3 Insulin growth factor 4,000 ⁇ 400 undetected undetected pathway inhibitor
- Example 3 Identification of differentially expressed genes in breast tissues An understanding of the molecular basis of disease requires the ability to detect genetic variation across a large number of genes and to correlate genetic factors with the resulting cellular consequences.
- the use of high density oligonucleotide (nucleic acid) arrays provided genotyping of candidate genes as well as the characterization of the relative abundance of mRNAs identified herein. Information from the human genome project, Merck EST sequencing effort, or any other source of genetic sequence information may be used to design and fabricate such oligonucleotide arrays for the highly parallel analysis of mRNA levels. DNA arrays containing probes that are complementary to 6,600 human ESTs were used in the particular experiments outlined herein to identify such messenger RNAs.
- the expressed genes identified herein will find application in a wide array of uses. Included among such uses are diagnostic uses, prognostic uses, therapeutic uses, and forensic uses.
- the particular arrays designed herein utilized semiconductor based photolithography and solid phase chemical synthesis to directly synthesize independently specified DNA probes on derivitized glass at a density of 10 7 oligonucleotide molecules per 50 ⁇ m 2 synthesis region, as discussed in Lockhart, D.J., Dong, H., Byrne, M.C., Follettie, M.T., Gallo, M.V., Chee, M.S., Mittmann, M., Wang, C, Kobayashi, M., Horton, H. & Brown, E.L, Expression monitoring by hybridization to high-density oligonucleotide arrays. Nature Biotechnology 13,1675- 1680 (1996), incorporated herein by reference.
- oligonucleotide arrays were generated with probes selected from 6,600 EST gene clusters derived from the dbEST public database, as described in Boguski, M.S., Lowe, T.M. & Tolstoshev, CM. dbEST-database for 'expressed sequence tags'. Nature Genetics 4, 332-333 (1993), incorporated herein by reference.
- arrays are complementary to 3,200 human full length GenBank genes, and 3,400 ESTs that demonstrate strong homology to other eukaryotic genes in the SwissProt protein sequence database.
- Particular arrays herein contained collections of 20 probe pairs for each of the 6,600 messages being monitored. Each probe pair is composed of a 25-mer oligonucleotide that is perfectly complementary to a region of sequence from a specific message, and a sister probe that is identical except for a single base substitution in a central position.
- This combination of perfect and mismatched probes serves as an internal control for hybridization specificity and allows for sensitive quantitation in the presence of cross-hybridization backgrounds. Probes were selected on the basis of uniqueness and hybridization specificity. The aim was to choose probes that would yield the best discrimination between perfect match and single base mismatch hybridization events.
- RNAs for array hybridization experiments were derived from the malignant breast cell line BT-474 and normal primary breast tissue cell line HT-125.
- BT-474 was isolated from a solid, invasive ductal carcinoma of the breast and is tumorigenic in athymic nude mice, as described in Lasfargues, E.Y., Coutinho, W.G. & Redfield, E.S. Isolation of two human tumor epithelial cell lines from solid breast carcinomas. Journal of the National Cancer Institute 4, 967-978 (1978), incorporated herein by reference..
- HT-125 was obtained from normal breast tissue peripheral to an infiltrating ductal carcinoma
- Hs578T aneuploid mammary epithelial
- Hs578Bst diploid myoepithelial
- mRNA was isolated from normal and malignant cells and converted into double stranded cDNA (ds cDNA) using an oligo dT primer containing a T7 promoter sequence at its 5 '-end (5).
- ds cDNA product (with T7 polymerase promoter sequence incorporated) served as template in an in vitro transcription (IVT) reaction using T7 polymerase and biotinylated ribonucleotides.
- IVT in vitro transcription
- the biotinylated cRNA was then fragmented by heating and hybridized to the oligonucleotide arrays. After washing and then staining with streptavidin-phycoerythrin conjugate, hybridization patterns were visualized using an argon ion laser scanning confocal microscope. The fluorescence intensity images were processed and quantitated by GeneChip data analysis software.
- Figure 9 A shows representative hybridization patterns of total message from normal and malignant breast cells to sets of 20 probe pairs from 1,650 gene sequences (one array of a set of 4 encompassing 6,600 human genes). Clear examples of unchanged and altered patterns of gene expression can be observed by visual comparison of the fluorescence intensities of probe sets from these two samples.
- the quantitative analysis of hybridization patterns is based on the assumption that for a specific mRNA the perfect-matched (PM) probes will hybridize more strongly on average than their mis-matched (MM) partners (Fig. 9B).
- the average difference in intensity between PM and MM hybridization signals is computed together with the average of the logarithm of the PM/MM ratios for each probe set. These values are then used to determine the relative copy number of a detected message.
- biotinylated control cRNAs E. coli biotin synthetase genes bioB, bioC, bioD and bacteriophage PI Cre recombinase
- bioB, bioC, bioD and bacteriophage PI Cre recombinase E. coli biotin synthetase genes bioB, bioC, bioD and bacteriophage PI Cre recombinase
- Spiking experiments were performed to investigate the absolute hybridization intensity range between multiple RNAs at known concentrations. When 32 individual cRNAs were spiked at levels ranging from copy numbers of 1: 100,000 to 1 :30,000 in the background of total cellular mRNA, absolute hybridization intensities were within a 2-fold range for all targets tested.
- Added biotinylated control oligonucleotide together with endogenous cellular RNA messages e.g.
- ⁇ -Actin and glyceraldehyde 3 -phosphate dehydrogenase (GAPDH)), allowed for normalization of experimental variation and array hybridization.
- Intensities from the spiked standards in the presence of total cellular target demonstrated sensitivities as high as 1 : 100,000, corresponding to a few copies per cell.
- Genes that are repressed and induced/activated may provide a particularly good starting point to decipher the molecular pathways involved in programs of tumorigenesis.
- To identify the genes falling into each of these two categories we sorted the normal and malignant message populations to identify those genes that demonstrated a 10-fold or greater difference in message intensities. This analysis revealed 168 genes repressed and 137 genes activated in BT-474 when compared to HT-125, as shown in Table 3 ( Figure 11). 260 of the messages displaying differential expression corresponded to GenBank human full length genes, 45 to ESTs with homologies to other eukaryotic or viral genes.
- Her2/neu protooncogene also known as c-erbB-2
- c-erbB-2 Her2/neu protooncogene
- BT-474 tumor cells have been previously demonstrated to overexpress Her2/neu, as discussed in Styles, J.M., Harrison, S., Gusterson, B.A. & Dean, C.J.
- Rat monoclonal antibodies to the external domain of the product of the c- erbB-2 proto-oncogene International Journal of Cancer, 2, 320-324 (1990), which belongs to the epidermal growth factor receptor family of receptor tyrosine kinases (RTKs), as discussed in Coussens, L., Yang-Feng, T.L., Liao, Y.C., Chen, E., Gray, A., McGrath, J., Seeburg, P.H., Libermann, T.A., Schlessinger, J., Francke, U., et al.
- RTKs receptor tyrosine kinases
- Tyrosine kinase receptor with extensive homology to EGF receptor shares chromosomal location with neu oncogene, Science 4730, 1132-1139 (1985).
- the oncogenic activation of RTKs is commonly achieved by overexpression resulting in the ability to dimerize in the absence of ligand, as discussed in Earp, H.S., Dawson, T.L., Li, X. & Yu, H.
- EGF receptor family members a new signaling paradigm with implications for breast cancer research, Breast Cancer Research and Treatment 1,115-132 (1995), and specifically, overexpression of Her2/neu is observed in 20-30% of all human breast cancers and ovarian cancers, as discussed in Earp, H.S., Dawson, T.L., Li, X. & Yu, H. Heterodimerization and functional interaction between EGF receptor family members: a new signaling paradigm with implications for breast cancer research, Breast Cancer Research and Treatment 1,115-132 (119195); King, C.R., Kraus, M.H. & Aaronson, S.A.
- Elevated expression of related RTK family member Her3 has also been implicated in the development and progression of human malignancies including breast cancer as discussed in Kraus, M.H., Issing, W., Miki, T., Vietnamese, N.C. & Aaronson, S.A.
- ERBB3 a third member of the ERBB/epidermal growth factor receptor family: evidence for overexpression in a subset of human mammary tumors, Proceedings of the National Academy of Sciences of the United States of America 23, 9193-9197 (1989), and Lemoine, N.R., Barnes, D.M., Hollywood, D P., Hughes, CM,, Smith, P., Dublin, E., Prigent, S.A., Gullick, W.J. & Hurst, H.C. Expression of the ERBB3 gene product in breast cancer, British Journal of Cancer 6, 1116-1121 (1992).
- GRB-7 is an SH2 domain protein and component of RTK signal transduction pathways that is overexpressed and found in tight complex with Her2/neu in certain breast cancers, as discussed in Stein, D., Wu, J., Fuqua, S.A., Roonprapunt, C, Yanjnik, V., D'Eustachio, P., Moskow, J.J., Buchberg, A.M., Osborne, C.K. & Margolis, B.
- the SH2 domain protein GRB-7 is co-amplified, overexpressed and in tight complex with HER2 in breast cancer, EMBO J 13, 1331-1340 (1994)
- Caveolins are integral membrane proteins and principle components of caveolae (non-clathrin-coated invaginations of the plasma membrane, as discussed in Lisanti, M P , Tang, Z , Scherer, P E , Kubler, E , Koleske, A J & Sargiacomo, M Caveolae, transmembrane signalling and cellular transformation Molecular Membrane Biology, 1, 121-124 (1995))
- GPCR G protein-coupled
- Ras proteins have been established as critical intermediates between upstream RTKs (21,22) and GPCRs (23,24), and downstream signaling components involved in cellular transformation (including mitogen activated protein (MAPK) kinases), see Van Biesen, T , Hawes, B E , Luttrell, D K , Krueger, K M , Touhara, K , Porfiri, E , Sakaue, M , Luttrell, L M & Lefkowitz, R J Receptor-tyrosine-kinase- and G beta gamma-mediated MAP kinase activation by a common signalling pathway, Nature 6543, 781-784 (1995), Li, N , Batzer, A , Daly, R , Yajnik, V , Skolnik, E , Chardin, P , Bar-Sagi, D , Margolis, B & Schlessinger, J Guanine-nucleotide-releasing factor hSosl bind
- Lysophosphatidic acid stimulates mitogen-activated protein kinase activation via a G-protein-coupled pathway requiring p21ras and p74raf-l. Journal of Biological Chemistry 28, 20717-20720 (1993); and Alblas, J., Van Corven, E.J., Hordijk, P.L., Milligan, G. & Moolenaar, W.H. Gi- mediated activation of the p21ras-mitogen-activated protein kinase pathway by alpha 2-adrenergic receptors expressed in fibroblasts. Journal of Biological Chemistry 30, 22235-22238 (1993).
- Ras mutations are seen in less than 5% of breast cancers, a large body of evidence implicates deregulation of the Ras pathway in breast carcinomas (Clark, G.J. & Der, C.J. Aberrant function of the Ras signal transduction pathway in human breast cancer, Breast Cancer Research and Treatment 1, 133-144 (1995)).
- the concurrent up-regulation of RTKs and down-regulation of caveolins in BT-474 strongly indicate a convergence of multiple upstream mitogenic signaling events on the Ras pathway in this breast carcinoma.
- our analysis also revealed up-regulation of Ras, Raf, Mek and ERK (Table 1) which together highlight a deregulated Ras/MAPK pathway ; see Marshall, M.S.
- p53 is the most commonly mutated gene associated with neoplasia and mutations are found in over 50% of all human cancers.
- the p53 gene product is a nuclear phosphoprotein that functions in cell-cycle regulation and the preservation of genetic integrity (reviewed in Levine, A.J. p53, the cellular gatekeeper for growth and division. Cell 3, 323-331 (1997)). It possesses numerous biochemical properties necessary to carry out these functions, including sequence-specific DNA binding activity, transcriptional activation and transcriptional repression.
- genomic p53 was resequenced in BT474.
- the strategy for rapid, simultaneous analysis of large amounts of genetic information using high-density oligonucleotide arrays has been described in Chee, M., Yang, R., Hubbell, E., Berno, A., Huang, X.C, Stern, D., Winkler, J., Lockhart, D.J., Morris, M.S. & Fodor, S.P. Accessing genetic information with high- density DNA arrays, Science 5287, 610-614 (1996).
- the DNA array used in this study allowed for simultaneous analysis of both sense and anti-sense sequence of p53 coding exons 2-11, including 10 base pairs of intronic flanking sequence (to identify splice donor-acceptor mutations), as well as allele specific probes for over 300 characterized hotspot p53 mutations and every possible single base deletion (Dee et al., manuscript in prep.)
- the re-sequence analysis portion of the DNA array consisted of a set of 4 identical 20-mer oligonucleotides complementary to p53 wild-type sequence, except that an A,C,G or T was substituted into each probe at a centrally localized position.
- BT-474 versus HT-125 p53 genomic DNA using the p53 genotyping array revealed a single base substitution of G to A in exon 8 (DNA binding domain), resulting in an amino acid change at position 285 from E to K (Fig. 3B).
- the hybridization signal difference centered about the mutation identified by the footprint analysis (see Fig. 3B, top panel), and subsequent base calling of a single genotype (see Fig. 3B, bottom panel) in BT-474 by the GeneChip software indicated that this carcinoma had a loss of heterozygosity at the p53 locus (confirmed by dideoxy sequence analysis).
- BT-474, MDA468 and MDA231 cells were maintained in RPMI-1640 (Gibco/BRL) containing 10% Fetal Bovine Serum, 10 ⁇ g/ml bovine insulin, 2 mM glutamine, 100 units/ml Penicillin and 100 ⁇ g/ml Streptomycin.
- HT-125 cells were maintained in Modified Dulbecco's Medium (Gibco/BRL) with 10% Fetal Bovine Serum, 30 ng/ml epidermal growth factor, 10 ⁇ g/ml bovine insulin, 10 ⁇ M non-essential amino acids, 100 units/ml Penicillin and 100 ⁇ g/ml Streptomycin.
- Cell lines were kept at 37°C, 5% C0 2 and split 1 to 3 at approximately 60-70% confluency.
- RNA preparation and labeling for gene expression monitoring Poly A + RNA was isolated from cells using an Oligotex Direct mRNA Kit (Qiagen) following both standard and batch protocols according to the manufacturer's instructions. 0.5-1 ⁇ g of mRNA was then converted into ds cDNA using a Superscript Choice System cDNA Synthesis Kit (Gibco/BRL) and an oligo dT primer incorporating a T7 RNA polymerase promoter sequence on its 5 '-end. The resultant ds cDNA was purified by one step of phenol/chloroform extraction using Phase Lock Gel (5 Prime to 3 Prime) followed by EtOH precipitation.
- Phase Lock Gel (5 Prime to 3 Prime) followed by EtOH precipitation.
- the ds cDNA product then served as target in an in- vitro transcription labeling reaction using T7 RNA polymerase (Ambion T7 Megascript Transcription Kit), 1.875 mM biotin-CTP and 1.875 mM biotin-UTP for a final concentration of 7.5 mM each NTP.
- T7 RNA polymerase Ambion T7 Megascript Transcription Kit
- 1.875 mM biotin-CTP 1.875 mM biotin-CTP
- biotin-UTP 1.875 mM biotin-UTP for a final concentration of 7.5 mM each NTP.
- the total labeled cRNA transcripts were purified by Chromaspin-100 columns (Clontech), followed by ProCipitate treatment (Affinity Binding) and EtOH precipitation to remove unincorporated nucleotides and protein contaminants.
- the fragmented samples were brought up to a final volume of 200 ⁇ l with hybridization buffer (0.9 M NaCl, 60 mM NaH 2 P0 4 , 6 mM EDTA and 0.005% Triton X-100, pH 7.6 (6xSSPE-T)) containing 0.1 ng/ml Herring Sperm DNA, 50 pM biotin-labeled control oligo (5'-
- arrays were first washed in 0.5X SSPE-T at 40°C for 15 min with rotation (60 rpm), then incubated with 2 ⁇ g/ml of phycoerytherin-strepavidin conjugate (Molecular Probes) in 6xSSPE-T containing 1 mg/ml of acetylated-bovine serum albumin at 40 °C for 10 min. Prior to scanning, the arrays were washed at room temperature with 6xSSPE-T for 5 cycles (2 drains- fills/cycle) in the fluidics station.
- Phycoerytherin-strepavidin conjugate Molecular Probes
- the hybridized stained arrays were scanned using an argon-ion laser GeneChip scanner 50 (Molecular Dynamics) with a resolution setting of 7.5 ⁇ m/pixel (-45 pixels/probe cell), and wavelength detection setting of 560 nm. Fluorescence images and quantitative analysis of hybridization patterns and intensities were performed using GeneSeq Analysis Software and GEprocess (Affymetrix) gene expression data analysis programs. p53 PCR and labeling for re-sequence analysis by array hybridization. The p53 gene was genotyped by amplifying coding exons 2-11 in a 100 ⁇ l multiplex PCR reaction using 100 ng of genomic DNA extracted from cells using a QIAmp Blood Kit (Qiagen).
- PCR Buffer II (Perkin-Elmer) was used at IX along with 2.5mM MgCb, 200 ⁇ M of each dNTP and 10 units of Taq Polymerase Gold (Perkin-Elmer).
- the multiplex PCR was performed using 10 exon-specific primers (Table 4) with the following cycling conditions: 1 cycle at 94°C (5 min), 50 cycles of 94°C (30 sec), 60°C (30 sec) and 72°C (30 sec), followed by 1 cycle at 72°C (7 min).
- the fragmented PCR products were then labeled in a 100 ⁇ l reaction using 10 ⁇ M flourecein-N6-ddATP (Dupont-NEN) and 25 units of terminal transferase (Boehringer Mannheim) in 200 ⁇ M K-Cacodylate, 25 mM Tris-HCl (pH 6.6), 0.25 mg/ml BSA and 2.5 mM CoCh.
- the labeling reaction was incubated at 37° C for 45 min and heat- inactivated at 99 °C for 5 min. p53 re-sequence analysis array hybridization and scanning.
- the fragmented, labeled PCR reaction was hybridized to the p53 re-sequence analysis array in 6xSSPE-T containing 2mg/ml BSA and 1.67 nM fluorescein-labeled control oligo (5'-CTGAACGGTAGCATCTTGAC-3') at 45 °C for 30 min.
- the array was then washed with 3X SSPE-T at 35 °C for 4 cycles (10 drains-fills/cycle) in the GeneChip Fluidics Station (RELA).
- the hybridized p53 array was scanned using an argon-ion laser scanner (Hewlett-Packard) with a resolution setting of 6.0 ⁇ m/pixel ( ⁇ 70 pixels/probe cell) and wavelength detection setting of 530 nm.
- a fluorescence image was created, intensity information analyzed and nucleotide determination made by GeneChip Analysis Software (Affymetrix). Footprint analysis was done using Ulysses Analysis Software (Affymetrix) essentially as described.
- Genotyping through-put capabilities Conventional gel-based dideoxynucleotide sequencing can genotype approximately twelve p53 genomes a day (10 hr) assuming an average read of 400 nucleotides per gel, run twice a day.
- the through-put of the GeneChip p53 system for a single person, using one fluidics station and scanner (40 min hyb/wash and 6 min scan time) is approximately 6 arrays per hour, or sixty p53 genomes fully genotyped in a 10 hour period.
- the probes for the human 6,600 gene arrays were selected from the 600 bases at the 3 '-end of sequences chosen from the dbEST database. Probes for inclusion on the arrays were identified based on a criteria of uniqueness and hybridization characteristics. Uniqueness was accessed by comparing potential probes with all genes that were considered for inclusion on the arrays. If any potential probe matched 22 out of 25 nucleotides of another sequence that probe was discarded. Selection of probes for hybridization characteristics was done by using heuristic rules and a neural net developed from previous expression experiments.
- the heuristics for the 6,600 gene arrays were as follows: 1) total number of As or Ts less than 13; 2) total number of Cs or Gs less than 11; 3) number of As or Ts in a window of 8 less than 7; 4) number of Cs or Gs in a window of 8 less than 6; 5) palindrome score less than 9 (the palindrome score is a measure of probe self-complementarity).
- the neural net was used to prune out probes that it identified as poor hybridizers or promiscuous cross hybridizers as described in detail elsewhere. Finally, any probes requiring more than 70 synthesis steps to include on the arrays were rejected to minimize synthesis time and cost.
- Data from tables 2 & 3 include expression results from an array designed to identify alternatively spliced forms of targets.
- This array surveys 250 genes from functional categories including oncogene, tumor suppressor, DNA mismatched repair and apoptosis gene products. Probe pairs for this design were chosen such that each exon for a given message was represented on the array. In this way, specific loss of signal from a sub-set of probes corresponding to a particular exon of a message would indicate a splice variant form.
- the present invention provides greatly improved methods, compositions, and apparatus for identifying gene function and for studying the regulatory relationship among genes. It is to be understood that the above description is intended to be illustrative and not restrictive. Many variations of the invention will be apparent to those of skill in the art upon reviewing the above description. By way of example, the invention has been described primarily with reference to the use of a high density oligonucleotide array, but it will be readily recognized by those of skill in the art that other nucleic acid arrays, other methods of measuring transcript levels and gene expression monitoring at the protein level could be used. The scope of the invention should, therefore, be determined not with reference to the above description, but should instead be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.
Abstract
L'invention concerne des procédés, des compositions et un appareil permettant d'établir des correspondances de relations de régulation entre des gènes au moyen d'un contrôle parallèle massif d'expression de gènes. Dans certains modes de réalisation, des mutations se produisant dans des gènes de régulation en amont sont détectées à l'aide d'un contrôle des modifications ayant lieu dans l'expression de gènes situés en aval. De façon similaire, la fonction d'une mutation spécifique d'un gène situé en amont est déterminée par contrôle de l'expression du gène situé en aval. De plus, la fonction de régulation d'un gène cible peut être déterminée à l'aide d'un contrôle de l'expression d'un grand nombre de gènes situés en aval. L'invention concerne également des modes de réalisation spécifiques permettant de détecter des mutations fonctionnelles homozygotes et hétérozygotes du gène p53, et de déterminer la fonction de mutations de p53.
Priority Applications (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU60356/98A AU6035698A (en) | 1997-01-13 | 1998-01-12 | Expression monitoring for gene function identification |
EP98903639A EP0973939A4 (fr) | 1997-01-13 | 1998-01-12 | Controle d'expression genique permettant d'identifier une fonction de gene |
JP53129498A JP2001508303A (ja) | 1997-01-13 | 1998-01-12 | 遺伝子機能同定のための発現モニタリング |
US09/086,285 US6303301B1 (en) | 1997-01-13 | 1998-05-29 | Expression monitoring for gene function identification |
PCT/US1998/013949 WO1999002682A1 (fr) | 1997-07-09 | 1998-07-09 | Mutants du p53 presents dans les tumeurs |
US09/836,278 US6733969B2 (en) | 1997-01-13 | 2001-04-18 | Expression monitoring for gene function identification |
US10/660,607 US20040058376A1 (en) | 1997-01-13 | 2003-09-12 | Expression monitoring for gene function identification |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US3532797P | 1997-01-13 | 1997-01-13 | |
US60/035,327 | 1997-01-13 | ||
US4962797P | 1997-06-13 | 1997-06-13 | |
US60/049,627 | 1997-06-13 |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US34130299A Continuation | 1997-01-13 | 1999-09-03 |
Related Child Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/086,285 Continuation-In-Part US6303301B1 (en) | 1997-01-13 | 1998-05-29 | Expression monitoring for gene function identification |
US09/911,856 A-371-Of-International US20020039739A1 (en) | 1997-01-13 | 2001-07-25 | Expression monitoring for gene function identification |
Publications (2)
Publication Number | Publication Date |
---|---|
WO1998030722A1 WO1998030722A1 (fr) | 1998-07-16 |
WO1998030722A9 true WO1998030722A9 (fr) | 1998-12-10 |
Family
ID=26711995
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1998/001206 WO1998030722A1 (fr) | 1997-01-13 | 1998-01-12 | Controle d'expression genique permettant d'identifier une fonction de gene |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP0973939A4 (fr) |
JP (1) | JP2001508303A (fr) |
AU (1) | AU6035698A (fr) |
WO (1) | WO1998030722A1 (fr) |
Families Citing this family (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1999002682A1 (fr) * | 1997-07-09 | 1999-01-21 | Affymetrix, Inc. | Mutants du p53 presents dans les tumeurs |
US6838283B2 (en) | 1998-09-29 | 2005-01-04 | Isis Pharmaceuticals Inc. | Antisense modulation of survivin expression |
US6340565B1 (en) * | 1998-11-03 | 2002-01-22 | Affymetrix, Inc. | Determining signal transduction pathways |
US6258536B1 (en) * | 1998-12-01 | 2001-07-10 | Jonathan Oliner | Expression monitoring of downstream genes in the BRCA1 pathway |
JP3443039B2 (ja) * | 1999-06-21 | 2003-09-02 | 科学技術振興事業団 | ネットワーク推定方法および装置 |
AU2001234455A1 (en) | 2000-01-14 | 2001-07-24 | Integriderm, L.L.C. | Informative nucleic acid arrays and methods for making same |
GB2373500B (en) * | 2000-02-04 | 2004-12-15 | Aeomica Inc | Methods and apparatus for predicting, confirming, and displaying functional information derived from genomic sequence |
EP1342782A4 (fr) * | 2000-11-13 | 2006-04-12 | Japan Science & Tech Agency | Procede d'anticipation de reseau de genes, systeme d'anticipation et support d'enregistrement |
EP1275734A1 (fr) * | 2001-07-11 | 2003-01-15 | Roche Diagnostics GmbH | Méthode pour la synthèse aléatoire et l'amplification d'ADNc |
EP1275738A1 (fr) * | 2001-07-11 | 2003-01-15 | Roche Diagnostics GmbH | Procédé pour la synthèse aléatoire et l'amplification d'ADNc |
EP1908851A3 (fr) * | 2001-09-19 | 2008-06-25 | Intergenetics Incorporated | Analyse génétique pour la stratification du risque de cancer |
US20040002071A1 (en) * | 2001-09-19 | 2004-01-01 | Intergenetics, Inc. | Genetic analysis for stratification of cancer risk |
US20040229225A1 (en) * | 2003-05-16 | 2004-11-18 | Jose Remacle | Determination of a general three-dimensional status of a cell by multiple gene expression analysis on micro-arrays |
ATE389035T1 (de) * | 2005-09-13 | 2008-03-15 | Eppendorf Array Tech Sa | Verfahren zum nachweis von homologen sequenzen, welche sich durch eine base unterscheiden, auf einem mikroarray |
CA2733672C (fr) * | 2007-08-16 | 2018-09-11 | The Royal Institution For The Advancement Of Learning/Mcgill University | Microvesicules issues d'une cellule tumorale |
US20100255514A1 (en) | 2007-08-16 | 2010-10-07 | The Royal Institution For The Advancement Of Learning/Mcgill University | Tumor cell-derived microvesicles |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1995019369A1 (fr) * | 1994-01-14 | 1995-07-20 | Vanderbilt University | Procede de detection et de traitement du cancer du sein |
-
1998
- 1998-01-12 JP JP53129498A patent/JP2001508303A/ja not_active Ceased
- 1998-01-12 WO PCT/US1998/001206 patent/WO1998030722A1/fr active Application Filing
- 1998-01-12 EP EP98903639A patent/EP0973939A4/fr not_active Ceased
- 1998-01-12 AU AU60356/98A patent/AU6035698A/en not_active Abandoned
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6303301B1 (en) | Expression monitoring for gene function identification | |
US7122373B1 (en) | Human genes and gene expression products V | |
WO1998030722A9 (fr) | Controle d'expression genique permettant d'identifier une fonction de gene | |
EP0973939A1 (fr) | Controle d'expression genique permettant d'identifier une fonction de gene | |
US6340565B1 (en) | Determining signal transduction pathways | |
CA2848369A1 (fr) | Gene chimere de gene kif5b et de gene ret, et procede de determination de l'efficacite d'un traitement anticancereux ciblant le gene chimere | |
US20050214824A1 (en) | Methods for monitoring the expression of alternatively spliced genes | |
US5945522A (en) | Prostate cancer gene | |
JP2008524986A (ja) | タキサンに基づく薬物療法に対する悪性腫瘍の応答予測に有用な遺伝子変化 | |
EP1006181A2 (fr) | Surveillance de l' expression de gènes en aval dans la voie BRCAI | |
CN102317470A (zh) | 促进前列腺癌风险的遗传性变型 | |
WO2002014500A2 (fr) | Genes humains et produits d'expression genique | |
US20030219768A1 (en) | Lung cancer therapeutics and diagnostics | |
US20150197816A1 (en) | Methods for the detection, visualization and high resolution physical mapping of genomic rearrangements in breast and ovarian cancer genes and loci brca1 and brca2 using genomic morse code in conjunction with molecular combing | |
US20060281126A1 (en) | Methods for monitoring the expression of alternatively spliced genes | |
WO2006124022A1 (fr) | Profilage d’expression de gene de micromatrice dans des sous-types d’hypernephrome | |
US20020039739A1 (en) | Expression monitoring for gene function identification | |
US20030165931A1 (en) | Qualitative differential screening | |
JP2003528630A (ja) | ヒト遺伝子および発現産物 | |
US20040161760A1 (en) | Method of molecular diagnosis of chronic myelogenous leukemia | |
US20100015620A1 (en) | Cancer-linked genes as biomarkers to monitor response to impdh inhibitors | |
CN108048562B (zh) | 一种针对欧洲人群全基因组范围内的非编码区的SNPs的DNA芯片 | |
WO2003007801A2 (fr) | Diagnostic et traitement d'une maladie vasculaire | |
Kipps | Genomic complexity in chronic lymphocytic leukemia | |
Mack et al. | Deciphering molecular circuitry using high-density DNA arrays |