MXPA96003077A - Comparative analysis of the transcription of the - Google Patents
Comparative analysis of the transcription of theInfo
- Publication number
- MXPA96003077A MXPA96003077A MXPA/A/1996/003077A MX9603077A MXPA96003077A MX PA96003077 A MXPA96003077 A MX PA96003077A MX 9603077 A MX9603077 A MX 9603077A MX PA96003077 A MXPA96003077 A MX PA96003077A
- Authority
- MX
- Mexico
- Prior art keywords
- sequences
- gene
- transcription
- library
- transcripts
- Prior art date
Links
- 230000035897 transcription Effects 0.000 title claims abstract description 106
- 238000010835 comparative analysis Methods 0.000 title description 14
- 229920002676 Complementary DNA Polymers 0.000 claims abstract description 78
- 238000004458 analytical method Methods 0.000 claims abstract description 73
- 239000002299 complementary DNA Substances 0.000 claims abstract description 72
- 230000000875 corresponding Effects 0.000 claims abstract description 22
- 229920000160 (ribonucleotides)n+m Polymers 0.000 claims abstract description 19
- 238000010191 image analysis Methods 0.000 claims abstract description 13
- 201000010099 disease Diseases 0.000 claims abstract description 11
- 102000004169 proteins and genes Human genes 0.000 claims description 53
- 108090000623 proteins and genes Proteins 0.000 claims description 53
- 108020004999 Messenger RNA Proteins 0.000 claims description 40
- 229920002106 messenger RNA Polymers 0.000 claims description 40
- 238000000034 method Methods 0.000 claims description 39
- 241000282414 Homo sapiens Species 0.000 claims description 38
- 210000001519 tissues Anatomy 0.000 claims description 36
- 229920001850 Nucleic acid sequence Polymers 0.000 claims description 14
- 239000000203 mixture Substances 0.000 claims description 10
- 210000004369 Blood Anatomy 0.000 claims description 4
- 239000008280 blood Substances 0.000 claims description 4
- 238000004519 manufacturing process Methods 0.000 claims description 3
- 241000282412 Homo Species 0.000 claims description 2
- 238000001574 biopsy Methods 0.000 claims description 2
- 210000003296 Saliva Anatomy 0.000 claims 1
- 210000002700 Urine Anatomy 0.000 claims 1
- 238000002405 diagnostic procedure Methods 0.000 claims 1
- 210000004027 cells Anatomy 0.000 description 128
- 101700038382 MARK1 Proteins 0.000 description 48
- 102100000538 MARK1 Human genes 0.000 description 48
- 229920003013 deoxyribonucleic acid Polymers 0.000 description 38
- 210000002540 Macrophages Anatomy 0.000 description 21
- 230000014509 gene expression Effects 0.000 description 21
- 241000282485 Vulpes vulpes Species 0.000 description 19
- 239000002253 acid Substances 0.000 description 16
- 238000007906 compression Methods 0.000 description 16
- 238000009396 hybridization Methods 0.000 description 15
- 241000894007 species Species 0.000 description 15
- 239000003814 drug Substances 0.000 description 14
- 239000000047 product Substances 0.000 description 14
- 238000011160 research Methods 0.000 description 14
- 229940079593 drugs Drugs 0.000 description 13
- 210000001616 Monocytes Anatomy 0.000 description 12
- 150000007513 acids Chemical class 0.000 description 12
- 239000000523 sample Substances 0.000 description 11
- 238000004590 computer program Methods 0.000 description 10
- 201000011510 cancer Diseases 0.000 description 9
- 239000003795 chemical substances by application Substances 0.000 description 9
- 238000009826 distribution Methods 0.000 description 9
- 208000006454 Hepatitis Diseases 0.000 description 8
- 210000004185 Liver Anatomy 0.000 description 8
- 241001465754 Metazoa Species 0.000 description 8
- 241000700159 Rattus Species 0.000 description 8
- 230000000208 anti-hepatitis Effects 0.000 description 8
- 231100000283 hepatitis Toxicity 0.000 description 8
- 101700046291 PACK Proteins 0.000 description 7
- 230000004913 activation Effects 0.000 description 7
- 230000003247 decreasing Effects 0.000 description 7
- 239000002158 endotoxin Substances 0.000 description 7
- 102000004190 Enzymes Human genes 0.000 description 6
- 108090000790 Enzymes Proteins 0.000 description 6
- 101700039639 PTN Proteins 0.000 description 6
- 238000010276 construction Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 150000007523 nucleic acids Chemical class 0.000 description 6
- 238000000746 purification Methods 0.000 description 6
- 230000001105 regulatory Effects 0.000 description 6
- 210000000138 Mast Cells Anatomy 0.000 description 5
- 206010028980 Neoplasm Diseases 0.000 description 5
- 238000000636 Northern blotting Methods 0.000 description 5
- 102100009534 TNF Human genes 0.000 description 5
- 230000027455 binding Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 238000003780 insertion Methods 0.000 description 5
- 150000002500 ions Chemical class 0.000 description 5
- 108020004707 nucleic acids Proteins 0.000 description 5
- PHEDXBVPIONUQT-RGYGYFBISA-N 12-O-Tetradecanoylphorbol-13-acetate Chemical compound C([C@]1(O)C(=O)C(C)=C[C@H]1[C@@]1(O)[C@H](C)[C@H]2OC(=O)CCCCCCCCCCCCC)C(CO)=C[C@H]1[C@H]1[C@]2(OC(C)=O)C1(C)C PHEDXBVPIONUQT-RGYGYFBISA-N 0.000 description 4
- 229940110715 ENZYMES FOR TREATMENT OF WOUNDS AND ULCERS Drugs 0.000 description 4
- 230000036740 Metabolism Effects 0.000 description 4
- 101710040537 TNF Proteins 0.000 description 4
- 241000700605 Viruses Species 0.000 description 4
- 102000024070 binding proteins Human genes 0.000 description 4
- 108091007650 binding proteins Proteins 0.000 description 4
- UXVMQQNJUSDDNG-UHFFFAOYSA-L cacl2 Chemical compound [Cl-].[Cl-].[Ca+2] UXVMQQNJUSDDNG-UHFFFAOYSA-L 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 4
- 230000001413 cellular Effects 0.000 description 4
- 238000003745 diagnosis Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000002068 genetic Effects 0.000 description 4
- 230000004060 metabolic process Effects 0.000 description 4
- 230000035786 metabolism Effects 0.000 description 4
- 230000036961 partial Effects 0.000 description 4
- 230000004044 response Effects 0.000 description 4
- 231100000419 toxicity Toxicity 0.000 description 4
- 230000001988 toxicity Effects 0.000 description 4
- 241000894006 Bacteria Species 0.000 description 3
- 208000005623 Carcinogenesis Diseases 0.000 description 3
- 210000000349 Chromosomes Anatomy 0.000 description 3
- 238000001712 DNA sequencing Methods 0.000 description 3
- 210000002889 Endothelial Cells Anatomy 0.000 description 3
- 229940114721 Enzymes FOR DISORDERS OF THE MUSCULO-SKELETAL SYSTEM Drugs 0.000 description 3
- 229940093738 Enzymes for ALIMENTARY TRACT AND METABOLISM Drugs 0.000 description 3
- 102000014150 Interferons Human genes 0.000 description 3
- 108010050904 Interferons Proteins 0.000 description 3
- 210000004698 Lymphocytes Anatomy 0.000 description 3
- 229920000272 Oligonucleotide Polymers 0.000 description 3
- 108020004412 RNA 3' Polyadenylation Signals Proteins 0.000 description 3
- 241000580858 Simian-Human immunodeficiency virus Species 0.000 description 3
- 239000002246 antineoplastic agent Substances 0.000 description 3
- 229940019336 antithrombotic Enzymes Drugs 0.000 description 3
- 230000001580 bacterial Effects 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 3
- 238000010192 crystallographic characterization Methods 0.000 description 3
- 238000009795 derivation Methods 0.000 description 3
- 230000004069 differentiation Effects 0.000 description 3
- 239000000499 gel Substances 0.000 description 3
- PEDCQBHIVMGVHV-UHFFFAOYSA-N glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 3
- 229940020899 hematological Enzymes Drugs 0.000 description 3
- 229940079322 interferon Drugs 0.000 description 3
- 238000009114 investigational therapy Methods 0.000 description 3
- 238000002955 isolation Methods 0.000 description 3
- CPLXHLVBOLITMK-UHFFFAOYSA-N magnesium oxide Chemical compound [Mg]=O CPLXHLVBOLITMK-UHFFFAOYSA-N 0.000 description 3
- 239000002245 particle Substances 0.000 description 3
- 229940083249 peripheral vasodilators Enzymes Drugs 0.000 description 3
- 102000020437 poly(A) binding proteins Human genes 0.000 description 3
- 108091022174 poly(A) binding proteins Proteins 0.000 description 3
- 238000002360 preparation method Methods 0.000 description 3
- 239000011347 resin Substances 0.000 description 3
- 229920005989 resin Polymers 0.000 description 3
- 235000010384 tocopherol Nutrition 0.000 description 3
- 231100000027 toxicology Toxicity 0.000 description 3
- 235000019731 tricalcium phosphate Nutrition 0.000 description 3
- 230000003612 virological Effects 0.000 description 3
- 206010000880 Acute myeloid leukaemia Diseases 0.000 description 2
- 241000321096 Adenoides Species 0.000 description 2
- 210000002534 Adenoids Anatomy 0.000 description 2
- 108009000447 Amino Acid metabolism Proteins 0.000 description 2
- 229920001276 Ammonium polyphosphate Polymers 0.000 description 2
- 101700063952 BIT2 Proteins 0.000 description 2
- 102000020504 Collagenase family Human genes 0.000 description 2
- 108060005980 Collagenase family Proteins 0.000 description 2
- 210000000805 Cytoplasm Anatomy 0.000 description 2
- 239000003298 DNA probe Substances 0.000 description 2
- 241000196324 Embryophyta Species 0.000 description 2
- 229940088598 Enzyme Drugs 0.000 description 2
- 241000588724 Escherichia coli Species 0.000 description 2
- 241000233866 Fungi Species 0.000 description 2
- 108091000058 GTP-Binding Proteins Proteins 0.000 description 2
- 102100005413 GTPBP4 Human genes 0.000 description 2
- XKMLYUALXHKNFT-UUOKFMHZSA-N Guanosine-5'-triphosphate Chemical compound C1=2NC(N)=NC(=O)C=2N=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)[C@H]1O XKMLYUALXHKNFT-UUOKFMHZSA-N 0.000 description 2
- 102000008070 Interferon-gamma Human genes 0.000 description 2
- 108010074328 Interferon-gamma Proteins 0.000 description 2
- 206010024324 Leukaemias Diseases 0.000 description 2
- 210000004072 Lung Anatomy 0.000 description 2
- 206010025323 Lymphomas Diseases 0.000 description 2
- 241001237731 Microtia elva Species 0.000 description 2
- 101710016676 NUP145 Proteins 0.000 description 2
- 210000004940 Nucleus Anatomy 0.000 description 2
- 102000004264 Osteopontin Human genes 0.000 description 2
- 108010081689 Osteopontin Proteins 0.000 description 2
- 206010025310 Other lymphomas Diseases 0.000 description 2
- 101700016958 PCNT Proteins 0.000 description 2
- 102100017532 PCNT Human genes 0.000 description 2
- 101700061766 PLAC Proteins 0.000 description 2
- 101710032707 PTRH2 Proteins 0.000 description 2
- 102100017351 PTRH2 Human genes 0.000 description 2
- 235000010240 Paullinia pinnata Nutrition 0.000 description 2
- 241001119522 Paullinia pinnata Species 0.000 description 2
- 108091005771 Peptidases Proteins 0.000 description 2
- 206010057249 Phagocytosis Diseases 0.000 description 2
- 102000030951 Phosphotransferases Human genes 0.000 description 2
- 108091000081 Phosphotransferases Proteins 0.000 description 2
- 239000004365 Protease Substances 0.000 description 2
- 108010078762 Protein Precursors Proteins 0.000 description 2
- 102000014961 Protein Precursors Human genes 0.000 description 2
- 210000000952 Spleen Anatomy 0.000 description 2
- 102000019197 Superoxide Dismutase Human genes 0.000 description 2
- 108010012715 Superoxide Dismutase Proteins 0.000 description 2
- 210000001744 T-Lymphocytes Anatomy 0.000 description 2
- 230000003213 activating Effects 0.000 description 2
- 230000032683 aging Effects 0.000 description 2
- 230000037354 amino acid metabolism Effects 0.000 description 2
- 230000010056 antibody-dependent cellular cytotoxicity Effects 0.000 description 2
- 238000000376 autoradiography Methods 0.000 description 2
- 230000004071 biological effect Effects 0.000 description 2
- OKTJSMMVPCPJKN-UHFFFAOYSA-N carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 2
- 230000003915 cell function Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- VNFPBHJOKIVQEB-UHFFFAOYSA-N clotrimazole Chemical compound ClC1=CC=CC=C1C(N1C=NC=C1)(C=1C=CC=CC=1)C1=CC=CC=C1 VNFPBHJOKIVQEB-UHFFFAOYSA-N 0.000 description 2
- 229960002424 collagenase Drugs 0.000 description 2
- 210000004748 cultured cells Anatomy 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000007877 drug screening Methods 0.000 description 2
- 239000012530 fluid Substances 0.000 description 2
- 238000005194 fractionation Methods 0.000 description 2
- 238000003306 harvesting Methods 0.000 description 2
- NTYJJOPFIAHURM-UHFFFAOYSA-N histamine Chemical compound NCCC1=CN=CN1 NTYJJOPFIAHURM-UHFFFAOYSA-N 0.000 description 2
- 230000000409 histolytic Effects 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 238000000338 in vitro Methods 0.000 description 2
- 230000002757 inflammatory Effects 0.000 description 2
- 239000003112 inhibitor Substances 0.000 description 2
- 230000002401 inhibitory effect Effects 0.000 description 2
- 229960003130 interferon gamma Drugs 0.000 description 2
- 230000000670 limiting Effects 0.000 description 2
- 150000002632 lipids Chemical class 0.000 description 2
- 230000002934 lysing Effects 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 239000012528 membrane Substances 0.000 description 2
- 238000002156 mixing Methods 0.000 description 2
- 210000000056 organs Anatomy 0.000 description 2
- 230000008782 phagocytosis Effects 0.000 description 2
- 230000000144 pharmacologic effect Effects 0.000 description 2
- 229920000406 phosphotungstic acid polymer Polymers 0.000 description 2
- 238000003752 polymerase chain reaction Methods 0.000 description 2
- 229920003245 polyoctenamer Polymers 0.000 description 2
- OZAIFHULBGXAKX-UHFFFAOYSA-N precursor Substances N#CC(C)(C)N=NC(C)(C)C#N OZAIFHULBGXAKX-UHFFFAOYSA-N 0.000 description 2
- 238000004445 quantitative analysis Methods 0.000 description 2
- 108020003175 receptors Proteins 0.000 description 2
- 102000005962 receptors Human genes 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000005204 segregation Methods 0.000 description 2
- PUZPDOWCWNUUKD-UHFFFAOYSA-M sodium fluoride Chemical compound [F-].[Na+] PUZPDOWCWNUUKD-UHFFFAOYSA-M 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 230000002194 synthesizing Effects 0.000 description 2
- -1 tennisposide Chemical compound 0.000 description 2
- QGVLYPPODPLXMB-UBTYZVCOSA-N (1aR,1bS,4aR,7aS,7bS,8R,9R,9aS)-4a,7b,9,9a-tetrahydroxy-3-(hydroxymethyl)-1,1,6,8-tetramethyl-1,1a,1b,4,4a,7a,7b,8,9,9a-decahydro-5H-cyclopropa[3,4]benzo[1,2-e]azulen-5-one Chemical compound C1=C(CO)C[C@]2(O)C(=O)C(C)=C[C@H]2[C@@]2(O)[C@H](C)[C@@H](O)[C@@]3(O)C(C)(C)[C@H]3[C@@H]21 QGVLYPPODPLXMB-UBTYZVCOSA-N 0.000 description 1
- PMATZTZNYRCHOR-CGLBZJNRSA-N (3S,6S,9S,12R,15S,18S,21S,24S,30S,33S)-30-ethyl-33-[(E,1R,2R)-1-hydroxy-2-methylhex-4-enyl]-1,4,7,10,12,15,19,25,28-nonamethyl-6,9,18,24-tetrakis(2-methylpropyl)-3,21-di(propan-2-yl)-1,4,7,10,13,16,19,22,25,28,31-undecazacyclotritriacontane-2,5,8,11,14,17 Chemical compound CC[C@@H]1NC(=O)[C@H]([C@H](O)[C@H](C)C\C=C\C)N(C)C(=O)[C@H](C(C)C)N(C)C(=O)[C@H](CC(C)C)N(C)C(=O)[C@H](CC(C)C)N(C)C(=O)[C@@H](C)NC(=O)[C@H](C)NC(=O)[C@H](CC(C)C)N(C)C(=O)[C@H](C(C)C)NC(=O)[C@H](CC(C)C)N(C)C(=O)CN(C)C1=O PMATZTZNYRCHOR-CGLBZJNRSA-N 0.000 description 1
- IPVFGAYTKQKGBM-BYPJNBLXSA-N 1-[(2R,3S,4R,5R)-3-fluoro-4-hydroxy-5-(hydroxymethyl)oxolan-2-yl]-5-iodopyrimidine-2,4-dione Chemical compound F[C@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C(I)=C1 IPVFGAYTKQKGBM-BYPJNBLXSA-N 0.000 description 1
- UGPMCIBIHRSCBV-XNBOLLIBSA-N 77591-33-4 Chemical compound N([C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC(C)C)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O)C(=O)[C@@H]1CCCN1C(=O)[C@H](CCCCN)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CO)NC(C)=O UGPMCIBIHRSCBV-XNBOLLIBSA-N 0.000 description 1
- 101700033661 ACTB Proteins 0.000 description 1
- 102100011550 ACTB Human genes 0.000 description 1
- 101710025911 ACTG1 Proteins 0.000 description 1
- 101700037218 ADCY1 Proteins 0.000 description 1
- 102100002263 ADCY1 Human genes 0.000 description 1
- 102100017390 ANKHD1 Human genes 0.000 description 1
- 101710013001 ANKHD1 Proteins 0.000 description 1
- 101700007619 AURKA Proteins 0.000 description 1
- 102100010552 AURKA Human genes 0.000 description 1
- 206010000871 Acute monocytic leukaemia Diseases 0.000 description 1
- 102000005130 Adenylosuccinate synthetase Human genes 0.000 description 1
- 241000014654 Adna Species 0.000 description 1
- 241000269328 Amphibia Species 0.000 description 1
- AVKUERGKIZMTKX-NJBDSQKTSA-N Ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 1
- 206010003210 Arteriosclerosis Diseases 0.000 description 1
- 210000003719 B-Lymphocytes Anatomy 0.000 description 1
- 241000349774 Bikinia letestui Species 0.000 description 1
- 210000001772 Blood Platelets Anatomy 0.000 description 1
- 241000283690 Bos taurus Species 0.000 description 1
- 210000004556 Brain Anatomy 0.000 description 1
- 210000004958 Brain cells Anatomy 0.000 description 1
- 101700055715 CCF1 Proteins 0.000 description 1
- 102100005858 CCNA2 Human genes 0.000 description 1
- 102100008186 CD83 Human genes 0.000 description 1
- 101700013105 CD83 Proteins 0.000 description 1
- 102100016531 CD9 Human genes 0.000 description 1
- 102100006400 CSF2 Human genes 0.000 description 1
- 229960003669 Carbenicillin Drugs 0.000 description 1
- FPPNZSSZRUTDAP-UWFZAAFLSA-N Carbenicillin Chemical compound N([C@H]1[C@H]2SC([C@@H](N2C1=O)C(O)=O)(C)C)C(=O)C(C(O)=O)C1=CC=CC=C1 FPPNZSSZRUTDAP-UWFZAAFLSA-N 0.000 description 1
- 102000004172 Cathepsin L Human genes 0.000 description 1
- 108090000624 Cathepsin L Proteins 0.000 description 1
- 102000000844 Cell Surface Receptors Human genes 0.000 description 1
- 108010001857 Cell Surface Receptors Proteins 0.000 description 1
- 241000282693 Cercopithecidae Species 0.000 description 1
- 102000001327 Chemokine CCL5 Human genes 0.000 description 1
- 108010055166 Chemokine CCL5 Proteins 0.000 description 1
- 102000015612 Complement 3b Receptors Human genes 0.000 description 1
- 108010024114 Complement 3b Receptors Proteins 0.000 description 1
- 108010062580 Concanavalin A Proteins 0.000 description 1
- 241000699800 Cricetinae Species 0.000 description 1
- 108010068192 Cyclin A Proteins 0.000 description 1
- 102000016736 Cyclins Human genes 0.000 description 1
- 108050006400 Cyclins Proteins 0.000 description 1
- 108010037462 Cyclooxygenase 2 Proteins 0.000 description 1
- 108010036949 Cyclosporine Proteins 0.000 description 1
- 108010017826 DNA Polymerase I Proteins 0.000 description 1
- 102000004594 DNA Polymerase I Human genes 0.000 description 1
- 108020003215 DNA Probes Proteins 0.000 description 1
- 101700011961 DPOM Proteins 0.000 description 1
- 241000347344 Dactylopteridae Species 0.000 description 1
- 102000019460 EC 4.6.1.1 Human genes 0.000 description 1
- 108060000200 EC 4.6.1.1 Proteins 0.000 description 1
- 102000008745 EC 6.1.1.- Human genes 0.000 description 1
- 108030004302 EC 6.1.1.- Proteins 0.000 description 1
- 108010056443 EC 6.3.4.4 Proteins 0.000 description 1
- 101710038747 EEF1A1 Proteins 0.000 description 1
- 101710005269 EEF1B2 Proteins 0.000 description 1
- 102100017665 EEF1B2 Human genes 0.000 description 1
- 102000033147 ERVK-25 Human genes 0.000 description 1
- 101700054838 ETS2 Proteins 0.000 description 1
- 102100016241 ETS2 Human genes 0.000 description 1
- 102000002045 Endothelin Human genes 0.000 description 1
- 108050009340 Endothelin Proteins 0.000 description 1
- ZUBDGKVDJUIMQQ-UBFCDGJISA-N Endothelin-1 Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)NC(=O)[C@H]1NC(=O)[C@H](CC=2C=CC=CC=2)NC(=O)[C@@H](CC=2C=CC(O)=CC=2)NC(=O)[C@H](C(C)C)NC(=O)[C@H]2CSSC[C@@H](C(N[C@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N2)=O)NC(=O)[C@@H](CO)NC(=O)[C@H](N)CSSC1)C1=CNC=N1 ZUBDGKVDJUIMQQ-UBFCDGJISA-N 0.000 description 1
- 210000003743 Erythrocytes Anatomy 0.000 description 1
- CYTYCFOTNPOANT-UHFFFAOYSA-N Ethylene tetrachloride Chemical compound ClC(Cl)=C(Cl)Cl CYTYCFOTNPOANT-UHFFFAOYSA-N 0.000 description 1
- VJJPUSNTGOMMGY-MRVIYFEKSA-N Etoposide Chemical compound COC1=C(O)C(OC)=CC([C@@H]2C3=CC=4OCOC=4C=C3[C@@H](O[C@H]3[C@@H]([C@@H](O)[C@@H]4O[C@H](C)OC[C@H]4O3)O)[C@@H]3[C@@H]2C(OC3)=O)=C1 VJJPUSNTGOMMGY-MRVIYFEKSA-N 0.000 description 1
- 229960005420 Etoposide Drugs 0.000 description 1
- 229920002760 Expressed sequence tag Polymers 0.000 description 1
- 102100006564 FAM107B Human genes 0.000 description 1
- 101710039027 FAM107B Proteins 0.000 description 1
- 101700016781 FER1 Proteins 0.000 description 1
- 101700077668 FTH1 Proteins 0.000 description 1
- 108010074864 Factor XI Proteins 0.000 description 1
- 102000009109 Fc receptors Human genes 0.000 description 1
- 108010087819 Fc receptors Proteins 0.000 description 1
- 108050000784 Ferritin Proteins 0.000 description 1
- 102000008857 Ferritin Human genes 0.000 description 1
- 238000008416 Ferritin Methods 0.000 description 1
- 102000016359 Fibronectins Human genes 0.000 description 1
- 108010067306 Fibronectins Proteins 0.000 description 1
- 102000002464 Galactosidases Human genes 0.000 description 1
- 108010093031 Galactosidases Proteins 0.000 description 1
- 241000287828 Gallus gallus Species 0.000 description 1
- 108010017213 Granulocyte-Macrophage Colony-Stimulating Factor Proteins 0.000 description 1
- 101710021379 H2AZ1 Proteins 0.000 description 1
- 102100019004 H2AZ1 Human genes 0.000 description 1
- 101710017819 H2AZ2 Proteins 0.000 description 1
- 102100019126 HBB Human genes 0.000 description 1
- 102100017445 HSPB1 Human genes 0.000 description 1
- 101700062887 HSPB1 Proteins 0.000 description 1
- 101700013735 HSPB3 Proteins 0.000 description 1
- 101710015954 HVA1 Proteins 0.000 description 1
- 241001190717 Hea Species 0.000 description 1
- 108091005902 Hemoglobin subunit beta Proteins 0.000 description 1
- 108020004996 Heterogeneous Nuclear RNA Proteins 0.000 description 1
- 102000017286 Histone H2A Human genes 0.000 description 1
- 108050005231 Histone H2A Proteins 0.000 description 1
- 208000008025 Hordeolum Diseases 0.000 description 1
- 229940088597 Hormone Drugs 0.000 description 1
- 210000003016 Hypothalamus Anatomy 0.000 description 1
- 101700078680 ICMT Proteins 0.000 description 1
- 102000009438 IgE Receptors Human genes 0.000 description 1
- 108010073816 IgE Receptors Proteins 0.000 description 1
- 206010061218 Inflammation Diseases 0.000 description 1
- 108010002352 Interleukin-1 Proteins 0.000 description 1
- 108090000978 Interleukin-4 Proteins 0.000 description 1
- 102000004388 Interleukin-4 Human genes 0.000 description 1
- 229940028885 Interleukin-4 Drugs 0.000 description 1
- 108090001007 Interleukin-8 Proteins 0.000 description 1
- 108010063738 Interleukins Proteins 0.000 description 1
- 102000015696 Interleukins Human genes 0.000 description 1
- 210000004020 Intracellular Membrane Anatomy 0.000 description 1
- 241000152160 Ira Species 0.000 description 1
- 101700038758 KIFC1 Proteins 0.000 description 1
- 102100006550 KIFC1 Human genes 0.000 description 1
- 101700014355 KINUC Proteins 0.000 description 1
- 102000003855 L-lactate dehydrogenases Human genes 0.000 description 1
- 108091000084 L-lactate dehydrogenases Proteins 0.000 description 1
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 1
- 125000000510 L-tryptophano group Chemical group [H]C1=C([H])C([H])=C2N([H])C([H])=C(C([H])([H])[C@@]([H])(C(O[H])=O)N([H])[*])C2=C1[H] 0.000 description 1
- 101700065814 LEA2 Proteins 0.000 description 1
- 101700021338 LEC Proteins 0.000 description 1
- 101700077545 LECC Proteins 0.000 description 1
- 101700028499 LECG Proteins 0.000 description 1
- 101700063913 LECT Proteins 0.000 description 1
- 229920000126 Latex Polymers 0.000 description 1
- 208000009721 Leukemia, Monocytic, Acute Diseases 0.000 description 1
- 208000007046 Leukemia, Myeloid, Acute Diseases 0.000 description 1
- 241000408529 Libra Species 0.000 description 1
- 210000000088 Lip Anatomy 0.000 description 1
- 206010024855 Loss of consciousness Diseases 0.000 description 1
- 101710029649 MDV043 Proteins 0.000 description 1
- 210000003470 Mitochondria Anatomy 0.000 description 1
- 229960004857 Mitomycin Drugs 0.000 description 1
- 102000016943 Muramidase Human genes 0.000 description 1
- 108010014251 Muramidase Proteins 0.000 description 1
- 210000002464 Muscle, Smooth, Vascular Anatomy 0.000 description 1
- NWIBSHFKIJFRCO-WUDYKRTCSA-N Mytomycin Chemical compound C1N2C(C(C(C)=C(N)C3=O)=O)=C3[C@@H](COC(N)=O)[C@@]2(OC)[C@@H]2[C@H]1N2 NWIBSHFKIJFRCO-WUDYKRTCSA-N 0.000 description 1
- 210000004279 Orbit Anatomy 0.000 description 1
- 241000283973 Oryctolagus cuniculus Species 0.000 description 1
- 101710034340 Os04g0173800 Proteins 0.000 description 1
- 241000237502 Ostreidae Species 0.000 description 1
- 210000003101 Oviducts Anatomy 0.000 description 1
- 241000283898 Ovis Species 0.000 description 1
- 108090000854 Oxidoreductases Proteins 0.000 description 1
- 102000004316 Oxidoreductases Human genes 0.000 description 1
- 101710016419 PBC1 Proteins 0.000 description 1
- 101700050350 PER33 Proteins 0.000 description 1
- 101700061424 POLB Proteins 0.000 description 1
- 102100000052 PPARGC1B Human genes 0.000 description 1
- 101710027641 PPARGC1B Proteins 0.000 description 1
- 101700033676 PRCT Proteins 0.000 description 1
- 101700011927 PRK1 Proteins 0.000 description 1
- 102100009930 PRNT Human genes 0.000 description 1
- 101700059760 PRNT Proteins 0.000 description 1
- 101700009419 PTEN Proteins 0.000 description 1
- 102100015381 PTGS2 Human genes 0.000 description 1
- 210000000496 Pancreas Anatomy 0.000 description 1
- 102000035443 Peptidases Human genes 0.000 description 1
- QGVLYPPODPLXMB-MEQJGUAMSA-N Phorbol Natural products O=C1C(C)=C[C@H]2[C@]3(O)[C@H](C)[C@H](O)[C@@]4(O)C(C)(C)[C@@H]4[C@@H]3C=C(CO)C[C@]12O QGVLYPPODPLXMB-MEQJGUAMSA-N 0.000 description 1
- 210000002826 Placenta Anatomy 0.000 description 1
- 208000002151 Pleural Effusion Diseases 0.000 description 1
- 229920000582 Polyisocyanurate Polymers 0.000 description 1
- 229940082622 Prostaglandin cardiac therapy preparations Drugs 0.000 description 1
- 229940077717 Prostaglandin drugs for peptic ulcer and gastro-oesophageal reflux disease (GORD) Drugs 0.000 description 1
- 101700080571 RAB1A Proteins 0.000 description 1
- 102100003118 RAB1A Human genes 0.000 description 1
- 210000003324 RBC Anatomy 0.000 description 1
- 101700054624 RF1 Proteins 0.000 description 1
- 102000009572 RNA Polymerase II Human genes 0.000 description 1
- 108010009460 RNA Polymerase II Proteins 0.000 description 1
- 102100018107 RPL7 Human genes 0.000 description 1
- 101710027129 RPL7 Proteins 0.000 description 1
- 102100017329 RPS20 Human genes 0.000 description 1
- 102100011252 RPS25 Human genes 0.000 description 1
- 101710008325 RPS25 Proteins 0.000 description 1
- 102100012492 RPS8 Human genes 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 229920000970 Repeated sequence (DNA) Polymers 0.000 description 1
- 210000003705 Ribosomes Anatomy 0.000 description 1
- 101700054751 SD17 Proteins 0.000 description 1
- 108060007534 SIGB Proteins 0.000 description 1
- 102100006165 SRP9 Human genes 0.000 description 1
- 108060007880 SRP9 Proteins 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- 241000426682 Salinispora Species 0.000 description 1
- WTGQALLALWYDJH-WYHSTMEOSA-N Scopolamine hydrobromide Chemical compound Br.C1([C@@H](CO)C(=O)OC2C[C@@H]3N([C@H](C2)[C@@H]2[C@H]3O2)C)=CC=CC=C1 WTGQALLALWYDJH-WYHSTMEOSA-N 0.000 description 1
- 102000013598 Signal recognition particle protein Human genes 0.000 description 1
- 108010051611 Signal recognition particle protein Proteins 0.000 description 1
- 210000002356 Skeleton Anatomy 0.000 description 1
- 210000002784 Stomach Anatomy 0.000 description 1
- 241000282898 Sus scrofa Species 0.000 description 1
- 102100018596 TEP1 Human genes 0.000 description 1
- 101700004103 TEP1 Proteins 0.000 description 1
- 102100008904 TFRC Human genes 0.000 description 1
- 102100011047 TMSB4X Human genes 0.000 description 1
- 108060001606 TRM1 Proteins 0.000 description 1
- 102100017477 TUBB Human genes 0.000 description 1
- 101710025662 TUBB Proteins 0.000 description 1
- NKANXQFJJICGDU-QPLCGJKRSA-N Tamoxifen Chemical compound C=1C=CC=CC=1C(/CC)=C(C=1C=CC(OCCN(C)C)=CC=1)/C1=CC=CC=C1 NKANXQFJJICGDU-QPLCGJKRSA-N 0.000 description 1
- 229960001603 Tamoxifen Drugs 0.000 description 1
- 108010006785 Taq Polymerase Proteins 0.000 description 1
- 108010046075 Thymosin Proteins 0.000 description 1
- 102000007501 Thymosin Human genes 0.000 description 1
- 231100000765 Toxin Toxicity 0.000 description 1
- 108010033576 Transferrin Receptors Proteins 0.000 description 1
- 239000007983 Tris buffer Substances 0.000 description 1
- 108060005989 Tryptase family Proteins 0.000 description 1
- 102000001400 Tryptase family Human genes 0.000 description 1
- 108010001801 Tumor Necrosis Factor-alpha Proteins 0.000 description 1
- 206010054094 Tumour necrosis Diseases 0.000 description 1
- 241001584775 Tunga penetrans Species 0.000 description 1
- 101710007061 UAC1 Proteins 0.000 description 1
- 108010075202 UDPglucose 4-Epimerase Proteins 0.000 description 1
- 108090000848 Ubiquitin Proteins 0.000 description 1
- 102400000757 Ubiquitin Human genes 0.000 description 1
- 210000003606 Umbilical Veins Anatomy 0.000 description 1
- 102000006943 Uracil-DNA Glycosidase Human genes 0.000 description 1
- 108010072685 Uracil-DNA Glycosidase Proteins 0.000 description 1
- 101700071792 VA3 Proteins 0.000 description 1
- 101700068531 VA5 Proteins 0.000 description 1
- 101700009998 VA52 Proteins 0.000 description 1
- 101700066103 VAL5 Proteins 0.000 description 1
- 210000003934 Vacuoles Anatomy 0.000 description 1
- 229960003048 Vinblastine Drugs 0.000 description 1
- HOFQVRTUGATRFI-XQKSVPLYSA-N Vinblastine Chemical compound C([C@@H](C[C@]1(C(=O)OC)C=2C(=CC3=C([C@]45[C@H]([C@@]([C@H](OC(C)=O)[C@]6(CC)C=CCN([C@H]56)CC4)(O)C(=O)OC)N3C)C=2)OC)C[C@@](C2)(O)CC)N2CCC2=C1N=C1[C]2C=CC=C1 HOFQVRTUGATRFI-XQKSVPLYSA-N 0.000 description 1
- JXLYSJRDGCGARV-WWYNWVTFSA-N Vinblastine Natural products O=C(O[C@H]1[C@](O)(C(=O)OC)[C@@H]2N(C)c3c(cc(c(OC)c3)[C@]3(C(=O)OC)c4[nH]c5c(c4CCN4C[C@](O)(CC)C[C@H](C3)C4)cccc5)[C@@]32[C@H]2[C@@]1(CC)C=CCN2CC3)C JXLYSJRDGCGARV-WWYNWVTFSA-N 0.000 description 1
- 229960004528 Vincristine Drugs 0.000 description 1
- 101700052871 YPT1 Proteins 0.000 description 1
- 101700078629 YWP1 Proteins 0.000 description 1
- 241000607479 Yersinia pestis Species 0.000 description 1
- 210000001789 adipocyte Anatomy 0.000 description 1
- 230000001919 adrenal Effects 0.000 description 1
- 231100000494 adverse effect Toxicity 0.000 description 1
- 229920002892 amber Polymers 0.000 description 1
- 150000001413 amino acids Chemical group 0.000 description 1
- 108010033244 aminocaproate esterase Proteins 0.000 description 1
- 229960000723 ampicillin Drugs 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 230000019552 anatomical structure morphogenesis Effects 0.000 description 1
- 238000010171 animal model Methods 0.000 description 1
- 239000003957 anion exchange resin Substances 0.000 description 1
- 230000003042 antagnostic Effects 0.000 description 1
- 239000005557 antagonist Substances 0.000 description 1
- 108090001123 antibodies Proteins 0.000 description 1
- 102000004965 antibodies Human genes 0.000 description 1
- 101700018145 ark1 Proteins 0.000 description 1
- 230000000712 assembly Effects 0.000 description 1
- 244000052616 bacterial pathogens Species 0.000 description 1
- 102000006635 beta-Lactamases Human genes 0.000 description 1
- 108020004256 beta-Lactamases Proteins 0.000 description 1
- 238000009534 blood test Methods 0.000 description 1
- 108010054191 butyrylesterase Proteins 0.000 description 1
- 238000010804 cDNA synthesis Methods 0.000 description 1
- 238000005251 capillar electrophoresis Methods 0.000 description 1
- 238000001818 capillary gel electrophoresis Methods 0.000 description 1
- 210000000748 cardiovascular system Anatomy 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 239000003054 catalyst Substances 0.000 description 1
- 230000021164 cell adhesion Effects 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000033077 cellular process Effects 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 108010004215 chloroacetate esterase Proteins 0.000 description 1
- 229960001265 ciclosporin Drugs 0.000 description 1
- 229960004316 cisplatin Drugs 0.000 description 1
- 238000010224 classification analysis Methods 0.000 description 1
- 238000007374 clinical diagnostic method Methods 0.000 description 1
- 230000000295 complement Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 230000000093 cytochemical Effects 0.000 description 1
- 230000002380 cytological Effects 0.000 description 1
- 230000009089 cytolysis Effects 0.000 description 1
- 230000001086 cytosolic Effects 0.000 description 1
- 230000004059 degradation Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000001419 dependent Effects 0.000 description 1
- 238000001784 detoxification Methods 0.000 description 1
- 229940000406 drug candidates Drugs 0.000 description 1
- 238000009509 drug development Methods 0.000 description 1
- 239000003596 drug target Substances 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- 230000002255 enzymatic Effects 0.000 description 1
- 230000002327 eosinophilic Effects 0.000 description 1
- 210000000267 erythroid cells Anatomy 0.000 description 1
- 239000003777 experimental drug Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000001605 fetal Effects 0.000 description 1
- 210000002950 fibroblast Anatomy 0.000 description 1
- 230000037320 fibronectin Effects 0.000 description 1
- 239000012467 final product Substances 0.000 description 1
- 108060002971 flz Proteins 0.000 description 1
- 238000005755 formation reaction Methods 0.000 description 1
- 238000010230 functional analysis Methods 0.000 description 1
- 201000003928 fungal infectious disease Diseases 0.000 description 1
- 108020001507 fusion proteins Proteins 0.000 description 1
- 102000037240 fusion proteins Human genes 0.000 description 1
- 238000001502 gel electrophoresis Methods 0.000 description 1
- 230000002518 glial Effects 0.000 description 1
- 239000008187 granular material Substances 0.000 description 1
- 230000035876 healing Effects 0.000 description 1
- 231100000304 hepatotoxicity Toxicity 0.000 description 1
- 229960001340 histamine Drugs 0.000 description 1
- 239000005556 hormone Substances 0.000 description 1
- 230000002209 hydrophobic Effects 0.000 description 1
- 230000036039 immunity Effects 0.000 description 1
- 239000000367 immunologic factor Substances 0.000 description 1
- 230000004054 inflammatory process Effects 0.000 description 1
- 229910052500 inorganic mineral Inorganic materials 0.000 description 1
- 230000003834 intracellular Effects 0.000 description 1
- 230000004068 intracellular signaling Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- KFZMGEQAYNKOFK-UHFFFAOYSA-N iso-propanol Chemical compound CC(C)O KFZMGEQAYNKOFK-UHFFFAOYSA-N 0.000 description 1
- 125000001261 isocyanato group Chemical group *N=C=O 0.000 description 1
- 101700023298 katG Proteins 0.000 description 1
- 238000009533 lab test Methods 0.000 description 1
- 239000004816 latex Substances 0.000 description 1
- 101700036391 lecA Proteins 0.000 description 1
- 239000002523 lectin Substances 0.000 description 1
- 150000002617 leukotrienes Chemical class 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 230000037356 lipid metabolism Effects 0.000 description 1
- WHXSMMKQMYFTQS-UHFFFAOYSA-N lithium Chemical compound [Li] WHXSMMKQMYFTQS-UHFFFAOYSA-N 0.000 description 1
- 229910052744 lithium Inorganic materials 0.000 description 1
- 201000009673 liver disease Diseases 0.000 description 1
- 229960000274 lysozyme Drugs 0.000 description 1
- 239000004325 lysozyme Substances 0.000 description 1
- 235000010335 lysozyme Nutrition 0.000 description 1
- 230000003211 malignant Effects 0.000 description 1
- 238000004949 mass spectrometry Methods 0.000 description 1
- 101700001016 mbhA Proteins 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000001404 mediated Effects 0.000 description 1
- 230000003278 mimic Effects 0.000 description 1
- 239000011707 mineral Substances 0.000 description 1
- 235000010755 mineral Nutrition 0.000 description 1
- 230000002438 mitochondrial Effects 0.000 description 1
- 230000011278 mitosis Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006011 modification reaction Methods 0.000 description 1
- 238000001823 molecular biology technique Methods 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 230000000877 morphologic Effects 0.000 description 1
- 230000004899 motility Effects 0.000 description 1
- 108010066052 multidrug resistance-associated protein 1 Proteins 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- XIVJNRXPRQKFRZ-UHFFFAOYSA-N naphthalen-1-yl butanoate Chemical compound C1=CC=C2C(OC(=O)CCC)=CC=CC2=C1 XIVJNRXPRQKFRZ-UHFFFAOYSA-N 0.000 description 1
- 239000002547 new drug Substances 0.000 description 1
- 108010076457 nonmuscle myosin type IIB heavy chain Proteins 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 239000002773 nucleotide Substances 0.000 description 1
- 125000003729 nucleotide group Chemical group 0.000 description 1
- 235000016709 nutrition Nutrition 0.000 description 1
- 230000035764 nutrition Effects 0.000 description 1
- 101700050775 oct-1 Proteins 0.000 description 1
- 210000004789 organ systems Anatomy 0.000 description 1
- 230000001590 oxidative Effects 0.000 description 1
- 230000010627 oxidative phosphorylation Effects 0.000 description 1
- 229940094443 oxytocics Prostaglandins Drugs 0.000 description 1
- 235000020636 oyster Nutrition 0.000 description 1
- 230000000242 pagocytic Effects 0.000 description 1
- 239000011886 peripheral blood Substances 0.000 description 1
- 210000001539 phagocyte Anatomy 0.000 description 1
- 230000000090 phagocyte Effects 0.000 description 1
- 229930002200 phorbol Natural products 0.000 description 1
- 230000000704 physical effect Effects 0.000 description 1
- 230000035479 physiological effects, processes and functions Effects 0.000 description 1
- 230000004977 physiological function Effects 0.000 description 1
- 230000035790 physiological processes and functions Effects 0.000 description 1
- 239000000902 placebo Substances 0.000 description 1
- 239000004033 plastic Substances 0.000 description 1
- 229920003023 plastic Polymers 0.000 description 1
- 101700049409 pnt Proteins 0.000 description 1
- 229930001140 podophyllotoxin Natural products 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 229920001184 polypeptide Polymers 0.000 description 1
- 230000003389 potentiating Effects 0.000 description 1
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 1
- 238000007639 printing Methods 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 150000003180 prostaglandins Chemical class 0.000 description 1
- 230000020978 protein processing Effects 0.000 description 1
- 230000003161 proteinsynthetic Effects 0.000 description 1
- 230000002829 reduced Effects 0.000 description 1
- 230000003252 repetitive Effects 0.000 description 1
- 108010092942 ribosomal protein S20 Proteins 0.000 description 1
- 108010033800 ribosomal protein S8 Proteins 0.000 description 1
- 101710007020 rut Proteins 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 230000001568 sexual Effects 0.000 description 1
- 231100000486 side effect Toxicity 0.000 description 1
- 238000002741 site-directed mutagenesis Methods 0.000 description 1
- 239000011775 sodium fluoride Substances 0.000 description 1
- 235000013024 sodium fluoride Nutrition 0.000 description 1
- 239000000243 solution Substances 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 230000001225 therapeutic Effects 0.000 description 1
- 108010079996 thymosin beta(4) Proteins 0.000 description 1
- 230000002588 toxic Effects 0.000 description 1
- 231100000331 toxic Toxicity 0.000 description 1
- 230000002110 toxicologic Effects 0.000 description 1
- 231100000041 toxicology testing Toxicity 0.000 description 1
- 239000003053 toxin Substances 0.000 description 1
- 108020003112 toxins Proteins 0.000 description 1
- 230000014723 transformation of host cell by virus Effects 0.000 description 1
- 230000001131 transforming Effects 0.000 description 1
- LXZZYRPGZAFOLE-UHFFFAOYSA-L transplatin Chemical compound [H][N]([H])([H])[Pt](Cl)(Cl)[N]([H])([H])[H] LXZZYRPGZAFOLE-UHFFFAOYSA-L 0.000 description 1
- 239000000717 tumor promoter Substances 0.000 description 1
- 241001515965 unidentified phage Species 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 230000002792 vascular Effects 0.000 description 1
- OGWKCGZFUXNPDA-XQKSVPLYSA-N vincristine Chemical compound C([N@]1C[C@@H](C[C@]2(C(=O)OC)C=3C(=CC4=C([C@]56[C@H]([C@@]([C@H](OC(C)=O)[C@]7(CC)C=CCN([C@H]67)CC5)(O)C(=O)OC)N4C=O)C=3)OC)C[C@@](C1)(O)CC)CC1=C2NC2=CC=CC=C12 OGWKCGZFUXNPDA-XQKSVPLYSA-N 0.000 description 1
- 238000009333 weeding Methods 0.000 description 1
- 238000001262 western blot Methods 0.000 description 1
Abstract
A method and system for quantifying the relative abundance of gene transcripts of a biological specimen. One embodiment of the method generates high throughput analysis of specific sequences of multiple RNAs or their corresponding cDNAs (gene transcription image analysis). Another embodiment of the method produces an image analysis of gene transcription through the use of high throughput analysis of cDNA sequences. In addition, projection of gene transcription images can be used to detect or diagnose a particular condition, disease or biological condition that correlates with the relative abundance of gene transcripts in a given cell or cell population. The invention provides a method for comparing the transcription image analysis of genes from two or more different biological specimens in order to distinguish between the two specimens and to identify one or more genes that are differentially expressed between the two specimens.
Description
• - -
COMPARATIVE ANALYSIS OF GENETIC TRANSCRIPTION
1. FIELD OF THE INVENTION The present invention pertains to the field of molecular biology and computer science; more particularly, the present invention describes methods for analyzing gene transcripts and diagnosing the genetic expression of cells and tissues.
* 2. BACKGROUND OF THE INVENTION Until very recently, the history of molecular biology had described one gene at a time. Scientists have observed the physical changes of the cell, isolated mixtures of
the cell or its medium, purified proteins, sequencing proteins and from this constructed probes to search for the corresponding gene. Recently, different countries have installed massive projects to sequence the billions of
bases in the human genome. These projects-typically begin by dividing the genome into large portions of chromosomes and then determine the sequences of these pieces, which are then analyzed to see their identity with known proteins or portions of them, known as
reasons. Unfortunately, most genomic DNA does not encode proteins and although it is postulated that this has some effect on the ability of the cell to make protein, its importance in medical applications is currently not understood. A third methodology involves sequencing
only transcripts that encode the cellular machinery actively involved in making protein, namely mRNA.
The advantage is that the cell has already edited all the uncoded DNA, and it is relatively easy to identify the
^ v that • encodes RNA protein. The usefulness of this form of
XTJ approximation was not immediately obvious for researchers of genomes. In fact, when it was initially proposed to sequence the cDNA, the method was strongly denounced by those in charge of sequencing genes. For example, the project leader of the Human Genome of the United States of America
North America disqualified the sequencing of the cDNA for not being valuable and refused to approve the financial support of the projects. In this description, we show methods for analyzing DNA, including cDNA libraries. Based on our analysis and research, we see each individual gene product as a "pixel" of information, which is related to the expression of that, and only that, gene. We show in the present, methods by which individual "pixels" of gene expression information can be combined into a single "image" of gene transcription, in which each of the individual genes can be visualized simultaneously and allows relationships between the gene pixels so that they are easily visualized and understood. We also show a new method that we call electronic subtraction. Electronic subtraction will allow the gene researcher to convert a single image into a moving image, which describes the temporality or dynamics of gene expression, at the level of a cell or a whole tissue. It is this sense of "movement" of the cellular machinery on the scale of a cell or organ that constitutes the new invention in the present. This constitutes a new vision in the process of the physiology of the living cell and which reserves great promises to reveal and discover new forms of therapeutic approach and diagnosis in medicine. We show another method that we call "Northern and * electronic", which drags the expression of a single gene through many types of cells and tissues. Nucleic acids (DNA and RNA) carry within their sequence the hereditary information and are therefore the primary molecules of life. Nucleic acids are found in all living organisms including bacteria, fungi, viruses, plants and animals. It is interesting to determine the relative abundance of different discrete nucleic acids 5 in different cells, tissues and organisms over time under different conditions, treatments and regimens. All dividing cells in the human body contain the same set of 23 pairs of chromosomes. It is estimated that these autosomal and sexual chromosomes encode approximately 100,000 genes. It is believed that the differences between different cell types reflect the differential expression of the approximately 100,000 genes. Fundamental questions of biology could be answered by understanding which genes are transcribed and knowing the abundance of the transcripts in different strains. Previously, the technique only took measurements for the analysis of some genes known at that time by standard molecular biology techniques such as polymerase chain reaction (PCR), Northern blot analysis, or other types of DNA probe analysis such as Hybridization in if you Each of these methods allows one to analyze the transcription of only known genes and / or a small number of genes each time, Nucí Acids Res.
7097-7104 (1991); Nucí Acids Res. 18, 4833-42 (1990); Nucí Acids Res. 18, 2789-92 (1989); European J. Neuroscience 2,
1063-1973 (1990); Analytical Biochem. 187, 364-73 (1990);
Genet Annals Techn, Appl. 7, 64-70 (1990); GATA 8. (4), 129-33
(1991); Proc. Nati Acad. Sci. USA 85, 1696-1700 (1988); Nucí
Acids Res. 19, 1954 (1991); Proc. Nati Acad. Sci. USA 88., 1943-47 (1991); Nucí Acids Res. 19, 6123-27 (1991); Proc.
^ P? Nati Acad. Sci. USA 85, 5738-42 (1988); Nucí Acids Res. 16, 10937 (1988). Studies of the number and types of genes whose transcription is induced or regulated in some other way during cellular processes such as activation, differentiation, aging, viral transformation, morphogenesis, and mitosis have been pursued for many years, using a variety of methodologies. One of the oldest methods was to isolate and analyze levels of proteins in a cell, tissue, organ system, or even organisms before and after the process of interest. One method to analyze multiple proteins in a sample is to use 2-dimensional gel electrophoresis, where the proteins can, in principle, be identified and quantified as individual bands, and finally reduced to a discrete signal. Currently the analysis in 2 dimensions only resolves approximately
^^ P "15 percent of proteins In order to positively analyze those bands that are resolved, each band must be separated from the membrane and subjected to protein sequence analysis using Edman degradation. of the bands were presented in quantities too small to obtain a reliable sequence, and many of those bands contained more than one discrete protein.An additional difficulty is that many of the proteins were blocked at the amino terminus, further complicating the sequencing process. Analyzing the differentiation in the level of gene transcription has overcome many of these disadvantages and drawbacks, since the power of recombinant DNA technology allows the amplification of signals that contain very small amounts of material.The most common method, called "Hybridization subtraction", involves the isolation of the mRNA from the biological specimen before (B) and then (A) of the development process of interest, transcribing a set of mRNA into cDNA, subtracting specimen B of specimen A (cDNA mRNA) by hybridization, and constructing a cDNA library from the fraction of MRNA of non-hybridization. Many different groups have used this strategy successfully, and a variety of procedures have been published and improved using the same basic scheme. Nucí Acids Res. 19, 7097-7104 (1991); Nucí Acids? Res. 18, 4833-42 (1990); Nucí Acids Res. 18, 2789-92 (1989); European J. Neuroscience 2, 1063-1973 (1990); Analytical Biochem. 187, 364-73 (1990); Genet Annals Techn, Appl. 1_, 64-70 (1990); GATA 8. (4), 129-33 (1991); Proc. Nati Acad. Sci.
USA 85, 1696-1700 (1988); Nucí Acids Res. 19, 1954 (1991);
, Proc. Nati Acad. Sci. USA 88., 1943-47 (1991); Nucí Acids
Res. 19, 6123-27 (1991); Proc. Nati Acad. Sci. USA 85, 5738-42 (1988); Nucí Acids Res. 16, 10937 (1988). Although each of these techniques has particular strengths and weaknesses, there are still some limitations and undesirable aspects of these methods: First, the time and effort required to build these libraries is quite large. Typically, a trained molecular biologist might expect that the construction and characterization of such a library would require 3 to 6 months, depending on the level of skill, experience, and luck. Second, subtraction libraries are typically inferior to libraries constructed using standard methodology. A typical conventional cDNA library should have a clone complexity of at least 106 clones, and an average insertion size of 1-3 kB. In contrast, subtraction libraries can have complexities of 102 or 103 and average insert sizes of 0.2 kB. Therefore, there may be a significant loss of clones and sequence information associated with those libraries. Third, this way
# of approximation allows the researcher to capture only the genes induced in specimen A in relation to specimen B, not vice versa, nor does it easily allow comparison with a third specimen of interest (C). Fourth, this form of approximation requires large quantities (hundreds of micrograms) of "driver" mRNA (specimen B), which significantly limits the number and type of subtractions that are possible since many tissues and cells are very difficult to obtain in large numbers. amounts.
Fifth, the resolution of the subtraction depends on the physical properties of the DNA hybridization: DNA or RNA: DN. The ability of a given sequence to find a hybridization match depends on its unique CoT value. The CoT value is a function of the number of copies (concentration) of the particular sequence, multiplied by the hybridization time. It follows that, for sequences that are abundant, hybridization events will occur very rapidly (low CoT value), while rare sequences will duplicate very high CoT values. The CoT values that allow these rare sequences to form duplicates and therefore be selected effectively are difficult to achieve in a convenient time frame. Therefore, hybridization subtraction is simply not a useful technique with which to study the relative levels of rare mRNA species. Sixth, this problem is further complicated by the fact that the formation of duplicates also depends on the composition of nucleotide bases for a given sequence. The sequences rich in G + C form stronger duplicates than those with a high content of A + T. Therefore, the above sequences will tend to be selectively removed by hybridization subtraction. Seventh, it is possible that hybridization between non-exact matches may occur. When this happens, the expression of a homologous gene can "mask" the expression of a gene of interest, artificially biasing the results for that particular gene. Matsubara and Okubo proposed using partial cDNA sequences to establish gene expression profiles that could be used in functional analysis of the human genome. Matsubara and Okubo warn of the danger of using random priming, because it creates multiple unique DNA fragments from the individual mRNAs and can thus bias the analysis of the number of particular mRNAs per library. The sequenced members selected at random from a cDNA library directed from 3 'and established the frequency of appearance of the various ESTs. They proposed comparing EST lists of various cell types to classify genes. The genes expressed in many types of cells were termed domestic and those expressed in certain cells were called specific cell genes, fPr even in the absence of the complete sequence of the gene or the biological activity of the gene product. The present invention avoids the drawbacks of the prior art by providing a method for quantifying the relative abundance of the transcripts of multiple genes in a given biological specimen by the use of high-throughput specific sequence analysis of individual RNAs and / or their corresponding DNAs . The present invention offers many advantages over current protein discovery methods that attempt to isolate individual proteins based on biological effects. The method of the present invention provides comparisons of the detailed diagnosis of the profiles of the cells that reveal numerous changes in the expression of the individual transcripts. The present invention offers many advantages over current subtraction methods that include an analysis
* £ _ from more complex libraries (106 to 107 clones compared
Your with 103 clones) that allows the identification of messages of little abundance as well as makes possible the identification of messages that can increase or decrease in abundance. These large libraries are very routine to be done in contrast to the libraries of previous methods. Further,
The homologs can easily be distinguished with the method of the present invention. This method is very convenient because it organizes a large amount of data in a digestible, understandable format.
The most significant differences are highlighted by the
electronic subtraction. In deep analyzes they become more convenient. The present invention provides many advantages over previous methods of electronic cDNA analysis. The method is particularly powerful when analyzing more than
100 and preferably more than 1000 gene transcripts. In this case, the new low frequency transcripts are discovered and the tissues are typified. High resolution analyzes of gene expression can be used directly as a diagnostic profile or to identify specific disease genes for the development of more classical diagnostic approach forms. This process is defined as frequency analysis of gene transcription. The resulting quantitative analysis of gene transcripts is defined as a comparative analysis of gene transcripts.
3. SUMMARY OF THE INVENTION The invention is a method for analyzing a specimen containing gene transcripts comprising the steps of (a) producing a library of biological sequences; (b) generating a set of transcript sequences, wherein each of the transcript sequences in said set is indicative of one of the biological sequences different from the library; (c) processing the transcription sequences in a programmed computer (in which a reference database of the sequences of the transcripts indicative of the reference sequences is stored), to generate an identified sequence value for each of the Transcription sequences, where f "each identified sequence value is indicative of the sequence annotation and a degree of coincidence between one of the biological sequences of the library and at least one of the reference sequences, and (d) process each identified sequence value 5 to generate final data values indicative of the number of times each identified sequence value is present in the library The invention also includes a method for comparing
^ - Two specimens containing gene transcripts. The (7"^ first specimen is processed as described above.
The second specimen is used to produce a second library of biological sequences, which is used to generate a second set of transcript sequences, wherein each of the transcript sequences in the second set is indicative of one of the biological sequences of the second library. Then the second set of
^ Transcription sequences are processed in a computer programmed to generate a second set of identified sequence values, that is, the additional identified sequence values, each of which is indicative of a sequence annotation and includes a degree of overlap between one of the biological sequences of the second library and at least one of the reference sequences. The additional identified sequence values are processed to generate additional final data values # each identified sequence value is indicative of the sequence annotation and a degree of coincidence between one of the biological sequences of the library and at least one of the sequences reference; and (d) processing each identified sequence value to generate final data values indicative of the number of times that each identified sequence value is present in the library. The invention also includes a method for comparing two specimens containing gene transcripts. The first O specimen is processed as described above. The second specimen is used to produce a second library of biological sequences, which is used to generate a second set of transcript sequences, wherein each of the transcript sequences in the second set is indicative of one of the biological sequences of the second library. Then the second set of
# Transcription sequences are processed in a programmed computer to generate a second set of identified sequence values, that is, the additional identified 0 sequence values, each of which is indicative of a sequence annotation and includes a degree of coincidence between one of the biological sequences of the second library and at least one of the reference sequences. The additional identified sequence values 5 are processed to generate additional final data values # representative of clones transfected with DNA. Each clone in the population is identified by a specific sequence method that identifies the gene from which the single mRNA was transcribed. The number of times each
(gene is identified with a clone to evaluate the abundance of gene transcripts.) Genes and their abundances are listed in order of abundance to produce an image of gene transcription, in an additional mode, the relative abundance of the transcripts. of genes in a cell type or tissue is compared with the relative abundance of numbers of transcripts of genes in a second type of cell or tissue in order to identify differences and similarities.In another embodiment, the method includes a system to analyze a library of biological sequences that includes an element to receive a set of sequences of
# transcription where each of the transcription sequences is indicative of one of the biological sequences different from the library; and an element for processing the transcription sequences in a computer system in which a database of reference transcription sequences indicative of reference sequences is stored, wherein the computer is programmed with software to generate a sequence value of identification for each of the Wr transcript sequences, wherein each identified sequence value is indicative of a sequence annotation and the degree of coincidence between one of the biological sequences of the library and at least one of the reference sequences, and to process each identified sequence value to generate final data values indicative of the number of times each identified sequence value is present in the library. # In essence, the invention is a method and system for quantifying the relative abundance of gene transcripts in a biological specimen. The invention provides a method for comparing the transcription picture of genes from two or more different biological specimens in order to distinguish between the two specimens and identify one or more
genes that are differentially expressed between the two specimens. Thus, this gene transcription image and its ff comparison can be used as a diagnostic. One modality of the method generates a high-throughput specific sequence analysis of multiple RNAs or their corresponding DNA: a 0 transcript image of genes. Another quality of the method produces image analysis of gene transcripts through the use of high-throughput DNA sequence analysis. In addition, two or more gene transcription images can be compared and used to detect or diagnose a particular condition, disease, or biological condition that correlates with the relative abundance of gene transcripts in a given cell or cell population.
4. Description of the Tables and Drawings 5 4.1. Tables Table 1 presents a detailed explanation of the letter codes used in Tables 2-5. Table 2 lists the hundred most common gene transcripts. This is a partial list of isolated from
The HUVEC cDNA library prepared and sequenced as described below. The column of the left hand refers to the order of abundance of the sequences in this table. The next column entitled "number" is the clone number of the first HUVEC sequence identification reference
that matches the sequence in the "entry" column number. Isolates that were not well sequenced are not present in Table 2. The next column, entitled "N",
* indicates the total number of cDNAs that have the same degree of coincidence with the sequence of the transcription of
reference in the "entry" column. The column titled "entry" gives the name of the NIH GENBANK locus, which corresponds to the sequence numbers of the library. The "s" column indicates in a few cases the species of the reference sequence. The code for
The "s" column is given in Table 1. The column entitled "describer" provides a complete explanation in English of the identity of the sequence corresponding to the name of the NIH GENBANK locus in the "input" column. Table 3 is a comparison of the top 15 most abundant gene transcripts in normal monocytes and activated macrophage cells. Table 4 is a detailed summary of the summary of the library subtraction analysis comparing the THP-1 and human macrophage cDNA sequences. In Table riT 4, the same code as in Table 2 is used. Additional columns are for "bgfreq" (abundance number in the subtraction library), "rfend" (abundance number in the target library) and " quotient "(the target abundance number divided by the abundance number of
subtraction). As is clear from the careful reading of the table, when the abundance number in the library of
B subtraction is "0", the target abundance number is divided by 0.05. This is a way to get a result (not possible by dividing by 0) and that distinguishes the result from
the quotients of the subtraction numbers of 1. Table 5 is the computer program, written in source code, to generate gene transcription subtraction profiles. Table 6 is a partial list of the entries of the database used in the electronic analysis of the Northern blot as provided by the present invention.
4. 2 Brief description of the Drawings Figure 1 is a diagram summarizing the data collected and stored with respect to the library construction portion of the sequence preparation and analysis. Figure 2 is a diagram representing the sequence of operations performed by the "abundance classification" software in a class of preferred embodiments of the method of the invention. Figure 4 is a more detailed block diagram of the bioinformatics process from a new sequence
(which has already been sequenced but has not been identified) to print the analysis of the image of the transcript and the provision of the subscriptions of the database.
. DETAILED DESCRIPTION OF THE INVENTION The present invention provides a method for comparing the relative abundance of gene transcripts in different biological specimens by using a high sequence specificity analysis of the individual RNAs or their corresponding cDNAs (or alternatively, of the data representing other biological sequences). This process is denoted in the present wr as an image of gene transcription. The quantitative analysis of relative abundance for a set of gene transcripts is denoted herein as
"gene transcription image analysis" or "analysis of
frequency of gene transcription. "The present invention makes it possible to obtain a profile for the transcription of genes in any population of cells or tissue given from any type of organism.
^ to obtain a profile of a specimen consisting of a single cell (or clones of a single cell), or of many cells, or of more complex tissue than a single cell and containing multiple cell types, such as the liver . The invention has significant advantages in the fields of diagnostics, toxicology and pharmacology, to name a few. A highly sophisticated diagnostic test can be performed on the sick patient whose diagnosis has not been made. A biological specimen is obtained consisting of the patient's fluids or tissues, and the gene transcripts are isolated and expanded at the amount necessary to determine their identity. Optionally, gene transcripts can be converted to cDNA. A sampling of the gene transcripts is subjected to specific sequence analysis and quantified. These sequence abundances of gene transcripts are compared against the sequence abundances of the base Wr data reference including normal datasets for sick and healthy patients. The patient has the disease (s) with which the patient data set correlates most closely. 5, For example, the analysis of gene transcription frequency can be used to differentiate normal cells or tissues from diseased cells or tissues, precisely as this highlights the differences between normal monocytes and activated macrophages in Table 3. In toxicology, A fundamental question is which tests are most effective in predicting or detecting a toxic effect. The imaging of gene transcripts provides very detailed information about the cell and the tissue environment, some of which would not be obvious in conventional, more detailed methods of analysis. The image (of gene transcription is a more powerful Jjf method for predicting the toxicity and efficacy of drugs.) Similar benefits accrue in the use of this tool in pharmacology.The image of gene transcription can be used selectively to observe the protein categories that are expected to affect, for example, enzymes that detoxify toxins In an alternative modality, the comparative analysis of gene transcription frequency is used to differentiate between cancer cells that respond to anticancer agents and those that do not respond. Examples of anticancer agents are tamoxifen, vincristine, vinblastine, podophyllotoxins, etoposide, tennisposide, cisplatin, biological response modifiers such as interferon, 11-2, GM-CSF, enzymes, hormones and the like.This method also provides a means for classifying gene transcripts by functional category.In the case of cancer cells, the factor It is transcription or other essential regulatory molecules are very important categories to analyze through different libraries. In yet another embodiment, the comparative analysis of gene transcription frequency is used to differentiate between control liver cells and liver cells isolated from patients treated with experimental drugs such as FIAU to distinguish between the pathology caused by the underlying disease and the caused by the drug. In yet another modality, the comparative analysis of gene transcription frequency is used to differentiate between brain tissue from treated and untreated patients with lithium. In a further embodiment, the comparative analysis of gene transcription frequency is used to differentiate between cells treated with cyclosporin and Fk506 and normal cells.
In a further embodiment, the comparative analysis of gene transcription frequency is used to differentiate between virally infected human cells (including HIV infected) and uninfected human cells. Gene transcription frequency analysis is also used for the rapid recognition of gene transcripts in HIV-resistant, HIV-infected, and HIV-sensitive cells. The comparison of the abundance of the transcription of genes will indicate the success of the treatment and / or new avenues to study. In a further embodiment, the comparative analysis of gene transcription frequency is used to differentiate between bronchial lavage fluids from healthy and diseased patients with a variety of conditions. In a further embodiment, the comparative analysis of gene transcription frequency for j ^ P is used to differentiate between cell, plant, microbe and mutant animals and wild-type species. In addition, the transcription abundance program is adapted to allow the scientist to evaluate the transcription of a gene in many different tissues. These comparisons could identify mutants by default that do not produce a gene product and point mutants that produce a less abundant message in a different way. These mutations can affect the basic biochemical and pharmacological processes, such as flp mineral nutrition and metabolism, and can be isolated by means known to those skilled in the art. Thus, crops with improved yields, resistance to pests and other factors can be developed. In a further modality, the comparative analysis of gene transcription frequency is used for a comparative analysis between species that would allow the selection of better models of pharmacological animals. In this embodiment, humans and other animals (such as a mouse) or their cultured cells are treated with a specific test agent. The sequence-specific abundance of each cDNA population is determined. If the animal's test system is a good model, the homologous genes in the cDNA population of the animal must change expression in a manner similar to those in human cells. If side effects are detected with the drug, a detailed analysis of transcript abundance is performed to recognize changes in gene transcription. Then the models must be evaluated by comparing the basic physiological changes. In a further embodiment, the comparative analysis of gene transcription frequency is used in the clinical setting to give a very detailed gene transcription profile of a patient's cells or tissue (eg, a blood sample). In particular the frequency analysis of gene transcription is used to give a high-resolution gene expression profile of a disease state or condition. In the preferred embodiment, the method uses a
high-throughput cDNA sequencing to identify specific transcripts of interest. The generated DNA and the deduced amino acid sequences are then compared extensively with GENBANK and other data banks of
, _ sequences as described below. The method offers many advantages over current protein discovery by two-dimensional gel methods that attempt to identify the individual proteins involved in a particular biological effect. Here, detailed comparisons of the profiles of the activated and inactive cells reveal numerous changes in the expression of the individual transcripts. After this it is determined if jf- the sequence is an "exact" match, similar or mismatch, the sequence is entered into a database. Next, the numbers of DNA copies that 0 correspond to each gene are tabulated. Although this can be done slowly and with difficulty, if at last, made by the hand of man, from the printing of all entries, a computer program is a useful and quick way to tabulate this information. The numbers of cDNA copies (optionally divided by the total number of sequences in the data set) provide an image of the relative abundance of
* transcripts for each corresponding gene. The list of genes represented should then be selected by abundance in the cDNA population. A multitude of 5 additional types of comparisons or dimensions are possible and are exemplified below. An alternative method to produce an image of gene transcription includes the steps of obtaining a mixture
^. of test mRNA and provide a representative order of or unique probes whose sequences are complementary to at least some of the test mRNAs. Next, a fixed amount of the test mRNA is added to the arranged probes. The test RNA is incubated with the probes for a sufficient time to allow hybridization of the test mRNA and the probes. The mRNA-probe hybrids are detected and the amount determined. Hybrids are identified by their
* position in the order of the probe. The amount of each hybrid is added to give a population number. Each hybrid quantity is divided by the population number for 0 to provide a set of relative abundance data called an image analysis of gene transcription.
6. Emploses The following examples are provided to illustrate the subject of the invention. These examples are provided by way of illustration and are not included for the purpose of limiting the invention.
6. 1. Origin of tissues and cell lines For analysis with the computer program claimed herein, biological sequences can be obtained from virtually any source. The most popular are the tissues obtained from the human body. You can get tissues from any organ of the body, from a
-? ^ donor of any age, any abnormality or any immortalized cell line. Immortal cell lines may be preferred in some cases due to their cell type purity; Other tissue samples invariably include mixed cell types. This
available a special technique to take a single cell
(for example, a brain cell) and strengthen the cellular machinery to develop enough cDNA to sequence, using the techniques and analyzes described herein (cf.
Patent of the United States of North America Nos. 5, 021, 35 and 0 5,168,038, which are incorporated by reference). The examples given herein used the following immortalized cell lines: U937 monocyte-like cells, TPH-macrophage-like cells, vascular-induced endothelial cells (HUVEC cells) and cells 5 as masts HMC-1.
The U-937 cell line is a human histolytic lymphoma cell line with monocyte characteristics, established from malignant cells obtained from the pleural effusion of a patient with diffuse histolytic lymphoma (Sundstrom, C. and Nilsson, K (1976) Int. J. Cancer 17: 565). The
U-937 * is one of only a few lines of human cells with the morphology, cytochemistry, surface and monocyte-like receptors characteristic of histiocytic cells. These cells can be induced to terminal monocytic differentiation and will express new molecules on the surface of the cell when activated with supernatants of human mixed lymphocyte cultures. In this type of in vitro activation, the cells produce morphological and functional changes, including increased antibody-dependent cellular cytotoxicity (ADCC) against erythroid cells and tumor target cells (one of the main functions of macrophages). The activation of U-937 cells with phorbol 12-myristate 13 acetate (PMA) in vitro stimulates the production of many compounds, including prostaglandins, leukotrienes and platelets activating factor (PAF), which are potent inflammatory mediators. Thus, U-937 is a cell line that is well suited for the identification and isolation of gene transcripts associated with normal monocytes. The HUVEC cell line is an early passage endothelial cell culture, normal, homogeneous, well characterized from the human umbilical vein (Cell Systems Corp., 12815 .NE 124th Street, Kirkland, WA 98034). Only gene transcripts of induced or treated HUVEC cells were sequenced. A batch of 1 x 108 cells was treated for 5 hours with 1 U / milliliter rIL-lb and 100 ng / ml of E. coli lipopolysaccharide endotoxin (LPS) before harvesting. A separate batch of 2 x 10 cells was treated in confluence with 4 U / milliliters of TNF and 2 U / milliliters of interferon-gamma (IFN-gamma) before harvesting. THP-1 is a line of human leukemic cells with distinctive monocytic features. This cell line was derived from the blood of a 1-year-old child with acute monocytic leukemia (Tsuchiya, S. et al. (1980) Int. J. Cancer: 171-76). The following cytological and cytochemical criteria were used to determine the nature
# Monocytic cell line: 1) the presence of alpha-naphthyl butyrate esterase activity that could be inhibited with sodium fluoride; 2) the production of lysozyme; 3) phagocytosis of latex particles and sensitized SRBC
(red blood cells of sheep); and 4) the ability of THP-1 cells treated with mitomycin C to activate T lymphocytes following ConA treatment (concanavalin A).
Morphologically, the cytoplasm contained small azurophilic granules and the nucleus was indented and irregularly f with deep folds. The cell line has Fc and C3b receptors, which probably work in phagocytosis. THP-1 cells treated with the tumor promoter 12-o-tetradecanoyl phorbol-13 acetate (TPA) stops proliferation and differs in cells such as macrophages that mimic macrophages derived from native monocytes in many aspects. Morphologically, as the shape of the cells changes, the nucleus becomes more irregular and additional rf- phagocytic vacuoles appear in the cytoplasm. Differentiated THP-1 cells also exhibit increased adhesion to the plastic of the tissue culture. The HMC-1 cells (human mast cell line) were established from the peripheral blood of a Mayo Clinic patient with leukemia, from mast cells 5 (Leukemia Res. (1988) 12: 345-55). The cultured cells looked similar to immature nV cloned murine mast cells, contained histamine, and stained positively with chloroacetate esterase, amino caproate esterase, eosinophilic major basic protein (MBP) and tryptase. HMC-1, 0 cells, however, have lost the ability to synthesize normal IgE receptors. HMC-1 cells also possess a translocation 10; 16, present in the cells initially collected by leukophoresis from the patient and not a culture artifact. Thus, HMC-1 cells are a good model for mast cells.
H 6.2 Construction of cDNA libraries For comparisons between libraries, libraries should be prepared in similar ways. Some parameters seem to be particularly important for the control. One of those parameters is the method of isolating mRNA. It is important to use the same conditions to remove DNA and heterogeneous nuclear RNA for comparison libraries. The fractionation of the cDNA size must be controlled. carefully. Preferably, the same vector ic 'should be used to prepare the libraries to be compared. At least, the same type of vector (for example, unidirectional vector) should be used to ensure a valid comparison. A unidirectional vector can be preferred in order to more easily analyze the product. It is preferred to prime only with oligo dT unidirectional primer in order to obtain a single clone by transcription of mRNA to obtain cDNA. However, it is recognized that using a mixture of oligo dT and random primers can also have an advantage because mixing
results in greater sequence diversity when the discovery of genes is also a goal. Similar effects can be obtained with DR2 (Clontech) and HXLOX (US Biochemical) and also Invitrogen and Novagen vectors. These vectors have two requirements. First, there must be sites
of primer for the commercially available primers as the reverse primers T3 and M13. Second, the vector must accept inserts up to 10 kB. It is also important that a clone sample be randomly drawn, and that a significant population of
clones. Data has been generated with 5000 clones; however, if very rare genes and / or their relative abundance are to be obtained, as many as 100,000 clones from a single library may be needed for sampling. The fractionation of
J cDNA size should also be carefully controlled. From
Alternatively, plates can be selected, instead of clones. To one side of the Uni-ZAP ™ vector system by
Stratagene described below, it is now believed that other unidirectional vectors can also be used in a similar manner.
For example, it is believed that these vectors include, but are not limited to DR2 (Clontech), and HXLOX (U.S. Biochemical). Preferably, the details of the construction of the library (as shown in Figure 1) are collected and stored in a database for later retrieval in relation to the sequences being compared. Figure 1 shows important information regarding the collaborator of the cDNA or cell library or provider, pretreatment, biological origin, culture, RNA preparation, and cDNA construction. Similarly, detailed information about other steps, benefits the Pr analysis of sequences and libraries in depth. The RNA must be harvested from cell and tissue samples and the cDNA libraries are subsequently constructed. The cDNA libraries can be constructed from
according to techniques known in the art. (See, for example, Maniatis, T et al. (1982) Molecular Cloning,
Cold Spring Harbor Laboratory, New York). You can also buy cDNA libraries. The U-937 cDNA library
, (catalog No. 937207) was obtained from Stratagene, Inc., 11099 M. 0 Torrey Pines Rd., La Jolla, CA 92037. The THP-1 cDNA library was constructed for the Stratagene client from cultured THP cells. 48 hours with 100 nm TPA and 4 hours with 1 μg / ml LPS. The human mast cell cDNA library HMC-1 was also sent 5 in Stratagene from cultured HMC-1 cells. The HUVEC cDNA library was ordered to be made in Stratagene from two batches of induced HuVEC cells that were processed separately. Essentially, all libraries were prepared 0 in the same way. First, poly (A +) RNA (mRNA) was purified. for RNA U-937 and HMC, cDNA synthesis was only primed with oligo dT. For THP-1 and HUVEC RNA, the synthesis of aDNA was primed separately with both oligo dT and random hexamers, and the two cDNA libraries were separately treated. 5 Synthetic adapter oligonucleotides were ligated into the DNA ends making it possible to insert them into the Uni-Zap ™ vector system (Stratagene), allowing high unidirectional efficiency (sense orientation), the construction of the lambda library and the convenience of a plasmid system 5 with blue-white color classification to detect clones with cDNA inserts. Finally, the two libraries were combined into a single library mixing equal numbers of bacteriophages. A Libraries can be selected with either DNA probes or antibody probes and the pBluescript® phagemid (Stratagene) can be easily extracted in vivo. Fagomido allows the use of a plasmid system to easily insert characterization, sequencing, site-directed mutagenesis, the creation of unidirectional deletions and the expression of fusion proteins. The phage particles of the library sent to be made
# infected within the host strain of E. coli XLl-blue®
(Stratagene), which has a high transformation efficiency, which increases the probability of obtaining clones -raras poorly represented in the cDNA library. 6.3. Isolation of cDNA clones The fagomid forms of the individual cDNA clones were obtained in an in vivo extraction process, in which the host bacterial strain was co-infected with both the phage of the lambda library and T an auxiliary phage fl. The proteins derived from both the phage-containing library and the auxiliary phage nicked the lambda DNA, initiated a new DNA synthesis from the defined sequences in the target lambda DNA
and created a smaller, single chain circular phagocyte DNA molecule, which included all the DNA sequences of the pBluescript® plasmid and the cDNA insert. The fagomide DNA was secreted from the cells and purified, then used to reinfect fresh host cells, where
"U" produced the double-stranded fagomida DNA Because fagomida carries the gene for beta-lactamase, the newly transformed bacteria are selected in a medium containing ampicillin. The fagomide DNA was purified using the System of
Magic Minipreps ™ DNA Purification (Promega catalog
# A7100. Promega Corp. 2800 Woods Hollow Rd., Madison, Wl
'H 53711). This small-scale process provides a simple and reliable method for the iysis of bacterial cells and rapidly isolates the purified fagomid DNA using a
Appropriate resin that binds to DNA. The DNA was separated from the purification resin already prepared for DNA sequencing and other analytical manipulations. The fagomide DNA was also purified using the
QIAGEN® QIAwell-8 Plasmid Purification System
(QUIAGEN Inc., 9259 Eton Ave., Cattsworth, CA 91311). This PT product line provides a convenient, reliable and fast high-throughput method for lysing bacterial cells and isolating highly purified fagomide DNA using QIAGEN anion exchange resin particles with EMPORE ™ membrane technology starting from 3 M in a multiple well format. The DNA was separated from the purification resin already prepared for DNA sequencing and other analytical manipulations. An alternative method for the purification of fagomida has recently become available. This uses the Miniprep Kit (Catalog No. 77468, available from Advanced Genetic Technologies Corp., 19212 Orbit Drive, Gaithersburg, Maryland). This equipment is in the 96-well format and provides enough reagents for 960 purifications. Each equipment is provided with a recommended protocol, which has been used except for the following changes. First, the 96 wells are each filled with only 1 milliliter of sterile, intense broth with carbenicillin at 25 milligrams / liter and 0.4% glycerol. After the wells are inoculated, the bacteria are cultured for 24 hours and treated with 60 μl of regulator for lysis. A centrifugation step is carried out (2900 rpm for 5 minutes) before the contents of the block are added to the main filter tray. The optional step of adding isopropanol to the TRIS regulator is not carried out routinely. After the last step in the protocol, the samples are transferred to a 96-well Beckman block for storage. Another new DNA purification system is the WIZARD ™ product line that is available from Promega (catalog No. A7071) and can be adapted to the 96-well format.
6. 4 Sequencing of cDNA Clones The cDNA inserts from randomized isolates from libraries U-937 and THP-1 were partially sequenced. Methods for DNA sequencing are well known in the art. Conventional enzymatic methods employ the Klenow fragment of DNA polymerase, Sequenase ™ or Taq polymerase to extend the strands of DNA from an oligonucleotide primer quenched to the DNA template of interest. Methods have been developed for the use of both simple and double-chain templates. The chain termination reaction * products are usually subjected to electrophoresis on urea-acrylated gels and are detected either by autoradiography (for precursors labeled with radionuclide) or by fluorescence (for fluorescent labeled precursors). Recent advances in the preparation of the mechanized reaction, sequencing and analysis using the fluorescent detection method * have allowed the expansion in the number of sequences that can be determined per day (such as Applied Biosystems 373 and DNA sequencer 377 , Catalyst 800). Currently with the system as described, the reading lengths fluctuate from 250 to 400 bases and are dependent on the clones. The length of the reading also varies with the length of time the gel runs. In general, shorter races tend to truncate the sequence. A minimum of only about 25 to 50 bases is necessary to establish the identification and degree of homology of the sequence. The specific method, which includes but is not limited to hybridization, mass spectroscopy, capillary electrophoresis and gel electrophoresis 505
6.5. Investigation of homology of DNA clones and deduced protein (and subsequent steps) Using the nucleotide sequences derived from
* clones of cDNA clones as sequences of doubt (Sequence Listing sequences), databases containing 0 sequences previously identified are investigated to see if they have areas of homology (similarity). Examples of those databases include Genbank and EMBL. We next describe examples of two homology search algorithms that can be used, and then describe the subsequent steps 5 implemented in the computer to bring them to the end according to the preferred embodiments of the invention. In the following description of the computer implemented steps of the invention, the word "library" denotes a set (or population) of nucleic acid sequences of a biological specimen. A "library" may consist of cDNA sequences, RNA sequences, or the like, which characterize a biological specimen. The biological specimen may consist of cells of a single type 0 of human cells (or may be any of the other types of specimens mentioned above). We think that the sequences in a library have been determined in order to represent or accurately characterize a biological specimen (for example, they can consist of cDNA sequences representative of RNA clones taken from a single human cell). Wm In the following description of the steps implemented in computer of the invention, the expression
"database" denotes a set of stored data that represents a collection of sequences, which in turn represent a collection of biological reference materials. For example, a database may consist of data representing many stored cDNA sequences which are, in turn, representative of human cells 5 infected by various viruses, human cells of various ages, cells of different mammalian species, and so on. like that. In the preferred embodiments, the invention employs a computer programmed with software (to be described) to perform the following steps: (a) processing of indicator data of the cDNA sequences of a library (generated as a result of high-level sequencing) cDNA yield or other method) f to determine if each sequence in the library matches a DNA sequence from a reference database of DNA sequences (and if so, identify the entry of the reference database that matches the sequence and indicate the degree of coincidence between the reference sequence and the library sequence) and assign an identified sequence value 15 based on the annotation of the sequence and the degree of coincidence with each of the sequences in the sequence. library; (b) for some or all of the entries in the database, tabulate the number of sequence values
identified that match in the library (although this can be done by hand from an impression of all the entries, we prefer to carry out this step using the computer software that will be described later), thereby generating a set of final data values
or "abundance numbers"; and # (c) if the libraries are of different sizes, divide each abundance number by the total number of. sequences in the library, to obtain a relative abundance number for each identified sequence value 5 (ie, a relative abundance of each gene transcript). The list of identified sequence values (or genes corresponding thereto) can be selected by i, - abundance in the cDNA population. A multitude of i f additional types of comparisons or dimensions is possible. For example (to be described later in greater detail), steps (a) and (b) may be repeated for two different libraries (sometimes referred to as an "objective" library and a "subtraction library").
Then, for each identified sequence value (or gene transcript), a "quotient" value is obtained
S dividing the abundance number (for that identified sequence value) for the target library, enter the abundance number (for that value of - sequence
identified) for the subtraction library. In fact, the subtraction must take place in multiple libraries. It is possible to add the transcripts of several libraries (for example, three) and then divide them among another set of multiple transcripts
libraries (again, for example, three). The notation for • this operation can be abbreviated as (A + B + B) / (D + E + F), where the uppercase letters each indicate a complete library. Optionally, the abundance numbers of the transcripts in the summed libraries can be divided 5 by the total sample size before the subtraction. Unlike standard hybridization technology that allows a single subtraction of two libraries, once one has processed a set or library of jfe transcription sequences and stored them in the
In the computer, you can perform any number of subtractions in the library, for example, using this method, the quotient values can be obtained by dividing the values of relative abundance in a first library between the corresponding values in a second library.
and vice versa. In variations of step (a), the library consists of
• in nucleotide sequences derived from cDNA clones. Examples of databases that can be investigated to see areas of homology (similarity) in step (a) include the 20 commercially available databases known as Genbank (NIH) EMBL (European Molecular Biology Labs, Germany), and GENESEQ (Intelligenetics, Mountain View, California). A homology investigation algorithm that can be used to implement step (a) is the algorithm described in the work of D.J. Lipman and W.R. Péarson, entitled "Rapid and Sensitive Protein Similarity Searches," Science, 227: 1435 (1985). In this algorithm, the homologous regions are discovered in a way in two steps. In the first step, the 5 most homologous regions are determined by calculating a match indicator using a homology indicator table. The "Ktup" parameter is used in this step to set the minimum window size to be moved to compare two sequences. Ktup also sets the number of bases that must coincide) to extract the region of greatest homology between the sequences. In this step, insertions or deletions are not applied and the homology is exposed as an initial value (INIT). In the second step, the homologous regions are
align to obtain the maximum indication of coincidence by inserting a gap in order to add a probable deleted portion. The match indicator obtained in the first step is recalculated using the Homology Indicator Table and the Insertion Indicator Table to a value-optimized 0 (OPT) in the final product. DNA homologies between two sequences can be examined graphically using the Harr method to construct graphs of homology matrices (Needleman, S.G. and Wunsch, C. O., J. Mom. Biol 48: 443 (1970)). This method produces a two-dimensional graph that can be useful for determining the regions of homology against the repeating regions.
However, in a class of preferred modalities, the step
(a) is implemented by processing the data from the library in the commercially available computer program known as the INHERIT 670 Sequence Analysis System, available from Applied Biosystems Inc. (Foster City,
California), including software known as the sof ware
Invoice (also available from Applied Biosystems Inc.). The Invoice program pre-processes each library sequence to "edit" portions of it that do not appear to be of interest, such as the vector used to prepare the library, additional sequences that can be edited or masked (ignored by research instruments) include but are not limited to the poly-A tail and the repetitive GAG and CCC sequences. You can write a low-end search program to mask those "little information" sequences, or programs such as BLAST can ignore the low-information sequences. In the algorithm implemented by the INHERIT 670 sequence analysis system, the pattern specification language (developed by TRW Inc.) is used to determine regions of homology. "There are three parameters that determine how the INHERIT analysis executes the sequence comparisons: window size, window derivation and error tolerance." The size of the window specifies the? F length of the segments within which the sequence is subdivided. problem sequence The window derivation specifies where the next segment [to be compared] begins, counting from the beginning of the previous segment.
errors specifies the total number of insertions, deletions and / or substitutions that are tolerated over the specified word length. The error tolerance must be set to any integer between 0 and 6. The values e? for lack are window tolerance = 20, derivation of 0 window = 10 and error tolerance = 3. "INHERIT Analysis
Users Manual, pp. 2-15. Version 1.0, Applied Biosystems, Inc. October 1991. Using a combination of these three parameters, a database (such as a DNA database) can be searched to see the sequences containing regions of homology and the appropriate sequences are indicated with a value
~ Initial WM. Subsequently, these homologous regions are examined using dot matrix homology plots to determine regions of homology against regions of 0 repetition. Smith-Waterman alignments can be used to expose the results of the homology investigation. The INHERIT software can be run using a Sun computer system programmed with the UNIX operating system. The research alternatives with respect to 5 INHERIT include the BLAST program, GCG (available from Genetics Computer Group, Wl) and the Dasher program (Temple Smith, Boston University, Boston, MA). Nucleotide sequences can be investigated against Genbank, EMBL or client databases such as GENESEQ (available from _ Intelligenetics, Mountain View, CA) or other databases for * genes. In addition, we have investigated some sequences against our own domestic database. In the preferred embodiments, the transcription sequences are analyzed by the INHERIT software for the best conformation with a transcription of reference genes to assign a sequence identifier and assigned the degree of homology, which together are the identified sequence value and they are entered into, and subsequently processed by a Macintosh personal computer (available from Apple) programmed with a computer program of "abundance and subtraction classification analysis" (to be described later). Prior to the subtraction analysis and abundance classification program (also referred to as the "abundance classification" program), the identified sequences of the cDNA clones are assigned values (according to the parameters given above) by the degree of coincidence according to the following categories: "exact" coincidence (regions with a high degree of identity), homologous human coincidence) homologous non-human coincidence (regions of high similarity present in species other than human species), or without coincidence (without significant regions of homology with respect to nucleotide sequences previously identified, stored in the form of the database). Alternatively, the degree of coincidence may be a numerical value as described below. Again, with reference to the step of "identifying the matches between the reference sequences and the database entries, the protein and peptide sequences can be deduced from the sequences of the nucleic acids. deduced polypeptide, match identification can be performed in a manner analogous to that done with the cDNA sequences.A protein sequence is used as a problem sequence and compared to previously identified sequences contained in such a database as the
Swiss / Prot, PIR and the NBRF Protein database to find homologous proteins. These proteins are indicated by their homology using a Table of homology indicators
(Orcutt, B.C. and Dayoff, M.O. Scoring Matrices, PIR Report MAT
- 0285 (February 1985)) resulting in an INIT indicator.
The homologous regions are aligned to obtain the maximum match indicators by inserting a gap that adds a probable deleted portion. The match indicator is recalculated using the Homology Indicator Table and the Insertion Indicator Table resulting in an Optimal Indicator (OPT). Even in the absence of knowledge about the appropriate reading frame of an isolated sequence, the protein homology research described above can be carried out by investigating in the 3 reading frames. The homologies of the peptide and protein sequences can also be ascertained using the INHERIT 670 sequence analysis system in a manner analogous to that used in the DNA sequence homologies. The pattern specification language and parameter windows are used to investigate protein databases for sequences that contain regions of homology with indicators of an initial value. Subsequent displays on the dot matrix homology plot show regions of homology against repeating regions. Additional research tools that are available for use in the pattern research database include PLsearch Blocks (available from Henikoff &; Henikoff, University of Washington, Seattle), Dacher and GCG. Patterns research databases include, but are not limited to, Protein Blocks (available from Henikoff &Henikoff, University of Washington, Seattle), Brookhaven Protein (available from Brookhaven National Laboratory, Brookhaven, MA), PROSITE (available in Amos Bairoch,
University of Geneva, Switzerland), ProDo (available in Temple
Smith, Boston University) and PROTEIN MOTIF FINGERPRINT
(available at University of Leeds, United Kingdom). The ABI Assembler application software, part of the INHERIT DNA analysis system (available from Applied Biosystems, Inc., Foster City, CA), can be used to create or manage sequence assembly projects by assembling data from fragments of sequences selected in a longer sequence. The Assembler software combines two advanced computing technologies that maximize the ability to assemble sequenced DNA fragments into Assemblies, a special grouping of data in which relationships between sequences are displayed by graphical overlays, alignment and statistical views. The process is based on the Meyers-Kececioglu fragment assembly model (INHERIT ™ Assembler User's Manual, Applied Biosystems, Inc., Foster City, CA), and uses graph theory as the foundation of a highly sequenced multiple alignment machinery. rigorous to assemble fragments of DNA sequences. Other assembly programs that can be used include MEGALIGN (available from Roger Staden, Cambridge, England). Next, with reference to Figure 2, we describe in more detail the "abundance classification" program that implements "step (b)" mentioned above to tabulate the number of library sequences that match each base entry of data (the "abundance number" for each entry in the database). Figure 2 is a flow diagram of a preferred embodiment of the abundance classification program. A source code that lists this modality of the abundance rating program is shown in Table 5. In the implementation of Table 5, the abundance classification program is written using the FoxBASE programming language, commercially available from Microsoft Corporation. Although FoxBASE was the program chosen for the first iteration of this technology, it should not be considered limiting. Many other programming languages can also be used, Sybase being a particularly desirable alternative, as will be obvious to those skilled in the art. The names of the subroutine specified in Figure 2 correspond to the subroutines listed in Table 5. With reference to Figure 2, again, the
"Identified sequences" are transcription sequences that represent each sequence of the library and a corresponding identification of the database entry (if any) with which it matches. In other words, the "identified sequences" are transcription sequences that represent the output of "step (a)" described above. Figure 3 is a block diagram of a system for implementing the invention. The system of Figure 3 includes the generation unit of library 2 that generates
a library and find out an output stream of transcription sequences indicative of the biological sequences comprising the library. The programmed processor 4 receives the output of the data stream from
»Unit 2 and process this data according to" step (a) "
T mU described above to generate the identified sequences. The processor 4 may be a processor programmed with the available computer program known as the INHERIT 670 sequence analysis system and the commercially available computer program known.
as the Invoice program (both available on Applied
Biosystems Inc.) and with the UNIX operating system. JB Still with reference to Figure 3, the identified sequences are loaded into the processor 6 which is programmed with the abundance rating program. The -processor 0 6 generates the Final Transcript sequences indicated in both Figure 2 and Figure 3. Figure 4 shows a more detailed block diagram of a planned relational computing system, which includes several research techniques that can be implemented , along with a 5 database research against which to ask.
With reference to Figure 2, the abundance classification program first performs an operation known as "Tempnum" in the identified Sequences, to discard all identified sequences except those that match the database entries of the selected types. For example, the Tempnum process can select identified sequences that represent matches of the following types with the entries in the database (see the definition above): "exact" match, "homologous" human match, "other species" match represents genes present in species other than human), "no" coincidence (there are no significant regions of homology with respect to the entries in the database representing previously identified nucleotide sequences), "I" match (Incyte for sequences of DNA not previously known), or "X" match (matches EST in the reference database). This eliminates the sequences U, S, M, V, A, R and D, (see definitions in Table 1). The values of the identified sequences selected during the "Tampnum" process then passes to an additional selection (weeding) operation known as "Tempred". This operation can, for example, discard all the values of the identified sequences that represent matches with entries of the selected database.
'M' The values of the sequences identified during the "Tempred" process are then classified according to the library, during the "Tempdesign" operation, it is contemplated that the "identified sequences" can represent
sequences from a single library, or from two or more libraries. Consider first the case that the values of the identified sequence represent sequences from a single library. In this case, all the values of the identified sequence determined during "Tempred pass classifying in the" Templib "operation, further classified in the" Libsort "operation, and finally an additional classification in the" Temptarsort "operation. that the transcription sequences produced during the 5"Tempred" operation represent sequences of two libraries (which we will call the "target" library and the "library" of
¿Subtraction? For example, the target library may consist of cDNA sequences from clones of a diseased cell, while the subtraction library may consist of cDNA sequences from clones of the diseased cell after treatment by exposure to a drug. As another example, the target library may consist of cDNA sequences of clones of a cell type of a young human, while the subtraction library may consist of 5 clone sequences of the same cell type of the same human at different ages. In this case, the "Tempdesig" operation directs all the transcription sequences that represent the target library to process them according to "Templib" (and then 5"Libsort" and "Temptarsort"), and directs all the transcription sequences that represent the subtraction library to process them according to "Tempsub" (and then "Subsort" and "Tempsubsort"). For example, the operations of
^ A consecutive classification "Templib," Libsort, "and Temptarsort"
classify the identified sequences of the target library in descending order of the abundance number (to generate a list of decreasing abundance numbers, each abundance number corresponding to a database entry, or several lists of abundance numbers
decreasing, with the numbers of abundance in each list corresponding to the entries of the database of a type fS - selected) with the redundancies removed from each classified list. The consecutive classification operations
"Tempsub," "Subsort", and Tempsubsort "classify 0 sequences identified from the subtraction library in decreasing order of abundance number (to generate a list of decreasing abundance numbers, each abundance number corresponding to an entry in the database, or several lists of decreasing abundance numbers, with the numbers 5 of abundance in each list corresponding to the entries of * the database of a selected type) with the redundancies removed from each classified list. The product of the transcription sequences of the operation "Temptarsort" "typically represent sorted lists from which a histogram can be generated in which the position along an axis (eg, horizontal) indicates the number of abundance (from the target library sequences), and the position at
^ Length of another axis (for example, vertical) indicates the value of
the identified sequence (e.g., human or non-human gene type). Similarly, the product of the transcription sequences from the "Tempsubsort" operation typically represent classified lists from which a histogram can be generated in which the position along a 5 axis (eg, horizontal) indicates the abundance number (of sequences of the subtraction library), tJjf- and the position along another axis (for example, vertical) indicates the value of the identified sequence (for example the type of human or non-human gene) ). 0 The product of the transcription sequences
(classified lists) of classification operations
Tempsubsort and Temptarsort are combined during the operation identified as "Cruncher" The "Cruncher" process identifies pairs of corresponding target abundance and subtraction numbers (both representing the same value of the identified sequence), and divides one from the other to generate a "quotient" value for each pair of corresponding abundance numbers, and then classify the quotient values in order of decreasing quotient value. The data product of the "Cruncher" operation (the final transcription sequence in Figure 2) is typically a classified list from which a histogram can be generated in which the position along an axis indicates the size of the quotient of the abundance numbers (for the corresponding identified sequence values of the target and subtraction libraries) and the position along another axis indicates the value of the identified sequence (e.g., gene type). Preferably, before obtaining a quotient between the two abundance values of the libraries, the Cruncher operation also divides each quotient value by the total number of sequences in one or both of the target and subtraction libraries. The resulting lists of "relative" quotient values generated by the Cruncher operation are useful for many medical, scientific and industrial applications. Also preferably, the product of the Cruncher operation is a set of lists, each list representing a sequence of decreasing quotient values for a different selected subset (e.g., protein family) of the database entries.
In one example, the abundance classification program of the invention tabulates the numbers of mRNA transcripts for a library corresponding to each gene identified in a database. These numbers are divided by the total number of sample clones. The results of the division reflect the relative abundance of mRNA transcripts in the type of cell or tissue from which they were obtained, obtaining this set of final data referred to herein as "gene transcription image analysis". The resulting subtracted data shows exactly which proteins and genes are up-regulated and down-regulated with very detailed complexity. 6.6. HUVEC cDNA library Table 2 is a table of abundance that lists the transcripts of several genes in an induced HUVEC library. The transcripts are listed in descending order of abundance. This computerized classification simplifies tissue analysis and accelerates the identification of new significant proteins that are specific for this type of cells. This type of endothelial cell line tissues of the cardiovascular system, and as much as is known about its composition, particularly in response to activation, greater opportunity for target proteins to be available to affect the treatment of disorders of this tissue, as well as the highly common arteriosclerosis.
6. 7. Monocyte cell and 5-mast cell cDNA libraries Tables 3 and 4 show truncated comparisons of two libraries. In Tables 3 and 4 the "normal monocytes" are the HMC-1 cells, and the "activated macrophages" are the
THP-1 cells previously treated with PMA and activated with 0 LPS. Table 3 lists in descending order of abundance the most abundant gene transcripts for both cell types. With only 15 gene transcripts of each cell type, this table allows a rapid qualitative comparison of the most common transcripts. This classification of abundance, with its convenient side-by-side exposure, provides a useful research tool immediately.
| jf. In this example, this research instrument describes that 1) only one of the 15 transcripts of higher activated macrophages is found above the 15 0 transcripts of normal genes (poly-A binding protein); and 2) a new gene transcript (not previously reported in another database) is represented relatively high in activated macrophages but is not prominent in a similar manner in normal macrophages. Such a research tool provides researchers with a short path • to new proteins, such as receptors, cell surface and intracellular signaling molecules, which can serve as drug targets in commercial drug screening programs. Such an instrument can save a considerable amount of time over what is consumed in a trial and error program aimed at identifying important proteins in and around cells, because those proteins that carry out the daily cellular functions and represented as RNA in steady state are quickly eliminated from a subsequent characterization. This illustrates how the profiles of gene transcripts change with altered cell function. Those skilled in the art know that the biochemical composition of cells also changes with other functional changes such as cancer, including several stages of f. cancer, and exposure to toxicity. A gene transcription subtraction profile such as in Table 3 is useful as a first screening tool for that type of gene expression and protein studies.
6. 8. Subtraction analysis of normal monocyte cell and activated monocyte cell cDNA libraries. Once the cDNA data is in the computer, the computer program as described in Table 5 was used to obtain ratios of all gene transcripts in the two libraries described in Example 6.7, and gene transcripts they were classified by the descending values of their ratios. If a transcript of abundance genes is unknown but appears to be less than 1. As an approximation - and to obtain a quotient, which would not be possible if the non-represented gene had an abundance of zero - to the genes that are represented in only one of the two libraries is assigned an abundance of 1/2. Using 1/2 for clones not represented increases the relative importance of the "on" and "off" genes, whose products would be candidates for drugs. The resulting print is called the subtraction table and is an extremely valuable selection method, as shown by the following data. . Table 4 is a subtraction table, in which the normal monocyte library was electronically "subtracted" from the library of activated macrophages. This table highlights more effectively the changes in abundance of gene transcripts by activating macrophages. Even among the first 20 transcripts listed, there are several transcripts of unknown genes. Thus, electronic subtraction is a useful instrument with which to help researchers to identify faster and more rapidly the changes in biology, mycoses between two types of cells. Such an instrument can save universities and pharmaceutical companies that spend billions of dollars in valuable research time and 5 laboratory resources at the early detection stage and can accelerate the drug development cycle., which in turn allows researchers to establish drug selection programs much sooner. Thus, this research instrument provides a way to bring new drugs to the public faster and more economically. Also, a subtraction table can be obtained for the diagnosis of patients. A sample of an individual patient (such as monocytes obtained from a biopsy of a blood sample) can be compared to the data provided herein to diagnose conditions associated with macrophage activation. ^ ?. Table 4 uncovers many transcripts of new genes (called Incyte clones). Note that many genes are turned on in the activated macrophage (ie, the monocyte had 0 on the bgfreq column). This method of selection is superior to other selection techniques, such as the Western blot, which are unable to uncover such a multitude of discrete new gene transcripts. The subtraction selection technique has also discovered a large number of transcripts of cancer genes (rho oncogenes, ETS2, rab-2 ras, related to YPT1, and mRNA of acute myeloid leukemia) in activated macrophages. These transcripts can be attributed to the use of immortalized cell lines and are inherently interesting for that reason. This selection technique offers a detailed picture of up-regulated transcripts including oncogenes, which help explain why anticancer drugs interfere with the patient's immunity mediated by activated macrophages. Armed with the knowledge gained from this screening method, those skilled in the art can establish more targeted programs, more effective drug screening programs to identify drugs that are differentially effective against both the relevant cancers and the conditions. of activated macrophages with the same gene transcription profile; 2) cancer alone; and 3) flR conditions. activated macrophages. The smooth muscle senescence protein (22kd) was up-regulated in the activated macrophage, indicating that it is a candidate to block to control inflammation.
6. 9 Abduction analysis of normal liver cell libraries and liver cells infected with hepatitis * In this example, the rats are exposed to the hepatitis virus and remain in the colony until they show definitive signs of hepatitis. Of the rats diagnosed with hepatitis, half are treated with a new anti-hepatitis agent (anti-hepatitis agent). Liver samples are obtained from all rats before being exposed to the hepatitis virus and at the end of treatment with anti-hepatitis agent or not receiving treatment. In addition, liver samples can be obtained from rats with hepatitis just before treatment with anti-hepatitis agent. The liver tissue is treated as described in Examples 6.2 and 6.3 to obtain mRNA and subsequently to sequence the cDNA. The cDNA of each sample is processed and analyzed according to its abundance with the computer program of Table 5. The transcript images of genes resulting from the cDNA provide detailed images of fl, the baseline (control) for each animal and of the states infected and / or treated of the animals. The cDNA data for a group of samples can be combined into a profile of transcripts of group summary genes for all control samples, all samples of the infected rats and all samples of the rats treated with anti-hepatitis agent . The subtractions are made between the appropriate individual libraries and the pooled libraries. For the fßß individual animals, control and post-study samples can be subtracted. Also, if the samples are obtained before and after treatment with anti-hepatitis agent, the data of the individual animals and the treatment groups can be subtracted. In addition, the data for all control samples can be combined and averaged. The average of the control can be subtracted from the averages of cDNA samples of both anti-hepatitis agent and post-study and anti-hepatitis or post-study agents. If pre- and post-treatment samples are available, the pre- and post-treatment samples can be compared individually (or averaged electronically) and subtracted. These subtraction tables are used in two general ways. First, we analyze the differences for transcripts of genes that are associated with continuous hepatic impairment or healing. Subtraction tables are instruments to isolate the effects of drug treatment from the underlying underlying pathology of hepatitis. 0 Because hepatitis affects many parameters, additional liver toxicity has been difficult to detect only with blood tests for the usual enzymes. The profile of gene transcription and subtraction provides a much more complex biochemical picture that the 5 researchers have needed to analyze those difficult problems. Second, the subtraction tables provide an instrument to identify clinical markers, individual proteins or other biochemical determinants that are used to predict and / or evaluate a clinical end point, such as disease, drug-related improvement, and even additional pathology due to the drug. ,. The subtraction tables specifically highlight the genes that are turned on or off.Thus, the subtraction tables provide a first selection for a set of gene transcription candidates to be used as clinical markers.Subsequently, the electronic subtractions of cell libraries and additional tissues reveal which of the potential markers are actually found in different cell and tissue libraries.Candidates for gene transcripts found in additional jJR libraries are removed from the pool of potential clinical markers. and other relevant samples that are known to be lacking and that 0 have the relevant condition are compared to validate the selection of the clinical marker In this method, the particular physiological function of the protein transcript does not need to be determined to qualify the transcription of genes as a marker c línico.
6. 10. Electronic Northern blot A limitation of electronic subtraction is that it is difficult to compare more than one pair of images at a time. Once the products are identified as relevant for further study (via electronic subtraction or other methods), it is useful to study the expression of single genes in a multitude of different tissues. In the laboratory, the "Northern" stain hybridization technique is used for this purpose. In this technique, a single cDNA, or a probe corresponding thereto, is labeled and then hybridized against a spot containing RNA samples prepared from a multitude of tissues or cell types. In autoradiography, the expression pattern of that particular gene, one at a time, can be quantified in all included samples. In contrast, another embodiment of this invention fH is the computerized form of this process, referred to herein
"Electronic Northern blot." In this variation, the expression of a single gene is sought against a multitude of prepared and sequenced libraries present in the database. In this way, the expression pattern of any candidate gene can only be examined instantaneously and effortlessly. In this way, more candidate genes can be explored, leading to more frequent and fruitful relevant discoveries. The computer program included as Table 5 includes a program to perform this function, and Table 6 is a partial list of entries of the database used in the analysis of the electronic Northern blot.
6.11. Phase I Clinical Cases Based on the establishment of safety and efficacy of the previous animal tests, clinical trials of Phase I were undertaken. Normal patients undergo preliminary clinical laboratory tests. In addition, appropriate specimens are taken and subjected to gene transcription analysis. Additional specimens are taken from patients at previously determined intervals during the test. The specimens are subjected to gene transcription analysis, as described above. In addition, gene transcript changes noted in the first toxicity study of rats are evaluated? carefully as clinical markers in the patients followed. Changes in gene transcription analyzes are evaluated as indicators of toxicity by correlation with clinical signs and symptoms and other laboratory results. In addition, subtraction is performed on specimens from individual patients and on specimens from averaged patients. The subtraction analysis highlights any toxicological change in the treated patients. This is a very refined determinant of toxicity. The subtraction method also scores clinical markers. Other subgroups can be analyzed by subtraction analysis, including, for example, 1) segregation by occurrence and type of adverse effects; and 2) segregation by dose. 5 6.12 Analysis of gene transcription imaging in clinical studies An image analysis of gene transcription (or multiple gene transcription image analysis) is a useful tool in other clinical studies. For example, differences in image analysis of gene transcripts before and after treatment can be seen in patients with drug treatments and placebos. This method also effectively selects clinical markers to continue the clinical use of the drug.
# 6.13 Comparative Analysis of transcription of genes between species The subtraction method can be used to select 0 libraries of cDNAs of diverse origins. For example, the same cell types of different species can be compared by gene transcription analysis to select specific differences, such as detoxification enzyme systems. These tests help in the selection and validation of an animal model for the purpose of drugs intended for human or animal use. When the comparison between animals of different species is shown in columns for each species, we refer to this as a comparison between species, or zoological spot. The embodiments of this invention may employ databases such as those written using the FoxBASE programming language commercially available from Microsoft Corporation. Other embodiments of the invention employ other databases, such as a random peptide database, a database of polymers, an oligomer database, or an oligonucleotide database of the type described in the Patent of the United States of America No. 5,270,170, issued December 14, 1993 to Culi et al., International Publication Application of TCP No. WO 9322684, published November 11, 1993, International Publication Application of TCP No. WO 9306121 , published on April 1, 1993, or Request for International Publication of the TCP No.-WO 9119818, published on December 26, 1991. These four references
(the texts of which are incorporated herein by reference) include the description that may be applied to implement those other embodiments of the present invention. All references referenced in the foregoing text are expressly incorporated herein by reference herein. For those skilled in the art, various modifications and variations of the described method and system of the invention will be apparent, without departing from the scope and spirit of the invention. Although the invention has been described in connection with preferred specific embodiments, it should be understood that the invention as claimed should not be unduly limited to those specific embodiments.
- # TABLE 1 Designations Distribution (D) (F) E = Exact C = Not specific
H = Homologue P = Cell / tissue id
0 = Other species u = Unknown
N = No U = matches D = non-coding gene Species U = illegible (S) R = repetitive DNA H = = human A = Only Poly-A A = monkey V = Only Vector P = = pig S = Leap D = dog I = Clone Incyte V = bovine X = matches EST B = rabbit Library R = rat
* (L) M = mouse U = U937 H = hamster M = HMC C = chicken T = THP -1 F = amphibian H = HUVEC I = invertebrate
S = Spleen z = protozoon
L = Lung G = Fungus Y = Cell T &B A = Adenoid * TABLE 1 (Cont.)
Location Function (Z) (R)
N = Nuclear T = translation C = cytoplasmic L = protein processing K = cellular skeleton R = ribosomal protain E = cell surface 0 = Oncogene 0 Z = memb. intracellular G = GTP ptn GTP N = mitochondria V = viral element S = secreted Y = kinase / phosphatase U = unknown A = tumor related to antigen 5 X = Other 1 = binding proteins State D = NA binding / transcription (i) B = surface / receptor molecule 0 = no current interest C = Ca ++ binding protein 0 1 = make first analysis S = ligands / effectors 2 = first analysis done H = voltage response protein
3 = Sequence of length E = Complete enzyme F = ferroprotein 4 = secondary analysis P = Protease / inhibitor 5 5 = Northern tissue Z = Oxidative phosphorylation 6 = Obtain full length Q = sugar metabolism M = amino acid metabolism N = acid metabolism nucleic 0 W = lipid metabolism K = Structural X = Other U = Unknown
TABLE 2 Clone numbers 15,000 A 20000 Libraries: HUVEC ordered by ABUNDANCE Total clones analyzed: 5000
319 genes, for a total of 1713 clones number N c entry descriptor 1 15365 67 HSRP 41 Riboptn L41 2 15004 65 NCY015004 INCYTE 015004 3 15638 63 NCY015638 INCYTE 015638 4 15390 50 NCY015390 INCYTE 015390 5 15193 47 HSFIB1 Fibronectin ü 15220 47 RRRPL9 Riboptn L9 *? 15280 47 NCY015280 INCYTE 015280 8 15583 33 M62060 EST HHCH09 (IGR) 9 15662 31 H? ACTCGR Actin, gamma 10 15026 29 NCY015026 INCYTE 015026 11 15279 24 HSEF1AR Elf 1-alpha 12 15027 23 NCY015027 INCYTE 015027 13 15033 20 NCY015033 INCYTE 015033 14 15198 20 NCY015198 INCYTE 015198 15 15809 20 HSCOLL1 Collagenase 16 15221 19 NCY015221 INCYTE 015221 17 15263 19 NCY015263 INCYTE 015263 18 15290 19 NCY015290 INCYTE 015290 19 15350 18 NCY015350 INCYTE 015350 20 15030 17 NCY015030 INCYTE 015030 21 15234 17 NCY015234 INCYTE 015234 22 15459 16 NCY015459 INCYTE 015459? 3 15353 15 NCY015353 INCYTE 015353 m 15378 15 S76965 Ptn kinase inhib P 15255 14 HUMTHYB4 Thymosin beta-4 26 15401 14 'HSLIPCR ipocortin I 27 15425 14 HSPOLYAB Poly-A bp 28 18212 14 HUMTHYMA Thymosin, alpha 29 18216 14 HSMRP1 Motility relat ptn; MRP-1; CD-9
15189 13 HS18D Interferon induc ptn 1-8D
31 15031 12 HUMFKBP FK506 bp 32 15306 12 HSH2AZ Histone H2A 33 15621 12 HUM EC Lectin, B-galbp, 14kDa 34 15789 11 NCY015789 INCYTE 015789 35 16578 11 HSRPS11 Riboptn Sil 36 16632 11 M61984 EST HHCA13 (IGR) 37 18314 11 NCY018314 INCYTE 018314 38 15367 10 NCY015367 INCYTE 015367 39 15415 10 HSIFNIN1 interferon induc mRNA 40 15633 10 HSLDHAR Lactate dehydrogenase 41 15813 10 CHKNMHCB C Myosin heavy chain B 42 18210 10 NCY018210 INCYTE 018210 43 18233 10 HSRPII140 RNA polymerase II 44 18996 10 NCY018996 INCYTE 018996 45 15088 9 HUMFERL Ferritin, light chain 46 15714 9 NCY015714 INCYTE 015714 47 15720 9 NCY015720 INCYTE 015720 48 15863 9 NCY015863 INCYTE 015863 49 16121 9 HSET Endothelin 50 18252 9 NCY018252 INCYTE 018252 51 15351 8 HUMALBP Lipid bp, adipocyte 52 15370 8 NCY015370 INCYTE 015370 TABLE 2 (Cont)
entry number s descriptor 53 15670 8 BTCIASI V NADH-ubiq oxidoreductase
54 15795 8 NCY015795 INCYTE 015795 55 16245 8 NCY016245 INCYTE 016245 56 18262 8 NCY018262 INCYTE 018262 57 18321 8 HSRPL17 Riboptn L17 58 15126 7 XLRPL1BRF Ribopt LL 59 15133 7 HSAC07 Act; Beta 60 15245 7 NCY015245 INCYTE 015245 61 15288 7 NCY015288 INCYTE 015288 62 15294 7 HSGAPDR G-3-PD 63 15442 7 HUMLAMB Laminin receiver, 54kDa
64 15485 7 HSNGMRNA Uracil DNA glycosylase
^ §5 16646 7 NCY016646 INCYTE 016646 Hß 18003 7 HUMPAIA Plsmnogen activ gene
^ 67 15032 6 HUMUB Ubiquitin 68 15267 6 HSRPS8 Riboptn S8 69 15295 6 NCY015295 INCYTE 015295 70 15458 6 RNRPS10R R Riboptn S10 71 15832 6 RSGALEM R UDP-galactose epimerase
72 15928 6 HUMAPOJ Apolipoptn J 73 16598 6 HUMTBBM40 Tubulin, beta 74 18218 6 NCY018218 INCYTE 018218 75 18499 6 HSP27 Hydrophobic ptn p27
76 18963 6 NCY018963 INCYTE 018963 77 18997 6 NCY018997 INCYTE 018997 78 15432 5 HSAGALAR Galactosidase A, alpha
79 15475 5 NCY015475 Incyte 015 475 80 15 721 5 NCY015721 Incyte 015 721 81 15 865 5 NCY015865 Incyte 015 865 82 16 270 5 NCY016270 Incyte 016270 16886 5 NCY016886 Incyte 016 886 18500 5 NCY018500 Incyte 018,500 ^ r 18503 5 NCY018503 Incyte 018 503 86 19 672 5 RRRPL34 R Riboptn L34 87 15 086 4 XLRPL1AR F Riboptn Lia 88 15113 4 HUMIFNWRS tRNA synthetase, trp
89 15242 4 NCY015242 INCYTE 015242 90 15249 4 NCY015249 INCYTE 015249 91 15377 4 NCY015377 INCYTE 015377 92 15407 4 NCY015407 INCYTE 015407 93 15473 4 NCY015473 INCYTE 015473 94 15588 4 HSRPS12 Riboptn S12 95 15684 4 HSEF1G Elf 1-gamma 96 15782 4 NCY015782 INCYTE 015782 97 15916 4 HSRPS18 Riboptn S18 98 15930 4 NCY015930 INCYTE 015930 99 16108 4 NCY016108 INCYTE 016108 100 16133 4 NCY016133 INCYTE 016133 TABLE 3 NORMAL MONOCITE AGAINST MACROPHAGE ACTIVATED
THE FIRST 15 MORE ABUNDANT GENES
NORMAL ACTIVATED 1 Lengthening factor I alpha Interleukin beta 2 Fosfoproteinaribosomal Inflammatory protein macrophage 3 Ribosomal protein S8 homologous Interleukin 4 Beta-globin Lymphocyte activation gene 5 Ferritin H chain Elongation factor I alpha 6 Ribosomal protein L7 Actin beta 7 Nuclcoplasmin Prololine specifies T-cells of variance 8 Ribosomal protein S20 Homologous poly-A binding protein 9 Transferrin receptor Osteopontin; nephropontine 10 Poly-A binding protein Tumor necrosis alpha factor 11 Tumor ptn controlled by translation Clone INCYTE 011050 12 Ribosomal protein S25 Cu / Zn superoxide dismutase 13 SRP9 signal recognition particle Adelinalo cyclin (yeast homolog) 14 Histone H2A .Z Cell B activation molecule related to NGF
Ribosomal protein Ke-3 Protcasa Ncxin-I.dcrivadadc glial
£ ^^ TABLE 4 Libraries: THP-1 Subtraction: IIMC Classified by ABUNDANCE Total clones analyzed: 7375 1057 genes, for a total of 2151 clones entry number s ¿bgfreq rfend quotient
10022 HUMIL1 IL 1-beta 0 131 262.00
10036 HSMDNCF IL-8 0 119 238.00
10089 HSLAG1CDN Lymphocyte activ gene 0 71 142.00
10060 HUMTCSM RANTES 0 23 46,000
10003 HUMMIP1A MIP-1 3 121 40.333
10689 HSOP Osteopontin 0 20 40,000
11050 NCY011O5O INCYTE 011050 0 17 34,000
10937 HSTNFR TNF-alpha 0 17 34,000
10176 HSSOD Superoxide dismutase 0 14 28,000 fe, 10886 HSCDW40 B-cell activ, NGF-relat 0 10 20,000
W 10186 HUMAPR Early resp PMA-induc 0 9 18,000
- 10967 HUMGDN PN-1, glial-deriv 0 9 18,000
11353 NCY011353 INCYTE 011353 0 8 16,000
10298 NCY010298 INCYTE 010298 0 7 14,000
10215 HUM4C0LA Collagenase, type IV 0 6 12,000
10276 NCY010276 INCYTE 010276) 0 6 12,000
10488 NCY010488 INCYTE 010488 or 6 12,000
11138 NCY011138 INCYTE 011138 or 6 12,000
10037 HUMCAPPRO Adenylate cyclase 1 10 10,000
10840 HUMADCY Adenylate cyclase or 5 10,000
10672 HSCD44E Cell adhesion glptn or 5 10,000
12837 HUMCYCLOX Cyclooxygenase-2 or 5 10,000
10001 NCYOIOOOI INCYTE 010001 or 5 10,000
10005 NCY010005 INCYTE 010005 or 5 10,000
10294 NCY010294 INCYTE 010294 or 5 10,000
10297 NCY010297 INCYTE 010297 or 5 10,000
10403 NCY010403 INCYTE 010403 or 5 10,000 NCY010699 INCYTE 010699 or 5 10,000 NCY010966 INCYTE 010966 or 5 10,000 NCY012092 INCYTE 012092 or 5 10,000
12549 HSRHOB Oncogene rho or 5 10,000
10691 HUMARF1BA ADP-ribosylation fctr or 4 8,000
12106 HSADSS Adenylosuccinate synthetase or 4 8,000
10194 HSCATHL Cathepsin L or 4 8,000
10479 CLMCYCA Cyclin A or 4 8,000
10031 NCY010031 INCYTE 010031 or 4 8,000
10203 NCY010203 INCYTE 010203 or 4 8,000
10288 NCY010288 INCYTE 010288 or 4 8,000
10372 NCY010372 INCYTE 010372 or 4 8,000
10471 NCY010471 INCYTE 010471 or 4 8,000
10484 NCY010484 INCYTE 010484 or 4 8,000
10859 NCY010859 INCYTE 010859 or 4 8,000
10890 NCY010890 INCYTE 010890 or 4 8,000
11511 NCY011511 INCYTE 011511 or 4 8,000
11868 NCY011868 INCYTE 011868 or 4 8,000
12820 NCY012820 INCYTE 012820 or 4 8,000
10133 HSI1RAP IL-1 antagonist or 4 8,000
10516 HUMP2A Phosphatase, regul 2A or 4 8,000
11063 HUMB94 TNF-induced response or 4 8,000
11140 HSHB15RNA HB15 gene; new Ig or 3 6,000
10788 NCY001713 INCYTE 001713 or 3 6,000
10033 NCY010033 INCYTE 010033 or 3 6,000
10035 NCY010035 INCYTE 010035 or 3 6,000
10084 NCY010084 INCYTE 010084 or 3 6,000
10236 NCY010236 INCYTE 010236 or 3 6,000
10383 NCÍT010383 INCYTE 010383 or 3 6,000 í ^^ TABLE 4 (Cont.)
entry number 's describer sgfre < 3 rfend 1 quotient
10450 NCY010450 INCYTE 010450 0 3 6.000
10470 NCY010470 INCYTE 010470 0 3 6.000
10504 NCY010504 I CYTE 010504 0 3 6.000
10507 NCY010507 INCYTE 010507 0 3 6.000
10598 NCY010598 INCYTE 010598 0 3 6.000
10779 NCY010779 INCYTE 010779 0 3 6.000
10909 NCY010909 INCYTE 010909 0 3 6.000
10976 NCY010976 INCYTE 010976 0 3 6.000
10985 NCY010985 INCYTE 010985 0 3 6.000
11052 NCY011052 INCYTE 011052 0 3 6.000
11068 NCY011068 INCYTE 011068 0 3 6.000
11134 NCY011134 INCYTE 011134 0 3 6.000
11136 NCY011136 INCYTE 011136 0 3 6.000
11191 NCY011191 INCYTE 011191 0 3 6.000
11219 NCY011219 INCYTE 011219 0 3 6.000
'11386 NCY011386 INCYTE 011386 0 3 6.000
11403 NCY011403 INCYTE 011403 0 3 6.000
11460 NCY011460 INCYTE 011460 0 3 6.000
11618 NCY011618 INCYTE 011618 0 3 6.000
11686 NCY011686 INCYTE 011686 0 3 6.000
12021 NCY012021 INCYTE 012021 0 3 6.000
12025 NCY012025 INCYTE 012025 0 3 6.000
12320 NCY012320 INCYTE 012320 0 3 6.000
12330 NCY012330 INCYTE 012330 0 3 6.000
12853 NCY012853 INCYTE 012853 0 3 6.000
14386 NCY014386 INCYTE 014386 0 3 6.000
14391 NCY014391 INCYTE 014391 0 3 6. OCO
11 TABLE 5
* Masfcer m? Nu for SÜ3TRACTICN ontput SEX TRI- _OPg BBT 8A? STY CCF1? SET EXACT CN
SET TYPEAKEAD TO 0 CLEAR 'SET DEVTCCE TO SCREEN
USS - "SnHxtGuysFo? BAS? + Mac: fox files sClonea .dbí" qo TOP 'STORE NUMBER TO HOTIATE with BOTGOM STCRB NUMBER TO'TERMIN? TB STORE' 'TO Targecl STORE' 'TO Tárgßt2 STORE' 'TO Targee3 STORE.' • TO objectl STORE '• TO 0bject2 STORE' 'TO Objssct3 STORE 0 TO A AL' • 'STORE 0 TO EMATCH STORE 0 TO HMATCH STORE 0 TO CMATCH STORE 0 TO IMATCH STORE 0 TO JTF STCR? 1 TO BAIL EO WHILE .T. '* • program. i • Subtraction 2.f t '»Data .10 / 11/94. ... * Version, i Fo? BASE + / Kac, zevÍBÍc 1.10 * Nstea ... . \ Fsppafc file SubtraCt'iO? 3. * SCREEN 1 TYPE 0 HEADIN3 '9cxe «n 1 * AT 40, * 2 SIZS 286,492 PIXELS FCNT' G« neva \ 9 COLOR 0,0,0, PIXSLS 75,120 TO 178,241 STY23871 COLOR 0,0, -1,24610 , -1,8947 3 PIXELS 27, 134 SAY 'Subtxac ± on Menu * STYLE 65536 FOOT "Gßpßv»', 274 COLOR 0,0, -1, -1, -1, -1 ß PIXELS 117,126 G? T EMATCH STYLE 65536 FOOT 'Chicago' 12 PICIUR? 'C * Exact' SIZE15; 62 'CO' PIXELS 135.125 GET HMATCH STYLS 65536 FONT 'Chicago * .12.PICTURB' VC Homologous' SIZE .15,1
9 PIXELS 153,126 GET OM? TCH STYLE 55536 FOOT 'Chicago', 12 PICTURE "Í« C Other epc 'SIZE 15.84
8 PIX? LS 90,152 SAY "Matehesi'.8T? LE 65535 FONT« G «nev * M2 COLOR 0,0, tl, -l, -1, -1 ß PIXELS 171,126 G? T match STYLE 65536 FONT • Chicago ', 12 PICTURS "ß * C Tncy e 'SIZE 15.55 CO
@ PIXELS 252,137 GET start STYLE 0 FCNT 'Geneva', 12 SIZE 15.70 COLOR 0,0, -1, -1, -1, -1
(I PIXELS 252,236 GET TERMINATE STYLE 0 FOMT 'Gßneva.', 12 SIZE 15,70 COLOR 0,0, -1, -1, -1, -1 ß PIXELS 252,35 BAY "laclu or clones'" STYLS 65536 FCNT 'Gßn' goes * M2 COLOR 0,0 -1, -1, -1, -1
Q FIXELS- 252,2156AY "-> 'STYLE 65536 FOOT' Genßva ', 14 COLOR 0,0, -1, -1, -1, -1. @ PIXELS 198,126 GET PTF ETYLE S5536 CNT' Chidago ', 12 PICTURE "ß'CPrilJt CO file 'SIZE 15', 9 ß 'PIXELS 90.9 TO 181,109 STYLE 3T71 COLOR 0,0, -1, -25500, -1, -1 ß PIXELS 90.288 TO'181.397 STYL23871 COLOR 0,0, -1, -25600, -1, -1 ß PIXELS di.296 SñY "Backgrcund: * ST? LE 65536 FCNT * G« n «va.s, 27a COLOR 0,0, -1, -1, -1, -1 ß FIX L L? 45.135 GBT ANAL STYLE 65536 FCOT 'Chicago', .12 PICTURE '«' R OvfralljFuacticn * S1ZE 4 ß PIXELS 81,26 SAY 'Targßtj' ST? LE 65536 FONT" Gßnßva ', 270 COLOR 0, 0, -1, -1, -1, -1 * PKCSL6108,20 GET targetl STYLE 0 K3OT "Genéva" -, 9 SIZE 12.79 COLOR 0,0, íl, -1, -1, -1 • ß PIXELS 135,20 GET carg «t2 STYLE 0 PCNT * Ceneva", 9 8IZE 12.79 COLOR 0,0, -1, -1, -1, -1 .ß PIXELS 162,20 GET targett3 STYLS 0 FCOT "0« n «V *" "9 5I2J.12,79 COLOR 0,0, -1, -1, -1, -1 ß PIXELS 108,299 GET objectl STYLE 0 FCNT 'G« to «va', 9 SIZE 12.79-COLOR 0, 0, -1, -1, -1, -1 ß? IXELS 135,299 GET object2 STYLE 0. FOTT 'G nßva', 9 SIZE 12.79 COLOR 0,0, -1, -1, -1, -1 8 PIXELS 162,299 G? T? Bject3 STYLE 0 FCNT "Geneva", 9 SIZE 12.79 COLOR 0, 0, -1, -1 , -1, -1 • »PIXELS 27 €, 324, GET Bail STYLS 65536 FCNT 'Chicago', 12 PICTURE * ß * R BunjS &il out" B1ZS 4112 • * EOFs Subtractian. . fmt READ 11? Bail-2 CL2AR CLOSS DATABASES
USE "SmartGu; FoxBASB + / Mac? Fcx fi ßi: clones.dbf" .SET SAEGGT ON SCREEN.1 OFF HEIURN ENDT7 STORE VAL (5YS (2)) TO STARTIÍIE STORE OTPERjTargetl.) TO Targetl STORE UPFER (Targ * t2) TO Targßt2 ^ STORE UPPE (Targßt3) TO Target3 STORE UPPE (Objectl) TO Objectl STORE UPPER (objectS) T0Object2 STORE U? PER (Objšct3) TO Objšct3 clear SET TALX Cíí GAP s TEB? TE-INTTIATE + I GO INITIATE
COPY NEXT GAP FIELDS NUM3ER, release .D.P, 2,, ENTRY, S, DESCRIPTOR, START.RF? ND, I TO t? MPNUM
USE TEHEtlUM CCUNT TO TOT
COPY TO TE24PRED FOR D- 'E'. OR.D- 'O' .OR. D ^ H 1. OR. D- 'N1 .OR. D »'! 1- USE TEKPRSD
IF EaiatchiO .AND. Kmacch-Q .AND. Onatch-O .AND. HíATCH »0 COPY TO TE4PDESIG COPY STRUCTJRS TO TEMPDESIG USE TEMPDESIG
IX Ercatch »l APPEND FRCM TEMPNUM FOR D * 'B' ENDF
IF'Hmatch *! APPEND FECW TEMPÍTOM FOR D- 'H' ENDTF
XF cpatch * L AFP? ND FR? JM, 'TEMPNUM FOR D ^' O 'ENDIF
XT Uratcha l? PH = ND FRCM TEKENUM FOX D * 'I' .O ^. D »'X'« .OR.D- 'N' EHDIF
COONT TO STAKXOT
COPY STRUCTORE TO TEMPLIB USE TEKFLI?
APPEND FROM TEMPDESIG FO.H liferaryßOTPER (targetl) XP targßt2o '' APPEND FFOM TEMPDESIG FOR library * UPFER (target2) ENDIF
IF target3 < > '' APPEND FROM. TSÍPDESIG FOR lÜJr * ry «pPTER (targßt3) ENDIF COONT TO ANALTOT
USE TEKPDESIG
COPY STRUOTURE TO TEMPSUB
USE TEMPSUB? PTEND FRCM TEKPDBS3Q FOR lihrn? And «OTPHl. { Cbjectl) i and tar «t2o '' APPEND FROM TEMPDESIG FOR. lihraxyiOTPER (0bjtCt2) EHDXF
IF tasGAcSo '•? PPEHD FBOH TBtFDSSXS FOR library »U? PER (Ctajtct3) COTOT TO eCBTRACTOT SBT TALK CfT * CCKFHESSICN SUBROOTINE A? 'O0MPRESSIN3' OUERY LIBRARY 'USE TEMPLTB ifl SO? T CN' ETTKY.NUMBHR TO IBSORT. USE LI23CRT CCUNT TC IDGENB
REHACE ALL RFEND ITH i MARX1 - 1 SW2 «0 DO V7HILB S 2-0 ROLL IF NARK1 > »IDGENB PACK
COUNT TO AUNI UE "SW2-1 LOOP
GO MARK1 DUP * 1 STORE ENTRY TO TESTA
STORE D TO DESIGA. ew »o - • 'DO WHILE SW« 0 TEST SKIP 8T0RE ENIRY TO TESTS STORE D TO ÜESIGB
GF TESTA - TESTB .AÍGD.DESIGAMDESIGB DELBTS
DUP «DUP + 1 LOOP
ENDIF GO'MRRKl REPLAC2 RFEND WITH CU? 1 - MASKl + DOT '•
ENDDO.TEST LOOP
ENDDO ROLL
SORT OH P? F D, NUMBER TO TEMP? AR90RT. USE TEUPIARSORT * REPL? CE ALL START WTIH RP1WD / XD3ENE * 10000
CCONT TO TEKPTARCO * CCKPRÉSSICM SUBROUT? NB B? 'CCtíFRESSXNß TARGET LLBRARY' USE .TEMPSUB SORT OR EWTRY.HUMBER TO 'SUBSOKr USE SUBSOWT
COUNT TO '¡' "ffTR K RSPLAC5 ALL RTZND KTTH 1 MNa - i 8W2-0 DO WHILB SN2« 0 ROLL 17 M? HCL> «SUBGEN? PACX
CCCKT 10 BUNIQUB sm * l LOOP E? DXJ GO HAMO, - DUP - 1 STORE,? OTKY TO TESTA STORE OR TO D? SIG? SW - O EO «HILE SW« 0 TEST «ap STORE ENIRY TO. TESTS STORE D TO D? SIGB
IF TESTA * TESTS.AND.DS5IGA »DBSISB DÉJETE 8 0 DU? - DUP + 1 LOOP
ZNDIF GO MARX1 REPLACE RFEND WITH EUP
? MAMÜ MARX1 + IXJP SW-Í LOOP
ENDDO TEST L8P: ENDDO ROLL
SORT CN RF3ND / D, UMBER TO TEMPSUBSORT -USÉ TEMPSUBSORT * WEPLACE ALL START ITH RFEND / IEGENE * 10000 COUNT TO T? MPSUECO ****** # »** ** • *» »» • * »*» » »» »****« «**** t» »******* * i * A ******************* ++ ** *** •• ** «» * «* • *** '**' *
* FUSICN ROUTTNE? 'SUETRACTING LIBRARIES1 USE SUBTRACTION
COPY STRUCTURE TO CRUNCKER
SEL2CT 2 USE 1? MFSUESORT S? L? CT 1 USB CRUNCHER
APPEND FRCM TEMPTARSORT
CCUNT TO BAILOUT
MARK e 0 DO see .T .. EELECG i MARK = MARK + 1 IF MARX BAILCCT 2XXT ENDIF GO MARK
STORE'ENTKY TO SCANN? R S3LECT 2 LÓCATE. FOR ENTRY »SCANN? R IFPOUNDO STORE RFIND TO BIT1 STORE RFEND TO BIT2 2LS? 'STORE 1/2 TO BITL STORE 0 TO BTT2 ENDIF SELECT 1 REPLACE BGFREQ WITH BIT2 REPLAC? CURRENT ITH BIT1 LOOP acc SEMCT i REPIACE ALL RATTO TTH RFEND / ACTOAL SORT AND RATIO ', BGFRIOJD, DESCRIPTOR TO FINAL ************************** ****************** »« «» *** «******** * # ************** *
SET D? VICE TO PRNT
SEGPRINT ON
E SCT
SET ALTÉRNATE TO 'A enoíd .Patent Figures: Subtracfcion. txt * SBT ALTÉRNATE CN ENECASE
STORE VAL (SYS (2)) 'TO FIOTME IF J1NTIME STARTTME * STORE FINTIM2 + S6400 TO .FINTXME ENDIF
STORE GINTIME - STAGSG? ME .TO OCMPSEC STORE CCMPSEC / 60 TO COMPMIN '*** + **** + ********' * »** SET MARGI TO 10 ßl, l EAY" Library Your tractiop Analygis "STYLE 65536 FONT 'Genßva', 274 COLOR 0,0,0, -1, -1, - 7 i * 7 • t? date ()
tch = 0 .AND. Cpatch =? ' .AND. IMATCKsO
IF Hraatchsl 7? 'Human,'? NDGF • IF Cmatehßl 7? 'Othßr ep.' ENDJF
IF Imatch-1 7.7 'XNCYTE' ENDIF • IF ANA sl? 'Sorted by ABUNDANCE1-? NDIF. XF ANAL-2 7 'Arranged by FUNCTICN' ENDIF 7 'Total alones represented:' ?? STR. { TOT, 5.0)? 'Total -clones analyaedi' ?? STR (STARTOT, 5.0) 7"Total, csiryutation- time!. STR (CCM? IN (5,2) 7? 'Min? Taa"???' D M deáignation £ »distribution z» locatic r »Function t» rpecißi i = inte? "*****" ***** '*** + «**» * «» *** »» 0 HSADIN3"Screen 1" AT 40.2 SIZE 286,492 PDCSLS FWT 'Geneva', 9 COLOR 0,0,0, ?? STR (AUNIQUE, 4,0) '7?' Genes, for a total of '• .7? STR (ANALTOT, .4,0)' 7 'clones' 7. • SCSEEN 1 TYP? 0 HEADING "Scrßen 1" AT 40, 2 SlZE 286, 492 PIXELS FOOT' G «nßva", 7 COLOR 0, 0, 0, lißt OFF fields nunber, DrF, Z, Ri E? 7p «, S, DESCRI? TOR, BGFR £ Q, RFEND, RAT10, I SET PRIOT 'OFF' CLOSE DATABASES, • USE" SpartGuy: FOXBAS? + / Mac fox files and clones, obb * ANAL CÁB. «« 2 • * start / function SET PRIOT 'ON SET HEADE83 CN SCRJEEN 1 TYPE 0 KEADING' Screen l 'AT 40, 2 SIZE 286, 492 PIXELS' FONT 'Helvetica *, 268 COLOR 0
7 '• •? 'BINDINa FROTSINS'? SCREEN '1 TYPE 0 KEADING' Screen 1"AT 40 ', 2 SlZE 286,492 PIXELS FCIOT' Helvétic ', 265 COLOR 0
7 'surfaca molßcules and recepteni'. •
SCREEN 1 TYPE 0 HEADING 'Screen 1 »AT 40.2 SIZ2286,492 PIX? LS FC« T "G * nev **, 7 COLOR 0,0,0, LI-AT OFF number, D (' F; ZrR , EOTR, S, r3_SC ^? TO, S. ^^ FOR R-'B '• SCREEN 1 TYPE 0 HEADING "Screen" l'? T 40.2 SlZE 236,4.92 PIXELS .FONT .'Helvética ", 265 COLOR 0 7 'Calcum-binding proteins!' SCREEN 1 TYPE 0 KEADING «aereen 1 'AT 40,2'siZE 286,492 PIXELS FCNT' Geneva \ 7 COLOR 0,0,0, ÜSt OFF fields nupiber, D, F, Z , R, ENTRY, SlDESCRIPTOR, BsFREQ, M, END, RA'p:?, I FOR R = 'C SORBEN 1 TYPE 0 KEADIN3' Screen 1"AT 40.2 SIZE.286,492 PIXELS FCNT 'Helvetica", 265 COLOR 0 'Liganda' and affßctorai! SCREEN 1 TYPE 0 EEtoCNS "Sereen 1" AT 40.2 SlZE 286,492 PDCELS FCNT "Geneva", 7 COLOR 0,0,0, list OFF fields purnber, D ^ F, Z, R, ENIRY , S, DESCRIPTOR, BGFREQ, IiyEND, RATIO, I FOR R «'S' SCREEN 1 TYPE 0 HEADT-33" Screen 1 * AT 40.2 SlZE 286,492 PIXELS FCRT 'Helvetica ", 265 COLOR 0
7 'úther binding proteinei' SCKEEN 1 TYPE 0 H? ADINS 'Screen 1' AT'40,2 SlZE 286,492 PIXELS FCNT * C 'nßva *, 7 COLOR 0,0,0, list OFF fißldJ'pupi »ßr, DfF, Z, R, EOT7lY, S, D- ^ CRIPT0R, B3FRSQ, Ira €), RATI0, I FOR Ra'I '• 7. • • SCREEN 1 TYPE 0 HEADING "Scrßen 1" AT 40, 2 SlZE 286, 492 PIXELS FONT 'Helvetica', 268 COLOR 0
7 'CNCOGENES' 7 eCREEN 1 TYPE 0 HEADT- * »'Screen 1' AT 40.2 SlZE 286,492 PIXELS FONT 'Helvetica", 265 COLOR 0
7 'General oaeogenßsj', "SCREEN 1 TYPE 0, HEADING" Scrßen 1"AT-40,2 SZE 286,492 PIXELS -FONT * Geneva« ', 7 COLOR 0,0,0, list OFF fieid? E ^^ D) F (2,, E ^, 9fD ^ OT O fBsFR? A, HFpiD.R? TIOfX FOR £ = '©' SCREEN 1 TYPE 0 KEADING "Scrßen i 'AT 40,2 SlZE 286,492 PIXELS FONT' Helvetica *, 265 COLOR 0 7 'GTP-binding protein * i' SCREEH 1 TYPE 0 KEADING "Scrßßp 1" AT 40,2 SZE 286,492 PIXELS FONT 'Geneva ", 7 COLOR 0,0,0, li» t OFP fißlj nurnber, D, F, Z , R, ENTRY, S, DESCRIPTOR, BSFREQ, RFEND, RATIO, I FOR R »'0' SCRE? N 1 TYPE OR HSADIN3« Scrßen 1 * AT 40,2 SlZE 286,492 F? X? LS F0NT 'Helvetica *, 265 COLOR 0
7 'Viral ßlßnßptEi' "_ _ SCREEN 1 YFE 0 HEADING" Screen 1 * AT 40.2 SlZE 286,492 PIXELS FONT • JsS &rHr, 7 COLOR "? ?, 0, liet OFF fields number, D, _F, Z, R, IN, S, DESaUFT0R, BGFREQ, RFEND (RATIO, I FOR R-'V SCREEN 1 TYPE 0 HEADIN3"Scrßen 1" AT 40.2 SlZE 286 ', 492 FIX? LS FONT "Helvetica *, 255 COLOR 0 7-' Xinases and P osp atasegi '• SCH? IN 1 TYPE 0 H? ADING" Sareen 1 * AT 10.2 SlZE 286,492 PIXSLS FONT "Genev? I' , 7 COLOR 0,0,0, list OFF-fißlds nuinber, D, F, Z, R,! RY, S, DESCRIPTOR, BGFREQ, RFEND, RATIO, I FOR Ra'Y 'SCREEN 1-TYPE 0 HEADING "Scrßen 1 * AT 40.2 SlZE 286,492 PIXELS FCNT "Helvetica", 265 COLOR 0
? "Tumor-related antigensJ 'SCRE? N 1 TYPE 0 HEADING" Screen 1"AT 40.2 SlZE 286,492 PIXSLS FCNT" G * neva ", 7 COLOR 0,0,0, list OFF-fißlds number, D, F, Z , R, ENTRY, S, DESCRIPTOR, BGFREQ, RFEND, RATIO, I FOR R * 'A' 7. SCREEN 1 TYPE 0 HEADTtvG «Screen 1" AT 40.2 SIZ? 286,492 PIXELS FONT "Helvetica", 268 COLO 0
? '' PROTEIN SYNTHETIC KACHINERY PROTEINS! . ? , SCREEN 1 TYPE 0 HEADING "Screen 1" AT 40,2 SIZ3286.492 PIXELS FONT 'Helvetica *, 265 COLOR 0
? 'Transcription and Nuclà © sic Acid-binding profeeinaj' SCREEN 1 TYP? 0 H? ADING "Scrßen 1" AT 40,2 SIZ? 286,492 PIXELS FONT "G"? "Va", 7 COLOR 0,0,0, list OFF faithful n «nber, D, F, Z, R, SNI ^ Y, S, DESCRIPTOR, BGFREQ; RFEND, RATIO, I FOR R = 'D' SCREEN 1 TYPE 0 HEADING "Screen 1" AT 40.2 SlZE 286,492 PIXELS FONT -Helvética ", 265 COLOR.0 7 'Translation:' * 'SCREEN 1 TYPE 0 HEADING" Screen 1"AT 40,2 SlZE 286,492 PIXELS FCNT 'Geneva', 7 COLOR 0,0,0, list OFF fields nuiBfcer, D, F, Z, R> ENRRY, S, DESCRIPTOR, EGFREQ, RFEND, RATIO, I FOR R »'T' SCREEN 1 TYFE? ' HEADING "Screen 1" AT'40,2 SIZE 286,492 PIXELS FCNT 'Helvetica *, 265 COLOR 0?' Ribasaual proteins: '• SCRE? N 1 TY7E 0 HEADTNG "Screen 1 * AT 40.2 SSZE 286,492 PIXELS FQNT * G * n «vn ', 7 COLOR 0.0,0, liatt OFF fields nuiri3 € r, D, F, Z, R, ?? TRY, S < DESCRIPTOR, BGFREQ, RFEN ?, RATIO, I FOR Rß'R 'SCREEN 1 TYPE 0 KEADING * Scrß «n 1 * AT 40,2 SlZE' 286,492 PIXELS FONT" Helvetica ',' 265 COLOR 0 7 'Protein proceseing:', SCREEN 1 TYPE 0 KEADING "Scxeßn 1" AT 40,2 SlZE 286,432 PIXELS FONT «G '« neva ", 7 COLOR 0,0,0, list OFF Cielda nup ± ter, D, F, Z, R, E2írRY, S , DESCRIPTOR, BGF9SQ, RFEND, RATIO, I FOR R-TL '7'. . . SCKEEN 1 TYPE 0 HEADIOT "Seseen 1 * AT 40..2 S ZE 286, 492 PIXELS. FONT" "Hßlvßtica ', 268 COLOR 0' 7 7 'IN ES" 7 SCREEN 1 TYPE 0 HEADIN5 * Sczeen 1' AT 40, 2 SlZE 286,492 PIXELS F8T "Helvétic", 265 COLOR 0 • • 'Ferroprotßinst' SCREEN 1 TYPB 0 HEADING 'Screen 1"AT 40,2 SlZE 286,492 PIXELS FONT" Gßneva ", 7 COLOR 0,0,0, liat OFF fields ni ? ibr, D, F, Z, R, 2NTRY, S, DESCRIPTCR, SSTlEQ, RFEND, RATIO, I FOR Rs'F 'SCREEN 1 TYPE 0 HEADING "Screen-1" AT 40.2 SISE 235,492 PIXELS FONT "Helvetica" , 265 COLOR 0 7 'Proseases and inhibi ors!' SCREEN 1 TYPE 0 HEADING "Scrßen 1« AT 40,2 SlZE 286,492 PIXELS 70NT 'Geneva', 7 COLOR 0,0,0, liat OFF fields nup? ^ R, D , F, Z, R, EOTSY, S, DESCRIBE, 8sFREQ, RrEND, R? TIO (I FOR R 'P' SCREEN 1 TYPB 0 HEADING "Scrßen 1" AT 40,2 SlZE 286,492 PIXELS TONT "Helv tica", 265 COLOR 0 7 'Oxidative phoep orylatianj.' .. SCREEN 1 TYPE 0 HEADINQ "Scr" in 1"AT 40.2 SlZE 286,492 PIXELS FQNT" Geneva "^ COLOR 0,0,0, list OFF fißlds nupiber.DFZiR.ia ? IRY.S.DESCRIPIOR.aGFREQ. BFED.RATIO,! FOR R-c'Z "SCREEN 1 TYP? 0 HEADING '' Scrßen 1"AT 40,2 S2ZB 286,492 PIXELS FONT" Helvetica ", 265 COLOR 0 7 'Sugax metatollsnu' • SCKEEN 1 TYPE 0 HEADING" Screen 1"AT 40.2 SlZE 236,492 PIXELS FONT" Ceneva ", 7 COLOR 0,0,0, liat OFF fields number, D, F, Z, R, EOTRY, S, DESCRIPTOR, BGFREQ, RFEND RATIO, I FOR R * 'Q' eCRE? N "l TYPE 0 HEADIKG '.¡ believe 1 'AT 40,2 SlZE 286,492 PIXELS FONT "Helvetica", 265 COLOR 0
7 'Aaiino acid pßtabolisn:' '• SCRE? N 1 TYPE 0 HEADING "Screen 1" AT 40.2 SIZ? 286,492 PIXELS FCCST "Geneva", 7 COLOR 0,0,0;
list OFP fields nupbßr, D, F, Z, R, ENIRY, S, DESCR? PTOR, BG7REQ, RFEND, R? TIOrI FOR R = 'M' SCREEN 1 TYPE 0. HEADXNG "Screen 1 'AT 40, 2 SIZB 286 , 492 PIX? LS FCNT "jdf fief 'S? S CO OR 0? 'Nucleic "acid metabolism:' • SCREEN l .TYPE 0 H? ADIN3" Screen '1"AT 40, 2 SlZE 286, 492 PIXELS FCNT" Geneva ", 7 COLOR 0, 0, 0,' fW liat, OFF 'fields nupbßr, D, F, Z, R, ENTRY, S, DESCRIPTOR, BGFREQ, RFEND, RATIO, I FOR R- 'N' 'SCREEN'1 TYPE 0 HE? DING "Screen 1" AT 40.2 SlZE 286,492 PIXELS' FCNT "Helvetica", 26S COLOR 0 7 'Lip'id mßtaboütn:' SCREEN 1 TYPE 0 HSADING "Screen 1" AT 40,2 SIZ? 286,492 PIXELS rCW "Geneva", 7 COLOR 0,0,0, liat OFF fields nupber , D, F, Z, R, ENTRY, S, I3ESCRIPR0R, BGFREQ, RFEND, RATIO, I FOR R »'W BCREEN 1 TYPE 0 HEADIN3" Scrßen' 1"AT 40.2 SlZE 286,492 PIXELS FONT" Helvetica ", 265 COLOR 0
7 'Ot ßr ßnzyroea:' SCKEEN 1 TYPE 0 HEADING "Screen 1" AT 40.2 SlZE 286,492 PIXELS FONT "Genßva", 7 COLOR 0,0,0, liat OFF fiields cÚ > ! r, D, Z,, E7IRY, 9, DESCRIPTOR, BGFREQ, FEND, RATTO / S FOR R = 'E' 7. . • •. • -. - • 'SCRE? N 1 TYPE 0 H? ADING "Screen 1' AT 40.2 S2ZE 286,492 PIXZLS FCNT • Helvetica", 268 COLOR 0
? '? • MISCFT.TANEOUS CATEGORIES '? SCRE? N 1 TYPE 0 HEADING "Screßn 1" AT 40,2 SlZE 286,492 PIXELS FCNT "Helvetica", 265 COLOR 0
? 'Stress responder:' 'SCREEN 1 TYPE 0 HEADING "Screen 1" AT 40,2 SIZ? 286,492 PIXELS FONT 'Gen «va", 7 COLOR 0,0,0, liat OFF fißlds pwaber, D, FvZ, R, ENTRY, S, DESCRIPTOR, BGFREQ, RF? ND, RATIO, I FOR R *' H 'SCREEN 1 TYPE 0 HEADING "Screen 1" AT 40,2 SIZE 286,492 PIXELS FONT "Helvetica", 265 COLO '0?' STRUCTURAL: '• SCREEN 1 TYPE 0 HEADING "Screen 1" AT 40,2 SIZ3286,492 PIXELS FCNT "Gßn < ? va ", 7 COLOR 0,0,0, list OFF fields nunber, D, F, Z, R, ENIRY, S, IffiSCRIPTOR, BGFREQ, RFEriD, RATIO, I; FOR R = 'K' SCREEN 1 TYP? 0 H? ADING "Screen 1" AT 40Í2 SIZ? 286,492 PIXELS FONT "Helvetica", 255 COLOR? • Other clones ¡"SCREEN 1 TYP? 0 SEADHa "Screen 1" AT 40.2 SlZE 286,492 PIXELS 'FONT "" Gene-va ", 7. COLOR 0,0,0 list OFF fields ni2d ^ r, D, F, Z, R, ENrRY, S, iESCRIPTOR , BGFREQ, RFEND, RAT10, I FOR R = 'X' SCREEN 1 TYP? 0 HEADING "Serien 1 * AT 40.2 SlZE 286,492 PIXELS FCNT" Helvetica ", 265 COLOR 0? 'Clones? £ jmjswn fupctions' SCREEN 1 T? FE 0 K? ADKG = 3eraert 1"AT 40.2 SISE 286,432 PIXSLS 3TCNT" Cer. «Va *, 7 COLOR 0,0,0, list OFF fißlds nur? Mr, D,?, Z (R, ENIRY, S, DESCRIPTOR, BGFREQ, RÍ? ND, RATIO, I FOR R «'U' ENDCASE
DO "Test print.prg" SET PRINT OFT
SET DBVICE TO SCREEN
COS? DATAHASES
ERASE TEMPLSB.DBF ERASE TEMPW? T.DBF ERASE TEMPDESId.DB? SET KARGIN TO 0 CLSAR
LOOP
ENDCO
• Northern (ainala), version 11-25-94 eloße dataJases SET TALK OFF
SET PRINT OFF 'SET EXACT OFF ß - CLEAR' STOSE. ' 'TO Eobject STORE' 'TO Dobject STORE 0 TO Nup ?. STORE 0 'TO zog STORE 1 TO Bail DO WHILE, T. . * Prograra. i Northern (single) .fine * Data 8/8/34 • * Veraion-. s .Fos BASE + / liac, 'revision 1.10 * Notes. ...: .Fopnat file Northern (ßingle) * • • 'SCREEN 1 TYPE 0 HEADING "Screen 1" AT 40.2 SlZE 286,493 PIXELS FCMT "Geneva", 12 COLO' 0,0,0 ß PIXELS 15,81 TO 46,397 STYLE 28447 COLOR 0,0, -1, -25600, -1, -1 ß PIXELS 89,79 TO 192,422 STYLB 28447 COLOR 0,0,0, -25600, -1, -1 ß PIXELS 115,98 SAY " Eptry ti • STYLE 65536 F8T "Genßva", 12 COLOR 0,0,0, -1, -1, -1 S PIXELS 115,173 GET Eobjeat STYLE 0 FCNT 'Geneva ", 12 SIZS 15,142 COLOR 0,0,0, -1 , -1, -1 8 PIXELS 145.89 SAY • Deaeription "STYLE 65536 FONT." Genov ', 12 COLOR 0,0,0, -1, -1, -1. ß PIXELS 145,173 GBT Dobject STYLE 0 FOOT 'Geneva', 12 SI2E 15,241 COLOR 0,0,0, -1, -1, -1 ß PIX? LS 35,89 SAY "Single Northern search aereen" STYLB 65536 FCNT * G « neva ", 274 COLOR 0,0, - ß PIXELS 220,162 GET Bail STYLE 65536 FONT 'Chicago", 12 PICTORE "ß * R I continued, Bail out' SlZE
9 PIXELS 175.98 SAY "Clone #?" STYLS 55535 FONT 'Gßpeva. »., 12 COLOR 0,0,0, -1, -1.-1 9- PIXELS 175,173 GET Numb STYLE 0 FONT * G« neva \ 12 Slze 15,70 COLOR 0,0,0 , -1, -1, -1 ß PIXELS 80,152 SAY "Enter * p and CNE of the following" STYLE 65535 FONT "Geneva ', 12 COLOR -1', * * 'ECF: Northern (single). Fmt KEADTF Eail = 2 CLEAR • screen 1 off 'RETURN ENDTP
USB »S? NartGuy¡F0x3ASE + /? Ac: Fc? files iLookup. bf "SET TAL 'ON
dbf '
SNDGF
BRCW5E STORE Entry TO Searchval 'CLOS? DATAHASES
ERASE '"Loo3? .- p entry.dbf" ENDIF • ZF-Dobject' • SET 2XACT OFF SET SAFETY OFF eaRT ON descriptor TO "LcoJaip descriptor bf" SET SAFETY On USE "LooJaip descriptor bf 'LCCATE FOR UPPER (TR M (descriptor)) = UPPER (TRIM (Dobjeet)) • ur. NOT.FOUND U CL? AR LOOP BNDTJ? BROWSE
STORE Entry TO Searchval CLOSB DATABASBS, ERASE 'Lookup descriptor. dbf "# SET EXACT ON IN? TF • F NuwboO USE • Smartsuy! FoxBASE + rMac: Fox files .clones, bf" GO Nüpk ERCW3E .STORE Entry TO Sßarchval E? DIF OLEAR? 'Northern analyali ffor e? Cry' 7? Sear'chval 7 •. ? 'Encer Y to proceed' WATT to o? - CLEAR
IF UPPER (OK) or'Y 'screen 1 off RETURN
ENDIF * COWPRESSICN 'SUBROÜTINE FOR LIBRA? Y.dbf 7' Coifreaaing the Librarles file no .--. . ' USE "SpartGuy: FoxBASE + / Mac: Fox filters .libraxißß .dbf"
SET SAFETY GFF - SORT ON library .TO "CsppreBaed librarles. Bf • * FOR entered> 0 'SET SAFETY ON
USE "Cappreased librarles", dbf • DELST3 FOR ßnteredse'O PACK
COUNT TO TOT
MARK1 * 1 S¡W2 »0. DO WHILE S 2 »0 ROLL • 33? MAR ^ l > »TOT • PACK SW2» = 1 'LOOP EMDIF
GO MARX1. 'STORE libraxy TO TESTA' SKEP STORE Library TO TESTB IF TESTA m TESTB DELET? E2ÜIF MARK1 »MARTl + l OOP 'ENDDO ROLL * Northern ßnálysiß CLEAR 7' Doing thß northern now. , • SET TALK OK
USE * epvartGu? Fox3ASE * / Mac «Fox f iles curtains. bf * - COPY TO "Hite bf '", FOR ent? y * ßearehval SET SAFETY CN * MASTER ANALYSIS 3; VERSION 12-5-94 * Master menu for analyaio output CLOS? DATABAS? 3 SET TALK OFF SET SAFETY OF? CLEAR
SET D? VTC? TO SCRE3N SET DEFAULT TO "SmartGuy: Fc ?? BASE + / Mac: fox filesiOutput programsi" USE "SmartGuyjFoxBASE - (- / Mac: fox iles¡Clones. Bf" GO TOP
STORE NUMB? R TO INITIAT3 GO BOTTO
STORE NUMBER TO ERMÍNATE STORE 0 TO EMTIR3
STORE 0 TO XMATCH STORE 0 TO PRINTON STORE 0 TO PTF DO WHILE .T. * Program: Master analysis. Emt * Date: 12/9/94 * Version: FoxBASE / Mac, revision 1.10 * Notes ....: Format file Master analysis »SCREEN 1 TYPE 0 HSADINO" Screen 1"AT 40.2 SIZ? 286,492 PIXELS FONT • Genov ", 9 COLOR 0,0,0, § PIX? LS 39,255 TO 277,430 STYLS 28447 COLOR 0,0, -1, -25600, -1.-1 S PIXELS 75,120 TO 178,241 STYL33871 COLOR 0,0 , -1, -25600, -1, -1 <3 PIXELS 27,98 SAY "Customized Output Menu" STYLE 65536 FCNT "Geneva ', 274 COLOR 0,0, -1, -1, -1 ß P? XELS 45.54 GET condsn STYLS 65536 FOIT "Chicago", 12 PICTCRE "S * C Condensed format" SlZE Q PIXELS 54,261 GET STYL anal? 65536 FONT "Chicago", 12 PICTUR3"@ * RV" Sort / nup? Ber; Sort / entry? G PIXELS 117,126 GET EMATCH STYLE 65536 FOOT "Chicago", 12 PICTURE "3 * C Exact" SlZE 15.62 CO.? PIXELS 135,126 GET HMATCH STYLE 65536 FOOT "Chicago", 12 rICTUR? "Q * C Homologoufi" SlZE 15,1 -ß iXELS 153,125 GET OMATCH STYLS 65536 FONT "Chicago", 12 FICTURE "3 * C Other spc" SIZ315.84 W? IXES 90,152 SAY "Matches:" STYLE 65536 FONT * Geneva * , 268 COLOR 0,0, -1, -1, -1, -1 9 PIXELS 53,54 GET PRINTON STYLE 65536 FONT "Chicago", 12 PICTURE "3 * C Ipclude clone listing * g PIXELS 171,126 GET Imacch STYLE 65536 FONT "Chicago", 12 PICTURE "3 * C In? Yt?" SlZE 15.65 CO? PIXELS 252.146 GET initiat? STYLE 0 FONT "Geneva", 12 SIZE? 15.70 COLOR 0.0, -1, -1, -1 , -1 9 PIXELS 270,146 GET ends STYLE 0 FONT * G * neva ", 12 SlZE 15,70 COLOR 0,0, -1, -1, -1, -1 3 PIX? LS 234,134 SAY" include clones "STYL? 65536 FONT "Genßva *, 12 COLOR 0,0, -1, -1, -1, -1 ß PIXELS 270, 125- SAY" - &"; STYL? 65536 FONT "Genßva *, 14 COLOR 0,0, -1, -1, -1, -1 (i PIXELS 198,126 GET PTF STYLE 65536 FCNT" Chicago ", 12 PICTURE" 9 * q Prinfc to file * SI2.E 15 , 9 S PIXELS 189,0 TO 257,120 STYLE 3871 COLOR 0,0, -1, -25600, -1, -1 ß PIXELS 209,8 SAY "Library ▲lection" STYLE 65536 FONT "Geneva", 266 COLOR 0,0, -1, -1, -1, -1 ß PIXELS 227.18 GET HTCIRE STYL265536 FONT "Chicago", 12 PICTOR? "S * RV All; Sselected * SlZE 16 * * EOF: Master analysis.fmt READ
IF ANAL »9 .CLEAR CL0S3 DATABASES ERASE TEMFASTER.DBF USE" Sp «rtGuy: FoxBASE-t- / Mac; i.}. Í iles clones.dbf" SET SAFETY ON
SCREEN 1 OFF RETURN
ENDIF clear 7 INITIAT? ? ERMATE YOURSELF 7 CONDEM? ANAL? e atch 7 Hmatch? C? Natch 7 8ATCK SET TALK CN I? ENTIRE = 2 USE "Uiiique librarles." Dbf "R? PLACS ALL i WITH '' BROWSS FIELDS i, libname, library, total, entered AT 0.0? NDIF
USE "SmartGuy: FoxBASE + / Mac: fox files solones.dbf" * COPY TO T? MPNUM FOR NüMSER «IiaTIATE.AND.NUMSER < = T2SMIKATE * US? TEMPNUM COPY STRUCTURE TO TEKPLI3 USE TEMPLI3 I? ENTIRE-1 APP? ND FROM "SaartGuy: Fox3AS? + / Mac: fox files: Clones.dbf" ENDIF I? ENITRR-2 USE "Unique librarles .dbf" COPY TO SSLECTED FOR UPP3R (i) - 'AND' USE SELECTED
STORE RECCOUNTO TO STOPIT
MARK = 1 DO WKILE .T. I? MARK > STOPIT CLEAR? X1T? NDI? USE S? LECTED GO MARK
STORE library TO THISQNE? 'COPYTNG' ?? THISONE USE TEMPLIB
APPEND FRCM • SmartGuyFoxBASE + / Mac: fex filea: Clones, dbf * FOR library «THISC2? STORE KARX + 1 TO MARX LOO? #? NDDQ ENDIF USE "SmartGuy: FoxBASE + / Kac! Fc" file ": clones bf • CCUNT TO STARTOT COPY STRUCTORE TO TEMFDESIG USE TEMPD?
IF Ematch-0 .AND .. Hmateh = 0 .AND. Qp? Atch = 0 .AND. IMATCH «0 APPEND FRCM TEMPLIB ENDIF
APPENDING FROM T? MPLIB FOR D * 'E' ENDIF
IF Hmatch = l APPENDING FROM T? MPLI3 FOR D * 'H' ENDIF
IF Omatch = l APP? ND FROM TEMPLIB FOR Da'O 'ENDIF
IF Imatchal APPEND FROM TEMPLIB FOR D «'I * .OR.D-'X' .OR.D-'N '2NDIF IF Xrratchpl APPENDING FROM TEMPLIB FOR D»' X 'ENDIF ßet calk off EO CASE CASE PTF = 0 SET DEVTC3 TO PRINT SET PRINT ON
ETBCT
CASE PTFsl SET ALTÉRNATE TO 'Total funstion aort.txt "" SET ALTÉRNATE TO "H and 0 function sort.txt" "SET ALTÉRNATE TO" Shear Stress HUVEC 2: Abur.dar.ce sort.txfc "
* SET ALTÉRNATE TO "Shear Stress HUVEC 2 • .Abur.? Ance con.t t"
* SET ALTÉRNATE TO "Shoar Stress HUVEC 2: Func ion sort fcxt *
* SET ALTÉRNATE TO "Shear Stress HUVEC 2: Distribution sort. Txfc"
* SET ALTÉRNATE TO "Shear stress HÜVEC l; Clone Ust.txf * SET ALTÉRNATE TO" Shear Stress HUVEC 2: ocación aort.txt "
SET ALTÉRNATE ON
ENECA3?
1 • i 7 date? '' ?? TIMBO? 'Clone- numbers' ?? STR (INITIATS, 6.0) 7? 'thrsjgh' 7? STR. { TERMINATE, 6, 0) 7 'Free them:' IF ENTIRE = 1? 'All Libraries' ENDIF 13? ENTIRE = 2 MARK-1 m WHILE .T. j-BFlF MARK > STOPIT? XIT
ENDIF
USE SELECT? D
GO MARX 7 '' ?? TRIM (libname) STORE MARK + 1 TO MARK LOOP
ENDDO
? NDIF? 'Desióncionß:' IF EmatehsO .AND. Hmatch * 0 • AND. anatch = 0 .AND. IMATCH = 0
7? 'All' ENDIF
IF Snatch = l ?? 'Exact,' ENDIF
IF Hmatch-1 ?? 'Human,'? NDIF • IF Ctnatchs-1 77 'Other sp. 'ENDIr IF Inatchsl 77' INCYTE1 ENDT
IF XpaLCh = l ?? < SS7 'ENDIF I? ONDEN-1? 'Copdensed form analyaia' ENDTP
IF ANALal? 'Sorted by NUMBER' ENDIF
IF ANAL = 2? 'Sorted by ENTRY' ENDIF
IF ANAL = 3 1 'rranged by ABUNDANC?' ENDIF I? ANAL-4 7 'Sorted by INTER? ST' ENDIF
TF ANAL = 5 • Arranged by LOCATION 'KDIF' W AN ES "? 'Arranged by DISTRI3UTI0N' EMDIF IF ANAL-.7? 'Arranged by FUNCTION' ENDIF? 'Total clones represented:' ?? STR (STARTOT, 6.0 )? "Total clones analyzeds' ?? STR (ANALTOT, 6.0) 'Al = llbrary d = deeignation f «distribution z« location r «function c« cer
? **** + • ** • ** ************************ »« «« *** «** w _ **** * # *** «** w * ^ USE TEMPD? SIG
SCREEN 1 TYPE 0 HEADING "Screen 1" A? 40.2 SlZE 286,492 PIXELS FOOT "Geneva" ', 7 COI R 0,0,0, DO CASE CASE ANAL = 1 «*** ort / number HEADING ON CONDENMI SORT TO TEMP1 ON ENTRY, NUMB? R DO" CCMPR? 3SI0N nup? Er.PRG 'SORT TO TEKP1 ON NUMBER USE TE Pl lint off fields number, L, D, F, Z, R, C, SNTRY, S, DESCRIPTOR * list Off flared number, L, D, F, Z , R, C, ^ rr.HY, S, DSSCRJP 0R,? N3TH, RFEND, ^ CLOSE DATABASES
ERASE TEMPl.D3F ENDIF
CASE ANAL = 2 * sorc / D? SCRIPTOR. SET KEADING ON * SORT TO TEMP1 ON DESCRIPTOR, ENTRY, NU? 3? R / S for D- '?' .OR.D-'H '.OR.D-'O' .OR.D * 'X' .OR.D-. ' 1'
• SORT TO TEMP1 ON ENTRY, DESCRIPTOR, NUMBER / S for Da'E1.OR.D-'H '.OR.D-'O' .OR.D «'X' .OR.D * 'I'
SORT TO TEMP1 ON ENTRY, START / S for D * 'E' .OR.Ds'K 'XF CONDEN = l DO "COMPRESSION entry.PSC * ELSE
USE TEMP1 list off fields number, L, D, F, Z, R, C, EÍ ^ RY, S, DESCRIPTOR, LENS, R7EtqD, INIT, I CLOSE DATABAS? S ERASE TEMPl. BF ENDIF CASE ANAL = 3 * ssrt by abundance SET HEADING ON
SORT TO TEMPl ON ENTRY, UMBER for D-'E '.OR.D =' H '.OR.D =' O '.OR.Dx' X '.OR.D- * I' DO "CQMPR? SSIQN abundance RG "CASE ANAL-4 * sort / intereat SET HEADING ON IF CONDEN = l SORT TO TEMP1 ON ENTRY, NUMBER FOR I > 0 DO "COMPRESSION interest.PRG" ELSE
SORT ON I / D.ENTRY TO TEP1 FOR I > 1 USE TEMP1 list off fields nurnber, L, D, F, Z, R, C, ENTRY, S, DESCRIPTOR, LENGTK, RFEND, INIT, I CLOSE DATA3ASES iRASE TEMPl.DEF * DIF CASE ANAL = 5 * arrange / location SET H? ADING ON STORE 4 TO AMPLIFIER 7 'Nuclear:' SORT ON ENTRY, NUMBER FIELDS RFEND, UMBER, L, D, F, Z, R, C, ENTRY, S, DESCRIPTOR. L? NGTH, TNIT, I, COMMEN
IF CCNDEN = 1 DO "Ccirpression location. Rg" ELSE
DO "Normal suhroutine 1"
L, D, F, Z, R, C, ENTRY, S, DESCRIPTOR, LENGTH, INIT, I, CCMMEN
L, D, F, Z, R, C, IN RY, S, DFCRIPTOR, LEGTH, BTIT, I, COMMEN
7 'Cell ßurf cß:' SORT CN ENTRY, NUMB? R FIELDS RF? ND, NUMBER,, D, F, Z, R, CENTRY, S, DESCRIPTOR, L? NGTH, INIT. I, CCWMEN IF CCNDEN-1 DO "Cspprßssion location.prg" ELS?
DO "Normal subroutine 1" _ENDXF "? 'Intracellular membrane:' SORT ON? OTRY, NUM3ER FIELDS RF? ND, UMBER, L, D, F, Z, R, C, ENTRY, S, DESCRIPTOR, LENGTH. , COMMEN
IF CCNDENßl DO "Csmpression location.prg" ELSE
DO "Nopnal subroutine 1" ENDIF 7 'Mitochondrial:' SORT CN QÍIRY, UMBER FIELDS KFIN, NUMBER1 L, D, F < Z, R, C,? OTRY, S, DESCRIPTOR, LENGra, INIT, I, COMMEN
IF CS2JDEN - 1 CO 'Csppreasion location. rg "ELSE
DO 'Normal aubroutinß 1"ENDIF #?' Sßc '? TiQcl *' SORT ON ENTRY, UMBER FIELDS RFEND, NUM3ER, L, D, F, Z, R, C, ENTRY, S, DESCRIPTOR. LENGTH, INIT, I, CCMMEN
IF C0ND? N = 1 DO "Compreaaion location. Rg" ELSE
DO "Normal subroutinß 1 *? NDIF? • Otheri '_
SORT ON ENTRY, UMBER FAITHFUL RFEND, UMBER,, D, F, Z, R, C, ENTRY, S, DESCRIPTOR,? M3TH, INXT, I, COMMEN
IF CONDEN = l DO "Comprassion location.prg" ELSE
DO "Normal subroutinß 1 *
RFEND, NUMBER, L, D, F, Z, R, C, EOTRY, S, DESCRIPTOR, L? NGTK, INIT, I, COMÜ? N "
S? T D? VICE. TO PRINTER SET PRINTER ON EJECT
DO "Output hea ing.prg 'USE" Ana-lysis loca ion.dbf * DO "Create bargraph.prg' SET HEADINO OFF? 'FUNCTIONAL CLASS TOTAL UNIQUE NEW% TOTAL' LIST OFT F1ELDS Z, AME, CLONES, GENES, EW , FERCEOT, GRAPH CLOS? DATABASES
ERASE TEMP2.DBF SET H? ADING ON * USE -SmarGuy: FoxBASE + / Mac: fox files iTEMEMASTER.dbf "SNDIF S? NAES * arrange / distribution SET HEADING ON
STORE 3 TO AMPLIFIER? 'Cell / tiaaue Bpecific distribution:' _ "?? ._,
SORT ON ENTR, NUMBER FIELDS RFEND.NUMBER / L.D.F.Z ^ .C.ENTRY, 3, DESCRIPTOR, I ^ G il, I, CCMKEN
IF CCNDEN »1 DO" Comprßssion dißprib.prg "ELSE
EO "Normal lubroutine 1 * ENDIF? 'Non-specific diatribution:' 'm, ^ tt _" .__.
SORT ON ENTRY, UMBER FI? LDS RFEND, NUMB? R, L, D, F, Z, R, C, ENTRY, S, D? SCRIPTOR, LENGTH, INGT, I, COMMEN
IF CONDEN-1 DO "Campression dißtrib.prg" DO "Normal eubroutina 1 * ENDIF? 'Unknown distribution:' _" "" •, "" "_,
SORT CN ENIRY, NUMBER FXELD3 RF? ND, UMBER, L, D, F, Z, R, C, EHTRY, S, DESCRIPTOR, LENGTH. INIT, I, COMMEN
IF CONDENsl DO "Ccmpreasion diatrib.prg" ELSE
EO "Normal subroutine 1" ENDIF
XF CONDEN = l S? T DEVICE TO PRINTER
S? T PRIMER ON SJSCT DO "Output heading.prg" USE "Analysis distribution. Bf" DO "Create bargraph.prg" SET HEADING OFF? 'FUNCTIONAL CLASS TOTAL UNIQUE TOTAL *? "LIST OFF FIELDS P. AME.CLONES, GENES, PERCENT, GRAPH CLOSE DATABASES
ERASE TEMP2.DBF S? T HEADING ON • USE "SmartGuy: FoxBASE + 7Mae: rc? Files.-TE PMAST? R. Dbf"? NDIF
CASE ANAL = 7 * arrange / function SET H? ADING ON mTORE 10 TO AMPLIFIER 'BINDING PROTEINS1? ? 'Suxface molecules ar.d receivers:' SORT ON ENTRY, NUM3I-R FIELDS RF? ND, NUMH? R, L, D, F, Z, R, C, ENTRY, S, DESCRIPTOR, LENGTH, INIT, I, COMM ? N IF CONDEN = l DO "Copprésaion function. R" ELS?
DO "Normal Eubrou ine 1" END F? 'Calcium-biuding proteir.s:'? ORT ON? NTRY.NUNB? R FIELDS RF? ND, NUM3ER, L, D, F, Z, R, C, ENTRY, S, DESCRIPTOR, EMGTH, INIT, I, COMMEN
IF CONDEN = l DO "Carrapressioh function. Rg" ELSE
DO "Normal ßubróutine 1" ENDIF? 'Ligar.ds and effectors t' SORT ON ENTEY, NUMBER FIELDS RFEND, UMB? R, L, D,?, Z, R, C, EOTRY, S, D? SCRIFTOR, L? NGTH, INIT, I, CCMMEN
ÉCCSTOEN-l "Csmpressisn function.prg" DO "Normal subroutxnß 1" ENDIF 7 'Othßr binding proteins:' SORT ON ENTRY.NUMBER FIELDS PJEND, NU ^ ER, L, D, F, Z, R, C, Ep? Y , S, DESCRIPTOR, LiN3ra, INIT (I, CCMMEN
IF CONDEN.1 DO 'Compression function.prg "ELSE
DO "Normal subroutine 1" ENDIF "EJECT? 'ONCOGENES'? 7 'General oncogenea:' SORT ON? NTRY.NUMB? R F1ELDS RFEND, NUMBER, L, D, F, Z, R, C, ENTRY, S, DESCRIFTOR , LE? »IHfINIT, I, COMMEN
IF CONDEN = l Coxppresaion iun.c ion.prg "DO" Normal subroutine 1"ENDIF • GTP-binding protein i 'SORT ON ENTRY, UMBER FT? LD3 RFEND, UMBER, L, D, F, Z, R, C , ENTRY, S, DESCRIPTOR, LENGTH, INIT, I, COMM? N
IF CONDEN * l DO "Copqpression function. Prg * ELS?
DO "Normal subroutine 1 * ENDIF? 'Viral element?'? SORT ON ENTRY, NUMBER FI? LD3 RFEND, NUMBER, L, D, F, Z, R.C, ENTRY, 3. DESCRIPTOR, LENGTH, XNIT.X, CCWM? N
IF CONDEM = l DO "Compression funccion.prg * ELS?
DO "Normal subroutine 1" ENDIF? 'Kinasßs and Phoßphatasßs:' SORT ON? NTRY, NUMBER FI? LDS RFEND, UMBER,, D, F, Z, R, C, ENTRY, 3, DESCRIPTOR, LEN3TH, INIT, I, CCMOJ
IF C0NDEN = 1 DO "Compression function.prg * ELS?
DO "Normal aubroutine 1" ENDIF? "Tumor-related aitigeasi 'SORT ON? NTRY, NUMBER FI? LDS RF? ND, NU 3ER, L, D, F, Z, R, C, ENTRY, S, DESCRIPTOR, ENGTH, IN2T, I / CCMM? N IF CONDE? Fcl ssion function.prg 'subroutina 1'? 'PRCT? IN SYNTHETIC MACHIN? RY PROT? 3-N3'? "Transcription and Nucleic Acid-bir.ding protein: 'SORT ON ENTRY, NUMBER FI? LDS ilFEND, NUME ? R, L, D, F, Z, R, C, ENTRY, S, DESCRIPTOR, L? NGTH, INIT, I, CCMCN
IF CQNDEN = 1 DO "Compressedon function.prg 'ELS?
DO "Normal subroutine 1" ENDIF 7 'Translation: • SORT CN? NTRY, NUMBER FIELDS RF? ND, NUMBER, L, D, F, Z, R, C, ENTRY, S, DESCRIPTOR, LENC-TK, INIT, I , CCt * l? N
IF CCNDEN-sl DO "Compression function.prg" E £ E DO "Normal subroutine 1"? NDIF |? Fc ibosacoal proteins: 'B r ON ENTRY.NUMB? R FIELDS RFEND, UMEER, L, D, F, Z, R, C, ENRRY, S, DESCRIPTOR, LENGTK, INTT, X, CCMMEN rf CONDEN-1 DO "Compression function.prg" ELSE
DO "Normal subroutine 1 'ENDIF 7' Protßin processing! 'SORT ON ENTRY, NUMBER FI? LDS RFEND, UMB? R, L, D, F, Z, R, C, ENTRY, S, DESCRIPTOR, LSNGTH.INIT, I , CCMM? N
IF CONDEN-l DO "Compreßsion fur.ction.prg". ELSE
DO * Noppal ßubroutine 1 * ENDIF * EJECT? 'ENZYMES'? 'Ferxoproteinsí' SORT ON ENTRY, UMBI3R FIELD3 RF? ND, NUMB? R, L. D, F, Z, R, C, ENTRY, S, DESCRIPTOR, ENSTH, EJIT, I, COM? Í? Íí
IF C0NDEN = 1 DO "Compression function.prg" EIJSE
DO 'Normal subroutine 1"ENDIF?' Proteases and inhibitors: 'SORT ON ENTRY, NUMBER FIELDS RF? ND, NUME? R, L, D, F, Z, R, C, ENTRY, S, DESCRIPTOR, ENGTxí, INTT, I, CCWMEN
IF COND? N = l DO "Coropreasion function.prg" ELS? DO "Normal subroutine 1" SIDIF 7 'O? Idative ph? 3p orylaeion:' SORT ON ENTRY, NUMBER FI? LDS RFEND, NUMBER, L, D, F, Z, R, C, ENTR, .S, DESCRIPTOR, L ? NGTH, INIT, I, COMMEN
IF CCNDEN-d DO "Compraaaion function.prg" EL ??
DO "Normal subroucine 1"? S? I? 7 'Sugar'meta olism:' SORT ON? NTRY, NUM3? R FIELDS RFEND, NUMBER, L, D, F, Z, R, C, ENTRY, S, DESCRIPTOR, LENGTH, INIT I, COMMEN
IF CONDEN-1 DO "Cómpreseion function.prg" ELS?
DO "Normal subroutine 1" ENDIF? 'Amino acid metabolism:' .T ON ENTRY, NUM3? R FIELDS RFEND, NUM3ER, L, D, F, Z, R, C, ENGRY, S, DESCRIPTOR, LENGTH, NIT, I, CQMMEN: 0NDEN = 1 too * Co? Ppression íunction.prg "? LSE
DO "Normal subroutine 1 *? NDI? 7 'Nucleic acid metabolist:' SORT ON ENTRY, NUMBER FIELDS RFEND, NUMBER.L, D, F, Z, R, C,? NTRY, S, DESCRIPTOR, LENGTH, INIT, I , CCWMEN IF COND? N = l DO "Comprehension function.prg"? LSE
DO JNormal subroutine 1"ENDIF? 'Lipid petabolÍ3m:' SORT ON ENTRY, NUMBER FIELDS RF? ND, lMMB? R, L, D, F, Z, R, C, ErmiY, S, DESCRIPTOR, LENGTK, INIT, I, CCWMEN Go CONDENsl DO "Compresßisp f nc ion.prg" ELSE
DO "Normal ßubrsutine 1"
F, Z, R, C, EOTRY, S, DESCRIPTOR, L? NGTH, INTT, I, CCMMEN
• EJECT 7 'MISCELLANEOUS CATEGORIES' 7 7 'Stress' response:' SORT ON EOTRY, NUMBSR FIELDS RF? ND, UMBER, L, D, F, Z, R, C, EOTRY, S, DESCRIPTOR. LENGTH, INXT, I. CCMM? N
IF CON33? N «.l DO 'Compression fun.ctioh.prg" ELS?
DO 'Normal subroutine 1"ENDIF 7' Structural; 1 SORT QN ENTR.Y, NUMB £ R FIELDS RFEND, ^ ^ fflER, L, D, F, Z, R, C, EN Y, S, DESCRIPTOR, E ^ K3 ^ 'H, IN ^^ IF C0NDEN = 1 DO' Compression functicn.prg "ELSE
DO "Normal subroutine 1"? NDIF 7 'Other clones:' SORT ON? NTRY, UMBER FIELDS RFEND, NUMBER, L, D,?, Z, R, C, EMImiY, S,? ESCRIPTOR, LENGTH, INIT, I / ODMMEN
IF C0NDEN = 1 EO 'Compression function.prg "? LS? .- *** * DO" Normal subroutine 1"ENDIF?" Clones of faceown function:' SORT QN ENTRY, NUMBER FI? LDS RFEND, NUM3ER, L, D , F, Z, R, C,? TiTRY, S, DESCRIFT0R, L? NGTH, INIT, I, COMM? N
IF CONDEN ^ l DO 'Compresflion function.prg "ELSE
DO "Normal subroutine 1" ENDIF
IF C0NDEN »1? ECT * SET DEVICE TO PRINTER * SET PRINT CN DO" Output headipg.pxg "» »* USE" AnalyBia function.dbf "" Create bargraph.prg "ISET HEADING OFF
SCRE? N 1 TYP? 0 HEADING "Screen 1 * AT 40,2 SIZ? 2 $ 6,492 PIX? LS FONT" Géneva ", 12 COLOR 0,0,0
* + *? • TOTAL TOTAL NEW DIST
7 'FUNCTICNAL CLASS CLONES GEN? S GENES FUNCTICSíIAL Cl? SS'
7 '** »* LIST OF? FIELDS P, NAM?, CL? N? S, GENES, NEW, PERC? NT, GRAPH, CCMPANY LIST OFF FIELDS P, AME, CLJN? S, GENES, NEW, PERCENT, GRAPH CLOSE DATABASES
ERASE TEMP2.DSF S? T HEADIN3 ON * USE "SmarpGuy: FaxBAS? + / Mac: fox files TEMPMASTER. Bf • ENDIF
CASE ANAL = 8 DO "Subgroup umummary 3. rg" ENDCASE ~ p "Test print.prg" i? K PRINT OFF ^ ET D? VICE TO SCREEN CLOSE DATABASES «ERASE T? MPLIB.DBP« ERASE ?? MPNUM. DBF * EHASE TEMPDESIG.DBF * ERASE SELECTED.DBF CLEAR
LOOP acco
* COMPRESSION SU3R0UTINE FOR ANALYSIS PROGRAMS USE T? MP1 COUNT TO TOT
R? PLACE ALL RFEND WITH 1 MARK1 = 1 SW2-0 DO .WHIL? SW2 = 0 ROLL IF MARK1 > «TOT PACX
COUNT TO UNIQUE
COUNT TO NEWG? NES FOR D = 'H' .OR.Da '0' SW2 = 1 LOOP
ENDIF GO MARi DUP = 1 .STORE EOTRY TO TESTA «0 WHILE SW = 0 TEST = KIF STORE ENTRY TO TESTB
TF TESTA = TESTS D? LETE
DU? = DUP-rl LOOP
ENDIF GO MAR 1. R? PLACE RFEND WITH DÜP MARK1 «MARXl + DUP SH = 1 LOOP
ENDDO TEST LOOP
NDDO ROLL • GO TOP STORE Z TO LOC '«' Analysis locat'ion.dbf" TE FOR Z-LOC ACE CLONES WITH TOT K? PLACE GENES WITH UNIQUE R? PLACE NEW WITK NEWG? N? S USE TEMP1 SORT ON RF ? ND '/ D TO TEMP2 USE TEMP2 77 STR (UNIQUE, 5.0) 77' genes, for a total of '?? STR (TOT, 5.0) ??' .clones'? 'V Coincidence' list off fißlds upúser.RFEipD.L ^ íF ^^^.? NrRy.S.DSSC IPTOR.LElJG H, ™ !!,!
* S? T PRINT OFF CLOSE DATA3ASES ERASE TEMP1.DBF ERASE T? MP2.DBF USE TEMPDESIG * COMPRESSION SUBROUTTNS FOR ANALYSIS PROGRAMS USE TEMP1 COUNT TO TOT
REPLAC? ALL RFEND ITH 1 MARX1 M 1 SW2-0 DO WHILE SW2 = 0 ROLL IF MARK1 > = TOT PACK
COUNT TO UNIQUE
SW2 = 1 LOOP
ENDIF GO MARKl EUP = 1 STORE ENTRY TO TESTA SW - 0 f WHIL? SW = 0 TEST CP ORE ENTRY TO TESTB
IF TESTA «T? STB DEL? TE
DUP = DUP + 1 LOOP • ENDIF GO MARK1 REPLACE RFEND ITH DUP MARK1 ß MAR l + DU? SW = 1 LOOP. ? NDDO TEST LOOP
? NDDO ROLL «BRCWSE
COUNT TO P3 FOR 1.3 IF P3 > 0? STR (P3,3,0) 7? 'genes with priority = 3 (Full insert sequepce :)' list off fields nun? ér..RF D! LF.F, ZP.C, ENRRY # S, DE3CRI? TOR, L? l?? TÍI, IÍIIT for 1 *3
• 3 ENDIF
COUOT TO P2 FOR Ia2. IF P2 > 0? STR (P2,3,0) 77 'genes witth priority »2 (Primary analysis complete :)' liat off faithful number, RFEND, L, D, F, Z, R, C, ENIRY, S, DESCRIPTOR, L? NGTH , INIT for 1 = 2
7 EUDF
COUNT TO Pl FOR 1-1 IF P1 > 0 7 STR (Pl, 3.0Ã? 'Genes with priority = 1 (Primary analysis needed:)' list of the faithful number, RFEND,, D, F, Z, R, C, ENTRY, 9, DESCRIPTOR, L? NGTH, INIT for 1 = 1
ENDIF
* SET PRINT OFF CLOSS DATABASES ERASE TEMPl.DBF 'ERASE TEMP2.DBF USE' SmartGuy: FoxBAS? + / Mac: fox i clones clones. bf *
^ F
»COMPR? SSXON SU3R0UTIN? FOR ANALYSIS PROGRAMS USE TZMP1 COUNT TO TOT
REPLAC? ALL RFEND WITH 1 MASK1 - 1 SW2 * 0 DO WHILE SW2-0 ROLL IF MARK1 > = TOT PACK
COÜNT TO UNIQUE
SW2 = 1 LOOP
ENDIF GO MARK1 DUP = 1
D? L? TE
DUP - EUP + 1 LOOP
ENDIF GO MARKl REPLACE RFEND WITH DUP MARK1 = MARXl + EUP SW = 1 IiOOP ENDDO TEST LOOP
ENDDO ROLL «BROWSE TEMP2
total of '7' V Coincidence 'list off iieldfl nupber, FJ3ND, L.D.F, Z, R, C,? WIXY, S, DESCRIPTOR.LEKMH, INIT, I
*? ET PRINT OFF CLOSE DATABASES
ERASE TEMP1.DBF ERASE TEMP2.DBF USE "SroartGuy: FoxBASEt / Mac: fox files: clones.dbf '*
* COMPRESSION SUBROUTINE FOR ANALYSIS PROGRAMS USE TEMP1 COUNT TO TOT
R? PLACE ALL RF? ND WITH 1 MARK1 - 1 SW2 = 0 DO WHILE SW2 = 0 ROLL IF MARK1 > - TOT PACK
COUNT TO UNIQUE
COUNT TO NEWGENBS FOR D = 'H' .OR.D = '0' SW2the LOOP
ENDIF GO MARK1 DUP - 1 < ? ENTRY TO TESTA 0 'DCOOIWHILE SW = Q TEST? IP STORE ENTRY TO TESTB
IF TESTA = T? STB DELET3 DUP = DUP + 1 LOOP
? NDIF GO MARK1"REPLAC? FFEND WITH DUP MARK1 - KARK1 + DUP SW = 1 LOOP
ENDDO TEST LOOP
ENDDO ROLL
REPLAC? GENES WITH UNIQUE REFLACE NEW WITH N? WGEUE3. USE TEMP1 SORT CN RFEND / D TO T? MP2 USE TEMP2 SET H? ADING ON 7? STR (UNIÓOS, 5.0) ?? 'genes, fox a total of' 7? STR (TOT, 5.0) ?? 'clones' * • «? 'V Coincidence' lißt 'off fields amber, RF? ^, L, D, F, Z, R, C, ENTRY, S, DESCPaFTOR, LE3 ^ raK, INIT, I w ** «SCRE? N 1 TYPE 0 H ? ADING "Screen 1" AT 40.2 SlZE 286,492 PIXELS FCNT "Geneva", 12 COLOR 0,0, "list Cff faithful RFEND, S, DESCRIPTOR" S? T PRINT OFF CLOS? DATABASES ERASE TEMPl.DBF ERASE T? MP2.DBF USE TEMFDESTG * CCMFR? SSION-SUBROOTINE FOR ANALYSIS PROGRAMS USE T? MP1 COUNT TO TOT
REPLACE ALL RFEND WITH 1 MARK1 = 1 SW2-0 DO WKILE SW2 = 0 RDLL IF MARK1 > = »TOT PACK
COUNT TO UNI? Yj? S 2 = 1 LOOP
ENDIF C-0 MARK1 DUP »1 STORE ENTRY TO TESTA £ W m O 8 WHIL? SWaO TEST rp STORE ENTRY TO T? STB
IF TESTA - TESTB DELETE
DUP = DUP + 1 LOOP
ENDTP GO MARICL
REPLACE RFEND WITH DÜP MARK1 = MARK1 + DUP ew =? LOOR
ENDDO TEST LOOP
ENDDO ROLL c-a TOP STORE F TO DIST
US? "Analysis distribution.dbf" -iOCATE FOR P = DIST - fcPLAC? CLONES WITH TOT UPLACE GENES WITH UNIQU33 USE TEMP1 ßsrt or rfend / d to T? MP2 US? TEMP2 ?? STR (UNIQUB, 5.0) ?? 'genes, for a total of' 77 eTR (TOT, 5.0) 7? 'clones? 'V Coincidenee' liat off fieids ™ jróf *, RF5 ^,, D, F-, S,, C, B? Tt.S, DESCRIÉ
* SET PRINT OFF CLOSE DATABASES ERASE TEMPl.DBF .ERASE T? MP2.DBF USE TEMPDESIG * COMPRESSION SUBROUTINE FOR ANALYSIS PROGRAMS USE TEMPl COUNT TO TOT
R? PLAC? ALL RFEND WITH 1 MARK1 = 1 SW2-0 DO WHILE SW2 = 0 ROLL IF MARK1 > - TOT PACK
COUNT TO UNIQUE
SW2 = 1 LOOP
ENDI? GO MARK1 DUP - 1 STORE ENTRY TO TESTA SW * 0 WHIL? S = 0 TEST STORE? NTRY TO TE? TB
IF TESTA to TESTB DELETE
DUP. = DUP + 1 LOOP
ENDIF GO MARK1 R? PLACB -RFEND WITH DUP MARK1 - MAHK1 + DUP SW = 1 LOOP
? NDDO TEST LOOP
ENDDO ROLL 'GO TO? USE TEMP1 7? STR (UNIQU?, 5.0) 1 genes, for a total of 'STR (TT, 5.0)' clones'? 'V Match' list off fields number, RFE ^ ro,, D, F, ZFR, C, EtTOnr, S, DESCRIFTOR, I ^ N3TH, INIT, I
* SET FRXNT OFF CLOSE DATABASES ERASE TEMPl.DBF USE TEMPDESIG
* CCMPRE? SICN SUBROUTINE FOR ANALYSIS PROGRAMS USE "SmartGuy: FoxBASE * / Mac: = or? Files: Clones.dbf" COPY TO TSMP1 FOR US? TEMP1 COUNT TO IDGENE FOR D «'E' .0R.D» '0' .OR.D = 'H' .OR.D »'N' .OR.Ete'R '.OR.Dß'A1 DELET? FOR D = 'N' .OR.D = 'D' .OR.D = 'A' .OR.D = 'U * .OR.Dp'S', OR.D '' M '.OR.D *' R '.OR.D =' V PACK
COUNT TO TOT
REPLAC? ALL RFEND WITH 1 MARK1 «1 SW2 = Q DO WHILE SW2 = 0 ROLL IF MARK1 > = TOT PACK
COUNT TO UNIQUE
SW2 = 1 LOOP "? -.? NDIF GO MARX1 DUP» 1 STORE ENTRY TO TESTA
SW - or DO WHIL? SW-0 TEST SKI? STORE EGTGRY TO TESTB
IF TESTA = TESTB DELET? * DUP - DUP + 1 LOOP
ENDIF GO MARK1 REPIACE RFEND WTTH DUP MARK1 * MARKl + DUP Sífal
* SET PRINTER ON SORT ON RFEND / D, UMBER TO TEMP2 USE TEMP2 REPLAC? ALL START WITH RFEND / IEGENE * 10000 ?? STR (UNIQUE, S, 0) ?? 'genes, for a total of' ?? STR (TOT, 5.0) 7? 'clones'? 'Coincidence V V Clones / 10000' set heading off? ? .
SCREEN 1 TYPE 0 H? ADING 'Screen 1"AT 40.2 SlZE 286,492 PIX? LS FONT" Geneva', 7 COLOR 0,0,0, list fields nu ber, RilOT, START, L, D, F, Z, R, C, ENTRY, S, DESCRIPTOR, INIT, I * SET PRINT OFF CLOSE D? TABASES ERASE TEMPl.DBF ERASE TEMP2.DBF USE "SmartGuy.FoxBASEi '/ Mac: fox files: clones.dbf" i COMPRESSION' SUBROUTINE FOR ANALYSIS PROGRAMS US? TEMP1 COUNT TO IDGSNE FOR D * '3' .OR.D = '?' .0R.D = 'H' .OR.Da'N '.OR.D =' R '.OR.Da'A' DELETE FOR D-s'N '.OR.D =' D '.OR.D 'A' .OR.Dβ'U *, OR.D -''.OR.D-'M '.OR.D =' R '.OR.D =' V PACK
COUNT TO TOT
REPLACE ALL RFEND WITH 1 MARK1 s 1 SW2 = 0 DO WHILE SW2 = 0 ROLL IP MARK1 «TOT PACK
COUNT TO UNIQU? S 2 = 1 LOOP
? NDIF
STORE EOTRY TO TESTB
IF TESTA = TESTS OF? T?
DUP - DUP + 1 LOOP - ENDIF GO ARK1 REPLACE RF? ND WITH DUP MARK1 - MARKl-rDUP SW = 1 LOOP
ENDDO TEST LOOP
ENDDO ROLL KRCWSE
PRIOT? R CN T ON RFEND / D, NUMBER TO TEMP2 USE T? MP2 R? PLAC? ALL START ITH RFEND / IDGENE * 10000 ?? STR (UNIQUS, 5.0) ?? 'genes, for a total of' ?? STR (TOT, 5.0) 7? 'clone *'? 'Coincidence V V Clonea / 10000' set heading off SCREEN 1 TYPE 0 H? ADING "Scte < 311" AT 40.2 SIZ? 286,492 PIXELS FONT "Genßva", 7 COLOR 0,0,0, list fields number, RFEND, ST? RT, L, D, P, z; R, C, ENIRY, S, DESCRIPTOR; INIT, I «SET PRINT OFF CLOS? DATA3ASES ERASE T? MPl.DBF ERASE T? MP2.DBF US? "SmartGu: FoxBASE + - / Mac i fox files clones, dbf" *
USE TEMP1 csuNT to TOT ?? 'Total of ?? STR (TOT, 4.0) ?? 'clones'? * List Off Loop nup? er, L, D, F, Z, R, C, ENIRY, DESCRIPTOR, LENGTH, RF? ND, INIT, I list off fields number, L, D, F, Z, R, C, ENTRY, DESCRIPTOR CLOS? DATABASES ERASE TEMPl.DBF USE T? MPD? SIG
•
«Lifescan menu; version 8-7-94 SET TALK OFF set device to screen CLEAR
USE "SroartGuy: FoxBAS? + / Mac: fox files: clones, dbf * STORE LUFOATEO TO Update GO BOTTCM
STORE P £ CNO () TO clonene STORE 6 TO Chooser DO WHIL? .T. * Program. s Lifßseq menu.fmt * Date .... »1/11/95 * Version: FoxEASE + / Mac, revision 1.10 * Notes Format file Lifesaq menu * SCREEN 1 TYP? 0 HEADING "Screen 1" AT 40.2 SlZE 286,492 PIXELS FCNT "Ger.eva", 268 COLOR 0,0, PIXELS 18,126 TO 77,365 STYLE 2B479 COLOR 32767, -25600, -1, -16223, -16721, -15725 PIXELS 110.29 TO 188.217 STYL? 3871 COLOR 0,0, -1, -25600, -1, -1 PIXELS 45,161 SAY "LIFESEQ" STYLE 55536 FdNT 'Geneva', 536 COLOR 0,0, -1, -1,7135,5884 9 PIXELS 36,269 SAY " TW "STYLE 65536 FONT * Geneva *, 12 COLOR 0,0, -1, -1,7135,5884 0 PIXELS 63,143 SAY" Molecular Biolo? And Deslctop * STYLE 65536 FCNT "Helvetica", 18 COLOR 0,0,0, ß PIXELS 90,252 TO 251,467 STYLE 28447 COLOR 0,0, -1, -25600, -1, -1 3 PIXELS 117,270 GET Chosser STYLE 65536 FONT "Chicago", 12 PICTUR5"< ? «RV Trarascript profiles 9 PIX? LS 135,128 SAY Update STYL? 0 FONT * Geneva ", 12 SlZE 15.79 COLOR 0,0,0, -25600, -1, -1 'G PIX? LS 171,128 SAY clonepo STYLE 0 FONT' Geneva ', 12 SlZE 15,79 COLOR 0,0 , 0, -25600, '-1, -1 <? PIXELS 135,44 SAY "Last update:" STYLE 65536 FONT' Geneva "« 12 COLOR 0,0, -1, -1, -1, -1 ß FTXELS 171,44 SAY "Total clones:" STYL? 65536 FQNT "Geneva", 12 COLOR 0,0, -1, -1, -1, -1 9 PIXELS 45,296 SAY "vl.30" STYL? 65536 FONT "Geneva", 782 COLOR 0,0, -1, -1, -1, -1 * EOF: Lifeseq menu.fmc READ
DO CASE
CAS? Chooser = l DO "SmartGuyjFox? ASE + / Mac! Fox files: Outtput programs iMastär analysis 3. rg" filsB: Output program: Subtracticn 2.prg "filss: Output program: orthern (single) .prg"
files: Output progxamsiSeß individual clone.prg "filesiLibrarles iOutput programs: Menu.prg" CLEAR
SCRE? N 1 OFF R? TURN ENDCASE
LOOP
ENDDO 1 os
YES, 0 SAY "Datábase Subset Anaiyais" STYLE 65536 FONT "Geneva", 274 COLOR 0,0,0, -1, -1, -1
? 7? • j? dateO? '' 77 TIMBO 7 'Clone pumbers' ?? STR (3NITIATE, 6.0) ?? 'through' '?? STR (TERMINATE, 6.0) 7 • Free them: 'IF ENTIR £ = 1? 'All free * ENDIF
IF ENTIRE = 2 MARK = 1 DO WHILE .7. IF MARK > STOPIT EXIT
ENDIF
US? SELECTED
GO MARK? '' 7? TRXM (libname) STORE MARK + 1 TO MARK LOOP
ENDDO ENDIF? 'Designatione i' IF Ematch = 0 .AND. Hraatch = 0 .AND. Ctnatch «0 ?? 'All' ENDIF I? Eaatch «l ?? 'Exact,' ÍSOIF Hmatch = l 'Human,' 'ENDF IF Cmatch »l ?? 'Other sp. 'ENDIF
IF CONDEN-1? 'Condensad forioat analyßiß' ENDIF IF ANAL-1? • 'Sorted by NUMBER'? NDIF IF ANAL? 2? 'Sorted by ENTOY1 ENDIF XF ANAL-3 7' Arranged by ABUNDANC? ' ENDIF IF ANAL * 4? 'Sorted by INTEREST' ENDIF IF ANAL-5? 'Arranged by LOCATICN' ENDIF IF ANAL-d? 'Arranged by DISTRIBUTICN' ENDIF
TB ANAL-7? 'Arrangad by FUNCTTON'
ENDIF? 'Total clones represented:
77 STR (STARTOT, 6.0)? 'Total clonea analyzedi'
7? STR (ANALTOT, 6.0)HEA
F
USE TEMP1 COUNT TO TOT ?? 'Total of' ?? STR (TOT, 4.0) ?? 'clones'? "List? Ff fields number, L, D, F, Z, R, C, EIí5TRY, DESCRIPTOR, LEt > TGTH, RF? ND, INIT, I liat off laida number, L.D, F.Z, R, C, ENTRY, DESCRIPTOR CLOSE DATAHASES ERASE TEMPl.D3F USE TEMPD? SIG
F
*
USE TEMP1 COUNT TO TOT ?? 'Total of ?? STR (TOT, 4.0)
I
F
«Northern (single), version 11-25-94 cióse databases SET TALX Oc? SET PRINT OFF
S? T EXACT OFF
CLEAR
STORE '' TO Eobjšct STORE '' TO Dcbject STORE 0 TO Numb STORE 0 TO Zsg STORE 1 TO Bail DO WHILE .T. * Program: Northern (single). fmt * Date: 3/8/94 * Version: FoxBASE + / Mas, revision 1.10 * Notes: Format file Northern (single) 1 TYPE 0 INQ "Screen 1" AT 40.2 SI2E 286,492 PIX? LS FONT "Geneva", 12 COLOR 0,0,0
PIXELS 15,81 TO 46,397 STYL? 28447 COLOR 0,0, -1, -25600, -1, -1 * ß PIXELS 89,79 TO 192,422 STYL? 28447 COLOR CO, 0, -25600, -1, -1 9 PIXEL3 115.98 SAY «Entry #:" STYLB 65536 PCNT "Geneva", 12 COLOR 0,0,0, -1, -1, -1 S PIXELS 115.173 QET Eobject STYLS 0 FCNT "Ger.eva", 12 SIZ315,142 COLOR 0,0,0, -1, -1, -1
8 FIXEL? 145.89 SAY "Description * STYLE 65536 FONT MGeneva", 12 COLOR 0,0,0, -1, -1, -1 3 PIXELS 145,173 G? T Dobject STYLE 0 FONT • Ge eva *, 12 SIZ? 15,241 COLOR 0,0,0, -1, -1, -1
@ PIXELS 35.89 SAY "Single Northern ßearch screen" STYLS 65536 FONT "Ger.ava", 274 COIOR 0,0, - @ PIXELS 220,162 GET Bail STYLE 65536 FONT "Chicago", 12 PICTOR? "3 * R Contin e; Bail out 'SlZE ß PSXELS- 175.98 SAY" Clone #: "STYLE 65536 FCOT" Geneva "; 12 COLOR 0,0,0, -1, -1, -1, S PIXELS 175,173 G? T Numb STYLE 0 FONT "Genßva", 12 SIZ? 15.70 COLOR 0,0,0, -1, -1, -1 < S PIXELS 80,152 SAY "Enter any ONE sf che follo ing:" STYLE 65536 FONT "Geneva'.U COLOR -1, * * EOF: Northern (single). f t READ
IF Bail 2 CLEAR aereen 1 off
files: Lookup. bf "IF Eobjecto '' STORE UPPER (Eobjeee) to fobjset SET SAFETY OFF
SORT ON Sntry TO * Lookup entry.dbf 'S5T SAFETY ON USE "Lookup entry.dbf" SOLUTION FOR Look-Eabject TF .NCT.FOUND CLEAR
LOOP
ENDIF
BROWSE
STORE Entry TO Searchv¡al CLOSE DATABAS? S
ERASE "Lookup -entry.dbf" ENDIF
IF Dobject '' SET EXACT OFF
SET SAF? TY OFF
SORT ON descriptor TO "Lookup descriptor.dbf 'SET SAFETY On USE" iioo up descriptor. dbf "WORK FOR UPPER (TRI (descriptor)) = UPPER (TRI (D? bj ect)) IF .NOT.FOOND OR CLEAR LOOP ENDIF
BROWSE
STORE E try TO Searchval CLOS? DATABASES
ERASE "Lookup descriptor.dbf" SET EXACT CN
ENDIF
IF NupiboO USE "SmartGuy: FoxBASE + / Mac: Fox filestelones, db"
GO Numb BRCW5E STORE? Ntry TO Searchval ENDIF
CLEAR Northern snalysia ícr entry? Searchval ft? 'Enter and to proceed' WAIT to ox CLEAR
IF UPP? R (O) < > 'Y' screen 1 off RETURN
ENDIF "
is. dbf '"
PACK ew *? LOOP
? NDI? GO MARK1 STORE library TO TESTA S IP 'STORE Library TO TESTB IF TESTA = TESTB D? LEGE
? NDIF
MARil «MAKK1 + 1 LOOP
? NDDO ROLL * Northern analysis CL? AR 7 'Doing the northem now ...' SET TALK ON
USE * SmartGuy: FoxBASE + '/ Mac: F «x files clones, dbf
SET SAF? TY OFF
COPY TO "Hits. Bf" FOR entryasearchval SET SAF? TY ON CLOSE DATABA =? S SELECT 1 USE "Compressed librarles. Bf" STORE KSCCCONT O TO Entrißs? EL? CT 2 US? "Hits. Bf" Marted EO WHILE .T. SELECT 1 J Kark > Enfcri »s EXIT
EMDIF
GO MARK
STORE library TO Jigger SELECT 2 COUNT TO Zog FOR library = Jigyer -SSEELLECT 1 &A35PPLACE hits with Zog k «Mark + l LOOP ENDDO
SELECT 1 BROWS? FIELDS LI3FARY, LIBNAME, ENT? RED, HITS AT 0.0 CI? AR? 'Enter Y to print:' WAIT TO PRINSET
IF UF? ER (PRINSET) = 'Y' SET PRINT ON
CLEAR
E? CT 'SCREEN 1 TYPE 0 : LG "Ecreen 1" AT 40,2 SIZE 286,492 FIXELS FONT "Geneva", 14 COLOR 0,0,0
? 'DATABASE ENTRI2S MATCHING ENTRY' ?? Sec chval? DATEO? 1 TYPE 0 ING "Screen 1" AT 40, '2 SIZ? 236,492 PIXELS FONT "Geneva", 7 COLOR 0,0,0, sT ÓFF FIELDS library, libname, entered, hits? S? L? CT 2 LIST OFF FIELDS NUMBER, LIBRARY, D, S,?, Z, R,? NrRY, DESCRIPTION, R? STAR-r, START, RFEND SET TALK OFF
SET PRINT OFF
ENDTF
CLOS? DATABAS32S SET TALK OFF
CLEAR
DO 'Test print .prg * RETÜRN
Ubrary libpamo ADSNINB01 Inflamed adenoid ADRENOR01 Adrenal güpd (r) ADR? NOTD1 Adrßnal glapd (T) AMLBNOT01 AML blast celia (T) eMENNOTO Bonß merrow BMARNOT02 Bonß marro (T) CARDNOT01 Cardlae musel (T) CHAO OTQ1 Chin, oyster shell COR NOT01 Corneal? Troma FiaRAOTOl Fibro laßt, AT 5 FIBRAGTC2 Fibroblast, AT 30 F1BRANT01 Flbroblast AT Fl3pNGT01 FibroblasU uv 5 F19RNGT02 Rbroblast. uv 30 *., R FIBfWOTOl Rbroblast aRNOTD2 Normal Fibroblas HMC1NOT01 Maßt cßll Une HMC-1 HUVELPBOI HUVEC IFN.TNF.LPS HUVENOB01 HUVEC control HUV? STB01 HUVEC shear stress HYPO CB01 Hypothalamus KIDNNOT01 ldnßy (T) UV MOT01 UVTG (T) LUNGNOT01 Lung (T) MUSC OT01 S fll? Lal mutdß (T) OVIDNOBOt Oviduct PANCNOT01 Pancreas, normal PG? JNOHOI Pliuilary (r) PITUNOT01 Pllullary fj) PLACNOB01 Placenta S1NT OTD2 Smßll ¡ntßatínß (T) SPL FET01 6pl? Entl? Vßr, fetal SPLNNOTOS Spleen (T) STOMNOT01 Stomach 6YNORAB01 Rhßum. synovlum JB YNOT01 T + B lyrnprtoblat: STNOTOI T? 9tia (T) P1NOB01 THP-1 control f T THHPI 1PEB01 THP phorbol THP1PLB01 THP-1 phnrbol LPS U937NOT01 U937, monocyle leu
number library d a f z r r entry descriptor rf ta iatart rfand
2304 U837NOT01 E H C C T HUMEF1B EJonoitloo lador 1-bata 0- 0 773
3240 HMC1N T01 E H C 0 T HUMEFlB Elongal'cn (actor 1-bßt? 0 370 773
3259 HMC1NOT01 E H C C T HUMEFlB Elonoaticn (actor 1-bata 0 371 773
«93 HMC1NOT01 E H C C T HU EF1B Elongatten tactor 1-bßta 0 470 773
39S9 HMC1NOT01 E H O CT HUMEFlB Elopgawn a or 1'bßta 0 327 773
9139 HMO1NOT01 E H C 0 T HUMEF1B Elongaucn (actor 1-bßta 0 375 773
Claims (16)
1. A method for analyzing a specimen containing gene transcripts, the method comprising the 5 steps of: (a) producing a library of biological sequences; (b) generate a set of sequences of ^^, transcripts, where each of the sequences of 10 transcripts in that set is indicative of one of the biological sequences different from the library; (c) process the transcription sequences in a programmed computer in which the database of the biological reference sequences is stored, to generate a 15 value of a sequence identified for each of the transcription sequences, where each of the values Identified sequence MIM is indicative of a sequence annotation and a degree of coincidence between one of the sequences and at least one of the sequences of 20 reference transcript; and (d) processing each of the identified sequence values to generate final data values indicative of a number of times each identified sequence value is present in the library.
2. The method of claim 1, wherein the pasq (a) includes the steps of: obtaining a mRNA mixture; make copies of RNA cDNA; isolate a representative population of clones transfected with the cDNA and produce from them the library of biological sequences.
3. The method of claim 1, wherein the biological sequences are cDNA sequences.
4. The method of claim 1, wherein the sequences are RNA sequences.
The method of claim 1, wherein the biological sequences are protein sequences.
The method of claim 1, wherein a first value of the degree of coincidence is indicative of an exact match, and a second value of that degree of coincidence is indicative of a non-exact match.
Wt 1 A method of comparing two specimens containing gene transcripts, said method comprising: 0 (a) analyzing a first specimen according to the method of claim 1; (b) producing a second library of biological sequences; (c) generating a second set of transcription sequences, wherein each of the sequences together is indicative of one of the sequences of the second library; (d) process the second set of transcription sequences in the programmed computer to generate a 5 second set of identified sequence values known as additional identified sequence values, wherein each of the additional identified sequence values is indicative of a sequence annotation and a degree of coincidence between one of the biological sequences 10 of the second library and at least one of the reference sequences; (e) processing each additional identified sequence value to generate additional final data values indicative of a number of times each sequence value 15 identified is present in the second library; and (f) processing the final data values from the flp of the first specimen and the additional identified sequence values from the second specimen to generate quotients of transcription sequences, each of these values of the ratios indicative of differences in numbers of gene transcripts between the two specimens.
8. A method for quantifying the relative abundance of mRNA in a biological specimen, said method comprising the steps of: (a) isolating a population of transcripts from (b) identifying the genes from which the mRNA was transcribed by a method of specific sequence; 5 (c) determine numbers of transcripts of MRNA corresponding to each of the genes; and (d) using the mRNA transcription numbers to determine the relative abundance of transcripts of j mRNA within the population of mRNA transcripts.
9. A diagnostic method comprising producing an image of gene transcription, said method comprising the steps of: (a) isolating a population of mRNA transcripts from the biological specimen; 5 (b) identify the genes from which the mRNA was transcribed by a specific W-sequence method; (c) determining numbers of mRNA transcripts corresponding to each of the genes; and 0 (d) using the mRNA transcription numbers to determine the relative abundance of mRNA transcripts within the population of mRNA transcripts, wherein the data that determine the relative abundance values of mRNA transcripts is the image of the mRNA. transcription of the biological specimen gene.
10. The method of claim 9, further comprising: (e) providing a set of standard transcription images of sick and normal genes; Y 5 (f) comparing the gene transcription image of the biological specimen with the gene transcription images of step (e) to identify at least one of the standard transcription images of genes that most closely approximates the transcription image 0 of genes of the biological specimen.
The method of claim 9, wherein the biological specimen is biopsy tissue, saliva, blood or urine.
12. A method for producing a gene transcription picture, the method comprising the steps of: (a) obtaining a mixture of mRNA; Wk (b) make copies of RNA cDNA; (c) inserting the cDNA into a vector by adapting and using that vector to transfect cells from a suitable host strain that are platinized and allowed to grow into clones, with each clone representing a single mRNA; (d) isolating a representative population of recombinant clones; (e) identify amplified cDNAs from each 5 clone in the population by a specific sequence method that identifies the gene from which the unique mRNA was transcribed. (f) determine a number of times that each gene is represented within the population of clones as a 5 indication of relative abundance; and (g) listing the genes and their relative abundance in order of abundance, thereby producing the gene transcription picture.
The method of claim 12, further including the step of diagnosing disease by: repeating steps (a) to (g) in biological specimens from a random sample of normal and diseased humans, encompassing a variety of diseases , to produce reference sets of images of 15 transcripts of normal and diseased genes. obtain a test specimen from a human, and produce a transcript image of the test gene by performing steps (a) through (g) on that test specimen. compare the transcript image of the gene with 20 reference sets of gene transcript images; and identifying at least one of the gene transcription images that closely approximates the transcript image of the test gene. 25.
A computer system for analyzing a library of biological sequences, including that system: an element for receiving a set of sequences of transcripts, wherein each of the sequences of transcripts is indicative of one of the different biological sequences. from the library; and an element for processing the transcription sequences in the computer system in which a database of sequences of reference transcripts is stored, wherein the computer is programmed with software 0 to generate an identified sequence value for each of the transcription sequences, wherein each identified sequence value is present in the library.
The system of claim 14, which also includes: an element for * the generation of a library to produce the library of the sequences -Jñt biological and generate the set of transcription sequences from that library.
16. The system of claim 15, wherein the element for generating the library includes: an element for obtaining a mRNA mixture; an element for making cDNA copies of the mRNA; an element for inserting the cDNA copies into cells and allowing the cells to develop into clones; 5 an element to isolate a population * »•« 123 Wr representative of the clones and produce from them the library of biological sequences. F F * • 124 * SUMMARY A method and system for quantifying the relative abundance of gene transcripts in a biological specimen. A method modality generates high throughput analysis of specific sequences of multiple RNAs or their corresponding cDNAs (gene transcription image analysis). Another modality of the method produces an image analysis of ft gene transcription by using high-throughput analysis 10 performance of cDNA sequences. In addition, projection of gene transcription images can be used to detect or diagnose a particular condition, disease or biological condition that correlates with the relative abundance of gene transcripts in a cell or population of 15 cells given. The invention provides a method for comparing the gene transcription image analysis of two or more f f different biological specimens in order to distinguish between the two specimens and identify one or more genes that are differentially expressed between the two specimens. twenty * * * * *
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US08187530 | 1994-01-27 | ||
US08282955 | 1994-07-29 |
Publications (1)
Publication Number | Publication Date |
---|---|
MXPA96003077A true MXPA96003077A (en) | 2000-01-01 |
Family
ID=
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6114114A (en) | Comparative gene transcript analysis | |
US5840484A (en) | Comparative gene transcript analysis | |
WO1995020681A9 (en) | Comparative gene transcript analysis | |
Zhang et al. | The functional landscape of mouse gene expression | |
US6309834B1 (en) | Method for simultaneous identification of differentially expressed mRNAs and measurement of relative concentrations | |
EP2556185B1 (en) | Gene-expression profiling with reduced numbers of transcript measurements | |
US10619195B2 (en) | Gene-expression profiling with reduced numbers of transcript measurements | |
MXPA00012758A (en) | METHOD FOR SIMULTANEOUS IDENTIFICATION OF DIFFERENTIALLY EXPRESSED mRNAs AND MEASUREMENT OF RELATIVE CONCENTRATIONS. | |
Lee et al. | Application of transcriptional and biological network analyses in mouse germ-cell transcriptomes | |
MXPA96003077A (en) | Comparative analysis of the transcription of the | |
US20040023231A1 (en) | System for identifying and analyzing expression of are-containing genes | |
JP2002528095A (en) | Methods for improving the detection and classification of gene expression patterns using co-regulated gene sets | |
EP4376022A1 (en) | Clinical therapeutic drug prediction and recommendation system and method for evaluating efficacy of second-generation hormone drug in treatment of prostate cancer | |
US20240136013A1 (en) | Quantification of rna mutation expression | |
CN109517825B (en) | FOXC1 gene mutant and application thereof | |
CN114921560A (en) | Noninvasive biomarkers for hepatic fibrosis and liver cancer | |
CN114591980A (en) | CARS gene mutant and application thereof | |
CN114921559A (en) | Application of GPNMB gene as noninvasive biomarker in preparation of products for diagnosing hepatic fibrosis and liver cancer diseases | |
CN115961021A (en) | Methods for detecting, diagnosing and treating COVID-19 | |
Ma | Genome-wide Analysis of Human Peripheral Leukocyte Gene Expression | |
Esfahani et al. | Identification of potential diagnostic microRNAs and lncRNAs in breast carcinoma: Integrated high-throughput bioinformatics investigation | |
Matsubara et al. | Recent progress in human molecular biology and expression profiling of active genes in the body | |
Gersten | Systems analysis of model organisms in the study of human disease phenotypes | |
US20020152196A1 (en) | cDNA database and biochip for analysis of hematopoietic tissue | |
HUT75550A (en) | Comparative gene transcript analysis |