NZ700647B2

NZ700647B2 - Interrogatory cell-based assays and uses thereof

Info

Publication number: NZ700647B2
Application number: NZ700647A
Authority: NZ
Inventors: Min Du; Niven Rajin Narain; Rangaprasad Sarangarajan; Vivek K Vishnudas; Tony Walshe
Original assignee: Berg Llc
Priority date: 2012-04-02
Filing date: 2012-09-07
Publication date: 2016-11-29

Abstract

The methods and systems described herein employ data-driven techniques to build Bayesian networks of causal relationships for biological networks, thereby identifying modulators of a biological system or process (e 5 .g., a disease condition, such as cancer).

Description

INTERROGATORY CELL-BASED ASSAYS AND USES F Related Applications This application claims priority to Provisional Patent Application Serial No. 61/619326, filed on April 2, 2012; Provisional Patent Application Serial No. 61/668617, filed on July 6, 2012; Provisional Patent Application Serial No. 61/620305, filed on April 4, 2012; Provisional Patent Application Serial No. 61/665631, filed on June 28, 2012; ional Patent Application Serial No. 596, filed on August 1, 2012; and ional Patent Application Serial No. 590, filed on August 1, 2012, the entire contents of each of which are sly incorporated herein by reference.

Background ofthe Invention New drug development has been enhanced greatly since the discovery of DNA in 1964 by James Watson and Francis Crick, pioneers of what we refer today as molecular biology. The tools and products of molecular biology allow for rapid, detailed, and precise measurement of gene regulation at both the DNA and RNA level. The next three decades following the paradigm-shifting discovery would see the genesis of knock-out animal models, key enzyme-linked reactions, and novel understanding of disease mechanisms and pathophysiology from the aforementioned platforms. In spring 2000, when Craig Ventor and Francis s announced the initial sequencing of the human genome, the scientific world entered a new wave of medicine.

The mapping of the genome immediately d hopes of, for example, being able to control disease even before it was initiated, of using gene therapy to reverse the degenerative brain processes that causes Alzheimer’s or Parkinson’s Disease, and of a construct that could be introduced to a tumor site and cause eradication of disease while restoring the normal tissue architecture and physiology. Others took versial twists and proposed the notion of creating desired offspring with respect to eye or hair color, , etc. Ten years later, however, we are still waiting with no particular path in sight for sustained success of gene therapy, or even elementary control of the genetic process.

Thus, one apparent reality is that genetics, at least independent of supporting constructs, does not drive the end-point of physiology. Indeed, many ses such as ranscriptional modifications, mutations, single-nucleotide polymorphisms (SNP’s), and translational modifications could alter the ence of a gene and/or its encoded complementary protein, and thereby contribute to the disease s.

Summary ofthe Invention The information age and creation of the internet has allowed for an information overload, while also facilitating international collaboration and critique. ally, the aforementioned realities may also be the cause of the scientific community overlooking a few simple points, ing that communication of signal cascades and cross-talk within and between cells and/or tissues allows for tasis and messaging for corrective mechanisms to occur when something goes awry.

A case on point relates to cardiovascular disease (CVD), which remains the leading cause of death in the United States and much of the developed world, accounting for l of every 2.8 deaths in the U.S. alone. In on, CVD serves as an underlying pathology that contributes to associated complications such as Chronic Kidney Disease (~ 19 million US cases), chronic fatigue syndrome, and a key factor in metabolic syndrome. Significant advances in technology related to diagnostics, minimally ve surgical techniques, drug eluting stents and effective clinical surveillance has buted to an unparalleled period of growth in the field of interventional cardiology, and has allowed for more effective management of CVD. However, disease etiology related to CVD and associated co-morbidities such as diabetes and peripheral vascular disease are yet to be fully elucidated.

New ches to e the mechanisms and pathways involved in a biological process, such as the etiology of disease conditions (e. g., CVD), and to identify key regulatory pathways and/or target molecules (6. g., “drugable s”) and/or markers for better disease diagnosis, ment, and/or treatment, are still lacking.

The invention described herein is based, at least in part, on a novel, collaborative utilization of network y, genomic, proteomic, metabolomic, transcriptomic, and bioinformatics tools and methodologies, which, when combined, may be used to study any biological system of interest, such as selected disease conditions including cancer, diabetes, obesity, cardiovascular disease, and angiogenesis, using a systems biology approach. In a first step, cellular modeling systems are developed to probe various biological s, such as a disease process, comprising disease-related cells ted to various e-relevantenvironment stimuli (e. g., hyperglycemia, hypoxia, immuno- stress, and lipid dation, cell y, angiogenic agonists and antagonists). In some embodiments, the cellular ng system involves cellular cross-talk mechanisms between various interacting cell types ( such as aortic smooth muscle cells (HASMC), proximal tubule kidney cells (HK-2), aortic, endothelial cells (HAEC), and dermal fibroblasts (HDFa)). High throughput biological readouts from the cell model system are obtained by using a combination of techniques, including, for example, cutting edge mass spectrometry (LC/MSMS), ﬂow cytometry, cell-based assays, and functional assays. The high throughput biological readouts are then ted to a bioinformatic analysis to study congruent data trends by in vitro, in vivo, and in silico modeling. The resulting matrices allow for cross-related data mining where linear and non-linear regression analysis weredeveloped to reach conclusive pressure points (or “hubs”). These “hubs,” as presented herein, are candidates for drug discovery. In particular, these hubs represent potential drug s and/or disease markers.

The molecular signatures of the differentials allow for insight into the mechanisms that dictate the alterations in the tissue microenvironment that lead to disease onset and progression. Taken together, the combination of the aforementioned technology platforms with gic ar modeling allows for robust intelligence that can be employed to further establish diseaseunderstanding while creating biomarker ies and drug candidates that may clinically t rd of care.

Moreover, this approach is not only useful for disease diagnosis or intervention, but also has general applicability to virtually all pathological or non-pathological ions in ical systems, such as biological systems where two or more cell systems interact. For example, this approach is useful for obtaining insight into the mechanisms associated with or causal for drug toxicity. The invention therefore provides a framework for an interrogative biological assessment that can be generally applied in a broad spectrum of settings.

A icant feature of the platform of the invention is that the AI-based system is based on the data sets obtained from the cell model system, without ing to or taking into consideration any existing knowledge in the art, such as known biological relationships (i.e., no data points are artificial), concerning the biological process.

Accordingly, the resulting statistical models generated from the platform are unbiased.

Another significant feature of the platform of the invention and its components, e. g., the cell model systems and data sets obtained therefrom, is that it allows for continual building on the cell models over time (e. g., by the introduction of new cells and/or conditions), such that an initial, “first generation” consensus causal relationship network generated from a cell model for a biological system or process can evolve along with the evolution of the cell model itself to a multiple generation causal onship network (and delta or delta-delta networks obtained therefrom). In this way, both the cell models, the data sets from the cell models, and the causal relationship networks generated from the cell models by using the Platform Technology methods can constantly evolve and build upon us knowledge obtained from the Platform Technology.

The invention es methods for identifying a modulator of a biological system, the s comprising: establishing a model for the biological system, using cells associated with the biological system, to represents a characteristic aspect of the biological system; ing a first data set from the model, wherein the first data set represents global proteomic s in the cells associated with the biological system; obtaining a second data set from the model, wherein the second data set ents one or more functional ties or cellular responses of the cells associated with the ical system, wherein said one or more functional activities or cellular responses of the cells comprises global enzymatic activity and/or an effect of the global enzyme activity on the enzyme metabolites or substrates in the cells associated with the biological system; generating a consensus causal relationship network among the global proteomic changes and the one or more functional ties or cellular responses based solely on the first and second data sets using a programmed computing device, n the generation of the consensus causal relationship network is not based on any known biological relationships other than the first and second data sets; and identifying, from the consensus causal relationship network, a causal relationship unique in the biological , wherein at least one enzyme associated with the unique causal relationship is identified as a tor of the biological system.

The invention also provides a method for identifying a modulator of a biological system, the method comprising: establishing a model for the biological system using cells associated with the biological system to represent a teristic aspect of the biological system; obtaining a first data set from the model, wherein the first data set represents global proteomic changes in the cells associated with the biological system; obtaining a second data set from the model, wherein the second data set represents one or more functional activities or ar responses of the cells associated with the biological system, wherein said one or more functional activities or ar responses of the cells ses global tic activity and/or an effect of the global enzyme activity on the enzyme metabolites or substrates in the cells associated with the ical system; generating a causal relationship network among the global proteomic changes and the one or more functional activities or cellular responses based solely on the first and second data sets using a programmed computing device, wherein the causal relationship network is a an network of causal relationships including quantitative probabilistic directional information ing relationships among the global proteomic changes and the one or more functional activities or ar responses; and identifying, from the causal relationship network, a causal relationship unique in the biological system, wherein at least one enzyme associated with the unique causal relationship is identified as a modulator of the biological system.

In certain embodiments, the first data set is a single mic data set. In certain embodiments, the second data set ents a single functional activity or cellular response of the cells associated with the biological system. In certain embodiments, the first data set further represents lipidomic data characterizing the cells associated with the ical system. In certain embodiments, the consensus causal relationship network is generated among the global proteomic changes, lipidomic data, and the one or more functional activities or cellular responses of the cells, wherein said one or more functional activities or cellular responses of the cells comprises global enzymatic activity. 11646660V1 In certain ments, the first data set further represents one or more of lipidomic, lomic, transcriptomic, genomic and SNP data characterizing the cells associated with the biological system. In certain embodiments, the first data set further represents two or more of lipidomic, metabolomic, transcriptomic, genomic and SNP data terizing the cells associated with the biological system. In certain embodiments, the consensus causal onship network is generated among the global proteomic changes, the one or more of lipidomic, metabolomic, transcriptomic, genomic, and SNP data, and the one or more functional activities or cellular responses of the cells, wherein said one or more functional activities or cellular responses of the cells comprises global enzymatic activity and/or the effect of the global enzymatic activity on at least one enzyme metabolite or substrate.

In certain embodiments, the global enzyme activity ses global kinase activity. In certain embodiments, the effect of the global enzyme activity on the enzyme metabolites or substrates ses the phospho proteome of the cells.

In certain embodiments, the second data set representing one or more functional activities or ar responses of the cell further comprises one or more of rgetics, cell proliferation, apoptosis, organellar function, cell migration, tube formation, chemotaxis, ellular matrix degradation, sprouting, and a genotype-phenotype associate actualized by functional models selected from ATP, ROS, OXPHOS, and (11117105_1):JJP Seahorse assays. In certain ments, the consensus causal relationship network is generated among the global mic changes, the one or more of lipidomic, metabolomic, transcriptomic, genomic, and SNP data, and the one or more functional activities or cellular responses of the cells, wherein said one or more functional activities or cellular responses of the cells ses global enzymatic activity and/or the effect of the global enzymatic activity on at least one enzyme metabolite or substrate and further comprises one or more of bioenergetics, cell proliferation, apoptosis, organellar function, cell migration, tube formation, chemotaxis, ellular matrix degradation, sprouting, and a genotype-phenotype associate actualized by functional models selected from ATP, ROS, OXPHOS, and Seahorse assays.

In certain embodiments of the ion, the model of the ical system comprises an in vitro culture of cells associated with the biological system. In certain embodiments of the invention, the model of the biological system optionally further comprising a matching in vitro culture of control cells.

In certain embodiments of the invention, the model of the biological system the in vitro culture of the cells is t to an environmental perturbation, and the in vitro culture of the matching control cells is identical cells not subject to the environmental perturbation. In certain embodiments, the model of the biological system the environmental perturbation comprises one or more of contact with a bioactive agent, a change in culture ion, introduction of a genetic modification / mutation, and introduction of a vehicle that causes a genetic modification / mutation. In certain embodiments, the model of the biological system the environmental bation comprises contacting the cells with an enzymatic activity tor. In certain embodiments, in the model of the biological system the tic activity inhibitor is a kinase inhibitor. In certain embodiments, the environmental perturbation comprises ting the cells with CleO. In certain embodiments, the environmental perturbation comprises further ting the cells with CleO.

In certain embodiments of the invention, the generating step is d out by an artificial intelligence (AI) -based informatics rm. In certain embodiments, the AI- based informatics platform receives all data input from the first and second data sets without applying a statistical cut-off point. In certain embodiments of the invention, the consensus causal relationship network established in the generating step is further refined to a simulation causal relationship network, before the identifying step, by in silico simulation based on input data, to e a confidence level of prediction for one or more causal relationships within the consensus causal relationship network.

In certain embodiments of the invention, the unique causal relationship is identified as part of a ential causal onship network that is uniquely present in cells associated with the biological system, and absent in the matching control cells. In certain embodiments, the unique causal relationship is identified as part of a differential causal relationship network that is uniquely present in cells associated with the biological system, and absent in the matching control cells.

In certain embodiments of the invention, the unique causal relationship identified is a relationship n at least one pair selected from the group consisting of expression of a gene and level of a lipid; expression of a gene and level of a transcript; sion of a gene and level of a metabolite; expression of a first gene and expression of a second gene; expression of a gene and ce of a SNP; expression of a gene and a functional activity; level of a lipid and level of a transcript; level of a lipid and level of a metabolite; level of a first lipid and level of a second lipid; level of a lipid and presence of a SNP; level of a lipid and a functional activity; level of a first ript and level of a second transcript; level of a transcript and level of a metabolite; level of a transcript and presence of a SNP; level of a first transcript and level of a functional activity; level of a first metabolite and level of a second metabolite; level of a metabolite and presence of a SNP; level of a metabolite and a functional activity; presence of a first SNP and presence of a second SNP; and ce of a SNP and a functional activity. In certain embodiments, the unique causal relationship identified is a relationship between at least a level of a lipid, expression of a gene, and one or more functional activities wherein the functional activity is a kinase activity.

The invention provides methods for identifying a modulator of a disease process, the method comprising: ishing a model for the disease s, using disease related cells, to represents a characteristic aspect of the disease process; obtaining a first data set from the model, wherein the first data set represents global proteomic changes in the disease related cells; obtaining a second data set from the model, wherein the second data set represents one or more functional activities or cellular responses of the cells associated with the biological system, wherein said one or more onal activities or cellular responses of the cells comprises global enzyme activity and/or an effect of the global enzyme activity on the enzyme metabolites or substrates in the disease related cells; generating a consensus causal relationship network among the global mic changes and the one or more functional activities or cellular responses of the cells based solely on the first and second data sets using a mmed computing device, wherein the generation of the consensus causal relationship k is not based on any known biological relationships other than the first and second data sets; and identifying, from the consensus causal relationship k, a causal relationship unique in the disease process, wherein at least one enzyme associated with the unique causal relationship is identified as a modulator of the disease process.

In n ments, the first data set is a single proteomic data set. In certain embodiments, the second data set represents a single functional ty or cellular response of the cells associated with the biological system. In certain embodiments, the first data set further represents lipidomic data characterizing the cells associated with the biological system. In certain embodiments, the consensus causal relationship network is generated among the global proteomic changes, lipidomic data, and the one or more functional activities or cellular responses of the cells, wherein said one or more onal activities or cellular responses of the cells comprises global enzymatic activity. In certain embodiments, the first data set further ents one or more of lipidomic, metabolomic, transcriptomic, genomic and SNP data characterizing the cells associated with the ical system. In certain ments, the first data set further represents two or more of lipidomic, metabolomic, riptomic, genomic and SNP data characterizing the cells associated with the biological system. In certain embodiments, the consensus causal relationship network is generated among the global mic changes, the one or more of lipidomic, metabolomic, transcriptomic, genomic and SNP data, and the one or more functional activities or cellular responses of the cells, wherein said one or more functional activities or ar responses of the cells comprises global enzymatic activity and/or the effect of the global enzymatic activity on at least one enzyme metabolite or substrate.

In certain ments of the invention, the global enzyme activity comprises global kinase activity, and wherein the effect of the global enzyme activity on the enzyme lites or substrates comprises the o proteome of the cells. In certain embodiments, the second data set representing one or more functional acivities or cellular resposes of the cell further comprises one or more of bioenergetics, cell proliferation, apoptosis, organellar function, cell migration, tube formation, axis, extracellular matrix degradation, sprouting, and a genotype-phenotype associate actualized by functional models selected from ATP, ROS, OXPHOS, and Seahorse assays. In certain embodiments, the consensus causal relationship network is generated among the global proteomic changes, the one or more of lipidomic, metabolomic, transcriptomic, genomic and SNP data, and the one or more functional activities or cellular responses of the cells, wherein said one or more functional activities or cellular responses of the cells comprises one or more of bioenergetics, cell proliferation, apoptosis, organellar on, cell migration, tube formation, chemotaxis, ellular matrix degradation, sprouting, and a genotype-phenotype associate actualized by functional models selected from ATP, ROS, OXPHOS, and Seahorse assays.

In n embodiments of the invention, the disease s is cancer, diabetes, obesity, cardiovascular disease, age d macular degeneration, diabetic retinopathy, inﬂammatory disease. In certain embodiments, the disease process comprises angiogenesis. In certain ments, the disease process comprises hepatocellular carcinoma, lung cancer, breast cancer, prostate cancer, melanoma, carcinoma, sarcoma, lymphoma, leukemia, squamous cell carcinoma, colorectal cancer, pancreatic cancer, thyroid cancer, endometrial cancer, bladder cancer, kidney cancer, a solid tumor, leukemia, non-Hodgkin lymphoma, or a drug-resistant cancer.

In n embodiments of the invention, the disease model comprises an in vitro culture of disease cells, optionally further sing a matching in vitro culture of control or normal cells. In certain embodiments, the in vitro culture of the disease cells is subject to an environmental bation, and the in vitro e of the matching control cells is identical e cells not subject to the environmental perturbation. In certain embodiments, the nmental perturbation comprises one or more of contact with a bioactive agent, a change in culture condition, introduction of a genetic cation / mutation, and introduction of a vehicle that causes a genetic modification / mutation. In certain embodiments, the environmental perturbation comprises ting the cells with an enzymatic activity inhibitor. In certain embodiments,the enzymatic activity inhibitor is a kinase inhibitor. In certain embodiments,the environmental perturbation further comprises contacting the cells with CleO. In certain embodiments,the environmental perturbation comprises contacting the cells with CleO.

In certain embodiments, the characteristic aspect of the disease process comprises a a condition, a hyperglycemic condition, a lactic acid rich culture condition, or combinations f. In certain embodiments, the generating step is carried out by an artificial intelligence (AI) -based informatics platform. In certain embodiments, the ed informatics platform receives all data input from the firstand second data sets without applying a statistical cut-off point.

In certain embodiments, the consensus causal relationship network established in the generating step is r refined to a tion causal relationship network, before the identifying step, by in silico simulation based on input data, to provide a confidence level of tion for one or more causal onships within the consensus causal relationship network. In certain embodiments, the unique causal relationship is identified as part of a differential causal relationship network that is ly present in model of disease cells, and absent in the matching control cells. In certain embodiments, the unique causal relationship is identified as part of a differential causal relationship network that is ly present in cells subject to nmental pertubation, and absent in the matching control cells.

The invention provides methods for identifying modulators of a biological system, the methods comprising: establishing a model for the biological , using cells associated with the biological system, to represents a characteristic aspect of the biological system ; obtaining a first data set from the model, wherein the first data set represents global proteomic changes in the cells and one or more of lipidomic, metabolomic, transcriptomic, c, and SNP data characterizing the cells associated with the biological ; obtaining a second data set from the model, wherein the second data set ents one or more functional activities or cellular responses of the cells associated with the biological system, n said one or more functional activities or cellular responses of the cells comprises global kinase activity and an effect of the global kinase activity on the kinase metabolites or substrates in the cells associated with the biological system; generating a consensus causal relationship network among the global proteomic changes, the one or more of lipidomic, lomic, transcriptomic, genomic, and SNP data, and the one or more functional activities or ar responses based solely on the first and second data sets using a programmed computing device, wherein the generation of the consensus causal relationship network is not based on any known biological relationships other than the first and second data sets; and identifying, from the consensus causal relationship network, a causal relationship unique in the biological system, wherein at least one kinase associated with the unique causal relationship is fied as a modulator of the biological system.

The invention provides methods for treating, alleviating a symptom of, inhibiting progression of, preventing, sing, or prognosing a disease in a mammalian subject, the methods comprising: administering to the mammal in need thereof a therapeutically effective amount of a pharmaceutical composition comprising a biologically active nce that affects the modulator fied by any of the methods provided herein, thereby treating, alleviating a symptom of, inhibiting progression of, preventing, diagnosing, or prognosing the disease.

The invention provides methods of sing or prognosing a e in a mammalian subject, the method comprising: determining an expression or activity level, in a biological sample obtained from the subject, of one or more modulators fied by any of the methods provided herein; comparing the level in the subject with the level of expression or activity of the one or more modulators in a control sample, wherein a difference between the level in the subject and the level of expression or activity of the one or more modulators in the l sample is an indication that the subject is afﬂicted with a disease, or predisposed to developing a disease, or responding favorably to a therapy for a disease, thereby diagnosing or prognosing the disease in the mammalian subject.

The invention provides methods of identifying a therapeutic compound for ng, alleviating a symptom of, inhibiting progression of, or ting a disease in a mammalian subject, the methods comprising: contacting a biological sample from a mammalian subject with a test compound; determining the level of sion, in the biological sample, of one or more modulators identified by any of the methods provided herein; ing the level of expression of the one or more modulators in the biological sample with a control sample not contacted by the test compound; and selecting the test nd that modulates the level of expression of the one or more modulators in the biological sample, thereby fying a therapeutic compound for ng, alleviating a symptom of, inhibiting progression of, or preventing a disease in a mammalian subject.

The invention provides methods for treating, alleviating a symptom of, inhibiting progression of, or preventing a e in a mammalian subject, the methods comprising: administering to the mammal in need thereof a therapeutically ive amount of a pharmaceutical composition comprising the therapeutic compound identified using any of the s provided herein, thereby treating, alleviating a symptom of, inhibiting progression of, or preventing the disease.

The invention provides methods for treating, alleviating a symptom of, inhibiting ssion of, or preventing a disease in a ian t, the methods comprising: administering to the mammal in need thereof a therapeutically effective amount of a pharmaceutical composition comprising a biologically active substance that affects expression or activity of any one or more of TCOFl, TOP2A, CAMKZA, CDKl, CLTCLl, EIF4G1, ENOl, FBL, GSK3B, HDLBP, HIST1H2BA, HMGB2, HNRNPK, PGKl, PGK2, RAB7A, RPLl7, RPL28, RPSS, RPS6, SLTM, TMED4, TNRCBA, TUBB, and UBE21, thereby treating, alleviating a symptom of, inhibiting progression of, or preventing the disease. In certain embodiments, the e is hepatocellular carcinoma.

The invention provides s of diagnosing or prognosing diseases in a mammalian t, the methods comprising: determining an expression or activity level, in a biological sample obtained from the subject, of any one or more proteins of TCOFl, TOP2A, CAMK2A, CDKl, , EIF4G1, ENOl, FBL, GSK3B, HDLBP, HIST1H2BA, HMGB2, HNRNPK, PGKl, PGK2, RAB7A, RPLl7, RPL28, RPSS, RPS6, SLTM, TMED4, TNRCBA, TUBB, and UBE21; and comparing the level in the subject with the level of expression or activity of the one or more proteins in a control sample, wherein a difference n the level in the subject and the level of expression or activity of the one or more ns in the control sample is an indication that the subject is afﬂicted with a disease, or posed to developing a disease, or responding favorably to a therapy for a disease, thereby diagnosing or prognosing the disease in the mammalian subject. In certain embodiments, the disease is hepatocellular carcinoma.

The invention provides methods of identifying therapeutic compounds for treating, alleviating a m of, inhibiting progression of, or preventing a diseases in a mammalian subject, the methods comprising: contacting a ical sample from a mammalian subject with a test compound; determining the level of expression, in the biological sample, of any one or more proteins of TCOFl, TOP2A, CAMKZA, CDKl, CLTCLl, EIF4G1, ENOl, FBL, GSK3B, LDHA, MAP4, MAPKl, MARCKS, NMEl, NME2, PGKl, PGK2, RAB7A, RPLl7, RPL28, RPSS, RPS6, SLTM, TMED4, TNRCBA, TUBB, and UBE21; comparing the level of expression of the one or more proteins in the biological sample with a l sample not contacted by the test nd; and selecting the test compound that modulates the level of sion of the one or more proteins in the biological sample, thereby identifying a therapeutic compound for treating, alleviating a symptom of, inhibiting progression of, or ting a disease in a mammalian subject. In certain embodiments, the disease is hepatocellular carcinoma.

The invention provides methods for treating, alleviating a symptom of, inhibiting progression of, or preventing a diseases in a mammalian subject, the methods comprising: administering to the mammal in need thereof a therapeutically effective amount of a pharmaceutical composition comprising the therapeutic compound fied by any of the methods ed herein, y treating, alleviating a symptom of, inhibiting progression of, or preventing the e.

The invention provides methods for fying a modulator of angiogenesis, said methods comprising: (1) establishing a model for angiogenesis, using cells associated with angiogenesis, to represents a characteristic aspect of angiogenesis; (2) obtaining a first data set from the model for enesis, wherein the first data set ents one or more of genomic data, lipidomic data, proteomic data, metabolomic data, transcriptomic data, and single nucleotide polymorphism (SNP) data characterizing the cells associated with angiogenesis; (3) obtaining a second data set from the model for angiogenesis, wherein the second data set represents one or more functional activities or a cellular ses of the cells associated with angiogenesis; (4) generating a consensus causal relationship network among the one or more of genomic data, lipidomic data, mic data, metabolic data, transcriptomic data, and single nucleotide polymorphism (SNP) data characterizing the cells associated with enesis, and the one or more functional ties or cellular responses of the cells associated with angiogenesis based solely on the first data set and the second data set using a programmed computing , wherein the generation of the consensus causal relationship network is not based on any known biological onships other than the first data set and the second data set; (5) identifying, from the consensus causal relationship network, a causal relationship unique in angiogenesis, n a gene, lipid, protein, metabolite, transcript, or SNP associated with the unique causal relationship is identified as a modulator of angiogenesis.

The invention provides methods for identifying a tor of angiogenesis, said methods sing: (1) establishing a model for angiogenesis, using cells associated with angiogenesis, to represents a characteristic aspect of angiogenesis; (2) obtaining a first data set from the model for angiogenesis, wherein the first data set represents lipidomic data; (3) obtaining a second data set from the model for angiogenesis, wherein the second data set represents one or more functional activities or a cellular responses of the cells associated with angiogenesis; (4) generating a consensus causal relationship network among the lipidomics data and the functional activity or cellular response based solely on the first data set and the second data set using a programmed computing device, wherein the generation of the consensus causal onship network is not based on any known ical onships other than the first data set and the second data set; (5) identifying, from the consensus causal relationship network, a causal relationship unique in angiogenesis, wherein a lipid associated with the unique causal relationship is fied as a modulator of angiogenesis.

In certain embodiments, the second data set representing one or more functional activities or cellular responses of the cells associated with angiogeensis comprises global enzymatic activity and an effect of the global enzymatic activity on the enzyme metabolites or substrates in the cells associated with angiogenesis.

The invention provides methods for identifying modulators of angiogenesis, said methods comprising: (1) establishing a model for angiogenesis, using cells associated with angiogenesis, to represents a characteristic aspect of angiogenesis; (2) obtaining a first data set from the model for angiogenesis, wherein the first data set represents one or more of genomic data, lipidomic data, proteomic data, metabolic data, transcriptomic data, and single nucleotide polymorphism (SNP) data characterizing the cells associated with angiogenesis; (3) obtaining a second data set from the model for enesis, wherein the second data set represents one or more functional activities or cellular responses kinase activity of the cells associated with angiogenesis, wherein the one or more functional activities or ar responses ses global enzymatic activity and/or an effect of the global enzymatic activity on the enzyme metabolites or substrates in the cells associated with angiogenesis; (4) ting a sus causal relationship network among the one or more of genomic data, lipidomic data, proteomic data, metabolic data, riptomic data, and single nucleotide polymorphism (SNP) data characterizing the cells associated with angiogenesis and the one or more onal activities or cellular responses of the cells associated with angiogenesis based solely on the first data set and the second data set using a programmed computing device, wherein the generation of the consensus causal relationship network is not based on any known biological relationships other than the first data set and the second data set; (5) fying, from the consensus causal onship network, a causal relationship unique in angiogenesis, wherein an enzyme associated with the unique causal relationship is identified as a modulator of angiogenesis.

In certain embodiments of the invention, the global enzyme activity ses global kinase activity and an effect of the global enzymatic activity on the enzyme metabolites or substrates in the cells associated with angiogenesis comprises the phosphoproteome of the cell. In certain embodiments, the global enzyme activity comprises global protease activity.

In certain embodiments of the invention, the modulator stimulates or es angiogenesis. In n embodiments of the invention, the modulator inhibits angiogenesis.

In certain embodiments, the model for angiogenesis comprising cells associated with angiogenesis is selected from the group consisting of an in vitro cell culture enesis model, rat aorta microvessel model, newborn mouse retina model, chick chorioallantoic membrane (CAM) model, corneal angiogenic growth factor pocket model, subcutaneous sponge enic growth factor implantation model, MATRIGEL® angiogenic growth factor implantation model, and tumor implanation model; and wherein the model of angiogenesis optionally further comprises a matching control model of angiogenesis comprising control cells. In n embodiments, the in vitro culture angiogenesis model is ed from the group consisting of MATRIGEL® tube formation assay, migration assay, Boyden chamber assay, scratch assay.

In certain embodiments, the cells ated with angiogenesis in the in vitro culture model are human endothelial vessel cells ). In certain embodiments, the angiogenic growth factor in the corneal angiogenic growth factor pocket model, subcutaneous sponge angiogenic growth factor implantation model, or MATRIGEL® angiogenic growth factor implantation model is selected from the group consisting of FGF-2 and VEGF.

In certain embodiments of the invention, the cells in the model of angiogenesis are subject to an environmental bation, and the cells in the matching model of angiogenesis are an identical cells not subject to the environmental perturbation. In certain embodiments, the environmental perturbation comprises one or more of a contact with an agent, a change in culture condition, an introduced genetic modification or mutation, a vehicle that causes a genetic modification or mutation, and induction of In certain embodiments, the agent is a pro-angiogenic agent or an anti- angiogenic agent. In certain ments, the pro-angiogenic agent is selected from the group ting of FGF-2 and VEGF. In certain embodiments, the anti-angiogenic agent is selected from the group consisting of VEGF inhibitors, integrin antagonists, tatin, endostatin, tumstatin, Avastin, sorafenib, sunitinib, pazopanib, and everolimus, soluble VEGF-receptor, angiopoietin 2, thrombospondinl, thrombospondin 2, vasostatin, calreticulin, ombin (kringle domain-2), antithrombin III fragment, ar endothelial growth inhibitor (VEGI), ed Protein Acidic and Rich in Cysteine (SPARC) and a SPARC peptide corresponding to the follistatin domain of the protein (FS-E), and coenzyme Q10.

In any of the embodiments, the agent is an enzymatic activity inhibitor. In any of the embodiments, the agent is a kinase activity inhibitor.

In any of the embodiments of the invention, the first data set comprises protein and/or mRNA expression levels of ta plurality of genes in the genomic data set. In certain embodiments of the invention, the first data set comprises two or more of genomic data, lipidomic data, proteomic data, metabolic data, transcriptomic data, and single nucleotide polymorphism (SNP) data. In certain embodiments of the invention, the first data set comprises three or more of genomic data, lipidomic data, proteomic data, metabolic data, transcriptomic data, and single tide polymorphism (SNP) data.

In any of the embodiments of the invention, the second data set representing one or more functional activities or a cellular responses of the cells ated with enesis comprising one or more of bioenergetics, cell proliferation, apoptosis, organellar on, cell migration, tube formation, enzyme activity, chemotaxis, extracellular matrix degradation, sprouting, and a genotype-phenotype association actualized by functional models selected from ATP, ROS, OXPHOS, and Seahorse assays.

In any of the embodiments of the invention, the first data set can be a a single data set such as one of genomic data, mic data, mic data, metabolic data, transcriptomic data, and single nucleotide rphism (SNP) data. In any of the embodiment, the first data set can be a two data sets. In any of the embodiment, the first data set is three data sets. In any of the embodiment, the first data set can be four data sets. In any of the embodiment, the first data set can be five data sets. In any of the embodiment, the first data set can be six data sets.

In any of the embodiments of the invention, the second data set is a single data set such as one of one or more functional activities or a cellular ses of the cells associated with angiogenesis comprising one or more of bioenergetics, cell proliferation, apoptosis, organellar function, cell migration, tube formation, enzyme activity, chemotaxis, extracellular matrix degradation, ing, and a genotype-phenotype association actualized by functional models selected from ATP, ROS, OXPHOS, and Seahorse assay data. In any of the embodiment, the second data set can be two data sets.

In any of the embodiment, the second data set can be three data sets. In n embodiments, the second data set can be four data sets. In any of the ment, the second data set can be five data sets. In any of the embodiment, the second data set can be six data sets. In any of the embodiment, the second data set can be seven data sets. In any of the embodiment, the second data set can be eight data sets. In any of the embodiment, the second data set can be nine data sets. In certain embodiments, the second data set can be ten data sets.

In any of the ments of the invention, the enzyme ty can be a kinase activity. In any of the embodiments of the invention, the enzyme activity can be a protease activity.

In certain of the embodiments of the invention, step (4) is carried out by an artificial intelligence (AI) -based informatics platform. In certain embodiments, the AI- based informatics platform comprises REFS(TM). In certain embodiments, the AI- based informatics platform receives all data input from the first data set and the second data set without applying a statistical cut-off point. In certain ments, the consensus causal onship network established in step (4) is further refined to a simulation causal onship network, before step (5), by in silico simulation based on input data, to provide a ence level of tion for one or more causal relationships within the consensus causal relationship network.

In certain embodiments of the invention, the unique causal relationship is identified as part of a differential causal relationship network that is uniquely present in cells, and absent in the ng control cells.

In the invention, the unique causal relationship identified is a relationship between at least one pair selected from the group consisting of expression of a gene and level of a lipid; expression of a gene and level of a transcript; expression of a gene and level of a metabolite; expression of a first gene and expression of a second gene; expression of a gene and presence of a SNP; expression of a gene and a functional activity; level of a lipid and level of a transcript; level of a lipid and level of a metabolite; level of a first lipid and level of a second lipid; level of a lipid and presence of a SNP; level of a lipid and a functional activity; level of a first transcript and level of a second transcript; level of a transcript and level of a metabolite; level of a transcript and presence of a SNP; level of a first transcript and level of a functional activity; level of a first metabolite and level of a second metabolite; level of a metabolite and presence of a SNP; level of a metabolite and a functional activity; presence of a first SNP and presence of a second SNP; and presence of a SNP and a onal ty.

In certain embodiments, the functional activity is selected from the group consisting of bioenergetics, cell proliferation, apoptosis, organellar function, cell migration, tube formation, enzyme activity, chemotaxis, extracellular matrix degradation, and sprouting, and a genotype-phenotype association actualized by functional models ed from ATP, ROS, OXPHOS, and Seahorse assays. In certain embodiments, the functional activity is kinase activity. In certain embodiments, the functional activity is se activity.

In certain embodiments of the invention, the unique causal onship identified is a relationship n at least a level of a lipid, expression of a gene, and one or more functional activities wherein the functional activity is a kinase activity.

In the invention, the methods can further comprise validating the identified unique causal relationship in angiogenesis.

The invention provides methods for providing a model for angiogenesis for use in a platform methods, comprising: establishing a model for angiogenesis, using cells associated with angiogenesis, to represent a characteristic aspect of angiogenesis, wherein the model for angiogenesis is useful for generating data sets used in the platform method; thereby ing a model for angiogenesis for use in a platform method.

The ion provides methods for obtaining a first data set and second data set from a model for angiogenesis for use in a platform method, comprising: (1) obtaining a first data set from the model for angiogenesis for use in a platform method, wherein the model for angiogenesis comprises cells associated with angiogenesis, and wherein the first data set ents one or more of genomic data, lipidomic data, proteomic data, lic data, transcriptomic data, and single nucleotide polymorphism (SNP) data characterizing the cells ated with angiogenesis; (2) obtaining a second data set from the model for angiogenesis for use in the rm method, wherein the second data set represents one or more functional activities or cellular responses of the cells associated with angiogenesis; thereby obtaining a first data set and second data set from the model for angiogenesis for use in a platform method.

The ion provides methods for fying a modulator of angiogenesis, said method comprising: (1) generating a consensus causal relationship network among a first data set and second data set obtained from a model for angiogenesis, n the model comprises cells associated with angiogenesis, and wherein the first data set represents one or more of genomic data, lipidomic data, proteomic data, metabolic data, transcriptomic data, and single nucleotide polymorphism (SNP) data characterizing the cells associated with angiogenesis; and the second data set represents one or more functional activities or cellular responses of the cells associated with angiogenesis, using a programmed computing device, wherein the generation of the consensus causal relationship network is not based on any known biological onships other than the first data set and the second data set; (2) identifying, from the consensus causal relationship network, a causal relationship unique in angiogenesis, wherein at least one of a gene, a lipid, a n, a metabolite, a transcript, or a SNP associated with the unique causal relationship is identified as a modulator of angiogenesis; thereby identifying a modulator of angiogenesis.

The invention provides s for fying a modulator of angiogenesis, said method comprising: (1) ing a consensus causal relationship network generated from a model for angiogenesis; (2) identifying, from the consensus causal relationship network, a causal relationship unique in angiogenesis, wherein at least one of a gene, a lipid, a protein, a lite, a transcript, or a SNP associated with the unique causal relationship is fied as a modulator of angiogenesis; thereby fying a modulator of angiogenesis.

In certain embodiments, the consensus causal relationship network is generated among a first data set and second data set obtained from the model for angiogenesis, wherein the model comprises cells ated with angiogenesis, and wherein the first data set represents one or more of genomic data, lipidomic data, proteomic data, metabolic data, transcriptomic data, and single nucleotide polymorphism (SNP) data characterizing the cells ated with angiogenesis; and the second data set represents one or more functional activities or cellular responses of the cells ated with angiogenesis, using a programmed computing device, wherein the generation of the consensus causal relationship network is not based on any known biological relationships other than the first data set and the second data SCI.

In certain embodiments, the model for angiogenesis is selected from the group consisting of in vitro cell culture angiogenesis model, rat aorta microvessel model, newborn mouse retina model, chick chorioallantoic membrane (CAM) model, corneal angiogenic growth factor pocket model, subcutaneous sponge angiogenic growth factor tation model, MATRIGEL® angiogenic growth factor implantation model, and tumor implanation model; and n the model of angiogenesis optionally further ses a matching control model of angiogenesis comprising control cells.

In certain embodiments, the first data set comprises lipidomics data. In certain embodiments, the first data set comprises only mics data.

In certain embodiments, the second data set represents one or more functional activities or cellular responses of the cells associated with angiogenesis comprising global enzymatic activity, and an effecot of the global enzymatic activity on the enzyme metabolites or substrates in the cells associated with angiogenesis.

In n embodiments, the second data set comprises kinase activity or protease activity. In n ments, the second data set comprises only kinase activity or se activity.

In certain embodiments, the second data set ents one or more functional activities or cellular responses of the cells associated with angiogenesis comprises one or more of bioenergetics profiling, cell proliferation, apoptosis, organellar function, cell migration, tube formation, kinase activity, and protease activity; and a genotype- phenotype association actualized by functional models selected from ATP, ROS, OXPHOS, and Seahorse .

In n embodiments of the invention, the angiogenesis is related to a disease state.

The invention provides methods for modulating angiogenesis in a mammalian subject, the methods comprising: administering to the mammal in need thereof a therapeutically effective amount of a pharmaceutical composition comprising a biologically active substance that affects the modulator identified by any one of the methods provided herein, thereby modulating angiogenesis.

The invention provides method of detecting modulated angiogenesis in a mammalian subject, the method sing: determining alevel, activity, or presence, in a biological sample obtained from the subject, of one or more modulators fied by any one of the s provided herein; and comparing the level, activity, or presence in the subject with the level, activity, or presence of the one or more modulators in a control sample, n a difference between the level, activity, or presence in the subject and the level, activity, or ce of the one or more tors in the control sample is an indication that angiogenesis is modulated in the mammalian subject.

The invention provides methods of identifying a therapeutic compound for modulating angiogenesis in a mammalian subject, the methods comprising: contacting a biological sample from a mammalian subject with a test compound; determining the level of expression, in the biological sample, of one or more modulators fied by any one of the methods ed herein; comparing the level, activity, or presence of the one or more modulators in the biological sample with a control sample not ted by the test compound; and selecting the test nd that modulates the level, activity, or presence of the one or more modulators in the biological sample, thereby identifying a therapeutic compound for modulating angiogenesis in a mammalian subject.

The ion provides methods for modulating angiogenesis in a mammalian subject, the methods comprising: administering to the mammal in need thereof a therapeutically effective amount of a pharmaceutical composition comprising the therapeutic compound identified by any of the methods ed herein, thereby treating, alleviating a symptom of, inhibiting progression of, preventing, diagnosing, or prognosing the disease.

In certain embodiments, the “environmental perturbation”, also ed to herein as “external stimulus component”, is a therapeutic agent. In certain embodiments, the external stimulus component is a small molecule (6. g., a small molecule of no more than kDa, 4 kDa, 3 kDa, 2 kDa, 1 kDa, 500 Dalton, or 250 Dalton). In n embodiments, the external stimulus component is a ic. In certain embodiments, the external stimulus component is a chemical. In certain embodiments, the external stimulus component is endogenous or exogenous to cells. In certain embodiments, the external stimulus component is a MIM or epishifter. In certain embodiments, the external stimulus component is a stress factor for the cell system, such as hypoxia, hyperglycemia, hyperlipidemia, hyperinsulinemia, and/or lactic acid rich conditions.

In n embodiments, the external us component may e a therapeutic agent or a candidate therapeutic agent for treating a disease condition, including chemotherapeutic agent, protein-based biological drugs, antibodies, fusion proteins, small molecule drugs, lipids, polysaccharides, nucleic acids, etc.

In certain ments, the al stimulus component may be one or more stress factors, such as those typically encountered in vivo under the various disease conditions, including hypoxia, hyperglycemic ions, acidic environment (that may be mimicked by lactic acid treatment), etc.

In other embodiments, the external stimulus component may include one or more MIMs and/or epishifters, as defined herein below. Exemplary MIMs include Coenzyme Q10 (also referred to herein as CleO) and compounds in the Vitamin B family, or nucleosides, cleotides or dinucleotides that comprise a compound in the n B family. In certain embodiments, the external stimulus is not CleO. In certain ments, the external stimulus is not Vitamin B or a compound in the Vitamin B family.

In making cellular output measurements (such as protein expression, lipid level), either absolute amount (e. g., expression or total amount) or ve level (e. g., relative expression level or amound) may be used. In one embodiment, absolute amounts (e.g., expression or total amounts) are used. In one ment, relative levels or amounts (e. g., relative expression levels or amounts) are used. For example, to determine the relative n expression level of a cell system, the amount of any given protein in the cell system, with or without the external stimulus to the cell system, may be compared to a suitable control cell line or mixture of cell lines (such as all cells used in the same experiment) and given a fold-increase or fold-decrease value. The skilled person will appreciate that absolute amounts or relative amounts can be employed in any cellular output measurement, such as gene and/or RNA transcription level, level of lipid, or any functional output, 6. g., level of apoptosis, level of toxicity, or ECAR or OCR as described herein. A pre-determined threshold level for a fold-increase (e. g., at least 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2, 2.5, 3, 3.5, 4, 4.5, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 75 or 100 or more fold increase) or fold-decrease (e.g., at least a decrease to 0.9, 0.8, 0.75, 0.7, 0.6, 0.5, 0.45, 0.4, 0.35, 0.3, 0.25, 0.2, 0.15, 0.1 or 0.05 fold, or a decrease to 90%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, %, 15%, 10% or 5% or less) may be used to select significant differentials, and the cellular output data for the significant differentials may then be included in the data sets (e. g., first and second data sets) utilized in the platform logy methods of the ion. All values presented in the foregoing list can also be the upper or lower limit of ranges, e.g., between 1.5 and 5 fold, 5 and 10 fold, 2 and 5 fold, or between 0.9 and 0.7, 0.9 and 0.5, or 0.7 and 0.3 fold, are intended to be a part of this invention.

Throughout the present application, all values presented in a list, e. g., such as those above, can also be the upper or lower limit of ranges that are intended to be a part of this invention.

In one embodiment of the methods of the invention, not every observed causal relationship in a causal relationship network may be of ical significance. With t to any given biological system for which the subject interrogative biological assessment is d, some (or maybe all) of the causal onships (and the genes associated therewith) may be “determinative” with respect to the specific ical problem at issue, e.g., either responsible for causing a disease condition (a potential target for therapeutic intervention) or is a biomarker for the disease condition (a potential diagnostic or prognostic factor). In one ment, an observed causal relationship unique in the biological system is determinative with respect to the specific biological m at issue. In one embodiment, not every observed causal relationship unique in the ical system is determinative with respect to the specific problem at issue.

Such determinative causal relationships may be selected by an end user of the subject method, or it may be selected by a bioinformatics software program, such as REFS, DAVID-enabled comparative pathway analysis program, or the KEGG pathway analysis program. In certain embodiments, more than one bioinformatics software m is used, and consensus results from two or more bioinformatics software ms are preferred.

As used herein, “differentials” of cellular outputs include differences (e.g., increased or decreased ) in any one or more parameters of the cellular outputs. In certain embodiments, the differentials are each independently selected from the group consisting of differentials in mRNA transcription, n expression, protein activity, metabolite / intermediate level, and/or ligand-target interaction. For example, in terms of protein expression level, differentials between two cellular outputs, such as the outputs ated with a cell system before and after the treatment by an external stimulus component, can be ed and quantitated by using art-recognized technologies, such as mass-spectrometry based assays (e.g., iTRAQ, 2D-LC—MSMS, etc.) In one , the cell model for a biological system comprises a cellular cross- talking system, wherein a first cell system having a first cellular environment with an al stimulus component generates a first modified ar environment; such that a cross-talking cell system is established by exposing a second cell system having a second cellular environment to the first modified cellular environment.

In one embodiment, at least one significant cellular cross-talking differential from the cross-talking cell system is generated; and at least one determinative cellular cross-talking differential is identified such that an interrogative biological assessment occurs. In certain embodiments, the at least one significant cellular cross-talking differential is a plurality of differentials.

In certain embodiments, the at least one determinative cellular cross-talking differential is selected by the end user. atively, in another embodiment, the at least one determinative cellular cross-talking differential is selected by a bioinformatics software m (such as, e.g., REFS, KEGG pathway analysis or enabled ative pathway analysis) based on the quantitative proteomics data.

In certain ments, the method further comprises ting a significant cellular output differential for the first cell system.

In certain embodiments, the differentials are each independently selected from the group consisting of differentials in mRNA transcription, protein expression, protein activity, metabolite / intermediate level, and/or ligand-target interaction.

In certain embodiments, the first cell system and the second cell system are independently selected from: a homogeneous population of y cells, a cancer cell line, or a normal cell line.

In n embodiments, the first modified ar environment ses factors secreted by the first cell system into the first cellular environment, as a result of contacting the first cell system with the external stimulus component. The factors may comprise secreted proteins or other ing molecules. In certain embodiments, the first modified cellular environment is substantially free of the original external stimulus COIIlpOl'lel'lt.

In certain ments, the cross-talking cell system comprises a ell having an insert compartment and a well compartment separated by a membrane. For example, the first cell system may grow in the insert compartment (or the well compartment), and the second cell system may grow in the well compartment (or the insert compartment).

In n embodiments, the cross-talking cell system comprises a first culture for growing the first cell system, and a second culture for growing the second cell system.

In this case, the first ed cellular environment may be a conditioned medium from the first cell system.

In certain embodiments, the first cellular environment and the second cellular environment can be identical. In certain embodiments, the first cellular environment and the second cellular environment can be different.

In certain embodiments, the cross-talking cell system comprises a co-culture of the first cell system and the second cell system.

The methods of the invention may be used for, or applied to, any number of “interrogative biological assessments.” Application of the methods of the invention to an interrogative biological assessment allows for the identification of one or more modulators of a ical system or inative cellular process “drivers” of a biological system or process.

The methods of the invention may be used to carry out a broad range of ogative biological assessments. In certain embodiments, the interrogative biological assessment is the diagnosis of a e state. In certain ments, the interrogative biological assessment is the determination of the efficacy of a drug. In certain embodiments, the interrogative biological ment is the determination of the toxicity of a drug. In certain embodiments, the interrogative biological assessment is the staging of a disease state. In certain embodiments, the interrogative ical assessment identifies targets for ging cosmetics.

As used herein, an “interrogative biological assessment” may include the identification of one or more modulators of a biological , e. g., determinative cellular process “drivers,” (6. g., an increase or decrease in activity of a biological pathway, or key members of the pathway, or key regulators to members of the pathway) associated with the environmental perturbation or external stimulus component, or a unique causal onship unique in a biological system or process. It may further include additional steps designed to test or verify whether the fied determinative cellular process drivers are necessary and/or sufficient for the downstream events associated with the environmental perturbation or external stimulus component, ing in vivo animal models and/or in vitro tissue culture experiments.

In certain embodiments, the interrogative biological assessment is the sis or g of a disease state, wherein the identified modulators of a biological system, e. g., determinative cellular process drivers (e. g., cross-talk differentials or causal relationships unique in a biological system or s) represent either disease markers or therapeutic targets that can be subject to therapeutic intervention. The subject interrogative biological assessment is suitable for any disease condition in , but may found particularly useful in areas such as oncology / cancer biology, diabetes, obesity, cardiovascular disease, and neurological conditions (especially neuro- degenerative diseases, such as, without limitation, Alzheimer’s disease, son’s disease, Huntington’s disease, ophic lateral sclerosis (ALS), and aging related neurodegeneration), and conditions associated with angiogenesis.

In certain embodiments, the interrogative biological assessment is the determination of the efficacy of a drug, wherein the identified modulators of a biological system, e. g., determinative cellular process driver (e. g., cross-talk differentials or causal relationships unique in a biological system or process) may be the hallmarks of a successful drug, and may in turn be used to identify additional , such as MIMs or epishifters, for treating the same disease condition.

In certain embodiments, the interrogative biological assessment is the identification of drug targets for preventing or treating infection, wherein the identified determinative cellular process driver (e. g., cellular cross-talk differentials or causal relationships unique in a biological system or process) may be s/indicators or key biological molecules ive of the infective state, and may in turn be used to identify anti-infective agents.

In certain embodiments, the interrogative biological assessment is the assessment of a molecular effect of an agent, e.g., a drug, on a given disease profile, wherein the fied modulators of a biological system, e. g., determinative cellular s driver (e. g., cellular cross-talk differentials or causal relationships unique in a biological system or process) may be an increase or decrease in activity of one or more biological pathways, or key members of the pathway(s), or key regulators to members of the pathway(s), and may in turn be used, e.g., to predict the therapeutic efficacy of the agent for the given e.

In n embodiments, the interrogative biological assessment is the assessment of the logical profile of an agent, e.g., a drug, on a cell, tissue, organ or organism, wherein the identified modulators of a biological system, e.g., determinative cellular process driver (e. g., cellular talk differentials or causal relationships unique in a biological system or process) may be indicators of toxicity, e.g., cytotoxicity, and may in turn be used to predict or identify the logical profile of the agent. In one ment, the fied modulators of a biological system, e.g., determinative cellular process driver (e.g., cellular cross-talk differentials or causal relationships unique in a biological system or process) is an indicator of cardiotoxicity of a drug or drug candidate, and may in turn be used to predict or identify the cardiotoxicological profile of the drug or drug candidate.

In certain embodiments, the interrogative biological assessment is the identification of drug targets for preventing or treating a disease or disorder caused by biological s, such as disease-causing protozoa, fungi, bacteria, protests, viruses, or toxins, wherein the identified modulators of a biological system, e. g., determinative cellular process driver (e.g., cellular cross-talk differentials or causal relationships unique in a biological system or process) may be markers/indicators or key biological molecules causative of said disease or disorder, and may in turn be used to identify biodefense agents.

In certain embodiments, the interrogative biological assessment is the fication of targets for anti-aging agents, such as anti-aging cosmetics, wherein the identified modulators of a biological system, e. g., determinative ar process driver (e.g., cellular cross-talk differentials or causal relationships unique in a ical system or process) may be markers or tors of the aging s, ularly the aging s in skin, and may in turn be used to identify anti-aging agents.

In one exemplary cell model for aging that is used in the methods of the invention to identify targets for anti-aging cosmetics, the cell model comprises an aging epithelial cell that is, for example, treated with UV light (an environmental perturbation or external stimulus component), and/or neonatal cells, which are also optionally treated with UV light. In one embodiment, a cell model for aging comprises a cellular cross- talk system. In one ary two-cell cross-talk system established to identify targets for anti-aging cosmetics, an aging epithelial cell (first cell system) may be treated with UV light (an external stimulus component), and changes, e. g., proteomic changes and/or functional changes, in a neonatal cell (second cell system) resulting from contacting the neonatal cells with ioned medium of the treated aging epithelial cell may be ed, e. g., proteome changes may be measured using conventional quantitative mass spectrometry, or a causal relationship unique in aging may be identified from a causal relationship network ted from the data.

In another aspect, the invention provides a kit for conducting an interrogative biological ment using a ery rm Technology, comprising one or more reagents for detecting the presence of, and/or for quantitating the amount of, an analyte that is the subject of a causal relationship network generated from the methods of the invention. In one embodiment, said analyte is the subject of a unique causal relationship in the biological system, e. g., a gene associated with a unique causal relationhip in the biological system. In certain embodiments, the analyte is a protein, and the reagents comprise an antibody against the protein, a label for the n, and/or one or more agents for preparing the n for high throughput analysis (6. g., mass spectrometry based sequencing).

In yet r aspect, the technology provides a method for treating, alleviating a symptom of, inhibiting progression of, preventing, sing, or prognosing a disease in a mammalian subject. The method includes stering to the mammal in need f a therapeutically effective amount of a pharmaceutical composition comprising a biologically active substance that affects expression or activity of any one or more of TCOFl, TOP2A, CAMK2A, CDKl, , EIF4G1, ENOl, FBL, GSK3B, HDLBP, MAPKl, MARCKS, NMEl, NME2, PGKl, PGK2, RAB7A, RPLl7, RPL28, RPSS, RPS6, SLTM, TMED4, TNRCBA, TUBB, and UBE21, thereby treating, alleviating a symptom of, inhibiting progression of, preventing, diagnosing, or prognosing the disease. In some embodiments, the disease is a cancer, for e hepatocellular carcinoma. In various embodiments, the method can use 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, or 34 of the kinases. In one embodiment, the composition increases expression and/or activity of one or more of the kinases. In another embodiment, the composition decreases expression and/or activity of one or more of the s.

In still yet another aspect, the technology provides a method of diagnosing a disease in a mammalian subject. The method includes (i) ining an expression or activity level, in a biological sample obtained from the subject, of any one or more of TCOFl, TOP2A, CAMK2A, CDKl, CLTCLl, EIF4G1, ENOl, FBL, GSK3B, HDLBP, MAPKl, MARCKS, NMEl, NME2, PGKl, PGK2, RAB7A, RPLl7, RPL28, RPSS, RPS6, SLTM, TMED4, TNRCBA, TUBB, and UBE21, and (ii) comparing the level in the subject with the level of expression or activity of the one or more proteins in a control sample, n a difference between the level in the subject and the level of expression or activity of the one or more proteins in the control sample is an indication that the subject is afﬂicted with a disease, or predisposed to developing a disease, or responding favorably to a therapy for a disease, y diagnosing the e in the mammalian subject. In some embodiments, the disease is a cancer, for example cellular carcinoma. In various embodiments, the method can use 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 13,14,15, 16,17, 18, 19,20, 23, 24,25, 26, 27,28, 29, 30, 31, 32, 33, or 34 of the kinases. In one embodiment, the difference is an increase in expression and/or activity of one or more of the kinases. In another embodiment, the difference is a decrease in expression and/or activity of one or more of the kinases.

In yet another aspect, the technology es a method of identifying a therapeutic compound for treating, alleviating a symptom of, inhibiting progression of, preventing, diagnosing, or prognosing a disease in a mammalian subject. The method includes (i) contacting a biological sample from a mammalian subject with a test compound, (ii) determining the level of sion, in the biological , of any one or more of TCOFl, TOP2A, , CDKl, CLTCLl, EIF4G1, ENOl, FBL, GSK3B, LDHA, MAP4, MAPKl, MARCKS, NMEl, NME2, PGKl, PGK2, RAB7A, RPLl7, RPL28, RPSS, RPS6, SLTM, TMED4, TNRCBA, TUBB, and UBE21, (iii) comparing the level of expression of the one or more ns in the biological sample with a control sample not contacted by the test nd, and (iv) selecting the test compound that modulates the level of expression of the one or more proteins in the biological sample, thereby identifying a therapeutic compound for treating, ating a symptom of, ting progression of, preventing, diagnosing, or prognosing a disease in a mammalian subject. In some embodiments, the disease is a cancer, for example hepatocellular carcinoma. In various embodiments, the method can use 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 15, 16,17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, or 34 of the kinases. In one embodiment, the compound increases expression and/or activity of one or more of the kinases. In r embodiment, the nd decreases expression and/or activity of one or more of the kinases.

In still yet another aspect, the logy provides a method for treating, alleviating a m of, inhibiting progression of, preventing, diagnosing, or prognosing a disease in a mammalian subject. The method comprising administering to the mammal in need thereof a therapeutically effective amount of a pharmaceutical composition comprising the therapeutic compound identified by the aspect above (i.e., utilizing any one or more of TCOFl, TOP2A, CAMK2A, CDKl, CLTCLl, EIF4G1, ENOl, FBL, GSK3B, MAP2K2, LDHA, MAP4, MAPKl, MARCKS, NMEl, NME2, PGKl, PGK2, RAB7A, RPLl7, RPL28, RPSS, RPS6, SLTM, TMED4, TNRCBA, TUBB, and UBE21), thereby treating, alleviating a symptom of, inhibiting progression of, preventing, diagnosing, or prognosing the disease. In some embodiments, the e is a , for example cellular carcinoma. In various embodiments, the method can use 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 13,14,15, 16,17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, or 34 of the kinases.

It should be understood that all embodiments described herein, including those described only in examples, are parts of the general description of the invention, and can be combined with any other embodiments of the ion unless explicitly disclaimed or inapplicable.

BriefDescription of the Drawings Various embodiments of the present disclosure will be described herein below with reference to the figures n: Figure 1: Illustration of approach to identify therapeutics.

Figure 2: Illustration of systems biology of cancer and consequence of integrated multi-physiological interactive output regulation.

Figure 3: Illustration of systematic interrogation of biological relevance using MIMS.

Figure 4: ration of modeling cancer network to enable ogative biological query.

Figure 5: Illustration of the interrogative biology platform technology.

Figure 6: Illustration of technologies employed in the platform technology.

Figure 7: Schematic representation of the components of the platform including data collection, data ation, and data mining.

Figure 8: Schematic representation of the systematic interrogation using MIMS and collection of response data from the “omics” cascade.

Figure 9: Sketch of the components employed to build the In vitro models representing normal and diabetic states.

Figure 10: tic representation of the informatics platform REFSTM used to generate causal ks of the n as they relate to disease pathophysiology.

Figure 11: tic representation of the approach towards tion of differential network in ic versus normal states and ic nodes that are restored to normal states by treatment with MIMS.

Figure 12: A representative differential network in diabetic versus normal states.

Figure 13: A schematic representation of a node and associated edges of interest (Nodelin the center). The cellular functionality associated with each edge is represented.

Figure 14: High level ﬂow chart of an exemplary method, in accordance with some embodiments.

Figure ISA-15D: High level schematic illustration of the components and process for an AI-based informatics system that may be used with exemplary embodiments.

Figure 16: Flow chart of process in AI-based informatics system that may be used with some exemplary embodiments.

Figure 17: Schematically depicts an exemplary computing nment suitable for practicing exemplary embodiments taught herein.

Figure 18: Illustration of case study design described in Example 1.

Figure 19: Effect of CleO treatments on downstream nodes.

Figure 20: CleO treatment decreases expression of LDHA in cancer cell line HepG2.

Figure 21: ary protein interaction consensus network at 70% fragment frequency based on data from Paca2, HepG2 and THLE2 cell lines.

Figure 22: Proteins responsive to LDHA expression simulation in two cancer cell lines were identified using the platform technology.

Figure 23: Ingenuity Pathway ® analysis of LDHA — PARK7 network identifies TP53 as upstream hub.

Figure 24: Effect of CleO treatment on TP53 expression levels in SKMEL28 cancer cell line.

Figure 25: Activation of TP53 associated with d expression of BCL—2 proteins effectuating apoptosis in 8 cancer cell line and effect of CleO treatment on Bcl-2, Bax and Caspase3 expression levels in SKMEL28.

Figure 26: Illustration of the mathematical approach towards generation of delta- delta networks.

Figure 27: Cancer- Healthy ential (delta-delta) network that drive ECAR and OCR. Each driver has differential effects on the end point as represented by the thickness of the edge. The thickness of the edge in cytoscape represents the strength of the fold .

Figure 28: Mapping PARK7 and associated nodes from the interrogative platform technology outputs using IPA: The gray shapes include all the nodes associated with PARK7 from the interrogative y outputs that were imported into IPA. The unfilled shapes (with names) are new connections orated by IPA to create a complete map.

Figure 29: The interrogative platform technology of the invention, demonstrating novel associations of nodes associated with PARK7. Edges shown in dashed lines are connections between two nodes in the simulations that have intermediate nodes, but do not have intermediate nodes in IPA. Edges shown in dotted lines are connections between two nodes in the simulations that have intermediate nodes, but have different intermediate nodes in IPA.

Figure 30: Illustration of the mathematical approach s generation of delta- delta ks. Compare unique edges from NG in the NGﬂHG delta network with unique edges of HGTl in the HGﬂHGTl delta network. Edges in the intersection of NG and HGTl are HG edges that are restored to NG with Tl._ Figure 31: Delta-delta network of diabetic edges restored to normal with Coenzyme Q10 treatment superimposed on the NGﬂHG delta network.

Figure 32: Delta-delta network of hyperlipidemic edges restored to normal with Coenzyme Q10 treatment superimposed on the normal lipidemia ﬂ Hyper lipidemia delta k.

Figure 33: A Schematic representing the altered fate of fatty acid in disease and drug treatment. A balance between utilization of free fatty acid (EPA) for generation of ATP and membrane remodeling in response to tion of ne biology has been implicated in drug induced cardiotoxicity.

Figure 34: A Schematic representing mental design and modeling parameters used to study drug induced toxicity in diabetic cardiomyocytes.

Figure 35: Dysregulation of transcriptional network and expression of human mitochondrial energy metabolism genes in diabetic cardiomyocytes by drug treatment (T): rescue le (R) normalizes gene expression.

Figure 36: A. Drug treatment (T) induced expression of GPATl and TAZ in mitochondria from cardiomyocytes conditioned in hyerglycemia. In ation with the rescue molecule (T+R) the levels of GPATl and TAZ were normalized. B.

Synthesis of TAG from G3P.

Figure 37: A. Drug treatment (T) decreases mitochondrial OCR (oxygen consumption rate) in cardiomyocytes conditioned in hyperglycemia. The rescue molecule (T+R) normalizes OCR. B. Drug treatment (T) represses mitochondrial ATP synthesis in cardiomyocytes conditioned in hyperglycemia.

Figure 38: GO Annotation of proteins down regulated by drug treatment.

Proteins involved in mitochondrial energy metabolism were down regulated with drug treatment.

Figure 39: Illustration of the mathematical approach towards tion of delta networks. Compare unique edges from T versus UT both the models being in diabetic environment.

Figure 40: A schematic representing potential n hubs and networks that drive pathophysiology of drug induced toxicity.

Figure 41 illustrates a method for identifying a modulator of a ical system or e process.

Figure 42 illustrates a significant decrease in ENOl activity not n expression in HepG2 treated with Sorafenib.

Figure 43 illustrates a significant decrease in PGKl activity and not protein expression in HepG2 treated with Sorafenib.

Figure 44 illustrates a Significant decrease in LDHA activity in HepG2 d with Sorafenib.

Figure 45 illustrates a causal molecular interaction network that can be produced by analyzing the dataset using the AI based REFSTM .

Figure 46 illustrates how ation of mics data employing bayesian network inference thims can lead to improved understanding of signaling pathways in hepatocellular carcinoma. Yellow squares represent post transcriptional modification (Phospho) data, blue les ent activity based (Kinase) data, and green s represent proteomics data.

Figure 47 illustrates how autoregulation and reverse feed back regulation in hepatocellular carcinoma signaling pathways can be inferred by the rm. Squares represent post transcriptional modification (Phospho) data (grey/dark = Kinase, yellow/light — No Kinase Activity), squares represent ty based (Kinase) + Proteomics data (grey/dark = Kinase, yellow/light — No Kinase Activity).

Figures 48-51 rate examples of causal association in signaling pathways inferred by the Platform. Kinase isoforms are indicated on representative squares and circles, with causal associations indicated by connectors.

Figures 52A-B show human umbilical vein endothelial cells s) grown in (A) conﬂuent or (B) subconﬂuent cultures were treated for 24 hours with a range of concentrations of CleO as ted. Conﬂuent cells closely resemble ‘normal’ cells whereas to sub-conﬂuent cells more y represent the angiogenic ype of proliferating cells. In conﬂuent cultures, addition of increasing concentrations of CleO led to closer ation, elongation and alignment of ECs. 5000uM led to a subtle increase in rounded cells.

Figures 53A-C show conﬂuent (A) and subconﬂuent (B) cultures of HUVEC cells were treated for 24 hours with 100 or 1500uM CleO and assayed for propidium iodide positive apoptotic cells. CleO was protective to ECs treated at nce, whereas sub-conﬂuent cells were sensitive to CleO and displayed increased apoptosis at lSOOuM CleO. (C) Representative histograms of sub-conﬂuent control ECs (left), lOOuM CleO (middle) and lSOOuM CleO (right).

Figures 54A-C show subconﬂuent cultures of HUVEC cells were treated for 72 hours with 100 or lSOOuM CleO and assayed for both cell numbers (A) and proliferation (B) using a propidium iodide incorporation assay (detects G2/M phase DNA). High concentrations of CleO led to a significant decrease in cell numbers and had a dose-dependent effect on EC proliferation. Representative histograms of cell proliferation gating for cells in the G2/M phase of the cell cycle [control ECs (left), lOOuM CleO e) and lSOOuM CleO (right)] are shown in (C).

Figure 55 shows HUVEC cells were grown to conﬂuence tested for migration using the ‘scratch’ assay. 100 or l500uM CleO was applied at the time of scratching and closure of the cleared area was monitored over 48 hours. lOOuM CleO delayed endothelial closure compared to l. Addition of 1500uM CleO prevented closure, even up to 48 hours (data not shown).

Figure 56 shows elial cells growing in 3-D matrigel form tubes over time.

Differential effects of 100uM and 1500 uM CleO on tube formation were observed.

Impaired cell to cell association and breakdown of early tube structure was significant at 1500 uM CleO. Images shown were taken at 72 hours. s 57A-B show endothelial cells were grown in subconﬂuent and conﬂuent cultures were grown in the presence or absence of CleO under both normal and hypoxic conditions. Generation of nitric oxide (NO) (A) and reactive oxygen species (ROS) (B) in response to CleO and hypoxia were assessed.

Figures SSA-D show endothelial cells were grown in subconﬂuent or conﬂuent cultures in the presence or absence of CleO to assess mitochondrial oxygen consumption under the indicated growth conditions. Assessment of Total OCR (A); Mitochondrial OCR(B); ATP production (C); ECAR (D) are shown.

Figures 59A-C show s from the interrogative biology platform used to identify key biological functional nodes through modulating endothelial cell function by CleO. These nodes are represented by a full multi-omic network (A), a hub of a protein enriched network (B), and a hub of a kinase, lipidomic, and functional endpoint k (C). Figures 59B and 59C are exploded portions of Figure 59A.

Detailed Description of the Invention I. Overview Exemplary ments of the present invention incorporate s that may be performed using an interrogative y platform (“the Platform”) that is a tool for tanding a wide variety of biological processes, such as disease pathophysiology or angiogenesis, and the key molecular drivers underlying such biological processes, including factors that enable a disease process. Some exemplary ments include systems that may incorporate at least a portion of, or all of, the rm. Some exemplary methods may employ at least some of, or all of the Platform. Goals and objectives of some ary embodiments involving the platform are generally outlined below for illustrative purposes: i) to create specific molecular signatures as drivers of critical components of the biological process (e.g., disease s, angiogenesis) as they relate to the overall e biological process; ii) to generate molecular signatures or differential maps pertaining to the biological process, which may help to identify differential molecular signatures that distinguishes one biological state (e.g., a disease state, angiogenic state) versus a different biological stage (e. g., a normal , and p understanding of signatures or molecular entities as they arbitrate mechanisms of change between the two biological states (e. g., from normal to disease state or angiogenic state); and, iii) to investigate the role of “hubs” of molecular activity as potential intervention s for external control of the biological process (e. g., to use the hub as a potential therapeutic target or target for the tion of angiogenesis), or as potential bio-markers for the biological process in question (e.g., disease specific biomarkers and angiogenic specific markers, in stic and/or theranostics uses).

Some exemplary methods involving the Platform may e one or more of the following features: 1) modeling the biological process (e. g., disease process, angiogenic process) and/or components of the biological process (e. g., disease physiology and hysiology, physiology of enesis) in one or more models, preferably in vitro models or laboratory models (e. g., CAM models, corneal pocket models, MATRIGEL ® ), using cells associated with the biological process. For example, the cells may be human derived cells which normally participate in the biological process in question.

The model may include various cellular cues / conditions / perturbations that are specific to the biological process (e. g., disease, angiogenesis). Ideally, the model represents various (disease, angiogenensis) states and ﬂux ents, instead of a static assessment of the biological se, angiogenensis) condition. 2) ing mRNA and/or protein signatures using any art-recognized means. For example, quantitative polymerase chain reaction (qPCR) and proteomics analysis tools such as Mass Spectrometry (MS). Such mRNA and protein data sets represent biological reaction to nment / perturbation. Where applicable and possible, lipidomics, metabolomics, and transcriptomics data may also be integrated as supplemental or alternative measures for the biological process in question. SNP analysis is another component that may be used at times in the process. It may be helpful for investigating, for example, whether the SNP or a specific on has any effect on the biological process. These variables may be used to describe the biological process, either as a static “snapshot,” or as a representation of a dynamic process. 3) assaying for one or more cellular responses to cues and perturbations, including but not limited to bioenergetics profiling, cell proliferation, sis, and organellar function. True genotype-phenotype ation is actualized by employment of functional models, such as ATP, ROS, OXPHOS, Seahorse assays, caspase , migration assays, chemotaxis assays, tube formation assays, etc. Such cellular responses represent the reaction of the cells in the ical process (or models thereof) in response to the corresponding state(s) of the mRNA / protein expression, and any other related states in 2) above. 4) ating functional assay data thus obtained in 3) with proteomics and other data obtained in 2), and ining protein associations as driven by causality, by employing artificial intelligence based (AI-based) informatics system or platform. Such an AI-based system is based on, and preferably based only on, the data sets obtained in 2) and/or 3), without resorting to existing knowledge concerning the biological process.

Preferably, no data points are statistically or artificially cut-off. Instead, all obtained data is fed into the AI-system for determining protein associations. One goal or output of the integration process is one or more differential networks wise may be referred to herein as “delta networks,” or, in some cases, “delta-delta networks” as the case may be) between the different biological states (e. g., disease vs. normal ). ) profiling the outputs from the AI-based atics platform to explore each hub of activity as a potential therapeutic target and/or biomarker. Such profiling can be done entirely in silico based on the obtained data sets, t resorting to any actual wet-lab ments. 6) validating hub of activity by employing molecular and cellular ques. Such nformatic validation of output with wet-lab cell-based experiments may be optional, but they help to create a full-circle of interrogation.

Any or all of the approaches outlined above may be used in any specific application concerning any biological s, depending, at least in part, on the nature of the specific application. That is, one or more approaches outlined above may be omitted or modified, and one or more additional approaches may be employed, depending on ic application.

Various tics illustrating the platform are provided. In particular, an illustration of an exemplary ch to identify therapeutics using the platform is depicted in Figure 1. An illustration of systems biology of cancer and the consequence of integrated multi-physiological interactive output regulation is depicted in Figure 2.

An illustration of a systematic interrogation of biological relevance using MIMS is depicted in Figure 3. An illustration of modeling a cancer network to enable an ogative biological query is depicted in Figure 4.

Illustrations of the interrogative biology rm and technologies ed in the platform are depicted in Figures 5 and 6. A schematic entation of the components of the platform including data collection, data integration, and data mining is depicted in Figure 7. A schematic entation of a systematic interrogation using MIMS and tion of response data from the “omics” cascade is depicted in Figure 8.

Figure 14 is a high level ﬂow chart of an exemplary method 10, in which components of an ary system that may be used to perform the exemplary method are indicated. Initially, a model (e. g., an in vitro model) is established for a biological process (e. g., a disease process) and/or components of the biological process (e. g., disease physiology and pathophysiology) using cells normally associated with the biological process (step 12). For example, the cells may be human-derived cells that normally participate in the biological process (e. g., disease). The cell model may include various cellular cues, conditions, and/or perturbations that are ic to the biological process (e. g., disease). Ideally, the cell model represents various (disease) states and ﬂux components of the biological process (e.g., disease), instead of a static assessment of the biological s. The comparison cell model may include control cells or normal (e. g., non-diseased) cells. Additional description of the cell models s below in sections 111A and IV.

A first data set is obtained from the cell model for the biological process, which includes information representing expression levels of a plurality of genes (e. g., mRNA and/or protein signatures) (step 16) using any known process or system (e.g., quantitative polymerase chain reaction (qPCR) and proteomics analysis tools such as Mass Spectrometry (MS)).

A third data set is obtained from the comparison cell model for the biological process (step 18). The third data set includes information representing expression levels of a plurality of genes in the comparison cells from the comparison cell model.

In certain embodiments of the methods of the invention, these first and third data sets are collectively ed to herein as a “first data set” that represents expression levels of a plurality of genes in the cells (all cells including comparison cells) associated with the biological system.

The first data set and third data set may be obtained from one or more mRNA and/or Protein Signature Analysis System(s). The mRNA and protein data in the first and third data sets may represent biological reactions to nment and/or perturbation. Where applicable and possible, lipidomics, metabolomics, and transcriptomics data may also be integrated as supplemental or alternative measures for the biological process. The SNP analysis is r component that may be used at times in the process. It may be helpful for investigating, for example, whether a - tide polymorphism (SNP) or a specific mutation has any effect on the biological process. The data variables may be used to describe the biological process, either as a static “snapshot,” or as a representation of a dynamic process. Additional description regarding obtaining information representing expression levels of a plurality of genes in cells appears below in section III.B.

A second data set is ed from the cell model for the biological process, which includes information representing a functional activity or response of cells (step ). Similarly, a fourth data set is ed from the comparison cell model for the biological process, which includes information enting a onal activity or response of the comparison cells (step 22).

In certain embodiments of the methods of the invention, these second and fourth data sets are tively referred to herein as a “second data set” that represents a functional activity or a cellular response of the cells (all cells including comparison cells) associated with the biological system.

One or more functional assay systems may be used to obtain information regarding the functional activity or response of cells or of comparison cells. The information regarding functional cellular responses to cues and perturbations may include, but is not limited to, bioenergetics profiling, cell proliferation, apoptosis, and organellar function. Functional models for processes and pathways (e.g., adenosine triphosphate (ATP), reactive oxygen species (ROS), oxidative phosphorylation (OXPHOS), Seahorse assays, caspase assay, migration assay, chemotaxis assay, tube formation assay, etc.,) may be employed to obtain true genotype-phenotype association.

The functional activity or cellular responses represent the reaction of the cells in the biological process (or models f) in response to the corresponding state(s) of the mRNA / protein expression, and any other related applied conditions or bations.

Additional information regarding obtaining information representing functional ty or response of cells is provided below in section III.B.

The method also includes generating computer-implemented models of the biological processes in the cells and in the l cells. For e, one or more (e. g., an ensemble of) Bayesian networks of causal relationships n the expression level of the plurality of genes and the functional activity or cellular se may be generated for the cell model (the “generated cell model networks”) from the first data set and the second data set (step 24). The generated cell model networks, individually or collectively, include quantitative probabilistic ional ation regarding relationships. The generated cell model networks are not based on known biological relationships between gene expression and/or functional activity or cellular response, other than information from the first data set and second data set. The one or more generated cell model ks may collectively be referred to as a consensus cell model network.

One or more (e.g., an ensemble of) Bayesian networks of causal relationships between the expression level of the plurality of genes and the functional activity or ar response may be generated for the ison cell model (the “generated comparison cell model ks”) from the first data set and the second data set (step 26). The generated comparison cell model networks, individually or collectively, e quantitative probabilistic directional information regarding relationships. The generated cell networks are not based on known biological relationships between gene expression and/or functional activity or cellular response, other than the ation in the first data set and the second data set. The one or more generated comparison model networks may collectively be refered to as a consensus cell model network.

The generated cell model networks and the generated comparison cell model networks may be created using an artificial intelligence based (AI-based) informatics platform. Further details regarding the creation of the generated cell model networks, the creation of the generated comparison cell model networks and the AI-based informatics system appear below in section III.C and in the description of Figures 2A-3.

It should be noted that many different AI—based platforms or systems may be employed to generate the Bayesian networks of causal relationships including quantitative probabilistic directional ation. Although certain examples described herein employ one specific commercially available system, i.e., REFSTM (Reverse Engineering/Forward Simulation) from GNS (Cambridge, MA), embodiments are not limited. AI—Based Systems or Platforms suitable to implement some embodiments employ atical algorithms to establish causal relationships among the input variables (e.g., the first and second data sets), based only on the input data without taking into consideration prior existing knowledge about any potential, established, and/or verified biological relationships.

For e, the REFSTM AI-based informatics platform utilizes experimentally derived raw (original) or minimally processed input biological data (e. g., genetic, c, epigenetic, proteomic, metabolomic, and clinical data), and rapidly performs trillions of calculations to determine how molecules interact with one another in a complete system. The REFSTM AI-based informatics platform performs a reverse engineering process aimed at creating an in silico computer-implemented cell model (e.g., generated cell model networks), based on the input data, that quantitatively represents the underlying biological system. Further, eses about the ying ical system can be ped and rapidly simulated based on the computerimplemented cell model, in order to obtain predictions, anied by associated ence , regarding the hypotheses.

With this approach, biological systems are represented by quantitative er- ented cell models in which “interventions” are simulated to learn detailed mechanisms of the biological system (e. g., disease), effective intervention strategies, and/or clinical biomarkers that determine which patients will respond to a given treatment regimen. Conventional ormatics and statistical approaches, as well as ches based on the modeling of known biology, are typically unable to provide these types of insights.

After the generated cell model networks and the generated comparison cell model networks are created, they are compared. One or more causal relationships present in at least some of the generated cell model ks, and absent from, or having at least one significantly ent parameter in, the generated comparison cell model ks are identified (step 28). Such a comparison may result in the creation of a differential network. The comparison, fication, and/or differential (delta) network creation may be conducted using a differential network creation module, which is described in further detail below in section 111D and with respect to the description of Figure 26.

In some embodiments, input data sets are from one cell type and one comparison cell type, which creates an ensemble of cell model networks based on the one cell type and another ensemble of comparison cell model networks based on the one ison control cell type. A differential may be performed between the ensemble of networks of the one cell type and the ensemble of networks of the comparison cell type(s).

In other embodiments, input data sets are from multiple cell types (e. g., two or more cancer cell types, two or more cell types in different angiogenic states e. g., induced by different pro-angiogenic stimuli) and multiple comparison cell types (e. g., two or more normal, non-cancerouscell types, two or more non-angiogenic and angiogenic cell types). An ensemble of cell model networks may be generated for each cell types and each comparison cell type individually, and/or data from the multiple cell types and the le comparison cell types may be combined into respective composite data sets.

The composite data sets e an ensemble of networks corresponding to the multiple cell types (composite data) and another ensemble of networks corresponding to the multiple comparison cell types (comparison composite data). A differential may be performed on the ensemble of networks for the composite data as compared to the le of networks for the comparison composite data.

In some embodiments, a differential may be performed between two different differential networks. This output may be referred to as a delta network, and is described below with t to Figure 26.

Quantitative relationship information may be identified for each relationship in the generated cell model ks (step 30). Similarly, tative relationship ation for each relationship in the generated ison cell model networks may be identified (step 32). The quantitative information regarding the relationship may include a direction indicating causality, a measure of the statistical uncertainty regarding the relationship (e. g., an Area Under the Curve (AUC) statistical measurement), and/or an expression of the quantitative magnitude of the strength of the relationship (e. g., a fold). The various relationships in the generated cell model ks may be profiled using the quantitative relationship information to explore each hub of activity in the networks as a potential therapeutic target and/or biomarker. Such profiling can be done entirely in silico based on the results from the ted cell model networks, without resorting to any actual b ments.

In some embodiments, a hub of activity in the networks may be ted by employing molecular and cellular techniques. Such post-informatic validation of output with wet-lab cell based ments need not be performed, but it may help to create a full-circle of interrogation.Figure 15 schematically depicts a simplified high level representation of the functionality of an exemplary AI-based informatics system (e. g., REFSTM AI-based informatics system) and interactions between the AI-based system and other elements or portions of an interrogative biology platform (“the Platform”). In Figure 15A, various data sets obtained from a model for a biological process (e. g., a disease model), such as drug dosage, treatment dosage, protein expression, mRNA expression, and any of many associated functional measures (such as OCR, ECAR) are fed into an AI-based system. As shown in Figure 15B, from the input data sets, the AI- system creates a library of “network fragments” that includes variables (proteins, lipids and metabolites) that drive molecular isms in the biological process (e. g., disease), in a process referred to as Bayesian Fragment ation (Figure 15B).

In Figure 15C, the AI-based system selects a subset of the network fragments in the library and constructs an initial trial network from the nts. The AI-based system also selects a different subset of the k fragments in the library to construct r initial trial network. Eventually an ensemble of initial trial networks are created (e.g., 1000 networks) from different s of network fragments in the library. This process may be termed parallel ensemble sampling. Each trial network in the ensemble is d or optimized by adding, subtracting and/or substitution additional network fragments from the library. If additional data is obtained, the additional data may be orated into the k fragments in the library and may be incorporated into the ensemble of trial networks through the evolution of each trial network. After completion of the optimization/evolution process, the ensemble of trial networks may be bed as the generated cell model networks.

As shown in Figure 15D, the le of generated cell model networks may be used to simulate the behavior of the biological system. The simulation may be used to predict behavior of the ical system to changes in conditions, which may be experimentally verified using wet-lab ased, or animal-based, experiments. Also, quantitative parameters of relationships in the generated cell model networks may be extracted using the simulation functionality by applying simulated perturbations to each node individually while observing the effects on the other nodes in the generated cell model neworks. Further detail is provided below in section III.C.

The automated reverse ering process of the AI-based informatics system, which is depicted in Figures 2A-2D, creates an ensemble of generated cell model networks ks that is an unbiased and systematic computer-based model of the cells.

The reverse engineering determines the probabilistic directional network connections between the molecular measurements in the data, and the phenotypic es of interest. The variation in the molecular measurements enables learning of the probabilistic cause and effect relationships between these entities and changes in endpoints. The machine learning nature of the platform also enables cross training and predictions based on a data set that is constantly evolving.

The network connections between the molecular measurements in the data are “probabilistic,” partly because the tion may be based on correlations between the observed data sets “learned” by the computer algorithm. For example, if the expression level of protein X and that of n Y are positively or vely correlated, based on statistical analysis of the data set, a causal relationship may be assigned to establish a network connection between proteins X and Y. The reliability of such a putative causal relationship may be further defined by a likelihood of the connection, which can be measured by e (e.g., p < 0.1, 0.05, 0.01, etc).

The network tions between the molecular measurements in the data are “directional,” partly because the network connections between the molecular measurements, as determined by the e-engineering process, s the cause and effect of the relationship between the connected gene / protein, such that raising the expression level of one protein may cause the expression level of the other to rise or fall, depending on r the connection is stimulatory or inhibitory.

The network connections between the molecular measurements in the data are “quantitative,” partly because the network connections between the molecular measurements, as determined by the process, may be simulated in , based on the existing data set and the probabilistic es associated therewith. For e, in the established network connections between the molecular measurements, it may be possible to theoretically increase or decrease (e.g., by l, 2, 3, 5, 10, 20, 30, -fold or more) the expression level of a given protein (or a “node” in the network), and quantitatively simulate its effects on other connected proteins in the network.

The network connections between the molecular ements in the data are “unbiased,” at least partly because no data points are statistically or artificially cut-off, and partly because the network connections are based on input data alone, without referring to pre-existing knowledge about the biological process in question.

The network connections between the molecular measurements in the data are “systemic” and (unbiased), partly because all potential connections among all input les have been systemically explored, for example, in a pair-wise fashion. The reliance on ing power to execute such systemic probing exponentially increases as the number of input variables increases.

In general, an ensemble of ~l,000 networks is usually sufficient to predict probabilistic causal tative relationships among all of the measured entities. The ensemble of networks captures uncertainty in the data and enables the calculation of confidence metrics for each model prediction. Predictions generated using the ensemble of networks together, where differences in the predictions from individual ks in the ensemble represent the degree of uncertainty in the prediction. This feature enables the assignment of confidence metrics for tions of clinical response generated from the model.

Once the models are reverse-engineered, further simulation queries may be conducted on the le of models to determine key molecular drivers for the ical s in question, such as a disease condition.

Sketch of components employed to build examplary In vitro models representing normal and diabetic statesis is depicted in Figure 9. tic representation of an examplary atics platform REFSTM used to generate causal networks of the protein as they relate to disease pathophysiology is depicted in Figure 10. Schematic representation of examplary approach towards generation of differential network in diabetic versus normal states and diabetic nodes that are restored to normal states by treatment with MIMS is depicted in Figure 11. A representative differential network in diabetic versus normal states is depicted in Figure 12. A schematic representation of a node and associated edges of interest (Nodel in the center) and the cellular functionality associated with each edge is depicted in Figure 13.

The ion having been lly described above, the ns below provide more ed description for s aspects or ts of the general invention, in conjunction with one or more specific ical systems that can be analyzed using the methods herein. It should be noted, however, the specific biological systems used for illustration purpose below are not limiting. To the contrary, it is intended that other distinct biological systems, including any alternatives, modifications, and equivalents thereof, may be ed similarly using the subject Platform technology. 11. Definitions As used herein, certain terms intended to be specifically defined, but are not already d in other sections of the specification, are defined herein.

The es “a” and “an” are used herein to refer to one or to more than one (i.e., to at least one) of the grammatical object of the article. By way of example, “an element” means one element or more than one element.

The term “including” is used herein to mean, and is used interchangeably with, the phrase “including but not limited to.” The term “or” is used herein to mean, and is used interchangeably with, the term “and/or,” unless context clearly indicates otherwise.

The term “such as” is used herein to mean, and is used interchangeably, with the phrase “such as but not limited to.” “Metabolic pathway” refers to a sequence of enzyme-mediated reactions that orm one compound to another and provide intermediates and energy for cellular functions. The metabolic pathway can be linear or cyclic or branched.

“Metabolic state” refers to the molecular content of a particular ar, ellular or tissue environment at a given point in time as ed by various chemical and biological indicators as they relate to a state of health or disease. genesis” refers to is the physiological process involving the growth of new blood vessels from pre-existing s. Angiogenesis includes at least the proliferation of vascular endothelial cells, the migration of vascular endothelial cells lly in response to chemotacitic agents, the degradation of ellular matrix typically by matrix metalloprotease production, matrix metalloproteinase production, tube formation, vessel lumen formation, vessel ing, adhesion molecule sion typically in expression, and differentiation. Depending on the culture system (e. g., one dimensional vs. three dimensional) and the cell type, s aspects of enesis can be observed in cells grown in vitro as well as in vivo. Angiogenic cells or cells exhibiting at least one characteristic of an angiogenic cell exhibit l, 2, 3, 4, 5, 6, 7, 8, 9, or more characteristics set forth above. Modulators of angiogenesis increase or decrease at least one of the characteristics provided above. Angiogenesis is distinct from vasculogenesis which is the spontaneous formation of blood vessels or intussusception is the term for the formation of new blood vessels by the splitting of existing ones.

The term “microarray” refers to an array of distinct polynucleotides, oligonucleotides, polypeptides (e. g., antibodies) or peptides synthesized on a substrate, such as paper, nylon or other type of membrane, filter, chip, glass slide, or any other suitable solid support.

The terms “disorders” and “diseases” are used ively and refer to any deviation from the normal structure or function of any part, organ or system of the body (or any combination thereof). A specific disease is manifested by characteristic ms and signs, including biological, chemical and physical changes, and is often associated with a variety of other factors including, but not limited to, demographic, environmental, employment, genetic and medically historical factors. Certain characteristic signs, symptoms, and related factors can be quantitated through a variety of methods to yield important diagnostic information.

The term “expression” includes the process by which a polypeptide is produced from polynucleotides, such as DNA. The process may involves the transcription of a gene into mRNA and the translation of this mRNA into a polypeptide. Depending on the context in which it is used, “expression” may refer to the production of RNA, protein or both.

The terms “level of expression of a gene” or “gene expression level” refer to the level of mRNA, as well as pre-mRNA nascent transcript(s), transcript processing intermediates, mature mRNA(s) and degradation products, or the level of protein, encoded by the gene in the cell.

The term “modulation” refers to upregulation (i.e., activation or stimulation), downregulation (i.e., inhibition or suppression) of a se, or the two in combination or apart. A “modulator” is a nd or molecule that tes, and may be, e. g., an agonist, antagonist, activator, stimulator, suppressor, or inhibitor.

The phrase “affects the modulator” is understood as altering the expression of, altering the level of, or altering the activity of the modulator.

The term “Trolamine,” as used herein, refers to Trolamine NF, Triethanolamine, TEALAN®, TEAlan 99%, anolamine, 99%, Triethanolamine, NF or Triethanolamine, 99%, NF. These terms may be used interchangeably herein.

The term “genome” refers to the entirety of a biological entity’s (cell, tissue, organ, system, organism) genetic information. It is encoded either in DNA or RNA (in certain viruses, for example). The genome includes both the genes and the non-coding ces of the DNA.

The term “proteome” refers to the entire set of proteins expressed by a genome, a cell, a , or an organism at a given time. More specifically, it may refer to the entire set of expressed proteins in a given type of cells or an organism at a given time under d conditions. Proteome may e protein ts due to, for example, alternative splicing of genes and/or post-translational modifications (such as ylation or phosphorylation).

The term “transcriptome” refers to the entire set of transcribed RNA molecules, including mRNA, rRNA, tRNA, microRNA, dicer substrate RNAs, and other non- coding RNA produced in one or a tion of cells at a given time. The term can be applied to the total set of transcripts in a given organism, or to the specific subset of transcripts present in a particular cell type. Unlike the genome, which is roughly fixed for a given cell line (excluding mutations), the transcriptome can vary with external environmental ions. Because it includes all mRNA transcripts in the cell, the riptome reﬂects the genes that are being actively expressed at any given time, with the exception of mRNA degradation phenomena such as transcriptional attenuation.

The study of transcriptomics, also referred to as expression profiling, examines the expression level of mRNAs in a given cell population, often using high-throughput techniques based on DNA rray technology.

The term “metabolome” refers to the complete set of small-molecule metabolites (such as metabolic intermediates, hormones and other ling molecules, and secondary metabolites) to be found within a biological sample, such as a single organism, at a given time under a given condition. The metabolome is c, and may change from second to second.

The term “lipidome” refers to the complete set of lipids to be found within a biological sample, such as a single organism, at a given time under a given condition.

The lipidome is dynamic, and may change from second to second.

The term “interactome” refers to the whole set of molecular ctions in a biological system under study (e. g., cells). It can be displayed as a directed graph. lar interactions can occur between molecules belonging to different biochemical families (proteins, nucleic acids, lipids, carbohydrates, etc.) and also within a given family. When spoken in terms of proteomics, interactome refers to protein-protein interaction network(PPI), or protein interaction network (PIN). Another extensively studied type of interactome is the protein-DNA interactome (network formed by transcription factors (and DNA or tin regulatory proteins) and their target genes.

The term “cellular output” includes a collection of parameters, preferably able parameters, relating to cellullar status, including (without limiting): level of transcription for one or more genes (e. g., measurable by RT-PCR, qPCR, microarray, etc.), level of expression for one or more proteins (e. g., able by mass spectrometry or Western blot), absolute activity (e. g., measurable as ate conversion rates) or relative activity (e. g., measurable as a % value compared to m activity) of one or more enzymes or proteins, level of one or more metabolites or intermediates, level of oxidative phosphorylation (e. g., measurable by Oxygen Consumption Rate or OCR), level of glycolysis (e.g., measurable by Extra Cellular ication Rate or ECAR), extent of ligand-target binding or interaction, activity of ellular secreted molecules, etc. The cellular output may include data for a pre- determined number of target genes or proteins, etc., or may include a global assessment for all detectable genes or proteins. For example, mass spectrometry may be used to identify and/or quantitate all detectable proteins expressed in a given sample or cell population, without prior knowledge as to whether any specific protein may be expressed in the sample or cell population.

As used herein, a “cell system” includes a population of homogeneous or heterogeneous cells. The cells within the system may be growing in vivo, under the natural or physiological nment, or may be growing in vitro in, for e, controlled tissue culture environments. The cells within the system may be relatively homogeneous (e.g., no less than 70%, 80%, 90%, 95%, 99%, 99.5%, 99.9% homogeneous), or may contain two or more cell types, such as cell types usually found to grow in close proximity in vivo, or cell types that may ct with one another in vivo through, 6. g., paracrine or other long distance inter-cellular communication. The cells within the cell system may be derived from established cell lines, including cancer cell lines, immortal cell lines, or normal cell lines, or may be primary cells or cells freshly isolated from live tissues or .

Cells in the cell system are lly in contact with a “cellular environment” that may provide nutrients, gases n or C02, eta), chemicals, or proteinaceous / non- proteinaceous stimulants that may define the conditions that affect cellular behavior.

The cellular environment may be a chemical media with defined chemical components and/or less well-defined tissue extracts or serum components, and may e a specific pH, C02 content, pressure, and temperature under which the cells grow. atively, the ar environment may be the natural or physiological environment found in vivo for the specific cell system.

In certain embodiments, a cell environment comprises conditions that simulate an aspect of a ical system or process, e.g., simulate a disease state, process, or environment. Such culture conditions include, for example, hyperglycemia, hypoxia, or lactic-rich conditions. Numerous other such ions are described herein.

In certain embodiments, a cellular environment for a specific cell system also include certain cell surface features of the cell system, such as the types of receptors or ligands on the cell surface and their respective ties, the structure of carbohydrate or lipid molecules, membrane polarity or ﬂuidity, status of clustering of certain membrane proteins, etc. These cell surface features may affect the function of nearby cells, such as cells belonging to a ent cell system. In certain other ments, however, the cellular environment of a cell system does not e cell surface features of the cell system.

The cellular environment may be altered to become a “modified cellular nment.” Alterations may include changes (6. g., increase or decrease) in any one or more component found in the cellular environment, including addition of one or more “external stimulus component” to the cellular environment. The nmental perturbation or external stimulus component may be endogenous to the ar environment (6. g., the cellular environment contains some levels of the stimulant, and more of the same is added to increase its level), or may be exogenous to the cellular nment (e.g., the stimulant is largely absent from the cellular environment prior to the alteration). The cellular environment may further be altered by secondary changes resulting from adding the external stimulus component, since the external stimulus component may change the cellular output of the cell system, including molecules secreted into the cellular environment by the cell .

As used herein, “external stimulus ent”, also referred to herein as “environmental perturbation”, include any al physical and/or chemical stimulus that may affect cellular function. This may include any large or small organic or inorganic molecules, natural or synthetic chemicals, temperature shift, pH change, radiation, light (UVA, UVB etc), microwave, sonic wave, electrical current, modulated or unmodulated magnetic fields, etc.

The term “Multidimensional Intracellular Molecule (MIM)”, is an isolated version or synthetically produced version of an endogenous molecule that is naturally produced by the body and/or is present in at least one cell of a human. A MIM is e of entering a cell and the entry into the cell includes te or partial entry into the cell as long as the biologically active portion of the molecule wholly enters the cell. MIMs are capable of inducing a signal transduction and/or gene expression mechanism within a cell. MIMs are imensional because the molecules have both a eutic and a r, e. g., drug delivery, effect. MIMs also are multidimensional because the molecules act one way in a disease state and a different way in a normal state. For example, in the case of CoQ-lO, administration of CoQ-lO to a melanoma cell in the presence of VEGF leads to a decreased level of Bc12 which, in turn, leads to a decreased nic potential for the melanoma cell. In contrast, in a normal fibroblast, co-administration of CoQ-lO and VEFG has no effect on the levels of Bc12.

In one embodiment, a MIM is also an epi-shifter In another embodiment, a MIM is not an epi-shifter. In another embodiment, a MIM is characterized by one or more of the foregoing ons. In r embodiment, a MIM is characterized by two or more of the foregoing functions. In a further embodiment, a MIM is terized by three or more of the foregoing functions. In yet another embodiment, a MIM is characterized by all of the foregoing functions. The skilled artisan will appreciate that a MIM of the invention is also intended to encompass a mixture of two or more endogenous molecules, wherein the mixture is characterized by one or more of the foregoing functions. The nous molecules in the mixture are present at a ratio such that the mixture functions as a MIM.

MIMs can be lipid based or non-lipid based molecules. Examples of MIMs include, but are not limited to, CoQ10, acetyl Co-A, palmityl Co-A, L-carnitine, amino acids such as, for example, ne, phenylalanine, and cysteine. In one embodiment, the MIM is a small le. In one embodiment of the invention, the MIM is not CoQ10. MIMs can be routinely identified by one of skill in the art using any of the assays described in detail herein. MIMs are described in further detail in US 12/777,902 (US 2011-0110914), the entire contents of which are expressly incorporated herein by nce.

As used herein, an “epimetabolic shifter” (epi-shifter) is a molecule that modulates the metabolic shift from a healthy (or normal) state to a disease state and vice versa, y maintaining or reestablishing cellular, tissue, organ, system and/or host health in a human. Epi-shifters are capable of effectuating normalization in a tissue microenvironment. For example, an epi-shifter includes any molecule which is capable, when added to or depleted from a cell, of affecting the nvironment (e. g., the metabolic state) of a cell. The skilled artisan will appreciate that an epi-shifter of the invention is also intended to encompass a e of two or more molecules, wherein the mixture is characterized by one or more of the foregoing functions. The molecules in the mixture are present at a ratio such that the mixture functions as an epi-shifter.

Examples of epi-shifters include, but are not limited to, CoQ-10; vitamin D3; ECM components such as fibronectin; immunomodulators, such as TNFa or any of the interleukins, e. g., IL-5, IL-12, IL-23; angiogenic factors; and apoptotic factors.

In one ment, the epi-shifter also is a MIM. In one embodiment, the epi- r is not CoQ10. Epi-shifters can be ely identified by one of skill in the art using any of the assays described in detail . Epi-shifters are described in further detail in US 12/777,902 (US 2011-0110914), the entire contents of which are expressly incorporated herein by reference.

Other terms not explicitly defined in the instant application have meaning as would have been understood by one of ordinary skill in the art. 111. Exemplary Steps and Components of the Platform Technology For illustration e only, the following steps of the subject Platform Technology may be described herein below as an exemplary utility for integrating data obtained from a custom built cancer model, and for fying novel ns / pathways driving the pathogenesis of cancer. Relational maps resulting from this is provides cancer treatment targets, as well as diagnostic / prognostic markers associated with cancer. However, the subject Platform Technology has l applicability for any ical system or process, and is not d to any particular cancer or other specific disease models.

In addition, although the description below is presented in some portions as discrete steps, it is for ration purpose and simplicity, and thus, in reality, it does not imply such a rigid order and/or demarcation of steps. Moreover, the steps of the invention may be performed separately, and the invention provided herein is intended to encompass each of the individual steps separately, as well as combinations of one or more (e. g., any one, two, three, four, five, six or all seven steps) steps of the subject Platform Technology, which may be carried out independently of the remaining steps.

The invention also is intended to e all aspects of the Platform Technology as separate components and embodiments of the invention. For example, the generated data sets are intended to be embodiments of the invention. As further examples, the generated causal relationship networks, generated consensus causal relationship networks, and/or generated simulated causal relationship ks, are also intended to be embodiments of the invention. The causal relationships identified as being unique in the biological system are intended to be embodiments of the invention. Further, the custom built models for a particular biological system are also intended to be embodiments of the invention. For e, custom built models for a disease state or process, such as, e.g., models for angiogenesis, cell models for cancer, obestity/diabetes/cardiovascular disease, or a custom built model for toxicity (e. g., cardiotoxicity) of a drug, are also ed to be embodiments of the invention.

A. Custom Model Building The first step in the Platform Technology is the establishment of a model for a biological system or process. 1. Angiogenesis models Both in vitro and in vivo models of angiogenesis are known. For example, an in vitro model using human umbilical cord vascular endothelail cells (HUVECs) is provided in detail in the Examples. Brieﬂy, when HUVECs are grown in sub-conﬂuent cultures, they exhibit teristics of angiogenic cells. When HUVECs are grown in conﬂuent cultures, they do not exhibit characteristics of angiogenic cells. Most steps in the angiogenic cascade can be analyzed in vitro, including endothelial cell proliferation, migration and differentiation. The proliferation studies are based on cell counting, thymidine oration, or immuno histochemical staining for cell proliferation (by measurement of PCNA) or cell death (by al deoxynucleotidyl transferase- mediated dUTP nick end labeling or Tunel assay). Chemotaxis can be examined in a Boyden chamber, which consists of an upper and lower well ted by a membrane filter. Chemotactic solutions are placed in the lower well, cells are added to the top well, and after a period of incubation the cells that have ed toward the chemotactic stimulus are counted on the lower surface of the membrane. Cell migration can also be studied using the ch” assay provided in the Examples below. Differentiation can be induced in vitro by culturing endothelial cells in different ECM components, including two- and three-dimensional fibrin clots, collagen gels and matrigel. essels have also been shown to grow from rings of rat aorta embedded in a three dimensional fibrin gel. Matrix metalloprotease expression can be assayed by zymogen assay.

Retinal vasculature is not fully formed in mice at the time of birth. Vascular growth and angiogenesis have been studied in detail in this model. Staged retina can be used to analyze enesis as a normal biological process.

The chick chorioallantoic membrane (CAM) assay is well known in the art. The early chick embryo lacks a mature immune system and is therefore used to study tumor- induced angiogenesis. Tissue grafts are placed on the CAM through a window made in the ll. This caused a typical radial rearrangement of vessels towards, and a clear increase of vessels around the graft within four days after implantation. Blood vessels entering the graft are counted under a stereomicroscope. To assess the anti-angiogenic or angiogenic activity of test substances, the compounds are either prepared in slow release polymer pellets, ed by gelatin s or air-dried on plastic discs and then implanted onto the CAM. l variants of the CAM assay ing culturing of less embryos in Petri dishes, and different quantification methods (i.e. measuring the rate of basement membrane biosynthesis using radio-labeled proline, counting the number of vessels under a microscope or image analysis) have been described.

The cornea presents an in vivo lar site. Therefore, any vessels penetrating from the limbus into the corneal stroma can be identified as newly formed. To induce an enic response, slow release polymer pellets [i.e. polyhydroxyethyl- methacrylate (hydron) or ne-vinyl acetate copolymer (ELVAX)], ning an angiogenic substance (i.e. FGF-2 of VEGF) are implanted in "pockets" created in the corneal stroma of a rabbit. Also, a wide variety of tissues, cells, cell extracts and conditioned media have been examined for their effect on enesis in the cornea.

The ar response can be quantified by computer image analysis after perfusion of the cornea with India ink. Cornea can be harvested and analyzed using the platform methods provided herein.

MATRIGEL® is a matrix of a mouse basement membrane neoplasm known as Engelbreth-Holm-Swarm murine sarcoma. It is a complex mixture of basement membrane proteins including laminin, collagen type IV, heparan sulfate, fibrin and growth factors, including EGF, TGF-b and IGF-1. It was originally developed to , PDGF study endothelial cell differentiation in vitro. However, MATRIGEL®-containing FGF- 2 can be injected subcutaneously in mice. MATRIGEL® is liquid at 40C but forms a solid gel at 370C that traps the growth factor to allow its slow e. Typically, after 10 days, the MATRIGEL® plugs are removed and angiogenesis is quantified ogically or morphometrically in plug sections. MATRIGEL® plugs can be harvested and analyzed using the platform methods provided herein. 2. In vitro disease models An example of a biological system or s is cancer. As any other complicated biological process or system, cancer is a complicated pathological condition characterized by multiple unique aspects. For example, due to its high growth rate, many cancer cells are adapted to grow in hypoxia conditions, have up-regulated glycolysis and reduced oxidative phosphorylation metabolic pathways. As a result, cancer cells may react differently to an environmental bation, such as treatment by a potential drug, as compared to the reaction by a normal cell in response to the same treatment. Thus, it would be of interest to decipher cancer’s unique responses to drug treatment as compared to the responses of normal cells. To this end, a custom cancer model may be established to simulate the environment of a cancer cell, e. g., within a tumor in vivo, by creating cell culture conditions closely imating the conditions of a cancer cell in a tumor in vivo, or to mimic various aspects of cancer growth, by isolating ent growth conditions of the cancer cells.

One such cancer onment”, or growth stress condition, is hypoxia, a ion typically found within a solid tumor. a can be induced in cells in cells using art-recognized methods. For example, hypoxia can be induced by placing cell systems in a Modular Incubator r (MIC-101, Billups-Rothenberg Inc. Del Mar, CA), which can be ﬂooded with an industrial gas mix containing 5% C02, 2% Oz and 93% nitrogen. Effects can be measured after a pre-determined period, e.g., at 24 hours after hypoxia treatment, with and t additional external stimulus components (6. g., CleO at 0, 50, or 100 MM).

Likewise, lactic acid treatment of cells mimics a cellular environment where glycolysis activity is high, as exists in the tumor nment in vivo. Lactic acid induced stress can be investigated at a final lactic acid concentration of about 12.5 mM at a pre-determined time, 6. g., at 24 hours, with or without additional external stimulus components (e.g., CleO at 0, 50, or 100 MM).

Hyperglycemia is normally a condition found in diabetes; however, hyperglycemia also to some extent mimics one aspect of cancer growth because many cancer cells rely on glucose as their primary source of energy. Exposing subject cells to a typical hyperglycemic condition may include adding 10% e grade e to suitable media, such that the final concentration of glucose in the media is about 22 mM.

Individual conditions reﬂecting different aspects of cancer growth may be investigated tely in the custom built cancer model, and/or may be combined together. In one embodiment, combinations of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, , 30, 40, 50 or more conditions reﬂecting or simulating different aspects of cancer growth / conditions are investigated in the custom built cancer model. In one embodiment, individual conditions and, in addition, combinations of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50 or more of the conditions reﬂecting or ting different aspects of cancer growth / conditions are investigated in the custom built cancer model. All values presented in the foregoing list can also be the upper or lower limit of ranges, that are intended to be a part of this invention, 6. g., between 1 and 5, 1 and 10, 1 and 20, 1 and 30, 2 and 5, 2 and 10, 5 and 10, 1 and 20, 5 and 20, 10 and 20, and 25, 10 and 30 or 10 and 50 different ions.

Listed herein below are a few exemplary combinations of conditions that can be used to treat cells. Other ations can be readily formulated depending on the specific interrogative biological assessment that is being conducted. 1. Media only 2 50 MM CTL Coenzyme Q10 (CleO) 3 100 MM CTL Coenzyme Q10 4 12.5 mM Lactic Acid 12.5 mM Lactic Acid + 50 MM CTL me Q10 6. 12.5 mM Lactic Acid + 100 MM CTL Coenzyme Q10 7 Hypoxia 8 Hypoxia + 50 MM CTL Coenzyme Q10 9 Hypoxia + 100 MM CTL Coenzyme Q10 . Hypoxia + 12.5 mM Lactic Acid 11. Hypoxia + 12.5 mM Lactic Acid + 50 MM CTL Coenzyme Q10 12. Hypoxia + 12.5 mM Lactic Acid + 100 MM CTL Coenzyme Q10 13. Media + 22 mM Glucose 14. 50 MM CTL Coenzyme Q10 + 22 mM Glucose . 100 MM CTL Coenzyme Q10 + 22 mM Glucose 16. 12.5 mM Lactic Acid + 22 mM Glucose 17. 12.5 mM Lactic Acid + 22 mM Glucose + 50 MM CTL Coenzyme Q10 18. 12.5 mM Lactic Acid + 22 mM Glucose +100 MM CTL Coenzyme Q10 19. Hypoxia + 22 mM Glucose . Hypoxia + 22 mM Glucose + 50 MM CTL Coenzyme Q10 21. a + 22 mM Glucose + 100 MM CTL Coenzyme Q10 22. a +12.5 mM Lactic Acid + 22 mM Glucose 23. Hypoxia +12.5 mM Lactic Acid + 22 mM Glucose + 50 MM CTL me Q10 24. Hypoxia + 12.5 mM Lactic Acid + 22 mM Glucose +100 MM CTL Coenzyme Q10 As a control one or more normal cell lines (e.g., THLE2 and HDFa) are cultured under similar conditions in order to identify cancer unique proteins or pathways (see below). The control may be the comparison cell model described above.

Multiple cancer cells of the same or different origin (for example, cancer lines PaCa2, HepG2, PC3 and MCF7), as opposed to a single cancer cell type, may be included in the cancer model. In certain situations, cross talk or ECS experiments between different cancer cells (e. g., HepG2 and PaCa2) may be conducted for several inter-related purposes.

In some embodiments that involve cross talk, experiments conducted on the cell models are designed to determine tion of cellular state or on of one cell system or population (e.g., Hepatocarcinoma cell HepG2) by another cell system or population (6. g., Pancreatic cancer PaCa2) under defined treatment conditions (6. g., hyperglycemia, hypoxia (ischemia)). According to a l setting, a first cell system / population is contacted by an external stimulus components, such as a candidate molecule (6. g., a small drug molecule, a protein) or a ate condition (6. g., hypoxia, high glucose environment). In response, the first cell system / population changes its transcriptome, proteome, metabolome, and/or interactome, leading to changes that can be readily detected both inside and outside the cell. For example, s in transcriptome can be measured by the transcription level of a plurality of target mRNAs; changes in proteome can be measured by the expression level of a plurality of target proteins; and changes in lome can be measured by the level of a plurality of target metabolites by assays designed specifically for given metabolites. Alternatively, the above referenced changes in metabolome and/or proteome, at least with respect to certain secreted lites or proteins, can also be measured by their effects on the second cell system / population, including the modulation of the transcriptome, proteome, metabolome, and interactome of the second cell system / population.

Therefore, the ments can be used to identify the effects of the molecule(s) of interest secreted by the first cell system / population on a second cell system / tion under different treatment conditions. The experiments can also be used to identify any proteins that are modulated as a result of signaling from the first cell system (in response to the external stimulus ent treatment) to another cell system, by, for example, differential screening of proteomics. The same experimental setting can also be adapted for a reverse g, such that reciprocal effects between the two cell s can also be assessed. In general, for this type of experiment, the choice of cell line pairs is y based on the factors such as origin, disease state and cellular function.

Although two-cell systems are typically involved in this type of experimental setting, similar experiments can also be designed for more than two cell systems by, for example, immobilizing each distinct cell system on a separate solid support.

Once the custom model is built, one or more “perturbations” may be applied to the system, such as genetic variation from t to patient, or with / without treatment by certain drugs or pro-drugs. See Figure 15D. The effects of such perturbations to the system, including the effect on disease related cancer cells, and disease related normal control cells, can be measured using s art-recognized or proprietary means, as described in section III.B below.

In an exemplary experiment, cancer lines PaCa2, HepG2, PC3 and MCF7, and normal cell lines THLE2 and HDFa, are conditioned in each of hyperglycemia, hypoxia, and lactic acid-rich conditions, as well as in all combinations of two or three of thee conditions, and in addition with or without an environmental perturbation, specifically treatment by meQ l 0.

The custom built cell model may be established and used throughout the steps of the Platform Technology of the invention to ultimately fy a causal relationship unique in the biological system, by carrying out the steps described herein. It will be tood by the d artisan, however, that a custom built cell model that is used to te an initial, “first generation” consensus causal relationship network for a biological process can continually evolve or expand over time, e.g., by the introduction of additional cancer or normal cell lines and/or additional cancer conditions. Additional data from the evolved cell model, i.e., data from the newly added portion(s) of the cell model, can be collected. The new data collected from an expanded or d cell model, i.e., from newly added portion(s) of the cell model, can then be introduced to the data sets previously used to generate the “first generation” consensus causal relationship network in order to generate a more robust d generation” consensus causal relationship network. New causal relationships unique to the biological system can then be identified from the “second generation” consensus causal relationship network. In this way, the evolution of the cell model provides an evolution of the consensus causal relationship networks, thereby providing new and/or more reliable insights into the modulators of the biological system.

Additional examples of custom built cell models are bed in detail herein.

B. Data Collection In l, two types of data may be collected from any custom built model systems. One type of data (e. g., the first set of data, the third set of data) usually relates to the level of certain macromolecules, such as DNA, RNA, protein, lipid, etc. An exemplary data set in this category is proteomic data (e. g., qualitative and quantitative data concerning the expression of all or ntially all measurable proteins from a ). The other type of data is lly functional data (e.g., the second set of data, the fourth set of data) that reﬂects the phenotypic changes ing from the changes in the first type of data..

With respect to the first type of data, in some example embodiments, quantitative polymerase chain reaction (qPCR) and mics are performed to profile changes in cellular mRNA and protein expression by quantitative polymerase chain on (qPCR) and proteomics. Total RNA can be isolated using a commercial RNA isolation kit. Following cDNA synthesis, specific commercially available qPCR arrays (e.g., those from SA Biosciences) for disease area or cellular processes such as angiogenesis, apoptosis, and diabetes, may be employed to profile a predetermined set of genes by following a manufacturer’s instructions. For example, the Biorad 4 amplification system can be used for all transcriptional profiling experiments. Following data collection (Ct), the final fold change over control can be determined using the 8Ct method as outlined in manufacturer’s protocol. mic sample analysis can be performed as described in subsequent sections.

The t method may employ large-scale high-throughput quantitative mic analysis of ds of samples of similar character, and provides the data necessary for identifying the cellular output differentials.

There are numerous art-recognized technologies suitable for this purpose. An exemplary technique, iTRAQ analysis in combination with mass spectrometry, is brieﬂy described below.

The quantitative proteomics approach is based on stable e labeling with the 8—plex iTRAQ reagent and 2D-LC MALDI MS/MS for peptide identification and quantification. Quantification with this technique is relative: peptides and proteins are ed abundance ratios relative to a nce sample. Common reference samples in le iTRAQ experiments facilitate the comparison of samples across multiple iTRAQ experiments.

For example, to implement this is scheme, six primary samples and two l pool samples can be combined into one 8—plex iTRAQ mix according to the manufacturer’s suggestions. This mixture of eight samples then can be fractionated by mensional liquid chromatography; strong cation exchange (SCX) in the first dimension, and reversed-phase HPLC in the second dimension, then can be subjected to mass spectrometric analysis.

A brief overview of exemplary laboratory procedures that can be employed is provided herein. n extraction: Cells can be lysed with 8 M urea lysis buffer with protease inhibitors (Thermo Scientific Halt Protease inhibitor EDTA-free) and incubate on ice for minutes with vertex for 5 seconds every 10 minutes. Lysis can be completed by onication in 5 seconds pulse. Cell lysates can be centrifuged at 14000 x g for 15 minutes (4 0C) to remove cellular debris. Bradford assay can be performed to determine the protein concentration. 100ug protein from each samples can be reduced (10mM Dithiothreitol (DTT), 55 OC, 1 h), alkylated (25 mM iodoacetamide, room temperature, s) and digested with n (1:25 w/w, 200 mM triethylammonium bicarbonate (TEAB), 37 0C, 16 h).

Secretome sample preparation: 1) In one embodiment, the cells can be cultured in serum free medium: Conditioned media can be concentrated by freeze dryer, reduced (10mM Dithiothreitol (DTT), 55 OC, 1 h), alkylated (25 mM etamide, at room temperature, incubate for 30 minutes), and then desalted by actone precipitation. Equal amount of proteins from the concentrated ioned media can be digested with Trypsin (1:25 w/w, 200 mM triethylammonium bicarbonate (TEAB), 37 0C, 16 h).

In one embodiment, the cells can be cultured in serum containing medium: The volume of the medium can be reduced using 3k MWCO in columns (GE Healthcare Life Sciences), then can be reconstituted withleBS (Invitrogen). Serum albumin can be depleted from all s using AlbuVoid column (Biotech Support Group, LLC) following the manufacturer’s instructions with the modifications of buffer- exchange to optimize for ion medium application. iTRAQ 8 Flex Labeling: Aliquot from each tryptic digests in each experimental set can be pooled together to create the pooled l sample. Equal aliquots from each sample and the pooled control sample can be labeled by iTRAQ 8 Flex reagents according to the manufacturer’s protocols (AB Sciex). The reactions can be combined, vacuumed to dryness, re-suspended by adding 0.1% formic acid, and analyzed by LC- MS/MS. 2D-Nan0LC-MS/MS: All labeled peptides es can be separated by online 2D-nanoLC and analysed by electrospray tandem mass spectrometry. The experiments can be carried out on an Eksigent 2D NanoLC Ultra system connected to an LTQ Orbitrap Velos mass spectrometer equipped with a nanoelectrospray ion source (Thermo Electron, Bremen, Germany).

The peptides mixtures can be injected into a 5 cm SCX column (300um ID, 5um, PolySULFOETHYL Aspartamide column from PolyLC, Columbia, MD) with a ﬂow of 4 uL / min and eluted in 10 ion exchange elution segments into a C18 trap column (2.5 cm, 100um ID, 5um, 300 A ProteoPep II from New Objective, , MA) and washed for 5 min with H20/0.1%FA. The separation then can be further carried out at 300 nL/min using a gradient of 2-45% B (H2O A (solvent A) and ACN /0.1%FA nt B)) for 120 minutes on a 15 cm fused silica column (75pm ID, 5um, 300 A ProteoPep II from New Objective, Woburn, MA).

Full scan MS spectra (m/z 300-2000) can be acquired in the Orbitrap with resolution of 30,000. The most intense ions (up to 10) can be sequentially isolated for fragmentation using High energy C-trap Dissociation (HCD) and dynamically exclude for 30 seconds. HCD can be conducted with an isolation width of 1.2 Da. The resulting fragment ions can be scanned in the orbitrap with resolution of 7500. The LTQ Orbitrap Velos can be controlled by Xcalibur 2.1 with foundation 1.0.1.

Peptides/proteins identification and fication: Peptides and proteins can be identified by automated database searching using Proteome Discoverer software (Thermo Electron) with Mascot search engine against SwissProt database. Search parameters can include 10 ppm for MS tolerance, 0.02 Da for MS2 tolerance, and full trypsin digestion allowing for up to 2 missed cleavages. Carbamidomethylation (C) can be set as the fixed modification. Oxidation (M), TMT6, and deamidation (NQ) can be set as dynamic modifications. Peptides and protein identifications can be filtered with Mascot Significant Threshold (p<0.05). The filters can be allowed a 99% confidence level of protein identification (1% FDA).

The me Discoverer software can apply tion factors on the er ions, and can reject all quantitation values if not all quantitation channels are present. ve protein quantitation can be achieved by normalization at the mean intensity.

With respect to the second type of data, in some exemplary embodiments, bioenergetics profiling of cancer and normal models may employ the seTM XF24 analyzer to enable the tanding of glycolysis and ive phosphorylation components .

Specifically, cells can be plated on Seahorse culture plates at optimal densities.

These cells can be plated in 100 pl of media or ent and left in a 37°C incubator with 5% C02. Two hours later, when the cells are adhered to the 24 well plate, an additional 150 pl of either media or treatment solution can be added and the plates can be left in the e incubator overnight. This two step seeding procedure allows for even bution of cells in the culture plate. Seahorse cartridges that contain the oxygen and pH sensor can be hydrated overnight in the calibrating ﬂuid in a non-C02 incubator at 37°C. Three ondrial drugs are typically loaded onto three ports in the cartridge. Oligomycin, a complex III inhibitor, FCCP, an uncoupler and Rotenone, a complex I inhibitor can be loaded into ports A, B and C tively of the cartridge.

All stock drugs can be prepared at a 10x concentration in an unbuffered DMEM media.

The cartridges can be first incubated with the mitochondrial compounds in a non-C02 incubator for about 15 minutes prior to the assay. Seahorse culture plates can be washed in DMEM based unbuffered media that ns glucose at a tration found in the normal growth media. The cells can be layered with 630 111 of the unbuffered media and can be equilibriated in a non-C02 incubator before placing in the Seahorse instrument with a precalibrated cartridge. The instrument can be run for three-four loops with a mix, wait and measure cycle for get a baseline, before injection of drugs through the port is initiated. There can be two loops before the next drug is introduced.

OCR (Oxygen consumption rate) and ECAR (Extracullular Acidification Rate) can be recorded by the electrodes in a 7 pl r and can be created with the cartridge pushing t the seahorse culture plate.

C. Data Integration and in silico Model Generation Once relevant data sets have been ed, ation of data sets and generation of computer-implemented statistical models may be performed using an AI- based informatics system or platform (e.g, the REFSTM platform). For example, an exemplary AI-based system may produce tion-based networks of n associations as key drivers of lic end points (ECAR/OCR). See Figure 15. Some background details regarding the REFSTM system may be found in Xing et al., “Causal Modeling Using Network Ensemble Simulations of Genetic and Gene Expression Data Predicts Genes Involved in Rheumatoid Arthritis,” PloS Computational Biology, vol. 7, issue. 3, 1-19 (March 2011) (e100105) and US. Patent 7,512,497 to l, the entire contents of each of which is expressly incorporated herein by nce in its entirety. In essence, as described earlier, the REFSTM system is an AI-based system that employs mathematical algorithms to establish causal relationships among the input variables (e. g., protein expression levels, mRNA expression levels, and the corresponding functional data, such as the OCR / ECAR values measured on Seahorse culture plates).

This process is based only on the input data alone, without taking into consideration prior existing knowledge about any potential, established, and/or verified biological relationships.

In particular, a significant advantage of the platform of the invention is that the AI-based system is based on the data sets obtained from the cell model, without resorting to or taking into consideration any existing knowledge in the art concerning the ical process. Further, preferably, no data points are statistically or artificially cut- off and, d, all obtained data is fed into the AI-system for determining protein associations. Accordingly, the resulting statistical models generated from the platform are unbiased, since they do not take into consideration any known biological relationships.

Specifically, data from the proteomics and ECAR/OCR can be input into the AI- based information system, which builds statistical models based on data ations, as bed above. Simulation-based networks of protein associations are then derived for each disease versus normal scenario, including treatments and conditions using the following methods.

A detailed description of an ary process for building the generated (e. g., optimized or evolved) networks appears below with respect to Figure 16. As described above, data from the proteomics and functional cell data is input into the AI-based system (step 210). The input data, which may be raw data or minimally processed data, is pre-processed, which may include normalization (e. g., using a quantile function or internal rds) (step 212). The pre-processing may also include imputing missing data values (e. g., by using the K-nearest neighbor (K-NN) thm) (step 212).

The pre-processed data is used to construct a network fragment library (step 214). The network fragments define quantitative, continuous relationships among all possible small sets (e. g., 2-3 member sets or 2-4 member sets) of ed variables (input data). The relationships between the variables in a fragment may be linear, logistic, multinomial, dominant or ive gous, etc. The relationship in each fragment is assigned a Bayesian ilistic score that reﬂect how likely the candidate relationship is given the input data, and also penalizes the relationship for its mathematical complexity. By scoring all of the possible pairwise and three-way relationships (and in some embodiments also four-way relationships) inferred from the input data, the most likely fragments in the library can be identified (the likely fragments). Quantitative parameters of the relationship are also computed based on the input data and stored for each fragment. Various model types may be used in nt enumeration including but not limited to linear sion, logistic regression, (Analysis of ce) ANOVA models, (Analysis of Covariance) ANCOVA models, non- linear/polynomial regression models and even non-parametric regression. The prior assumptions on model ters may assume Gull distributions or Bayesian Information Criterion (BIC) penalties related to the number of parameters used in the model. In a network inference process, each network in an ensemble of initial trial networks is constructed from a subset of fragments in the fragment y. Each initial trial network in the ensemble of initial trial networks is constructed with a different subset of the fragments from the fragment library (step 216).

An overview of the mathematical representations underlying the Bayesian ks and network fragments, which is based on Xing et al., “Causal Modeling Using k Ensemble Simulations of Genetic and Gene Expression Data Predicts Genes Involved in Rheumatoid Arthritis,” PLoS Computational Biology, vol. 7, issue. 3, 1-19 (March 2011) (e100105), is presented below.

A multivariate system With random variables ‘3;. . . . :‘R Evy-E": "jv' ;_ - A} . multivariate ility distribution function “: characterized by a (9. The multivariate probability distribution includes a large number of parameters function may be factorized and ented by a product of local ional probability distributions: P(X1,...,X -o)=ﬁP,.(X,-|Yﬂ,m,YJ-K 19,-)i 1n Wthh each variable 3:3" is independent from its non-descendent les given its Ki variables, which are is After ization, each local probability parent distribution has its own parameters (9,.

The multivariate probability distribution function may be factorized in different ways with each ular factorization and corresponding parameters being a distinct probabilistic model. Each particular ization (model) can be represented by a Directed c Graph (DAC) having a vertex for each variable I, and directed edges between vertices representing dependences between variables in the local conditional distributions Pi(Xi|le,...,YjK, ) of a DAG, each including a vertex and . Subgraphs associated directed edges are network fragments.

A model is evolved or optimized by determining the most likely factorization and the most likely parameters given the input data. This may be described as “learning a Bayesian networ ,” or, in other words, given a training set of input data, finding a network that best matches the input data. This is accomplished by using a scoring on that evaluates each network with t to the input data.

A Bayesian framework is used to determine the likelihood of a ization given the input data. Bayes Law states that the posterior probability, P(DIM) of a proportional to the product of the product of the posterior model M given data D is probability of the data given the model assumptions, P(DIM) multiplied by the prior P(M) probability of the data, P(D), is probability of the model, that the , assuming constant across . This is expressed in the following equation: P(DlM)* P(M) P(M|D) = P(D) The posterior probability of the data assuming the model is the integral of the data likelihood over the prior distribution of ters: P(D|M) = j P(D|M(®))P(®|M )JG).

Assuming all models are equally likely (i.e., that P(M) is a constant), the posterior probability of model M given the data D may be factored into the product of integrals over parameters for each local network fragment Mi as follows: P):(M|DH Pi(XYXilYjl " YjK’661') Note that in the equation above, a leading constant term has been omitted. In some embodiments, a Bayesian Information Criterion (BIC), which takes a negative logarithm of the posterior probability of the model P(DIM) may be used to “Score” each model as follows: Sm, (M)= —logP(M|D) = 25W) i=1 ’ where the total score S,0, for a model M is a sum of the local scores 5, for each local network fragment. The BIC r gives an expression for determining a score each dual k fragment: S(Mi)zSBIC(Mi):SMLE(Mi)+ K051i) log N where K(Mi) is the number of fitting parameter in model M, and N is the number of samples (data points). SMLE(Mi) is the ve logarithm of the likelihood function for a network fragment, which may be calculated from the functional relationships used for each network fragment. For a BIC score, the lower the score, the more likely a model fits the input data.

The ensemble of trial networks is globally optimized, which may be described as optimizing or evolving the networks (step 218). For example, the trial networks may be evolved and optimized according to a Metropolis Monte Carlo Sampling alogorithm.

Simulated annealing may be used to optimize or evolve each trial network in the ensemble h local transformations. In an example ted annealing processes, each trial network is d by adding a network fragment from the library, by deleted a network fragment from the trial k, by substituting a network fragment or by otherwise changing network topology, and then a new score for the network is calculated. lly ng, if the score improves, the change is kept and if the score worsens the change is rejected. A “temperature” parameter allows some local changes which worsen the score to be kept, which aids the optimization process in avoiding some local minima. The “temperature” ter is decreased over time to allow the optimization/evolution process to converge.

All or part of the network inference process may be conducted in parallel for the trial different networks. Each network may be optimized in parallel on a separate processor and/or on a separate computing device. In some embodiments, the zation s may be conducted on a supercomputer incorporating hundreds to thousands of processors which operate in parallel. Information may be shared among the zation processes conducted on parallel sors.

The optimization process may include a network filter that drops any networks from the ensemble that fail to meet a threshold standard for overall score. The dropped network may be replaced by a new l network. Further any networks that are not “scale free” may be dropped from the ensemble. After the ensemble of networks has been optimized or evolved, the result may be termed an ensemble of generated cell model networks, which may be collectively referred to as the ted consensus network.

D. Simulation to Extract uantitative Relationshi Information and for Prediction Simulation may be used to extract quantitative parameter information ing each relationship in the generated cell model networks (step 220). For e, the simulation for quantitative information extraction may involve perturbing (increasing or decreasing) each node in the network by 10 fold and calculating the posterior distributions for the other nodes (e. g., ns) in the models. The endpoints are compared by t-test with the assumption of 100 samples per group and the 0.01 significance cut-off. The t-test statistic is the median of 100 t-tests. Through use of this simulation technique, an AUC (area under the curve) representing the strength of prediction and fold change representing the in silico magnitude of a node driving an end point are generated for each relationship in the le of ks.

A onship quantification module of a local computer system may be employed to direct the AI—based system to perform the perturbations and to extract the AUC information and fold information. The extracted quantitative information may include fold change and AUC for each edge connecting a parent note to a child node.

In some embodiments, a custom-built R program may be used to extract the tative information.

In some embodiments, the le of generated cell model networks can be used through simulation to predict responses to changes in conditions, which may be later verified though wet-lab cell-based, or animal-based, ments.

The output of the AI—based system may be quantitative relationship parameters and/or other simulation predictions (222).

E. Generation of Differential gDelta) Networks A differential network creation module may be used to te differential (delta) networks between generated cell model networks and ted comparison cell model networks. As described above, in some embodiments, the differential network es all of the quantitative parameters of the relationships in the generated cell model networks and the generated comparison cell model network. The quantitative parameters for each relationship in the ential k are based on the comparison.

In some embodiments, a differential may be performed between various differential networks, which may be termed a delta-delta k. An example of a delta-delta network is described below with respect to Figure 26 in the Examples section. The differential network creation module may be a program or script written in PERL.

F. Visualization of Networks The relationship values for the ensemble of networks and for the differential networks may be visualized using a network visualization m (e.g., ape open source platform for complex network analysis and visualization from the Cytoscape consortium). In the visual depictions of the networks, the thickness of each edge (e. g., each line connecting the proteins) represents the strength of fold change. The edges are also directional indicating causality, and each edge has an ated prediction confidence level.

G. Exemplary Computer System Figure 17 schematically depicts an exemplary computer system/environment that may be employed in some embodiments for communicating with the AI-based informatics system, for generating differential networks, for visualizing networks, for saving and storing data, and/or for interacting with a user. As explained above, ations for an AI-based atics system may be performed on a separate supercomputer with ds or thousands of parallel processors that interacts, directly or indirectly, with the exemplary computer system. The environment includes a ing device 100 with associated peripheral devices. Computing device 100 is mmable to implement executable code 150 for performing various methods, or portions of methods, taught herein. ing device 100 includes a storage device 116, such as a hard-drive, CD-ROM, or other non-transitory computer le media.

Storage device 116 may store an operating system 118 and other related software.

Computing device 100 may r include memory 106. Memory 106 may comprise a computer system memory or random access memory, such as DRAM, SRAM, EDO RAM, etc. Memory 106 may comprise other types of memory as well, or combinations f. Computing device 100 may store, in storage device 116 and/or memory 106, ctions for implementing and processing each portion of the executable code 150.

The executable code 150 may include code for icating with the AI—based informatics system 190, for generating differential networks (e. g., a differential network creation module), for extracting quantitative relationship information from the AI-based informatics system (e.g., a relationship quantification ) and for visualizing networks (e. g., Cytoscape).

In some embodiments, the computing device 100 may communicate directly or indirectly with the AI-based atics system 190 (e. g., a system for executing REFS).

For example, the computing device 100 may communicate with the AI-based informatics system 190 by transferring data files (e. g., data frames) to the AI-based informatics system 190 through a network. Further, the computing device 100 may have executable code 150 that provides an interface and instructions to the AI-based atics system 190.

In some embodiments, the computing device 100 may communicate directly or ctly with one or more experimental systems 180 that provide data for the input data set. Experimental systems 180 for generating data may include systems for mass spectrometry based proteomics, microarray gene expression, qPCR gene expression, mass spectrometry based metabolomics, and mass spectrometry based lipidomics, SNP microarrays, a panel of functional assays, and other in-vitro biology platforms and technologies.

Computing device 100 also includes processor 102, and may include one or more onal processor(s) 102’, for executing software stored in the memory 106 and other programs for controlling system hardware, peripheral devices and/or peripheral hardware. sor 102 and processor(s) 102’ each can be a single core processor or multiple core (104 and 104’) processor. Virtualization may be employed in computing device 100 so that tructure and resources in the computing device can be shared dynamically. Virtualized processors may also be used with executable code 150 and other software in storage device 116. A l machine 114 may be provided to handle a s running on multiple sors so that the process appears to be using only one computing resource rather than multiple. Multiple virtual machines can also be used with one processor.

A user may interact with computing device 100 through a visual display device 122, such as a computer monitor, which may display a user interface 124 or any other interface. The user interface 124 of the display device 122 may be used to display raw data, visual representations of networks, etc. The visual display device 122 may also display other aspects or elements of ary embodiments (e.g., an icon for storage device 116). Computing device 100 may include other I/O s such a keyboard or a point touch interface (e. g., a touchscreen) 108 and a pointing device 110, (e. g., a mouse, trackball and/or trackpad) for ing input from a user. The keyboard 108 and the pointing device 110 may be connected to the visual display device 122 and/or to the computing device 100 via a wired and/or a wireless connection.

Computing device 100 may include a network interface 112 to interface with a network device 126 via a Local Area Network (LAN), Wide Area Network (WAN) or the Internet through a variety of connections including, but not limited to, standard telephone lines, LAN or WAN links (e.g., 802.11, T1, T3, 56kb, X.25), broadband connections (e. g., ISDN, Frame Relay, ATM), wireless connections, controller area network (CAN), or some combination of any or all of the above. The network interface 112 may comprise a in network adapter, k interface card, PCMCIA network card, card bus network adapter, ss network adapter, USB network adapter, modem or any other device suitable for enabling computing device 100 to interface with any type of network capable of communication and performing the operations described herein.

Moreover, computing device 100 may be any computer system such as a workstation, desktop computer, server, laptop, handheld computer or other form of computing or telecommunications device that is capable of communication and that has ient processor power and memory capacity to perform the operations described herein.

Computing device 100 can be running any operating system 118 such as any of the versions of the MICROSOFT WINDOWS operating systems, the different releases of the Unix and Linux operating systems, any version of the MACOS for Macintosh computers, any embedded operating system, any ime operating system, any open source operating system, any proprietary operating system, any ing systems for mobile computing devices, or any other ing system capable of running on the computing device and performing the operations bed herein. The operating system may be g in native mode or emulated mode.

IV. Models for a Biological System and Uses Therefor A. Establishing a Model for a Biological System Virtually all ical systems or processes involve complicated interactions among ent cell types and/or organ s. Perturbation of critical functions in one cell type or organ may lead to secondary effects on other interacting cells types and organs, and such downstream changes may in turn feedback to the initial changes and cause further complications. Therefore, it is beneficial to dissect a given biological system or process to its components, such as interaction between pairs of cell types or organs, and systemically probe the interactions between these components in order to gain a more complete, global view of the biological system or process.

Accordingly, the present invention provides cell models for biological systems.

To this end, Applicants have built cell models for several ary biological systems which have been employed in the subject discovery Platform Technology. Applicants have conducted experiments with the cell models using the subject ery Platform Technology to generate consensus causal relationship ks, including causal relationships unique in the biological system, and thereby fy “modulators” or critical molecular “drivers” important for the particular biological systems or processes.

One significant advantage of the Platform Technology and its components, e. g., the custom built cell models and data sets obtained from the cell models, is that an l, “first generation” consensus causal relationship network ted for a biological system or process can continually evolve or expand over time, e.g., by the introduction of additional cell lines/types and/or additional conditions. Additional data from the evolved cell model, i.e., data from the newly added portion(s) of the cell model, can be collected. The new data collected from an expanded or evolved cell model, i.e., from newly added portion(s) of the cell model, can then be introduced to the data sets previously used to generate the “first generation” consensus causal relationship network in order to generate a more robust “second generation” consensus causal relationship network. New causal relationships unique to the biological system can then be identified from the “second generation” consensus causal relationship network. In this way, the evolution of the cell model provides an evolution of the consensus causal relationship networks, thereby providing new and/or more reliable insights into the modulators of the ical system. In this way, both the cell , the data sets from the cell , and the causal relationship networks generated from the cell models by using the Platform Technology methods can constantly evolve and build upon us dge obtained from the Platform Technology.

Accordingly, the invention provides consensus causal relationship networks generated from the cell models employed in the Platform Technology. These consensus causal onship networks may be first generation consensus causal relationship networks, or may be multiple generation consensus causal relationship networks, e. g., 2nda3rd, 4th, 5th, 6th, 7th, 8th, 93‘, 10m, 113‘, 12th, 13th, 14m, 15m, 16“, 17th, 18th, 193‘, 20Lh or greater generation consensus causal relationship networks. Further, the invention provides simulated consensus causal relationship networks ted from the cell models employed in the rm logy. These simulated consensus causal relationship ks may be first tion simulated consensus causal relationship networks, or may be multiple generation ted consensus causal relationship networks, e.g., 2nd, 3rd, 4th, 5th, 63‘, 7th, 8th, 9th, 10th, 11th, 12th, 13th, 14th, 15th, 16th, 17th, 18m, 19”, 20h or greater simulated generation consensus causal relationship networks.

The invention further provides delta networks and delta-delta networks generated from any of the consensus causal relationship networks of the invention.

A custom built cell model for a biological system or process ses one or more cells associated with the biological . The model for a biological system/process may be established to simulate an environment of ical system, e. g., environment of a cancer cell in vivo, by creating conditions (e. g., cell culture conditions) that mimic a characteristic aspect of the biological system or process.

Multiple cells of the same or different origin, as opposed to a single cell type, may be included in the cell model. In one embodiment, at least 1, 2, 3, 4, 5, 6, 7, 8, 9, , 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 50 or more different cell lines or cell types are included in the cell model. In one embodiment, the cells are all of the same type, e. g., all breast cancer cells or plant cells, but are different ished cell lines, e. g., different established cell lines of breast cancer cells or plant cells. All values presented in the ing list can also be the upper or lower limit of ranges, that are intended to be a part of this invention, e. g., between 1 and 5, 1 and 10, 2 and 5, or 5 and different cell lines or cell types.

Examples of cell types that may be included in the cell models of the invention include, without limitation, human cells, animal cells, mammalian cells, plant cells, yeast, bacteria, or . In one embodiment, cells of the cell model can include diseased cells, such as cancer cells or ially or virally ed cells. In one embodiment, cells of the cell model can include disease-associated cells, such as cells involved in diabetes, obesity or cardiovascular disease state, e. g., aortic smooth muscle cells or hepatocytes. The skilled person would recognize those cells that are involved in or associated with a particular biological state/process, e. g., disease state/process, and any such cells may be included in a cell model of the invention.

Cell models of the invention may include one or more “control cells.” In one embodiment, a control cell may be an untreated or unperturbed cell. In another embodiment, a “control cell” may be a normal, e. g., non-diseased, cell. In one embodiment, at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 13, 14,15, 16,17, 18, 19,20, 25, , 35, 40, 45, 50 or more different control cells are included in the cell model. All values presented in the foregoing list can also be the upper or lower limit of ranges, that are intended to be a part of this invention, e. g., between 1 and 5, 1 and 10, 2 and 5, or 5 and 15 ent control cell lines or control cell types. In one embodiment, the l cells are all of the same type but are different established cell lines of that cell type. In one embodiment, as a control, one or more normal, e.g., non-diseased, cell lines are cultured under similar conditions, and/or are exposed to the same perturbation, as the primary cells of the cell model in order to identify proteins or pathways unique to the biological state or process.

A custom cell model of the invention may also comprise conditions that mimic a characteristic aspect of the biological state or process. For example, cell culture conditions may be selected that closely approximating the conditions of a cancer cell in a tumor environment in Vivo, or of an aortic smooth muscle cell of a patient suffering from cardiovascular disease. In some instances, the conditions are stress conditions.Various conditions / ors may be employed in the cell models of the invention. In one embodiment, these stressors / conditions may constitute the “perturbation”, e.g., external stimulus, for the cell systems. One ary stress ion is hypoxia, a condition typically found, for example, within solid tumors.

Hypoxia can be induced using cognized methods. For example, hypoxia can be induced by placing cell s in a r Incubator Chamber (MIC-101, Billups- Rothenberg Inc. Del Mar, CA), which can be ﬂooded with an industrial gas mix containing 5% C02, 2% Oz and 93% nitrogen. Effects can be measured after a pre- determined period, e. g., at 24 hours after hypoxia treatment, with and without onal external stimulus components (e. g., CoQ10 at 0, 50, or 100 11M). Likewise, lactic acid treatment mimics a cellular environment where glycolysis activity is high. Lactic acid induced stress can be igated at a final lactic acid concentration of about 12.5 mM at a pre-determined time, e. g., at 24 hours, with or t additional external stimulus components (e. g., CoQ10 at 0, 50, or 100 11M). Hyperglycemia is a condition found in es as well as in . A typical hyperglycemic condition that can be used to treat the subject cells include 10% culture grade e added to suitable media to bring up the final concentration of glucose in the media to about 22 mM. Hyperlipidemia is a condition found, for example, in obesity and cardiovascular disease. The hyperlipidemic conditions can be provided by culturing cells in media containing 0.15 mM sodium palmitate. Hyperinsulinemia is a ion found, for example, in diabetes. The hyperinsulinemic conditions may be induced by culturing the cells in media containing 1000 nM insulin.

Individual conditions may be investigated separately in the custom built cell models of the invention, and/or may be combined together. In one embodiment, a combination of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20,25, 30, 40, 50 or more conditions reﬂecting or ting different characteristic s of the biological system are investigated in the custom built cell model. In one embodiment, dual conditions and, in on, combinations of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, ll, l2, l3, 14, 15,20, , 30, 35, 40, 45, 50 or more of the conditions reﬂecting or simulating ent characteristic aspects of the biological system are investigated in the custom built cell model. All values presented in the foregoing list can also be the upper or lower limit of ranges, that are intended to be a part of this invention, e. g., between 1 and 5, l and 10, l and 20, l and 30, 2 and 5, 2 and 10, 5 and 10, l and 20, 5 and 20, 10 and 20, 10 and 25, and 30 or 10 and 50 different conditions.

Once the custom cell model is built, one or more “perturbations” may be applied to the system, such as genetic variation from patient to patient, or with / t treatment by certain drugs or pro-drugs. See Figure 15D. The effects of such perturbations to the cell model system can be measured using various art-recognized or proprietary means, as described in section III.B below.

The custom built cell model may be exposed to a perturbation, e. g., an “environmental perturbation” or “external stimulus component”. The “environmental perturbation” or “external stimulus component” may be endogenous to the cellular environment (e. g., the cellular nment contains some levels of the stimulant, and more of the same is added to increase its level), or may be exogenous to the cellular environment (e. g., the stimulant/perturbation is largely absent from the cellular nment prior to the tion). The cellular environment may further be altered by secondary changes resulting from adding the environmental perturbation or external stimulus component, since the external stimulus component may change the cellular output of the cell system, including molecules secreted into the cellular environment by the cell system. The environmental perturbation or al stimulus ent may include any external al and/or chemical stimulus that may affect cellular function.

This may include any large or small c or inorganic molecules, natural or synthetic chemicals, temperature shift, pH change, radiation, light (UVA, UVB eta), microwave, sonic wave, electrical current, modulated or unmodulated magnetic fields, etc. The nmental perturbation or external stimulus component may also include an introduced c modification or mutation or a vehicle (e. g., vector) that causes a genetic modification / mutation. (i) Cross-talk cell systems In certain situations, where interaction between two or more cell systems are desired to be investigated, a “cross-talking cell system” may be formed by, for example, bringing the modified cellular environment of a first cell system into contact with a second cell system to affect the cellular output of the second cell system.

As used herein, -talk cell system” comprises two or more cell systems, in which the cellular environment of at least one cell system comes into contact with a second cell system, such that at least one cellular output in the second cell system is changed or affected. In certain embodiments, the cell systems within the cross-talk cell system may be in direct contact with one another. In other embodiments, none of the cell systems are in direct contact with one another.

For example, in certain embodiments, the talk cell system may be in the form of a transwell, in which a first cell system is growing in an insert and a second cell system is growing in a corresponding well compartment. The two cell systems may be in contact with the same or different media, and may ge some or all of the media components. External stimulus component added to one cell system may be ntially absorbed by one cell system and/or ed before it has a chance to diffuse to the other cell system. Alternatively, the external stimulus component may eventually approach or reach an equilibrium within the two cell systems.

In certain embodiments, the cross-talk cell system may adopt the form of separately ed cell systems, where each cell system may have its own medium and/or e conditions (temperature, C02 content, pH, etc), or similar or cal culture conditions. The two cell systems may come into contact by, for example, taking the conditioned medium from one cell system and bringing it into contact with another cell system. Direct cell-cell contacts between the two cell systems can also be effected if desired. For example, the cells of the two cell systems may be co-cultured at any point if desired, and the co-cultured cell systems can later be separated by, for example, FACS sorting when cells in at least one cell system have a sortable marker or label (such as a stably expressed cent marker n GFP).

Similarly, in certain ments, the talk cell system may simply be a co- culture. Selective treatment of cells in one cell system can be effected by first treating the cells in that cell system, before culturing the treated cells in co-culture with cells in another cell system. The co-culture cross-talk cell system setting may be helpful when it is desired to study, for example, effects on a second cell system caused by cell surface changes in a first cell system, after stimulation of the first cell system by an external stimulus component.

The cross-talk cell system of the invention is particularly suitable for exploring the effect of certain pre-determined external us component on the cellular output of one or both cell systems. The primary effect of such a stimulus on the first cell system (with which the stimulus directly contact) may be determined by comparing cellular outputs (e.g., protein expression level) before and after the first cell ’s contact with the external stimulus, which, as used herein, may be referred to as “(significant) cellular output differentials.” The secondary effect of such a stimulus on the second cell system, which is mediated through the modified cellular environment of the first cell system (such as its ome), can also be similarly measured. There, a comparison in, for example, proteome of the second cell system can be made between the proteome of the second cell system with the external stimulus treatment on the first cell system, and the me of the second cell system without the external stimulus treatment on the first cell system. Any significant s observed (in proteome or any other cellular outputs of interest) may be referred to as a ficant cellular talk differential.” In making cellular output measurements (such as protein expression), either absolute expression amount or relative expression level may be used. For example, to ine the relative protein expression level of a second cell system, the amount of any given protein in the second cell system, with or without the external stimulus to the first cell , may be compared to a suitable control cell line and mixture of cell lines and given a fold-increase or fold-decrease value. A pre-determined threshold level for such fold-increase (e.g., at least 1.2, 131.4, 1.5,1.6,1.7, 1.8, 1.9, 2, 2.5, 3, 3.5, 4, 4.5, , 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 75 or 100 or more fold increase) or fold- decrease (e.g., at least a decrease to 0.95, 0.9, 0.8, 0.75, 0.7, 0.6, 0.5, 0.45, 0.4, 0.35, 0.3, 0.25, 0.2, 0.15, 0.1 or 0.05 fold, or 90%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10% or 5% or less) may be used to select significant cellular cross-talk differentials. All values presented in the foregoing list can also be the upper or lower limit of ranges, e.g., between 1.5 and 5 fold, between 2 and 10 fold, between 1 and 2 fold, or between 0.9 and 0.7 fold, that are intended to be a part of this invention.

Throughout the present application, all values ted in a list, e. g., such as those above, can also be the upper or lower limit of ranges that are intended to be a part of this invention.

To illustrate, in one exemplary two-cell system established to imitate aspects of a vascular disease model, a heart smooth muscle cell line (first cell ) may be treated with a hypoxia condition (an external stimulus component), and proteome changes in a kidney cell line (second cell ) resulting from contacting the kidney cells with ioned medium of the heart smooth muscle may be measured using tional quantitative mass spectrometry. Significant cellular cross-talking differentials in these kidney cells may be determined, based on comparison with a proper control (e. g., similarly cultured kidney cells contacted with conditioned medium from rly cultured heart smooth muscle cells n_ot treated with hypoxia conditions).

Not every ed significant cellular cross-talking differentials may be of biological significance. With respect to any given biological system for which the subject interrogative biological assessment is applied, some (or maybe all) of the significant cellular cross-talking differentials may be “determinative” with respect to the specific biological m at issue, e. g., either responsible for causing a disease condition (a potential target for therapeutic intervention) or is a biomarker for the disease condition (a potential diagnostic or prognostic factor).

Such determinative cross-talking differentials may be selected by an end user of the subject method, or it may be selected by a bioinformatics software program, such as DAVID-enabled comparative pathway is program, or the KEGG pathway analysis m. In certain embodiments, more than one bioinformatics re program is used, and consensus results from two or more bioinformatics software programs are preferred.

As used herein, “differentials” of cellular outputs include differences (e. g., increased or decreased levels) in any one or more parameters of the cellular outputs. For example, in terms of protein sion level, differentials between two cellular outputs, such as the outputs associated with a cell system before and after the treatment by an external stimulus component, can be measured and quantitated by using art-recognized technologies, such as mass-spectrometry based assays (e.g., iTRAQ, 2D-LC—MSMS, eta). (ii) Cancer Specific Models An example of a ical system or process is cancer. As any other complicated biological process or system, cancer is a complicated pathological condition characterized by multiple unique aspects. For example, due to its high growth rate, many cancer cells are d to grow in hypoxia conditions, have up-regulated glycolysis and reduced oxidative phosphorylation lic pathways. As a result, cancer cells may react ently to an environmental perturbation, such as treatment by a potential drug, as compared to the reaction by a normal cell in response to the same treatment. Thus, it would be of interest to decipher cancer’s unique ses to drug treatment as compared to the responses of normal cells. To this end, a custom cancer model may be established to simulate the environment of a cancer cell, e. g., within a tumor in vivo, by choosing appropriate cancer cell lines and creating cell culture conditions that mimic a teristic aspect of the disease state or process. For example, cell culture conditions may be ed that y approximating the conditions of a cancer cell in a tumor in vivo, or to mimic various aspects of cancer growth, by isolating different growth conditions of the cancer cells.

Multiple cancer cells of the same or ent origin (for example, cancer lines PaCa2, HepG2, PC3 and MCF7), as opposed to a single cancer cell type, may be included in the cancer model. In one embodiment, at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50 or more different cancer cell lines or cancer cell types are included in the cancer model. All values presented in the foregoing list can also be the upper or lower limit of ranges, that are intended to be a part of this invention, e. g., between 1 and 5, 1 and 10, 2 and 5, or 5 and 15 different cancer cell lines or cell types.

In one embodiment, the cancer cells are all of the same type, e. g., all breast cancer cells, but are different established cell lines, e.g., different established cell lines of breast .

Examples of cancer cell types that may be included in the cancer model include, without limitation, lung cancer, breast cancer, prostate cancer, melanoma, squamous cell carcinoma, colorectal cancer, pancreatic cancer, thyroid cancer, endometrial , bladder cancer, kidney cancer, solid tumor, leukemia, non-Hodgkin lymphoma. In one embodiment, a drug-resistant cancer cell may be ed in the cancer model. Specific examples of cell lines that may be included in a cancer model include, without limitation, PaCa2, HepG2, PC3 and MCF7 cells. Numerous cancer cell lines are known in the art, and any such cancer cell line may be included in a cancer model of the invention.

Cell models of the invention may e one or more “control cells.” In one embodiment, a l cell may be an untreated or unperturbed cancer cell. In another embodiment, a “control cell” may be a normal, non-cancerous cell. Any one of numerous normal, non-cancerous cell lines may be ed in the cell model. In one ment, the normal cells are one or more of THLE2 and HDFa cells. In one embodiment, at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 13, 14,15, 16,17, 18, 19, 20, 25, , 35, 40, 45, 50 or more ent normal cell types are included in the cancer model.

All values presented in the foregoing list can also be the upper or lower limit of ranges, that are intended to be a part of this invention, e. g., between 1 and 5, 1 and 10, 2 and 5, or 5 and 15 different normal cell lines or cell types. In one embodiment, the normal cells are all of the same type, e. g., all healthy epithelial or breast cells, but are different established cell lines, e. g., different established cell lines of epithelial or breast cells. In one embodiment, as a control, one or more normal non-cancerous cell lines (e. g., THLE2 and HDFa) are ed under similar conditions, and/or are exposed to the same perturbation, as the cancer cells of the cell model in order to identify cancer unique proteins or pathways.

A custom cancer model may also comprise cell culture conditions that mimic a characteristic aspect of the ous state or s. For example, cell culture conditions may be selected that closely approximating the conditions of a cancer cell in a tumor environment in vivo, or to mimic various aspects of cancer growth, by isolating different growth conditions of the cancer cells. In some instances the cell culture conditions are stress conditions.

One such cancer “environment”, or stress condition, is hypoxia, a condition typically found within a solid tumor. Hypoxia can be induced in cells in cells using artrecognized methods. For example, hypoxia can be induced by placing cell systems in a Modular Incubator r (MIC-101, s-Rothenberg Inc. Del Mar, CA), which can be ﬂooded with an industrial gas mix containing 5% C02, 2% Oz and 93% nitrogen.

Effects can be measured after a pre-determined period, e. g., at 24 hours after hypoxia treatment, with and without onal external stimulus components (e.g., CleO at 0, 50, or 100 MM).

Likewise, lactic acid treatment of cells mimics a cellular environment where glycolysis activity is high, as exists in the tumor environment in vivo. Lactic acid induced stress can be investigated at a final lactic acid concentration of about 12.5 mM at a pre-determined time, e. g., at 24 hours, with or without additional external stimulus components (e.g., CleO at 0, 50, or 100 MM).

Hyperglycemia is normally a condition found in diabetes; however, hyperglycemia also to some extent mimics one aspect of cancer growth because many cancer cells rely on glucose as their primary source of . Exposing subject cells to a typical hyperglycemic condition may include adding 10% culture grade glucose to suitable media, such that the final concentration of glucose in the media is about 22 mM.

Individual ions reﬂecting different aspects of cancer growth may be investigated separately in the custom built cancer model, and/or may be combined together. In one embodiment, ations of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, , 30, 40, 50 or more conditions reﬂecting or simulating ent aspects of cancer growth / conditions are investigated in the custom built cancer model. In one embodiment, individual conditions and, in addition, ations of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50 or more of the ions ing or simulating ent aspects of cancer growth / conditions are investigated in the custom built cancer model. All values presented in the foregoing list can also be the upper or lower limit of , that are intended to be a part of this invention, e. g., between 1 and 5, 1 and 10, 1 and 20, 1 and 30, 2 and 5, 2 and 10, 5 and 10, 1 and 20, 5 and 20, 10 and 20, and 25, 10 and 30 or 10 and 50 different conditions.

Once the custom cell model is built, one or more “perturbations” may be applied to the , such as genetic variation from patient to patient, or with / without treatment by certain drugs or pro-drugs. See Figure 15D. The effects of such perturbations to the system, ing the effect on disease related cancer cells, and disease related normal control cells, can be measured using various art-recognized or proprietary means, as described in section III.B below.

In an exemplary experiment, cancer lines PaCa2, HepG2, PC3 and MCF7, and normal cell lines THLE2 and HDFa, are conditioned in each of hyperglycemia, a, and lactic acid-rich conditions, as well as in all combinations of two or three of thee conditions, and in addition with or without an environmental perturbation, specifically ent by Coenzyme Q10. Listed herein below are such exemplary combinations of conditions, with or without a perturbation, Coenzyme Q10 treatment, that can be used to treat the cancer cells and/or control (e. g., normal) cells of the cancer cell model. Other combinations can be readily formulated depending on the specific interrogative biological assessment that is being conducted. 1. Media only 2. 50 11M CTL Coenzyme Q10 3. 100 11M CTL Coenzyme Q10 4. 12.5 mM Lactic Acid . 12.5 mM Lactic Acid + 50 11M CTL Coenzyme Q10 6. 12.5 mM Lactic Acid + 100 11M CTL Coenzyme Q10 7. Hypoxia Hypoxia + 50 11M CTL Coenzyme Q10 Hypoxia + 100 11M CTL Coenzyme Q10 . a + 12.5 mM Lactic Acid 11. Hypoxia + 12.5 mM Lactic Acid + 50 11M CTL Coenzyme Q10 12. Hypoxia + 12.5 mM Lactic Acid + 100 11M CTL Coenzyme Q10 13. Media + 22 mM Glucose 14. 50 11M CTL Coenzyme Q10 + 22 mM Glucose . 100 11M CTL Coenzyme Q10 + 22 mM Glucose 16. 12.5 mM Lactic Acid + 22 mM Glucose 17. 12.5 mM Lactic Acid + 22 mM e + 50 11M CTL Coenzyme Q10 18. 12.5 mM Lactic Acid + 22 mM Glucose +100 11M CTL Coenzyme Q10 19. Hypoxia + 22 mM Glucose . Hypoxia + 22 mM e + 50 11M CTL Coenzyme Q10 21. Hypoxia + 22 mM Glucose + 100 11M CTL Coenzyme Q10 22. Hypoxia +12.5 mM Lactic Acid + 22 mM Glucose 23. Hypoxia +12.5 mM Lactic Acid + 22 mM Glucose + 50 11M CTL Coenzyme 24. Hypoxia + 12.5 mM Lactic Acid + 22 mM Glucose +100 11M CTL me Q10 In certain situations, cross talk or ECS experiments between different cancer cells (e. g., HepG2 and PaCa2) may be conducted for several inter-related purposes. In some embodiments that involve cross talk, experiments conducted on the cell models are designed to determine modulation of cellular state or function of one cell system or tion (e. g., Hepatocarcinoma cell HepG2) by another cell system or population (e. g., Pancreatic cancer PaCa2) under d treatment ions (e. g., hyperglycemia, hypoxia (ischemia)). According to a typical setting, a first cell system / population is contacted by an external stimulus components, such as a candidate molecule (e. g., a small drug molecule, a n) or a ate condition (e. g., hypoxia, high glucose environment). In response, the first cell system / population changes its transcriptome, proteome, metabolome, and/or interactome, g to changes that can be readily ed both inside and outside the cell. For example, changes in transcriptome can be measured by the transcription level of a plurality of target mRNAs; changes in proteome can be measured by the expression level of a plurality of target ns; and changes in metabolome can be measured by the level of a plurality of target metabolites by assays designed specifically for given metabolites. Alternatively, the above referenced changes in metabolome and/or proteome, at least with respect to certain secreted metabolites or proteins, can also be ed by their effects on the second cell system / population, including the modulation of the transcriptome, me, metabolome, and interactome of the second cell system / population. Therefore, the experiments can be used to fy the effects of the molecule(s) of interest secreted by the first cell system / population on a second cell system / population under ent treatment conditions.

The experiments can also be used to identify any proteins that are modulated as a result of signaling from the first cell system (in response to the external stimulus component treatment) to another cell system, by, for example, differential screening of proteomics.

The same mental setting can also be adapted for a reverse setting, such that reciprocal s between the two cell systems can also be assessed. In general, for this type of experiment, the choice of cell line pairs is largely based on the factors such as origin, disease state and cellular function.

The custom built cancer model may be established and used throughout the steps of the Platform Technology of the invention to ultimately identify a causal relationship unique in the ical system, by carrying out the steps described herein. It will be understood by the skilled artisan, however, that a custom built cancer model that is used to generate an initial, “first generation” consensus causal relationship network can continually evolve or expand over time, e. g., by the introduction of additional cancer or normal cell lines and/or additional cancer conditions. Additional data from the evolved cancer model, i.e., data from the newly added portion(s) of the cancer model, can be collected. The new data ted from an expanded or evolved cancer model, i.e., from newly added portion(s) of the cancer model, can then be introduced to the data sets previously used to generate the “first generation” consensus causal relationship network in order to generate a more robust “second generation” sus causal relationship network. New causal relationships unique to the cancer state (or unique to the response of the cancer state to a perturbation) can then be fied from the “second generation” consensus causal relationship network. In this way, the evolution of the cancer model provides an evolution of the consensus causal relationship networks, thereby ing new and/or more reliable insights into the determinative drivers (or modulators) of the cancer state. (iii) Diabetes/Obesity/Cardiovascular Disease Cell Models Other examples of a biological system or process are diabetes, obesity and cardiovascular disease. As with cancer, the related disease states of diabetes, obesity and cardiovascular e are complicated pathological conditions characterized by multiple unique aspects. It would be of interest to identify the proteins/pathways driving the enesis of diabetes/obesity/ cardiovascular disease. It would also be of interest to decipher the unique response of cells ated with diabetes/obesity/cardiovascular disease to drug treatment as compared to the ses of normal cells. To this end, a custom diabetes/obesity/cardiovascular model may be established to te an nment experienced by disease-relevant cells, by choosing appropriate cell lines and creating cell culture conditions that mimic a teristic aspect of the disease state or s. For example, cell culture conditions may be selected that closely approximate hyperglycemia, hyperlipidemia, hyperinsulinemia, hypoxia or lactic-acid rich ions.

Any cells relevant to diabetes/obesity/cardiovascular disease may be included in the diabetes/obesity/cardiovascular disease model. Examples of cells relevant to diabetes/obesity/cardiovascular disease include, for example, ytes, myotubes, hepatocytes, aortic smooth muscle cells (HASMC) and proximal tubular cells (e.g., HK2). Multiple cell types of the same or different origin, as opposed to a single cell type, may be included in the diabetes/obesity/cardiovascular disease model. In one embodiment, at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 13, 14, 15, 16,17, 18, 19, 20, 25, , 35, 40, 45, 50 or more different cell types are included in the diabetes/obesity/cardiovascular disease model. All values presented in the foregoing list can also be the upper or lower limit of ranges, that are intended to be a part of this ion, e. g., n 1 and 5, 1 and 10, 2 and 5, or 5 and 15 different cell cell types.

In one embodiment, the cells are all of the same type, e. g., all adipocytes, but are different established cell lines, e. g., different established yte cell lines. Numerous other cell types that are involved in the diabetes/obesity/cardiovascular disease state are known in the art, and any such cells may be included in a diabetes/obesity/cardiovascular disease model of the invention.

Diabetes/obesity/cardiovascular disease cell models of the invention may include one or more “control cells.” In one ment, a control cell may be an untreated or unperturbed disease-relevant cell, e. g., a cell that is not exposed to a hyperlipidemic or hyperinsulinemic condition. In another ment, a “control cell” may be a non- disease relevant cell, such as an epithelial cell. Any one of numerous non-disease relevant cells may be included in the cell model. In one embodiment, at least 2, 3, 4, 5, 6,7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40,45, 50 or more different non-disease nt cell types are included in the cell model. All values presented in the foregoing list can also be the upper or lower limit of ranges, that are intended to be a part of this invention, e. g., between 1 and 5, 1 and 10, 2 and 5, or 5 and different non-disease relevant cell lines or cell types. In one embodiment, the non- disease relevant cells are all of the same type, e. g., all healthy epithelial or breast cells, but are different established cell lines, e. g., different established cell lines of epithelial or breast cells. In one embodiment, as a control, one or more sease relevant cell lines are cultured under similar conditions, and/or are exposed to the same bation, as the e relevant cells of the cell model in order to identify proteins or pathways unique to diabetes/obesity/cardiovascular disease.

A custom diabetes/obesity/cardiovascular disease model may also comprise cell culture conditions that mimic a characteristic aspect of sent the pathophysiology of) the diabetes/obesity/cardiovascular disease state or process. For example, cell culture conditions may be selected that closely approximate the conditions of a cell relevant to diabetes/obesity/cardiovascular disease in its nment in vivo, or to mimic various s of diabetes/obesity/cardiovascular disease. In some instances the cell e conditions are stress conditions.

Exemplary conditions that represent the pathophysiology of diabetes/ obesity/ cardiovascular disease include, for example, any one or more of hypoxia, lactic acid rich conditions, hyperglycemia, hyperlimidemia and nsulinemia. Hypoxia can be induced in cells in cells using art-recognized methods. For example, hypoxia can be induced by placing cell systems in a Modular Incubator Chamber 01, Billups- Rothenberg Inc. Del Mar, CA), which can be ﬂooded with an industrial gas mix ning 5% C02, 2% Oz and 93% nitrogen. Effects can be measured after a predetermined period, e. g., at 24 hours after hypoxia ent, with and without additional external us components (e. g., CoQ10 at 0, 50, or 100 11M).

Likewise, lactic acid treatment of cells mimics a cellular environment where glycolysis activity is high. Lactic acid induced stress can be investigated at a final lactic acid concentration of about 12.5 mM at a pre-determined time, e. g., at 24 hours, with or without additional external stimulus components (e.g., CoQ10 at 0, 50, or 100 11M).

Hyperglycemia is a condition found in es. ng subject cells to a typical hyperglycemic condition may include adding 10% culture grade glucose to suitable media, such that the final concentration of glucose in the media is about 22 mM. ipidemia is a condition found in obesity and cardiovascular e. The hyperlipidemic ions can be provided by culturing cells in media containing 0.15 mM sodium palmitate. Hyperinsulinemia is a condition found in diabetes. The hyperinsulinemic conditions may be induced by culturing the cells in media containing 1000 nM insulin.

Additional conditions that represent the pathophysiology of diabetes/ obesity/ cardiovascular disease include, for example, any one or more of inﬂammation, endoplasmic reticulum stress, mitochondrial stress and peroxisomal stress. Methods for creating an inﬂammatory-like condition in cells are known in the art. For example, an inﬂammatory condition may be simulated by ing cells in the presence of TNFalpha and or IL—6. Methods for creating conditions simulating endoplasmic reticulum stress are also known in the art. For e, a conditions simulating endoplasmic reticulum stress may be created by culturing cells in the presence of thapsigargin and/or tunicamycin. Methods for creating conditions simulating mitochondrial stress are also known in the art. For example, a conditions simulating mitochondrial stress may be created by culturing cells in the presence of cin and/or galactose. Methods for creating conditions simulating peroxisomal stress are also known in the art. For example, a conditions simulating peroxisomal stress may be created by culturing cells in the presence of abscisic acid.

Individual ions ing different aspects of diabetes/obesity/cardiovascular disease may be investigated tely in the custom built diabetes/obesity/cardiovascular e model, and/or may be combined together.

In one embodiment, combinations of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50 or more conditions reﬂecting or simulating different aspects of diabetes/obesity/cardiovascular disease are investigated in the custom built diabetes/obesity/cardiovascular disease model. In one embodiment, individual conditions and, in addition, combinations of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, , 40, 50 or more of the conditions reﬂecting or ting different aspects of diabetes/obesity/cardiovascular disease are investigated in the custom built diabetes/obesity/cardiovascular disease model. All values presented in the foregoing list can also be the upper or lower limit of ranges, that are intended to be a part of this invention, e.g., between 1 and 5, l and 10, l and 20, l and 30, 2 and 5, 2 and 10, 5 and , l and 20, 5 and 20, 10 and 20, 10 and 25, 10 and 30 or 10 and 50 ent conditions.

Once the custom cell model is built, one or more “perturbations” may be applied to the system, such as genetic variation from patient to patient, or with / without treatment by certain drugs or pro-drugs. See Figure 15D. The effects of such bations to the system, including the effect on diabetes/obesity/cardiovascular disease related cells, can be measured using various art-recognized or etary means, as bed in section III.B below.

In an exemplary experiment, each of adipocytes, es, cytes, aortic smooth muscle cells (HASMC) and proximal tubular cells (HK2), are conditioned in each of hyperglycemia, hypoxia, hyperlipidemia, hyperinsulinemia, and lactic acid-rich conditions, as well as in all combinations of two, three, four and all five conditions, and in on with or without an environmental perturbation, specifically treatment by Coenzyme Q10. In addition to exemplary combinations of conditions described above in the t of the cancer model, listed herein below are some additional exemplary combinations of conditions, with or without a perturbation, e.g., Coenzyme Q10 treatment, which can be used to treat the diabetes/obesity/cardiovascular disease relevant cells (and/or control cells) of the diabetes/obesity/cardiovascular e cell model.

These are merely intended to be exemplary, and the skilled n will appreciate that any individual and/or combination of the above-mentioned conditions that represent the pathophysiology of diabetes/ obesity/ cardiovascular disease may be employed in the cell model to produce output data sets. Other combinations can be readily formulated depending on the specific interrogative biological assessment that is being conducted. 1. Media only 2. 50 11M CTL Coenzyme Q10 3. 100 11M CTL Coenzyme Q10 4. 0.15 mM sodium palmitate . 0.15 mM sodium palmitate + 50 11M CTL Coenzyme Q10 6. 0.15 mM sodium ate + 100 11M CTL Coenzyme Q10 7. 1000 nM insulin 8. 1000 nM n + 50 11M CTL Coenzyme Q10 9. 1000 nM insulin + 100 11M CTL Coenzyme Q10 . 1000 nM insulin + 0.15 mM sodium palmitate 11.1000 nM insulin + 0.15 mM sodium palmitate + 50 11M CTL me Q10 12.1000 nM insulin + 0.15 mM sodium palmitate + 100 11M CTL Coenzyme Q10 In certain situations, cross talk or ECS experiments between different disease- relevant cells (e. g., HASMC and HK2 cells, or liver cells and adipocytes) may be conducted for l inter-related purposes. In some embodiments that involve cross talk, experiments conducted on the cell models are designed to determine modulation of cellular state or function of one cell system or population (e. g., liver cells) by another cell system or population (e. g., adipocytes) under defined treatment conditions (e. g., hyperglycemia, hypoxia, hyperlipidemia, hyperinsulinemia). According to a typical setting, a first cell system / population is contacted by an external stimulus components, such as a candidate molecule (e. g., a small drug molecule, a n) or a candidate condition (e. g., hypoxia, high glucose environment). In se, the first cell system / population changes its transcriptome, proteome, metabolome, and/or interactome, leading to changes that can be readily ed both inside and outside the cell. For example, changes in riptome can be ed by the transcription level of a plurality of target mRNAs; s in proteome can be measured by the expression level of a plurality of target proteins; and changes in metabolome can be measured by the level of a plurality of target metabolites by assays designed specifically for given metabolites. Alternatively, the above referenced changes in metabolome and/or proteome, at least with respect to certain secreted metabolites or proteins, can also be measured by their effects on the second cell system / population, including the modulation of the riptome, proteome, metabolome, and interactome of the second cell system / population. Therefore, the experiments can be used to identify the s of the le(s) of interest secreted by the first cell system / population on a second cell system / population under different treatment conditions. The experiments can also be used to identify any proteins that are modulated as a result of signaling from the first cell system (in response to the external stimulus component ent) to another cell system, by, for example, differential screening of proteomics. The same experimental setting can also be adapted for a reverse setting, such that reciprocal s between the two cell systems can also be assessed. In general, for this type of experiment, the choice of cell line pairs is largely based on the factors such as origin, disease state and cellular function.

The custom built diabetes/obesity/cardiovascular disease model may be established and used throughout the steps of the Platform Technology of the invention to tely identify a causal relationship unique to the es/obesity/cardiovascular disease state, by carrying out the steps described herein. It will be understood by the skilled artisan, however, that just as with a cancer model, a custom built diabetes/obesity/cardiovascular disease model that is used to generate an initial, “first generation” consensus causal relationship network can ually evolve or expand over time, e. g., by the uction of additional disease-relevant cell lines and/or additional disease-relevant conditions. Additional data from the evolved diabetes/obesity/cardiovascular disease model, i.e., data from the newly added portion(s) of the cancer model, can be collected. The new data collected from an expanded or d model, i.e., from newly added portion(s) of the model, can then be uced to the data sets previously used to te the “first generation” consensus causal onship network in order to generate a more robust “second generation” consensus causal relationship network. New causal onships unique to the diabetes/obesity/cardiovascular disease state (or unique to the response of the diabetes/obesity/cardiovascular disease state to a perturbation) can then be fied from the “second generation” consensus causal relationship network. In this way, the evolution of the diabetes/obesity/cardiovascular disease model provides an evolution of the consensus causal relationship networks, thereby providing new and/or more reliable insights into the determinative drivers (or tors) of the diabetes/obesity/cardiovascular disease state.

B. Use of Cell Models for Interrogative Biological Assessments The methods and cell models provided in the present invention may be used for, or applied to, any number of “interrogative biological assessments.” Use of the methods of the invention for an ogative biological assessment tates the identification of “modulators” or determinative cellular process “drivers” of a biological .

As used herein, an rogative biological assessment” may include the identification of one or more modulators of a biological system, e. g., inative cellular s “drivers,” (e. g., an increase or decrease in ty of a biological pathway, or key members of the pathway, or key regulators to members of the pathway) associated with the environmental perturbation or external stimulus component, or a unique causal relationship unique in a ical system or process. It may further include additional steps designed to test or verify whether the identified determinative cellular process s are necessary and/or sufficient for the downstream events associated with the environmental perturbation or external stimulus component, including in vivo animal models and/or in vitro tissue culture experiments.

In certain embodiments, the interrogative biological assessment is the diagnosis or staging of a disease state, wherein the identified modulators of a biological system, e. g., determinative cellular process drivers (e. g., cross-talk differentials or causal relationships unique in a biological system or process) represent either disease markers or therapeutic targets that can be subject to eutic intervention. The subject interrogative biological assessment is suitable for any disease condition in theory, but may found particularly useful in areas such as oncology / cancer biology, diabetes, obesity, cardiovascular disease, and neurological conditions (especially neuro- degenerative diseases, such as, t limitation, Alzheimer’s disease, son’s e, Huntington’s disease, Amyotrophic lateral sclerosis (ALS), and aging related neurodegeneration) .

In certain embodiments, the interrogative biological ment is the determination of the efficacy of a drug, wherein the identified modulators of a biological , e. g., determinative ar process driver (e. g., cross-talk differentials or causal relationships unique in a biological system or process) may be the hallmarks of a successful drug, and may in turn be used to identify additional agents, such as MIMs or epishifters, for treating the same disease condition.

In n embodiments, the interrogative biological assessment is the identification of drug targets for ting or treating infection (e. g., bacterial or viral infection), wherein the fied determinative cellular process driver (e. g., cellular talk differentials or causal relationships unique in a biological system or process) may be markers/indicators or key biological molecules ive of the infective state, and may in turn be used to fy anti-infective agents.

In certain embodiments, the interrogative biological assessment is the assessment of a molecular effect of an agent, e.g., a drug, on a given disease profile, wherein the identified tors of a biological system, e. g., determinative cellular process driver (e. g., cellular talk differentials or causal relationships unique in a biological system or process) may be an increase or decrease in activity of one or more biological pathways, or key members of the pathway(s), or key regulators to members of the pathway(s), and may in turn be used, e. g., to predict the therapeutic efficacy of the agent for the given disease.

In certain embodiments, the interrogative biological assessment is the assessment of the toxicological profile of an agent, e. g., a drug, on a cell, tissue, organ or organism, wherein the identified modulators of a biological system, e.g., determinative cellular process driver (e. g., cellular cross-talk differentials or causal relationships unique in a biological system or process) may be tors of toxicity, e.g., cytotoxicity, and may in turn be used to predict or identify the toxicological profile of the agent. In one embodiment, the identified modulators of a biological system, e.g., determinative cellular process driver (e.g., ar cross-talk differentials or causal onships unique in a biological system or process) is an tor of cardiotoxicity of a drug or drug candidate, and may in turn be used to predict or identify the cardiotoxicological profile of the drug or drug candidate.

In certain embodiments, the ogative biological assessment is the identification of drug targets for preventing or treating a disease or disorder caused by biological s, such as e-causing protozoa, fungi, bacteria, protests, viruses, or toxins, wherein the identified modulators of a biological system, e. g., determinative cellular process driver (e.g., cellular cross-talk differentials or causal relationships unique in a biological system or process) may be markers/indicators or key biological molecules causative of said disease or er, and may in turn be used to identify biodefense .

In certain ments, the interrogative biological assessment is the identification of targets for ging , such as anti-aging cosmetics, wherein the identified modulators of a biological , e. g., determinative cellular process driver (e. g., cellular cross-talk differentials or causal relationships unique in a biological system or process) may be markers or indicators of the aging process, particularly the aging process in skin, and may in turn be used to identify anti-aging agents.

In one exemplary cell model for aging that is used in the methods of the ion to identify targets for anti-aging cosmetics, the cell model comprises an aging epithelial cell that is, for example, treated with UV light (an environmental perturbation or external stimulus component), and/or neonatal cells, which are also optionally treated with UV light. In one embodiment, a cell model for aging comprises a cellular cross- talk system. In one exemplary ll talk system established to identify targets for anti-aging cosmetics, an aging epithelial cell (first cell system) may be d with UV light (an external stimulus component), and changes, e. g., proteomic changes and/or functional changes, in a neonatal cell (second cell ) resulting from contacting the neonatal cells with conditioned medium of the treated aging epithelial cell may be measured, e. g., proteome changes may be measured using conventional quantitative mass spectrometry, or a causal relationship unique in aging may be identified from a causal relationship network generated from the data.

V. mic Sample Analysis In certain embodiments, the subject method employs scale high-throughput quantitative proteomic analysis of hundreds of samples of similar character, and provides the data necessary for identifying the cellular output differentials.

To provide reference samples for relative quantification with the iTRAQ technique, multiple QC pools are created. Two separate QC pools, consisting of aliquots of each sample, were generated from the Cell #1 and Cell #2 samples - these s are denoted as QCSl and QCSZ, and QCPl and QCP2 for supematants and pellets, respectively. In order to allow for protein tration ison across the two cell lines, cell pellet aliquots from the QC pools described above are combined in equal volumes to generate reference samples (QCP).

The quantitative proteomics approach is based on stable isotope labeling with the 8—plex iTRAQ reagent and 2D-LC MALDI MS/MS for peptide identification and quantification. Quantification with this que is relative: peptides and proteins are assigned abundance ratios relative to a nce sample. Common reference samples in multiple iTRAQ experiments facilitate the ison of samples across le iTRAQ experiments.

To implement this analysis scheme, six primary s and two control pool samples are combined into one 8—plex iTRAQ mix, with the control pool samples labeled with 113 and 117 reagents according to the manufacturer’s suggestions. This e of eight samples is then fractionated by two-dimensional liquid chromatography; strong cation exchange (SCX) in the first dimension, and reversed- phase HPLC in the second dimension. The HPLC eluent is directly fractionated onto MALDI plates, and the plates are ed on an MDS SCIEX/AB 4800 MALDI TOF/TOF mass spectrometer.

In the absence of additional information, it is assumed that the most important changes in protein expression are those within the same cell types under different treatment conditions. For this reason, primary s from Cell#l and Cell#2 are analyzed in separate iTRAQ mixes. To facilitate comparison of protein sion in Cell#l vs. Cell#2 samples, universal QCP samples are analyzed in the available “iTRAQ slots” not occupied by primary or cell line specific QC samples (QCl and QC2).

A brief overview of the laboratory procedures employed is provided herein.

A. Protein Extraction From Cell Supernatant Samples For cell supernatant samples (CSN), proteins from the culture medium are present in a large excess over proteins secreted by the cultured cells. In an attempt to reduce this background, upfront abundant protein depletion was implemented. As ic affinity columns are not available for bovine or horse serum proteins, an anti- human Ing4 column was used. While the antibodies are ed against human proteins, the broad specificity provided by the polyclonal nature of the antibodies was anticipated to lish depletion of both bovine and equine proteins present in the cell culture media that was used.

A 200-ul aliquot of the CSN QC material is loaded on a 10-mL Ing4 depletion column before the start of the study to determine the total protein concentration (Bicinchoninic acid (BCA) assay) in the ﬂow-through material. The loading volume is then selected to achieve a depleted on containing approximately 40 ug total protein.

B. Protein Extraction From Cell Pellets An aliquot of Cell #1 and Cell #2 is lysed in the “standar ” lysis buffer used for the analysis of tissue samples at BGM, and total protein t is determined by the BCA assay. Having established the protein content of these representative cell lystates, all cell pellet samples ding QC samples described in Section 1.1) were processed to cell lysates. Lysate amounts of approximately 40 pg of total protein were carried forward in the processing w.

C. Sample Preparation for Mass Spectrometry Sample preparation follows standard operating procedures and tute of the following: 0 Reduction and alkylation of proteins 0 Protein clean-up on reversed-phase column (cell pellets only) 0 Digestion with trypsin 0 iTRAQ labeling 0 Strong cation exchange chromatography — tion of six fractions (Agilent 1200 system) 0 HPLC fractionation and spotting to MALDI plates (Dionex Ultimate3000/Probot system) D. MALDI MS and MS/MS HPLC-MS generally employs online ESI MS/MS strategies. BG Medicine uses an ne LC-MALDI MS/MS platform that results in better concordance of observed protein sets across the primary samples without the need of injecting the same sample multiple times. Following first pass data collection across all iTRAQ mixes, since the e fractions are retained on the MALDI target plates, the s can be analyzed a second time using a targeted MS/MS acquisition pattern d from knowledge gained during the first acquisition. In this manner, m observation frequency for all of the identified proteins is lished (ideally, every protein should be measured in every iTRAQ mix).

E. Data Processing The data processing process within the BGM Proteomics ow can be separated into those procedures such as preliminary peptide identification and fication that are completed for each iTRAQ mix individually (Section 1.5.1) and those processes (Section 1.5.2) such as final assignment of peptides to proteins and final quantification of proteins, which are not completed until data acquisition is completed for the project.

The main data processing steps within the BGM Proteomics workﬂow are: 0 Peptide identification using the Mascot (Matrix Sciences) database search engine 0 Automated in house validation of Mascot IDs 0 Quantification of peptides and preliminary quantification of proteins 0 Expert curation of final dataset 0 Final assignment of peptides from each mix into a common set of proteins using the automated PVT tool 0 Outlier elimination and final quantification of proteins (i) Data Processing of Individual iTRAQ Mixes As each iTRAQ mix is processed h the workflow the MS/MS spectra are analyzed using proprietary BGM software tools for peptide and protein identifications, as well as initial assessment of quantification information. Based on the results of this preliminary analysis, the quality of the workflow for each primary sample in the mix is judged against a set of BGM performance metrics. If a given sample (or mix) does not pass the specified minimal performance s, and additional material is ble, that sample is ed in its entirety and it is data from this second implementation of the workflow that is incorporated in the final dataset. (ii) Peptide fication MS/MS spectra was searched against the Uniprot protein sequence database containing human, bovine, and horse sequences augmented by common contaminant sequences such as porcine trypsin. The details of the Mascot search ters, including the te list of modifications, are given in Table 3.

Table 3: Mascot Search Parameters Precursor mass tolerance 100 ppm Fragment mass tolerance 0.4 Da Variable modifications N-term iTRAQ8 Lysine iTRAQ8 Cys carbamidomethyl Pyro-Glu (N-term) Pyro-Carbamidomethyl Cys (N-term) Deamidation (N only) Oxidation (M) Enzyme specificity Fully Tryptic Number of missed t ntic sites d 2 Peptide rank considered 1 After the Mascot search is complete, an auto-validation procedure is used to promote (i.e., validate) specific Mascot peptide matches. Differentiation between valid and invalid matches is based on the attained Mascot score relative to the expected Mascot score and the difference between the Rank 1 peptides and Rank 2 peptide Mascot scores. The criteria ed for tion are somewhat relaxed if the peptide is one of several d to a single protein in the iTRAQ mix or if the e is present in a catalogue of previously validated es. (iii) Peptide and Protein fication The set of validated peptides for each mix is utilized to calculate preliminary protein quantification metrics for each mix. e ratios are calculated by dividing the peak area from the iTRAQ label (i.e., m/z 114, 115, 116, 118, 119, or 121) for each validated peptide by the best representation of the peak area of the reference pool (QCl or QC2). This peak area is the average of the 113 and 117 peaks provided both samples pass QC acceptance criteria. Preliminary protein ratios are determined by calculating the median ratio of all “useful” validated peptides matching to that protein. “Useful” peptides are fully iTRAQ labeled (all N-terminal are labeled with either Lysine or PyroGlu) and fully Cysteine labeled (i.e., all Cys residues are alkylated with Carbamidomethyl or N-terminal Pyro-cmc). (iv) Post-acquisition Processing Once all passes of MS/MS data acquisition are complete for every mix in the project, the data is collated using the three steps discussed below which are aimed at enabling the results from each primary sample to be simply and meaningfully compared to that of another. (v) Global Assignment of Peptide ces to Proteins Final assignment of peptide sequences to protein accession numbers is d out h the proprietary Protein Validation Tool (PVT). The PVT procedure ines the best, minimum non-redundant protein set to describe the entire tion of peptides identified in the project. This is an automated procedure that has been optimized to handle data from a homogeneous taxonomy.

Protein assignments for the supernatant experiments were manually d in order to deal with the complexities of mixed taxonomies in the database. Since the automated paradigm is not valid for cell cultures grown in bovine and horse serum supplemented media, extensive manual curation is necessary to minimize the ambiguity of the source of any given protein. (vi) Normalization of Peptide Ratios The peptide ratios for each sample are normalized based on the method of Vandesompele et al. Genome Biology, 2002, 3(7), ch 0034.1-11. This procedure is applied to the cell pellet ements only. For the supernatant samples, quantitative data are not normalized ering the largest contribution to peptide identifications coming from the media. (vii) Final Calculation of Protein Ratios A rd statistical outlier elimination procedure is used to remove outliers from around each protein median ratio, beyond the 1.96 6 level in the log-transformed data set. Following this ation process, the final set of protein ratios are (re- )calculated.

VI. Markers of the ion and Uses Thereof The present invention is based, at least in part, on the identification of novel biomarkers that are associated with a biological system, such as a disease process, or response of a biological system to a perturbation, such as a therapeutic agent.

In particular, the invention relates to markers (hereinafter “markers” or “markers of the invention”), which are described in the examples. The invention provides nucleic acids and proteins that are encoded by or pond to the markers (hereinafter “marker nucleic acids” and “marker proteins,” respectively). These markers are particularly useful in sing disease states; prognosing disease states; developing drug targets for varies disease states; screening for the presence of toxicity, preferably drug-induced toxicity, e.g., cardiotoxicity; identifying an agent that cause or is at risk for g toxicity; identifying an agent that can reduce or prevent drug-induced toxicity; alleviating, reducing or preventing drug-induced cardiotoxicity; and identifying markers predictive of drug-induced cardiotoxicity.

A "marker" is a gene whose altered level of expression in a tissue or cell from its expression level in normal or healthy tissue or cell is associated with a disease state such as cancer, diabetes, obesity, cardiovescular disease, or a toxicity state, such as a drug- induced toxicity, e. g., cardiotoxicity. A “marker nucleic acid” is a nucleic acid (e. g., mRNA, cDNA) encoded by or corresponding to a marker of the invention. Such marker nucleic acids include DNA (e.g., cDNA) comprising the entire or a l sequence of any of the genes that are markers of the ion or the complement of such a sequence.

Such sequences are known to the one of skill in the art and can be found for e, on the NIH government pubmed website. The marker c acids also include RNA comprising the entire or a partial sequence of any of the gene markers of the invention or the complement of such a sequence, wherein all thymidine residues are replaced with uridine residues. A “marker protein” is a protein encoded by or corresponding to a marker of the invention. A marker protein comprises the entire or a partial sequence of any of the marker proteins of the invention. Such sequences are known to the one of skill in the art and can be found for example, on the NIH government pubmed website.

The terms “protein” and “polypeptide’ are used interchangeably.

A se state or toxic state associated" body ﬂuid is a ﬂuid which, when in the body of a patient, contacts or passes through a cells or into which cells or proteins shed from sarcoma cells are capable of passing. Exemplary disease state or toxic state ated body ﬂuids include blood ﬂuids (e. g. whole blood, blood serum, blood having platelets removed rom), and are described in more detail below. e state or toxic state associated body ﬂuids are not limited to, whole blood, blood having platelets removed therefrom, lymph, tic ﬂuid, urine and semen.

The "normal" level of expression of a marker is the level of expression of the marker in cells of a human subject or t not afﬂicted with a disease state or a toxicity state.

An “over-expression” or “higher level of expression” of a marker refers to an expression level in a test sample that is greater than the standard error of the assay ed to assess expression, and is preferably at least twice, and more preferably three, four, five, six, seven, eight, nine or ten times the expression level of the marker in a control sample (e. g., sample from a healthy subject not having the marker associated a e state or a toxicity state, e.g., cancer, diabetes, obesity, cardiovescular disease, and cardiotoxicity) and preferably, the average expression level of the marker in several control samples.

A “lower level of expression” of a marker refers to an expression level in a test sample that is at least twice, and more ably three, four, five, six, seven, eight, nine or ten times lower than the expression level of the marker in a control sample (e. g., sample from a healthy subjects not having the marker associated a e state or a ty state, e. g., cancer, diabetes, obesity, cardiovescular disease, and cardiotoxicity) and ably, the average expression level of the marker in several control samples.

A "transcribed polynucleotide" or “nucleotide ript” is a polynucleotide (e. g. an mRNA, hnRNA, a cDNA, or an analog of such RNA or cDNA) which is complementary to or homologous with all or a portion of a mature mRNA made by transcription of a marker of the invention and normal post-transcriptional processing (e. g. splicing), if any, of the RNA transcript, and reverse transcription of the RNA transcript.

"Complementary" refers to the broad concept of sequence complementarity between regions of two nucleic acid strands or between two regions of the same nucleic acid strand. It is known that an adenine residue of a first nucleic acid region is capable of forming specific hydrogen bonds (”base pairing") with a residue of a second c acid region which is antiparallel to the first region if the residue is e or uracil.

Similarly, it is known that a cytosine residue of a first c acid strand is capable of base pairing with a residue of a second nucleic acid strand which is rallel to the first strand if the residue is guanine. A first region of a nucleic acid is complementary to a second region of the same or a different nucleic acid if, when the two regions are arranged in an antiparallel n, at least one nucleotide residue of the first region is capable of base pairing with a residue of the second region. Preferably, the first region comprises a first portion and the second region comprises a second portion, whereby, when the first and second portions are arranged in an antiparallel fashion, at least about 50%, and preferably at least about 75%, at least about 90%, or at least about 95% of the nucleotide residues of the first portion are capable of base g with nucleotide residues in the second portion. More preferably, all nucleotide residues of the first portion are capable of base pairing with nucleotide residues in the second portion. ogous" as used herein, refers to nucleotide sequence similarity between two s of the same nucleic acid strand or between s of two different nucleic acid strands. When a nucleotide residue position in both regions is occupied by the same nucleotide residue, then the regions are homologous at that position. A first region is homologous to a second region if at least one nucleotide e position of each region is ed by the same residue. gy between two regions is expressed in terms of the proportion of nucleotide residue positions of the two s that are occupied by the same nucleotide residue. By way of example, a region having the nucleotide sequence GCC—3’ and a region having the nucleotide sequence 5'- TATGGC—3’ share 50% homology. Preferably, the first region comprises a first portion and the second region comprises a second portion, whereby, at least about 50%, and preferably at least about 75%, at least about 90%, or at least about 95% of the nucleotide residue positions of each of the portions are occupied by the same nucleotide residue.

More ably, all nucleotide residue ons of each of the portions are occupied by the same nucleotide residue.

“Proteins of the invention” encompass marker proteins and their fragments; variant marker proteins and their fragments; peptides and polypeptides comprising an at least 15 amino acid segment of a marker or variant marker n; and fusion proteins comprising a marker or variant marker protein, or an at least 15 amino acid segment of a marker or variant marker protein.

The invention further provides antibodies, dy derivatives and antibody fragments which specifically bind with the marker proteins and fragments of the marker proteins of the present invention. Unless otherwise specified herewithin, the terms “antibody” and “antibodies” broadly encompass naturally-occurring forms of antibodies (e. g., IgG, IgA, IgM, IgE) and recombinant antibodies such as single-chain antibodies, ic and humanized dies and multi-specific antibodies, as well as fragments and derivatives of all of the foregoing, which fragments and derivatives have at least an antigenic binding site. Antibody derivatives may comprise a protein or al moiety conjugated to an antibody.

In certain embodiments, the s of the invention e one or more genes (or proteins) selected from the group consisting of HSPAS, FLNB, PARK7, HSPAlA/HSPAlB, STl3, TUBB3, MIF, KARS, NARS, LGALSl, DDXl7, EIF5A, HSPA5, DHX9, HNRNPC, CKAP4, HSPA9, PARPl, HADHA, PHB2, ATP5A1, CANX, GRP78, GRP75, TIMPl, PTX3, HSP76, PDIA4, PDIAl, CA2Dl, GPATl and TAZ. In some embodiments, the markers are a combination of at least two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, n, sixteen, seventeen, eighteen, nineteen, twenty, twenty-five, thirty, or more of the foregoing genes (or proteins). All values presented in the foregoing list can also be the upper or lower limit of ranges, that are intended to be a part of this invention, e. g., between 1 and , l and 10, l and 20, l and 30, 2 and 5, 2 and 10, 5 and 10, l and 20, 5 and 20, 10 and , 10 and 25, 10 and 30 of the foregoing genes (or proteins).

In one embodiment, the markers of the invention are genes or proteins ated with or involved in cancer. Such genes or proteins involved in cancer include, for example, HSPAS, FLNB, PARK7, HSPAlA/HSPAlB, STl3, TUBB3, MIF, KARS, NARS, LGALSl, DDXl7, EIF5A, HSPA5, DHX9, , CKAP4, HSPA9, PARPl, HADHA, PHB2, ATP5Al, and/or CANX. In some embodiments, the markers of the invention are a combination of at least two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, en, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty or more of the foregoing genes (or proteins). All values presented in the foregoing list can also be the upper or lower limit of ranges, that are ed to be a part of this invention, e.g., between 1 and 5, l and 10, l and 20, l and 30, 2 and 5, 2 and , 5 and 10, l and 20, 5 and 20, 10 and 20, 10 and 25, 10 and 30 of the foregoing genes (or proteins).

In one embodiment, the markers of the invention are genes or ns associated with or involved in drug-induced toxicity. Such genes or proteins involved in drug- induced toxicity e, for example, GRP78, GRP75, TIMPl, PTX3, HSP76, PDIA4, PDIAl, CA2Dl, GPATl and/or TAZ. In some embodiments, the markers of the invention are a combination of at least two, three, four, five, six, seven, eight, nine, ten of the foregoing genes (or proteins). All values presented in the foregoing list can also be the upper or lower limit of ranges, that are intended to be a part of this ion, e. g., between 1 and 5, 1 and 10, 1 and 20, 1 and 30, 2 and 5, 2 and 10, 5 and 10, 1 and 20, 5 and 20, 10 and 20, 10 and 25, 10 and 30 of the foregoing genes (or proteins).

A. Cardiotoxicity Associated Markers The present invention is based, at least in part, on the identification of novel biomarkers that are associated with drug-induced cardiotoxicity. The invention is further based, at least in part, on the discovery that Coenzyme Q10 is e of ng or preventing drug-induced cardiotoxicity. ingly, the invention provides methods for identifying an agent that causes or is at risk for causing ty. In one ment, the agent is a drug or drug candidate. In one embodiment, the toxicity is drug-induced ty, e. g., cardiotoxicity.

In one embodiment, the agent is a drug or drug candidate for treating diabetes, obesity or a cardiovascular disorder. In these methods, the amount of one or more kers/proteins in a pair of samples (a first sample not subject to the drug treatment, and a second sample subjected to the drug treatment) is assessed. A modulation in the level of expression of the one or more kers in the second sample as compared to the first sample is an indication that the drug causes or is at risk for causing drug- induced toxicity, e. g., cardiotoxicity. In one embodiment, the one or more biomarkers is selected from the group ting of GRP78, GRP75, TIMPl, PTX3, HSP76, PDIA4, PDIAl, CA2D1, GPATl and TAZ. The methods of the present invention can be practiced in conjunction with any other method used by the skilled practitioner to identify a drug at risk for causing drug-induced cardiotoxocity.

Accordingly, in one aspect, the invention provides a method for identifying a drug that causes or is at risk for causing drug-induced toxicity (e. g., cardiotoxicity), comprising: comparing (i) the level of expression of one or more biomarkers present in a first cell sample obtained prior to the treatment with the drug; with (ii) the level of expression of the one or more biomarkers present in a second cell sample obtained following the treatment with the drug; wherein the one or more biomarkers is selected from the group consisting of GRP78, GRP75, TIMPl, PTX3, HSP76, PDIA4, PDIAl, CA2D1, GPATl and TAZ; wherein a modulation in the level of expression of the one or more biomarkers in the second sample as compared to the first sample is an indication that the drug causes or is at risk for g drug-induced toxicity (e.g., cardiotoxicity).

In one embodiment, the drug-induced toxicity is drug-induced toxicity. In one embodiment, the cells are cells of the cardiovascular system, e. g., cardiomyocytes.

In one embodiment, the cells are diabetic cardiomyocytes. In one ment, the drug is a drug or candidate drug for treating es, obesity or cardiovascular disease.

In one embodiment, a modulation (e.g., an increase or a decrease) in the level of expression of one, two, three, four, five, six, seven, eight, nine or all ten of the biomarkers selected from the group consisting of GRP78, GRP75, TIMPl, PTX3, HSP76, PDIA4, PDIAl, CA2Dl, GPATl and TAZ in the second sample as compared to the first sample is an indication that the drug causes or is at risk for causing drug- induced toxicity.

Methods for identifying an agent that can reduce or prevent drug-induced toxicity are also provided by the invention. In one embodiment, the nduced toxicity is cardiotoxicity. In one embodiment, the drug is a drug or drug candidate for treating diabetes, obesity or a cardiovascular disorder. In these methods, the amount of one or more biomarkers in three samples (a first sample not subjected to the drug treatment, a second sample subjected to the drug treatment, and a third sample subjected both to the drug treatment and the agent) is assessed. Approximately the same level of expression of the one or more biomarkers in the third sample as ed to the first sample is an indication that the agent can reduce or prevent nduced toxicity, e. g., drug-induced cardiotoxicity. In one embodiment, the one or more kers is selected from the group consisting of GRP78, GRP75, TIMPl, PTX3, HSP76, PDIA4, PDIAl, CA2D1, GPATl and TAZ.

Using the methods described herein, a variety of molecules, particularly ing molecules sufficiently small to be able to cross the cell ne, may be screened in order to identify molecules which modulate, e. g., increase or decrease the expression and/or activity of a marker of the invention. Compounds so identified can be provided to a subject in order to reduce, alleviate or prevent drug-induced toxicity in the subject.

Accordingly, in another aspect, the invention es a method for identifying an agent that can reduce or t drug-induced toxicity comprising: (i) determining the level of expression of one or more biomarkers t in a first cell sample obtained prior to the treatment with a toxicity inducing drug; (ii) determining the level of expression of the one or more biomarkers present in a second cell sample obtained ing the treatment with the toxicity inducing drug; (iii) determining the level of expression of the one or more biomarkers present in a third cell sample obtained following the treatment with the toxicity inducing drug and the agent; and (iv) comparing the level of expression of the one or more biomarkers present in the third sample with the first sample; n the one or more biomarkers is selected from the group consisting of GRP78, GRP75, TIMPl, PTX3, HSP76, PDIA4, PDIAl, CA2Dl, GPATl and TAZ; and wherein about the same level of expression of the one or more biomarkers in the third sample as compared to the first sample is an indication that the agent can reduce or prevent drug-induced toxicity.

In one embodiment, the cells are diabetic cardiomyocytes. In one embodiment, the drug is a drug or candidate drug for treating diabetes, obesity or cardiovascular disease.

In one embodiment, about the same level of expression of one, two, three, four, five, six, seven, eight, nine or all ten of the biomarkers ed from the group consisting of GRP78, GRP75, TIMPl, PTX3, HSP76, PDIA4, PDIAl, CA2D1, GPATl and TAZ in the third sample as compared to the first sample is an indication that the agent can reduce or prevent drug-induced toxicity.

The invention further provides methods for alleviating, reducing or ting drug-induced cardiotoxicity in a subject in need f, comprising administering to a subject (e. g., a mammal, a human, or a non-human animal) an agent identified by the screening methods provided herein, thereby reducing or preventing drug-induced toxicity in the subject. In one embodiment, the agent is administered to a t that has already been treated with a cardiotoxicity-inducing drug. In one embodiment, the agent is administered to a subject at the same time as treatment of the subject with a cardiotoxicity-inducing drug. In one embodiment, the agent is administered to a subject prior to treatment of the t with a toxicity-inducing drug.

The invention further es methods for alleviating, reducing or preventing drug-induced cardiotoxicity in a t in need thereof, comprising administering Coenzyme Q10 to the subject (e.g., a mammal, a human, or a non-human animal), thereby reducing or preventing drug-induced cardiotoxicity in the t. In one embodiment, the Coenzyme Q10 is administered to a subject that has y been treated with a cardiotoxicity-inducing drug. In one embodiment, the Coenzyme Q10 is administered to a subject at the same time as treatment of the subject with a cardiotoxicity-inducing drug. In one embodiment, the me Q10 is administered to a subject prior to treatment of the subject with a cardiotoxicity-inducing drug. In one embodiment, the drug-induced cardiotoxicity is associated with modulation of expression of one, two, three, four, five, six, seven, eight, nine or all ten of the biomarkers ed from the group consisting of GRP78, GRP75, TIMPl, PTX3, HSP76, PDIA4, PDIAl, CA2Dl, GPATl and TAZ. All values presented in the foregoing list can also be the upper or lower limit of ranges, that are intended to be a part of this invention, e. g., between 1 and 5, l and 10, 2 and 5, 2 and 10, or 5 and 10 of the foregoing genes (or proteins).

The invention r provides biomarkers (e.g, genes and/or proteins) that are useful as predictive markers for cardiotoxicity, e. g., drug-induced cardiotoxicity. These biomarkers include GRP78, GRP75, TIMPl, PTX3, HSP76, PDIA4, PDIAl, CA2Dl, GPATl and TAZ. The ordinary skilled artisan would, however, be able to identify additional kers tive of drug-induced cardiotoxicity by employing the methods described herein, e.g., by carrying out the methods described in Example 3 but by using a different drug known to induce cardiotoxicity. Exemplary drug-induced cardiotoxicity biomarkers of the invention are further described below.

GRP78 and GRP75 are also referred to as glucose response proteins. These proteins are associated with arcoplasmic reticulum stress (ER ) of cardiomyocytes. SERCA, or sarcoendoplasmic reticulum calcium ATPase, regulates Ca2+ homeostatsis in c cells. Any tion of these ATPase can lead to cardiac dysfunction and heart failure. Based upon the data provided herein, GRP75 and GRP78 and the edges around them are novel predictors of drug induced cardiotoxicity.

TIMPl, also referred to as TIMP metalloprotease inhibitor 1, is involved with remodeling of extra cellular matrix in association with MMPs. TIMPl expression is correlated with fibrosis of the heart, and hypoxia of vascular endothelial cells also induces TIMPl expression. Based upon the data provided herein, TIMPl is a novel predictor of drug induced cardiactoxicity PTX3, also referred to as Pentraxin 3, belongs to the family of C Reactive Proteins (CRP) and is a good marker of an atory condition of the heart.

However, plasma PTX3 could also be representative of ic inﬂammatory response due to sepsis or other medical conditions. Based upon the data provided herein, PTX3 may be a novel marker of cardiac function or cardiotoxicity. onally, the edges ated with PTX 3 in the network could form a novel panel of kers.

HSP76, also ed to as HSPA6, is only known to be expressed in endothelial cells and B lymphocytes. There is no known role for this protein in c function.

Based upon the data provided herein, HSP76 may be a novel predictor of drug induced cardiotoxicity PDIA4, PDIAl, also referred to as protein disulphide isomerase family A proteins, are associated with ER stress response, like GRPs. There is no known role for these proteins in c function. Based upon the data provided herein, these proteins may be novel predictors of drug induced cardiotoxicity.

CA2Dl is also referred to as calcium channel, e-dependent, alpha 2/delta subunit. The alpha-2/delta subunit of voltage-dependent calcium channel regulates calcium t density and tion/inactivation kinetics of the calcium channel.

CA2Dl plays an important role in excitation-contraction coupling in the heart. There is no known role for this n in cardiac function. Based upon the data provided herein, CA2Dl is a novel predictor of drug induced cardiotoxicity GPATl is one of four known glycerolphosphate acyltransferase isoforms, and is located on the mitochondrial outer membrane, allowing reciprocal regulation with carnitine palmitoyltransferase-l. GPATl is upregulated transcriptionally by insulin and SREBP-lc and downregulated acutely by AMP-activated protein kinase, consistent with a role in triacylglycerol synthesis. Based upon the data provided herein, GPATl is a novel predictor of drug induced cardiotoxicity.

TAZ, also referred to as Tafazzin, is highly expressed in cardiac and skeletal muscle. TAZ is involved in the metabolism of cardiolipin and functions as a phospholipid-lysophospholipid transacylase. Tafazzin is responsible for remodeling of a phospholipid cardiolipin (CL), the signature lipid of the mitochondrial inner membrane.

Based upon the data provided herein, TAZ is a novel predictor of drug induced cardiotoxicity B. Cancer Associated Markers The present invention is based, at least in part, on the identification of novel kers that are ated with . Such markers associated in cancer include, for example, HSPAS, FLNB, PARK7, HSPAlA/HSPAlB, STl3, TUBB3, MIF, KARS, NARS, LGALSl, DDXl7, EIFSA, HSPAS, DHX9, HNRNPC, CKAP4, HSPA9, PARPl, HADHA, PHB2, ATP5A1, and/or CANX. In some embodiments, the markers of the invention are a combination of at least two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty or more of the foregoing markers.

Accordingly, the ion provides methods for fying an agent that causes or is at risk for causing cancer. In one embodiment, the agent is a drug or drug candidate. In these methods, the amount of one or more biomarkers/proteins in a pair of samples (a first sample not subject to the drug treatment, and a second sample subjected to the drug treatment) is assessed. A modulation in the level of expression of the one or more biomarkers in the second sample as compared to the first sample is an indication that the drug causes or is at risk for causing cancer. In one embodiment, the one or more biomarkers is selected from the group consisting of HSPAS, FLNB, PARK7, /HSPAlB, STl3, TUBB3, MIF, KARS, NARS, LGALSl, DDXl7, EIFSA, HSPAS, DHX9, HNRNPC, CKAP4, HSPA9, PARPl, HADHA, PHB2, ATP5A1, and CANX. The s of the present invention can be practiced in conjunction with any other method used by the skilled practitioner to identify a drug at risk for causing the cancer.

In one aspect, the invention provides methods for assessing the efficacy of a therapy for treating a cancer in a subject, the method sing: comparing the level of expression of one or more markers present in a first sample obtained from the subject prior to stering at least a portion of the treatment regimen to the t, wherein the one or more markers is selected from the group consisting of HSPAS, FLNB, PARK7, HSPAlA/HSPAlB, STl3, TUBB3, MIF, KARS, NARS, LGALSl, DDXl7, EIFSA, HSPAS, DHX9, HNRNPC, CKAP4, HSPA9, PARPl, HADHA, PHB2, ATP5A1, and CANX; and the level of sion of the one or moare markers present in a second sample ed from the subject ing administration of at least a portion of the treatment regimen, wherein a modulation in the level of expression of the one or more markers in the second sample as compared to the first sample is an indication that the therapy is efficacious for treating the cancer in the subject.

In one embodiment, the sample comprises a ﬂuid obtained from the subject. In one embodiment, the ﬂuid is selected from the group consisting of blood ﬂuids, vomit, saliva, lymph, cystic ﬂuid, urine, ﬂuids collected by bronchial lavage, ﬂuids collected by peritoneal rinsing, and gynecological ﬂuids. In one embodiment, the sample is a blood sample or a ent thereof.

In another embodiment, the sample comprises a tissue or component thereof obtained from the t. In one embodiment, the tissue is selected from the group consisting of bone, connective tissue, cartilage, lung, liver, kidney, muscle tissue, heart, pancreas, and skin.

In one embodiment, the subject is a human.

In one embodiment, the level of expression of the one or more markers in the biological sample is determined by assaying a transcribed polynucleotide or a n thereof in the sample. In one embodiment, wherein assaying the transcribed polynucleotide comprises ying the transcribed polynucleotide.

In one embodiment, the level of expression of the marker in the subject sample is determined by assaying a protein or a portion thereof in the sample. In one embodiment, the protein is assayed using a reagent which specifically binds with the protein.

In one embodiment, the level of expression of the one or more markers in the sample is determined using a technique selected from the group consisting of polymerase chain on (PCR) amplification reaction, reverse-transcriptase PCR analysis, single-strand conformation polymorphism analysis (SSCP), mismatch cleavage detection, heteroduplex analysis, Southern blot analysis, Northern blot is, Western blot analysis, in situ hybridization, array analysis, deoxyribonucleic acid sequencing, ction nt length polymorphism analysis, and ations or sub- combinations thereof, of said sample.

In one embodiment, the level of expression of the marker in the sample is determined using a technique selected from the group consisting of immunohistochemistry, immunocytochemistry, ﬂow try, ELISA and mass spectrometry.

In one embodiment, the level of expression of a plurality of markers is determined.

In one embodiment, the subject is being treated with a therapy selected from the group consisting of an environmental inﬂuencer compound, surgery, radiation, e therapy, antibody therapy, therapy with growth factors, cytokines, herapy, allogenic stem cell therapy. In one embodiment, the environmental inﬂuencer compound is a me Q10 molecule.

The invention further provides methods of assessing r a subject is afﬂicted with a cancer, the method comprising: determining the level of expression of one or more markers present in a biological sample obtained from the subject, wherein the one or more markers is selected from the group consisting of HSPAS, FLNB, PARK7, /HSPAlB, STl3, TUBB3, MIF, KARS, NARS, LGALSl, DDXl7, EIFSA, HSPAS, DHX9, HNRNPC, CKAP4, HSPA9, PARPl, HADHA, PHB2, ATPSAl, and CANX; and comparing the level of expression of the one or more markers t in the biological sample obtained from the subject with the level of expression of the one or more markers t in a control sample, wherein a modulation in the level of sion of the one or more markers in the biological sample obtained from the subject relative to the level of expression of the one or more markers in the control sample is an indication that the subject is afﬂicted with cancer, thereby assessing whether the subject is afﬂicted with the cancer.

In one embodiment, the sample comprises a ﬂuid ed from the subject. In one embodiment, the ﬂuid is selected from the group consisting of blood ﬂuids, vomit, saliva, lymph, cystic ﬂuid, urine, ﬂuids collected by ial , ﬂuids collected by peritoneal rinsing, and gynecological ﬂuids. In one embodiment, the sample is a blood sample or a component thereof.

In another embodiment, the sample comprises a tissue or component thereof obtained from the subject. In one embodiment, the tissue is selected from the group consisting of bone, connective tissue, cartilage, lung, liver, kidney, muscle , heart, pancreas, and skin.

In one embodiment, the subject is a human.

In one embodiment, the level of expression of the one or more s in the biological sample is determined by ng a transcribed polynucleotide or a portion thereof in the sample. In one embodiment, wherein assaying the transcribed polynucleotide comprises amplifying the transcribed polynucleotide.

In one embodiment, the level of expression of the marker in the subject sample is determined by ng a protein or a n thereof in the sample. In one ment, the protein is assayed using a reagent which ically binds with the n.

In one embodiment, the level of expression of the one or more markers in the sample is determined using a technique selected from the group consisting of polymerase chain reaction (PCR) amplification reaction, reverse-transcriptase PCR analysis, single-strand conformation polymorphism analysis (SSCP), mismatch ge detection, heteroduplex analysis, Southern blot analysis, Northern blot analysis, Western blot analysis, in situ hybridization, array analysis, deoxyribonucleic acid sequencing, restriction fragment length polymorphism is, and combinations or sub- combinations thereof, of said .

In one embodiment, the level of expression of the marker in the sample is determined using a technique selected from the group consisting of histochemistry, immunocytochemistry, ﬂow cytometry, ELISA and mass spectrometry.

In one embodiment, the t is being treated with a therapy selected from the group consisting of an environmental inﬂuencer compound, surgery, radiation, hormone therapy, antibody therapy, y with growth factors, cytokines, chemotherapy, allogenic stem cell therapy. In one embodiment, the environmental inﬂuencer compound is a me Q10 molecule.

The invention further provides methods of prognosing whether a subject is predisposed to developing a cancer, the method comprising: determining the level of expression of one or more markers present in a biological sample obtained from the subject, wherein the one or more markers is selected from the group consisting of HSPAS, FLNB, PARK7, HSPAlA/HSPAlB, STl3, TUBB3, MIF, KARS, NARS, LGALSl, DDXl7, EIFSA, HSPAS, DHX9, HNRNPC, CKAP4, HSPA9, PARPl, HADHA, PHB2, ATP5A1, and CANX; and comparing the level of expression of the one or more markers present in the biological sample obtained from the subject with the level of expression of the one or more markers present in a l sample, wherein a modulation in the level of expression of the one or more markers in the biological sample obtained from the t relative to the level of expression of the one or more markers in the control sample is an indication that the subject is posed to developing cancer, thereby sing whether the t is predisposed to developing the cancer.

In one embodiment, the sample comprises a ﬂuid obtained from the subject. In one embodiment, the ﬂuid is selected from the group consisting of blood ﬂuids, vomit, saliva, lymph, cystic ﬂuid, urine, ﬂuids collected by bronchial lavage, ﬂuids collected by peritoneal g, and gynecological ﬂuids. In one embodiment, the sample is a blood sample or a ent thereof.

In another embodiment, the sample comprises a tissue or component thereof obtained from the subject. In one ment, the tissue is selected from the group consisting of bone, connective tissue, cartilage, lung, liver, kidney, muscle tissue, heart, pancreas, and skin.

In one embodiment, the subject is a human.

In one embodiment, the level of expression of the one or more markers in the biological sample is determined by assaying a transcribed polynucleotide or a n thereof in the sample. In one embodiment, wherein assaying the transcribed polynucleotide comprises amplifying the transcribed cleotide.

In one embodiment, the level of expression of the one or more markers in the sample is determined using a technique selected from the group consisting of rase chain on (PCR) amplification reaction, reverse-transcriptase PCR analysis, single-strand conformation polymorphism analysis (SSCP), mismatch cleavage detection, heteroduplex analysis, Southern blot analysis, Northern blot analysis, Western blot is, in situ hybridization, array analysis, deoxyribonucleic acid sequencing, restriction fragment length polymorphism analysis, and combinations or sub- combinations thereof, of said sample.

In one embodiment, the level of expression of the marker in the sample is determined using a technique selected from the group ting of immunohistochemistry, immunocytochemistry, ﬂow cytometry, ELISA and mass spectrometry.

In one embodiment, the subject is being treated with a therapy selected from the group consisting of an environmental inﬂuencer nd, surgery, ion, hormone therapy, antibody y, therapy with growth factors, cytokines, chemotherapy, allogenic stem cell therapy. In one embodiment, the environmental inﬂuencer compound is a Coenzyme Q10 molecule.

The invention further provides methods of prognosing the recurrence of a cancer in a subject, the method sing: determining the level of expression of one or more markers present in a biological sample obtained from the subject, wherein the one or more markers is selected from the group consisting of HSPAS, FLNB, PARK7, /HSPAlB, STl3, TUBB3, MIF, KARS, NARS, LGALSl, DDXl7, EIFSA, HSPAS, DHX9, HNRNPC, CKAP4, HSPA9, PARPl, HADHA, PHB2, ATP5A1, and CANX; and ing the level of expression of the one or more markers t in the ical sample obtained from the subject with the level of expression of the one or more markers present in a control sample, wherein a modulation in the level of expression of the one or more markers in the biological sample obtained from the subject relative to the level of expression of the one or more markers in the control sample is an indication of the recurrence of cancer, y prognosing the recurrence of the cancer in the subject.

In one embodiment, the sample comprises a ﬂuid obtained from the subject. In one embodiment, the ﬂuid is selected from the group consisting of blood ﬂuids, vomit, saliva, lymph, cystic ﬂuid, urine, ﬂuids collected by bronchial lavage, ﬂuids ted by peritoneal rinsing, and gynecological ﬂuids. In one embodiment, the sample is a blood sample or a component thereof.

In another embodiment, the sample comprises a tissue or component thereof obtained from the subject. In one embodiment, the tissue is selected from the group consisting of bone, connective tissue, cartilage, lung, liver, kidney, muscle tissue, heart, pancreas, and skin.

In one embodiment, the subject is a human.

In one embodiment, the level of expression of the one or more markers in the biological sample is determined by ng a transcribed polynucleotide or a portion thereof in the sample. In one embodiment, wherein ng the transcribed polynucleotide comprises ying the transcribed polynucleotide.

In one ment, the level of expression of the marker in the t sample is determined by ng a protein or a portion thereof in the . In one embodiment, the protein is assayed using a reagent which specifically binds with the protein.

In one embodiment, the level of expression of the one or more markers in the sample is determined using a technique selected from the group consisting of polymerase chain reaction (PCR) amplification reaction, reverse-transcriptase PCR analysis, single-strand conformation polymorphism analysis , ch cleavage detection, heteroduplex analysis, Southern blot analysis, Northern blot analysis, Western blot analysis, in situ hybridization, array is, deoxyribonucleic acid sequencing, restriction fragment length polymorphism is, and combinations or sub- combinations thereof, of said sample.

In one embodiment, the level of expression of the marker in the sample is determined using a technique selected from the group consisting of immunohistochemistry, immunocytochemistry, ﬂow cytometry, ELISA and mass spectrometry.

In one embodiment, the level of expression of a plurality of markers is ined.

In one embodiment, the subject is being treated with a therapy selected from the group consisting of an environmental cer compound, surgery, radiation, hormone therapy, antibody therapy, therapy with growth factors, cytokines, chemotherapy, allogenic stem cell therapy. In one embodiment, the environmental inﬂuencer compound is a Coenzyme Q10 molecule.

The invention futher es methods of prognosing the survival of a subject with a cancer, the method comprising: determining the level of sion of one or more markers present in a biological sample obtained from the subject, wherein the one or more markers is selected from the group consisting of HSPAS, FLNB, PARK7, HSPAlA/HSPAlB, STl3, TUBB3, MIF, KARS, NARS, LGALSl, DDXl7, EIFSA, HSPAS, DHX9, HNRNPC, CKAP4, HSPA9, PARPl, HADHA, PHB2, ATPSAl, and CANX; and comparing the level of expression of the one or more markers present in the biological sample obtained from the subject with the level of expression of the one or more s present in a control sample, wherein a tion in the level of expression of the one or more markers in the biological sample obtained from the subject relative to the level of expression of the one or more markers in the control sample is an indication of survival of the subject, thereby prognosing survival of the subject with the cancer.

In one embodiment, the sample comprises a ﬂuid obtained from the subject. In one embodiment, the ﬂuid is selected from the group consisting of blood ﬂuids, vomit, saliva, lymph, cystic ﬂuid, urine, ﬂuids collected by bronchial , ﬂuids collected by peritoneal rinsing, and gynecological ﬂuids. In one embodiment, the sample is a blood sample or a component f.

In another embodiment, the sample comprises a tissue or component thereof obtained from the subject. In one embodiment, the tissue is selected from the group ting of bone, connective tissue, cartilage, lung, liver, kidney, muscle tissue, heart, pancreas, and skin.

In one embodiment, the t is a human.

In one embodiment, the level of expression of the one or more markers in the biological sample is determined by assaying a transcribed polynucleotide or a portion thereof in the sample. In one embodiment, wherein assaying the transcribed cleotide ses amplifying the transcribed polynucleotide.

In one embodiment, the level of expression of the marker in the subject sample is determined by assaying a protein or a portion thereof in the sample. In one embodiment, the protein is assayed using a reagent which specifically binds with the n.

In one embodiment, the level of expression of the one or more markers in the sample is determined using a technique selected from the group consisting of polymerase chain reaction (PCR) amplification reaction, reverse-transcriptase PCR analysis, single-strand conformation polymorphism analysis (SSCP), mismatch cleavage detection, heteroduplex analysis, rn blot analysis, Northern blot analysis, Western blot analysis, in situ hybridization, array analysis, deoxyribonucleic acid sequencing, restriction fragment length polymorphism analysis, and combinations or sub- combinations f, of said sample.

In one embodiment, the subject is being d with a therapy selected from the group ting of an environmental cer compound, surgery, radiation, hormone therapy, antibody therapy, therapy with growth factors, cytokines, chemotherapy, allogenic stem cell therapy. In one embodiment, the environmental inﬂuencer nd is a Coenzyme Q10 molecule.

The invention further provides s of monitoring the progression of a cancer in a subject, the method comprising: comparing, the level of expression of one or more markers present in a first sample obtained from the subject prior to administering at least a portion of a treatment regimen to the subject and the level of expression of the one or more markers present in a second sample obtained from the subject following administration of at least a portion of the treatment regimen, wherein the one or more s is selected from the group consisting of HSPAS, FLNB, PARK7, HSPAlA/HSPAlB, STl3, TUBB3, MIF, KARS, NARS, LGALSl, DDXl7, EIFSA, HSPAS, DHX9, HNRNPC, CKAP4, HSPA9, PARPl, HADHA, PHB2, , and CANX, y monitoring the progression of the cancer in the subject.

In one embodiment, the sample comprises a ﬂuid obtained from the t. In one embodiment, the ﬂuid is selected from the group consisting of blood ﬂuids, vomit, saliva, lymph, cystic ﬂuid, urine, ﬂuids collected by bronchial lavage, ﬂuids collected by peritoneal rinsing, and gynecological ﬂuids. In one embodiment, the sample is a blood sample or a ent thereof.

In another embodiment, the sample comprises a tissue or component thereof obtained from the subject. In one embodiment, the tissue is selected from the group ting of bone, tive tissue, cartilage, lung, liver, kidney, muscle tissue, heart, pancreas, and skin.

In one embodiment, the subject is a human.

In one ment, the level of expression of the one or more markers in the biological sample is determined by assaying a transcribed polynucleotide or a n thereof in the . In one embodiment, wherein assaying the transcribed polynucleotide comprises amplifying the transcribed polynucleotide.

In one embodiment, the level of expression of the marker in the t sample is determined by assaying a protein or a portion thereof in the sample. In one embodiment, the protein is assayed using a reagent which specifically binds with the protein.

In one embodiment, the level of expression of the one or more markers in the sample is determined using a technique selected from the group consisting of polymerase chain reaction (PCR) amplification reaction, reverse-transcriptase PCR analysis, single-strand conformation polymorphism analysis (SSCP), mismatch cleavage detection, heteroduplex analysis, Southern blot is, Northern blot is, Western blot analysis, in situ hybridization, array analysis, deoxyribonucleic acid sequencing, restriction fragment length polymorphism analysis, and combinations or sub- combinations thereof, of said sample.

In one ment, the level of expression of the marker in the sample is determined using a technique selected from the group consisting of immunohistochemistry, immunocytochemistry, ﬂow cytometry, ELISA and mass spectrometry.

In one embodiment, the level of expression of a plurality of s is determined.

In one embodiment, the subject is being treated with a therapy selected from the group consisting of an environmental inﬂuencer compound, surgery, ion, e therapy, antibody therapy, therapy with growth s, cytokines, chemotherapy, allogenic stem cell therapy. In one embodiment, the environmental inﬂuencer compound is a Coenzyme Q10 molecule.

The invention further provides s of identifying a compound for treating a cancer in a subject, the method comprising: ing a biological sample from the subject; contacting the ical sample with a test compound; determining the level of expression of one or more markers present in the biological sample obtained from the subject, wherein the one or more markers is ed from the group consisting of HSPAS, FLNB, PARK7, HSPAlA/HSPAlB, STl3, TUBB3, MIF, KARS, NARS, LGALSl, DDXl7, EIFSA, HSPAS, DHX9, HNRNPC, CKAP4, HSPA9, PARPl, HADHA, PHB2, ATPSAl, and CANX with a positive fold change and/or with a negative fold change; comparing the level of expression of the one of more markers in the ical sample with an riate control; and selecting a test compound that decreases the level of expression of the one or more markers with a negative fold change present in the biological sample and/or increases the level of expression of the one or more markers with a positive fold change present in the biological sample, thereby identifying a compound for treating the cancer in a subject.

In one embodiment, the sample comprises a ﬂuid obtained from the subject. In one embodiment, the ﬂuid is selected from the group ting of blood ﬂuids, vomit, , lymph, cystic ﬂuid, urine, ﬂuids collected by bronchial lavage, ﬂuids collected by peritoneal rinsing, and gynecological ﬂuids. In one embodiment, the sample is a blood sample or a component thereof.

In another embodiment, the sample comprises a tissue or component f obtained from the subject. In one embodiment, the tissue is selected from the group consisting of bone, connective tissue, cartilage, lung, liver, kidney, muscle tissue, heart, pancreas, and skin.

In one ment, the subject is a human.

In one embodiment, the level of expression of the one or more markers in the biological sample is determined by assaying a transcribed polynucleotide or a portion thereof in the sample. In one embodiment, wherein assaying the transcribed polynucleotide comprises amplifying the transcribed polynucleotide.

In one embodiment, the level of expression of the marker in the t sample is determined by assaying a protein or a n thereof in the sample. In one embodiment, the protein is assayed using a reagent which specifically binds with the protein.

In one embodiment, the level of expression of the one or more markers in the sample is determined using a que selected from the group consisting of polymerase chain reaction (PCR) amplification on, e-transcriptase PCR analysis, single-strand conformation polymorphism analysis (SSCP), mismatch cleavage detection, heteroduplex analysis, Southern blot analysis, Northern blot analysis, Western blot analysis, in situ hybridization, array analysis, deoxyribonucleic acid sequencing, restriction fragment length polymorphism analysis, and combinations or sub- combinations thereof, of said sample.

In one ment, the level of expression of the marker in the sample is determined using a que selected from the group consisting of immunohistochemistry, immunocytochemistry, ﬂow cytometry, ELISA and mass spectrometry.

In one embodiment, the subject is being d with a therapy selected from the group consisting of an environmental inﬂuencer compound, surgery, ion, hormone therapy, antibody therapy, y with growth factors, nes, chemotherapy, allogenic stem cell therapy. In one embodiment, the nmental inﬂuencer compound is a Coenzyme Q10 molecule.

The invention futher provides a kit for assessing the efficacy of a y for treating a cancer, the kit comprising reagents for determining the level of expression of at least one marker selected from the group ting of HSPAS, FLNB, PARK7, HSPAlA/HSPAlB, STl3, TUBB3, MIF, KARS, NARS, LGALSl, DDXl7, EIFSA, HSPAS, DHX9, HNRNPC, CKAP4, HSPA9, PARPl, HADHA, PHB2, ATPSAl, and CANX and instructions for use of the kit to assess the efficacy of the therapy for treating the cancer.

The invention further provides a kit for assessing whether a subject is afﬂicted with a cancer, the kit comprising reagents for determining the level of expression of at least one marker selected from the group consisting of HSPAS, FLNB, PARK7, HSPAlA/HSPAlB, STl3, TUBB3, MIF, KARS, NARS, LGALSl, DDXl7, EIFSA, HSPAS, DHX9, HNRNPC, CKAP4, HSPA9, PARPl, HADHA, PHB2, ATPSAl, and CANX and instructions for use of the kit to assess whether the subject is afﬂicted with the cancer.

The invention futher provides a kit for prognosing whether a subject is predisposed to developing a cancer, the kit comprising reagents for determining the level of expression of at least one marker selected from the group consisting of HSPAS, FLNB, PARK7, HSPAlA/HSPAlB, STl3, TUBB3, MIF, KARS, NARS, , DDXl7, EIFSA, HSPAS, DHX9, HNRNPC, CKAP4, HSPA9, PARPl, HADHA, PHB2, ATP5A1, and CANX and instructions for use of the kit to prognose whether the subject is predisposed to developing the cancer.

The invention further provides a kit for prognosing the recurrence of a cancer in a subject, the kit comprising reagents for assessing the level of expression of at least one marker selected from the group consisting of HSPAS, FLNB, PARK7, HSPAlA/HSPAlB, STl3, TUBB3, MIF, KARS, NARS, LGALSl, DDXl7, EIFSA, HSPAS, DHX9, HNRNPC, CKAP4, HSPA9, PARPl, HADHA, PHB2, ATP5A1, and CANX and instructions for use of the kit to prognose the recurrence of the .

The invention further provides a kit for prognosing the recurrence of a cancer, the kit comprising reagents for determining the level of expression of at least one marker selected from the group consisting of HSPAS, FLNB, PARK7, HSPAlA/HSPAlB, STl3, TUBB3, MIF, KARS, NARS, LGALSl, DDXl7, EIFSA, HSPAS, DHX9, HNRNPC, CKAP4, HSPA9, PARPl, HADHA, PHB2, ATP5A1, and CANX and instructions for use of the kit to se the recurrence of the cancer.

The ion further provides a kit for sing the survival of a subject with a , the kit comprising ts for determining the level of sion of at least one marker selected from the group consisting of HSPAS, FLNB, PARK7, HSPAlA/HSPAlB, STl3, TUBB3, MIF, KARS, NARS, LGALSl, DDXl7, EIFSA, HSPAS, DHX9, HNRNPC, CKAP4, HSPA9, PARPl, HADHA, PHB2, ATP5A1, and CANX and instructions for use of the kit to prognose the survival of the subject with the cancer.

The invention further provides a kit for monitoring the progression of a cancer in a t, the kit comprising reagents for ining the level of expression of at least one marker selected from the group consisting of HSPAS, FLNB, PARK7, HSPAlA/HSPAlB, STl3, TUBB3, MIF, KARS, NARS, LGALSl, DDXl7, EIFSA, HSPA5, DHX9, HNRNPC, CKAP4, HSPA9, PARPl, HADHA, PHB2, ATP5A1, and CANX and instructions for use of the kit to prognose the progression of the cancer in a subject.

The kits of the ion may further comprising means for obtaining a biological sample from a subject, a control sample, and/or an environmental inﬂuencer The means for determining the level of expression of at least one marker may comprises means for ng a transcribed polynucleotide or a portion thereof in the sample and/or means for assaying a protein or a portion f in the sample.

In one ment, the kits comprises reagents for determining the level of expression of a plurality of markers.

Various aspects of the invention are described in further detail in the following subsections.

C. Isolated Nucleic Acid Molecules One aspect of the invention pertains to ed nucleic acid molecules, including nucleic acids which encode a marker protein or a portion thereof. Isolated nucleic acids of the invention also include nucleic acid molecules sufficient for use as hybridization probes to identify marker nucleic acid molecules, and fragments of marker nucleic acid molecules, e. g., those suitable for use as PCR primers for the amplification or mutation of marker nucleic acid molecules. As used herein, the term ic acid molecule" is intended to include DNA molecules (e. g., cDNA or genomic DNA) and RNA molecules (e. g., mRNA) and analogs of the DNA or RNA generated using nucleotide analogs. The nucleic acid molecule can be single-stranded or -stranded, but preferably is double-stranded DNA.

An "isolated" nucleic acid molecule is one which is separated from other nucleic acid molecules which are t in the natural source of the nucleic acid molecule. In one embodiment, an ted" nucleic acid molecule is free of sequences (preferably protein-encoding sequences) which naturally ﬂank the nucleic acid (i.e., sequences located at the 5' and 3' ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the ed nucleic acid le can contain less than about 5 kB, 4 kB, 3 kB, 2 kB, 1 kB, 0.5 kB or 0.1 kB of nucleotide sequences which naturally ﬂank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. In another embodiment, an "isolated" nucleic acid le, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized. A nucleic acid molecule that is ntially free of cellular material includes preparations having less than about 30%, 20%, 10%, or 5% of heterologous nucleic acid (also referred to herein as a "contaminating nucleic acid").

A nucleic acid molecule of the present invention can be isolated using standard molecular biology techniques and the sequence information in the se records described . Using all or a n of such nucleic acid sequences, nucleic acid molecules of the invention can be ed using standard ization and cloning techniques (e. g., as described in ok et al., ed., Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989).

A nucleic acid molecule of the invention can be amplified using cDNA, mRNA, or genomic DNA as a template and appropriate oligonucleotide primers according to standard PCR amplification ques. The nucleic acid so amplified can be cloned into an appropriate vector and characterized by DNA sequence analysis. Furthermore, nucleotides corresponding to all or a portion of a nucleic acid molecule of the invention can be prepared by standard synthetic techniques, e. g., using an automated DNA synthesizer.

In r preferred embodiment, an isolated c acid molecule of the invention comprises a nucleic acid molecule which has a nucleotide sequence complementary to the nucleotide sequence of a marker nucleic acid or to the tide ce of a nucleic acid encoding a marker protein. A nucleic acid molecule which is complementary to a given nucleotide sequence is one which is iently complementary to the given nucleotide sequence that it can hybridize to the given nucleotide sequence y forming a stable duplex.

Moreover, a nucleic acid molecule of the invention can comprise only a portion of a nucleic acid sequence, wherein the full length nucleic acid sequence comprises a marker nucleic acid or which encodes a marker protein. Such nucleic acids can be used, for example, as a probe or primer. The probe/primer typically is used as one or more substantially purified oligonucleotides. The oligonucleotide typically comprises a region of nucleotide sequence that hybridizes under stringent conditions to at least about 7, preferably about 15, more preferably about 25, 50, 75, 100, 125, 150, 175, 200, 250, 300, 350, or 400 or more consecutive nucleotides of a nucleic acid of the invention.

Probes based on the sequence of a nucleic acid molecule of the invention can be used to detect transcripts or c sequences corresponding to one or more markers of the invention. The probe comprises a label group attached thereto, e. g., a radioisotope, a ﬂuorescent compound, an enzyme, or an enzyme co-factor. Such probes can be used as part of a stic test kit for identifying cells or tissues which mis-express the protein, such as by measuring levels of a nucleic acid molecule encoding the protein in a sample of cells from a subject, e.g., detecting mRNA levels or determining whether a gene encoding the protein has been mutated or deleted.

The invention further encompasses nucleic acid les that differ, due to racy of the genetic code, from the nucleotide sequence of nucleic acids encoding a marker protein, and thus encode the same protein.

It will be appreciated by those skilled in the art that DNA sequence polymorphisms that lead to changes in the amino acid sequence can exist within a population (e. g., the human population). Such genetic polymorphisms can exist among individuals within a population due to l allelic variation. An allele is one of a group of genes which occur alternatively at a given genetic locus. In addition, it will be appreciated that DNA polymorphisms that affect RNA expression levels can also exist that may affect the overall expression level of that gene (e. g., by affecting tion or ation).

As used herein, the phrase "allelic variant" refers to a nucleotide sequence which occurs at a given locus or to a polypeptide d by the tide sequence.

As used herein, the terms "gene" and "recombinant gene" refer to nucleic acid molecules comprising an open reading frame encoding a polypeptide corresponding to a marker of the invention. Such natural allelic variations can typically result in l-5% ce in the nucleotide sequence of a given gene. Alternative s can be identified by sequencing the gene of interest in a number of different individuals. This can be y carried out by using hybridization probes to identify the same genetic locus in a variety of individuals. Any and all such nucleotide variations and ing amino acid polymorphisms or variations that are the result of natural c variation and that do not alter the functional activity are intended to be within the scope of the invention.

In another embodiment, an isolated nucleic acid molecule of the invention is at least 7, 15, 20, 25, 30, 40, 60, 80, 100, 150, 200, 250, 300, 350, 400, 450, 550, 650, 700, 800, 900, 1000, 1200, 1400, 1600, 1800, 2000, 2200, 2400, 2600, 2800, 3000, 3500, 4000, 4500, or more nucleotides in length and hybridizes under ent conditions to a marker nucleic acid or to a nucleic acid encoding a marker protein. As used herein, the term dizes under stringent conditions" is intended to describe conditions for hybridization and washing under which nucleotide sequences at least 60% (65%, 70%, preferably 75%) identical to each other typically remain hybridized to each other. Such stringent conditions are known to those skilled in the art and can be found in ns 6.3.1-6.3.6 of Current Protocols in Molecular Biology, John Wiley & Sons, NY. (1989). A preferred, non-limiting example of stringent hybridization conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 0.2X SSC, 0.1% SDS at C.

In addition to naturally-occurring allelic variants of a nucleic acid molecule of the invention that can exist in the population, the skilled artisan will further iate that sequence changes can be introduced by mutation thereby leading to changes in the amino acid sequence of the d protein, without altering the biological activity of the protein encoded thereby. For example, one can make nucleotide substitutions leading to amino acid substitutions at ssential" amino acid residues. A "non- essential" amino acid residue is a residue that can be altered from the wild-type sequence without altering the ical activity, whereas an "essential" amino acid residue is required for biological activity. For example, amino acid residues that are not conserved or only semi-conserved among homologs of various species may be sential for activity and thus would be likely targets for alteration. Alternatively, amino acid residues that are conserved among the homologs of various species (e. g., murine and human) may be essential for activity and thus would not be likely targets for tion.

Accordingly, another aspect of the invention pertains to nucleic acid molecules encoding a variant marker n that contain changes in amino acid residues that are not essential for ty. Such variant marker proteins differ in amino acid sequence from the naturally-occurring marker proteins, yet retain biological activity. In one ment, such a variant marker protein has an amino acid sequence that is at least about 40% identical, 50%, 60%, 70%, 80%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identical to the amino acid sequence of a marker protein.

An isolated nucleic acid le encoding a variant marker n can be created by introducing one or more nucleotide substitutions, additions or deletions into the nucleotide ce of marker nucleic acids, such that one or more amino acid residue substitutions, additions, or deletions are introduced into the encoded protein.

Mutations can be introduced by standard techniques, such as site-directed mutagenesis and diated mutagenesis. Preferably, conservative amino acid substitutions are made at one or more predicted non-essential amino acid es. A "conservative amino acid substitution" is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e. g., aspartic acid, glutamic acid), uncharged polar side chains (e. g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), non-polar side chains (e. g., e, valine, leucine, isoleucine, proline, alanine, nine, tryptophan), ranched side chains (e. g., threonine, valine, isoleucine) and aromatic side chains (e. g., tyrosine, phenylalanine, tryptophan, histidine). Alternatively, mutations can be introduced randomly along all or part of the coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for biological activity to identify mutants that retain activity. Following mutagenesis, the encoded n can be expressed recombinantly and the activity of the protein can be determined.

The present invention encompasses antisense nucleic acid molecules, i.e., molecules which are complementary to a sense nucleic acid of the invention, e. g., complementary to the coding strand of a double-stranded marker cDNA le or complementary to a marker mRNA sequence. Accordingly, an antisense nucleic acid of the invention can hydrogen bond to (Le. anneal with) a sense c acid of the invention. The nse nucleic acid can be complementary to an entire coding strand, or to only a portion thereof, e. g., all or part of the protein coding region (or open reading frame). An antisense nucleic acid molecule can also be antisense to all or part of a non- coding region of the coding strand of a nucleotide sequence encoding a marker protein.

The non-coding regions ("5' and 3' untranslated regions") are the 5' and 3' sequences which ﬂank the coding region and are not translated into amino acids.

An antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45, or 50 or more nucleotides in length. An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e. g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological ity of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e. g., phosphorothioate derivatives and acridine tuted nucleotides can be used. Examples of modified nucleotides which can be used to generate the antisense nucleic acid include ouracil, ouracil, 5- chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 4-acetylcytosine, 5- (carboxyhydroxylmethyl) , 5-carboxymethylaminomethylthiouridine, 5- carboxymethylaminomethyluracil, dihydrouracil, -galactosquueosine, inosine, N6-isopentenyladenine, l-methylguanine, l-methylinosine, 2,2-dimethylguanine, 2- methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7- methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethylthiouracil, beta- D-mannosquueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio- N6-isopentenyladenine, uraciloxyacetic acid (v), xosine, pseudouracil, ne, 2-thiocytosine, 5-methylthiouracil, uracil, 4-thiouracil, 5- methyluracil, uraciloxyacetic acid methylester, uraciloxyacetic acid (v), 5-methyl- 2-thiouracil, 3-(3-aminoNcarboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine.

Alternatively, the antisense nucleic acid can be produced biologically using an expression vector into which a nucleic acid has been sub-cloned in an antisense orientation (i.e., RNA ribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

As used herein, a “nucleic acid” inhibitor is any c acid based inhibitor that causes a decrease in the expression of the target by hybridizing with at least a n of the RNA transcript from the target gene to result in a decrease in the expression of . Nucleic acid tors include, for example, single stranded nucleic acid les, e.g., antisense nucleic acids, and double stranded nucleic acids such as siRNA, shRNA, dsiRNA (see, e.g., US Patent publication 20070104688). As used herein, double stranded nucleic acid molecules are designed to be double stranded over at least 12, preferably at least 15 nucleotides. Double stranded nucleic acid les can be a single nucleic acid strand designed to hybridize to itself, e. g., an shRNA. It is understood that a c acid tor of target can be administered as an isolated nucleic acid. Alternatively, the nucleic acid inhibitor can be administered as an expression construct to produce the inhibitor in the cell. In certain embodiments, the nucleic acid inhibitor includes one or more chemical cations to improve the activity and/ or stability of the nucleic acid inhibitor. Such modifications are well known in the art. The specific modifications to be used will depend, for example, on the type of nucleic acid tor.

Antisense nucleic acid therapeutic agent single stranded nucleic acid therapeutics, typically about 16 to 30 nucleotides in length and are complementary to a target nucleic acid sequence in the target cell, either in culture or in an organism.

Patents directed to antisense nucleic acids, chemical modifications, and therapeutic uses are provided, for example, in U.S. Patent No. 031 related to chemically modified RNA-containing therapeutic compounds, and U.S. Patent No. 6,107,094 related methods of using these compounds as therapeutic agent. U.S. Patent No. 7,432,250 related to methods of treating patients by administering single-stranded chemically modified RNA-like compounds; and U.S. Patent No. 7,432,249 related to pharmaceutical compositions containing single-stranded chemically modified RNA-like compounds. U.S. Patent No. 7,629,321 is related to s of cleaving target mRNA using a single-stranded oligonucleotide having a plurality RNA nucleosides and at least one al modification. Each of the patents listed in the paragraph are incorporated herein by reference.

In many embodiments, the duplex region is 15-30 nucleotide pairs in . In some embodiments, the duplex region is 17-23 nucleotide pairs in length, 17-25 nucleotide pairs in , 23-27 tide pairs in length, 19-21 nucleotide pairs in length, or 21-23 nucleotide pairs in length.

In certain embodiments, each strand has 15-30 nucleotides.

The RNAi agents that can be used in the methods of the invention include agents with chemical modifications as disclosed, for example, in U.S. Provisional Application No. 61/561,710, filed on November 18, 2011, International Application No. , filed on September 15, 2010, and PCT Publication WO 2009/073809, the entire contents of each of which are incorporated herein by reference.

An “RNAi agent,” “double stranded RNAi agent,” double-stranded RNA ) le, also referred to as “dsRNA agent,” “dsRNA”, “siRNA”, “iRNA agent,” as used interchangeably herein, refers to a complex of ribonucleic acid molecules, having a duplex structure comprising two anti-parallel and substantially complementary, as defined below, nucleic acid strands. As used herein, an RNAi agent can also include dsiRNA (see, e. g., US Patent publication 20070104688, incorporated herein by reference). In general, the ty of tides of each strand are ribonucleotides, but as bed , each or both strands can also include one or more bonucleotides, e. g., a ibonucleotide and/or a modified nucleotide. In addition, as used in this specification, an “RNAi agent” may include ribonucleotides with chemical modifications; an RNAi agent may include substantial modifications at multiple nucleotides. Such modifications may include all types of cations disclosed herein or known in the art. Any such modifications, as used in a siRNA type molecule, are encompassed by “RNAi agent” for the purposes of this specification and claims.

The two strands forming the duplex structure may be different portions of one larger RNA molecule, or they may be separate RNA molecules. Where the two strands are part of one larger molecule, and therefore are connected by an uninterrupted chain of nucleotides between the 3’-end of one strand and the 5’-end of the respective other strand forming the duplex structure, the connecting RNA chain is referred to as a “hairpin loop.” Where the two s are connected covalently by means other than an rrupted chain of nucleotides between the 3’-end of one strand and the 5’-end of the respective other strand forming the duplex structure, the connecting structure is referred to as a r.” The RNA strands may have the same or a different number of nucleotides. The maximum number of base pairs is the number of nucleotides in the shortest strand of the dsRNA minus any overhangs that are present in the duplex. In addition to the duplex ure, an RNAi agent may comprise one or more nucleotide overhangs. The term ” is also used herein to refer to an RNAi agent as described above.

In another aspect, the agent is a single-stranded antisense RNA molecule. An antisense RNA molecule is complementary to a sequence within the target mRNA.

Antisense RNA can t translation in a stoichiometric manner by base pairing to the mRNA and physically obstructing the translation machinery, see Dias, N. et al., (2002) Mol Cancer Ther 1:347-355. The antisense RNA molecule may have about 15-30 nucleotides that are complementary to the target mRNA. For example, the antisense RNA molecule may have a sequence of at least 15, 16, 17, 18, 19, 20 or more contiguous nucleotides from one of the antisense sequences of Table 1.

The term “antisense strand” refers to the strand of a double stranded RNAi agent which includes a region that is substantially complementary to a target sequence. As used herein, the term “region complementary to part of an mRNA encoding” a n of interest refers to a region on the antisense strand that is substantially complementary to part of a target mRNA sequence encoding the protein. Where the region of complementarity is not fully complementary to the target sequence, the mismatches are most tolerated in the terminal regions and, if present, are lly in a terminal region or regions, e.g., within 6, 5, 4, 3, or 2 nucleotides of the 5’ and/or 3’ terminus.

The term “sense strand,” as used , refers to the strand of a dsRNA that includes a region that is substantially complementary to a region of the antisense .

In various embodiments, the nucleic acid molecules of the invention can be modified at the base moiety, sugar moiety or phosphate backbone to e, e. g., the stability, hybridization, or solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic acids can be modified to generate peptide nucleic acids (see Hyrup et al., 1996, Bioorganic & Medicinal Chemistry 4(1): 5-23). As used , the terms "peptide nucleic acids" or "PNAs" refer to c acid mimics, e. g., DNA mimics, in which the ibose phosphate ne is replaced by a peptide backbone and only the four natural nucleobases are retained. The neutral backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup et al. (1996), supra; Perry-O'Keefe et al. (1996) Proc. Natl. Acad. Sci. USA 93:14670- 675.

PNAs can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or ne agents for sequence-specific tion of gene expression by, e. g., inducing transcription or ation arrest or inhibiting replication. PNAs can also be used, e. g., in the analysis of single base pair mutations in a gene by, e. g., PNA ed PCR clamping; as artificial restriction enzymes when used in combination with other enzymes, e. g., 81 nucleases (Hyrup (1996), supra; or as probes or primers for DNA sequence and hybridization (Hyrup, 1996, supra; Perry- O'Keefe et al., 1996, Proc. Natl. Acad. Sci. USA 93:14670-675).

In r embodiment, PNAs can be modified, e. g., to enhance their stability or cellular uptake, by attaching ilic or other helper groups to PNA, by the formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug delivery known in the art. For example, PNA-DNA chimeras can be generated which can e the advantageous properties of PNA and DNA. Such chimeras allow DNA ition enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA portion would provide high binding affinity and specificity.

A chimeras can be linked using linkers of appropriate lengths selected in terms of base stacking, number of bonds between the nucleobases, and orientation (Hyrup, 1996, supra). The synthesis of PNA-DNA chimeras can be performed as described in Hyrup (1996), supra, and Finn et al. (1996) Nucleic Acids Res. :3357-63. For example, a DNA chain can be synthesized on a solid support using standard phosphoramidite coupling chemistry and modified nucleoside analogs. Compounds such as 5'—(4-methoxytrityl)amino-5'-deoxy-thymidine phosphoramidite can be used as a link between the PNA and the 5' end of DNA (Mag et al., 1989, Nucleic Acids Res. 17:5973-88). PNA monomers are then coupled in a step-wise manner to e a chimeric molecule with a 5' PNA segment and a 3' DNA t (Finn et al., 1996, Nucleic Acids Res. 24(17):3357-63). Alternatively, chimeric molecules can be synthesized with a 5' DNA segment and a 3' PNA segment (Peterser et al., 1975, Bioorganic Med. Chem. Lett. 5:1119-11124).

In other ments, the oligonucleotide can include other appended groups such as peptides (e. g., for targeting host cell receptors in vivo), or agents facilitating ort across the cell membrane (see, e. g., Letsinger et al., 1989, Proc. Natl. Acad.

Sci. USA 86:6553-6556; Lemaitre et al., 1987, Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO 88/09810) or the blood-brain barrier (see, e. g., PCT Publication No. W0 89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e. g., Krol et al., 1988, Bioﬂechniques 6:958-976) or intercalating agents (see, e.g., Zon, 1988, Pharm. Res. 5:539-549). To this end, the oligonucleotide can be conjugated to another molecule, e. g., a peptide, hybridization triggered cross-linking agent, transport agent, hybridization-triggered cleavage agent, etc.

The invention also includes molecular beacon nucleic acids having at least one region which is complementary to a nucleic acid of the invention, such that the molecular beacon is useful for quantitating the presence of the nucleic acid of the invention in a . A "molecular " nucleic acid is a c acid sing a pair of complementary regions and having a ﬂuorophore and a ﬂuorescent quencher associated therewith. The ﬂuorophore and quencher are associated with different portions of the nucleic acid in such an orientation that when the complementary s are annealed with one another, ﬂuorescence of the ﬂuorophore is quenched by the quencher. When the complementary regions of the nucleic acid are not annealed with one another, ﬂuorescence of the ﬂuorophore is quenched to a lesser degree. Molecular beacon nucleic acids are described, for example, in U.S. Patent 5,876,930.

D. Isolated Proteins and Antibodies One aspect of the invention pertains to isolated marker proteins and biologically active portions thereof, as well as polypeptide fragments suitable for use as immunogens to raise antibodies ed against a marker protein or a nt thereof. In one embodiment, the native marker protein can be isolated from cells or tissue sources by an riate purification scheme using standard protein purification techniques. In another embodiment, a n or peptide sing the whole or a segment of the marker protein is produced by recombinant DNA techniques. Alternative to inant expression, such protein or peptide can be synthesized chemically using standard peptide synthesis techniques.

An "isolated" or "purified" protein or biologically active portion thereof is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or ntially free of chemical precursors or other chemicals when chemically synthesized. The language "substantially free of cellular material" includes preparations of protein in which the protein is separated from ar components of the cells from which it is isolated or recombinantly produced. Thus, protein that is substantially free of cellular material includes preparations of n having less than about 30%, 20%, 10%, or 5% (by dry weight) of heterologous protein (also ed to herein as a "contaminating protein”).

When the protein or biologically active portion f is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, 10%, or 5% of the volume of the protein preparation. When the protein is produced by al synthesis, it is preferably substantially free of chemical precursors or other chemicals, i.e., it is separated from chemical precursors or other chemicals which are involved in the synthesis of the protein. Accordingly such preparations of the protein have less than about 30%, 20%, 10%, 5% (by dry ) of chemical precursors or compounds other than the polypeptide of interest.

Biologically active portions of a marker protein include polypeptides comprising amino acid sequences sufficiently identical to or derived from the amino acid sequence of the marker protein, which include fewer amino acids than the full length protein, and exhibit at least one ty of the corresponding full-length protein. Typically, ically active portions comprise a domain or motif with at least one activity of the corresponding full-length protein. A ically active portion of a marker protein of the invention can be a polypeptide which is, for example, 10, 25, 50, 100 or more amino acids in length. Moreover, other biologically active portions, in which other regions of the marker protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional ties of the native form of the marker protein.

Preferred marker proteins are encoded by nucleotide sequences comprising the sequences encoding any of the genes bed in the examples. Other useful proteins are ntially cal (e. g., at least about 40%, preferably 50%, 60%, 70%, 80%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%) to one of these sequences and retain the functional activity of the corresponding naturally-occurring marker protein yet differ in amino acid sequence due to natural allelic variation or mutagenesis.

To determine the percent ty of two amino acid ces or of two nucleic acids, the sequences are aligned for optimal comparison purposes (e. g., gaps can be introduced in the ce of a first amino acid or nucleic acid sequence for optimal alignment with a second amino or nucleic acid sequence). The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then ed. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the ponding position in the second sequence, then the molecules are identical at that position. Preferably, the percent identity between the two ces is calculated using a global alignment. Alternatively, the percent identity between the two sequences is calculated using a local alignment. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences (i.e., % identity = # of identical positions/total # of ons (e.g., overlapping positions) X100). In one embodiment the two sequences are the same length. In another embodiment, the two sequences are not the same length.

The determination of t identity between two sequences can be accomplished using a mathematical algorithm. A preferred, non-limiting example of a mathematical algorithm ed for the comparison of two sequences is the algorithm of Karlin and Altschul (1990) Proc. Natl. Acad. Sci. USA 4-2268, modified as in Karlin and Altschul (1993) Proc. Natl. Acad. Sci. USA 90:5873-5877. Such an algorithm is incorporated into the BLASTN and BLASTX programs of Altschul, et al. (1990) J. Mol. Biol. 215:403-410. BLAST nucleotide es can be performed with the BLASTN program, score = 100, wordlength = 12 to obtain nucleotide sequences homologous to a nucleic acid molecules of the invention. BLAST protein searches can be performed with the BLASTP program, score = 50, wordlength = 3 to obtain amino acid sequences homologous to a protein molecules of the invention. To obtain gapped alignments for comparison purposes, a newer version of the BLAST thm called Gapped BLAST can be utilized as described in Altschul et al. (1997) Nucleic Acids Res. :3389-3402, which is able to perform gapped local alignments for the programs BLASTN, BLASTP and BLASTX. Alternatively, PSI-Blast can be used to perform an iterated search which detects distant onships n molecules. When utilizing BLAST, Gapped BLAST, and PSI-Blast programs, the t parameters of the respective ms (e.g., BLASTX and BLASTN) can be used. See http://www.ncbi.nlm.nih.gov. Another preferred, non-limiting example of a mathematical thm utilized for the comparison of sequences is the algorithm of Myers and , (1988) CABIOS 4:11-17. Such an algorithm is incorporated into the ALIGN program (version 2.0) which is part of the GCG sequence ent software package. When ing the ALIGN program for comparing amino acid ces, a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4 can be used. Yet another useful algorithm for identifying regions of local sequence similarity and alignment is the FASTA algorithm as described in Pearson and Lipman (1988) Proc. Natl. Acad. Sci. USA 4-2448. When using the FASTA algorithm for comparing nucleotide or amino acid sequences, a PAM120 weight residue table can, for example, be used with a k-tuple value of 2.

The percent identity between two sequences can be determined using techniques similar to those described above, with or without ng gaps. In calculating t identity, only exact matches are counted.

The invention also provides chimeric or fusion proteins comprising a marker protein or a segment thereof. As used herein, a "chimeric protein" or "fusion protein" comprises all or part rably a biologically active part) of a marker protein operably linked to a logous polypeptide (i.e., a polypeptide other than the marker protein).

Within the fusion protein, the term "operably linked" is intended to indicate that the marker n or segment thereof and the heterologous polypeptide are fused in-frame to each other. The heterologous polypeptide can be fused to the amino-terminus or the carboxyl-terminus of the marker protein or segment.

One useful fusion protein is a GST fusion protein in which a marker protein or segment is fused to the carboxyl terminus of GST sequences. Such fusion proteins can facilitate the purification of a recombinant polypeptide of the ion.

In another embodiment, the fusion protein contains a heterologous signal sequence at its amino terminus. For example, the native signal sequence of a marker protein can be removed and replaced with a signal sequence from another protein. For example, the gp67 secretory sequence of the baculovirus envelope protein can be used as a heterologous signal sequence (Ausubel et al., ed., Current Protocols in Molecular Biology, John Wiley & Sons, NY, 1992). Other examples of eukaryotic heterologous signal sequences include the secretory sequences of melittin and human tal ne phosphatase (Stratagene; La Jolla, California). In yet another example, useful prokaryotic heterologous signal sequences include the phoA secretory signal (Sambrook et al., supra) and the protein A secretory signal (Pharmacia Biotech; Piscataway, New Jersey).

In yet another embodiment, the fusion n is an immunoglobulin fusion protein in which all or part of a marker protein is fused to sequences derived from a member of the immunoglobulin protein family. The immunoglobulin fusion proteins of the invention can be orated into pharmaceutical compositions and administered to a subject to inhibit an interaction between a ligand (soluble or membrane-bound) and a protein on the surface of a cell (receptor), to thereby suppress signal transduction in vivo.

The immunoglobulin fusion protein can be used to affect the bioavailability of a cognate ligand of a marker protein. Inhibition of /receptor ction can be useful eutically, both for treating proliferative and differentiative disorders and for ting (e. g. promoting or inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be used as immunogens to produce antibodies ed against a marker protein in a t, to purify ligands and in screening assays to identify molecules which inhibit the interaction of the marker protein with ligands.

Chimeric and fusion proteins of the invention can be produced by standard recombinant DNA techniques. In r embodiment, the fusion gene can be synthesized by conventional techniques including automated DNA synthesizers.

Alternatively, PCR amplification of gene fragments can be d out using anchor primers which give rise to complementary overhangs between two consecutive gene fragments which can subsequently be annealed and re-amplified to generate a chimeric gene sequence (see, e. g., Ausubel et al., supra). Moreover, many expression s are commercially available that already encode a fusion moiety (e. g., a GST polypeptide).

A nucleic acid encoding a polypeptide of the invention can be cloned into such an expression vector such that the fusion moiety is linked me to the polypeptide of the invention.

A signal sequence can be used to facilitate secretion and ion of marker proteins. Signal sequences are typically characterized by a core of hydrophobic amino acids which are generally cleaved from the mature protein during secretion in one or more cleavage events. Such signal peptides contain processing sites that allow cleavage of the signal sequence from the mature proteins as they pass through the secretory pathway. Thus, the invention ns to marker proteins, fusion proteins or segments thereof having a signal sequence, as well as to such ns from which the signal sequence has been proteolytically cleaved (i.e., the cleavage products). In one embodiment, a nucleic acid sequence encoding a signal ce can be operably linked in an expression vector to a protein of interest, such as a marker n or a segment thereof. The signal ce s secretion of the protein, such as from a eukaryotic host into which the expression vector is transformed, and the signal sequence is subsequently or concurrently cleaved. The protein can then be readily purified from the extracellular medium by art recognized methods. Alternatively, the signal sequence can be linked to the protein of interest using a sequence which tates purification, such as with a GST domain.

The present invention also pertains to variants of the marker proteins. Such variants have an altered amino acid sequence which can function as either agonists (mimetics) or as antagonists. Variants can be generated by mutagenesis, e. g., discrete point mutation or truncation. An agonist can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of the protein. An antagonist of a protein can inhibit one or more of the activities of the naturally occurring form of the protein by, for example, competitively binding to a downstream or upstream member of a cellular signaling cascade which includes the protein of interest. Thus, specific biological effects can be elicited by treatment with a variant of limited function.

Treatment of a subject with a variant having a subset of the biological activities of the lly ing form of the protein can have fewer side effects in a subject relative to treatment with the naturally occurring form of the protein.

Variants of a marker protein which on as either agonists (mimetics) or as antagonists can be identified by screening atorial libraries of mutants, e. g., truncation mutants, of the protein of the invention for agonist or antagonist activity. In one ment, a variegated library of variants is generated by combinatorial mutagenesis at the nucleic acid level and is d by a variegated gene library. A variegated library of variants can be produced by, for example, tically ligating a mixture of synthetic ucleotides into gene sequences such that a degenerate set of ial protein sequences is sible as individual polypeptides, or alternatively, as a set of larger fusion proteins (e. g., for phage display). There are a variety of methods which can be used to produce libraries of potential variants of the marker proteins from a degenerate oligonucleotide sequence. Methods for synthesizing degenerate oligonucleotides are known in the art (see, e. g., Narang, 1983, Tetrahedron 39:3; Itakura et al., 1984, Annu. Rev. Biochem. ; Itakura et al., 1984, Science 198:1056; Ike et al., 1983 Nucleic Acid Res. ).

In addition, libraries of segments of a marker protein can be used to generate a variegated population of polypeptides for screening and subsequent selection of variant marker proteins or segments thereof. For example, a library of coding sequence fragments can be generated by treating a double stranded PCR fragment of the coding sequence of interest with a nuclease under conditions wherein nicking occurs only about once per molecule, denaturing the double stranded DNA, renaturing the DNA to form double stranded DNA which can include sense/antisense pairs from different nicked products, removing single stranded portions from reformed duplexes by treatment with $1 nuclease, and ligating the resulting fragment library into an sion vector. By this method, an expression library can be derived which encodes amino terminal and al fragments of various sizes of the protein of interest.

Several techniques are known in the art for screening gene products of combinatorial libraries made by point ons or truncation, and for screening cDNA libraries for gene ts having a selected property. The most widely used techniques, which are amenable to high through-put analysis, for screening large gene libraries typically include cloning the gene library into replicable expression vectors, transforming appropriate cells with the resulting library of vectors, and expressing the combinatorial genes under conditions in which detection of a desired activity facilitates isolation of the vector encoding the gene whose product was detected. Recursive le mutagenesis (REM), a technique which enhances the ncy of functional mutants in the ies, can be used in combination with the screening assays to identify variants of a protein of the invention (Arkin and n, 1992, Proc. Natl. Acad. Sci.

USA 89:7811-7815; Delgrave et al., 1993, n Engineering 6(3):327- 331).

Another aspect of the invention ns to dies ed against a n of the invention. In preferred ments, the antibodies specifically bind a marker protein or a fragment thereof. The terms "antibody" and "antibodies" as used hangeably herein refer to immunoglobulin molecules as well as fragments and derivatives thereof that comprise an immunologically active portion of an immunoglobulin molecule, (i.e., such a n contains an antigen binding site which specifically binds an antigen, such as a marker protein, e.g., an epitope of a marker protein). An antibody which specifically binds to a protein of the invention is an antibody which binds the protein, but does not substantially bind other molecules in a sample, e. g., a ical sample, which naturally contains the protein. es of an immunologically active portion of an immunoglobulin molecule include, but are not limited to, single-chain antibodies (scAb), F(ab) and F(ab')2 fragments.

An isolated protein of the invention or a fragment f can be used as an immunogen to generate antibodies. The full-length protein can be used or, alternatively, the invention provides antigenic peptide fragments for use as immunogens. The antigenic peptide of a protein of the invention ses at least 8 (preferably 10, 15, 20, or 30 or more) amino acid residues of the amino acid sequence of one of the proteins of the invention, and encompasses at least one epitope of the protein such that an antibody raised t the peptide forms a ic immune complex with the protein. Preferred epitopes encompassed by the antigenic peptide are regions that are located on the surface of the protein, e. g., hydrophilic regions. Hydrophobicity sequence analysis, hydrophilicity sequence is, or similar analyses can be used to identify hydrophilic regions. In preferred embodiments, an isolated marker protein or fragment thereof is used as an immunogen.

An immunogen typically is used to prepare antibodies by immunizing a suitable (Le. immunocompetent) subject such as a rabbit, goat, mouse, or other mammal or vertebrate. An appropriate immunogenic ation can n, for example, recombinantly-expressed or chemically-synthesized protein or peptide. The preparation can r include an adjuvant, such as Freund's complete or incomplete adjuvant, or a similar immunostimulatory agent. Preferred immunogen compositions are those that contain no other human proteins such as, for example, immunogen compositions made using a non-human host cell for recombinant expression of a protein of the invention. In such a manner, the ing antibody compositions have reduced or no binding of human proteins other than a protein of the invention.

The invention provides polyclonal and monoclonal antibodies. The term "monoclonal antibody" or lonal antibody composition", as used herein, refers to a population of antibody molecules that contain only one species of an antigen binding site capable of immunoreacting with a particular epitope. Preferred polyclonal and monoclonal antibody compositions are ones that have been ed for antibodies directed against a protein of the invention. Particularly preferred polyclonal and monoclonal antibody preparations are ones that contain only antibodies ed against a marker protein or fragment thereof.

Polyclonal dies can be prepared by immunizing a suitable subject with a protein of the invention as an immunogen. The antibody titer in the immunized subject can be monitored over time by standard techniques, such as with an enzyme linked immunosorbent assay ) using immobilized polypeptide. At an riate time after immunization, e. g., when the ic antibody titers are highest, antibody- producing cells can be obtained from the subject and used to prepare monoclonal antibodies (mAb) by standard techniques, such as the hybridoma technique originally described by Kohler and Milstein (1975) Nature 256:495-497, the human B cell hybridoma technique (see Kozbor et al., 1983, Immunol. Today 4:72), the EBV- hybridoma technique (see Cole et al., pp. 77-96 In Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., 1985) or trioma ques. The technology for producing hybridomas is well known (see generally Current Protocols in Immunology, Coligan et al. ed., John Wiley & Sons, New York, 1994). oma cells producing a monoclonal antibody of the invention are detected by screening the hybridoma culture supernatants for antibodies that bind the polypeptide of interest, e. g., using a standard ELISA assay.

Alternative to preparing monoclonal antibody-secreting hybridomas, a monoclonal antibody directed t a protein of the invention can be identified and ed by screening a recombinant combinatorial immunoglobulin library (e. g., an dy phage display library) with the polypeptide of st. Kits for generating and ing phage y libraries are commercially available (e. g., the Pharmacia Recombinant Phage Antibody System, Catalog No. 2701; and the Stratagene SuerAP Phage Display Kit, Catalog No. ). onally, examples of methods and reagents ularly amenable for use in ting and screening antibody display library can be found in, for example, U.S. Patent No. 5,223,409; PCT Publication No.

W0 92/18619; PCT Publication No. W0 91/17271; PCT Publication No. WO 92/20791; PCT ation No. W0 92/15679; PCT Publication No. WO 93/01288; PCT Publication No. WO 92/01047; PCT Publication No. WO 92/09690; PCT Publication No. WO 90/02809; Fuchs et al. (1991) Bioﬂechnology 9:1370-1372; Hay et al. (1992) Hum. Antibod. Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275- 1281; Griffiths et al. (1993) EMBO J. 12:725-734.

The invention also provides recombinant antibodies that specifically bind a protein of the invention. In preferred embodiments, the recombinant antibodies specifically binds a marker protein or fragment thereof. Recombinant antibodies e, but are not limited to, chimeric and zed monoclonal antibodies, comprising both human and non-human portions, single-chain dies and multi- specific antibodies. A chimeric antibody is a molecule in which different portions are derived from different animal species, such as those having a variable region derived from a murine mAb and a human immunoglobulin constant region. (See, e. g., Cabilly et al., U.S. Patent No. 4,816,567; and Boss et al., U.S. Patent No. 4,816,397, which are incorporated herein by reference in their entirety.) Single-chain antibodies have an antigen binding site and consist of a single polypeptide. They can be produced by techniques known in the art, for example using methods described in Ladner et. al U.S.

Pat. No. 4,946,778 (which is incorporated herein by reference in its entirety); Bird et al., (1988) Science 242:423-426; Whitlow et al., (1991) Methods in Enzymology 2:1-9; Whitlow et al., (1991) Methods in Enzymology 297-105; and Huston et al., (1991) s in Enzymology Molecular Design and Modeling: Concepts and Applications 203:46-88. Multi-specific antibodies are antibody molecules having at least two antigen-binding sites that ically bind different ns. Such molecules can be produced by ques known in the art, for example using methods described in Segal, U.S. Patent No. 4,676,980 (the disclosure of which is incorporated herein by nce in its entirety); Holliger et al., (1993) Proc. Natl. Acad. Sci. USA 90:6444-6448; Whitlow et al., (1994) Protein Eng. 7:1017-1026 and U.S. Pat. No. 6,121,424.

Humanized antibodies are antibody molecules from non-human species having one or more complementarity determining regions (CDRs) from the non-human species and a framework region from a human immunoglobulin molecule. (See, e. g., Queen, U.S. Patent No. 5,585,089, which is orated herein by reference in its entirety.) Humanized monoclonal antibodies can be ed by recombinant DNA techniques known in the art, for example using methods described in PCT Publication No. WO 71; European Patent Application 184,187; European Patent Application 171,496; European Patent Application 173,494; PCT Publication No. WO 86/01533; U.S. Patent No. 4,816,567; European Patent Application 125,023; Better et al. (1988) Science 240:1041-1043; Liu et al. (1987) Proc. Natl. Acad. Sci. USA 84:3439-3443; Liu et al. (1987) J. Immunol. 139:3521- 3526; Sun et al. (1987) Proc. Natl. Acad. Sci. USA 84:214-218; Nishimura et al. (1987) Cancer Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al. (1988) J. Natl. Cancer Inst. 80: 1553-1559); Morrison (1985) Science 02-1207; Oi et al. (1986) Bio/Techniques 4:214; U.S.

Patent 5,225,539; Jones et al. (1986) Nature 321:552-525; Verhoeyan et al. (1988) Science 239:1534; and Beidler et al. (1988) J. Immunol. 141:4053-4060.

More particularly, humanized antibodies can be produced, for example, using enic mice which are ble of expressing endogenous globulin heavy and light chains genes, but which can express human heavy and light chain genes. The enic mice are immunized in the normal fashion with a selected antigen, e. g., all or a portion of a polypeptide ponding to a marker of the invention. Monoclonal antibodies directed against the antigen can be obtained using conventional oma technology. The human immunoglobulin transgenes harbored by the transgenic mice rearrange during B cell differentiation, and subsequently undergo class switching and somatic mutation. Thus, using such a technique, it is possible to produce therapeutically useful IgG, IgA and IgE antibodies. For an overview of this technology for producing human antibodies, see Lonberg and Huszar (1995) Int. Rev. Immunol. 13:65-93). For a detailed discussion of this technology for producing human antibodies and human monoclonal antibodies and protocols for producing such antibodies, see, e. g., U.S.

Patent 126; U.S. Patent 5,633,425; U.S. Patent 825; U.S. Patent 5,661,016; and U.S. Patent 5,545,806. In addition, ies such as Abgenix, Inc. (Freemont, CA), can be d to e human antibodies directed against a selected antigen using technology similar to that described above.

Completely human dies which ize a selected epitope can be generated using a technique referred to as " guided selection." In this approach a selected non-human monoclonal antibody, e.g., a murine antibody, is used to guide the selection of a completely human antibody recognizing the same e (Jespers et al., 1994, Bio/technology 12:899-903).

The antibodies of the invention can be isolated after production (e. g., from the blood or serum of the subject) or sis and further purified by well-known techniques. For example, IgG antibodies can be purified using protein A tography. Antibodies specific for a protein of the invention can be selected or (e. g., partially purified) or ed by, e. g., affinity chromatography. For example, a recombinantly expressed and ed (or partially purified) protein of the invention is produced as described herein, and covalently or non-covalently coupled to a solid t such as, for example, a chromatography . The column can then be used to affinity purify antibodies specific for the proteins of the invention from a sample containing dies directed against a large number of different epitopes, thereby generating a ntially ed antibody composition, i.e., one that is substantially free of contaminating dies. By a substantially purified antibody composition is meant, in this context, that the antibody sample contains at most only 30% (by dry weight) of contaminating antibodies directed against epitopes other than those of the desired protein of the invention, and preferably at most 20%, yet more preferably at most 10%, and most preferably at most 5% (by dry weight) of the sample is contaminating antibodies. A purified antibody composition means that at least 99% of the dies in the composition are directed against the desired protein of the invention.

In a preferred embodiment, the substantially purified antibodies of the invention may specifically bind to a signal peptide, a secreted sequence, an extracellular domain, a transmembrane or a cytoplasmic domain or cytoplasmic membrane of a protein of the invention. In a particularly preferred embodiment, the substantially purified antibodies of the invention specifically bind to a secreted sequence or an extracellular domain of the amino acid sequences of a protein of the invention. In a more preferred embodiment, the substantially purified antibodies of the invention specifically bind to a secreted sequence or an extracellular domain of the amino acid sequences of a marker protein.

An antibody directed against a protein of the invention can be used to isolate the protein by standard techniques, such as affinity chromatography or immunoprecipitation.

Moreover, such an antibody can be used to detect the marker protein or fragment thereof (e. g., in a cellular lysate or cell supernatant) in order to evaluate the level and pattern of expression of the marker. The antibodies can also be used diagnostically to monitor protein levels in tissues or body ﬂuids (e. g. in disease sate or ty state associated body ﬂuid) as part of a clinical testing procedure, e. g., to, for example, determine the efficacy of a given treatment regimen. Detection can be facilitated by the use of an antibody derivative, which comprises an dy of the invention coupled to a detectable substance. Examples of detectable substances include various enzymes, etic groups, ﬂuorescent materials, luminescent materials, bioluminescent materials, and ctive materials. Examples of suitable enzymes include horseradish dase, alkaline phosphatase, B-galactosidase, or acetylcholinesterase; examples of suitable etic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable cent materials include umbelliferone, ﬂuorescein, cein isothiocyanate, rhodamine, dichlorotriazinylamine ﬂuorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes l; examples of bioluminescent als e luciferase, rin, and aequorin, and 125 131 35 3 examples of suitable radioactive material e I, I, S or H.

Antibodies of the invention may also be used as therapeutic agents in treating cancers. In a preferred embodiment, completely human antibodies of the invention are used for therapeutic treatment of human cancer patients, ularly those having a cancer. In another preferred embodiment, antibodies that bind specifically to a marker protein or fragment thereof are used for therapeutic treatment. Further, such therapeutic antibody may be an antibody derivative or immunotoxin comprising an dy conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive metal ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples e taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, bicin, daunorubicin, oxy anthracin dione, mitoxantrone, mycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, and puromycin and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e. g., rexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e. g., mechlorethamine, thioepa chlorambucil, melphalan, tine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e. g., daunorubicin (formerly daunomycin) and doxorubicin), otics (e. g., dactinomycin rly actinomycin), bleomycin, mycin, and anthramycin (AMC)), and anti-mitotic agents (e. g., vincristine and vinblastine).

The conjugated antibodies of the invention can be used for modifying a given biological response, for the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such ns may include, for example, a toxin such as ribosome-inhibiting protein (see Better et al., U.S. Patent No. 6,146,631, the disclosure of which is incorporated herein in its entirety), abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis , .alpha.-interferon, B-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 ("IL-1"), interleukin-2 ("IL-2"), interleukin-6 ("IL-6"), granulocyte hase colony stimulating factor ("GM-CSF"), granulocyte colony stimulating factor ("G-CSF"), or other growth factors.

Techniques for conjugating such therapeutic moiety to antibodies are well known, see, e. g., Amon et al., "Monoclonal Antibodies For Immunotargeting Of Drugs In Cancer Therapy", in onal Antibodies And Cancer Therapy, Reisfeld et al. (eds.), pp. 243-56 (Alan R. Liss, Inc. 1985); Hellstrom et al., odies For Drug Delivery", in Controlled Drug Delivery (2nd Ed.), Robinson et al. (eds.), pp. 623-53 (Marcel Dekker, Inc. 1987); Thorpe, ody Carriers Of Cytotoxic Agents In Cancer Therapy: A Review", in Monoclonal dies '84: Biological And Clinical Applications, Pinchera et al. (eds.), pp. 475-506 (1985); "Analysis, Results, And Future Prospective Of The Therapeutic Use Of Radiolabeled Antibody In Cancer Therapy", in Monoclonal Antibodies For Cancer Detection And Therapy, Baldwin et al. (eds.), pp. 303-16 (Academic Press 1985), and Thorpe et al., "The Preparation And Cytotoxic Properties Of Antibody-Toxin Conjugates", Immunol. Rev., 62:119-58 .

Accordingly, in one aspect, the ion provides substantially purified antibodies, antibody fragments and derivatives, all of which specifically bind to a protein of the invention and preferably, a marker n. In various embodiments, the substantially purified antibodies of the invention, or fragments or derivatives thereof, can be human, non-human, ic and/or humanized antibodies. In another aspect, the invention provides non-human antibodies, antibody fragments and derivatives, all of which specifically bind to a protein of the invention and preferably, a marker protein.

Such non-human antibodies can be goat, mouse, sheep, horse, chicken, rabbit, or rat antibodies. Alternatively, the non-human antibodies of the ion can be chimeric and/or humanized antibodies. In addition, the man antibodies of the invention can be polyclonal dies or monoclonal antibodies. In still a further aspect, the invention provides monoclonal antibodies, dy fragments and derivatives, all of which specifically bind to a protein of the ion and preferably, a marker protein.

The monoclonal antibodies can be human, humanized, chimeric and/or non-human antibodies.

The invention also provides a kit containing an antibody of the invention conjugated to a detectable substance, and instructions for use. Still another aspect of the invention is a pharmaceutical composition comprising an dy of the invention. In one embodiment, the pharmaceutical composition comprises an antibody of the ion and a pharmaceutically acceptable carrier.

E. Predictive Medicine The present invention pertains to the field of predictive medicine in which diagnostic , prognostic assays, pharmacogenomics, and monitoring clinical trails are used for prognostic ctive) purposes to thereby treat an individual prophylactically. Accordingly, one aspect of the present invention s to diagnostic assays for determining the level of expression of one or more marker proteins or nucleic acids, in order to determine whether an individual is at risk of developing certain disease or drug-induced ty. Such assays can be used for stic or predictive purposes to thereby lactically treat an dual prior to the onset of the er.

Yet another aspect of the invention pertains to monitoring the inﬂuence of agents (e. g., drugs or other compounds administered either to inhibit or to treat or prevent a disorder or drug-induced toxicity {i.e. in order to understand any drug-induced toxic effects that such treatment may have}) on the expression or activity of a marker of the invention in clinical trials. These and other agents are described in further detail in the following sections.

F. Diagnostic Assays An exemplary method for detecting the presence or absence of a marker protein or nucleic acid in a biological sample involves obtaining a biological sample (e. g. toxicity-associated body ﬂuid or tissue sample) from a test subject and contacting the biological sample with a compound or an agent capable of detecting the ptide or nucleic acid (e. g., mRNA, genomic DNA, or cDNA). The detection methods of the invention can thus be used to detect mRNA, protein, cDNA, or genomic DNA, for example, in a biological sample in vitro as well as in vivo. For example, in vitro techniques for detection of mRNA include Northern hybridizations and in situ hybridizations. In vitro techniques for detection of a marker protein include enzyme linked immunosorbent assays (ELISAs), Western blots, immunoprecipitations and ﬂuorescence. In vitro techniques for detection of genomic DNA include Southern izations. In vivo techniques for detection of mRNA e polymerase chain reaction (PCR), rn hybridizations and in situ hybridizations. Furthermore, in vivo techniques for detection of a marker protein include introducing into a subject a labeled antibody directed against the protein or fragment thereof. For example, the antibody can be labeled with a radioactive marker whose presence and on in a subject can be detected by rd imaging techniques.

A general principle of such diagnostic and prognostic assays involves preparing a sample or reaction mixture that may contain a marker, and a probe, under appropriate conditions and for a time sufficient to allow the marker and probe to interact and bind, thus forming a x that can be removed and/or detected in the on mixture.

These assays can be conducted in a variety of ways.

For example, one method to conduct such an assay would involve anchoring the marker or probe onto a solid phase support, also referred to as a substrate, and detecting target marker/probe complexes anchored on the solid phase at the end of the reaction. In one embodiment of such a method, a sample from a subject, which is to be assayed for presence and/or concentration of marker, can be anchored onto a carrier or solid phase support. In r embodiment, the reverse ion is possible, in which the probe can be anchored to a solid phase and a sample from a subject can be allowed to react as an unanchored component of the assay.

There are many ished methods for ing assay components to a solid phase. These include, t limitation, marker or probe molecules which are immobilized through ation of biotin and streptavidin. Such biotinylated assay components can be prepared from biotin-NHS (N-hydroxy-succinimide) using techniques known in the art (e. g., biotinylation kit, Pierce Chemicals, Rockford, IL), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical). In certain ments, the surfaces with immobilized assay components can be prepared in advance and stored.

Other suitable carriers or solid phase supports for such assays include any material capable of binding the class of molecule to which the marker or probe belongs.

Well-known supports or carriers include, but are not limited to, glass, polystyrene, nylon, opylene, nylon, polyethylene, dextran, amylases, natural and modified celluloses, polyacrylamides, gabbros, and magnetite.

In order to conduct assays with the above mentioned approaches, the non- immobilized component is added to the solid phase upon which the second component is anchored. After the reaction is te, lexed ents may be removed (e. g., by washing) under conditions such that any complexes formed will remain immobilized upon the solid phase. The detection of marker/probe complexes anchored to the solid phase can be accomplished in a number of methods outlined herein.

In a preferred embodiment, the probe, when it is the unanchored assay component, can be labeled for the purpose of detection and readout of the assay, either ly or indirectly, with detectable labels discussed herein and which are well-known to one skilled in the art.

It is also possible to directly detect marker/probe complex formation t further manipulation or labeling of either component (marker or probe), for example by utilizing the technique of ﬂuorescence energy transfer (see, for example, Lakowicz et al., U.S. Patent No. 5,631,169; Stavrianopoulos, et al., U.S. Patent No. 4,868,103). A ﬂuorophore label on the first, ‘donor’ molecule is selected such that, upon excitation with incident light of appropriate wavelength, its emitted ﬂuorescent energy will be absorbed by a ﬂuorescent label on a second ‘acceptor’ molecule, which in turn is able to ﬂuoresce due to the absorbed energy. Altemately, the ‘donor’ protein molecule may simply utilize the natural ﬂuorescent energy of tryptophan es. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ le label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is d to the distance separating the molecules, spatial relationships between the molecules can be assessed. In a situation in which binding occurs between the molecules, the ﬂuorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through rd ﬂuorometric detection means well known in the art (e. g., using a ﬂuorimeter).

In another embodiment, determination of the ability of a probe to recognize a marker can be accomplished t labeling either assay component (probe or marker) by utilizing a technology such as real-time ecular Interaction is (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C., 1991, Anal. Chem. 8—2345 and Szabo et al., 1995, Curr. Opin. Struct. Biol. 5:699-705). As used herein, “BIA” or “surface plasmon resonance” is a logy for studying biospecific interactions in real time, without labeling any of the interactants (e. g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal which can be used as an indication of real-time reactions between biological molecules.

Alternatively, in r embodiment, ous stic and prognostic assays can be conducted with marker and probe as solutes in a liquid phase. In such an assay, the complexed marker and probe are ted from uncomplexed components by any of a number of standard techniques, including but not limited to: differential centrifugation, chromatography, electrophoresis and immunoprecipitation. In differential centrifugation, marker/probe complexes may be separated from lexed assay components through a series of centrifugal steps, due to the ent sedimentation bria of complexes based on their different sizes and densities (see, for example, Rivas, G., and , AR, 1993, Trends Biochem Sci. 18(8):284-7).

Standard chromatographic techniques may also be utilized to separate complexed molecules from uncomplexed ones. For example, gel tion chromatography separates molecules based on size, and through the ation of an appropriate gel filtration resin in a column format, for example, the relatively larger complex may be separated from the relatively smaller uncomplexed components. Similarly, the vely different charge properties of the marker/probe complex as compared to the uncomplexed components may be exploited to differentiate the complex from uncomplexed components, for example through the utilization of ion-exchange chromatography resins. Such resins and chromatographic techniques are well known to one skilled in the art (see, e. g., Heegaard, NH, 1998, J. Mol. Recognit. Winter 11(1- 6):141-8; Hage, D.S., and Tweed, S.A. J Chromatogr B Biomed Sci Appl 1997 Oct ;699(1-2):499-525). Gel electrophoresis may also be employed to separate complexed assay components from unbound components (see, e. g., Ausubel et al., ed., Current Protocols in Molecular Biology, John Wiley & Sons, New York, 1987-1999). In this technique, protein or nucleic acid complexes are ted based on size or charge, for example. In order to maintain the g interaction during the electrophoretic process, non-denaturing gel matrix materials and ions in the absence of reducing agent are typically preferred. Appropriate conditions to the particular assay and components thereof will be well known to one skilled in the art.

In a particular embodiment, the level of marker mRNA can be determined both by in situ and by in vitro formats in a biological sample using methods known in the art.

The term "biological sample" is intended to include tissues, cells, biological ﬂuids and isolates thereof, isolated from a subject, as well as tissues, cells and ﬂuids present within a subject. Many expression detection methods use ed RNA. For in vitro methods, any RNA isolation technique that does not select against the isolation of mRNA can be utilized for the cation of RNA from cells (see, e. g., Ausubel et al., ed., Current Protocols in Molecular y, John Wiley & Sons, New York 1987-1999).

Additionally, large numbers of tissue samples can y be processed using techniques well known to those of skill in the art, such as, for example, the -step RNA isolation process of Chomczynski (1989, U.S. Patent No. 155).

The isolated mRNA can be used in hybridization or amplification assays that include, but are not d to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length cDNA, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to a mRNA or genomic DNA encoding a marker of the present invention. Other le probes for use in the stic assays of the invention are described herein. ization of an mRNA with the probe indicates that the marker in question is being expressed.

In one format, the mRNA is immobilized on a solid surface and contacted with a probe, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probe(s) are immobilized on a solid e and the mRNA is contacted with the probe(s), for example, in an Affymetrix gene chip array. A skilled artisan can readily adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the markers of the present invention.

An alternative method for determining the level of mRNA marker in a sample involves the process of nucleic acid amplification, e.g., by RT-PCR (the experimental embodiment set forth in Mullis, 1987, U.S. Patent No. 4,683,202), ligase chain reaction y, 1991, Proc. Natl. Acad. Sci. USA, 88:189-193), self sustained sequence replication (Guatelli et al., 1990, Proc. Natl. Acad. Sci. USA 4-1878), transcriptional amplification system (Kwoh et al., 1989, Proc. Natl. Acad. Sci. USA 86: 1 173-1177), Q-Beta Replicase (Lizardi et al., 1988, Bioﬂechnology 6: 1 197), rolling circle replication (Lizardi et al., U.S. Patent No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques well known to those of skill in the art. These detection schemes are especially useful for the detection of nucleic acid molecules if such molecules are present in very low s. As used herein, ication primers are defined as being a pair of nucleic acid molecules that can anneal to 5’ or 3’ regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and ﬂank a region from about 50 to 200 nucleotides in . Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence ﬂanked by the primers.

For in situ methods, mRNA does not need to be isolated from the prior to detection. In such methods, a cell or tissue sample is prepared/processed using known histological methods. The sample is then immobilized on a t, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the marker.

As an alternative to making determinations based on the absolute expression level of the marker, inations may be based on the normalized sion level of the marker. Expression levels are normalized by correcting the absolute expression level of a marker by comparing its expression to the expression of a gene that is not a marker, e. g., a housekeeping gene that is constitutively sed. Suitable genes for normalization e housekeeping genes such as the actin gene, or epithelial cell- specific genes. This normalization allows the comparison of the expression level in one sample, e. g., a patient sample, to another sample, e. g., a non-disease or non-toxic sample, or between samples from different s.

Alternatively, the expression level can be provided as a relative expression level.

To determine a relative expression level of a marker, the level of expression of the marker is determined for 10 or more samples of normal versus disease or toxic cell isolates, preferably 50 or more samples, prior to the ination of the expression level for the sample in question. The mean expression level of each of the genes assayed in the larger number of samples is determined and this is used as a baseline expression level for the marker. The expression level of the marker determined for the test sample (absolute level of expression) is then divided by the mean expression value ed for that . This provides a relative sion level.

Preferably, the samples used in the ne determination will be from non- disease or non-toxic cells. The choice of the cell source is dependent on the use of the relative expression level. Using expression found in normal tissues as a mean expression score aids in validating whether the marker assayed is disease or toxicity specific (versus normal cells). In addition, as more data is accumulated, the mean expression value can be revised, providing improved relative sion values based on accumulated data.

Expression data from disesase cells or toxic cells provides a means for grading the severity of the disease or toxic state.

In another embodiment of the t invention, a marker protein is detected. A preferred agent for detecting marker protein of the invention is an antibody capable of binding to such a protein or a nt thereof, preferably an dy with a detectable label. Antibodies can be polyclonal, or more preferably, onal. An intact antibody, or a fragment or derivative thereof (e. g., Fab or F(ab')2) can be used. The term "labeled", with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with another reagent that is directly labeled. es of indirect labeling include detection of a primary dy using a cently labeled secondary antibody and end-labeling of a DNA probe with biotin such that it can be detected with ﬂuorescently labeled streptavidin.

Proteins from cells can be ed using techniques that are well known to those of skill in the art. The n isolation methods employed can, for example, be such as those described in Harlow and Lane (Harlow and Lane, 1988, Antibodies: A tory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York).

A variety of formats can be employed to determine whether a sample contains a protein that binds to a given dy. Examples of such s include, but are not limited to, enzyme immunoassay (EIA), radioimmunoassay (RIA), n blot analysis and enzyme linked immunoabsorbant assay (ELISA). A skilled artisan can readily adapt known protein/antibody detection methods for use in determining whether cells express a marker of the present invention.

In one format, antibodies, or antibody fragments or derivatives, can be used in methods such as Western blots or immunoﬂuorescence techniques to detect the expressed proteins. In such uses, it is generally preferable to immobilize either the antibody or proteins on a solid support. Suitable solid phase supports or carriers include any support capable of binding an n or an antibody. Well-known supports or carriers include glass, yrene, polypropylene, polyethylene, dextran, nylon, es, natural and modified celluloses, polyacrylamides, s, and magnetite.

One skilled in the art will know many other suitable carriers for binding antibody or antigen, and will be able to adapt such support for use with the present invention. For e, protein isolated from disease or toxic cells can be run on a polyacrylamide gel electrophoresis and immobilized onto a solid phase support such as nitrocellulose. The support can then be washed with suitable buffers ed by treatment with the detectably labeled antibody. The solid phase support can then be washed with the buffer a second time to remove unbound antibody. The amount of bound label on the solid t can then be detected by conventional means.

The invention also encompasses kits for detecting the presence of a marker protein or nucleic acid in a biological sample. Such kits can be used to determine if a subject is suffering from or is at increased risk of developing certain diseases or drug- induced toxicity. For example, the kit can comprise a labeled compound or agent capable of detecting a marker protein or nucleic acid in a biological sample and means for ining the amount of the protein or mRNA in the sample (e. g., an dy which binds the protein or a fragment thereof, or an oligonucleotide probe which binds to DNA or mRNA encoding the protein). Kits can also include instructions for interpreting the results obtained using the kit.

For antibody-based kits, the kit can comprise, for example: (1) a first antibody (e. g., attached to a solid support) which binds to a marker protein; and, optionally, (2) a second, different antibody which binds to either the protein or the first dy and is conjugated to a detectable label.

For oligonucleotide-based kits, the kit can comprise, for example: (1) an oligonucleotide, e. g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a marker protein or (2) a pair of primers useful for amplifying a marker c acid molecule. The kit can also se, e. g., a buffering agent, a preservative, or a protein stabilizing agent. The kit can further comprise components necessary for detecting the detectable label (e. g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and ed to the test . Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with ctions for interpreting the results of the assays performed using the kit.

G. Pharmacogenomics The markers of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker whose expression level correlates with a specific clinical drug response or susceptibility in a t (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35(12): 1650-1652). The ce or quantity of the pharmacogenomic marker expression is related to the predicted response of the patient and more particularly the patient’s diseased or toxic cells to therapy with a specific drug or class of drugs. By assessing the presence or ty of the expression of one or more pharmacogenomic s in a patient, a drug therapy which is most appropriate for the patient, or which is predicted to have a r degree of success, may be selected. For example, based on the presence or quantity of RNA or protein encoded by specific tumor markers in a patient, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the patient. The use of cogenomic markers therefore permits selecting or designing the most appropriate treatment for each cancer patient without trying different drugs or regimes.

Another aspect of pharmacogenomics deals with c ions that alters the way the body acts on drugs. These pharmacogenetic conditions can occur either as rare defects or as polymorphisms. For example, glucosephosphate dehydrogenase (G6PD) deficiency is a common inherited enzymopathy in which the main clinical complication is hemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, urans) and consumption of fava beans.

As an illustrative ment, the activity of drug metabolizing enzymes is a major inant of both the intensity and on of drug action. The discovery of genetic polymorphisms of drug metabolizing enzymes (e. g., N-acetyltransferase 2 (NAT 2) and cytochrome P450 enzymes CYP2D6 and CYP2C19) has provided an explanation as to why some patients do not obtain the expected drug effects or show exaggerated drug response and serious toxicity after taking the standard and safe dose of a drug.

These polymorphisms are expressed in two phenotypes in the population, the extensive metabolizer (EM) and poor metabolizer (PM). The prevalence of PM is different among different tions. For example, the gene coding for CYP2D6 is highly polymorphic and several mutations have been identified in PM, which all lead to the absence of functional CYP2D6. Poor metabolizers of CYP2D6 and CYP2C19 quite frequently experience exaggerated drug response and side effects when they receive standard doses.

If a lite is the active therapeutic moiety, a PM will show no therapeutic response, as trated for the analgesic effect of codeine mediated by its CYP2D6-formed metabolite morphine. The other extreme are the so called ultra-rapid metabolizers who do not respond to standard doses. Recently, the molecular basis of ultra-rapid metabolism has been fied to be due to CYP2D6 gene amplification.

Thus, the level of expression of a marker of the invention in an individual can be determined to thereby select appropriate agent(s) for therapeutic or prophylactic treatment of the individual. In addition, pharmacogenetic s can be used to apply genotyping of polymorphic alleles encoding drug-metabolizing enzymes to the identification of an individual's drug responsiveness phenotype. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a modulator of expression of a marker of the invention.

H. Monitoring Clinical Trials Monitoring the inﬂuence of agents (e.g., drug compounds) on the level of expression of a marker of the invention can be applied not only in basic drug screening, but also in clinical trials. For example, the effectiveness of an agent to affect marker expression can be monitored in al trials of subjects receiving treatment for certain diseases, such as cancer, diabetes, obesity, cardiovescular disease, and cardiotoxicity, or drug-induced toxicity. In a preferred embodiment, the present invention provides a method for monitoring the iveness of treatment of a subject with an agent (e. g., an t, nist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) comprising the steps of (i) obtaining a pre-administration sample from a t prior to administration of the agent; (ii) detecting the level of expression of one or more selected s of the invention in the pre-administration sample; (iii) ing one or more dministration samples from the subject; (iv) detecting the level of expression of the marker(s) in the post-administration samples; (V) comparing the level of sion of the marker(s) in the pre-administration sample with the level of expression of the marker(s) in the post-administration sample or samples; and (vi) ng the administration of the agent to the subject accordingly. For e, increased expression of the marker ) during the course of treatment may indicate ineffective dosage and the desirability of increasing the dosage. Conversely, decreased expression of the marker gene(s) may indicate efficacious treatment and no need to change dosage.

H. Arrays The invention also includes an array sing a marker of the present invention. The array can be used to assay expression of one or more genes in the array.

In one embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array. In this manner, up to about 7600 genes can be simultaneously assayed for expression. This allows a profile to be developed showing a battery of genes ically expressed in one or more tissues.

In addition to such qualitative determination, the invention allows the quantitation of gene expression. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertainable. Thus, genes can be grouped on the basis of their tissue expression per se and level of expression in that tissue. This is , for example, in ascertaining the relationship of gene expression between or among tissues. Thus, one tissue can be perturbed and the effect on gene expression in a second tissue can be determined. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined. Such a determination is useful, for example, to know the effect of cell-cell interaction at the level of gene expression. If an agent is administered therapeutically to treat one cell type but has an rable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired . Similarly, even within a single cell type, undesirable ical effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

In another embodiment, the array can be used to monitor the time course of expression of one or more genes in the array. This can occur in various ical contexts, as disclosed herein, for example development of drug-induced toxicity, progression of nduced toxicity, and processes, such a cellular transformation associated with nduced toxicity.

The array is also useful for aining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells. This provides, for example, for a selection of alternate molecular targets for eutic ention if the ultimate or downstream target cannot be ted.

The array is also useful for aining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes that could serve as a molecular target for diagnosis or therapeutic intervention.

VII. Methods for Obtaining Samples Samples useful in the methods of the invention include any tissue, cell, biopsy, or bodily ﬂuid sample that ses a marker of the ion. In one embodiment, a sample may be a tissue, a cell, whole blood, serum, plasma, buccal scrape, saliva, cerebrospinal ﬂuid, urine, stool, or bronchoalveolar lavage. In preferred embodiments, the tissue sample is a disease state or toxicity state sample. In more preferred embodiments, the tissue sample is a cancer sample, a diabetes , an obesity sample, a cardiovascular sample or a drug-induced toxicity sample.

Body samples may be ed from a subject by a variety of techniques known in the art including, for example, by the use of a biopsy or by scraping or swabbing an area or by using a needle to aspirate bodily ﬂuids. Methods for collecting various body samples are well known in the art.

Tissue samples suitable for detecting and quantitating a marker of the invention may be fresh, frozen, or fixed according to methods known to one of skill in the art.

Suitable tissue samples are preferably sectioned and placed on a microscope slide for further analyses. Alternatively, solid samples, i.e., tissue samples, may be solubilized and/or homogenized and subsequently analyzed as soluble extracts.

In one embodiment, a y obtained biopsy sample is frozen using, for example, liquid nitrogen or odichloromethane. The frozen sample is mounted for sectioning using, for example, OCT, and serially ned in a cryostat. The serial sections are collected on a glass microscope slide. For immunohistochemical staining the slides may be coated with, for example, chrome-alum, gelatine or poly-L-lysine to ensure that the sections stick to the slides. In another embodiment, samples are fixed and embedded prior to sectioning. For example, a tissue sample may be fixed in, for example, formalin, serially dehydrated and embedded in, for e, paraffin.

Once the sample is obtained any method known in the art to be suitable for detecting and quantitating a marker of the invention may be used (either at the nucleic acid or at the n level). Such methods are well known in the art and include but are not limited to western blots, northern blots, southern blots, immunohistochemistry, ELISA, e. g., amplified ELISA, immunoprecipitation, immunoﬂuorescence, ﬂow cytometry, immunocytochemistry, mass spectrometrometric analyses, e. g., MALDI- TOF and SELDI—TOF, nucleic acid ization techniques, nucleic acid reverse transcription methods, and nucleic acid amplification methods. In particular embodiments, the expression of a marker of the invention is detected on a protein level using, for e, antibodies that specifically bind these proteins.

Samples may need to be modified in order to make a marker of the invention accessible to antibody binding. In a particular aspect of the immunocytochemistry or immunohistochemistry methods, slides may be transferred to a pretreatment buffer and optionally heated to increase n accessibility. Heating of the sample in the pretreatment buffer rapidly disrupts the lipid bi-layer of the cells and makes the antigens (may be the case in fresh specimens, but not typically what occurs in fixed specimens) more accessible for dy binding. The terms "pretreatment buffer" and "preparation buffer" are used interchangeably herein to refer to a buffer that is used to prepare cytology or histology samples for immunostaining, ularly by increasing the accessibility of a marker of the invention for antibody binding. The pretreatment buffer may comprise a pH-specific salt solution, a polymer, a detergent, or a nonionic or anionic surfactant such as, for example, an ethyloxylated anionic or ic surfactant, an alkanoate or an alkoxylate or even blends of these surfactants or even the use of a bile salt. The pretreatment buffer may, for example, be a solution of 0.1% to 1% of deoxycholic acid, sodium salt, or a on of sodium laureth-l3-carboxylate (e. g., Sandopan LS) or and ethoxylated anionic complex. In some ments, the pretreatment buffer may also be used as a slide storage buffer.

Any method for making marker proteins of the invention more accessible for antibody binding may be used in the practice of the invention, including the antigen retrieval methods known in the art. See, for example, Bibbo, et al. (2002) Acta. Cytol. 46:25-29; Saqi, et al. (2003) Diagn. Cytopathol. 27:365-370; Bibbo, et al. (2003) Anal.

Quant. Cytol. Histol. 25:8-11, the entire contents of each of which are incorporated herein by reference.

Following atment to increase marker protein ibility, samples may be blocked using an appropriate blocking agent, e. g., a peroxidase blocking reagent such as hydrogen peroxide. In some embodiments, the samples may be blocked using a protein blocking reagent to prevent non-specific binding of the antibody. The protein blocking reagent may comprise, for example, purified casein. An antibody, ularly a monoclonal or polyclonal antibody that specifically binds to a marker of the invention is then incubated with the sample. One of skill in the art will appreciate that a more accurate prognosis or diagnosis may be obtained in some cases by detecting multiple epitopes on a marker protein of the invention in a patient sample. Therefore, in particular embodiments, at least two antibodies directed to different epitopes of a marker of the invention are used. Where more than one antibody is used, these antibodies may be added to a single sample sequentially as individual dy reagents or simultaneously as an antibody cocktail. Alternatively, each individual antibody may be added to a separate sample from the same patient, and the resulting data pooled. ques for detecting dy binding are well known in the art. Antibody binding to a marker of the invention may be detected h the use of chemical reagents that generate a detectable signal that corresponds to the level of dy binding and, accordingly, to the level of marker protein sion. In one of the immunohistochemistry or immunocytochemistry methods of the invention, antibody binding is ed through the use of a secondary antibody that is conjugated to a labeled polymer. es of labeled polymers include but are not limited to polymer- enzyme conjugates. The enzymes in these complexes are lly used to catalyze the deposition of a chromogen at the antigen-antibody binding site, thereby resulting in cell staining that corresponds to expression level of the biomarker of interest. Enzymes of particular interest include, but are not limited to, horseradish peroxidase (HRP) and alkaline phosphatase (AP).

In one particular immunohistochemistry or immunocytochemistry method of the invention, antibody binding to a marker of the ion is ed through the use of an HRP-labeled polymer that is conjugated to a secondary antibody. Antibody binding can also be detected through the use of a species-specific probe reagent, which binds to monoclonal or polyclonal antibodies, and a polymer conjugated to HRP, which binds to the s specific probe reagent. Slides are stained for dy binding using any chromagen, e. g., the chromagen 3,3-diaminobenzidine (DAB), and then counterstained with hematoxylin and, optionally, a bluing agent such as ammonium hydroxide or TBS/Tween-20. Other le chromagens include, for example, 3-amino ethylcarbazole (AEC). In some aspects of the invention, slides are reviewed microscopically by a cytotechnologist and/or a pathologist to assess cell staining, e. g., ﬂuorescent ng (i.e., marker expression). Alternatively, samples may be reviewed via automated microscopy or by personnel with the assistance of computer software that facilitates the identification of positive staining cells.

Detection of antibody binding can be facilitated by coupling the anti-marker antibodies to a detectable substance. Examples of detectable substances include various enzymes, prosthetic groups, ﬂuorescent materials, luminescent materials, bioluminescent als, and radioactive materials. Examples of suitable enzymes e horseradish peroxidase, alkaline phosphatase, B-galactosidase, or acetylcholinesterase; examples of le prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of le ﬂuorescent materials include umbelliferone, ﬂuorescein, cein isothiocyanate, rhodamine, dichlorotriazinylamine ﬂuorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; es of bioluminescent materials include luciferase, rin, and aequorin; and examples of suitable radioactive material e 125I, 131I, 358, 14C, or 3H.

In one embodiment of the invention frozen samples are prepared as described above and subsequently stained with antibodies t a marker of the invention diluted to an appropriate concentration using, for example, Tris-buffered saline (TBS). Primary antibodies can be detected by incubating the slides in biotinylated anti-immunoglobulin.

This signal can optionally be amplified and visualized using diaminobenzidine precipitation of the antigen. Furthermore, slides can be optionally counterstained with, for example, hematoxylin, to ize the cells.

In another ment, fixed and embedded samples are d with antibodies against a marker of the invention and counterstained as described above for frozen sections. In addition, samples may be optionally treated with agents to amplify the signal in order to visualize antibody staining. For example, a peroxidase-catalyzed deposition of yl-tyramide, which in turn is reacted with peroxidase-conjugated streptavidin (Catalyzed Signal Amplification (CSA) System, DAKO, Carpinteria, CA) may be used.

Tissue-based assays (i.e., histochemistry) are the preferred methods of detecting and quantitating a marker of the invention. In one embodiment, the presence or absence of a marker of the invention may be determined by immunohistochemistry.

In one embodiment, the immunohistochemical analysis uses low concentrations of an anti-marker antibody such that cells lacking the marker do not stain. In another embodiment, the presence or absence of a marker of the invention is determined using an immunohistochemical method that uses high concentrations of an anti-marker antibody such that cells lacking the marker n stain heavily. Cells that do not stain contain either mutated marker and fail to produce antigenically recognizable marker n, or are cells in which the pathways that regulate marker levels are dysregulated, resulting in steady state expression of negligible marker n.

One of skill in the art will recognize that the concentration of a particular antibody used to practice the methods of the invention will vary ing on such factors as time for binding, level of specificity of the antibody for a marker of the invention, and method of sample preparation. Moreover, when multiple antibodies are used, the required concentration may be affected by the order in which the antibodies are applied to the sample, e.g., simultaneously as a cocktail or tially as individual antibody reagents. Furthermore, the detection chemistry used to visualize antibody binding to a marker of the invention must also be optimized to produce the desired signal to noise ratio.

In one ment of the invention, proteomic s, e. g., mass spectrometry, are used for ing and quantitating the marker proteins of the invention. For example, matrix-associated laser desorption/ionization time-of—ﬂight mass ometry (MALDI-TOF MS) or surface-enhanced laser desorption/ionization time-of—ﬂight mass ometry (SELDI-TOF MS) which involves the ation of a biological , such as serum, to a protein-binding chip (Wright, G.L., Jr., et al. (2002) Expert Rev M01 Diagn 2:549; Li, J., et al. (2002) Clin Chem 48:1296; Laronga, C., et al. (2003) Dis Markers ; Petricoin, EMF, et al. (2002) 359:572; Adam, B.L., et al. (2002) Cancer Res 62:3609; Tolson, J., et al. (2004) Lab Invest 84:845; Xiao, Z., et al. (2001) Cancer Res 61:6029) can be used to detect and quantitate the PY-Shc and/or p66-Shc proteins. Mass spectrometric methods are described in, for e, U.S. Patent Nos. 5,622,824, 5,605,798 and 5,547,835, the entire contents of each of which are incorporated herein by reference.

In other embodiments, the expression of a marker of the invention is detected at the nucleic acid level. Nucleic acid-based techniques for assessing sion are well known in the art and include, for example, determining the level of marker mRNA in a sample from a subject. Many expression detection methods use isolated RNA. Any RNA isolation technique that does not select against the isolation of mRNA can be utilized for the purification of RNA from cells that express a marker of the invention (see, e. g., Ausubel et al., ed., (1987-1999) Current Protocols in Molecular Biology (John Wiley & Sons, New York). Additionally, large numbers of tissue samples can readily be processed using techniques well known to those of skill in the art, such as, for example, the single-step RNA isolation process of Chomczynski (1989, U.S. Pat. No. 155).

The term "probe" refers to any molecule that is capable of selectively binding to a marker of the invention, for example, a tide transcript and/or protein. Probes can be synthesized by one of skill in the art, or derived from appropriate biological preparations. Probes may be specifically designed to be labeled. Examples of molecules that can be utilized as probes include, but are not limited to, RNA, DNA, proteins, antibodies, and organic molecules.

Isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern es, polymerase chain reaction analyses and probe arrays. One method for the detection of mRNA levels involves contacting the ed mRNA with a nucleic acid molecule (probe) that can ize to the marker mRNA. The nucleic acid probe can be, for example, a full-length cDNA, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to marker c DNA.

In one embodiment, the mRNA is immobilized on a solid surface and contacted with a probe, for example by running the ed mRNA on an agarose gel and erring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative embodiment, the probe(s) are immobilized on a solid surface and the mRNA is contacted with the probe(s), for example, in an trix gene chip array. A skilled artisan can readily adapt known mRNA detection methods for use in detecting the level of marker mRNA.

An alternative method for determining the level of marker mRNA in a sample involves the process of nucleic acid amplification, e.g., by RT-PCR (the mental ment set forth in Mullis, 1987, U.S. Pat. No. 4,683,202), ligase chain reaction y (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al. (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al. (1989) Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al. (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques well known to those of skill in the art. These detection schemes are especially useful for the detection of nucleic acid molecules if such les are present in very low numbers. In particular aspects of the invention, marker expression is assessed by quantitative ﬂuorogenic RT-PCR (i.e., the TaqManTM System). Such methods typically utilize pairs of oligonucleotide primers that are specific for a marker of the invention.

Methods for designing oligonucleotide primers specific for a known sequence are well known in the art.

The expression levels of a marker of the invention may be red using a membrane blot (such as used in hybridization analysis such as Northern, Southern, dot, and the like), or microwells, sample tubes, gels, beads or fibers (or any solid support comprising bound nucleic acids). See U.S. Pat. Nos. 5,770,722, 219, 5,744,305, ,677,195 and 5,445,934, which are incorporated herein by nce. The detection of marker expression may also comprise using nucleic acid probes in solution.

In one embodiment of the ion, microarrays are used to detect the expression of a marker of the invention. Microarrays are particularly well suited for this purpose because of the reproducibility between ent experiments. DNA microarrays provide one method for the simultaneous measurement of the expression levels of large s of genes. Each array consists of a reproducible pattern of e probes attached to a solid support. Labeled RNA or DNA is hybridized to complementary probes on the array and then detected by laser scanning. Hybridization intensities for each probe on the array are determined and converted to a tative value representing relative gene expression levels. See, U.S. Pat. Nos. 6,040,138, 5,800,992 and 6,020,135, 6,033,860, and 6,344,316, which are incorporated herein by reference.

High-density oligonucleotide arrays are particularly useful for determining the gene expression profile for a large number of RNA's in a sample.

The amounts of marker, and/or a atical relationship of the s of a marker of the invention may be used to calculate the risk of recurrence of a disease state, e. g. cancer, diabetes, obesity, cardiovascular disease, or a toxicity state, e. g., a drug- induced toxicity or cardiotoxicity, in a subject being treated for a disease state or toxicity state, the survival of a t being treated for a disease state or a ty state, r a disesase state or toxicity state is aggressive, the efficacy of a treatment n for treating a disease state or toxicity state, and the like, using the methods of the invention, which may include methods of regression analysis known to one of skill in the art. For example, suitable regression models include, but are not limited to CART (e. g., Hill, T, and Lewicki, P. (2006) STICS Methods and Applications” StatSoft, Tulsa, OK), Cox (e. g., www.evidence-based-medicine.co.uk), exponential, normal and log normal (e. g., www.obgyn.cam.ac.uk/mrg/statsbook/stsurvan.html), logistic (e. g., www.en.wikipedia.org/wiki/Logistic_regression), parametric, non-parametric, semi- parametric (e. g., www.socserv.mcmaster.ca/jfox/Books/Companion), linear (e. g., www.en.wikipedia.org/wiki/Linear_regression), or additive (e. g., www.en.wikipedia.org/wiki/Generalized_additive_model).

In one embodiment, a regression analysis includes the amounts of marker. In another embodiment, a regression analysis includes a marker atical relationship.

In yet another embodiment, a regression analysis of the amounts of marker, and/or a marker mathematical relationship may include additional clinical and/or molecular co- variates. Such clinical co-variates include, but are not limited to, nodal status, tumor stage, tumor grade, tumor size, treatment , e. g., chemotherapy and/or radiation therapy, clinical outcome (e. g., relapse, disease-specific survival, therapy failure), and/or clinical outcome as a function of time after diagnosis, time after initiation of therapy, and/or time after completion of treatment.

VIII. Kits The invention also provides compositions and kits for sing a disease state, e. g. cancer, diabetes, obesity, cardiovascular disease, or a toxicity state, e. g., a drug- induced toxicity or cardiotoxicity, recurrence of a disease state or a ty state, or survival of a subject being treated for a disease state or a toxicity state. These kits include one or more of the following: a detectable antibody that specifically binds to a marker of the invention, a detectable antibody that specifically binds to a marker of the invention, reagents for obtaining and/or preparing subject tissue samples for ng, and instructions for use.

The kits of the invention may optionally comprise additional components useful for ming the methods of the ion. By way of e, the kits may comprise ﬂuids (e. g., SSC buffer) suitable for annealing mentary nucleic acids or for binding an antibody with a protein with which it ically binds, one or more sample compartments, an instructional material which describes performance of a method of the invention and tissue specific controls/standards.

IX. Screening Assays s of the invention include, but are not limited to, the genes and/or proteins listed herein. Based on the results of experiments described by Applicants herein, the key proteins modulated in a disease state or a toxicity state are associated with or can be classified into different pathways or groups of molecules, including cytoskeletal components, transcription factors, apoptotic response, e phosphate pathway, biosynthetic pathway, ive stress (pro-oxidant), membrane alterations, and oxidative phosphorylation lism. Accordingly, in one embodiment of the invention, a marker may include one or more genes (or proteins) selected from the group consisting of HSPAS, FLNB, PARK7, HSPAlA/HSPAlB, STl3, TUBB3, MIF, MRS, NARS, LGALSl, DDX17, EIFSA, HSPAS, DHX9, HNRNPC, CKAP4, HSPA9, PARPl, HADHA, PHB2, ATP5A1, CANX, GRP78, GRP75, TIMPl, PTX3, HSP76, PDIA4, PDIAl, CA2Dl, GPATl and TAZ. In one embodiment, a marker may include one or more genes (or proteins) selected from the group consisting of GRP78, GRP75, TIMPl, PTX3, HSP76, PDIA4, PDIAl, CA2D1, GPATl and TAZ. In some embodiments, the markers are a combination of at least two, three, four, five, six, seven, eight, nine, ten, , , en, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty, twenty-five, thirty, or more of the foregoing genes (or proteins).

Screening assays useful for identifying modulators of identified markers are described below.

The invention also provides methods (also referred to herein as "screening assays") for identifying tors, i.e., candidate or test compounds or agents (e. g., ns, peptides, peptidomimetics, peptoids, small molecules or other drugs), which are useful for treating or preventing a disease state or a toxicity state by modulating the expression and/or activity of a marker of the invention. Such assays typically comprise a reaction between a marker of the invention and one or more assay ents. The other ents may be either the test compound itself, or a combination of test nds and a natural g partner of a marker of the invention. Compounds identified via assays such as those described herein may be useful, for example, for modulating, e. g., ting, ameliorating, treating, or preventing aggressiveness of a disease state or toxicity state.

The test compounds used in the screening assays of the present ion may be obtained from any available source, including systematic libraries of natural and/or tic compounds. Test compounds may also be obtained by any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide ne which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann et al., 1994, J. Med. Chem. 37:2678—85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the 'one-bead one-compound' library method; and synthetic library methods using affinity chromatography selection. The ical library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam, 1997, Anticancer Drug Des. 12:145).

Examples of methods for the synthesis of lar libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. USA. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med.

Chem. 37:2678; Cho et al. (1993) Science 261:1303; l et al. (1994) Angew. Chem.

Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 332061; and in Gallop et al. (1994) J. Med. Chem. 37:1233.

Libraries of nds may be ted in solution (e. g., Houghten, 1992, Biotechniques 13:412-421), or on beads (Lam, 1991, Nature 354:82-84), chips (Fodor, 1993, Nature 5-556), bacteria and/or spores, (Ladner, USP 5,223,409), plasmids (Cull et al, 1992, Proc NatlAcad Sci USA 89:1865-1869) or on phage (Scott and Smith, 1990, Science 6-390; Devlin, 1990, Science 249:404-406; Cwirla et al, 1990, Proc. Natl. Acad. Sci. 87:6378-6382; Felici, 1991, J. Mol. Biol. 222:301-310; Ladner, supra.) The screening methods of the invention comprise contacting a disease state cell or a ty state cell with a test compound and determining the y of the test compound to modulate the expression and/or activity of a marker of the invention in the cell. The expression and/or activity of a marker of the invention can be determined as described herein.

In another embodiment, the invention provides assays for ing candidate or test compounds which are substrates of a marker of the invention or biologically active portions thereof. In yet another embodiment, the invention provides assays for screening candidate or test compounds which bind to a marker of the invention or biologically active ns f. Determining the ability of the test compound to directly bind to a marker can be accomplished, for example, by coupling the compound with a radioisotope or enzymatic label such that binding of the compound to the marker can be determined by detecting the labeled marker compound in a complex. For example, compounds (e. g., marker substrates) can be labeled with 131I, 125I, 35S, 14C, or 3H, either directly or ctly, and the radioisotope ed by direct counting of radioemission or by scintillation counting. Alternatively, assay components can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label ed by determination of conversion of an appropriate substrate to product.

This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent fied as described herein in an appropriate animal model. For example, an agent capable of ting the expression and/or activity of a marker of the invention identified as described herein can be used in an animal model to ine the efficacy, toxicity, or side effects of treatment with such an agent. Alternatively, an agent identified as described herein can be used in an animal model to determine the mechanism of action of such an agent. Furthermore, this invention pertains to uses of novel agents identified by the above-described screening assays for treatment as described above.

Exempliﬁcation 0f the Invention E 1: Employing Platform Technology to Build a Cancer Consensus and Simulation Networks In this example, the platform technology bed in detail above was employed to integrate data obtained from a custom built in vitro cancer model, and thereby identify novel proteins/pathways driving the pathogenesis of cancer. Relational maps resulting from this is have provided cancer treatment targets, as well as diagnostic/prognostic s associated with cancer.

The study design is depicted in Figure 18. Brieﬂy, two cancer cell lines (PaCa2, HepG2) and one normal cell line (THLE2) were subjected to one of seven conditions simulating an environment experienced by cancer cells in vivo. Specifically, cells were exposed to hyperglycemic ion, hypoxia condition, lactic acid condition, hyperglycemic + a combination condition, hyperglycemic + lactic acid combination condition, hypoxia + lactic acid combination condition, or hyperglycemic + hypoxia + lactic acid combination condition. Different conditions were created as the following: rglycemic condition was created by culturing the cells in media containing 22 mM glucose.

--Hypoxia condition was induced by placing the cells in a Modular Incubator Chamber (MIC-101, Billups-Rothenberg Inc. Del Mar, CA), which was ﬂooded with an industrial gas mix containing 5% C02, 2% Oz and 93% nitrogen. ic acid condition was created by culturing the cells in media containing 12.5 mM lactic acid.

--Hyperglycemic + a combination condition was created by culturing the cells in media containing 22 mM glucose and the cells were placed in a Modular Incubator Chamber ﬂooded with an industrial gas mix containing 5% C02, 2% Oz and 93% nitrogen.

--Hyperglycemic + lactic acid combination condition was created by culturing the cells in media containing 22 mM glucose and 12.5 mM lactic acid.

--Hypoxia + lactic acid combination condition was created by culturing the cells in media containing 12.5 mM lactic acid and the cells were placed in a Modular Incubator Chamber ﬂooded with an industrial gas mix containing 5% C02, 2% Oz and 93% en. rglycemic + hypoxia + lactic acid combination condition was created by culturing the cells in media containing 22 mM glucose and 12.5 mM lactic acid, and the cells were placed in a Modular Incubator Chamber ﬂooded with an industrial gas mix containing 5% C02, 2% Oz and 93% nitrogen.

The cell model comprising the above-mentioned cells, wherein the cells were exposed to each condition described above, was additionally interrogated by exposing the cells to an environmental bation by treating with Coenzyme Q10. Specifically, the cells were treated with Coenzyme Q10 at 0, 5011M, or 100uM.

Cell s as well as media samples for each cell line with each condition and each Coenzyme Q10 treatment were collected at s times following treatment, including after 24 hours and 48 hours of treatment.

In addition, cross talk experiments between two ent cancer cells, PaCa2 and HepG2 cells, were carried out in which PaCa2 and HepG2 cells were co-cultured. This co-culturing approach is referred to as an ellular secretome (ECS) experiment.

The first cell system (PaCa2) was first seeded in the inserts of the wells of a transwell type growth chamber. Six well plates were used to enable better statistical analysis. At the time of seeding with the first cell system in the inserts, the inserts were placed in a separate 6-well plate. The second cell system (HepG2) was seeded on the primary tray.

The insert tray containing the first cell system and the primary tray containing the second cell system were incubated at 37°C overnight. Each of the cell systems was grown in the specific cell specific media (wherein alternatively, each of the cell systems could be grown in a medium adapted to support the growth of both cell types). On the second day, the pre-determined treatment was given by media exchange. ically, the inserts containing the first cell system were placed into the primary tray containing the second cell . The tray was then incubated for a pre-determined time period, e. g., 24 hour or 48 hours. Duplicate wells were set up with the same ions, and cells were pooled to yield sufficient material for 2D analysis. The media (1 ml aliquot), the cells from the inserts and the cells from the wells of the primary tray were harvested as separate samples. The experiments were conducted in triplicate in order to provide better statistical analysis power.

Cross-talk experiments were also conducted by “media swap” experiments.

Specifically, a cultured media or “secretome” from the first cell system (PaCa2) was collected after 24 hrs or 48 hrs following perturbation or conditioning as described above and then added to the second cell system (HepG2) for 24-48 hrs. The final cultured media or tome” from the second cell system was then collected. All final secretomes were subjected to proteomic analysis. iProfiling of changes in total cellular n expression by quantitative proteomics was performed for cell and media samples collected for each cell line at each condition and with each “environmental perturbation”, i.e, Coenzyme Q10 treatment, using the techniques described above in the detailed description. iProfiling of changes in total cellular protein expression by quantitative proteomics was similarly med for cell and media samples ted for each co-cultured cell line at each condition with each treatment.

Further, bioenergetics profiling of the cancer, normal cells and cells in cross-talk experiments exposed to each condition and with or without Coenzyme Q10 perturbation were generated by ing the se analyzer essentially as recommended by the manufacturer. OCR (Oxygen ption rate) and ECAR (Extracullular Acidification Rate) were recorded by the electrodes in a 7 ul chamber created with the cartridge pushing against the seahorse culture plate. mics data ted for each cell line (including cells in cross-talk experiments) at each condition and with each perturbation, and bioenergetics profiling data collected for each cell line at each condition and with each perturbation, were all ed and sed by the REFSTM system. Raw data for Paca2, HepG2, THLE2 and cross-talk experiments were then combined using a standardized nomencalture.

Genes with more than 15% of the proteomics data missing were filtered out. Data imputation strategy was developed. For example, a within replicates error model was used to impute data from experimental ions with replicates. A K-NN algorithm based on 10 neighbors was used to impute data with no replicates. Different REFSTM models were built for three biological s together, for just the Paca2 , or for just the HepG2 system linked to the phenotypic data.

The area under the curve and fold changes for each edge connecting a parent node to a child node in the simulation networks were extracted by a custom-built program using the R programming ge, where the R programming language is an open source software environment for statistical computing and cs.

Output from the R program were inputted into Cytoscape, an open source program, to generate a visual representation of the consensus k.

Among all the models built, an exemplary protein interaction REFS consensus network at 70% nt frequency is shown in figure 21.

Each node in the consensus network shown in figure 21 was simulated by increasing or decreasing expression of LDHA by 4-fold to generate a simulation network using REFSTM, as described in detail above in the detailed description.

The effect of simulated LDHA expression change on PARK7 and proteins in notes associated with PARK7 at high level in the exemplary consensus network shown in figure 21 were investigated. Proteins responsive to the LDHA simulation in two cancer cell lines, i.e., Paca2 and HepG2, were fied using REFSTM (see figure 22).

The numbers ent particular protein expression level fold changes.

To validate the protein connections identified using the above method, markers identified to be in immediate proximity to LDHA in the simulation network were inputted to IPA, a software program that utilizes neural networks to determine molecular linkage between experimental s to networks based on previously published literature. Output of the IPA program is shown in figure 23, wherein the markers in grey shapes were fied to be in immediate proximity to LDHA in the simulation network generated by the platform and the markers in unfilled shapes are connections identified by IPA based on known knowledge in previously published ture.

Markers identified in the output from the Interrogative Biology platform technology (shown in Figure 21), i.e. DHX9, HNRNPC, CKAP4, HSPA9, PARPl, HADHA, PHB2, ATP5Al and CANX were observed to be connected to well-known cancer s such as TP53 and PARK7 within the IPA generated network (shown in Figure 23). The fact that the factors identified by the use of the Interrogative Biology platform share connectivity with known factors published in the scientific literatures validated the accuracy of the k created by the use of the Interrogative Biology Platform. In addition, the network association within the LDHA sub-network created by the use of the Interrogative Biology rm outputs trated the presence of directional ce of each factor, in contrast to the IPA network wherein the linkage between molecular entities does not provide functional directionality between the interacting nodes. Thus, by employing an unbiased approach to data tion, integration and reverse engineering to create a computational model followed by simulation and differential network analysis, the Interrogative Biology discovery platform enables the understanding of hitherto unknown isms in cancer pathophysiology that are in congruence with well-established scientific understandings of disease pathophysiology.

Figure 19 shows effect of CleO treatment on downstream nodes (pubmed protein accession numbers are listed in Figure 19) based on the n expression data from iProfiling. Protein accession number P00338 is LDHA. Wet lab validation of mics data were performed for LDHA expression in HepG2 cells (see Figure 20).

As shown in Figure 20, LDHA expression levels were decreased when HepG2 were treated with 50 uM CleO or 100 uM CleO for 24 or 48 hours.

For the well know cancer markers TP53, Bcl-2, Bax and Caspase3 lab , wet validation of effects of CleO ent on these markers’ expression level in SKMEL 28 cells were performed (see Figure 24 and Figure 25).

EXAMPLE 2: Employing Platform Technology to Build a Cancer Delta- Delta Network In this example, the platform technology described in detail above was employed to integrate data obtained from a custom built in vitro cancer model, and thereby identity novel proteins/pathways g the enesis of cancer. Relational maps resulting from this analysis have provided cancer treatment targets, as well as diagnostic/prognostic markers ated with cancer.

Brieﬂy, four cancer lines , HepG2, PC3 and MCF7) and two normal cells lines (THLE2 and HDFa) were subject to various conditions simulating an environment experienced by cancer cells in vivo. Specifically, cells were exposed separately to each of hyperglycemic conditions, c conditions and treatment with lactic acid. For example, a hyperglycemic condition was created by culturing the cells in media containing 22 mM glucose. A hypoxic ion was induced by placing the cells in a Modular Incubator Chamber (MIC-101, Billups-Rothenberg Inc. Del Mar, CA), which was ﬂooded with an industrial gas mix containing 5% C02, 2% Oz and 93% nitrogen.

For lactic acid treatment, each cell line was treated with 0 or 12.5 mM lactic acid. In addition to exposing the cells to each of the three foregoing conditions separately, cells were also exposed to combinations of two or all three of the conditions (i.e., hyperglycemic and hypoxic conditions; hyperglycemic ion and lactic acid; hypoxic condition and lactic acid; and, hyperglycemic and hypoxic conditions and lactic acid).

The cell model sing the above-mentioned cells, wherein each type of cell was exposed to each condition described above, was additionally interrogated by exposing the cells to an environmental perturbation by treating with Coenzyme Q10.

Specifically, the cells were treated with Coenzyme Q10 at 0, 50 MM or 100 MM.

Cell samples, as well as media samples containing the secretome from the cells, for each cell line exposed to each condition (or ation of conditions), with and t Coenzyme Q10 treatment, were collected at various times ing treatment, ing after 24 hours and 48 hours of treatment.

In addition, cross talk experiments between two different cancer cells, PaCa2 and HepG2 cells, were carried out in which PaCa2 and HepG2 cells were co-cultured. This co-culturing approach is referred to as an ellular secretome (ECS) experiment.

The first cell system (PaCa2) was seeded in the inserts of the wells of a transwell type growth chamber. Six well plates were generally used in order to enable better tical analysis. At the time of seeding of the first cell system in the inserts, the inserts were placed in a separate 6-well plate. The second cell system (HepG2) was seeded in the primary tray. The 6-well plate containing the inserts, which contained the first cell system, and the primary tray containing the second cell system were incubated at 37°C overnight. Each of the cell systems was grown in its respective cell specific media (wherein alternatively, each of the cell systems could be grown in a medium adapted to support the growth of both cell . On the second day, the pre-determined treatment was given by media ge. Specifically, the s containing the first cell system and the first cell system’s respective media were placed into the primary tray containing the second cell system and the second cell system’s respective media. In all cases of co- culture, however, co-cultured cells had been exposed to the same r condition” (e.g., hyperglycemia, hypoxia, lactic acid, or combinations thereof), albeit separately, during the first day prior to co-culturing. That is, the first cell system in the inserts and the second cell system in the trays were exposed to the same condition before being moved to a “coculture” arrangement. The tray was then incubated for a pre-determined time period, e. g., 24 hour or 48 hours. Duplicate wells were set up with the same conditions, and cells were pooled to yield sufficient material for subsequent proteomic analysis. The media containing the secretome (1 ml aliquot), the cells from the inserts and the cells from the wells of the primary tray were harvested as separate samples. The experiments were conducted in triplicate in order to provide better statistical power.

Cross-talk experiments were also conducted by “media swap” experiments.

Specifically, a cultured media or “secretome” from the first cell system ) was collected after 24 hrs or 48 hrs ing perturbation and/or conditioning and then added to the second cell system for 24-48 hrs. The final ed media or “secretome” from the second cell system was then collected. All final secretomes were subjected to proteomic analysis.

Following the exposure of the cell system to the “cancer ions” described above, the perturbation (i.e., Coenzyme Q10 treatment), and/or the conditions produced in the ome of a paired cell from a co-culture experiment, the response of the cells was then ed by analysis of s ts from the cell system. The readouts included proteomic data, specifically intracellular protein expression as well as proteins secreted into cell culture media, and functional data, specifically cellular bioenergetics. iProfiling of changes in total cellular protein expression by quantitative proteomics was performed for cell and media samples collected for each cell line (normal and cancer cell lines) exposed to each condition (or combination of conditions), with or without the “environmental perturbation”, i.e., Coenzyme Q10 treatment, using the techniques described above in the detailed description.

Further, rgetics profiling of each cell line (normal and cancer cell lines) exposed to each condition (or combination of conditions), with or without the “environmental perturbation”, i.e., Coenzyme Q10 treatment, were ted by employing the Seahorse er essentially as ended by the manufacturer.

Oxygen consumption rate (OCR) and Extracullular Acidification Rate (ECAR) were recorded by the electrodes in a 7 ul chamber created with the cartridge pushing against the seahorse culture plate.

Proteomics data collected for each cell line at each ion(s) and with/without each perturbation, and bioenergetics profiling data collected for each cell line at each condition(s) and with/without each perturbation, were then processed by the REFSTM system. A “composite cancer perturbed networ ” was generated from ed data obtained from all of the cancer cell lines, each having been exposed to each specific condition (and combination of ions), and further exposed to perturbation (CleO).

A “composite cancer unperturbed networ ” was generated from combined data obtained from all of the cancer cell lines, each having been exposed to each specific condition (and combination of conditions), without perturbation (without CleO). Similarly, a “composite normal perturbed network” was generated from combined data obtained from all of the normal cell lines, each having been exposed to each specific ion (and combination of ions), and additionally exposed to perturbation . A “composite normal unperturbed networ ” was generated from combined data obtained from all of the normal cell lines, each having been d to each ic condition (and combination of conditions), without perturbation (without CleO).

Next, “simulation composite networks” (also referred to herein as “simulation networks”) were generated for each of the four composite networks described above using REFSTM. To accomplish this, each node in the given consensus composite network was simulated (by increasing or decreasing by 10-fold) to generate simulation networks using REFSTM, as described in detail above in the ed description.

The area under the curve and fold changes for each edge connecting a parent node to a child node in the simulation networks were ted by a custom-built program using the R programming language, where the R programming language is an open source software environment for statistical computing and graphics.

Finally, delta networks were generated, where the delta networks represent the differential between two simulation composite networks. The delta ks were generated from the tion composite networks. To generate a cancer vs. normal differential network in response to Coenzyme Q10 (delta-delta network), consecutive ison steps were performed as illustrated in Figure 26, by a custom built program using the PERL programming language.

First, cancer untreated (T0) and cancer treated (Tl) networks were compared using the R program, and the unique Cancer treated Tl networks were separated (see the crescent shape in dark grey in Figure 26). This represents the Cancer T1 0 (intersection) Cancer T0 “delta” network. Protein interaction/ associations within this delta network can be viewed as representing the unique cancer response to Coenzyme Q10 treatment.

Similarly, normal untreated (T0) and normal treated (Tl) networks were compared using the R program, and the unique normal treated Tl networks were separated (see the crescent shape in light grey in Figure 26). This represents the Normal T1 0 Normal T0 “delta” network. Protein interactions / associations within this delta k can be viewed as representing the unique normal cell response to me Q10 treatment.

Finally, unique Cancer Tl networks (see the crescent shape in dark grey in Figure 26) and unique normal Tl networks (see the crescent shape in light grey in Figure 26) were compared using the R m, and networks that are unique to cancer alone, and not present in normal cells, in response to Coenzyme Q10 were ted (see Figure 26). This collection of protein interactions / associations ents the unique pathways within cancer cells that are not present in normal cells upon Coenzyme Q10 treatment. This collection of protein interactions/associations is called a “delta-delta networ ,” since it is a differential map produced from a comparison of a differential map from cancer cells and a differential map from normal control cells.

Output from the PERL and R programs were input into Cytoscape, an open source program, to generate a visual representation of the Delta-Delta network.

The delta-delta networks fied using the method described herein are highly useful for identifying targets for cancer treatment. For example, according to the delta- delta network presented in Figure 27, Protein A inhibits OCR3 (a measurement for ive phosphorylation) and enhances ECAR3 (a measurement for ysis).

Since this interaction is unique in cancer cells (because the delta-delta network has cted any interactions that are commonly present in normal cells upon Coenzyme Q10 treatment), inhibiting the expression of protein A is ed to reduce glycolysis- based energy metabolism, which is a hallmark of the cancer metabolic pathway, and shift the cells towards an oxidative phosphorylation-based energy lism, which is a phenotype more closely associated with normal cells. Thus, a combination therapy using Coenzyme Q10 and protein A inhibitor is expected to be effective to treat , at least in part by shifting the energy metabolism profile of the cancer cell to that which resembles a normal cell.

The advantage of the ogative Biology platform technology of the invention is further illustrated by the use of a substantive example wherein a sub-network derived from causal networks was compared to molecular network using IPA, a software program that utilizes neural networks to determine molecular linkage between experimental outputs to networks based on previously published literature. The causal twork ning PARK7 generated using the Interrogative Biology platform (shown in Figure 29) is used as a substantive e. All molecular signatures of the PARK7 network from the Interrogative y platform were incorporated into IPA to generate a network based on known/existing literature evidence. The network outputs between the Interrogative Biology output and that generated by the use of IPA was then compared.

Six markers identified by the output from the ogative Biology platform technology (shown in Figure 29), i.e. A, B, C, X, Y and Z in Figures 27-29, were observed to be connected to TP53 within the IPA generated network (Figure 28).

Among the six markers, A, B and C have been reported in the literature to be associated with cancer, as well as HSPAlA/HSPAlB. X, Y and Z were identified as “hubs” or key drivers of the cancer state, and are therefore identified as novel cancer markers. Further, MIFl and KARS were also identified as “hubs” or key drivers of the cancer state, and are therefore identified as novel cancer markers. The fact that the factors identified by the use of the Interrogative Biology platform share connectivity with known factors published in the scientific literatures validated the accuracy of the network created by the use of the Interrogative Biology Platform. In addition, the network association within the PARK7 sub-network created by the use of the Interrogative Biology platform outputs (shown in Figure 29) demonstrated the presence of ional inﬂuence of each factor, in st to the IPA network (shown in Figure 28) wherein the linkage between molecular entities does not provide functional directionality n the interacting nodes. Furthermore, outputs from the Interrogative Biology platform (shown as dotted lines in Figure 29) demonstrated the ation of these components leading to a potential ism through PARK7. Protein C, Protein A and other nodes of PARK7 were observed to be key drivers of cancer metabolism (Figure 27).

As evidenced by the present example, by employing an unbiased approach to data generation, ation and reverse engineering to create a computational model followed by simulation and ential network analysis, the Interrogative Biology discovery platform enables the understanding of hitherto unknown mechanisms in cancer pathophysiology that are in congruence with stablished scientific understandings of disease pathophysiology.

EXAMPLE 3: Employing Platform Technology to Build a Diabetes/Obesity/ Cardiovascular Disease Delta-Delta Network In this example, the platform technology described in detail above in the detailed description was employed to ate data obtained from a custom built diabetes/obesity/cardiovascular disease (CVD) model, and to ty novel proteins/pathways g the pathogenesis of diabetes/obesity/CVD. Relational maps resulting from this analysis have provided diabetes/obesity/CVD treatment targets, as well as diagnostic/prognostic markers associated with diabetes/obesity/CVD.

Five primary human cell lines, namely adipocytes, myotubes, hepatocytes, aortic smooth muscle cells (HASMC), and proximal r cells (HK2) were subject to one of five conditions simulating an environment enced by these disease-relevant cells in vivo. Specifically, each of the five cell lines were exposed separately to each of the following conditions: hyperglycemic conditions, hyperlipidemic conditions, hyperinsulinemic ions, hypoxic conditions and exposure to lactic acid . The lycemic ion was induced by culturing cells in media containing 22 mM glucose. The hyperlipidemic condition was induced by culturing the cells in media containing 0.15 mM sodium palmitate. The hyperinsulinemic condition was induced by culturing the cells in media containing 1000 nM n. The hypoxic condition was induced by placing the cells in a Modular Incubator Chamber (MIC-101, Billups- Rothenberg Inc. Del Mar, CA), which was ﬂooded with an industrial gas mix ning % C02, 2% Oz and 93% nitrogen. Each cell line was also treated with 0 or 12.5 mM lactic acid.

In addition, cross talk experiments between two different pairs of cells, HASMC (cell system 1) and HK2 cells (cell system 2) or liver cells (cell system 1) and adipocytes ystem 2) were carried out in which the paired cells were co-cultured. This co- culturing ch is referred to as an ellular secretome (ECS) experiment. The first cell system (e.g., HASMC) was first seeded in the inserts of the wells of a transwell type growth chamber. Six well plates were used to enable better statistical analysis. At the time of seeding with the first cell system in the inserts, the inserts were placed in a separate 6-well plate. The second cell system (e. g., HK2) was seeded on the primary tray. The insert tray containing the first cell system and the primary tray containing the second cell system were incubated at 37°C overnight. Each of the cell systems was grown in the specific cell specific media (wherein alternatively, each of the cell systems could be grown in a medium adapted to t the growth of both cell types ). On the second day, the pre-determined treatment was given by media exchange. Specifically, the inserts containing the first cell system were placed into the primary tray containing the second cell . The tray was then incubated for a pre-determined time period, e. g., 24 hour or 48 hours. Duplicate wells were set up with the same conditions, and cells were pooled to yield sufficient material for 2D analysis. The media (1 ml aliquot), the cells from the inserts and the cells from the wells of the primary tray were harvested as separate samples. The experiments were ted in triplicate in order to provide better statistical analysis power.

Cross-talk experiments were also ted by “media swap” experiments.

Specifically, a cultured media or “secretome” from the first cell system, HASMC was collected after 24 hrs or 48 hrs following bation or conditioning and then added to the second cell system, Adipoctes, for 24-48 hrs. The final cultured media or “secretome” from the second cell system was then collected. All final secretomes were subjected to mic analysis.

The cell model comprising the above-mentioned cells, wherein the cells were exposed to each condition described above, was additionally “interrogated” by exposing the cells to an “environmental perturbation” by treating with Coenzyme Q10.

Specifically, the cells were treated with Coenzyme Q10 at 0, SOMM, or lOOuM.

Cell samples for each cell line, condition and Coenzyme Q10 treatment were collected at various times following treatment, including after 24 hours and 48 hours of treatment. For certain cells and under certain conditions, media samples were also collected and analyzed. iProfiling of changes in total ar protein expression by quantitative proteomics was performed for cell and media samples collected for each cell line at each condition and with each “environmental perturbation”, i.e, me Q10 treatment, using the techniques described above in the detailed description.

Proteomics data collected for each cell line listed above at each condition and with each perturbation, and bioenergetics profiling data collected for each cell line at each condition and with each perturbation, were then sed by the REFSTM .

A composite perturbed network was ted from combined data obtained from all the cell lines for one specific condition (e. g., hyperglycemia) exposed to perturbation (CleO). A composite unperturbed network was generated from ed data obtained from all of the cell lines for the same one specific condition (e. g., hyperglycemia), without perturbation (without CleO). rly, a composite bed network was generated from ed data obtained from all of the cell lines for a second, control condition (e. g., normal glycemia) d to perturbation (CleO).

A composite unperturbed network was generated from combined data obtained from all of the cell lines for the same second, control condition (e. g., normal glycemia), without perturbation (without CleO).

Each node in the consensus composite networks described above was ted (by increasing or decreasing by 10-fold) to generate simulation networks using REFSTM, as described in detail above in the detailed description.

The area under the curve and fold changes for each edge ting a parent node to a child node in the simulation networks were extracted by a custom-built m using the R programming language, where the R programming language is an open source software environment for statistical computing and graphics.

Delta networks were generated from the simulated composite networks. To generate a Diabetes/Obesity/Cardiovascular disease condition vs. normal condition differential network in se to Coenzyme Q10 (delta-delta network), steps of comparison were med as illustrated in Figure 30, by a custom built program using the PERL programming language. ically, as shown in Figure 30, Treatment Tl refers to Coenzyme Q10 treatment and NG and HG refer to normal and hyperglycemia as conditions. Unique edges from NG in the NGﬂHG delta network was compared with unique edges of HGTlin the HGﬂHGTl delta network. Edges in the intersection of NG and HGTl are HG edges that are restored to NG with T1. HG edges restored to NG with T1 were superimposed on the NGﬂHG delta network (shown in darker colored circles in Figure Specifically, a simulated composite map of normal glycemia (NG) condition and a simulated composite map of hyperglycemia (HG) condition were compared using a custom-made Perl program to generate unique edges of the normal glycemia condition.

A simulated ite map of hyperglycemia condition without me Q10 treatment (HG) and a simulated map of hyperglycemia condition with Coenzyme Q10 treatment (HGTl) were ed using a custom-made Perl program to generate unique edges of the hyperglycemia ion with Coenzyme Q10 treatment(HGTl). Edges in the intersection of the unique edges from normal glycemia condition (NG) and the unique edges from hyperglycemia condition with me Q10 treatment (HGTl) were identified using the Perl program. These edges represent factors/networks that are restored to normal glycemia ion from hyperglycemia condition by the treatment of me Q10. The delta-delta network of hyperglycemic edges restored to normal with Coenzyme Q10 treatment was superimposed on the normal glycemia ﬂ Hyperglycemia delta network. A sample of the superimposed networks is shown in Figure 31. Figure 31 is an exemplary diabetes/obesity/cardiovascular e condition vs. normal condition differential network in response to Coenzyme Q10 (delta-delta network). Darker colored circles in Figure 31 are identified edges which were restored to a normal glycemia condition from a hyperglycemia condition by the treatment of Coenzyme Q10. Lighter colored circles in Figure 31 are identified unique normal hypercemia edges.

Similarly to the experiments described above for hyperglycemia vs. normal glycemic condition, a simulated composite network of hyperlipidemia condition ning data from all diabetes/obesity/cardiovascular-related cells described above) without Coenzyme Q10 treatment and a ted composite network of ipidemia ion (combining data from all es/obesity/cardiovascular-related cells, described above) with Coenzyme Q10 treatment were compared using the Perl program to generate unique edges of the hyperlipidemia condition with Coenzyme Q10 treatment.

Edges in the intersection of the unique edges from normal lipidemia ion and the unique edges from hyperlipidemic ion with me Q10 treatment were identified using the Perl program. These edges represent factors/networks that are restored to a normal lipidemia condition from a hyperlipidemia condition by the treatment of Coenzyme Q10. A delta-delta network of hyperlipidemic edges restored to normal with Coenzyme Q10 treatment was superimposed on the normal lipidemia ﬂ Hyperlipidemia delta network. A sample of the superimposed networks is shown in Figure 32. Darker colored circles in Figure 32 are identified edges which were restored to a normal lipidemia condition from a hyperlipidemia condition by the ent of Coenzyme Q10. Lighter colored circles in Figure 32 are identified unique normal lipidemia edges. FASN was identified as one important factor of a signaling pathway which modulates me Q10’s effect of restoring hyperlipidemia to a normal lipidemia ion.

Fatty acid synthase- fatty acid synthesis enzymes such as FASN have been implicated in almost all aspects of human lic alterations such as obesity, insulin resistance or dyslipidemia. FASN inhibitors have been proposed as lead molecules for treatment of obesity, althought molecular mechanisms are unknown (Mobbs et al 2002). nin and synthetic compound C75 - FASN inhibitors have been shown to have an effect in reducing food intake and effectuate weight loss (Loftus et al 2000).

The fact that FASN was identified by the platform technology bed herein as one important factor in the signaling pathway which modulates me Q10’s effect of restoring a diabetic to a normal state, as shown in Figure 32, validated the accuracy of this delta-delta k. Therefore, other novel-factors identified in this delta-delta network will be ial therapeutic factors or drug targets for further investigation.

EXAMPLE 4: Employing Platform Technology to Build Models of Drug Induced Cardiotoxicity In this example, the platform technology described in detail above in the detailed description was employed to integrate data obtained from a custom built cardiotoxicity model, and to identify novel proteins/pathways driving the pathogenesis/ toxicity of drugs. Relational maps resulting from this analysis have provided toxicity biomarkers.

In the healthy heart contractile function depends on a balance of fatty acid and carbohydrate ion. Chronic imbalance in uptake, utilization, organellar biogenesis and secretion in ipose tissue (heart and liver) is thought to be at the center of mitochondrial damage and dysfunction and a key player in drug d cardiotoxicity.

Here Applicants describe a systems approach combining protein and lipid signatures with functional end point assays specifically looking at cellular bioenergetics and mitochondrial membrane function. In vitro models comprising ic and normal cardiomyocytes supplemented with excessive fatty acid and lycemia were treated with a panel of drugs to create signatures and ial mechanisms of toxicity.

Applicants demonstrated the varied effects of drugs in destabilizing the mitochondria by disrupting the energy metabolism component at various levels including (i) Dysregulation of transcriptional networks that controls expression of mitochondrial energy metabolism genes; (ii) Induction of GPATl and taffazin in diabetic cardiomyocytes y initiating de novo phospholipid synthesis and remodeling in the mitochondrial membrane; and (iii) Altered fate of fatty acid in diabetic cardiomyocytes, inﬂuencing , fatty acid oxidation and ATP synthesis. Further, Applicants combined the power of wet lab biology and AI based data mining rm to te causal network based on bayesian models. Networks of proteins and lipids that are causal for loss of normal cell function were used to discern mechanisms of drug induced toxicity from cellular protective mechanisms. This novel approach will serve as a powerful new tool to understand mechanism of toxicity while allowing for development of safer therapeutics that correct an d phenotype.

Human cardiomyocytes were subject to conditions simulating an diabetic environment experienced by the disease-relevant cells in vivo. Specifically, the cells were exposed to hyperglycemic conditions and hyperlipidemia conditions. The hyperglycemic condition was induced by culturing cells in media containing 22 mM glucose. The hyperlipidemia condition was induced by culturing the cells in media containing lmM L—carnitine, 0.7mM Oleic acid and 0.7mM ic acid.

The cell model comprising the above-mentioned cells, wherein the cells were exposed to each condition described above, was onally “interrogated” by exposing the cells to an “environmental perturbation” by treating with a diabetic drug (T) which is known to cause cardiotoxicity, a rescue molecule (R) or both the diabetic drug and the rescue le (T+R). Specifically, the cells were treated with ic drug; or treated with rescue molecule me Q10 at 0, 50ttM, or lOOttM; or treated with both of the diabetic drug and the rescue molecule Coenzyme Q10.

Cell samples from each condition with each perturbation treatment were collected at s times following treatment, including after 6 hours of treatment. For certain conditions, media samples were also ted and analyzed. iProfiling of changes in total cellular protein expression by quantitative proteomics was performed for cell and media s collected for each condition and with each “environmental perturbation”, i.e, ic drug treatment, Coenzyme Q10 treatment or both, using the techniques described above in the detailed description.

Transcriptional profiling experiments were carried out using the Biorad cfx-384 amplification system. Following data collection (Ct), the final fold change over control was determined using the 8Ct method as outlined in manufacturer’s protocol.

Lipidomics experiments were carried out using mass spectrometry. Functional assays such as Oxygen ption rate OCR were measured by employing the Seahorse analyzer essentially as ended by the manufacturer. OCR was recorded by the odes in a 7 ul chamber created with the cartridge pushing against the seahorse culture plate.

As shown in Figure 35, transcriptional network and expression of human ondrial energy metabolism genes in diabetic cardiomyocytes (cardiomyocytes conditioned in lycemic and hyperlipidemia) were compared between perturbed and urbed treatments. Specifically, data of transcriptional network and expression of human mitochondrial energy metabolism genes were compared between diabetic cardiomyocytes treated with diabetic drug (T) and untreated diabetic cardiomyocytes samples (UT). Data of Transcriptional network and expression of human ondrial energy metabolism genes were compared between diabetic cardiomyocytes treated with both diabetic drug and rescue molecule Coenzyme Q10 (T+R) and untreated diabetic myocytes samples (UT). Comparing to data from untreated diabetic cardiomyocytes, certain genes expression and transcription were altered when diabetic cardiomyocytes were treated with diabetic drug. Rescue molecule Coenzyme Q10 was demonstrated to reverse the toxic effect of diabetic drug and normalize gene expression and transcription.

As shown in Figure 36A, cardiomyocytes were cultured either in normoglycemia (NG) or hyperglygemia (HG) condition and treated with either ic drug alone (T) or with both diabetic drug and rescue molecule Coenzyme Q10 (T+R) . n expression levels of GPATl and TAZ for each condition and each treatment were tested with western blotting. Both GPATl and TAZ were upregulated in hyperglycemia conditioned and diabetic drug treated cardiomyocytes. When hyperglycemia conditioned cardiomyocytes were treated with both diabetic drug and rescue molecule Coenzyme Q10, the lated protein expression level of GPATl and TAZ were normalized.

As shown in Figure 37A, mitochondrial oxygen consumption rate (%) experiments were carried out for hyperglycemia conditioned cardiomyocytes samples.

Hyperglycemia conditioned myocytes were either untreated (UT), treated with diabetic drug Tl which is known to cause cardiotoxicity, treated with diabetic drug T2 which is known to cause cardiotoxicity, treated with both diabetic drug T1 and rescue molecule me Q10 (Tl+R), or treated with both diabetic drug T2 and rescue molecule Coenzyme Q10 (T2+R). Comparing to untreated control samples, mitochondrial OCR was sed when hyperglycemia conditioned cardiomyocytes were treated with diabetic drug T1 or T2. However, mitochondrial OCR was normalized when hyperglycemia conditioned cardiomyocytes were treated with both diabetic drug and rescue le Coenzyme Q10 (Tl + R, or T2 + R).

As shown in Figure 37B, mitochondria ATP synthesis experiments were carried out for lycemia conditioned cardiomyocytes samples. Hyperglycemia conditioned cardiomyocytes were either untreated (UT), treated with a ic drug (T), or treated with both diabetic drug and rescue molecule Coenzyme Q10 (T+R).

Comparing to untreated l samples, mitochondrial ATP synthesis was repressed when hyperglycemia conditioned cardiomyocytes were treated with diabetic drug (T).

As shown in Figure 38, based on the collected proteomic data, proteins down regulated by drug ent were annotated with G0 terms. Proteins involved in mitochondrial energy metabolism were down regulated when hyperglycemia conditioned cardiomyocytes were treated with a diabetic drug which is known to cause cardiotoxicity.

Proteomics, lipidomics, transcriptional profiling, functional assays, and western blotting data ted for each ion and with each perturbation, were then processed by the REFSTM system. Composite perturbed networks were generated from combined data obtained from one specific ion (e. g., hyperglycemia, or hyperlipidemia) exposed to each perturbation (e. g., diabetic drug, CleO, or both).

Composite unperturbed networks were generated from combined data ed from the same one ic condition (e.g., hyperglycemia, or hyperlipidemia), without bation (untreated). Similarly, composite perturbed networks were generated from combined data obtained for a second, control condition (e. g., normal glycemia) exposed to each bation (e. g., diabetic drug, CleO, or both). Composite unperturbed networks were generated from combined data obtained from the same second, control condition (e. g., normal glycemia), without perturbation (untreated).

Each node in the consensus composite ks bed above was simulated (by increasing or decreasing by 10-fold) to generate tion networks using REFSTM, as described in detail above in the detailed description.

The area under the curve and fold changes for each edge connecting a parent node to a child node in the simulation networks were extracted by a custom-built program using the R programming language, where the R programming language is an open source software environment for statistical computing and graphics.

Delta networks were ted from the simulated composite networks. To generate a drug induced toxicity condition vs. normal ion differential network in response to the diabetic drug (delt network), steps of comparison were performed as illustrated in Figure 39, by a custom built program using the PERL programming language.

Specifically, as shown in Figure 39, UT refers to protein expression networks of untreated control cardiomyocytes in hyperglycemia condition. Treatment T refers to n expression networks of diabetic drug d cardiomyocytes in hyperglycemia condition. Unique edges from T in the UTﬂT delta network are presented in Figure 40. ically, a simulated composite map of untreated cardiomyocytes in hyperglycemia condition and a simulated composite map of diabetic drug treated cardiomyocytes in hyperglycemia condition were compared using a custom-made Perl program to generate unique edges of the diabetic drug treated cardiomyocytes in hyperglycemia condition. Output from the PERL and R programs were input into ape, an open source program, to generate a visual representation of the delta k. As shown in figure 40, the network represents delta ks that are driven by the diabetic drug versus untreated in cardiomyocytes/ cardiotox models in hyperglycemia condition.

From the drug induced toxicity condition vs. normal condition differential network shown in Figure 40, ns were identified which drive pathophysiology of drug induced cardiotoxicity, such as GRP78, GRP75, TIMPl, PTX3, HSP76, PDIA4, PDIAl, CA2Dl. These ns can function as biomarkers for identification of other cardiotoxicity inducing drugs. These proteins can also function as biomarkers for identification of agents which can alleviate cardiotoxicity.

The experiments described in this Example demonstrate that perturbed membrane biology and altered fate of free fatty acid in diabetic cardiomyocytes exposed to drug treatment represent the center piece of drug induced toxicity. Data integration and k biology have allowed for an ed understanding of cardiotoxicity, and identification of novel biomarkers tive for cardiotoxicity.

EXAMPLE 5: Employing Platform Technology to Implement Multi mics Models for Elucidating Enzymatic Activity.

In general, the platform technology bed in e l-4 above can be adapted to implement further methods for identifying a modulator of a biological system or disease process. The methods employ a model for the biological , using cells associated with the ical system, to represents a characteristic aspect of the biological system. The model is used to obtain at least three levels of data, namely (i) a first data set representing global enzyme activity in the cells associated with the biological system, (ii) a second data set representing an effect of the global enzyme activity on the enzyme metabolites or substrates in the cells associated with the biological system, and (iii) a third data set enting global mic changes in the cells associated with the biological system. The data is used to generate a consensus causal relationship network among the global enzyme activity, the effect of the global enzyme activity, and the global proteomic changes. The consensus causal relationship network is based solely on the first, second, and third data sets using a programmed computing device (i.e., not based on any other known biological relationship). The sus causal relationship network is then used to identify a causal relationship unique to the biological system, where at least one gene or protein associated with the unique causal relationship is identified as a modulator of the biological system or disease pI‘OCCSS.

In this example, the platform technology was adapted to implement multi proteomics ques for measuring enzyme activity and the direct effects of that activity on the proteome, thereby, provide a system that can be used to understand causal relationships between enzymes and their metabolites/substrates in the context of global changes in the cellular proteome. Such techniques can e valuable insight because, as demonstrated in this example, enzyme activity can be orthogonal to enzyme expression (e.g., activity down regulated and expression unregulated). Relational maps resulting from such an analysis can provide disease treatment targets, as well as diagnostic/prognostic markers associated with disease. Such targets and markers can e for eutic compositions and methods. Techniques for establishing models, obtaining data sets, generating consensus causal relationship networks, and identifying causal relationship unique to the biological system are discussed in the summary, detailed description, and examples above. Further ques for establishing models and obtaining data sets representing global enzyme ty and the effect of the global enzyme activity on the enzyme metabolites or substrates are provided below.

A illustrates a method for identifying a modulator of a biological system or disease process, which employs multi proteomic techniques for elucidating enzyme (e. g., kinase) activity. First, a model is established in accordance with the platform technology wherein cell lines are subjected to conditions ting a e and interrogated by exposure to an environmental perturbation (e.g., exposure to Sorafenib in the specific example of hepatocellular oma provided below). A control is provided for comparison. Second, enzyme activity and its downstream s are tracked in the t of global mic changes by analyzing (i) global enzymatic activity, (ii) the specific effect of the enzymatic activity on the proteome (e. g., the metabolites/substrates of the enzymatic activity), and (iii) the global effect on the cellular me. Third, the datasets are analyzed in accordance with the platform technology to identify modulators of interest. For e, a cancer model can be interrogated by a known anti-cancer drug kinase inhibitor; the effects of this perturbation to the system on the global kinase activity can be analyzed, along with the resulting effects on the o proteome and whole proteome; and the dataset can be analyzed by the AI based REFSTM system.

In this example, epatocellular carcinoma (HCC) was selected to provide an illustrative implementation of the platform technology. HCC is one of the leading causes of cancer-related death worldwide, ranked as the third most fatal cancer after lung and stomach carcinomas. The diverse etiology, high morbidity/mortality, lack of diagnostic markers for early diagnosis and the highly le clinical course of HCC have hindered advances in diagnosis and treatment. After years of studying the HCC, the tanding of molecular mechanism operational in HCC remains incomplete.

The genomic, transcriptomic, and comparative proteomic es have yielded some important ts for HCC research. However, many studies focused on single aspect of the cellular changes ated with HCC, hindering the full understanding of biological systems in their true complexity and dynamics.

This illustrative e combines the power of (i) cell y, (ii) integrated mics platforms and an informatics platform that generates causal protein networks to delineate the role of post-translation modification, e. g., orylation, and enzymes that participate in such mechanisms, e. g., kinases, in the hysiology of HCC. In particulat, this approach incorporates activity based proteomics employing ATP binding domain enrichment probes and phospho-proteome mapping of total proteins in HCC cellular models. inase inhibitor Sorafenib, a first line chemotherapeutic agent for the advanced HCC patients, was used to probe the role of global kinase activity and protein phosphorylation changes associated with this treatment. The HepG2 (ATCC Ascension No. HB-8065) cell line was selected to model HCC cells and the THLE2 (ATCC Ascension No. CRL-lOl49) cell line was selected to model normal hepatic cells.

B illustrates a method for global enzyme (e. g., kinase) enrichment profiling. First, a cell lysate including the targeted enzyme (e. g., kinase) is parepared.

The second step is probe binding (e. g., an ATP probe in the case of kinase). Theny the enzyme is digested and bound fragments are captured. These fragments can be analyzed (e. g., by LC-MS/MS) and the corresponding protein thus identified (e. g., via a database search of the LC—Ms/MS data).

THERMO IFIC© PIERCE® Kinase Enrichment Kits and ACTIVX© probes (instructions available from THERMO SCIENTIFIC© and PIERCE® Biotechnology ermoscientific.com/pierce) were used for global enzyme activity analysis. Brieﬂy, these and similar kits enable selective labeling and enrichment of ATPases including kinases, chaperones and metabolic enzymes. ATP and ADP Probes are generally nucleotide derivatives, which covalently modify the active site of s with conserved lysine residues in the nucleotide-binding site. For example, the structure of desthiobiotin-ATP and -ADP consists of a modified biotin attached to the nucleotide by a labile acyl-phosphate bond. Depending on the position of the lysine within the enzyme active site, either desthiobiotin-ATP or -ADP can be preferred for labeling ic s.

Both desthiobiotin-ATP and -ADP can selectively , identify and profile target enzyme classes in samples or assess the specificity and affinity of enzyme inhibitors. Many ATPases and other nucleotide-binding proteins bind nucleotides or inhibitors even when they are enzymatically inactive; these reagents bind both inactive and active enzymes in a complex sample. Preincubation of samples with small-molecule inhibitors that compete for active-site probes can be used to determine inhibitor binding affinity and target specificity.

Assessment of -site labeling can be accomplished by either Western blot or mass spectrometry (MS). For the Western blot workﬂow, desthiobiotin-labeled proteins are enriched for SDS-PAGE analysis and subsequent detection with specific antibodies.

For the MS workflow, desthiobiotin-labeled proteins are reduced, alkylated and enzymatically digested to peptides. Only the desthiobiotin-labeled, -site peptides are enriched for analysis by LC-MS/MS. Both workflows can be used for determining tor target binding, but only the MS workflow can fy global inhibitor targets and off-targets.

THERMO SCIENTIFIC© PIERCE® TiOz Phosphopeptide Enrichment and up Kit (instructions ble from THERMO SCIENTIFIC© and PIERCE® Biotechnology www.thermoscientific.com/pierce) were used for the phospho proteome analysis. Brieﬂy, these and similar kits can enable efficient isolation of phosphorylated peptides from x and fractionated protein digests for analysis by mass spectrometry (MS). Spherical porous titanium dioxide (Ti02) combined with optimized buffers provide enhanced enrichment and identification of phosphopeptides with minimal nonspecific g. The spin-column format is fast and easy to use and can enrich up to 100ug of phosphopeptides from 300-1000ug of ed n sample.

The kit’s optimized protocol, buffer components and graphite spin columns result in a high yield of clean phosphopeptide s ready for MS analysis.

Phosphorylation is a n modification essential to ical ons such as cell signaling, growth, differentiation and division, and programmed cell death. r, phosphopeptides have high hydrophilicity and are low in abundance, resulting in poor chromatography, ionization and fragmentation. Phosphopeptide enrichment is therefore essential to successful MS analysis. Phosphopeptide enrichment and clean-up kit can be compatible with lysis, reduction, alkylation, digestion and graphite spin columns to provide a complete workﬂow for phosphopeptide enrichment and identification.

Comparative proteomics, phospho proteome and enzyme activity data are integrated into the AI based REFSTM informatics platform. Causal networks of protein interaction specifically from a functional stand point namely kinase/enzyme activity and potential targets that kinases can phosphorylate are then generated. In addition, using cellular functional read out, enzymes/ kinases that modulate phosphorylation of targets and mechanistically drive pathophysiological cellular behavior are determined. The rative implementation ed herein facilitates global characterization of cellular responses, insights into mechanisms of chemo sensitivity and potential targets/biomarkers for clinical management of HCC.

Materials and Methods The cells were cultured according to the following protocol. Day 1: HepG2/Hep3B — seed 06 cells in T-75 culture ﬂasks; 7.4x106 cells in T-l75 culture ﬂasks; or 6 cells in T-225 culture ﬂasks. THLE-2 — seed 1.3x106 cells in T-75 culture ﬂasks. Day 2: 16-24 hours later, at 50-70% conﬂuence — add treatment.

Control: DMSO at final concentration of 0.01%. EGF: 500 ng/mL in 10 mM acetic acid. Sorafenib: 1 uM at 0.1% volume in DMSO. Day 3: 24 hours after ent, harvest cells by trypsinization. Wash pellets 2X with PBS before freezing.

The global enzyme activity analysis was conducted according to the following protocol.

Cell Lysis: Fresh-made Lysis buffer — 5 M urea, 50 mM Tris-HCL pH 8.4, 0.1% SDS, 1% Protease Inhibitor Cocktail, 1% Phosphatase Inhibitor Cocktail l) Pellet cells in 15-2 mL Eppendorf microtubes by centrifuging at 2000 g for 5 minutes and remove supernatant. 2) Wash cells by resuspending pellet in PBS. Repeat wash once more. 3) Add an riate amount of lysis buffer to each sample and vortex. 4) Incubate on ice for 10 minutes with periodic mixing ) te each sample until lysis is complete 6) Centrifuge at top speed for 15 minutes 7) Transfer lysate (supernatant) to new tube Lysis Buffer-Exchange: Used ’s pre-made Reaction Buffer.

Reaction buffer — 20 mM HEPES pH 7.4, 150 mM NaCl, 0.1% TritonX-100 l) Twist off Zeba Spin Desalting Column’s bottom closure and loosen cap 2) Put in 15 mL l tube 3) Centrifuge column at room temperature at 1000 g for 2 minutes to remove storage solution 4) Add 3 mL of Reaction Buﬂer to column. Centrifuge at 1000 g for 2 minutes to remove buffer. Repeat 2 more times, discarding buffer a. Centrifuge additional 1000 g for 2-3 minutes if there is excess buffer on last wash ) Transfer column to new conical tube 6) SLOWLY apply entire lysate to center of resin bed 7) Centrifuge at 1000 g for 2 minutes to collect sample. Discard column 8) Add 1:100 se/phosphatase inhibitor cocktail to sample and place on ice a. Samples may be frozen in -800 C freezer Stopping point Sample ng with Probe: Used pre-made 1 M MgClz from Pierce.

Made fresh 1 M MnClz. 1) Determine protein concentration using Bradford Assay 2) Dilute lysate with water to 2 mg/mL (2 ug/uL) if le 3) Transfer 2 mg to new microcentrifuge tube 4) Add 20 uL of 1 M MgClz to each sample, mix, incubate for 1 min at room temperature.

Note: Final concentration is 0.02 M MgClz ) Add 10 uL of 1 M MgClz to each sample, mix, incubate for 1 min at room temperature Note: Final concentration is 10 mM MgC12. 6) Equilibrate ATP/ADP reagent to room temperature with desiccant. Store remainder at -800 C 7) For 20 uM reaction — add 10 uL of ultrapure water to t to make 1 mM stock solution 8) Add 20 uL of ATP/ADP stock to each sample and incubate for 1 hour at room temperature. d Protein ion and Alkylation: Prepare fresh 10 M Urea/50 mM Tris-HCL pH 8.4 1) Add 1 mL of 10 M Urea/50 mM Tris-HCL to each reaction 2) Add 100 uL of 200 mM TCEP to each sample. Incubate at 550 C for 1 hr 3) Add 100 uL of 375 mM iodoacetamide to each sample. Incubate at room ature for 30 minutes in the dark Buffer Exchange: Prepare fresh Digest Buffer — 2 M urea, 200 mM Tris-HCL pH 8.4 1) Twist off Zeba Spin Desalting ’s bottom closure and loosen cap 2) Put in 15 mL conical tube 3) Centrifuge column at room temperature at 1000 g for 2 minutes to remove storage solution 4) Add 3 mL of Digest Buffer to column. Centrifuge at 1000 g for 2 minutes to remove buffer. Repeat 2 more times, discarding buffer a. Centrifuge additional 1000 g for 2-3 minutes if there is excess buffer on last wash ) Transfer column to new conical tube 6) SLOWLY apply entire sample to center of resin bed 7) Centrifuge at 1000 g for 2 minutes to collect sample. d column Labeled Protein : 1) Add trypsin in 1:50 ratio in:protein) 2) Incubate at 370 C with shaking for overnight Labeled Peptide Capture and Elution: Prepare fresh n Buffer (50% ACN; 0.1% formic acid) 1) Add 50 uL of slurry to each digested sample. Incubate for 1.5 hours at room temperature with constant mixing 2) Transfer sample to Pierce Spin column. Centrifuge at 1000 g for 1 minute.

Collect ﬂow-through and save. 3) At 1000 g for 1 minute per wash: a. Wash resin 3X with 500 uL of 4 M urea/50 mM Tris-HCl pH 8.4 b. Wash resin 4X with 500 uL of PBS c. Wash resin 4X with 500 uL of water 4) Elute peptides with 75 uL of Elution Buffer and incubate for 3 minutes. Repeat 2 more times, combining eluate fractions ) Lyophilize samples in vacuum concentrator.

Label-free, 1-D separation for LCMSMS analysis 1) Once samples are dried by lyophilizing, resuspend each sample in 25 uL of 0.1% formic acid 2) Transfer 10 uL into Vials for LCMSMS iTRAQ Labeling 1) The remaining 15 uL samples were dried completely 2) Resuspend samples in 30 uL of 200 mM TEAB 3) 15 uL of sample was labeled with 30 uL of iTRAQ reagent and incubated for 2 hours at I‘OOI’II temperature a. 6 uL per sample was pooled for the QCP 4) After labeling, 8 uL of 5% hydroxamine was added for quenching for 15 minutes at 40 C ) All MP’s were pooled together, dried, desalted, and resuspended in 20 uL of 0.1% formic acid.

Eksigent/LTQ Orbitrap instrument was haVing ms so MP’s were dried and resuspended in 18 uL of 20 mM ammonium formate.

Leftovers per sample: - 9 uL of eluate in 200 mM TEAB in -80° C - MP’s in 20 mM ammonium formate on instrument The phospho protein analysis was ted according to the following protocol.

Sample prep protocol: 1. Cell lysis a. Lysis buffer — 5 M urea, 50 mM Tris-HCL, 0.1% SDS, 1% Protease Inhibitor Cocktail, 1% Phosphatase Inhibitor Cocktail b. Suspend pellet in the appropriate amount of lysis buffer c. Vortex and incubate for 10 minutes on ice. . d. Sonicate and incubate for 10 minutes on ice. e. Centrifuge at top speed for 15 minutes f. Resonicate if lysate is still viscous/sticky. g. Transfer lysate to new tube . Perform Bradford assay to determine protein concentration . Transfer 700 ug of protein (400 ug for THLE-2) to new microtube with 45 uL of 200 mM TEAB . Reduced with 200 mM TCEP at 5 uL TCEP : 100 uL volume for 1 hour at 550 C . Alkylate with 375 mM iodoacetamide at 5 uL iodo:100 uL volume at room temperature for 30 minutes in the dark . Acetone precipitation at 7X the volume overnight in -200 C . Resuspend n in 200 mM TEAB at 50 ug/uL. Digest with trypsin at 1:40 in:protein) at 370 C overnight During column ation, resuspend e sample in 150 uL of Buffer B.

Column Preparation: . Place Centrifuge Column Adaptor in collection tube and insert TiOz Spin Tip into adaptor.

. Add 20 uL of Buffer A. Centrifuge at 3000 g for 2 minutes. Discard FT. 3. Add 20 uL of Buffer B. Centrifuge at 3000 g for 2 minutes. d FT.

Phosphopeptide Binding: 1. Transfer spin tip to a clean microtube. 2. Apply suspended sample to spin tip. Centrifuge at 1000 g for 10 minutes 3. Reapply sample to spin tip and centrifuge 1000 g for 10 minutes. Save FT. 4. Transfer spin tip to a new microtube.

. Wash column by adding 20 uL of Buffer B. Centrifuge at 3000 g for 2 minutes. 6. Wash column by adding 20 uL of Buffer A. fuge at 3000 g for 2 minutes.

Repeat once more.

Elution: 1. Place spin tip in new collection tube. Add 50 uL of Elution Buffer l. Centrifuge at 1000 g for 5 minutes 2. Using same collection tube, add 50 uL of Elution Buffer 2 to spin tip.

Centrifuge for 1000 g for 5 minutes 3. Acidify elution fraction by adding 100 uL of 2.5% Formic Acid.

Graphite Clean-up of Phosphopeptides **Replace TFA with Formic Acid since this is the final up before LC/MS/MS analysis Column Preparation: 1. Remove top and bottom cap from graphite spin column. Place column in 1.5 mL microtube. fuge at 2000 g for 1 minute to remove storage buffer.

. Add 100 uL of 1 M NH4OH. Centrifuge at 2000 g for 1 minute. Discard FT.

Repeat once more.

. Activate graphite by adding 100 uL of acetonitrile. fuge at 2000 g for 1 minute. Discard FT.

. Add 100 uL of 1% Formic Acid. Centrifuge at 2000 g for 1 minute. Discard FT.

Repeat once more.

Sample Binding and Elution: Elution = 0.1% FA + 50% ACN . Place column into new collection tube. Apply sample on top of resin bed. Allow binding for 10 minutes with periodic vortex mixing Centrifuge at 1000 g for 3 minutes. Discard FT.

. Place column into new collection tube. Wash column by adding 200 uL of 1% FA. Centrifuge at 2000 g for 1 minute. Discard FT. Repeat once more.

. Place column into new tion tube. Add 100 uL of 0.1% FA/50% ACN to elute sample. fuge at 2000 g for 1 minute. Repeat 3 more times for total elution of 400 uL.

. Dry samples in vacuum evaporator (SpeedVac) HepG2 and Hep3B: Start with 700 ug of protein After TiOz enrichment and graphite clean-up, opeptides were eluted in 400 uL of 0.1% formic acid/50% ACN.

A ratio of (400/700)*400 uL aliquot was taken from eluent and dried completely.

It was resuspended in 20 uL of 200 mM TEAB for iTRAQ labeling.

After ng, samples were desalted, dried, and resuspended in 20 uL of 0.1% formic acid.

Remaining aliquot was dried completely and resuspended in 20 uL of 0.1% formic acid. uL was transferred to vials for free LCMSMS analysis.

THLE-2: Only 400 ug of protein was ted.

All of the n was enriched with TiOz columns and cleaned with graphite columns The elutes were dried, resuspended in 20 uL of 200 mM TEAB for iTRAQ labeling After labeling, samples were desalted, dried, and resuspended in 20 uL of 0.1% formic acid.

Leftover samples: iTRAQ samples — on instrument in 20 mM ammonium formate Label-free HepG2/Hep3B — 10 uL in 0.1% formic acid in -80° C; 10 uL in 0.1% formic acid on instrument Results illustrates a significant decrease in ENOl activity but not ENOl expression in HepG2 treated with Sorafenib. rates a significant decrease in PGKl activity but not in PGKl protein expression in HepG2 treated with nib. illustrates a significant decrease in LDHA activity in HepG2 treated with Sorafenib. In each case, ENOl expression was measured in units relative to a QC sample and the ENOl activity change was measured in units relative to the control, untreated sample.

The data in FIGS. 42-44 show that for ENOl, LDHA, and PGKl in the HCC disease model, treatment of cells with Sorafenib results in upregulation of protein expression while concommitantly gulating the protein’s enzymatic activity.

Thus, the phospho me affords an additional layer of information that can be used for elucidating the complex relationship between the effect of an extracellular signal (e. g., drug molecule) on kinase ty and total cellular protein, thereby facilitating the fication of disease treatment targets, as well as stic/prognostic markers associated with disease. illustrates (see left frame) a causal molecular interaction network that can be produced by analyzing a resulting dataset using the AI based REFSTM system.

The k can be used, for example, to identify networks of interest that are differentially regulated in normal and cancer cells (see middle and right frames, respectively). Such information can be used to provide HCC treatment s, as well as diagnostic/prognostic markers associated with HCC.

FIGS. 46-51 illustrate how a two dimensional chemical interrogation of oncogenic systems and multi-omics integration of signatures can reveal novel signaling pathways involved in the pathophysiology of cancer, thereby identifying therapeutic targets, relevant biomarkers, and/or therapeutics. In particular, FIGS. 46-51 illustrate the implementation of the general methodology shown in and in accordance with the various methods described herein. As shown in , the approach is powered by “two dimensional al interrogation” where in vitro cancer and control models were interrogated by a kinase inhibitor (Sorafenib) in a first dimension. Overall s in kinase activity were ed by a second dimension of chemical interrogation employing activity based kinase enrichment probes. Kinases were fied by LC—MS. In addition, changes in the o proteome in response to exposure to the kinase inhibitor were captured using a phospho protein enrichment method followed by LC-MS for identification of proteins. Finally, tative changes in total protein expression were obtained. The resulting multi-omics data was integrated using AI-based informatics, leading to the generation of data-driven causal networks representing differential kinase activity driving phosphorylation of proteins that are operational in a cancer model but not in a “normal” model. Integration of these complementary analysis is shown in the inferred pathways of FIGS. 46 and 47. The technology led to the discovery of novel kinases and onships that are mechanistically relevant to pathophysiology of cancer (e.g., FIGS. . illustrates how the integration of multiomics data employing bayesian network inference algorithims can lead to improved understanding of signaling pathways in hepatocellular carcinoma. Yellow squares represent post transcriptional modification ho) data, blue triangles represent activity based (Kinase) data, and green circles represent proteomics data. illustrates how autoregulation and reverse feed back regulation in hepatocellular carcinoma signaling pathways can be inferred by the rm. s represent PMT (Phospho) data dark = Kinase, yellow/light — No Kinase Activity), squares represent activity based (Kinase) + Proteomics data (grey/dark = Kinase, yellow/light — No Kinase Activity). These analyses were carried our using the three-layerd multi-proteomics methodology described above and summarized in . Results of these analyses are shown in FIGS. 48-51 and discussed in further detail below.

FIGS. 48-50 illustrate examples of causal association in signaling pathways inferred by the Platform. Kinase names are indicated on representative squares and s, with causal ates indicated by connectors. identifies the CLTCLl, MAPKl, NMEl, HISTlH2BA, RPS5, TMED4, and MAP4 kinase isoforms and shows an inferred onship therebetween. identifies the , , RAB7A, RPL28, HSPA9, MAP2K2, RPS6, FBL, TCOFl, PGKl, SLTM, TUBB, PGK2, CDKl, MARCKS, HDLBP, and GSK3B kinase isoforms and shows an inferred relationship therebetween. identifies the RPS5, TNRCBA, CLTCLl, NMEl, MAPKl, RPLl7, CAMK2A, NME2, UBE21, CLTCLl, HMGB2, and NME2 kinase isoforms and shows an inferred relationship etween. These kinase isoforms present potential eutic targets, markers, and thereapeutics. illustrates a causal association derived by the Platform. In particular, identifies the EIF4G1, MAPKl, and TOP2A kinase isoforms and shows an inferred relationship therebetween. This relationship provides validation for the model and method because it comports with the published relationship between EIF, MAPK, and TOP kinases.

In sion, multiomics based analysis of enzyme (e. g., kinase) activity ents a useful method for the determination of downstream causal relationships between metabolites and substrates as a function of cell behavior. Likewise, activity based me monitoring of s in global enzyme activity in response to therapeutic treatment can provide critical insight into cellular signaling dynamics as compared to monitoring only the overall ar expression of proteins (e. g., enzymes).

Furthermore, it has been shown that the Platform can robustly infer signaling pathways and reverse feed back regulation in oncogenic versus normal environments and, ore, identify novel causal associations in oncogenic signaling pathways.

Accordingly, the technology provides fication of novel kinases and deciphering mechanism of action of kinase inhibitors.

EXAMPLE 6: In Vitro Model of Angiogenesis and Modulation by CoQ10 Introduction: Progression of tumor size greater than 2-5mm in size requires induction of angiogenesis to supply the tumor with oxygen and nutrients. Angiogenesis occurs due to intratumoral cell release of endothelial mitogenic factors in response to hypoxia or genetic mutation, and there are currently numerous endogenous proteins in clinical development as therapeutic antiangiogenesis targets e. g. VEGF and PlGF.

Herein, we have investigated Coenzyme Q10 (CoQ10) in Vitro, which is currently under investigation in human studies of cancer progression.

Methods: Human umbilical vein endothelial cell ) fate decisions that modulate the angiogenic phenotype were examined in the ce of 100 or 1500uM CoQ10 or ent and compared to untreated control cells. Endothelial cell fate assays for sis, proliferation, ion and 3-D tube formation within MATRIGEL® were performed.

Results: Morphological and ﬂow cytometric analysis of anneXin V/propidium iodide positive cells revealed an increase in HUVEC apoptosis in the presence of l500uM CoQ10, compared to excipient or l cells. Concomitant with increased cell death due to CoQ10, HUVEC cell counts were significantly decreased in the presence of l500uM CoQ10. To assess the potential effects of CoQ10 on endothelial migration, HUVEC migration was examined 5 hours post-cell clearance, in an endothelial scratch assay. Both CoQ10 and excipient significantly impaired HUVEC migration at both 100 and 1500uM concentration, demonstrating antimigratory activity of both the excipient and CoQ10. In order to determine if the CoQ10 umor activity is due to effects on endothelial sprouting angiogenesis, we ed endothelial tube formation in 3-D EL® cultures over time. Addition of excipient in both the gel and overlying media impaired tube formation compared to control. Moreover, addition of 1500uM CoQ10 further impaired HUVEC tube formation compared to both ent and control untreated cells. These effects were noted as early as 24 hours after seeding and up to 96 hours in culture. Taken er, these studies demonstrate that CoQ10 effect is likely, at least in part, due to inhibition of tumor recruitment of local blood supply for neo-vessel formation.

Effect of CoQ10 on endothelial morphology: Human umbilical vein endothelial cells (HUVEC cells) were treated for 24 hours with a range of concentrations of CoQ10. Drug was applied to conﬂuent cells that closely resemble ‘normal’ cells and also to sub-conﬂuent cells that more closely represent the angiogenic phenotype of proliferating cells. In conﬂuent cultures, on of increasing concentrations of CoQ10 led to closer association, elongation and alignment of ECs. 5000uM led to a subtle increase in d cells (Figure 52A). The response of nﬂuent endothelial cells to CoQ10 diverged from the conﬂuent cell response (Figure 52B). Endothelial were visibly unhealthy at lOOOuM CoQ10 and above. Increased cell death was e with sing concentrations of CoQ10.

CoQ10 has divergent effects on endothelial cell survival: Conﬂuent and sub- conﬂuent cultures of HUVEC cells were treated for 24 hours with 100 or 1500uM CoQ10 and assayed for propidium iodide positive apoptotic cells. The results are shown in Figures 53A and 53B, respectively. CoQ10 was tive to ECs treated at conﬂuence, s sub-conﬂuent cells were sensitive to CoQ10 and displayed increased apoptosis at 1500uM CoQ10. Representative histograms of sub-conﬂuent control ECs (left), lOOuM CoQ10 (middle) and l500uM CoQ10 (right) demonstrating sing levels of apoptosis with increasing concentrations of CoQ10 are shown in Figure 53C.

CoQ10 decreases endothelial cell numbers and eration: Sub-conﬂuent cultures of HUVEC cells were treated for 72 hours with 100 or 1500uM CoQ10 and assayed for both cell numbers (Figure 54A) and proliferation (Figure 54B) using a propidium iodide incorporation assay (detects G2/M phase DNA). High concentrations of CoQ10 led to a significant decrease in cell numbers and had a dose-dependent effect on EC proliferation. Representative histograms of cell proliferation gating for cells in the G2/M phase of the cell cycle demonstrating decreased cell proliferation with sing trations of CoQ10 [Figure 54C, control ECs (left), lOOuM CoQ10 (middle) and 1500uM CoQ10 (right)].

CoQ10 decreases endothelial cell migration: HUVEC cells were grown to nce tested for migration using the ‘scratch’ assay. 100 or l500uM CoQ10 was applied at the time of scratching and closure of the cleared area was monitored over 48 hours. 100uM CoQ10 delayed elial closure compared to control. Representative images at 0, 12, 24, and 36 hours are ed in Figure 55. Addition of l500uM CoQ10 ted closure, even up to 48 hours (data not shown).

CoQ10 impairs endothelial tube formation: Endothelial cells growing in 3-D matrigel form tubes over time. Differential effects of 100uM and 1500 uM CoQ10 on tube formation were observed. Impaired cell to cell association and breakdown of early tube structure was icant at 1500 uM CoQ10. Interestingly, tube formation did commence in the presence of l500uM CoQ10, however the process was impaired 48 hours into tube growth and formation. Images shown in Figure 56 were taken at 72 hours.

Results and Conclusion: We igated the potential angiogenesis modulating effects of CoQ10.

CoQ10 is an anti-cancer agent currently under investigation in human solid tumor studies that modulates the cellular energy metabolism.

CleO at low doses was protective to conﬂuent endothelial cells, whereas addition of CleO to sub-conﬂuent cells led to increased apoptosis, decreased cell numbers and was a potent inhibitor of endothelial proliferation. We demonstrate divergent effects on conﬂuent and subconﬂuent cells that would protect the ‘normal’ vasculature.

Functional assessment of the endothelial y to e in 2-D scratch assays revealed a potent inhibition of endothelial migration. Time-lapse raphy revealed a dynamic endothelial ‘front’ that fails to close the cleared zone over a 2 day culture/treatment.

Suspension of endothelial cells in 3-D matrigel leads to formation of tubes over time. Using this well-characterized assay that recapitulates many of the s at play in tumor angiogenesis, we ed the effect of CleO on endothelial tube formation.

Addition of 100uM CleO had a modest effect of tube formation, however addition of 1500uM CleO led to a dramatic disruption of endothelial tube formation.

In summary, these results demonstrate the effect of CleO on elial sprouting, migration and proliferation and selectively induces cell death in angiogenic endothelial cells.

EXAMPLE 7: Coenzyme Q10 Differentially Modulated Functional ses in Confluent and Subconfluent HUVEC Cells Having demonstrated a differential effect of CleO on cell proliferation and migration in HUVEC cells grown under nt and subconﬂuent conditions, the effects of CleO on the mical pathways of HUVEC cells was investigated.

The response of HUVEC cells to normoxia and hypoxia in the ce of absence of CleO was assessed. Specifically, HUVEC cells were grown in subconﬂuent and conﬂuent cultures under normoxic or hypoxic conditions as described herein. The cells were also exposed to 0, 100, or 1500 uM CleO. Nitric oxide (NO) and reactive oxygen species (ROS) levels were determined using methods provided herein. As shown in Figure 57, the HUVEC cells demonstrated a differential dose dependent generation of nitric oxide (NO) and reactive oxygen species (ROS) in response to CleO and hypoxia.

The bioenergetics of HUVEC cells were assessed in the presence of s concentrations of CleO. Specifically, HUVEC cells were grown in subconﬂuent or conﬂuent conditions in the e or presence of CleO (10, 100, 1500 uM). Oxygen consumption rates (OCR), both total and mitochondrial, ATP production, and Extra Cellular Acidification Rate (ECAR) were assessed using Seahorse assays. HUVEC cells growing in sub-conﬂuent es limit mitochondrial oxygen consumption when compared to conﬂuent cultures as shown in Figure 58A-D ((A) Total OCR; (B) Mitochondrial OCR; (C) ATP; (D) ECAR_. Addition of CleO to sub-conﬂuent cultures reverts mitochondrial OCR to conﬂuent level OCR (Figure 58B).

EXAMPLE 8: Application of Functional Proteomics and Lipidomics t0 Elucidate Anti-angiogenic Mechanism of COQ10 enesis is a key enabling feature of tumor ssion that provides oxygen and nutrients that are required for tumor cell growth. We have investigated the anti-angiogenic ties of CleO, an anti-tumor drug that is currently under investigation in human studies of cancer progression. CleO impairs endothelial migration in ‘scratch’ assays and tube formation in 3-D MATRIGEL® tube formation assays. Addition of CleO also impairs endothelial proliferation, as detected by G2/M phase cells and erating cell nuclear antigen (pCNA) protein. CleO induces activation of caspase 3 and increases apoptosis of angiogenic/proliferating elial cells, whereas cell death of non-proliferating conﬂuent endothelial cell cultures is sed compared to controls.

In order to ine the intracellular proteomic profile of enic proliferating endothelial cells and non-proliferating endothelial cells, we used a proteomic, lipidomic, and functional proteomic approach. Proteomic and n lipidomic analysis were performed on a LTQ-OrbiTrap-Velos and Vantage-QqQ, respectively. The functional proteomics approach employed activity-based probes in combination with comparative proteomics. Kinases and other s were specifically labeled with ATP-binding domain enrichment probes that interact with the active sites of enzymes in their native conformation. Enrichment was carried out through immunoprecipitation with streptavidin resin.

Using integrated lipidomics and proteomic platforms, and an AI based Bayesian informatics rm that generates causal lipid/ protein/ functional proteomics networks, novel ns, lipids, and enzymes that modulate angiogenesis were identified. CleO treated cells and comparison of normal and angiogenic endothelial cells were used to probe the global kinase activity. Comparative proteomics and enzyme activity data were integrated into the AI based Bayesian informatics platform to investigate causal networks of functional protein-protein interactions in order to elucidate the complexity and dynamics of angiogenesis. A causal interactive network is shown in Figure 59A-C. Specifically, Figure 59A is a full mic causal interaction network of lipids, proteins, and kinases. Figure 59B shows a hub of a protein enriched k, and Figure 59C shows a hub of a kinase, lipidomic, and functional nt network. In the networks, ns are indicated by circles, kinases are indicated by s, lipids are indicated by diamonds, and functional activity or cellular response are indicated by octagons. Some protein and kinase names are provided. The outputs from the platform confirmed known protein interactions.

In summary, using the platform logy, the anti-angiogenic mechanism of CleO and the unique characteristics of proliferating endothelial cells by applying integrated functional proteomic assays to determine global changes in enzymatic ty have been investigated. Interrogative “omic” based platform robustly infers cellular intelligence. The AI-based network engineering approach to data mining to infer ity s in actionable biological intelligence. Moreover, the discovery platform allows for enhanced understanding of the pathophysiology of endothelial cells in response to environmental challenge, alteration in lic status, and production of ve molecules to mitigate physiologic perturbations.

EXAMPLE 9: Employing Platform Technology to Build Models of Angiogenesis In this example, the platform technology described in detail above in the ed description is employed to integrate data ed from a custom built angiogenesis model, and to identify novel ns/pathways driving angiogenesis. Relational maps resulting from this analysis provide angiogenesis biomarkers.

Angiogenesis is a result of a complex series of signaling ys that are not fully understood. Angiogenesis plays a role in a number of ogical conditions including, but not d to, cancer. A systems approach combining protein and lipid signatures with functional end point assays specifically looking at cellular bioenergetics and mitochondrial membrane function is provided herein. As demonstrated above, sub- conﬂuent HUVEC cells can be used to mimic an angiogenic state, whereas nt HUVEC cells can be used to mimic a non-angiogenic, i.e., normal, state.

In an in vitro model, HUVEC cells are grown under conditions of contact inhibition (e. g., conﬂuent cultures) or under conditions lacking contact inhibition (e.g., sub-conﬂuent cultures, e. g., less than about 60% nt, less than about 70% conﬂuent, less than about 80% conﬂuent, less than about 90% conﬂuent; three- dimensional cultures; or cultures in which a patch of cells is removed by “scratching” the culture), in the presence or absence of an environmental inﬂuencer, such as an angiogenesis inhibitor, e.g., CoQ10, to create signatures and elucidate potential mechanisms of angiogenesis. The proteomic and lipidomic signatures are analyzed using the platform methods provided herein. Biomarkers of angiogenesis are further med using wet lab methods. This approach serves as a powerful tool to tand mechanism of angiogenesis, allowing for the identification of new angiogenic biomarkers and the pment and testing of agents that modulate angiogenesis.

Human umbilical vein endothelial cells are subject to conditions simulating an angiogenic environment experienced by the disease-relevant cells in vivo. ically, the cells are grown under conditions wherein growth is ted due to contact inhibition (i.e., normal cells) or under ions wherein, in at least a n of the culture, growth is not inhibited due to t inhibition (i.e., angiogenic cells). For the sake of simplicity, such cells grown under conditions wherein, in at least a portion of the culture, growth is not inhibited due to contact inhibition will be referred to as non- conﬂuent cultures.

The cell model comprising the above-mentioned cells, wherein the cells are grown in conﬂuent or non-conﬂuent cultures, is additionally “interrogated” by exposing the cells to an “environmental perturbation” by treating with an agent that modulates angiogenesis, e. g., an agent that inhibits angiogenesis. For example, the cells are treated with Coenzyme Q10 at various concentrations, for example, one or more of, 0, 5011M, 100uM, 250uM, SOOMM, 750uM, 1000uM, 1250uM, or 1500uM. As provided herein, perturbation can include mechanical disruption of the cells, e. g., by “scratching” the culture or subculturing the cells at a lower density.

Cell samples from each condition with each perturbation treatment are collected at various times ing treatment, for example, after 6, 12, 18, 24, 36, 48, 60, 72, 84, 96, 108, or 120 hours, or some time point therebetween, of treatment. For certain conditions, media samples are also ted and analyzed. Samples can then be analyzed for one or more of level of protein expression or activity, gene sion, and lipid levels. iProfiling of changes in total cellular protein expression by quantitative proteomics is med for cell and media samples collected for each condition and with each “environmental perturbation”, i.e, Coenzyme Q10 treatment, using the techniques described above in the detailed description. Transcriptional profiling experiments are carried out, for example, using the Biorad® CFX-384 amplification system. Following data collection (Ct), the final fold change over control is determined using, for example, the 8Ct method as outlined in cturer’s protocol. Lipidomics experiments are carried out using mass spectrometry. Functional assays such as Oxygen Consumption Rate (OCR) are measured, for e, by employing the se analyzer essentially as recommended by the manufacturer. OCR can be recorded by the electrodes in a 7 ul chamber created with the cartridge pushing against the seahorse culture plate.

In summary, morphological, enzymatic, and ﬂow cytometric analysis revealed dramatic s in apoptosis, migration, nitric oxide and ROS generation, and bioenergetic capacity in response to CleO treatment. Lipidomic analysis revealed novel changes in lipid pathways mitigated by altering mitochondrial function and cell y. Proteomic integration utilizing the Platform methods revealed uncharacterized association of intracellular adaption and signaling ed by mitochondrial modulation.

Taken er, these studies reveal that CleO alters endothelial migration, proliferation, apoptosis, nitric oxide, ROS, and protein/lipid ecture. A novel mechanism is presented herein where umor activity of CleO is due to metabolic cross-talk of angiogenic and tic factors to inhibit tumor recruitment of local blood supply for neo-vessel formation. Additionally, proteomic and lipidomic adaption was associated with interactive networks which support the physiological requirements of endothelial cells in response to environmental stimuli. These data provide hallmark insight into the selective adaptation of tumor angiogenesis due to dysregulated mitochondrial lic control elements.

EXAMPLE 10: Employing rm Technology to Implement Multi mics Models for Elucidating Enzymatic Activity.

In general, the enzymatic platform logy described in Example 5 above can be adapted to implement further methods for identifying a modulator of a biological system or disease process such as angiogenesis. The methods employ a model for angiogenesis, comprising cells associated with angiogenesis, to represents a characteristic aspect of angiogenesis. The model is used to obtain at least three levels of data, namely (i) a first data set representing global enzyme activity in the cells associated with angiogenesis, (ii) a second data set representing an effect of the global enzyme activity on the enzyme metabolites or substrates in the cells associated with angiogenesis, and (iii) a third data set representing global proteomic changes in the cells associated with angiogenesis. Additional data sets such as lipidomic, transctiptomic, metabolomics, and SNP data. The data is used to te a consensus causal onship network among the global enzyme activity, the effect of the global enzyme activity, and the global mic changes. The consensus causal relationship network is based solely on the first, second, and third data sets using a programmed computing device (i.e., not based on any other known biological relationship). The sus causal relationship network is than used to identify a causal relationship unique to angiogenesis, where at least one gene or protein associated with the unique causal relationship is identified as a modulator of angiogenesis.

In this example, the platform technology was adapted to implement multi mics techniques for ing enzyme activity related to angiogensis and the direct effects of that activity on the me; and thereby, provide a system that can be used to tand causal onships between enzymes (e. g., kinases and/or proteases) and their metabolites/substrates in the context of global changes in the cellular proteome during angiogenesis. Such techniques can provide valuable insight because enzyme ty can be onal to enzyme expression (e.g., activity down regulated and expression unregulated). Relational maps resulting from such an analysis can provide disease treatment s by modulating angiogenesis, as well as diagnostic/prognostic markers associated with angiogenesis. Such targets and markers can e for therapeutic compositions and methods. Techniques for establishing models, obtaining data sets, generating consensus causal onship networks, and identifying causal relationships unique to angiogenesis are discussed in the summary, detailed description, and examples above. Further ques for establishing models and obtaining data sets representing global enzyme activity and the effect of the global enzyme activity on the enzyme metabolites or substrates are provided below.

First, a model is established in accordance with the platform logy wherein, for example, cell lines are subjected to conditions simulating a disease and interrogated by exposure to an environmental perturbation (e. g., exposure to a modulator of angiogenesis, e. g., CleO, Avastin, a VEGF inhibitor, angiostatin, zumab, change of confluency of HUVEC cells). A control is provided for comparison. Second, enzyme activity and its downstream s are tracked in the context of global proteomic changes by analyzing (i) global enzymatic activity, (ii) the specific effect of the enzymatic activity on the proteome (e.g., the metabolites/substrates of the enzymatic activity), and (iii) the global effect on the cellular proteome. Third, the datasets are ed in accordance with the platform technology to identify modulators of interest.

For example, an angiogenic model can be interrogated by a known tor of angiogenesis; the effects of this perturbation to the system on the global kinase activity can be analyzed, along with the resulting effects on the phospho proteome and whole proteome; and the dataset can be analyzed by the AI based REFSTM system.

For e, HUVEC cells grown under s conditions can be used to simulate angiogenic and normal (e. g., non-angiogenic) states. As angiogenesis does not occur in adults except under specific circumstances, e. g., pregnancy, wound healing, etc. the presence of angiogenic markers identified by using this approach may be useful as markers indicative of a disease state, e. g., cancer, rheumatoid arthritis, age related macular degeneration, or diabetic retinopathy.

This illustrative example es the power of (i) cell biology, (ii) integrated mics platforms and an informatics platform that tes causal protein networks to delineate the role of post-translation modification, e. g., phosphorylation, and enzymes that partake in such mechanisms, e.g., kinases, in the angiogenesis. In ular, this approach incorporates activity based proteomics employing ATP g domain enrichment probes and o-proteome mapping of total proteins in angiogenesis models.

Comparative proteomics, phospho proteome and enzyme activity data are integrated into the AI based REFSTM informatics platform. Causal networks of protein interaction specifically from a functional stand point namely kinase/enzyme activity and potential targets that kinases can orylate are then ted. In addition, using cellular functional read out, enzymes/ kinases that modulate phosphorylation of targets and mechanistically drive pathophysiological ar behavior are determined. The illustrative implementation ed herein facilitates global characterization of cellular responses, insights into mechanisms of enesis and potential targets/biomarkers for clinical management of angiogenesis.

As an rative example, cells representing normal cells and angiogenic cells are selected for comparison. As trated herein, HUVEC cells when grown in sub- conﬂuent cultures show characteristics of angiogenesis, whereas conﬂuent HUVEC cells do not. Treatment of sub-conﬂuent cultures of HUVEC cells with CleO shifts the HUVEC cells to non-angiogenic state as demonstrated herein. As with the proteomics s provided above, methods for analysis of enzymatic activity can include pairwise analysis of HUVEC cells grown under any conditions, and optionally r analysis of the results from the pairwise comparison with results from a third data set.

As an exemplary embodiment, equivalent numbers of HUVEC cells cultured in conﬂuent and non-conﬂuent cultures are harvested and the cells are enriched for the presence of peptides of interest, e. g., phosphopeptides. A comparative analysis is performed as in Example 5 to detect changes in enzymatic activity associated with angiogenesis.

Incorporation by Reference The contents of all cited references (including literature references, patents, patent applications, GenBank Numbers in the version available on the date of filing the instant application, and websites) that maybe cited throughout this application are hereby expressly incorporated by reference in their entirety, as are the references cited therein. The practice of the present invention will employ, unless otherwise indicated, conventional techniques of n formulation, which are well known in the art.

Equivalents The ion may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The ing embodiments are therefore to be considered in all ts illustrative rather than limiting of the invention bed herein. Scope of the invention is thus indicated by the appended claims rather than by the foregoing description, and all s that come within the meaning and range of equivalency of the claims are therefore intended to be embraced herein.

Appendix A: Amino acid and cDNA sequences for relevant proteins 1 . "COFl: Treacher Collins—Franceschet :i syndrome 1 LOCUS WW_OOO356 AA: MAdARKR?*..PLIYH{ .RAGYVRAAR4VK4QSGQKCFLAQPV .DIY HWQQiS*-GRKRKA**DAALQAKKTQVSDPISLS*SS L 444 4A4A4iAKA AASTVSSVLGAD-PSSMK*KAKA*i4KAGKiGNSWPHPATGKTVAW--SGKSP?KS i-VS*i*44GSVPAEGAAAKPGMVSAGQADSSS.DiSSSSD4iDV4VKA54LL JQVRAASAPAKG"PGKGATPAPPGKAGAVASQiKAGKP L *DS*SSSL 45535444 JQAKASGK"SQVGAASAPAKESPRKGAAPAPPGKTGPAVAKAQAGKQ4*DSQ DS L 44APAQAKPSGKAPQVRAASAPAKESPQKGAAPAPPRKTGPAAAQVQVG KQ L L DSQSSS L 4SDSDR4ALAAWVAAQVKPLGKSPQVKPASTMGWGPJGKGAGPVPPG KVGPATPSAQVGKW L 43545554*SSDSSDGLVPLAVAPAQ4KS-GNI-QAKPTSSPA KGPPQKAGPVAVQVKAEKPMDNS455**SSDSADS**APAAW AAQAKPALKIPQTKA TTASAKVAPVQVGTQAPQKAGTATSPAGSSPAVAGG QRPA*DSSSSL 4535 4 4 4K G-AViVGQAKSVGKGLQVKAASVPVKGSnGQGTAPVnPGK GPiViQVKALKQ 435455 L 44535*4AAASPAQVKTSVKKTQAKAVPAAARAPSAKG"ISAPGKVVTAAA QAKQQSPSKVKPPVRNPQNSTVLARGPASVPSVGKAVA AAQAQLGP U U] G) U] U] 4‘s DS***A*i-AQVKPSGKTHQIRAALAPAKESPRKGAAP"PPGKTGPSAAQAGKQDDSG DS3G4APAAVLSAQVIKPPLIFVDPNRSPAGPAATPAQAQAASTPRKAQASE STA?SSSS*S*D4DVIPAiQCLiPGIRiNVViMPiAiPRIAPKASMAGASSSKESSRI SDGKKQLGPA QVSKKNPASLP-TQAA-KVLAQKASTAQPPVARTQPSSGVDSAVGi4 PATSPQSTSVQAKG"NK-?KPK-P*VQQALKAP*SSDDS*DSSDSSSGS4*DG4GPQG AKSAHinGPiPS?idiLV**iAA*SS* DDVVAPSQS. -SGYM PGL PAVSQASKA"? KLDSSPSVSS"LAAKDDP DGKQEAKPQQAAGWLSPKLGGK LAASG PQKSRKPKKGA GNPQASTLAJQSWI"QC. .GQPWP N'TAQVQASVVKV .14.L4Q4RKKVVJTTKESSR KGW TSRKRK-SG DQPAA QTP QSKKKKK .GAG‘GG‘ASVSP Si S KGKAKRDKASG DVKEKKGKGSLGSQGAK 34p44 dLQKGWGTVEGG DQSVPKSKKEKKKSDKQKKDKEKK EKKKKAKKASTK DSESPSQKKKKKKKKTAEQTV CDNA: gaaagaggag ccggaag:gt ggcgcgcgag gggc gcgagggaag gcgg 6; ggactaaggc ggggcgtgca ggtagccggc cggccggggg tcgcgggtat ggccgaggcc 12; aggaagcggc gggagctact tcccctgatc taccaccatc tgctgcgggc tggctatgtg 18; Cgtgcggcgc gggaagtgaa ggagcagagc ggccagaagt tggc tcagcccgta 24; acccttctgg acatctatac gcaa caaacctcag agcttggtcg gaagcggaag ; gcagaggaag atgcggcact gcaagctaag cgtg accc catcagcacc 36; agct cggaagagga ggaagaagca gaagccgaaa aagc caccccaaga 42; ctagcatcta ccaactcctc agtcctgggg gcggacttgc catcaagcat gaaagaaaaa 48; gccaaggcag agacagagaa agctggcaag actgggaatt ccatgccaca ccctgccact 54; gggaagacgg tggccaacct tctttctggg aagtctccca cagc agagccctca 60; acta cgttggtctc agaaactgag gaggagggca gcgtcccggc ctttggagct 66; gctgccaagc ctgggatggt gtcagcgggc caggccgaca gctccagcga ggacacctcc 72; agtg atgagacaga cgtggaggta aaggcctctg ttct ccaggtcaga 78; gctgcctcag cccctgccaa ggggacccct gggaaagggg ctaccccagc accccctggg 84; gggg ctgtagcctc ccagaccaag gcagggaagc cagaggagga ctcagagagc 90; agcagcgagg agtcatctga cagtgaggag gagacgccag ctgccaaggc cctgcttcag 96; gcgaaggcct caggaaaaac ctctcaggtc ggagctgcct cagcccctgc caaggagtcc L02; cccaggaaag gagctgcccc agcgccccct acag ggcctgcagt tgccaaggcc L08; caggcgggga agcgggagga ggactcgcag agcg aggaatcgga cagtgaggag L14; gaggcgcctg ctcaggcgaa gccttcaggg aaggcccccc gagc cgcctcggcc L20; cctgccaagg agtcccccag gaaaggggct gccccagcac ctcctaggaa aacagggcct L26; gcagccgccc aggtccaggt ggggaagcag gaggaggact caagaagcag cagcgaggag L32; tcagacagtg acagagaggc agcc atgaatgcag ctcaggtgaa gcccttgggg L38; aaaagccccc aggtgaaacc tgcctctacc atgggcatgg tggg gaaaggcgcc L44; ggcccagtgc ggaa ggtggggcct gcaaccccct cagcccaggt ggggaagtgg L50; gaggaggact cagagagcag tagtgaggag tcatcagaca atgg agaggtgccc L56; acagctgtgg ccccggctca ggaaaagtcc ttggggaaca tcctccaggc caaacccacc L62; tccagtcctg ccaaggggcc ccctcagaag cctg tagccgtcca ggtcaaggct L68; gaaaagccca tggacaactc ggagagcagc tcat cggacagtgc ggacagtgag L74; gaggcaccag cagccatgac tgcagctcag gcaaaaccag ctctgaaaat tcctcagacc L80; aaggcctgcc aaac caataccact gcatctgcca aggtcgcccc tgtgcgagtg L86; ggcacccaag ccccccggaa agcaggaact gcgacttctc cagcaggctc atccccagct L92; gtggctgggg gcacccagag accagcagag gattcttcaa gcagtgagga atcagatagt L98; gaggaagaga agacaggtct tgcagtaacc gtgggacagg caaagtctgt ggggaaaggc 204; ctccaggtga aagcagcctc agtgcctgtc aaggggtcct tggggcaagg tcca 210; gtactccctg cggg gcctacagtc acccaggtga aagctgaaaa gcaggaagac 216; tctgagagca gtgaggagga atcagacagt gaggaagcag ctgcatctcc agcacaggtg 222; aaaacctcag taaagaaaac ccaggccaaa gccaacccag ctgccgccag ttca 228; gcaaaaggga caatttcagc ccctggaaaa actg cagctgctca agccaagcag 234; aggtctccat ccaaggtgaa gccaccagtg agaaaccccc agaacagtac cgtcttggcg 240; aggggcccag catctgtgcc atctgtgggg aaggccgtgg ctacagcagc tcaggcccag 246; acagggccag aggaggactc agggagcagt gaggaggagt cagacagtga ggaggaggcg 252; gagacgctgg ctcaggtgaa gccttcaggg aagacccacc agatcagagc ggct 258; cctgccaagg agtcccccag ggct gccccaacac ctcctgggaa gcct 264; tcggctgccc aggcagggaa gcaggatgac tcagggagca gcagcgagga atcagacagt 270; gatggggagg caccggcagc tgtgacctct gcccaggtga ttaaaccccc tctgattttt 276; gtcgacccta atcgtagtcc agctggccca gctgctacac ccgcacaagc ccaggctgca 282; agcaccccga ggaaggcccg agcctcggag agcacagcca ggagctcctc ctccgagagc 288; gaggatgagg acgtgatccc cgctacacag tgcttgactc ctggcatcag aaccaatgtg 294; gtgaccatgc ccactgccca cccaagaata aaag ccagcatggc tggggccagc 300; agcagcaagg agtccagtcg agat ggcaagaaac aggagggacc agccactcag 306; gtgtcaaaga agaacccagc ttccctccca ctgacccagg ctgccctgaa ggtcctcgcc 312; cagaaagcca gtgaggctca gcctcctgtt accc agccttcaag tggggttgac 318; agtgctgtgg tccc tgcaacaagt ccccagagca cctccgtcca ggccaaaggg 324; accaacaagc tcagaaaacc taagcttcct gaggtccagc aggccaccaa agcccctgag 330; agctcagatg acagtgagga cagcagcgac agttcttcag ggagtgagga agatggtgaa 336; gggccccagg gggccaagtc cacg ctgggtccca ccccctccag gacc 342; ctggtggagg agaccgcagc cagc gaggatgatg tggtggcgcc atcccagtct 348; ctcctctcag gttatatgac acta accccagcca attcccaggc ctcaaaagcc 354; actcccaagc tagactccag cccctcagtt tcctctactc tggccgccaa agatgaccca 360; gatggcaagc aggaggcaaa gccccaacag gcagcaggca tgttgtcccc taaaacaggt 366; ggaaaagagg ctgcttcagg caccacacct cagaagtccc ggaagcccaa gaaaggggct 372; gggaaccccc aagcctcaac cctggcgctg caaagcaaca agtg cctcctgggc 378; caaccctggc ccctgaatga ggtg caggcctcag tggtgaaggt cctgactgag 384; ctgctggaac aggaaagaaa gaaggtggtg gacaccacca aggagagcag caggaagggc 390; tgggagagcc gcaagcggaa gctatcggga gaccagccag ctgccaggac ccccaggagc 396; aagaagaaga agaagctggg ggccggggaa gagg cctctgtttc cccagaaaag 102; acctccacga cttccaaggg gaaagcaaag agagacaaag caagtggtga tgtcaaggag 108; aagaaaggga aggggtctct tggctcccaa ggggccaagg acgagccaga agaggagctt 114; cagaagggga tggggacggt tgaaggtgga gatcaaagca acccaaagag caagaaggag 120; aagaagaaat ccgacaagag aaaaaaagac aaagaaaaaa aagaaaagaa gaagaaagca 126; aaaaaggcct caaccaaaga ttctgagtca ccgtcccaga agaaaaagaa gaaaaagaag Z32; aagacagcag ctgt atgacgagca ccagcaccag gcacagggat ttcctagccg Z38; agcagtggcc atccccatgc ctctgacctc caccgacctc tgcccaccat gggttggaac Z44; taaactgtta ccttccctcg ctccacagaa gaagacagcc agcttcaggg gtccctgtgc 150; tggccaagcc agtgagcctg ngggaggct ggtccaagga gaaagtggac cagctcccat 156; gacctcaccc cactccccca gacg atag atgtgtacag tatatgtatt 4621 tttttaagtg acctcctctc cttccacaga atgc ccaaaggcct cgggacttcc 4681 caccaccttg ctccacagat ccagctaggc ctgacctgtg cctcatcccg tgccgctcgg 4741 tctctggctg atcccgaggc tttgtcttcc tctcgtcagt tcttttggtt gtgttttttg 4801 tttttttttt aataactcaa aaaaaaaata aaagacttgg aggaagggtg caagctccca 4861 gtgcaaaaaa aaaaaaaaaa aa 2. TOPZA: Homo sapiens :opoisomerase AOCJS VM_001067 AA :ransla:ion="MTVSP-QPVN8NWQVNKIKKNTDAKKR-SVT?IYQKKTQLEHILL ARPDTYIGSVELVTQQMWVYDEDVGIVYRLV tVPGLYKIFD'I-VVAADWKQRDPKM‘J SCIRV IDP4WN-ISIWWNGKGIPVV*HKV4KMYVPA.1FGQ--TSSVYDDDEKKVTG GRVGYGAKLCWIESiKt1V41ASR4YKKWEKQ1WWDNWGRAG W4LKPENG‘DY1C11L FQPDASKFKWQSADKDIVA.WVRRAYDIAGS"KDVKVFLNGNKJPVKGFRSYVDMYLKL DK-D* GNS-KVIHTQVVH W‘VC-1W54KGEQQISEVNSIA SKGG?{V3YVADQIV/U TKAVDVVKKKVKGGVAVKA{QVKNiMWIbVVALILNPibDSQ K‘NMi-QPKSEGSiC C .STKFIKAAIGCGIVTSI-VWVKFKAQVQ4NKKCSAVKHWRIKGIPKADDAVDAGG? A51*C .1L1*GDSAK1-AVSG-GVVGRDKYGVFPLRGKI-NVRTAS{KQIM*VA*IV A IIKIVGLQYKKNY‘D‘DS-Ki-RYGKIWIWTDQDQDGSHIKGLAINFIH{NWPS--? A Rb-44t1 PIVKVSKWKQ‘WAEYS-P*b**WKSSiPNiKKWKVKYYKGLG SiSKLA WEYFADWKRHRIQFKYSGPEDDAAISLAFSKKQIDD?K*W.1NEW*DRRQRK--G-PT DYLYGQii EIVK*.1-bSVSDVLRSIPSMVDGLKPGQRKVAFTCFKRNDK? TVKVAQ-AGSVATWSSY{{GLMSLMW IIVJAQVFVGSVN.WL QPIGQFG"?AHGGK DSASPRYIFTW-SS-AR--FPPKDD{"-KF-YDDWQRV*P*WYIPIIPMV.1VGATGI GTGWSCKIPNFDVREIVVVIRR-MDG**P-PW-PSYKNEKG IddﬂAPNQYVISGEVAL I-WSiiI‘ISL -PV?1W1Q1YK*QV-*PM-VG1*K PP-IiDYR*Y{1D11VKbVVKW ‘A4RVG-{KVFK-QTS-"CVSWV4FD{VGC4KKYD"V-DI-RDFFT-R-KYY G-RKTW--GM-GATSAK-WNQARFI-TKIDGKIII*WKPKK*.1KV-IQRGYDSDPVK IAQQKVPD“ L V**SDN*K* *KSDSViDSGP EVYL.DWP-WYL1K*KK34LCR .?W*K*Q*-D1-K?KSPSDLWK*D-A1EI**-*AV*AK*KQD*QVG-PGKGGKAKGKK TV-PSPRGQRVIPRIil L WKA*A*KKVKKKIKV*V *GSPQ‘DGV*-*G-KQR.

R‘PGiKiKKQi14AtKPIKKGKKRWPWSDSESDRSSDLSWEDVPP?*1*PRRA TW)LDSD*DbSDtD‘KiDD‘DbVPSDASPPKTK"SPKLSWKELKPQKSVVSD .TADDVKGSVPLSSSPPAiHbPD*1*I1WPVPKKNVTVKKTAAKSQSSTS"TGAKKRA APKGTKRDPAJNSGVSQKPDPAKTKVRRKRKPSTSDDSDSNFEKIVSKAV"SKKSKGL‘J SDDF{MDFDSAVAPRAKSVRAKKPIKYLL *SD‘DD-b CDNA: 1 ga::ggctgg tctgcttcgg :aaa ggaaggttca gctc tcctaaccga 61 cgcgcgtctg tggagaagcg :cgg tctc gtggggtcct gcctgtttag 121 tcgctttcag ggttcttgag ccccttcacg accgtcacca tggaagtgtc accattgcag 181 aatg aaaatatgca agtcaacaaa ataaagaaaa atgaagatgc taagaaaaga 241 ctgtctgttg aaagaatcta tcaaaagaaa acacaattgg aacatatttt gctccgccca ; gacacctaca ttggttctgt ggaattagtg acccagcaaa tgtgggttta cgatgaagat 361 atta actataggga agtcactttt gttcctggtt tgtacaaaat ctttgatgag 421 attctagtta atgctgcgga caacaaacaa agggacccaa aaatgtcttg agtc 481 acaattgatc cggaaaacaa tttaattagt atatggaata atggaaaagg tattcctgtt 541 gttgaacaca aagttgaaaa gatgtatgtc ccagctctca tatttggaca gctcctaact 60; tctagtaact atgatgatga tgaaaagaaa gtgacaggtg gtcgaaatgg ctatggagcc 66; aaattgtgta acatattcag taccaaattt actgtggaaa cagccagtag agaatacaag 72; aaaatgttca aacagacatg gatggataat atgggaagag ctggtgagat ggaactcaag 78; cccttcaatg gagaagatta tacatgtatc acctttcagc ctgatttgtc taagtttaaa 84; atgcaaagcc tggacaaaga tattgttgca ctaatggtca gaagagcata tgatattgct 90; ggatccacca aagatgtcaa agtctttctt aatggaaata aactgccagt aaaaggattt 96; cgtagttatg tggacatgta tttgaaggac aagttggatg aaactggtaa ctccttgaaa L02; gtaatacatg aacaagtaaa ccacaggtgg gaagtgtgtt taactatgag aggc L08; tttcagcaaa ttagctttgt caacagcatt gctacatcca gcag acatgttgat L14; tatgtagctg atcagattgt gactaaactt gttgatgttg tgaagaagaa gggt L20; ggtgttgcag taaaagcaca tcaggtgaaa aatcacatgt ggatttttgt aaatgcctta L26; attgaaaacc caacctttga ctctcagaca aaagaaaaca tgactttaca acccaagagc L32; tcaa catgccaatt gagtgaaaaa aaag ctgccattgg ctgtggtatt L38; gtagaaagca tactaaactg ggtgaagttt aaggcccaag tccagttaaa caagaagtgt L44; tcagctgtaa aacataatag aatcaaggga attcccaaac tcgatgatgc caatgatgca L50; gggggccgaa actccactga gtgtacgctt atcctgactg agggagattc agccaaaact L56; ttggctgttt caggccttgg tggg agagacaaat atggggtttt ccctcttaga L62; atac tcaatgttcg agaagcttct cataagcaga tcatggaaaa tgctgagatt L68; aacaatatca tcaagattgt gggtcttcag tacaagaaaa aaga ttca L74; ttgaagacgc atgg gaagataatg attatgacag atcaggacca agatggttcc L80; cacatcaaag gcttgctgat taattttatc aact ggccctctct tctgcgacat L86; cgttttctgg aggaatttat cactcccatt gtaaaggtat acaa gcaagaaatg L92; gcattttaca gccttcctga atttgaagag tggaagagtt ctactccaaa tcataaaaaa L98; tggaaagtca acaa aggtttgggc accagcacat caaaggaagc taaagaatac 204; tttgcagata tgaaaagaca tcgtatccag ttcaaatatt ctggtcctga agatgatgct 210; gctatcagcc tggcctttag caaaaaacag atagatgatc aatg gttaactaat 216; ttcatggagg atagaagaca gtta cttgggcttc ctgaggatta cttgtatgga 222; caaactacca catatctgac atataatgac ttcatcaaca aggaacttat cttgttctca 228; aattctgata gatc ttct atggtggatg gtttgaaacc aggtcagaga 234; aaggttttgt ttacttgctt caaacggaat gacaagcgag aagtaaaggt tgcccaatta 240; gctggatcag aaat gtcttcttat catcatggtg agatgtcact aatgatgacc 246; attatcaatt tggctcagaa ttttgtgggt aatc taaacctctt gcagcccatt 252; ggtcagtttg gtaccaggct acatggtggc tctg ctagtccacg atacatcttt 258; acaatgctca gctctttggc tcgattgtta tttccaccaa aagatgatca cacgttgaag 264; tttttatatg atgacaacca gcgtgttgag cctgaatggt acattcctat tattcccatg 270; gtgctgataa atggtgctga aggaatcggt actgggtggt cctgcaaaat ccccaacttt 276; gatgtgcgtg aaattgtaaa taacatcagg cgtttgatgg atggagaaga gcca 282; atgcttccaa gttacaagaa cttcaagggt actattgaag aactggctcc aaatcaatat 288; agtg gtgaagtagc tattcttaat tctacaacca ttgaaatctc agagcttccc 294; acat ggacccagac atacaaagaa caagttctag aacccatgtt gaatggcacc 300; gagaagacac ctcctctcat aacagactat agggaatacc atacagatac cactgtgaaa 306; gtga agatgactga agaaaaactg gcagaggcag agagagttgg actacacaaa 312; gtcttcaaac tccaaactag tctcacatgc aactctatgg tgctttttga ccacgtaggc 318; tgtttaaaga aatatgacac ggtgttggat attctaagag acttttttga actcagactt 324; aaatattatg gattaagaaa gctc ctaggaatgc ttggtgctga atctgctaaa 330; ctgaataatc aggctcgctt tatcttagag gatg gcaaaataat cattgaaaat 336; aagcctaaga aagaattaat taaagttctg attcagaggg gatatgattc ggatcctgtg 342; aaggcctgga aagaagccca gcaaaaggtt ccagatgaag aagaaaatga agagagtgac 348; aacgaaaagg aaaa gagtgactcc gtaacagatt caac cttcaactat 354; cttcttgata tgcccctttg gtatttaacc aaggaaaaga aagatgaact gcta 360; agaaatgaaa aagaacaaga gctggacaca ttaaaaagaa agagtccatc agatttgtgg 366; aaagaagact tggctacatt tattgaagaa ttggaggctg ttgaagccaa ggaaaaacaa 372; gatgaacaag tcggacttcc tgggaaaggg gggaaggcca aaaa aacacaaatg 378; gctgaagttt tgccttctcc gcgtggtcaa agagtcattc cacgaataac catagaaatg 384; aaagcagagg cagaaaagaa aaag aaaattaaga atgaaaatac tgaaggaagc 390; cctcaagaag atggtgtgga actagaaggc ctaaaacaaa gattagaaaa gaaacagaaa 396; agagaaccag gtacaaagac aaagaaacaa actacattgg catttaagcc aaaa 402; ggaaagaaga gaaatccctg gtctgattca gata ggagcagtga cgaaagtaat 408; tttgatgtcc ctccacgaga aacagagcca cggagagcag caacaaaaac aaaattcaca 414; atggatttgg atga agatttctca gattttgatg aaaaaactga tgatgaagat 420; tttgtcccat cagatgctag tccacctaag accaaaactt ccccaaaact tagtaacaaa 126; aaac cacagaaaag tgtcgtgtca gaccttgaag ctgatgatgt taagggcagt Z32; ctgt cttcaagccc tcctgctaca catttcccag atgaaactga aattacaaac Z38; ccagttccta aaaagaatgt gacagtgaag aagacagcag caaaaagtca gtcttccacc Z44; tccactaccg gtgccaaaaa aagggctgcc ccaaaaggaa ctaaaaggga tccagctttg 150; aattctggtg tctctcaaaa gcctgatcct gccaaaacca agaatcgccg caaaaggaag 156; ccatccactt ctgatgattc tgactctaat tttgagaaaa ttgtttcgaa agcagtcaca Z62; agcaagaaat ccaaggggga gagtgatgac ttccatatgg actttgactc ggct Z68; cctcgggcaa aatctgtacg ggcaaagaaa aagt acctggaaga gtcagatgaa Z74; gatgatctgt tttaaaatgt gaggcgatta ttttaagtaa ttatcttacc aagcccaaga 180; ctggttttaa agttacctga agctcttaac ttcctcccct ctgaatttag tttggggaag 186; ttag tacaagacat caaagtgaag taaagcccaa gtgttcttta tata Z92; atactgtcta aatagtgacc atctcatggg cattgttttc gctt tgtctgtgtt Z98; ttgagtctgc tttcttttgt ctttaaaacc tgatttttaa gttcttctga actgtagaaa 504; tagctatctg tcag cgtaaagcag tgtgtttatt aaccatccac aaaa 510; ctagagcagt ttgatttaaa actc ttcctccttt tctactttca gtagatatga 516; gatagagcat aattatctgt tttatcttag acat aatttaccat cagatagaac 522; tttatggttc tagtacagat actctactac actcagcctc ttatgtgcca tctt 528; taagcaatga gaaattgctc atgttcttca tcttctcaaa tcatcagagg aaaa 534; acactttggc tgtgtc:ata acttgacaca gtcaatagaa tgaagaaaat agtt 540; atgtgattat ttcagc:ctt gacctgtccc ctctggctgc ctctgagtct gaatctccca 546; aagagagaaa ccaatt:cta agaggactgg attgcagaag actcggggac aacatttgat 552; ccaagatctt aaatgt:ata ttgataacca tgctcagcaa tgagctatta gattcatttt 558; gggaaatctc cataat :tca atttgtaaac tttgttaaga cctgtctaca ttgttatatg 564; tgtgtgactt gagtaa:gtt atcaacgttt ttgtaaatat tgtt tttctattag 570; ctaaattcca acaatt:tgt actttaataa aatgttctaa acattgcaac cca 3. CAWKZA: CAMKZA calcium/calmodulin—dependent protein kinase II alpha [ Homo sapiens ] Locus: NW_015981.3 (isoform 1) AA /translation="WAiIiCini A. *YQ .t**-GKGAtSVVRRCVKVLAGQEYAAKII NTKKLSARDHQKL‘R‘AQICR--K{PNIVRL IS‘4GHHY-IbD-VLGG4Lb431V A?*YYS*ADASHCIQQIuTAV-{CiQMGVVHQDuKPTVLLLASKLKGAAVKAADFGLA I*V*G*QQAWEGEAGLPGYLSPﬂVuRKDPYGKPVDoWACGVI-YIu-VGYPPFWDEDQA.

HQAYQQIKAGAYDEPSP‘WDLVLP‘AKDLINKMuTIVPSKRITAAEALKHPWIS{QST VASCM{QQETVDCLKKFWARRKuKGAIuTTMuATQVFSGGKSGGNKKSDGVKKRKSSS SVQ.M*SS*SLV iI *D‘DiKVRKQ *IIKV “Q J DRESYTKMCDPGMTAF PTA.GWEV‘G. DbHRbe LNLWSRNSKPVH"TI .NPHIHLMGDTSACIAYIRITQYL AGGI PRLAQS A. 4 RVWH a QDGKWQIVHFHRSGAPSVLPH CDNA: catg gggacctgga tgctgacgaa ggctcgcgag gctgtgagca gccacagtgc 6; cctgctcaga agccccgggc tcgtcagtca aaccggttct ctgtttgcac tcggcagcac 12; gggcaggcaa gtggtcccta ggttcgggag cagagcagca gcgcctcagt cctggtcccc 18; cagtcccaag cctcacctgc agcg ccaggatggc caccatcacc tgcacccgct 24; tcacggaaga gtaccagctc ttcgaggaat tgggcaaggg ctcg gtggtgcgaa ; ggtgtgtgaa ggtgctggct ggccaggagt atgctgccaa gatcatcaac acaaagaagc 36; tgtcagccag agaccatcag aagctggagc gtgaagcccg catctgccgc ctgctgaagc 42; accccaacat cgtccgacta catgacagca tctcagagga gggacaccac tacctgatct 48; tcgacctggt cactggtggg gaactgtttg aagatatcgt ggcccgggag tattacagtg 54; aggcggatgc cagtcactgt atccagcaga tcctggaggc tgtgctgcac tgccaccaga 60; tgggggtggt ggac ctgaagcctg agaatctgtt ctcc aagg 66; gtgccgcagt gaagctggca gactttggcc tggccataga ggtggagggg gagcagcagg 72; catggtttgg gtttgcaggg actcctggat atctctcccc agaagtgctg cggaaggacc 78; cgtacgggaa gcctgtggac ctgtgggctt gtggggtcat cctgtacatc ctgctggttg 84; cccc gttctgggat gaggaccagc accgcctgta ccagcagatc aaagccggcg 90; cctatgattt cccatcgccg gaatgggaca ctgtcacccc ggaagccaag gatctgatca 96; ataagatgct gaccattaac ccatccaaac gcatcacagc agcc cttaagcacc L02; cctggatctc ctcc accgtggcat cctgcatgca cagacaggag accgtggact L08; gcctgaagaa gttcaatgcc aggaggaaac tgaagggagc cattctcacc acgatgctgg L14; ggaa cttctccgga gggaagagtg acaa gaagagcgat ggtgtgaaga L20; aaagaaagtc cagttccagc gttcagttaa tggaatcctc agagagcacc aacaccacca L26; tcgaggatga agacaccaaa gtgcggaaac aggaaattat aaaagtgaca gagcagctga L32; ttgaagccat aagcaatgga gattttgagt cctacacgaa gatgtgcgac cctggcatga L38; cagccttcga acctgaggcc ctggggaacc tggttgaggg cctggacttc catcgattct L44; aaaa cctgtggtcc cggaacagca agcccgtgca caccaccatc ctgaatcccc L50; acatccacct gatgggcgac gcct gcatcgccta catccgcatc acgcagtacc L56; ctgg cggcatccca cgcaccgccc agtcggagga tgtc tggcaccgcc L62; gggatggcaa atggcagatc gtccacttcc acagatctgg ggcgccctcc cccc L68; actgagggac caggctgggg tcgctgcgtt ccgc agagatccac tctgtccgtg 174; gagtggagct gctggttctc ccaggtggat tttgctggaa ttctcccatg tcatcacccc 180; accaccgtca cttctgtacc tgcatcaaga aaacctgctt gttcacaaaa gtcatcgcaa 186; cttcagagcg aacggccaca tctccccacc cccc accctctccc ctgccaggct 192; ggggcttcct caggcatggg tgtccacagc actggccccc tctccccagc ctcagctgct 198; gtccgcctga tctgtcttgg gctgtaggct agaatgcccg ggctggtgcc caccaggggc 204; tggggagaag gaggggtggc atgatgagga aggcagcatc cgtccgtccc tctcccagac 210; ctctcctctt ccagtgtccc cggggaaggg cagatgacac tcccttcccc ctaagccaac 216; cgcactgaag ggag atac gccaggagcc tcctgcctca aagtgctccc 222; ctaagtcttc ctgt gctgacctca gggtggtctg acccttccct ngtgtgggg 228; gatgtggccc tctcaggtgc ccctacttgc cttc cttctggtga agtccacctc 234; caacattaac ctgcccaccc cacccccgtc atccctggag aattccagct ttgtcgtatc 240; tcagagaggg attg tttttggggg gcaaaagaaa gcaacgttta actt 246; ctacttggac cgcatgcctt tttatagcca aatttctgtg tatttcgtaa atggatttcg 252; cgttaatgga tatttatgta ataactagac ttctcagatt attgtgagaa gggtcaggtt 258; ggaaggggtg taggaagagg ggtgaggggt agtttttttc tgttctagtt tttttttttt 264; tttttgtcat ggtg gaccttgtca cctgtggtta ttggggccaa ggtggactca 270; gctccgggga gaagggcctc tctgccattt caag gtgagctgac acaggcgttc 276; cttttgggac tgtggaagca tcagatgcca gcactgactc aggaacagca ggca 282; gagaggagga gggaggctgt ggaa atacctggac ttgc ttccctcgca 288; aactggggtc ttctctaccg aacttcccag gatttcatct caccatatct gtgtgccgcc 294; cccagcaccc cccacccacc tctggggggc ccgtgagcgt gtgtcttcat tgcctctctc 300; cccttggcgt ctgatgacca cagcaaagca ctgggaattt ctactcttca atcc 306; tgcagcctcg catt ctctctttct tttcctcttt ccctctttcc ctgggattga 312; ctctgagtgg aataccttgg cacatccact aggatctact gtctgcactg ttttctttgc 318; atgactttat acgcagtaag gaaa aaaa agaagaaaac actcaacaaa 324; accaatctac atgttttgga ctaaaaaaaa aaatagaggt tgtattctca gtgtccgact 330; cggaattatg ttgctgcctc tctgtgcttt tggcctctgt gtggccgtgt tttgccagca 336; actg tcccctctgg aggattttag gggaggaaga gccacgtccc cagggattgg 342; aggaggctcc ggtaccctcg accctcctgg gtgttggttg gagcagaact ggtgaggatg 348; tttgatccga gattttctga gctctcccca atcaccagct gtctgctggg ttcttttctc 354; aagtcctgct gcccaggccc aggtgagaca ggcaacgcca ggtctgcagg ccaggagaga 360; ccag gcctcctggt ttccaagctg gtccatcact ggcctctgtc cttggcagag 366; accttgctgc ccaggcccag gggcaggctc ttggcctgcc ccag agggcttccc 372; agtaaggccc agtgatccca ttatcccagg ggcaaaacca cctgtcccct tttgagctgc 378; cagttcccta cagccatccc cagtcaaggg tgagggtgtg gccttcacca ggggctgctg 384; taattaccga gcaaggtctg agctcttctt cagcctcagt tccctcattg gttaaaaggg 390; ttctttgttc ccatccagcc ggag caaacgtctg tgaa gcctaattta 396; ggaa ctggcaggga ctgg ctggactcct gtttacttct agacctggtc 102; aggctccatc ccctccccca cctgcccctg attcccctcg tcggtgcctg tcaactgctt 108; ttcagcagtg aggg gaaagagcag tgatttgggg tgagtaggct tcaattccca z gctctgacca gacttgctgt gtgaccttgg gcaagttcct ttccctcttt ggagcttggt 120; ttccctgcca gaggaaactg agctggagga gcctgaggtc ctgcctttca ttggctgaca 126; cacctcctgt ccactgtgtc actctccaag agaa gtggaggcag atcgctaccc 132; caggctgaga tggcccccac ggcc acgcctgtgg agcc acctggtgcc 138; accacagggc accagggatg atcctgatgt cagg ggagactcac agaaaaatct Z44; gcccagagcc cacc agacaaactc tgtgctcctc caaaacatcc tttagatgca 150; aaataataat aataataata ataaataaat aaataaaaat ccaaacccaa gtcaaaacct 156; tggctccagc atgaaaacac gtttacagga aagtgttctc ctgggtttgt gcccaccatg Z62; gtgcgaatcc tgacccaagg cctcctgtct aaag ggagaccctt ttgggggatg Z68; agtttgccag actccccgtg ctggtttctt tgttactatt gggt tttgttttag Z74; ttcttttt:t ttttcttttc ttttttaaaa atatgtggct gtgaacttga atgaacactg Z80; ctcaaact:t ctgctattgg ggggggcggg tgggatggga agaaggggcg tttgttttat Z86; tcttggtg:t ttcagtgcaa taaatagcta caaacttctg tgcaaaaaaa JOCUS WW_171825 (isoform 2) AA / translati Qt]. A. *YQ-t** .GKGAESVVR QCVKVLAGQEYAAKII DHQK. {PNIVRL {3515* *GHHY-Ib) .ViGG 4Lb‘DIV DASHCIQQIETAV. {QMGVVH QD-KP4 VLLLASKLKGAAVKAADFGLA *QQAWEGEAGiPGYLSPTV .RK DPYGKPVJAWACGVI .YI .VGYPPFW DEDQ QQIKAGAY DEPSP *WD ViP *AK D-INKW .TIVPSKRI AALALKHPWISi KS" KKFVAR QK .KGAI .TTM-ATQVFSGGKSGGNKKS DGVK*SS*S u 1KVRKQ *IIKV “Q .1 *AISWGDFESYTKMCDPGWiAb 4? *ALGWLV'G.‘J HVLWS QNSKPVH I .NPHI SACIAYIRI"QY 4 DAGGIP QTAQSL‘J N DGKWQIVHF {RSGAPSVLP { CDNA: 1 gg :gcca :g gggacctgga tgctgacgaa ggctcgcgag gctg :gagca gccacagtgc 61 CC :gctcaga agccccgggc tcgtcagtca aaccggttct ctgt :tgcac tcggcagcac 12; gggcaggcaa ccta ggttcgggag cagagcagca gcgcctcagt cctggtcccc 18; cagtcccaag cctcacctgc ctgcccagcg ccaggatggc caccatcacc tgcacccgct 24; tcacggaaga gtaccagctc ttcgaggaat tgggcaaggg ctcg gtggtgcgaa ; ggtgtgtgaa ggtgctggct ggccaggagt atgctgccaa gatcatcaac acaaagaagc 36; tgtcagccag agaccatcag aagctggagc gtgaagcccg catctgccgc ctgctgaagc 42; acat cgtccgacta catgacagca tctcagagga gggacaccac tacctgatct 48; tcgacctggt tggg gaactgtttg aagatatcgt ggcccgggag tattacagtg 54; aggcggatgc cagtcactgt atccagcaga tcctggaggc tgtgctgcac caga 60; tgggggtggt ggac ctgaagcctg agaatctgtt gctggcctcc aagctcaagg 66; gtgccgcagt gaagctggca gactttggcc tggccataga ggtggagggg gagcagcagg 72; catggtttgg gtttgcaggg actcctggat atctctcccc agaagtgctg cggaaggacc 78; cgtacgggaa gcctgtggac ctgtgggctt gtggggtcat catc ctgctggttg 84; cccc gttctgggat gaggaccagc accgcctgta ccagcagatc aaagccggcg 90; cctatgattt cccatcgccg gaatgggaca ctgtcacccc ggaagccaag gatctgatca 96; ataagatgct gaccattaac ccatccaaac gcatcacagc tgccgaagcc cttaagcacc L02; tctc gcaccgctcc accgtggcat cctgcatgca cagacaggag accgtggact L08; gcctgaagaa gttcaatgcc aggaggaaac tgaagggagc cattctcacc acgatgctgg L14; ccaccaggaa cttctccgga gggaagagtg ggggaaacaa gaagagcgat ggtgtgaagg L20; aatcctcaga gagcaccaac accaccatcg aggatgaaga caccaaagtg cggaaacagg L26; aaattataaa agtgacagag cagctgattg aagccataag caatggagat tttgagtcct L32; acacgaagat gtgcgaccct ggcatgacag ccttcgaacc tgaggccctg gggaacctgg L38; ttgagggcct ggacttccat tatt ttgaaaacct gtggtcccgg aacagcaagc L44; ccgtgcacac caccatcctg aatccccaca tccacctgat cgag tcagcctgca L50; tcgcctacat ccgcatcacg cagtacctgg acgctggcgg acgc accgcccagt L56; cggaggagac ccgtgtctgg caccgccggg atggcaaatg gcagatcgtc cacttccaca L62; gatctggggc gccctccgtc cact gagggaccag gctggggtcg ctgcgttgct L68; gtgccgcaga gatccactct gtccgtggag tggagctgct ggttctccca ggtggatttt L74; gctggaattc tcccatgtca tcaccccacc accgtcactt ctgc atcaagaaaa L80; cctgcttgtt cacaaaagtc atcgcaactt gaac ggccacatct ctct L86; cacccccacc ctctcccctg tggg gcttcctcag gcatgggtgt ccacagcact 192; ggccccctct ccccagcctc agctgctgtc cgcctgatct gtcttgggct gtaggctaga 198; atgcccgggc tggtgcccac caggggctgg ggagaaggag gggtggcatg atgaggaagg 204; cagcatccgt ccgtccctct cccagacctc tcctcttcca ccgg ggaagggcag 210; atgacactcc cttcccccta agccaaccgc actgaaggag tggggagaag agcatacgcc 216; aggagcctcc aaag tgctccccta agtcttcttc ctcctgtgct gacctcaggg 222; tggtctgacc cttccctcgg tgtgggggat ctct caggtgcccc tacttgcttt 228; ctgcttcctt aagt ccacctccaa cattaacctg cccaccccac ccccgtcatc 234; cctggagaat tccagctttg tcgtatctca gagagggaat gttt ttggggggca 240; aaagaaagca acgtttaggt atcacttcta cttggaccgc tttt atagccaaat 246; gtat ttcgtaaatg gcgt taatggatat ttatgtaata actagacttc 252; tcagattatt gtgagaaggg tcaggttgga aggggtgtag gaagaggggt gaggggtagt 258; ttttttctgt tctagttttt tttttttttt ttgtcatctc tgaggtggac cttgtcacct 264; gtggttattg gggccaaggt ggactcagct ccggggagaa gggcctctct gccatttcgg 270; tcccaaggtg agctgacaca ggcgttcctt ttgggactgt ggaagcatca gatgccagca 276; ctgactcagg aacagcaagt cagggcagag aggg aggctgtcag gatggaaata 282; cctggacttt tctttgcttc cctcgcaaac tggggtcttc tctaccgaac ttcccaggat 288; tcac catatctgtg tgccgccccc agcacccccc acccacctct cccg 294; tgagcgtgtg tcttcattgc ctctctcccc ttggcgtctg atgaccacag caaagcactg 300; ggaatttcta ctcttcatgc ctgc agcctcgggt tcgcattctc tctttctttt 306; cctctttccc tctttccctg ggattgactc tgagtggaat accttggcac tagg 312; atctactgtc tgcactgttt tctttgcatg tacg cagtaagtat gttgaaaaca 318; aacaaaaaga agaaaacact caacaaaacc aatctacatg ttttggacta aaaaaaaaaa 324; tagaggttgt attctcagtg tccgactcgg aattatgttg ctgcctctct gtgcttttgg 330; cctctgtgtg gccgtgtttt gccagcatga gatactgtcc cctctggagg attttagggg 336; aggaagagcc acgtccccag ggattggagg aggctccggt accctcgacc ctcctgggtg 342; ttggttggag cagaactggt gaggatgttt gatccgagat tttctgagct ctccccaatc 348; accagctgtc tgctgggttc ttttctcaag tcctgctgcc caggcccagg tgagacaggc 354; aacgccaggt ctgcaggcca ggagagatgc tgcccaggcc tcctggtttc caagctggtc 360; catcactggc ctctgtcctt ggcagagacc ttgctgccca ggcccagggg caggctcttg 366; gcctgcccca ggcccagagg cagt aaggcccagt gatcccatta tcccaggggc 372; aaaaccacct tttt gagctgccag acag ccatccccag tcaagggtga 378; gggtgtggcc ttcaccaggg gctgctgtaa agca aggtctgagc tcttcttcag 384; cctcagttcc ctcattggtt aaaagggttc tttgttccca tccagccgat gaaggagcaa 390; acgtctggct atgtgaagcc taatttacct gcaggaactg gcagggatag tcactggctg 396; gactcctgtt tacttctaga cctggtcagg ctccatcccc tcccccacct gcccctgatt 102; cccctcgtcg gtgcctgtca actgcttttc agcagtggac tgcaggggaa agagcagtga 108; tttggggtga gtaggcttca attcccagct ctgaccagac ttgctgtgtg accttgggca 114; agttcctttc cctctttgga tttc cctgccagag gaaactgagc tggaggagcc 120; tgaggtcctg cctttcattg gctgacacac ctcctgtcca ctgtgtcact ctccaagtgc 126; agtg gaggcagatc gctaccccag atgg cccccactgt gaaggccacg 132; cctgtgggtg ggcagccacc tggtgccacc acagggcacc agggatgatc ctgatgtggc 138; aggcagggga gactcacaga aaaatctgcc cagagcctac cctcaccaga caaactctgt gctcctccaa cttt agatgcaaaa taataataat aataataata taaaaatcca aacccaagtc ttgg ctccagcatg aaaacacgtt tgttctcctg tgcc ggtg cgaatcctga cccaaggcct tccc Z62; ttcaaaggga gacccttttg ggggatgagt ttgccagact gctg gtttctttgt Z68; tactatttgt ttggggtttt gttc tttttttttt tcttttcttt tttaaaaata Z74; tgtggctgtg aatg aacactgctc aaactttctg ctattggggg gggcgggtgg 180; gatgggaaga aggggcgttt gttttattct tggtgttttc agtgcaataa atagctacaa 186; acttctgtgc aaaaaaaaaa aaaaa 4. CDKl: CDKl cyclin—dependent kinase 1 [ Homo sapiens ] LOCUS 170406 (isoform 4) nslation="M4DYiKI *KIG *GiYGVVYKGRHKTTGQVVAMKKIRL45** *GV PSiAIR*ISLLK4LRHPVIVSLQ DVLMQDSRLYLIFTFLSMDLKKYLDSIPPGQYWJS SLVKVKA CDNA21 agccgccctt :cctc :ttct :tcgcgctct agccacccgg gaaggcctgc ccagcgtagc 6; tgggctctga ttggctgc :t tgaaagtcta cgggctaccc gattggtgaa tccggggccc 12; tttagcgcgg atctacca:a cccattgact aactatggaa gattatacca aaatagagaa 18; aattggagaa ggtaccta:g gagttgtgta taagggtaga cacaaaacta caggtcaagt 24; ggtagccatg aaaaaaatca gactagaaag tgaagaggaa ggggttccta gtactgcaat ; tcgggaaatt tctctattaa aggaacttcg tcatccaaat atagtcagtc ttcaggatgt 36; gcttatgcag gattccaggt tcat ctttgagttt ctttccatgg atctgaagaa 42; atacttggat tctatccctc ctggtcagta catggattct tcacttgtta aggtaaaagc 48; aatt ttattaatat ttatgcactg taaa gggactatat atagaagtcc 54; ctgcattttg tgggaatatg cttggaaaaa gtgttagaat aagaaaaagt atttcatttt 60; tctccctcat ggttagttta tacaggttag agatacccat gttattacca gatagtgttt 66; ctagtaagta aaaattagtg cctgagataa catagaactg gtaggtattg ttggaagcta 72; gggtagtctg gtctttcttt ggctgtcaga tacatgtaaa acaaagtaat ctag 78; ggcagagtgg tggttgtagg tgttttattc cagttttgaa catgttttgg tcaatttatt 84; gtagacattt attatatttc atta taaaattgta tagttttaag tactgaagta 90; tataaaagtg tcttattctt gcaccagttc taccaaacca ctctgcagag gtagcgctgt 96; tagttttatt ctta cacttgtatg tatgttcact ttgtatgtat ataaagattt ;02; ttttttttac acaaggtgga cttatttgca tatgtatata tacatatttt cccttttttg ;08; tgtaaaacat tatcaagacg tagatctacc tatgtctatt tacatttttg atataattaa ;14; accacttcca tattgatgaa catttaaatt attttccaac ttggttattg ttgctcttat ;20; taacagtact gcactgaatg tccttataga tatttatctt cgtatgcaac tttataggat ;26; ttag aatg tgaa gatgtttatt tacattttga tagatattgc ;32; cggttgcccc aact tgtagcaatt tactcttaaa tactcatggt gtgtaatact ;38; tattgtttta gtacatcatt gccaaaactt ggttttatca atctgttaac tatgtgaaaa ;44; aggcatatta agattgtttt aattttatat ttcatgacaa tttaacactt catatttagc ;50; tattataaac cgcctatatt ttcgttagga tacgttcttt aacaatcttg catgactttt ;56; ggactttc:g cttttatgtc ttgcttaagt cact caaagatcga atgtattaga ;62; ataataca:g tcagtatttt tctggtagtt ttagtaagtc ctgtcttcca cacatacttt ;68; ttttgtct:a aattctgtat taagatttat tttgacttaa aaactgggat tctg ;74; ctttatct:t ttcc LOCUS WW_001786 (isoform 1) AA lation="M4DYLKIdKIGdGiYGVVYKGRHKTTGQVVAWKKIRL45*4 *GV dISL-Kd.RiPNIVSLQDVLMQDSR.YLIFTFLSMD.KKY.DSIPPGQYMDS S-VKSY-YQI.QGIVFCHSRRVLHRDLKPQNALIDDKG"IKLADFG;ARAFGIPIRVY .WYRSPTVL.GSARYSTPVDIWSIGLIbAd.A KKP-bHGDSdIDQLbRIbR VdVWPdVdS.QDYKNTFPKWKPGSLASHVKVLDTNG.DLLSKWLIYDPAKQI SGKMALWiPYFVDADVQIKKM CDNA: 1 agcgcggtga g:ttgaaact gctcgcactt ggc:tcaaag c:ggctcttg gaaattgagc 61 ggagagcgac gcggttgttg tagctgccgc tgcggccgcc gcggaataat aagccgggat 121 ctaccatacc cattgactaa ctatggaaga ttataccaaa atagagaaaa ttggagaagg 181 tacctatgga gttgtgtata agggtagaca caaaactaca ggtcaagtgg tagccatgaa 24; caga ctagaaagtg aagg ggttcctagt actgcaattc gggaaatttc ; tctattaaag gaacttcgtc atccaaatat agtcagtctt gtgc ttatgcagga 36; ttccaggtta tatctcatct ttgagtttct ttccatggat ctgaagaaat acttggattc 42; tatccctcct ggtcagtaca tggattcttc acttgttaag agttatttat accaaatcct 48; acaggggatt gtgttttgtc actctagaag agttcttcac agagacttaa aacctcaaaa 54; gatt gatgacaaag gaacaattaa actggctgat tttggccttg ccagagcttt 60; acct atcagagtat atacacatga ggtagtaaca ctctggtaca gatctccaga 66; gctg gggtcagctc gttactcaac tgac atttggagta taggcaccat 72; atttgctgaa ctagcaacta cact tttccatggg gaaa aact 78; cttcaggatt ttcagagctt tgggcactcc caataatgaa gtgtggccag aagtggaatc 84; tttacaggac tataagaata catttcccaa atggaaacca ggaagcctag atgt 90; caaaaacttg gatgaaaatg gcttggattt gctctcgaaa atgttaatct atgatccagc 96; caaacgaatt aaaa tggcactgaa tcatccatat gatt tggacaatca L02; gattaagaag a:gtagcttt ctgacaaaaa gtttccatat gttatatcaa cagatagttg L08; tgtttttatt g:taactctt gtctattttt gtcttatata tatttctttg ttatcaaact L14; tcagctgtac ttct aatttcaaaa atataactta taaa tattctatat L20; gaatttaaat a:aattctgt aaatgtgtgt aggtctcact gtaacaacta tttgttacta L26; taataaaact a:aatattga tgtcaggaat caggaaaaaa tttgagttgg cttaaatcat L32; ctcagtcctt a:ggcagttt tattttcctg tagttggaac tactaaaatt taggaaaatg L38; ctaagttcaa g:ttcgtaat gctttgaagt atttttatgc tctgaatgtt taaatgttct L44; catcagtttc t:gccatgtt gttaactata caacctggct gaat atttttctac L50; tggtatttta a:ttttgacc taaatgttta agcattcgga atgagaaaac tatacagatt L56; tgagaaatga tgctaaattt ataggagttt tcagtaactt aaaaagctaa catgagagca L62; tgccaaaatt tgctaagtct gatc aagggctgtc cgcaacaggg aagaacagtt L68; ttgaaaattt atgaactatc ttatttttag gtaggttttg aaagcttttt gtctaagtga L74; attcttatgc cttggtcaga gtaataactg aaggagttgc ttatcttggc tttcgagtct L80; gagtttaaaa ctacacattt tgacatagtg tttattagca gccatctaaa aaggctctaa L86; tgtatattta actaaaatta ctagctttgg gaattaaact gtttaacaaa taaaaaaaaa L92; aaa LOCUS NM_033379 (isoform 2) AA / translation="M4DYiKI *KIG dGiYGVVYKGRHKTTGQVVAMKKIRLL U] L L *GV PSiAIRdISLLKd.QHPNIVSLQDVLMQDSR.YLIFTF.SWDLKKY.DSIPPGQYMDS SLVKVVTLWYRSPTVL.GSARYSTPVDIWSIGiIbAdLALKKP.bHGDS*IDQLbRIb RALGLPVNdVWPdVdS.QDYKNTFPKWKPGSAASHVKN.DTNG.DL.SKWLIYDPAKR ISGKMAANiPYFWDADWQIKKM CDNA: l agcgcggtga g:t:gaaact actt ggc:tcaaag ctggctcttg gaaattgagc 6; ggagagcgac gcggttgttg tagctgccgc tgcggccgcc gcggaataat aagccgggat l2; ctaccatacc cattgactaa aaga ttataccaaa atagagaaaa aagg l8; tacctatgga gttgtgtata agggtagaca caaaactaca gtgg tagccatgaa 24; aaaaatcaga ctagaaagtg aagaggaagg ggttcctagt actgcaattc gggaaatttc ; tctattaaag gaacttcgtc atccaaa:at agtcagtctt caggatgtgc ttatgcagga 36; ttccaggtta atct ttgagtt:ct ttccatggat ctgaagaaat acttggattc 42; tcct ggtcagtaca tggattc:tc acttgttaag gtagtaacac tctggtacag 48; atctccagaa gtattgctgg ggtcagc:cg ttactcaact gaca tttggagtat 54; cata gaac tagcaac:aa gaaaccactt ttccatgggg attcagaaat 60; tgatcaactc ttcaggattt tcagagc:tt gggcactccc aataatgaag tgtggccaga 66; agtggaatct ttacaggact ataagaa:ac atttcccaaa tggaaaccag gaagcctagc 72; atcccatgtc aaaaacttgg atgaaaa:gg cttggatttg ctctcgaaaa tgttaatcta 78; agcc aaacgaattt ctggcaaaat ggcactgaat catccatatt attt 84; ggacaatcag attaagaaga :gtagctttc tgacaaaaag tttccatatg ttatatcaac 90; agatagttgt gtttttattg :taactcttg tctatttttg atat atttctttgt 96; tatcaaactt cagctgtact :cgtcttcta atttcaaaaa tataacttaa aaatgtaaat ;O2; attctatatg aatttaaata :aattctgta aatgtgtgta ggtctcactg taacaactat ;O8; ttgttactat aataaaacta :aatattgat gtcaggaatc aggaaaaaat ttgagttggc ;14; ttaaatcatc tcagtcctta tttt attttcctgt aact actaaaattt ;20; aggaaaatgc taagttcaag :ttcgtaatg ctttgaagta tttttatgct ctgaatgttt ;26; aaatgttctc atcagtttct :gccatgttg ttaactatac aacctggcta aagatgaata ;32; tttttctact ggtattttaa acct aaatgtttaa gcattcggaa tgagaaaact ;38; atacagattt gagaaatgat gctaaattta taggagtttt cagtaactta aaaagctaac ;44; atgagagcat attt gctaagtctt acaaagatca agggctgtcc gcaacaggga ;50; agaacagttt tgaaaattta tgaactatct tatttttagg taggttttga aagctttttg ;56; tctaagtgaa ttcttatgcc ttggtcagag taataactga aggagttgct tatcttggct ;62; tctg agtttaaaac tacacatttt gacatagtgt ttattagcag ccatctaaaa 1681 aggctctaat gtatatttaa ctaaaattac tagctttggg aattaaactg tttaacaaat 1741 aaaaaaaaaa aa . CLTCLl: CLTCLl cla:hrin, heavy chain—like 1 [ Homo sapiens LOCJS NM_001835 (isoform 2) AA /:ranslation="MAQI-PV?FQTHFQLQN-GINPAWIGb51.1M*SDKEICI?LKV .0L‘J'QAQVTIIDWSDPMAPIRRPISAESAIMNPASKVIALKAGKTLQIFNIEMKSKWKAi WA4*VIbWKWVSVNTVALVTETAVY{WSMEGDSQPWKMFDRiTSAVGCQVIHYRTDE YQKW---VGISAQQNRVVGAWQ4YSVDRKVSQPILGiAAAbA4EKW4GNAKPATAFCF AVRWPTGGK-{IITVGQPAAGWQPFVKKAVDVFFPPEAQNDFPVAWQIGAKiGVIYAI "KYGY-i-YD-TSGVCICMNRISADTIFVTAP{KP"SGIIGVVKKGQVLSVCV**DWI VNYA"VV-QWPD.GLRLAVRSV-AGA4K-EVRKEV AEAQGSYAE .

RlRL VQKhQSIPAQSGQASP--QYFGI--DQGQ-NK-*S-*-C{.VLQQGRKQ--TK WLK43K-4C544-GDLVK"TDPWLALSVY4QAVVPSKVIQCFAETGQFQKIVLYAKKV GYTPDWIF--?GVMKISPTQG-QFSRWLVQD**P-AWISQIVDIFWTNS.1QQCTSF4 LDAAKWVRPA4G.LQiW-.4WV-V{APQVADAIAGNKWFT{YDQAiIAQ-CTKAG.

QA-TiYTD-YDIKRAVV4 {--VP*W-VVEEGS-SV*DSV*C-{AW-SAVIRQV-Q-C VQVASKY44Q-G1QA-V4-b*SbKSYKG-FYF-GSIVWFSQDPDV{AKYIQAACKT IK*V*?ICR*SSCYWP*RVKVE-K*AK-1DQ-P-IIVCD?FGFVHD.VLY-YRWV YIEIYVQKVVPSRTPAVIGG-.3VDC544VIK4.1WAVRGQb51D4-VA*V*K? L.-PW-*SQIQ*GC**PA HVAAAKIYIDSVVSP4CE-R*WAYYDSSVVGRYCEK YTRGQCD-T.1KVCW4VS-bKS*ARY-VCRK3PT-WAHV-**1VPSRR QVVQiA-S41RDP4415V VKAEW AD-PN4-*KIV-DNSVFST{RV-QW..I.

TAIKADRTRVMTYISQLDNYDA.31ASIAVSSA-Y**Ah1VbHKbDWVASAIQV.1Ti IGW-DRAY L n— :D‘ L RCV4PAVWSQ-AQAQ-QKD-VKTAIWSYIRGDDPSSY-TVVQSASR SNWWTD-VKFLQWAQKKGR*SYI* AKisRV54-*DEIVGPVWA{IQQVGD a CY“GWY*AAK--YSWVSNFAQ-AST-V{.GTYQAAVDVSRKASSTQTWKEVCFACW D GQTFRFAQ-CG-{IVIiADdﬂdd-WCYYQDRGYE**-I---*AA-G-*?A4MGWFTT.

AIAYSKFKPQKW-di *LEWSQVVIPKV-RAAT Ai-WAd-Vb-YDKYddYDNAV-"W MS iPi‘AWK‘GQbKDII KVAWV‘ CYRA -QFY JYKP - -I\ID .T.V -SPRT.D-ITW"V SFFSKAGQ.PLVKPY-?SVQS{WWKSVNTA-Ni--1***DYQDAWQ{AA*SRDA*-AQ KL.QWE-**GK?*CEAAC-b1CYD.LRPDWVLT-AW?{V-VDLAWPYFIQVMRTY-SK VDKLDA-‘S-RKQ**{V *PAP-VbDbDGiL CDNA: 1 accgg:cagc ccgcgcgagg gg:cggcgtt tgcc gctgccgccg ccgccgccga 61 ggtcccgcac cagccatggc gcagatcctc cc:gttcgct ttcaggagca cttccagctc 121 caaaaccttg gaattaatcc agctaacatt ggattcagca cactgaccat ggaatctgac 181 aagttcatat gtatccgaga gaaagttggt gcac aggtcacgat cattgacatg 241 agtgacccaa tggctccgat ccgacggcct atctctgcag agagtgccat catgaatcca ; gcctctaagg tgatagctct gaaagctggg aagacacttc agatctttaa tattgagatg 361 aagagtaaaa ctca ggca gaagaagtga ttttctggaa atgggtttct 421 gtgaacactg ttgccttggt gaccgagacc gcggtctacc actggagcat ggaaggtgac 481 tcccagccca tgtt tgatagacat accagtctgg gcca ggtgattcac 541 taccggactg acca gaagtggctg ctgctcgtag cggc tcagcaaaac 60; cgtgtggttg gagcaatgca gctctactct gtggatagga aggtttcaca acccatagaa 661 ggccatgctg ttgc agagttcaag atggagggga atgccaagcc tgccaccctt 72; ttctgctttg ctgtacgtaa tcccacagga ggcaagttgc acatcattga agttggacag 78; cctgcagcgg gaaaccaacc ttttgtaaag aaagcagtag atgtgttttt agag 84; gcacagaatg attttccagt ggctatgcag attggagcta aacatggtgt tatttacttg 90; atcacaaagt atggctatct tcatctgtac gagt ctggcgtgtg catctgcatg 96; aaccgtatta gtgctgacac aatatttgtc actgctccac acaaaccaac ctctggaatt L02; attggtgtca aggg acaggtactg tcagtttgtg ttgaggaaga taacattgtg L08; aattatgcaa ccaacgtgct tcagaatcca gaccttggtc tgcgtttggc cgttcgtagt L14; aacctggctg gggcagagaa gttgtttgtg agaaaattca ataccctctt tgcacagggc L20; agctatgctg aagccgccaa agttgcagcg tctgcaccaa tcct gcgtaccaga L26; gagacggtcc agaaattcca gagtataccc gctcagtctg cttc tccattgctg L32; cagtacttcg tgct cgaccagggt cagctcaata aacttgaatc cttagaactt L38; tgccatctgg ttcttcagca ggggcgtaag caactcctag agaagtggct gaaagaagat L44; aagctggagt gctcagagga gctcggagac ttggtcaaaa ccactgaccc catgctcgct L50; ctgagtgtgt accttcgggc aaatgtgcca agcaaagtga tccagtgttt tgcagaaaca L56; ggccaattcc agaaaattgt gctctatgcc aaaaaggttg ggtacacccc agactggatc L62; tttctgctga ggggtgtaat gaagatcagt ccggaacagg gcctgcagtt ttctcgaatg L68; ctagtgcagg acgaggagcc gctggccaac attagccaga ttgtggacat tttcatggaa L74; aacagtttaa ttcagcagtg tacttccttc ttattggatg ccttgaagaa taatcgccca L80; ggac tcctgcagac atggctgttg gagatgaacc ttgttcatgc ggtt L86; gcagatgcca gaaa taaaatgttt actcattacg accgggccca cattgcccag L92; gaga aggcaggcct cctgcagcaa gagc actacaccga cctctatgac L98; atcaagaggg ctgtggtcca cactcacctc cccg agtggcttgt cttt 204; ttat ngtggagga ttctgtggag tgtctgcatg ccatgctgtc catc 210; agacagaacc ttcagctgtg tgtgcaggtg gcctctaagt agca gctgggcacg 216; caggccctgg tggagctctt tgaatccttc aagagttaca aaggcctctt ctacttcctg 222; ggctcaatcg tgaacttcag ccaagaccca gatgtgcatc tgaaatacat tcaggctgcc 228; tgtaagacag ggcagatcaa ggaggtggag aggatatgcc gagagagcag ctgctacaac 234; ccagagcgtg tgaagaactt cctgaaggag gccaagctca cagaccagct tcccctcatc 240; atcgtgtgtg atcgttttgg ctttgtccat gaccttgtcc tatatttata ccgcaacaac 246; ctgcagaggt acattgagat ctacgtgcag aaggtcaacc ctagccggac cccagctgtg 252; gggc tgcttgatgt ggattgttct gaggaagtga ttaaacactt aatcatggca 258; gtgagaggac agttctctac tgatgagttg gtggctgaag tagaaaaaag aaataggctc 264; aagctgctgc ggct ggagtcccag attcaggaag gctgtgagga gcctgccact 270; cacaatgcac tggctaaaat ctacatcgac agcaacaaca gccccgagtg cttcctgaga 276; gagaatgcct actatgacag cagcgtggtg ggccgctact gtgagaagcg agacccccat 282; ctggcctgtg ttgcctatga gcgggggcag tgtgaccttg agctcatcaa ggtgtgcaat 288; gagaattctc tgttcaaaag cgaggcccgc tacctggtat gcagaaagga tccggagctc 294; tgggctcacg tccttgagga gaccaaccca tccaggagac agctaattga ggta 300; cagacagcat aaac acgggatcct gaagagattt cggtcactgt caaagccttt 306; atgacagccg acctgcctaa gatt gaactgctgg agaagatagt tctggataac 312; ttca gcgagcacag gaatctacag aatctgttga ctgc ggca 318; gaccgcacac gggtcatgga gtacatcagc cgcctggaca actatgacgc actggacatc 324; gcgagcatcg ctgtcagcag cgcactgtat gaggaggcct tcaccgtttt ccacaagttt 330; aatg caat ccaggtcctg atcgagcaca ttggaaacct ggaccgggca 336; tatgagtttg atg caatgagcct gctgtgtgga gtcagctggc ccaagcccag 342; ctccagaaag atttggtgaa ggaagccatc aactcctata tcagagggga cgacccttcc 348; tcttacctgg aagttgttca gtcagccagc aggagcaaca actgggagga taaa 354; tttctgcaga tggccaggaa aaagggccgt gagtcctata tagagactga acttattttt 360; gcta aaaccagccg tgtttctgag ctagaagatt ttattaatgg acccaacaat 366; gcccacatcc agcaggttgg ctgt tacgaggagg gaatgtacga ggctgccaag 372; ctgctctata gcaatgtttc taactttgcc cgcctggctt ccaccttggt cggt 378; gagtatcagg cagcagtgga caacagccgc aaggccagca gcacccggac gtggaaggag 384; gtgtgctttg cctgcatgga tggacaagag ttccgcttcg cacagctgtg tggtcttcac 390; atcgtcattc atga gctggaggag ctgatgtgct attaccagga tcgtggctac 396; tttgaggagc tgatcttgct gttggaagcg gccctgggcc tggagcgggc gggc 402; atgttcactg agctggccat cctctactcc aaattcaagc cacagaagat gctggagcat 408; ctggagcttt tctggtcccg tgtcaacatc ccaaaggtgc tgagggctgc agagcaggca 414; cacctgtggg ctgagctggt gttcctctat gacaagtacg aggagtatga tgtg 420; ctcaccatga tgagccaccc cactgaggcc tggaaggagg gtcagttcaa ggacatcatt 426; accaaggttg ccaacgtcga ttac agagccctgc agttctattt ggattacaaa 432; ccactgctca tcaatgacct gctgctggtg ctttcacccc ggctggacca cacctggaca Z38; gtcagtttct tttcaaaggc aggtcagctg cccctggtga acct gcggtcagtc Z44; cagagccaca acaacaagag tgtgaatgag gcactcaacc acctgctgac agaggaggag 150; gactatcagg atgccatgca gcatgctgca cggg atgctgagct ggcccagaag 156; ttgctgcagt ggttcctgga ggaaggcaag tgct tcgcagcttg tctcttcacc Z62; tgctatgacc tgcttcgccc agacatggtg cttgagctgg cctggaggca caacctcgtg Z68; gacttggcca actt catccaggtg atgagggagt acctgagcaa ggtggacaaa Z74; ctggatgcct tggagagtct gcgcaagcaa gaggagcatg tgacagagcc tgcccctctc 180; gtgtttgatt ttgatgggca agac ccagctgatt gcactaagcc ctgccgtggg 186; ccct gccagcttcc cctatggata tgcctctgct cccaacttcg ccagcctcca Z92; atgtacaact tccgcgtgta gtgggcgttg tcaccaccca ccctacctgc agagttacta Z98; acttctccaa tgtc actccagcag cacaggggac gcaatgggag gcagggacac 504; ctggacaata tttatttttg ctgaaaccca atgacggcaa cctctgagcc atcccagagc 510; ctggggaggc cagggtagag gctgacggcg caagaccagc tttagccgac aacagagact 516; ggactgtggg tgct ggagccaggc cttcctcctg ggcgcctccg actggctgga 522; gctgccccct ccaggccagt ttgaagacta catgaacacg tcttgtttgg aggtaccgga 528; cctcataaaa tcag cctcttggca atcataaata ttaaagtcgg tttatccagg 534; caaaaaaaaa aaaaaaaaaa aaaaaaaaaa LOCJS NML 007098 (isoform 1) AA / :ranslation="MAQI .PVQFQTHFQLQN-G INPAVIGESL. iM‘SDKhICI LKV GIQAQVTII44 DWSDPMAPIRRPISA SSAIMNPASKVIALKAGKTLQIFNIEMKSKWKA WA L *VIEWKWVSVNTVALVTETAVY {WSMEG DSQPWKMF DR {TSJVGCQVIHY QT) YQKW.. .VGISAQQNRVVGAWQ 4YSV DQKVSQPILG {AAAbAdeW *GNAKPATAF AVRVPTGGK. {IITVGQPAAGVQPFVKKAV DVFFPPEAQN DFPVAWQIGAK {GV "KYGY.{ TSGVCICMN RISADT IFVTAP {KP"SGIIGVWKKGQVLSVCV VNYA"wv D .GLRLAVRSV .AGA KEW JEAQGSYA RiR.LL VQKhQS QASP. .QYFGI--DQGQ .NK. *S.4 .C WLK L 3K. 4 4 .GDT.VK"T DPWLALSVYA RAVVPSKVIQCFA GYTP QGVMKISPT D .AW ISQIVJ LDAAKWV 4G .LQiW.. DAI JGNKWFT {YD QA-'{YT4 DIKRAVV { .SV L DSV‘C.

VQVASKY .GiQA .V 4 .GSIVVFSQDP 3V4.

IKdV 4 RIC? *SSCYVP 4 RVKVE .P .IIVCD J YIEIYVQKVWPS RTPAVIGG.. { L.-PW-*SQIQ 4 GC 4 *PA HVA H-ACVAY 4 RGQCD.4 .IKVCV -b KS QVVQiA-S‘iRDP 4 *ISV VKAEW A D .Qw.

TAIKADQTRVM TYIS QLDNYDA-DIASIAVSSA DWVASAIQV .IT IGV-DQAdeA L QCW *PAVWSQ-AQAQ-QKD DPSSY.TVVQSAS SNVWTD-VKFLQWA RKKGR45YI 4 4 .IhA-AKLS DEIVGPWVA D CY L *GWY‘AAK. .YSWVSNFA Q-AST .VH RTWKEVCFACW GQ'FRFAQ-CG.4 {IVI {ADdﬂd 4 -MCYYQ RAiMGMFTﬂ.L AIAYSKFKPQKW. 4 { VWIPKVL 4 L"S MSiPi‘AWK‘GQhK DIIiKVAWVdLCYRA .SPRLDHTW"V SFFSKAGQAPLVKPYLRSVQSHNNKSVNﬂAL {LLi L L *DYQGLRASI DAYDWF DNIS .AQQLdKiQLM *J: QCIAAYLYKGNNWWAQSVT .CKK QHAA *SRDA* .AQK .LQWb-**GKR *CEAAC ¢iCYDLLRPDWVF .AWR .VDLAMPYFIQVMRTY .SKV DKLDA-TS-RKQ* *HVi *PAPLVEDEDG {44 CDNA: accggtcagc ccgcgcgagg gg:cggcgtt cattcctgcc gctgccgccg ccga 6; ggtcccgcac cagccatggc gcagatcctc cctgttcgct ttcaggagca cttccagctc 12; caaaaccttg gaattaatcc agctaacatt ggattcagca cactgaccat ggaatctgac 18; aagttcatat gtatccgaga tggt gagcaggcac aggtcacgat catg 24; agtgacccaa tggctccgat gcct atctctgcag agagtgccat catgaatcca ; gcctctaagg tgatagctct gaaagctggg aagacacttc agatctttaa tattgagatg 36; aagagtaaaa tgaaggctca tactatggca gaagaagtga ttttctggaa atgggtttct 42; actg ttgccttggt gaccgagacc gcggtctacc actggagcat ggaaggtgac 48; tcccagccca tgaagatgtt tgatagacat accagtctgg gcca ggtgattcac 54; taccggactg atgagtacca gaagtggctg ctgctcgtag gcatctcggc tcagcaaaac 60; Cgtgtggttg gagcaatgca gctctactct gtggatagga aggtttcaca agaa 66; ggccatgctg cggcttttgc caag atggagggga atgccaagcc tgccaccctt 72; ttctgctttg ctgtacgtaa tcccacagga ggcaagttgc acatcattga agttggacag 78; cctgcagcgg gaaaccaacc ttttgtaaag aaagcagtag atgtgttttt tcctccagag 84; gcacagaatg attttccagt ggctatgcag attggagcta aacatggtgt tatttacttg 90; atcacaaagt atggctatct tcatctgtac gacctagagt ctggcgtgtg catctgcatg 96; aaccgtatta gtgctgacac aatatttgtc actgctccac acaaaccaac ctctggaatt L02; attggtgtca acaaaaaggg acaggtactg tgtg ttgaggaaga taacattgtg L08; aattatgcaa ccaacgtgct tcagaatcca gaccttggtc tgcgtttggc cgttcgtagt L14; aacctggctg gggcagagaa tgtg ttca ataccctctt gggc L20; agctatgctg aagccgccaa agttgcagcg tctgcaccaa agggaatcct gcgtaccaga L26; gagacggtcc agaaattcca gagtataccc gctcagtctg gccaggcttc tccattgctg L32; cagtacttcg gaatcctgct cgaccagggt cagctcaata aacttgaatc actt L38; tgccatctgg ttcttcagca ggggcgtaag caactcctag agaagtggct gaaagaagat L44; aagctggagt gctcagagga gctcggagac ttggtcaaaa accc catgctcgct L50; ctgagtgtgt accttcgggc aaatgtgcca agcaaagtga tccagtgttt tgcagaaaca L56; ggccaattcc agaaaattgt gctctatgcc aaaaaggttg ggtacacccc agactggatc L62; ctga taat gaagatcagt ccggaacagg gcctgcagtt ttctcgaatg 168; ctagtgcagg acgaggagcc gctggccaac attagccaga ttgtggacat tttcatggaa 174; aacagtttaa ttcagcagtg tacttccttc ttattggatg ccttgaagaa taatcgccca 180; gctgagggac agac atggctgttg gagatgaacc ttgttcatgc accccaggtt 186; gcagatgcca tccttggaaa gttt actcattacg accgggccca ccag 192; ctctgtgaga aggcaggcct gcaa gcactggagc actacaccga cctctatgac 198; atcaagaggg ctgtggtcca cactcacctc ctcaatcccg agtggcttgt caatttcttt 204; ggctccttat ngtggagga ttctgtggag tgtctgcatg ccatgctgtc tgctaacatc 210; agacagaacc tgtg tgtgcaggtg gcctctaagt agca gctgggcacg 216; caggccctgg tggagctctt tgaatccttc aagagttaca aaggcctctt ctacttcctg 222; atcg tgaacttcag ccaagaccca gatgtgcatc tgaaatacat tcaggctgcc 228; tgtaagacag tcaa ggaggtggag aggatatgcc gagagagcag ctgctacaac 234; ccagagcgtg tgaagaactt cctgaaggag gccaagctca agct catc 240; atcgtgtgtg atcgttttgg ctttgtccat gaccttgtcc tatatttata ccgcaacaac 246; aggt acattgagat ctacgtgcag aaggtcaacc ctagccggac cccagctgtg 252; attggagggc tgcttgatgt ggattgttct gaggaagtga ttaaacactt aatcatggca 258; gtgagaggac agttctctac tgatgagttg gtggctgaag tagaaaaaag aaataggctc 264; aagctgctgc ttccctggct ggagtcccag attcaggaag gctgtgagga gcctgccact 270; cacaatgcac tggctaaaat ctacatcgac agcaacaaca gccccgagtg cttcctgaga 276; gagaatgcct actatgacag ggtg ggccgctact gtgagaagcg agacccccat 282; ctggcctgtg atga gcgggggcag tgtgaccttg agctcatcaa ggtgtgcaat 288; gagaattctc aaag ccgc tacctggtat gcagaaagga tccggagctc 294; tgggctcacg tccttgagga gaccaaccca tccaggagac agctaattga ggta 300; cagacagcat tgtcagaaac acgggatcct gaagagattt cggtcactgt caaagccttt 306; atgacagccg acctgcctaa tgaactgatt gaactgctgg agaagatagt tctggataac 312; ttca gcgagcacag gaatctacag aatctgttga tcctgactgc catcaaggca 318; gaccgcacac gggtcatgga gtacatcagc cgcctggaca actatgacgc actggacatc 324; gcgagcatcg ctgtcagcag cgcactgtat gaggaggcct tcaccgtttt ccacaagttt 330; gatatgaatg cctcagcaat ccaggtcctg atcgagcaca ttggaaacct ggaccgggca 336; tatgagtttg ngagagatg caatgagcct gctgtgtgga gtcagctggc ccaagcccag 342; ctccagaaag atttggtgaa ggaagccatc aactcctata tcagagggga cgacccttcc 348; tcttacctgg aagttgttca gtcagccagc aaca actgggagga tctagttaaa 354; caga tggccaggaa aaagggccgt gagtcctata tagagactga acttattttt 360; gccttggcta aaaccagccg tgag gatt ttattaatgg acccaacaat 366; atcc agcaggttgg agaccgctgt gagg gaatgtacga ggctgccaag 372; ctgctctata gcaatgtttc taactttgcc cgcctggctt ccaccttggt tcacctcggt 378; gagtatcagg cagcagtgga caacagccgc aaggccagca gcacccggac gtggaaggag 384; gtgtgctttg cctgcatgga tggacaagag ttccgcttcg cacagctgtg tcac 390; atcgtcattc atgcagatga gctggaggag ctgatgtgct attaccagga tcgtggctac 396; gagc tgatcttgct gttggaagcg gccctgggcc tggagcgggc ccacatgggc 102; atgttcactg agctggccat cctctactcc aaattcaagc cacagaagat gctggagcat 108; ctggagcttt tctggtcccg tgtcaacatc ccaaaggtgc tgagggctgc agagcaggca 114; cacctgtggg ctgagctggt gttcctctat gacaagtacg atga caatgctgtg 120; ctcaccatga tgagccaccc cactgaggcc gagg gtcagttcaa accaaggttg ccaacgtcga gctctgttac agagccctgc agttctattt caaa 132; ccactgctca tcaatgacct gctgctggtg ctttcacccc ggctggacca cacctggaca 138; gtcagtttct tttcaaaggc aggtcagctg cccctggtga agccttacct gcggtcagtc Z44; cagagccaca acaacaagag tgtgaatgag aacc acctgctgac agaggaggag 150; cagg gcttaagggc atctatcgat gcctatgaca actttgacaa catcagcctg 156; gctcagcagc tggagaagca tcagctgatg aggt gcattgcggc ctatctgtac Z62; aagggcaata actggtgggc ccagagcgtg gagctctgca agaaggatca tctctacaag Z68; gatgccatgc agcatgctgc agagtcgcgg gatgctgagc tggcccagaa gttgctgcag Z74; tggttcctgg aggaaggcaa gtgc ttcgcagctt gtctcttcac ctgctatgac 180; ctgcttcgcc cagacatggt gcttgagctg gcctggaggc acaacctcgt ggcc 186; atgccctact tcatccaggt gatgagggag tacctgagca aggtggacaa actggatgcc Z92; ttggagagtc tgcgcaagca agaggagcat gtgacagagc ctgcccctct cgtgtttgat Z98; tttgatgggc atgaatgaga cccagctgat tgcactaagc cctgccgtgg gcccagcccc 504; tgccagcttc ccctatggat ctgc tcccaacttc gccagcctcc aatgtacaac 510; ttccgcgtgt agtgggcgtt gtcaccaccc accctacctg cagagttact tcca 516; atgt cactccagca gcacagggga cgcaatggga ggcagggaca cctggacaat 522; atttattttt gctgaaaccc aatgacggca acctctgagc catcccagag cctggggagg 528; ccagggtaga ggctgacggc gcaagaccag ctttagccga caacagagac tggactgtgg 5341 gccctcctgc tggagccagg ccttcctcct gggcgcctcc gactggctgg agctgccccc 5401 tccaggccag tttgaagact acatgaacac gtcttgtttg gaggtaccgg acctcataaa 5461 aggactctca gcctcttggc aatcataaat attaaagtcg gtttatccag gcaaaaaaaa 5521 aaaaaaaaaa aaaaaaaaaa aa 6. LIE4G1: LIE4G1 eukaryotic translation initiation factor 4 gamma, 1 [ {omo sapiens ] LOCJS NM_001194946 (isoform 6) AA /transla:ion="MNKAPQSTGPPPAPSPGLPQPAEPPGQ APVVES PQAiQMNiP SQPRQGGFRSJQHFYPSRAQPPSSAASRVQSAAPARPGPAA{VYPAGSQVMMIPSQIS YPASQGAYYIPGQGRSTYVVP"QQYPVQPGAPGFYPGASPlitG PAQGVQQ FP"GVAPAPV4WNQPPQIAPKR7RKTIRIRDPWQGGKDliu L *IWSGA? AS P PPQi GGG-‘PQANG L PQVAVIVRPD u RSQGAIIADRPGLPGPL {SPS*SQPSSPSP"PSPS PVL‘PGS‘PV-AV-SIPGDiW H IQMSV U] 1PISR*1G PYR-SP4P PLA‘PI-‘VL *V -SKPVP*S*ESSSP-QAP w -AS{1V*I{*PVGWVPS*DL‘P‘V‘SSP‘-APPPA CPSESPVPIAP AQPL *--WGAPSPPAVD-SPVS*P**QAK4V1ASWAPPTIPSATPA 5PAQ L **W********G*AG*AG*A*S*KGG**--PP*SiPIPAN-SQW-TA AAATQVAVSVPKRRRKIK*-VKK*AVGD--DAbK‘AVPAVP‘V4VQPPAGSVPGP4SL GSGVPPRPL *AD41WDSK*DKIHVA‘VIQPG‘QKY‘YKSDQWKP-N-**KK?YDRTF.

AGFQFIFASWQKPTG-P{ISDVV4DKANKTP-RP.3P"?-QGIVCGPDFTPSFAV4GR TTASTRGPPRGGPGG'-P?GPAG4GPRRSQQGPRKEPRKIIA V-Mi‘DIK-VKATKA‘J WKPSSKRTAADKDRG**DADGSK"QDAFRRVRSIAVKL"PQWFQQLWKQVTQ-AIDTT TR-KGVIDLIE*KAIS‘PVESVAYAVWCRC-WA-KVP *KP K---NRCQK *t4KDKDDD*Vb*KKQK*W3*AA A**RG?-K**-**A?DIARRRSJGVIKFIGT .KWLTTAIMiDCVVK--KW{D**SL4CLCR--TTIGKD.3FTKAKPRWDQYFVQWEKI IKEKKTSSRIRFWLQDV-D-?GSNWVPRRGDQGPK"IDQI{K*A*W‘*{R*{IKVQQ4 WAKGSDKRRGGPPGPPISRG-P-VDDGGWWTVPISKGSRPID"SRATKITKPGSIDSV WQAFAPGGRASWGKGSSGGSGAKPSDAASLAARPA Si-VRESA-QQAVPTESTDVRR VVQRSS-SR L RG‘KAGDRGD?.4RS*?GGDRGD?ADKAR1PA *V**RSR*R PSQPTG-RKAAS-TT RDRGRDAVKRTAA-PPVSP-KAA-S***-*KKSKAII**Y-{ DWKTAVQCVQT-ASPS--bIbVR{GV*51L4?SAIAR'{WGQ--HQ--CAG{‘J YQG-Y*I-*-A*DM*IDIP{VWLY-A*-V1PI-Q*GGVPWG*-tR*IiKP-RP-GK S---TI-G--CKSMGPKKVGT-W?TAG-SWK*E-P*GQDIGAEVA*QKV*Y1 APGQQA-PS‘4-WRQL4KL-K*GSSNQRVEDWI*AW-S*QQIVSNT-VRA-WTAVC AIIFTTP-RVDVAV-KARAKL.QKYLCD‘QK L -QA-YA-QA-VVT-TQPPV--RWF A .YD‘DVVK‘JAEYSWLSSKDPAEQQGKGVAAKSV1AEbKW-R*A***SDiW CDNA: 1 cgca ccgg cgcggctccg ccccctgcgc cgg:cacg:g ggggcgccgg 61 ctgcgcctgc ggagaagcgg tggccgccga gcgggatctg tgcggggagc cggaaatggt 121 tgtggactac gtctgtgcgg ctgcgtgggg ctcggccgcg cggactgaag gagactgaag 181 gccctcggat aacc tgtaggccgc accgtggact tgttcttaat Cgagggggtg 241 ctggggggac cctgatgtgg caccaaatga caaa gctccacagt ccacaggccc ; cccacccgcc ccatcccccg gactcccaca gccagcgttt cccccggggc agacagcgcc 361 ggtggtgttc agtacgccac aagcgacaca aatgaacacg cagc cccgccaggg 421 aggattcagg tctctgcagc acttctaccc ggcc cagcccccga gcagtgcagc 481 agtg cagagtgcag cccctgcccg ccctggccca gctgcccatg tctaccctgc 541 tggatcccaa gtaatgatga tcccttccca gatctcctac ccagcctccc agggggccta 60; ctacatccct gggc gttccacata cgttgtcccg acacagcagt accctgtgca 66; gccaggagcc ccaggcttct atccaggtgc aagccctaca gaatttggga cctacgctgg 72; cgcctactat ccagcccaag gggtgcagca gtttcccact ggcgtggccc ccgccccagt 78; tttgatgaac cagccacccc agattgctcc caagagggag cgtaagacga tccgaattcg 84; agatccaaac caaggaggaa aggatatcac agaggagatc atgtctgggg cccgcactgc 90; accc acccctcccc agacgggagg cggtctggag cctcaagcta atggggagac 96; ggtt gctgtcattg tccggccaga tgaccggtca cagggagcaa tcattgctga L02; aggg ctgcctggcc cagagcatag cccttcagaa tcccagcctt cgtcgccttc L08; tccgacccca tcaccatccc cagtcttgga accggggtct gagcctaatc tcgcagtcct L14; ctctattcct ggggacacta tgacaactat acaaatgtct gtagaagaat caacccccat L20; ctcccgtgaa actggggagc catatcgcct ctctccagaa cccactcctc tcgccgaacc L26; catactggaa gtga cacttagcaa accggttcca gaatctgagt tttcttccag L32; tcctctccag gctcccaccc ctttggcatc tcacacagtg gaaattcatg agcctaatgg L38; catggtccca tctgaagatc tggaaccaga ggtggagtca agcccagagc ttgctcctcc L44; cccagcttgc ccctccgaat cccctgtgcc cattgctcca actgcccaac ctgaggaact L50; gctcaacgga gccccctcgc ctgt ggacttaagc ccagtcagtg agccagagga L56; gcaggccaag gaggtgacag catcaatggc gCCCCCC&CC atcccctctg cagc L62; tacggctcct tcagctactt ccccagctca ggaggaggaa atggaagaag aagaagaaga L68; ggaagaagga ggag aagcaggaga agctgagagt gagaaaggag gagaggaact L74; gctcccccca gagagtaccc ctattccagc caacttgtct cagaatttgg aggcagcagc L80; agccactcaa gtggcagtat ctgtgccaaa gaggagacgg aaaattaagg agctaaataa L86; gaaggaggct gttggagacc ttctggatgc cttcaaggag ccgg cagtaccaga L92; ggtggaaaat cagcctcctg caggcagcaa tccaggccca gagg gcagtggtgt L98; gcccccacgt cctgaggaag cagatgagac ctca aaggaagaca aaattcacaa 204; tgctgagaac atccagcccg gggaacagaa gtatgaatat aagtcagatc agtggaagcc 210; tctaaaccta gaggagaaaa aacgttacga ccgtgagttc ctgcttggtt ttcagttcat 216; cagt aagc cagagggatt gccacatatc agtgacgtgg tgctggacaa 222; ggccaataaa acaccactgc ggccactgga tcccactaga ctacaaggca taaattgtgg 228; cccagacttc actccatcct ttgccaacct tggccggaca acccttagca cccgtgggcc 234; cccaaggggt gggccaggtg tgcc ccgtgggccg gctggcctgg ggcg 240; gcag ggaccccgaa aagaaccacg caagatcatt gccacagtgt taatgaccga 246; agatataaaa ctgaacaaag cagagaaagc ctggaaaccc aagc ggacggcggc 252; tgataaggat cgaggggaag aagatgctga tggcagcaaa acccaggacc tattccgcag 258; ggtgcgctcc atcctgaata aactgacacc ccagatgttc cagcagctga tgaagcaagt 264; gacgcagctg gccatcgaca aacg aggg gtcattgacc tcatttttga 270; gaaggccatt tcagagccca acttctctgt ggcctatgcc aacatgtgcc gctgcctcat 276; ggcgctgaaa gtgcccacta cggaaaagcc aacagtgact gtgaacttcc gaaagctgtt 282; gttgaatcga tgtcagaagg agtttgagaa agacaaagat gatgatgagg tttttgagaa 288; gaagcaaaaa gagatggatg aagctgctac ggcagaggaa cgaggacgcc tgaaggaaga 294; gctggaagag gctcgggaca tagcccggcg gcgctcttta gggaatatca ttgg 300; gttc aaga tgttaacaga ggcaataatg catgactgtg tggtcaaact 306; gcttaagaac catgatgaag agtcccttga gtgcctttgt cgtctgctca ccaccattgg 312; caaagacctg gactttgaaa aagccaagcc ccgaatggat cagtatttca accagatgga 318; aaaaatcatt aaagaaaaga agacgtcatc ccgcatccgc tttatgctgc aggacgtgct 324; ggatctgcga gggagcaatt cacg ccgaggggat cagggtccca agaccattga 330; ccagatccat aaggaggctg agatggaaga acatcgagag cacatcaaag tgcagcagct 336; catggccaag ggcagtgaca agcgtcgggg cggtcctcca ggccctccca tcagccgtgg 342; acttcccctt gtggatgatg gtggctggaa cacagttccc atcagcaaag gtagccgccc 348; cattgacacc tcacgactca ccaagatcac tggc tccatcgatt ctaacaacca 354; gctctttgca cctggagggc gactgagctg gggcaagggc agcagcggag gctcaggagc 360; caagccctca gacgcagcat cagaagctgc tcgcccagct actagtactt tgaatcgctt 366; ctcagccctt caacaagcgg tacccacaga aagcacagat aatagacgtg tggtgcagag 372; cttg agccgagaac gaggcgagaa agac cgaggagacc agcg 378; gagtgaacgg ggaggggacc accg gcttgatcgt gcgcggacac ctgctaccaa 384; gcggagcttc agcaaggaag tggaggagcg gagtagagaa cggccctccc agcctgaggg 390; gctgcgcaag gcagctagcc tcacggagga tcgggaccgt gggcgggatg ccgtgaagcg 396; agaagctgcc ctacccccag tgagccccct gaaggcggct ctctctgagg aggagttaga 402; gaagaaatcc atca ttgaggaata tctc aatgacatga aagaggcagt 408; cgtg caggagctgg cctcaccctc cttc atctttgtac ggcatggtgt 414; cgagtctacg ctggagcgca gtgccattgc tcgtgagcat atggggcagc tgctgcacca 420; gctgctctgt gctgggcatc tgtctactgc ctac caagggttgt atgaaatctt Z ggaattggct gaggacatgg aaattgacat cccccacgtg tggctctacc tagcggaact 132; ggtaacaccc attctgcagg aaggtggggt gcccatgggg gagctgttca gggagattac 138; aaagcctctg agaccgttgg gcaaagctgc ttccctgttg ctggagatcc tgggcctcct Z44; gtgcaaaagc atgggtccta aaaaggtggg gacgctgtgg cgagaagccg gctg 150; attt ctacctgaag gccaggacat tggtgcattc gtcgctgaac agaaggtgga 156; gtataccctg ggagaggagt cggaagcccc tggccagagg gcactcccct ccgaggagct Z62; gaacaggcag ctggagaagc tgctgaagga gggcagcagt aaccagcggg tgttcgactg Z68; gatagaggcc aacctgagtg agcagcagat agtatccaac acgttagttc gagccctcat Z74; gacggctgtc tgctattctg caattatttt tgagactccc gtgg acgttgcagt 180; gctgaaagcg cgagcgaagc tgctgcagaa atacctgtgt gacgagcaga aggagctaca 186; ggcgctctac gccctccagg cccttgtagt agaa cagcctccca acctgctgcg Z92; gatgttcttt gacgcactgt atgacgagga Cgtggtgaag gaggatgcct tctacagttg Z98; ggagagtagc aaggaccccg ctgagcagca gggt gtggccctta aatctgtcac 504; agccttcttc aagtggctcc gtgaagcaga gtct gaccacaact gagggctggt 510; ggggccgggg acctggagcc acac acagatggcc cggctagccg cctggactgc 516; aggggggcgg cagcagcggc ggtggcagtg ggtgcctgta tgtg ctaa 522; taaagtggct gaagaggcag gatggcttgg ggctgcctgg gcccccctcc aggatgccgc 528; caggtgtccc tctcctcccc caca gagatatatt atatataaag tcttgaaatt 534; tggtgtgtct tggggtgggg aggggcacca acgcctgccc ctggggtcct tatt 540; ttctgaaaat cactctcggg actgccgtcc tcgctgctgg gggcatatgc cccagcccct 546; cccc tgcc tgggcagggg gaaggggggg cacggtgcct gtaattatta 552; aacatgaatt caattaagct caaaaaaaaa aaaaaaaaa AOCJS NM_004953 ( isoform 4) AA /transla :ion="WSGARiASiPiPPQ GGGL 4 PQAVG VIVRPDD QSQGAI IADRPGLPGP L {SP5 *SQPSSPSPTPSPSPVLdPGS 4 PV-AV-SIPG 31W iIQMSVL iPISR‘iG 4 PYR.SP‘PiPLA‘PI .4V4V .SKPVP 4 S‘hSSSP .QAP"? .ASiTV *PVGWVPS4 3L 4 P‘V‘SSP L -APPPACPSESPVP P L . .WGAPSPPAVJ 4? 4 *QAK‘V i ASWAPPLIPSA iPAiAPSAiSPAQ L 4 4W 444 *G‘AG‘AG *KGG --PP 4 SiPIPAN-SQN QVAVSVPK RR 4 .VKK‘AVGD.

TAVPAVP4V 4 NQPPAGSWPGP *S *GSGVPPRP L *AD‘iW 4 DKIHVAEWIQ WKPLV- 4 *KKRY DR4 F .GFQFIFASWQKPTG {IS DVVADKAVK DPT? .QGIWCGP DFTPSFAN4G QTTLSTRGPP RGGPGGT RGPQAGLGPRQ QKEPQKIIA V-W .VKA TKAWKPSSKRTAA DKJRG DADGSK"QDJF .WK-"PQWFQQ- -A IDi 4 *R-KGVI D-Ib 4 KAIS *PNESVAYAVWC .KVPi *KP V -N QCQK *b‘KDKDDD 4 Vb *MD‘AA A 4 4 R .K**- 4 *ARDIA QR IG 4 .bK-KWLL *AIW iDCVVK. .KN {D 4 *SLdC. 4LTTIGK3LDFEKAKP YEVQW *KIIK‘KKLSSKIREWLQDV .D-QGSNWVP? DQGPKTIDQIHK 4 {IKVQQLWAKGS 3KR RGGPPGPPISRG .P-VDDGG WVTVPISKGSRPID"SRATKITKPGSIDSWNQLFAPGGRASWGKGSSGGSGAKPS3AA SEAARPATST.VRFSA.QQAVPTESTDNRQVVQQSS.SRdRGdKAGDRGDR.4RS*QG GDRGDRADKARLPA KRSbSKdVddRSRdRPSQPdG.RKAAS.TTDRDQGRDAVKRE A-PPVSP-KAA-ded.dKKSKAllddY.{LVDWKTAVQCVQT.ASPS.LFIFVR4GV 451L4QSAIART4WGQ..HQ..CAG4.STAQYYQG.Y*I.4.AdDMdlDIPiVWLYAA 4.ViPI.Q*GGVPWG4.deliKP.RPLGKAAS...TI.G..CKSMGPKKVGT.WRTA G-SWde.PdGQDIGAbVAdQKV4Yi.G4454APGQRA.PS**.WRQLdKL.K4GSSN QRVbDWIdAV.SdQQIVSNT.VRA.WTAVCYSAIIFTTP.RVDVAV.KARAKL.QKYL CDdQKd.QA.YA.QA.VVT.TQPPV..RWbbDA.YDdDVVKdDAbYSWLSSKDPAEQQ GKGVAAKSViAbbKW.R4A444534V" CDNA: ' :c:aga:ggg gg:cc:gggc cccaggg:g: gcagccactg acttggggac tgctggtggg 6; g:agggatga gggagggagg ggcattg:ga tgtacagggc tgctctgtga gatcaagggt l2; aggg tgggagctgg ggcagggact acgagagcag gggc tgaaagtgga l8; actcaagggg tttctggcac ctacctacct gcttcccgct tggg gagttggccc 24; agagtcttaa gattggggca gggtggagag gtgggctctt cctgcttccc actcatctta ; tagctttctt tccccagatc cgaattcgag atccaaacca aaag gatatcacag 36; aggagatcat gtctggggcc cgcactgcct ccacacccac ccag acgggaggcg 42; gtctggagcc tcaagctaat ggggagacgc ttgc tgtcattgtc cggccagatg 48; accggtcaca gggagcaatc attgctgacc ggccagggct ccca gagcatagcc 54; cttcagaatc ccagccttcg tcgccttctc catc accatcccca gtcttggaac 60; cggggtctga gcctaatctc gcagtcctct ctattcctgg ggacactatg acaactatac 66; aaatgtctgt agaagaatca acccccatct aaac tggggagcca tatcgcctct 72; ctccagaacc cactcctctc gccgaaccca tactggaagt agaagtgaca aaac 78; cggttccaga atctgagttt tcttccagtc ctctccaggc tcccacccct ttggcatctc 84; acacagtgga aattcatgag ggca tggtcccatc tctg gaaccagagg 90; tggagtcaag cccagagctt gctcctcccc cagcttgccc ctccgaatcc cctgtgccca 96; ttgctccaac tgcccaacct gaggaactgc tcaacggagc cccctcgcca ccagctgtgg ;O2; acttaagccc agtcagtgag ccagaggagc aggccaagga ggtgacagca tcaatggcgc ;O8; cccccaccat cccctctgct actccagcta cggctccttc agctacttcc ccagctcagg ;14; aggaggaaat ggaagaagaa gaagaagagg aagaaggaga agcaggagaa gcaggagaag ;20; ctgagagtga gaaaggagga gaggaactgc tccccccaga gagtacccct attccagcca ;26; ctca gaatttggag gcagcagcag ccactcaagt ggcagtatct gtgccaaaga ;32; ggagacggaa ggag ctaaataaga aggaggctgt tggagacctt ctggatgcct ;38; tcaaggaggc gaacccggca gtaccagagg tggaaaatca gcctcctgca ggcagcaatc ;44; caggcccaga gtctgagggc agtggtgtgc ccccacgtcc tgaggaagca gatgagacct ;50; gggactcaaa ggaagacaaa attcacaatg acat ccagcccggg gaacagaagt L56; ataa gtcagatcag tggaagcctc taaacctaga ggagaaaaaa cgttacgacc L62; gtgagttcct gcttggtttt cagttcatct ttgccagtat gcagaagcca ttgc L68; cacatatcag tgacgtggtg ctggacaagg aaac accactgcgg ccactggatc L74; ccactagact acaaggcata ggcc cagacttcac tccatccttt gccaaccttg L80; gccggacaac ccttagcacc cgtgggcccc caaggggtgg gccaggtggg gagctgcccc L86; gtgggccgca ggctggcctg ggaccccggc gctctcagca gggaccccga aaagaaccac L92; gcaagatcat tgccacagtg accg aagatataaa actgaacaaa aaag L98; cctggaaacc cagcagcaag ngacggcgg ctgataagga tcgaggggaa gaagatgctg 204; gcaa aacccaggac ctattccgca gggtgcgctc catcctgaat aaactgacac 210; cccagatgtt ccagcagctg atgaagcaag tgacgcagct ggccatcgac accgaggaac 216; gcctcaaagg ggtcattgac ctcatttttg agaaggccat ttcagagccc aacttctctg 222; tggcctatgc caacatgtgc cgctgcctca tgaa agtgcccact acggaaaagc 228; tgac tgtgaacttc cgaaagctgt tgttgaatcg atgtcagaag gagtttgaga 234; aagacaaaga tgatgatgag gtttttgaga agaagcaaaa agagatggat gaagctgcta 240; cggcagagga acgaggacgc ctgaaggaag agctggaaga ggctcgggac atagcccggc 246; cttt agggaatatc aagtttattg gagagttgtt caaactgaag acag 252; aggcaataat gcatgactgt gtggtcaaac tgcttaagaa ccatgatgaa gagtcccttg 258; agtgcctttg tcgtctgctc accaccattg gcaaagacct ggactttgaa aaagccaagc 264; cccgaatgga tcagtatttc aaccagatgg aaaaaatcat aaag aagacgtcat 270; cccgcatccg ctttatgctg caggacgtgc tggatctgcg agggagcaat tgggtgccac 276; gccgagggga tcagggtccc aagaccattg accagatcca taaggaggct gagatggaag 282; gaga gcacatcaaa gtgcagcagc tcatggccaa gggcagtgac aagcgtcggg 288; ctcc aggccctccc atcagccgtg gacttcccct tgtggatgat ggtggctgga 294; acacagttcc catcagcaaa ggtagccgcc ccattgacac ctcacgactc accaagatca 300; ccaagcctgg ctccatcgat tctaacaacc agctctttgc acctggaggg cgactgagct 306; ggggcaaggg cagcagcgga ggctcaggag cctc agacgcagca tcagaagctg 312; ctcgcccagc tactagtact ttgaatcgct tctcagccct agcg gtacccacag 318; aaagcacaga taatagacgt caga ggagtagctt gagccgagaa cgaggcgaga 324; aagctggaga ccgaggagac cgcctagagc ggagtgaacg ggac cgtggggacc 330; ggcttgatcg tgcgcggaca cctgctacca agcggagctt cagcaaggaa gtggaggagc 336; ggagtagaga acggccctcc cagcctgagg ggctgcgcaa ggcagctagc ctcacggagg 342; atcgggaccg tgggcgggat gccgtgaagc gagaagctgc cctaccccca gtgagccccc 348; tgaaggcggc tctctctgag gaggagttag agaagaaatc caaggctatc attgaggaat 354; atctccatct caatgacatg aaagaggcag gcgt gcaggagctg gcctcaccct 360; ccttgctctt catctttgta cggcatggtg tcgagtctac gctggagcgc agtgccattg 366; ctcgtgagca tatggggcag ctgctgcacc agctgctctg tgctgggcat ctgtctactg 372; ctcagtacta ccaagggttg tatgaaatct tggaattggc tgaggacatg gaaattgaca 378; tcccccacgt gtggctctac ctagcggaac tggtaacacc gcag gaaggtgggg 384; tgcccatggg ggagctgttc agggagatta caaagcctct gagaccgttg ggcaaagctg 390; cttccctgtt gctggagatc ctcc tgtgcaaaag catgggtcct aaaaaggtgg 396; tgtg agcc gggcttagct ggaaggaatt tctacctgaa ggccaggaca 102; ttggtgcatt cgtcgctgaa cagaaggtgg agtataccct gggagaggag tcggaagccc 108; ctggccagag ggcactcccc tccgaggagc tgaacaggca gaag ctgctgaagg 114; agggcagcag taaccagcgg gtgttcgact ggatagaggc caacctgagt caga 120; tagtatccaa cacgttagtt cgagccctca tgacggctgt ctgctattct gcaattattt 126; ctcc cctccgagtg gacgttgcag tgctgaaagc gcgagcgaag ctgctgcaga 132; aatacctgtg tgacgagcag ctac aggcgctcta cgccctccag gcccttgtag 138; taga acagcctccc aacctgctgc ggatgttctt tgacgcactg tatgacgagg Z44; acgtggtgaa tgcc ttctacagtt gggagagtag caaggacccc gctgagcagc 150; agggcaaggg tgtggccctt aaatctgtca cagccttctt caagtggctc gcag 156; aggaggagtc tgaccacaac tgagggctgg tggggccggg gacctggagc cccatggaca Z62; cacagatggc ccggctagcc gcctggactg caggggggcg gcagcagcgg ngtggcagt Z68; gggtgcctgt agtgtgatgt gtctgaacta ataaagtggc tgaagaggca ggatggcttg Z74; gggctgcctg ggcccccctc gccg gtcc ctctcctccc cctggggcac Z80; agagatatat tatatataaa gtcttgaaat ttggtgtgtc ttggggtggg gaggggcacc Z86; aacgcctgcc cctggggtcc ttttttttat tttctgaaaa tcgg gactgccgtc Z92; ctcgctgctg ggggcatatg ccccagcccc accc ctgctgttgc ctgggcaggg Z98; ggaagggggg gcacggtgcc tgtaattatt aaacatgaat tcaattaagc tcaaaaaaaa 504; aaaaaaaaaa LOCUS NM_182917 rm 1) AA /translation:"MNKAPQSTGPPPAPSPGLPQPAFPPGQTAPVVFSTPQATQMNTP SQPRQHFYPSRAQPPSSAASRVQSAAPARPGPAAHVYPAGSQVMMIPSQISYPASQGA YYIPGQGRSTYVVP'"QQYPVQPGAPGFYPGASPT EFGTYAGAYYPAQGVQQFP'"GVAP APV 4WNQPPQIAPKR QDPWQGGKDIi L *IWSGA R ASiP PPQiGGG-4PQ LL VRPD D QSQGAIIA D?PGLPGP L {SP5 *SQPSSPSP' PSPSPVLEPGS .AV-SIPGDLW J. IQMSV L *SiP ISR‘iG PYRL .SP‘P PLA‘P I-‘V‘V -SKP *bSSSP-QAP P-ASiiV‘Ii4 PWGWVPS 4 3L *P‘V‘SSP L -APPPACPSESPV AQPL *--VGAPSPPAVD-SPVS‘P 4 *QAK‘V iASWAPPTIPSATPATAPSATS 4444444 4 G‘AG‘AG‘A *KGG**--PP *SiPIPAN-SQN.TAAAATQVA QKIK 4 -WKK4AVG -DAFKTAVPAVP4V *NQPPAGSVPGP *SdGSGVPPR DSK 4 DKI HWA 4QKY4YKS DQWKPLV. 4 *KKRYDR'F4 -GFQFIF {ISDVV P-DPT? .QGIVCGP DFTPSFANAGRTTLSTRG QGPQAGLGP QKEPRKIIA V -VKA DADGSK"QD .VK-'"PQWFQQ .KVPi *KP V 4 .44ARDIA QR J. TTIGKJ-DFTKAKP QGSNWVPQRGDQGPK"IDQI{K 4A KG. DGGWVTVPISKGSRPI y"S ASWGKGSSGGSGAKPS DAASLAA RPA Si-V RESA *KAGDQGD? .dRS *QGGD DKARLPA K RG? DAVKQTAA-PPVSP-KAA-S4 4 4 T-ASPS--b ItVRiGV‘SiL 4 RSAIART {WGQ. 4 DM*I {VWLY-A 4 -V1 PI *GGVPWG 4 .ER .CKSMGPKKVGT-WQTAG-SWK -P*GQ DIGAEVA 4 .WRQL .K‘GSSNQRVE *AV-S *QQIVSN DVAV-KARAKL.QKYLCD*QK L .QA-YA .QA .VVTE DAEYSWLSSKDPAEQQGKGVA 4KSViAbbKW .R‘A L *S CDNA: :cacttgcct gaaaccggc cc:cgacggc cgccgcccgc c:ggcct I agggcc :gac 6; cctt cctggcctac ac:cctgggc ggcggcaggc c:agcttctg gcccag :gcg 12; ccgg cggcaggcgt atcctgtgtg cccctgggcc aggcccgaac ccggtg :ccc 18; nggtggggg gtggggacgc cgaa gcagctagct ccgttcgtga tccgggagcc 24; tggtgccagc gagacctgga atttccggtc tggttggtct ggggccccgc ggagccaggt ; tgataccctc acctcccaac cccaggccct cggatgccca gaacctgtag gccgcaccgt 36; ggacttgttc ttaatcgagg gggtgctggg gggaccctga tgtggcacca atga 42; acaaagctcc acagtccaca ggccccccac ccgccccatc ccccggactc ccacagccag 48; cgtttccccc ggggcagaca gcgccggtgg tgttcagtac agcg acacaaatga 54; acacgccttc tcagccccgc cagcacttct accctagccg ggcccagccc ccgagcagtg 60; cagcctcccg agtgcagagt cctg cccgccctgg tgcc catgtctacc 66; ctgctggatc ccaagtaatg atgatccctt cccagatctc ctacccagcc tcccaggggg 72; cctactacat ccctggacag gggcgttcca catacgttgt acag cagtaccctg 78; tgcagccagg agccccaggc ttctatccag gtgcaagccc attt gggacctacg 84; ctggcgccta ctatccagcc caaggggtgc agcagtttcc cactggcgtg gcccccgccc 90; cagttttgat gcca ccccagattg ctcccaagag taag acgatccgaa 96; ttcgagatcc aaaccaagga ggaaaggata agga gatcatgtct ggggcccgca L02; ctgcctccac acccacccct ccccagacgg gaggcggtct ggagcctcaa gctaatgggg L08; agacgcccca ggttgctgtc attgtccggc cagatgaccg ggga gcaatcattg L14; ctgaccggcc agggctgcct ggcccagagc atagcccttc agaatcccag ccttcgtcgc L20; cttctccgac acca tccccagtct tggaaccggg gtctgagcct aatctcgcag L26; ctat tcctggggac acaa ctatacaaat gtctgtagaa gaatcaaccc L32; ccatctcccg tgaaactggg gagccatatc ctcc agaacccact cctctcgccg L38; aacccatact ggaagtagaa ctta gcaaaccggt tccagaatct gagttttctt L44; ccagtcctct tccc acccctttgg catctcacac agtggaaatt catgagccta L50; atggcatggt cccatctgaa gatctggaac tgga gtcaagccca gagcttgctc L56; ctcccccagc ttgcccctcc gaatcccctg tgcccattgc tccaactgcc gagg L62; aactgctcaa cggagccccc tcgccaccag ctgtggactt aagcccagtc agtgagccag L68; aggagcaggc caaggaggtg acagcatcaa tggcgccccc caccatcccc tctgctactc L74; cggc agct acttccccag ctcaggagga ggaaatggaa gaagaagaag L80; aagaggaaga aggagaagca ggagaagcag ctga gagtgagaaa ggaggagagg L86; aactgctccc cccagagagt acccctattc cagccaactt gtctcagaat ttggaggcag L92; cagcagccac tcaagtggca gtgc caaagaggag acggaaaatt aaggagctaa L98; ataagaagga ggctgttgga gaccttctgg atgccttcaa ggaggcgaac ccggcagtac 204; cagaggtgga aaatcagcct cctgcaggca gcaatccagg cccagagtct gagggcagtg 210; cccc acgtcctgag gaagcagatg agacctggga ctcaaaggaa gacaaaattc 216; acaatgctga gaacatccag cccggggaac agaagtatga atataagtca gatcagtgga 222; agcctctaaa cctagaggag aaaaaacgtt acgaccgtga gttcctgctt ggttttcagt 228; tcatctttgc cagtatgcag aagccagagg gattgccaca tatcagtgac gtggtgctgg 234; acaaggccaa taaaacacca ctgcggccac tggatcccac tagactacaa ggcataaatt 240; gtggcccaga cttcactcca tcctttgcca accttggccg gacaaccctt agcacccgtg 246; ggcccccaag gcca ggtggggagc tgccccgtgg gccgcaggct ggcctgggac 252; cccggcgctc tcagcaggga ccccgaaaag aaccacgcaa gatcattgcc acagtgttaa 258; tgaccgaaga tataaaactg aacaaagcag cctg gaaacccagc agcaagcgga 264; cggcggctga taaggatcga ggggaagaag atgctgatgg cagcaaaacc caggacctat 270; tccgcagggt gcgctccatc ctgaataaac tgacacccca gatgttccag cagctgatga 276; agcaagtgac gcagctggcc atcgacaccg aggaacgcct caaaggggtc attgacctca 282; tttttgagaa ggccatttca gagcccaact tggc ctatgccaac atgtgccgct 288; gcctcatggc gctgaaagtg cccactacgg aaaagccaac agtgactgtg aacttccgaa 294; agctgttgtt gaatcgatgt cagaaggagt aaga caaagatgat gatgaggttt 300; ttgagaagaa gcaaaaagag atggatgaag ctgctacggc agaggaacga ggacgcctga 306; aggaagagct ggaagaggct cgggacatag CCngngCg ctctttaggg aatatcaagt 312; ttattggaga gttgttcaaa atgt taacagaggc aataatgcat gactgtgtgg 318; tcaaactgct taagaaccat gatgaagagt agtg cctttgtcgt acca 324; ccattggcaa agacctggac tttgaaaaag ccaagccccg aatggatcag tatttcaacc 330; agatggaaaa aatcattaaa gaaaagaaga cgtcatcccg catccgcttt atgctgcagg 336; acgtgctgga tctgcgaggg agcaattggg tgccacgccg aggggatcag ggtcccaaga 342; ccattgacca gatccataag gaggctgaga tggaagaaca tcgagagcac atcaaagtgc 348; agcagctcat ggccaagggc agtgacaagc gcgg tcctccaggc atca 354; gccgtggact tccccttgtg gatgatggtg gctggaacac agttcccatc agcaaaggta 360; gccgccccat tgacacctca cgactcacca agatcaccaa gcctggctcc atcgattcta 366; acaaccagct ctttgcacct ggagggcgac tgagctgggg cagc agcggaggct 372; caggagccaa gccctcagac gcagcatcag aagctgctcg cccagctact agtactttga 378; atcgcttctc tcaa gtac ccacagaaag cacagataat agacgtgtgg 384; tgcagaggag tagcttgagc cgagaacgag gcgagaaagc tggagaccga ggagaccgcc 390; tagagcggag tgaacgggga ggggaccgtg ggct tgatcgtgcg cctg 396; ctaccaagcg gagcttcagc gtgg aggagcggag tagagaacgg cagc 102; ctgaggggct gcgcaaggca gctagcctca atcg ggaccgtggg nggatgccg 108; tgaagcgaga agctgcccta cccccagtga gccccctgaa ggcggctctc gagg 114; agttagagaa gaaatccaag gctatcattg aggaatatct ccatctcaat gacatgaaag 120; tcca gtgcgtgcag gcct caccctcctt gctcttcatc cggc 126; atggtgtcga gtctacgctg gagcgcagtg ccattgctcg tgagcatatg gggcagctgc 132; tgcaccagct gctctgtgct gggcatctgt ctactgctca gtactaccaa gggttgtatg 138; aaatcttgga attggctgag gacatggaaa ttgacatccc ccacgtgtgg ctctacctag Z44; cggaactggt aacacccatt ctgcaggaag gtggggtgcc catgggggag ctgttcaggg 150; agattacaaa gcctctgaga ccgttgggca aagctgcttc cctgttgctg gagatcctgg 156; gcctcctgtg caaaagcatg ggtcctaaaa aggtggggac gctgtggcga gaagccgggc Z62; ttagctggaa ggaatttcta cctgaaggcc aggacattgg tgcattcgtc gctgaacaga 468; aggtggagta taccctggga gaggagtcgg aagcccctgg ccagagggca ctcccctccg 474; aggagctgaa caggcagctg gagaagctgc tgaaggaggg cagcagtaac cagcgggtgt 480; tcgactggat agaggccaac ctgagtgagc agcagatagt atccaacacg ttagttcgag 486; ccctcatgac ggctgtctgc tattctgcaa ttga gactcccctc Cgagtggacg 492; ttgcagtgct gaaagcgcga gcgaagctgc tgcagaaata cctgtgtgac gagcagaagg 498; aggc gctctacgcc ctccaggccc ttgtagtgac cttagaacag aacc 504; tgctgcggat gttctttgac gcactgtatg acgaggacgt ggtgaaggag gatgccttct 510; acagttggga gagtagcaag gaccccgctg agcagcaggg caagggtgtg aaat 516; ctgtcacagc cttcttcaag tggctccgtg aagcagagga ggagtctgac cacaactgag 522; tggg gacc tggagcccca tggacacaca gatggcccgg gcct 528; ggactgcagg ggggcggcag cagcggcggt ggcagtgggt gcctgtagtg tgatgtgtct 534; gaactaataa agtggctgaa ggat ggcttggggc tgcctgggcc cccctccagg 540; atgccgccag gtgtccctct cctccccctg gggcacagag atatattata tataaagtct 546; tgaaatttgg tgtgtcttgg ggtggggagg ggcaccaacg cctgcccctg gggtcctttt 552; ttttattttc tcac tctcgggact gccgtcctcg ctgctggggg catatgcccc 558; agcccctgta ccacccctgc ctgg gcagggggaa gggggggcac ggtgcctgta 564; attattaaac atgaattcaa ttaagctcaa aaaaaaaaaa aaaaaa LOCJS NM_198241 (isoform 5) AA /:ranslation="MNKAPQSTGPPPAPSPG;PQPAFPPGQ"APVVFSTPQATQMNTP SQPQQHFYPSRAQPPSSAASRVQSAAPARPGPAA{VYPAGSQVMMIPSQISYPASQGA YYIPGQGRSTYVVP"QQYPVQPGAPGFYPGASPTEFGTYAGAYYPAQGVQQFP"GVAP APVAWNQPPQIAPKRERKTIRIRDPVQGGKDIL L *IWSGA? AS P PPQiGGG-‘PQ w2QU PQVAVIVQPDDQSQGAIIADRPGLPGPL iSPS*SQPSSPSP"PSPSPVLEPGS TPW-AV-SIPGDLW iIQMSV U] iPISR‘iG "U K: 5U -SP*P PLA‘PI-‘V‘V -SKP VP‘S‘ESSSP-QAP P-ASiiV*Ii*PWGWVPS*DL‘P‘V‘SSP L -APPPACPSESPV PIAP AQPL *--WGAPSPPAVD-SPVS*P*dQAK*ViASWAPPTIPSATPATAPSATS W********G*AG*AG*A*S*KGG**--PP*SiPIPAN-SQW-TAAAATQVA VSVPKRQRKIKd-VKK*AVGD--DAFKTAWPAVP*V‘VQPPAGSWPGP‘S*GSGVPPR P‘4A34iWDSK4DKIHWA‘WIQPG‘QKY‘YKSDQWKP-N-**KK?YDQTF -GFQFIF ASWQKPTG-P{ISDVV;DKANKTP-RP.3P"Q-QGIVCGPDFTPSFAV-G?TT-STRG ww NG) G)w G)G) u'-PQGPAG;GPQRSQQGPRKEPRKIIA V-Mi‘DIK-WKATKAWKPSSKR TAADKJ?G**DADGSK"QD;FRQVQSI;WKL"PQWFQQLWKQVLQ-AIDi L I DT.IJ: *KAIS* PW]: VICRC 34A -KVP *KP VLVWJ: QK -N RCQK‘ J: * KDKD - - DD‘Vb‘KKQK‘WJ‘AA A**RG?-K**-**A?DIAQRQS;GWIKEIG4-bK-KWLL*A IMiDCVVK--KV{D**SL4CLC?--TTIGKD.3FTKAKPRWDQYEVQW4KIIK‘KKLS SRIQFWLQDV-D-?GSNWVPRRGDQGPK"IDQIHK*AdW4di?*{IKVQQ;WAKGSDK RRGGPPGPPISQG-P-VDDGGWWTVPISKGSQPID"SQ;TKITKPGSIDSWVQ;FAPG GRASWGKGSSGGSGAKPSDAASLAARPA Si-WRESA-QQAVPTESTDVQRVVQRSS; SR*?G*KAGDQGDRL*RS*RGGDRGDRADRARLPA KRSESK*V**?SR*QPSQP4G.

RKAAS-TT QDRGRDAVKQTAA-PPVSP-KAA-S***-*KKSKAII**Y-{-WDMKTA VQCVQT-ASPS--bIbVRiGV*SiL4RSAIAQTHWGQ-.HQL-CAGi-STAQYYQG;Y *I-*-A*DM*IDIPHVWLY-A*-ViPI-Q*GGVPWG*-bR*IiKP-?P-GKAASL..T ILG--CKSMGPKKVGTLWQTAG-SWK*E-P*GQDIGAEVA QKV‘Yi-G**S*APGQ?L ALPSA. 4LNRQLdKLLKdGSSNQRVbDWI*ANLS*QQIVSNTAVRALMTAVCYSAIIFE TPLRVDVAVLKARAKL .QKYLCD‘QKdLQALYA-QA-VVT.7QPPNLLRMFFDALYDE DVVKLL JAEYSWLSSKDPAEQQGKGVAAKSVLAEbKW-R*A***SDHN CDNA: 1 cggcggcgca gatcgcccgg cgcggctccg ccccc :gcgc cgtg ggggcgccgg 6; ctgcgcctgc ggagaagcgg ccga gcgggatctg tgcggggagc cggaaatggt 12; tgtggactac gtctgtgcgg ctgcgtgggg ctcggccgcg cggactgaag gagactgaag 18; gccctcggat gcccagaacc tgtaggccgc accgtggact tgttcttaat Cgagggggtg 24; ctggggggac cctgatgtgg caccaaatga aatgaacaaa gctccacagt gccc ; cccacccgcc ccatcccccg gactcccaca gccagcgttt gggc agacagcgcc 36; ggtggtgttc agtacgccac aagcgacaca aatgaacacg cagc cccgccagca 42; cttctaccct agccgggccc agcccccgag cagtgcagcc tcccgagtgc agagtgcagc 48; ccctgcccgc cctggcccag ctgcccatgt ctaccctgct ggatcccaag taatgatgat 54; cccttcccag atctcctacc cagcctccca ctac tacatccctg gacaggggcg 60; ttccacatac gttgtcccga cacagcagta ccctgtgcag ccaggagccc caggcttcta 66; tccaggtgca agccctacag aatttgggac ctacgctggc gcctactatc aagg 72; ggtgcagcag tttcccactg gcgtggcccc agtt ttgatgaacc agccacccca 78; gattgctccc aagagggagc gtaagacgat ccgaattcga gatccaaacc aaggaggaaa 84; ggatatcaca atca tgtctggggc ccgcactgcc tccacaccca cccctcccca 90; gacgggaggc ggtctggagc ctcaagctaa tggggagacg ccccaggttg ctgtcattgt 96; ccggccagat gaccggtcac agggagcaat cattgctgac gggc gccc L02; agagcatagc ccttcagaat cccagccttc gtcgccttct ccgaccccat caccatcccc L08; agtcttggaa ccggggtctg agcctaatct cgcagtcctc tctattcctg gggacactat L14; gacaactata caaatgtctg tagaagaatc aacccccatc tcccgtgaaa ctggggagcc L20; atatcgcctc tctccagaac ccactcctct cgccgaaccc atactggaag tagaagtgac L26; acttagcaaa ccggttccag agtt ttcttccagt cctctccagg cccc L32; tttggcatct cacacagtgg aaattcatga gcctaatggc atggtcccat ctgaagatct L38; agag gtggagtcaa gcccagagct tgctcctccc ccagcttgcc cctccgaatc L44; ccctgtgccc attgctccaa ctgcccaacc tgaggaactg ctcaacggag ccccctcgcc L50; accagctgtg gacttaagcc cagtcagtga ggag caggccaagg aggtgacagc L56; atcaatggcg ccccccacca ctgc tactccagct acggctcctt cagctacttc L62; cccagctcag gaggaggaaa tggaagaaga agaagaagag gaagaaggag aagcaggaga L68; agcaggagaa gctgagagtg gagg agaggaactg ctccccccag agagtacccc 174; tattccagcc aacttgtctc agaatttgga ggcagcagca gccactcaag tggcagtatc 180; tgtgccaaag aggagacgga aaattaagga gctaaataag aaggaggctg ttggagacct 186; tctggatgcc ttcaaggagg cgaacccggc agtaccagag gtggaaaatc ctgc 192; aggcagcaat ccaggcccag agtctgaggg cagtggtgtg cccccacgtc aagc 198; agatgagacc tgggactcaa aggaagacaa aattcacaat gctgagaaca tccagcccgg 204; ggaacagaag tatgaatata agtcagatca gtggaagcct ctaaacctag aggagaaaaa 210; acgttacgac ttcc tgcttggttt tcagttcatc tttgccagta tgcagaagcc 216; agagggattg ccacatatca gtgacgtggt gctggacaag gccaataaaa caccactgcg 222; ggat cccactagac tacaaggcat aaattgtggc ccagacttca ctccatcctt 228; tgccaacctt ggccggacaa cccttagcac ccgtgggccc ccaaggggtg ggccaggtgg 234; ggagctgccc Cgtgggccgg ctggcctggg accccggcgc tctcagcagg gaaa 240; acgc aagatcattg ccacagtgtt aatgaccgaa gatataaaac tgaacaaagc 246; agagaaagcc tggaaaccca gcagcaagcg gacggcggct gataaggatc gaggggaaga 252; agatgctgat ggcagcaaaa cccaggacct attccgcagg gtgcgctcca tcctgaataa 258; actgacaccc cagatgttcc agcagctgat gaagcaagtg acgcagctgg ccatcgacac 264; cgaggaacgc ctcaaagggg acct catttttgag aaggccattt cagagcccaa 270; cttctctgtg gcctatgcca acatgtgccg catg gcgctgaaag tgcccactac 276; ggaaaagcca acagtgactg tgaacttccg aaagctgttg ttgaatcgat gtcagaagga 282; gtttgagaaa gacaaagatg atgatgaggt ttttgagaag aagcaaaaag agatggatga 288; agctgctacg gcagaggaac gaggacgcct gaaggaagag gagg ctcgggacat 294; agcccggcgg cgctctttag ggaatatcaa gtttattgga gagttgttca aactgaagat 300; gttaacagag gcaataatgc atgactgtgt ggtcaaactg cttaagaacc atgatgaaga 306; gtcccttgag tgcctttgtc gtctgctcac caccattggc aaagacctgg actttgaaaa 312; agccaagccc cgaatggatc agtatttcaa ccagatggaa aaaatcatta aagaaaagaa 318; gacgtcatcc cgcatccgct ttatgctgca ggacgtgctg gatctgcgag ggagcaattg 324; ggtgccacgc cgaggggatc agggtcccaa gaccattgac cagatccata ctga 330; agaa catcgagagc aagt gcagcagctc atggccaagg gcagtgacaa 336; gggc ggtcctccag gccctcccat tgga cttccccttg tggatgatgg 342; tggctggaac acagttccca tcagcaaagg tagccgcccc attgacacct cacgactcac 348; caagatcacc aagcctggct ccatcgattc ccag gcac ctggagggcg 354; actgagctgg ggcaagggca gcagcggagg ctcaggagcc aagccctcag acgcagcatc 360; agaagctgct cgcccagcta ctagtacttt gaatcgcttc tcagcccttc aacaagcggt 366; acccacagaa gata atagacgtgt ggtgcagagg ttga gccgagaacg 372; aggcgagaaa gacc gaggagaccg cctagagcgg agtgaacggg gaggggaccg 378; tggggaccgg cttgatcgtg cgcggacacc tgctaccaag cggagcttca gcaaggaagt 384; ggaggagcgg agtagagaac ggccctccca gcctgagggg ctgcgcaagg cagctagcct 390; cacggaggat cgggaccgtg ggcgggatgc cgtgaagcga gaagctgccc tacccccagt 396; gagccccctg aaggcggctc agga ggagttagag aagaaatcca aggctatcat 102; tgaggaatat ctccatctca tgaa agaggcagtc cagtgcgtgc aggagctggc 108; ctcaccctcc ttgctcttca tctttgtacg tgtc gagtctacgc tggagcgcag 114; tgccattgct cgtgagcata agct gctgcaccag ctgctctgtg ctgggcatct 120; gtctactgct cagtactacc aagggttgta tgaaatcttg gctg aggacatgga 126; aattgacatc ccccacgtgt ggctctacct agcggaactg ccca ttctgcagga Z32; aggtggggtg cccatggggg tcag ggagattaca aagcctctga gaccgttggg Z38; caaagctgct tccctgttgc tggagatcct gggcctcctg agca tgggtcctaa Z44; aaaggtgggg acgctgtggc gagaagccgg ctgg aaggaatttc tacctgaagg 150; ccaggacatt ggtgcattcg tcgctgaaca gaaggtggag tataccctgg gagaggagtc 156; ggaagcccct ggccagaggg cactcccctc cgaggagctg aacaggcagc tggagaagct Z62; gctgaaggag ggcagcagta accagcgggt gttcgactgg atagaggcca acctgagtga Z68; gcagcagata gtatccaaca cgttagttcg agccctcatg acggctgtct gctattctgc Z74; aattattttt gagactcccc tccgagtgga cgttgcagtg ctgaaagcgc gagcgaagct 180; gctgcagaaa tacctgtgtg acgagcagaa ggagctacag gcgctctacg ccctccaggc 186; ccttgtagtg accttagaac agcctcccaa cctgctgcgg atgttctttg acgcactgta Z92; tgacgaggac gtggtgaagg aggatgcctt ctacagttgg gagagtagca aggaccccgc Z98; tgagcagcag ggcaagggtg tggcccttaa atctgtcaca ttca agtggctccg 504; tgaagcagag gaggagtctg accacaactg agggctggtg gggccgggga cctggagccc 510; catggacaca cagatggccc ggctagccgc tgca ggggggcggc agcagcggcg 516; gtgg gtgcctgtag gtgt ctgaactaat aaagtggctg aagaggcagg 522; atggcttggg gctgcctggg cccccctcca ggatgccgcc aggtgtccct ctcctccccc 528; tggggcacag atta tatataaagt cttgaaattt ggtgtgtctt ggggtgggga 534; ggggcaccaa cgcctgcccc tggggtcctt ttttttattt tctgaaaatc actctcggga 5401 ctgccgtcct cgctgctggg ggcatatgcc ccagcccctg taccacccct gctgttgcct 5461 gggcaggggg aagggggggc acggtgcctg ttaa acatgaattc aattaagctc 5521 aaaaaaaaaa aaaaaaaa LOCUS WW_198242 (isoform 3) AA /translation="MNQPPQIAPKRIQKTIRIL DPNQGGKJII L 4 a *IWSGARIASTPTP PQIGGG *PQANG 4 VRPDDRSQGAIIAD RPGLPGP L *SQPSSPSPTP SPSPVL*PGS*PW-AV-SIPGDTW"TIQWSV L 1? ISR‘IG .SP *PiPLA‘PI .4V‘V1-SKPVP4 S 4 tSSSP-QAP P-ASiiV 4? *V‘SSP L -AP ESPVPIAP AQP L *--WGAPSPPAVJ 1ASWAPPTIPSA iPAiAPSAISPAQ L 4 4W *4 4 *G‘AG‘AG *SIPIPANASQV .TAAAATQVAVSVPK R? 4 VKK‘AVGD. *VQPPAGSVPGP *S‘GSGVPPRP L *AD 4 4 DKIHVA‘VIQPG .N. 4 *KKRYD? TF--GFQFIFASWQKP' I SDVVJDKANKTP .QGIWCGP DFTPSFAV .GQTT-STRGPPQGGP QGPAGAGPQRSQQGP A V .M1 4 DIK-VKA EKAWKPSSKRTAADK D ADGSK'"QDJFRQVQS I AWKL"PQWFQQLWKQVTQ 1AI *R-KGVIDLIE *PVESVAYAVWCRC-WA *KP V 1VVE RK---NQ *KDKDDD D‘AA A 4 *RGR-K 4 QDIAQR RS IGE TAIMi {D 4 *SLdCLCQ. .TTIGKD. 3FTKAKP DQYFVQW isSRI D-RGSNWVPRRG DQGPK"IDQI {K {IKV 3K QG-P-VDDGGWWTVP ISKGSQPID"S GGSGAKPSDAAS.AA RPA Si-VRESA RGDR-*RS*RGGDRG DKARIPA RGRDAVKRTAA .PPVSP .KAA-S4 PS--bIbVR{GV*SIL 4 QSAIAQ'4WGQ4 3M iVWLY-A 4 -VIP I *GGVPWG 4 .G..CKSMGPKKVGT-WQTAG .P‘GQ DIGAEVA 4 4 .VRQL -K*GSSNQ *AV-S *QQIVSNT TIP-XV DVAV-KARAKL.QKYLCD*QK L .QA-YA-QA .VVTE *DVVK L DAEYSWLSSKDPAEQQGKGVAAKSViAbbKW .R 4A 4 *S CDNA: 1 cgca ga :cgcccgg cgcggctccg ccccc :gcgc cggtcacg :g ggggcgccgg 61 ctgcgcctgc ggagaagcgg tggccgccga gcgggatctg tgcggggagc cggaaatggt 121 ctac gtctgtgcgg ctgcgtgggg ctcggccgcg cggactgaag gagactgaag 181 gggcgttcca catacgttgt cccgacacag cagtaccctg tgcagccagg agccccaggc 241 ccag gtgcaagccc tacagaattt gggacctacg ctggcgccta ctatccagcc 301 caaggggtgc agcagtttcc cactggcgtg gccc cagttttgat gaaccagcca 361 ccccagattg ctcccaagag ggagcgtaag acgatccgaa atcc aaaccaagga 421 ggaaaggata tcacagagga gatcatgtct ggggcccgca ctgcctccac acccacccct 481 ccccagacgg tct tcaa gctaatgggg agacgcccca ggttgctgtc 541 cggc cagatgaccg gtcacaggga gcaatcattg ctgaccggcc agggctgcct 601 ggcccagagc atagcccttc agaatcccag ccttcgtcgc cttctccgac cccatcacca 661 gtct tggaaccggg gtctgagcct aatctcgcag ctat tcctggggac 721 actatgacaa ctatacaaat gtctgtagaa gaatcaaccc ccatctcccg tgaaactggg 78; tatc gcctctctcc agaacccact cctctcgccg tact ggaagtagaa 84; gtgacactta gcaaaccggt tccagaatct gagttttctt ccagtcctct ccaggctccc 90; acccctttgg catctcacac agtggaaatt catgagccta atggcatggt cccatctgaa 96; gatctggaac cagaggtgga gtcaagccca gagcttgctc ctcccccagc ttgcccctcc L02; gaatcccctg tgcccattgc tccaactgcc caacctgagg aactgctcaa cggagccccc L08; ccag ctgtggactt aagcccagtc agtgagccag aggagcaggc caaggaggtg L14; acagcatcaa tggcgccccc caccatcccc tctgctactc cagctacggc tccttcagct L20; acttccccag ctcaggagga ggaaatggaa gaagaagaag aagaggaaga agca L26; ggagaagcag gagaagctga gagtgagaaa ggaggagagg aactgctccc cccagagagt L32; acccctattc cagccaactt gtctcagaat ttggaggcag cagcagccac tcaagtggca L38; gtatctgtgc caaagaggag acggaaaatt aaggagctaa ataagaagga ggctgttgga L44; gaccttctgg atgccttcaa gaac ccggcagtac cagaggtgga aaatcagcct L50; cctgcaggca gcaatccagg cccagagtct gagggcagtg gtgtgccccc acgtcctgag L56; gaagcagatg agacctggga ctcaaaggaa gacaaaattc acaatgctga gaacatccag L62; cccggggaac agaagtatga atataagtca tgga taaa ggag L68; aaaaaacgtt acgaccgtga gttcctgctt ggttttcagt tcatctttgc cagtatgcag L74; aagccagagg gattgccaca tatcagtgac gtggtgctgg acaaggccaa taaaacacca L80; ctgcggccac tggatcccac tagactacaa ggcataaatt gtggcccaga cttcactcca L86; tcctttgcca accttggccg gacaaccctt agcacccgtg ggcccccaag gggtgggcca L92; ggtggggagc tgccccgtgg gccggctggc ctgggacccc ggcgctctca gcagggaccc L98; cgaaaagaac cacgcaagat cattgccaca gtgttaatga ccgaagatat aaaactgaac 204; aaagcagaga aagcctggaa cagc aagcggacgg cggctgataa ggatcgaggg 210; gaagaagatg ctgatggcag caaaacccag ttcc gcagggtgcg ctccatcctg 216; ctga caccccagat gttccagcag ctgatgaagc aagtgacgca catc 222; gacaccgagg aacgcctcaa aggggtcatt gacctcattt ttgagaaggc catttcagag 228; cccaacttct ctgtggccta tgccaacatg tgccgctgcc tcatggcgct gaaagtgccc 234; actacggaaa cagt gaac aagc tgttgttgaa tcgatgtcag 240; tttg agaaagacaa agatgatgat gaggtttttg agaagaagca aaaagagatg 246; gatgaagctg ctacggcaga agga cgcctgaagg aagagctgga agaggctcgg 252; gacatagccc ggcggcgctc tttagggaat atcaagttta ttggagagtt gttcaaactg 258; aagatgttaa cagaggcaat aatgcatgac tgtgtggtca aactgcttaa gaaccatgat 264; gaagagtccc ttgagtgcct ttgtcgtctg ctcaccacca aaga cctggacttt 270; gaaaaagcca agccccgaat ggatcagtat caga tggaaaaaat cattaaagaa 276; aagaagacgt catcccgcat ccgctttatg ctgcaggacg tgctggatct gcgagggagc 282; aattgggtgc cacgccgagg ggatcagggt cccaagacca agat ccataaggag 288; gctgagatgg aagaacatcg agagcacatc aaagtgcagc agctcatggc caagggcagt 294; gacaagcgtc ggggcggtcc tccaggccct cccatcagcc ttcc ccttgtggat 300; gatggtggct ggaacacagt tcccatcagc agcc gccccattga cacctcacga 306; ctcaccaaga tcaccaagcc catc gattctaaca accagctctt tgcacctgga 312; gggcgactga gctggggcaa gggcagcagc ggaggctcag gagccaagcc ctcagacgca 318; gcatcagaag ctgctcgccc agctactagt actttgaatc gcttctcagc ccttcaacaa 324; gcggtaccca cagaaagcac agataataga cgtgtggtgc agaggagtag cttgagccga 330; gaacgaggcg agaaagctgg agaccgagga ctag agcggagtga acggggaggg 336; gaccgtgggg accggcttga tcgtgcgcgg acacctgcta ccaagcggag cttcagcaag 342; gaagtggagg agcggagtag agaacggccc cctg aggggctgcg caaggcagct 348; agcctcacgg aggatcggga ccgtgggcgg gatgccgtga agcgagaagc tgccctaccc 354; ccagtgagcc ccctgaaggc ggctctctct gaggaggagt tagagaagaa atccaaggct 360; atcattgagg aatatctcca tctcaatgac atgaaagagg cagtccagtg Cgtgcaggag 366; ctggcctcac tgct cttcatcttt gtacggcatg gtgtcgagtc tacgctggag 372; cgcagtgcca ttgctcgtga gcatatgggg cagctgctgc accagctgct ctgtgctggg 378; catctgtcta ctgctcagta aggg ttgtatgaaa tcttggaatt ggctgaggac 384; atggaaattg acatccccca cgtgtggctc tacctagcgg aactggtaac acccattctg 390; caggaaggtg gggtgcccat gggggagctg ttcagggaga ttacaaagcc accg 396; ttgggcaaag ccct gttgctggag atcctgggcc tcctgtgcaa aagcatgggt 102; cctaaaaagg tggggacgct gtggcgagaa ctta agga atttctacct 108; gaaggccagg acattggtgc attcgtcgct gaacagaagg tggagtatac cctgggagag 114; gagtcggaag cccctggcca gagggcactc ccctccgagg agctgaacag gcagctggag 120; aagctgctga aggagggcag cagtaaccag nggtgttcg actggataga ggccaacctg 126; agtgagcagc agatagtatc caacacgtta gttcgagccc tcatgacggc tgtctgctat 132; atta tttttgagac tcccctccga gtggacgttg cagtgctgaa agcgcgagcg 138; ctgc agaaatacct gtgtgacgag gagc tacaggcgct ctacgccctc Z441 caggcccttg tagtgacctt agaacagcct ctgc tgcggatgtt ctttgacgca 1501 ctgtatgacg tggt gaaggaggat gccttctaca gttgggagag tagcaaggac 1561 cccgctgagc agcagggcaa gggtgtggcc tctg tcacagcctt cttcaagtgg Z621 ctccgtgaag cagaggagga gtctgaccac aactgagggc tggtggggcc ctgg Z681 agccccatgg acacacagat ggcccggcta gccgcctgga ctgcaggggg gcggcagcag Z741 tggc agtgggtgcc tgtagtgtga tgtgtctgaa ctaataaagt ggctgaagag 1801 gcaggatggc ttggggctgc ctgggccccc ctccaggatg ccgccaggtg tccctctcct 1861 ccccctgggg cacagagata tattatatat aaagtcttga aatttggtgt gtcttggggt Z921 ggggaggggc accaacgcct gcccctgggg tttt tattttctga aaatcactct Z981 cgggactgcc gtcctcgctg ctgggggcat atgccccagc ccctgtacca cccctgctgt 5041 tgcctgggca gggggaaggg ggggcacggt gcctgtaatt attaaacatg aattcaatta 5101 agctcaaaaa aaaaaaaaaa aaa LOCUS NM_198244 (isoform 2) AA /:ransla:ion="MMIPSQISYPASQGAYYIPGQGRS"YVVPTQQYPVQPGAPGFYP GASPTEFGTYAGAYYPAQGVQQFP"GVAPAPV4WNQPPQIAPKRIRKTIRIRDPVQGGL‘J KDIi L *IWSGA? ASiP G-‘PQANG L PQVAVIVRPD u RSQGAIIADRPGL PGP L iSPS*SQPSSPSP"PSPSPVL*PGS*PW-AV-SIPGD W H IQMSV U] 1PISR *1G PYR-SP‘P PLA‘PI-‘V‘VL -SKPVP*S*ESSSP-QAP w -A541V*I{*PWG WVPS‘DL‘P‘V‘SSPL -APPPACPSESPVPIAP AQP L L --WGAPSPPAV3-SPVSTP d4QAK‘ViASWAPPiIPSAiPAiAPSAiSPAQ L *****G*AG*AG*A*S*KG G**--PP*SiPIPAN-SQW TAAAATQVAVSVPKRRQKIK4-WKK*AVGD--DAFKTA WPAVP‘V‘VQPPAGSWPGP‘S*GSGVPPRPL *AD‘iWDSK‘DKIHVA‘WIQPG‘QKY‘ YKSDQWKP-N-**KKRYDQTF--GFQFIFASWQKPTG-P{ISDVV4DKANKTP-RP.3 P"?AQGIVCGPDFTPSFAV-G?TT-STRGPPQGGPGGT-P?GPAG4GP?RSQQGPQKE LLJ'UN'U RKIIA V-Mi‘DIK-VKATKAWKPSSKRTAADKDQG**DADGSK"Q34FRRVQSIAVL"PQVIFQQLVIKQV1Q -AI 31 L * R -KGVI DT.IJ: * KAIS * PW]: SVAYAWVICRC MA -KV LKP V1VVERK---N?CQK*E*KDKDDD*VE*KKQK‘WJ‘AA A‘4RGR-Kdd-d ARDIAQRRSAGWIKEIG & W kE ; *AIMiDCVVK--KV{D**SL4CLC?--TTIG KD.3FTKAKPRWDQYEVQW*KIIK4KKiSSRIQEWLQDV.3-?GSNWVPRRGDQGPK" IDQIiK*A*W**{Rd{IKVQQAWAKGSDKRRGGPPGPPISRG-P-VDDGGWWTVPISK GSQPID"S?ATKITKPGSIDSWVQAFAPGGRASWGKGSSGGSGAKPSDAASLAAQPA ST-W?FSA-QQAVPTESTDVQRVVQRSS-SR*QG*KAGD?GD?-*RS*?GGDQGD?AD RAaiPA KRSESK*V**?SQ*QPSQP*G-?KAAS-TT AVKQTAA-PPVSP- KAA-S***-*KKSKAII**Y-i-V3WKTAVQCVQT-ASPS--bIbVRiGV*51L4?SA w QTiWGQ--HQ--CAG{-STAQYYQG-Y*I-*-A*DM*IDIP{VWLY-AT-VTPI-Q mmVPWG‘-bR*IiKP-?P-GKAAS---T SMGPKKVGT-W?TAG-SWKTF. mGQDIGAEVA‘QKV‘Yi-G**S*APGQQA-PS**.WRQLdKL-K‘GSSNQRVFDWIE A .STQQIVSNT-VRA-WTAVCYSAIIFTTP-?VDVAV-KARAKL.QKYLCD‘QK L -Q -VVT-TQPPW--?Wbb3A-YD*DVVK L DAEYSWLSSKDPAEQQGKGVAAKS V1AthW-R*A***SDiW CDNA: 1 cggcggcgca ga:cgcccgg cgcggctccg ccccc:gcgc cggtcacgtg ggggcgccgg 61 ctgcgcctgc ggagaagcgg tggccgccga gcgggatctg tgcggggagc cggaaatggt 121 tgtggactac gcgg ctgcgtgggg ctcggccgcg cggactgaag gagactgaag 181 cacttctacc ctagccgggc ccagcccccg gcag cctcccgagt gcagagtgca 24; gcccctgccc gccctggccc agctgcccat gtctaccctg ctggatccca agtaatgatg ; atcccttccc agatctccta ctcc cagggggcct actacatccc tggacagggg 36; cgttccacat acgttgtccc gacacagcag taccctgtgc agccaggagc cccaggcttc 42; tatccaggtg caagccctac agaatttggg acctacgctg gcgcctacta ccaa 48; ggggtgcagc agtttcccac tggcgtggcc cccgccccag ttttgatgaa ccagccaccc 54; cagattgctc ccaagaggga gacg atccgaattc gagatccaaa ccaaggagga 60; aaggatatca cagaggagat tggg gcccgcactg cctccacacc tccc 66; cagacgggag tgga gcctcaagct aatggggaga cgccccaggt tgctgtcatt 72; ccag atgaccggtc acagggagca atcattgctg accggccagg gctgcctggc 78; ccagagcata gcccttcaga atcccagcct tcgtcgcctt ctccgacccc atcaccatcc 84; ttgg aaccggggtc tgagcctaat ctcgcagtcc tctctattcc tggggacact 90; atgacaacta tacaaatgtc tgtagaagaa tcaaccccca gtga aactggggag 96; ccatatcgcc tctctccaga acccactcct ctcgccgaac ccatactgga agtagaagtg L02; acacttagca ttcc agaatctgag ttttcttcca gtcctctcca ggctcccacc L08; cctttggcat ctcacacagt ggaaattcat gagcctaatg gcatggtccc atctgaagat L14; ctggaaccag aggtggagtc aagcccagag cttgctcctc ccccagcttg cccctccgaa L20; tcccctgtgc ccattgctcc aactgcccaa cctgaggaac tgctcaacgg agccccctcg L26; ccaccagctg tggacttaag cccagtcagt gagccagagg agcaggccaa ggaggtgaca L32; gcatcaatgg cgccccccac catcccctct ccag ctacggctcc tact L38; tccccagctc aggaggagga agaa gaagaagaag aagg agga L44; ggag aagctgagag agga ggagaggaac tgctcccccc agagagtacc L50; cctattccag ccaacttgtc tcagaatttg gaggcagcag cagccactca agtggcagta L56; tctgtgccaa agaggagacg gaaaattaag gagctaaata agaaggaggc tgttggagac L62; cttctggatg ccttcaagga ggcgaacccg gcagtaccag aggtggaaaa tcagcctcct L68; gcaggcagca atccaggccc agagtctgag ggcagtggtg tgcccccacg tcctgaggaa L74; gcagatgaga cctgggactc aaaggaagac aaaattcaca atgctgagaa catccagccc L80; ggggaacaga agtatgaata taagtcagat cagtggaagc ctctaaacct agaggagaaa L86; tacg accgtgagtt cctgcttggt tttcagttca tctttgccag tatgcagaag L92; ccagagggat tgccacatat cagtgacgtg gtgctggaca aggccaataa aacaccactg L98; cggccactgg atcccactag actacaaggc ataaattgtg gcccagactt cactccatcc 204; aacc ttggccggac aacccttagc acccgtgggc ccccaagggg tgggccaggt 210; ggggagctgc cccgtgggcc ggctggcctg ggaccccggc gctctcagca gggaccccga 216; aaagaaccac gcaagatcat tgccacagtg ttaatgaccg aagatataaa actgaacaaa 222; gcagagaaag cctggaaacc cagcagcaag ngangng ctgataagga ggaa 228; gaagatgctg atggcagcaa aacccaggac ctattccgca gggtgcgctc catcctgaat 234; aaactgacac cccagatgtt ccagcagctg atgaagcaag tgacgcagct ggccatcgac 240; accgaggaac gcctcaaagg tgac ctcatttttg agaaggccat ttcagagccc 246; aacttctctg tggcctatgc caacatgtgc cgctgcctca tggcgctgaa agtgcccact 252; aagc caacagtgac tgtgaacttc cgaaagctgt tgttgaatcg atgtcagaag 258; gaga aagacaaaga tgatgatgag gtttttgaga agaagcaaaa agagatggat 264; gaagctgcta cggcagagga acgaggacgc ctgaaggaag agctggaaga ggac 270; atagcccggc ggcgctcttt agggaatatc attg gagagttgtt caaactgaag 276; atgttaacag aggcaataat gcatgactgt gtggtcaaac tgcttaagaa ccatgatgaa 282; gagtcccttg agtgcctttg tcgtctgctc accaccattg gcaaagacct ggactttgaa 288; aaagccaagc tgga tcagtatttc aaccagatgg aaaaaatcat taaagaaaag 294; aagacgtcat cccgcatccg ctttatgctg caggacgtgc tggatctgcg agggagcaat 300; tgggtgccac gccgagggga tccc aagaccattg accagatcca taaggaggct 306; gagatggaag gaga gcacatcaaa gtgcagcagc tcatggccaa gggcagtgac 312; aagcgtcggg gcggtcctcc aggccctccc atcagccgtg gacttcccct tgtggatgat 318; ggtggctgga acacagttcc catcagcaaa ggtagccgcc ccattgacac ctcacgactc 324; accaagatca ccaagcctgg ctccatcgat tctaacaacc agctctttgc acctggaggg 330; cgactgagct ggggcaaggg cagcagcgga ggctcaggag ccaagccctc agacgcagca 336; tcagaagctg ctcgcccagc tactagtact ttgaatcgct tctcagccct agcg 342; acag aaagcacaga taatagacgt gtggtgcaga gctt gagccgagaa 348; cgaggcgaga aagctggaga agac cgcctagagc ggagtgaacg ggac 354; cgtggggacc ggcttgatcg gaca cctgctacca agcggagctt cagcaaggaa 360; gtggaggagc ggagtagaga ctcc cagcctgagg ggctgcgcaa tagc 366; ctcacggagg atcgggaccg tgggcgggat gccgtgaagc gagaagctgc cctaccccca 372; gtgagccccc tgaaggcggc tctctctgag gaggagttag agaagaaatc caaggctatc 378; attgaggaat atctccatct caatgacatg aaagaggcag tccagtgcgt gcaggagctg 384; gcctcaccct ccttgctctt catctttgta cggcatggtg tcgagtctac gctggagcgc 390; agtgccattg ctcgtgagca tatggggcag ctgctgcacc tctg tgctgggcat 396; actg ctcagtacta ccaagggttg tatgaaatct tggaattggc tgaggacatg 102; gaaattgaca acgt gtggctctac ctagcggaac tggtaacacc cattctgcag 108; gaaggtgggg tgcccatggg ggagctgttc agggagatta caaagcctct gagaccgttg 114; ggcaaagctg cttccctgtt gctggagatc ctgggcctcc aaag catgggtcct 120; aaaaaggtgg ggacgctgtg gcgagaagcc gggcttagct ggaaggaatt tctacctgaa 126; ggccaggaca ttggtgcatt cgtcgctgaa cagaaggtgg agtataccct gggagaggag Z32; tcggaagccc ctggccagag ggcactcccc gagc tgaacaggca gaag Z38; ctgctgaagg agggcagcag taaccagcgg gtgttcgact ggatagaggc caacctgagt Z44; gagcagcaga tagtatccaa cacgttagtt cgagccctca tgacggctgt ctgctattct 150; gcaattattt ttgagactcc agtg gacgttgcag tgctgaaagc gcgagcgaag 156; ctgctgcaga aatacctgtg tgacgagcag aaggagctac aggcgctcta cgccctccag Z62; gcccttgtag tgaccttaga acagcctccc ctgc ggatgttctt actg Z68; gagg acgtggtgaa ggaggatgcc ttctacagtt gggagagtag caaggacccc Z74; gctgagcagc agggcaaggg tgtggccctt aaatctgtca cagccttctt caagtggctc Z80; cgtgaagcag aggaggagtc tgaccacaac ctgg tggggccggg gacctggagc Z86; cccatggaca cacagatggc ccggctagcc gcctggactg caggggggcg gcagcagcgg Z92; cggtggcagt gggtgcctgt agtgtgatgt gtctgaacta ataaagtggc tgaagaggca Z98; ggatggcttg gggctgcctg ggcccccctc caggatgccg ccaggtgtcc ctctcctccc 504; cctggggcac agagatatat tatatataaa gtcttgaaat ttggtgtgtc ttggggtggg 510; gaggggcacc aacgcctgcc gtcc ttat tttctgaaaa tcactctcgg 516; gactgccgtc ctcgctgctg ggggcatatg ccccagcccc tgtaccaccc ttgc 522; ctgggcaggg ggaagggggg gcacggtgcc tgtaattatt aaacatgaat tcaattaagc 528; tcaaaaaaaa aaaaaaaaaa 7. *WOl: *NOl enolase l, (alpha) [ Homo sapiens ] AOCJS WM_001428 AA /translation="MSILKIHAREIFDSRGNPTVEVDLFTSKGLFRAAVPSGASTGIY 4A.4LRDWDK"RYMGKGVSKAVEHINKTIAPALVSKK-NVi*Q*KIDKLWI*MDGi*W WAIAGVSLAVCKAGAVEKGVPLYRHIADLAGVSTVI-PVPAFNVIVGGSHAG WK-AMQTFMILPVGAAVFREAMQIGATVYHN-KNVIKTKYGK3A GEAPNIA *NK‘G.4.LK AIGKAGYTDKVVIGMDVAASEFFRSGKYDADFKSPDDPSRYISPDQA ADAYKSFIKDYPVVSI:DPFDQDDWGAWQKFTASAGIQVVGDDL"VTNPKRIAKAVNEA.

KSCVC---KVNQIGSVTﬂS-QACK-AQANGWGVMVSH?SG*i*D EIADAVVGLCTGQA.

IKTGAPCRSTRLAKYNQ.LQI**4-GSKAKFAGRNFRWP4AK CDNA: l g:ggggcccc agagcgacgc tgagtgcgtg cgggactcgg agtacgtgac ggagccccga 6; gctctcatgc ccgccacgcc ggcc cgga gccccggctc cgcacacccc 12; agttcggctc accggtccta tctggggcca cgcc cgcaccacta cagggccgct 18; ggggagtcgg ggccccccag atctgcccgc ctcaagtccg cgggacgtca cccccctttc 24; cacgctactg cagccgtcgc agtcccaccc ctttccggga ggtgagggaa tgagtgacgg ; ctctcccgac gaatggcgag gcggagctga gggggcgtgc cccggaggcg ggaagtgggt 36; ggggctcgcc ttagctaggc aggaagtcgg cgcgggcggc gcggacagta tctgtgggta 42; cccggagcac ggagatctcg ccggctttac gttcacctcg gcag caccctccgc 48; ttcctctcct aggcgacgag acccagtggc tagaagttca ccatgtctat gatc 54; catgccaggg agatctttga ctctcgcggg aatcccactg ttgaggttga tctcttcacc 60; tcaaaaggtc tcttcagagc tgctgtgccc agtggtgctt gtat ctatgaggcc 66; ctagagctcc gggacaatga taagactcgc tatatgggga agggtgtctc aaaggctgtt 72; gagcacatca ctat tgcgcctgcc ctggttagca agaaactgaa cgtcacagaa 78; caagagaaga ttgacaaact gatgatcgag atggatggaa cagaaaataa atctaagttt 84; aacg ccattctggg ggtgtccctt gccgtctgca aagctggtgc cgttgagaag 90; ggggtccccc gcca catcgctgac ttggctggca actctgaagt catcctgcca 96; gtcccggcgt tcaatgtcat caatggcggt tctcatgctg gcaacaagct ggccatgcag L02; gagttcatga tcctcccagt cggtgcagca aacttcaggg aagccatgcg cattggagca L08; gaggtttacc acaacctgaa gaatgtcatc aaggagaaat atgggaaaga tgccaccaat L14; gtgggggatg aaggcgggtt tgctcccaac atcctggaga aagg cctggagctg L20; ctgaagactg ggaa agctggctac actgataagg tggtcatcgg catggacgta L26; gcggcctccg agttcttcag gtctgggaag tatgacctgg acttcaagtc tcccgatgac L32; cccagcaggt acatctcgcc tgaccagctg gctgacctgt acaagtcctt ggac L38; tacccagtgg tcga agatcccttt gaccaggatg gagc ttggcagaag L44; ttcacagcca gtgcaggaat ccaggtagtg ggggatgatc tcacagtgac caacccaaag L50; aggatcgcca aggccgtgaa cgagaagtcc tgcaactgcc tcctgctcaa agtcaaccag L56; attggctccg tgaccgagtc tcttcaggcg tgcaagctgg cccaggccaa tggttggggc L62; gtcatggtgt ctcatcgttc gggggagact gaagatacct tcatcgctga cctggttgtg L68; gggctgtgca ctgggcagat caagactggt tgcc gatctgagcg cttggccaag L74; tacaaccagc tcctcagaat tgaagaggag agca aggctaagtt tgccggcagg L80; aacttcagaa accccttggc caagtaagct aggc aagcccttcg gtcacctgtt 1861 ggctacacag acccctcccc tcgtgtcagc tcaggcagct cgaggccccc gaccaacact 1921 tgcaggggtc cctgctagtt agcgccccac Cgccgtggag ccgc ttccttagaa 1981 cttctacaga agccaagctc cctggagccc tgttggcagc tctagctttg cagtcgtgta 2041 attggcccaa gtcattgttt ttctcgcctc actttccacc ctag agtcatgtga 2101 gcctcgtgtc atctccgggg tggccacagg ctagatcccc ggtggttttg aaat 2161 aaaaagcctc agtgacccat gagaataaaa aaaaaaaaaa aaaa 8. FBL: FBL larin [ Homo sapiens ] JOCJS WM 001436 AA /translation="WKPGFSP QGGGFGGRGGFGDRGGRGGRGGFGGGRGRGGGF QGRG QGGGGGGGGGGGGGRGGGGFHSGGVRGRGRGGKRGNQSGKNVMV *PHRH *GVEICRGK TDA-VTKNLVPG *SVYG S‘G DDKI *YRAWNPERSKLAAAILGGVDQIHIKPG AKV-Y-GAASGTTVSHVSDIVGPDGLVYAVEFSHRSGRDLINLAKKRTNIIPVIEDA? {PHKYRMLIAMVDVI FADVAQPDQTRIVALWAHTFLRNGGHFVISIKANCIDSTASAE AVFASEVKKMQQ *NMKPQ 4QTJJ.4PY *RDHAVVVGVYRPPPKVKN CDNA: cgca gtcg ccgcgcgcct gcgctctttt ccacgtgcga aagccccgga 6; ctcgtggagt cgcc gcggactccg gagccgcaca aaccagggct gaag 12; ccaggattca gtccccgtgg gggtggcttt ggcggccgag ggggctttgg tgaccgtggt 18; ggtcgtggag gccgaggggg ctttggcggg ggccgaggtc gaggcggagg ctttagaggt 24; Cgtggacgag gaggaggtgg aggcggcggc ggcggtggag gaggaggaag tgga ; catt ctggtggcaa ccggggtcgt ggtcggggag gaaaaagagg aaaccagtcg 36; gggaagaatg tgatggtgga gccgcatcgg catgagggtg tcttcatttg tcgaggaaag 42; gaagatgcac tggtcaccaa gaacctggtc cctggggaat cagtttatgg agagaagaga 48; gtctcgattt cggaaggaga tgacaaaatt gagtaccgag cctggaaccc cttccgctcc 54; aagctagcag cagcaatcct gggtggtgtg gaccagatcc acatcaaacc gggggctaag 60; gttctctacc tcggggctgc ctcgggcacc acggtctccc atgtctctga catcgttggt 66; ccggatggtc tagtctatgc agtcgagttc tcccaccgct ctggccgtga cctcattaac 72; ttggccaaga agaggaccaa catcattcct gtgatcgagg atgctcgaca cccacacaaa 78; taccgcatgc tcatcgcaat ggtggatgtg atctttgctg atgtggccca gccagaccag 84; acccggattg tggccctgaa tgcccacacc cgta atggaggaca ctttgtgatt 90; tccattaagg ccaactgcat tgactccaca gcctcagccg aggccgtgtt tgcctccgaa 96; gtgaaaaaga tgcaacagga gaacatgaag ccgcaggagc agttgaccct tgagccatat 102; gaaagagacc atgccgtggt Cgtgggagtg tacaggccac cccccaaggt gaagaactga 108; agttcagcgc tgtcaggatt gcgagagatg tgtgttgata ctgttgcacg tgtgtttttc 114; aaga ctcatccgtc tcccaaaaaa 9. GSK3B: GSK3B glycogen synthase kinase 3 beta [ Homo sapiens LOCUS NM_001146156 (isoform 2) AA /translation="MSGRP RTTSFAIESCKPVQQPSAFGSMKVSRDKDGSKVTTVVATP GQGPDRPQEVSYTDTKVIGNGSFGVVYQAK .CDSGTLVAIKKVLQDKRFKVR ELQIMR KLDHCNIVRLRYEEYSSGdKKD4VYLN .VLDYVPTTVYRVARHYSQAKQ"JPVIYVKL QSAAYIHSFGIC {RDIKPQNL. .DPD"AVJKLCDFGSAKQLVRGEPWVSYIC SRYYRAPT .IFGATDYTSSIDVWSAGCV .AT .LGQPIFPGDSGV DQLVEIIKVLGTP 1RdQIRdWVPNYTEFKFPQIKA {PW1KVERPQ PP‘AIALCSRLL *YiP AR-iPLdA CA{SFFDT-RDPNVKLPNGRDTPALFNFTTQT-SSWPPLATILIPPHARIQAAASTPT NATAASDAVTGDRGQTNNAASASASNST CDNA: 1 cgggc::gtg ccgccgccgc cgccgccgcc gcccgggcca aaag gaaggaagga 6L agcgaggagg agccggcccc gcagccgctg acagggctct gggctggggc aaagcgcgga 12; ctga gcgggcaccg agcagagccg aggggcggga gggcggccga gctgttgccg 18; cggacggggg cccc gagggacgga agcggttgcc gggttcccat ggcg 24; aatggggaac agtcgaggag ccgctgcctg gggtctgaag ggagctgcct ccgccaccgc ; catggccgct ggatccagcc gccgcctgca gctgctcctg gcgcaatgag agcc 36; gccgccaccg ccaccgcccg cctctgactg actcgcgact ccgccgccct ctagttcgcc 42; gggcccctgc cgtcagcccg ccggatcccg cggcttgccg gagctgcagc gtttcccgtc 48; gcatctccga gccaccccct ccctccctct ccctccctcc tacccatccc cctttctctt 54; caagcgtgag gatc cttccgccgc ttcccttctt ctcg aaat 60; ccccgaggaa aatataatat tcgaagtact cattttcaat caagtatttg cccccgtttc 66; acgtgataca tattttttta ggatttgccc tctcttttct ctcctcccag gaaagggagg 72; ggaaagaatt gtattttttc ccaagtccta aatcatctat atgttaaata tccgtgccga 78; ttga aggagaaata tatcgcttgt tttgtttttt atagtataca aaaggagtga 84; aaagccaaga agtc tttttctttt tcttctgtgg gagaacttaa tgctgcattt 90; atcgttaacc ccca acataaagac aaaaggaaga aaaggaggaa ggaaggaaaa 96; ggtgattcgc gaagagagtg tcag ggcggcccag aaccacctcc tttgcggaga 102; gctgcaagcc ggtgcagcag ccttcagctt ttggcagcat gaaagttagc aagg 108; acggcagcaa ggtgacaaca gcaa ctcctgggca gggtccagac aggccacaag 114; aagtcagcta tacagacact aaagtgattg gaaatggatc atttggtgtg gtatatcaag 120; ccaaactttg tgattcagga gaactggtcg ccatcaagaa agtattgcag gacaagagat 126; ttaagaatcg agagctccag atcatgagaa agctagatca ctgtaacata gtccgattgc 132; gttatttctt ctactccagt aaga aagatgaggt ctatcttaat ctggtgctgg 138; actatgttcc ggaaacagta tacagagttg ccagacacta tagtcgagcc aaacagacgc L44; tccctgtgat ttatgtcaag ttgtatatgt atcagctgtt ccgaagttta gcctatatcc L50; attcctttgg aatctgccat cgggatatta aaccgcagaa cctcttgttg gatcctgata L56; ctgctgtatt aaaactctgt gactttggaa gtgcaaagca gctggtccga ccca L62; atgtttcgta tatctgttct cggtactata gggcaccaga gttgatcttt ggagccactg L68; attatacctc tagtatagat gtatggtctg ctggctgtgt gttggctgag ctgttactag L74; gacaaccaat atttccaggg gatagtggtg tggatcagtt ggtagaaata atcaaggtcc L80; tgggaactcc aacaagggag caaatcagag aaatgaaccc aaactacaca gaatttaaat L86; tccctcaaat taaggcacat ccttggacta tccg accccgaact ccaccggagg L92; caattgcact gtgtagccgt ctgctggagt atacaccaac tgcccgacta acaccactgg L98; aagcttgtgc acattcattt tttgatgaat tacgggaccc aaatgtcaaa ctaccaaatg 204; ggcgagacac acctgcactc ttcaacttca ccactcaaga actgtcaagt aatccacctc 210; tggctaccat ccttattcct cctcatgctc ggattcaagc agctgcttca acccccacaa 216; cagc agcgtcagat gctaatactg gagaccgtgg acagaccaat aatgctgctt 222; ctgcatcagc ttccaactcc acctgaacag tcccgagcag ccagctgcac aggaaaaacc 228; accagttact tgagtgtcac acac tggtcacgtt tggaaagaat attaaaaaga 234; gaaaaaaatc ctgttcattt tagtgttcaa tttttttatt attattgttg ttcttattta 240; accttgtaaa ataa atacaaacca atttcattgt actt tgagggagat 246; ccagggggtg ggaggggttg tggggagggg ggag cactagaaca tacaatctct 252; ctcccacgac aatctttttt tattaaaagt ctgc:gttgt atactttaaa aacaggactc 258; ctgcctcatg ccccttccac agaa tttc tgtgctgatg ggtttttttg 264; gttt tcttttaaag tctagtgtga gact :tggta acag cttgaaattg 270; gttgggagct tagcaggtat aactcaacgg ggac:taaat gtcacttgta atcc 276; atatcttcgg atag acttgccttt ggca:gttgg tggcaggtgt caaa 282; gaaatgtgta tcattcgtaa cccagggagg tcaa:aaagt ttggaactct acagggaaga 288; ttcttagtag atttgttaag gttttgtttt gctc:cagtt agtgctagtg atgtagaggc 294; ttgtacagga ggctgccaga ggggaagcag caagcaagac tcaggcacac atgctctaca 300; ggtggctctt tgtttgcctg accaaagttc tttgcaaatc ttagcacagt ttcaaactag 306; tgacctggga ggagatggaa ggggtgttga gcaggctgag ctagctgctg aggtcaaagg 312; ctgatgagcc aagg ggacaggtca gggatacatc tcaccactgt gttt 318; attt aaag ttacttccct tggaaagata cacttgagag gacattgtag 324; ttaaataatg tgaactgtaa cagtcatcta ctggtttatt tttcatattt tgaa 330; aattgagctt gcagaaatag ccacattcta cacatagttc taattttaaa tcta 336; gaatctgtat ttaatttgtt ttttaacctc atgcttttta ttta ttgatgcatg 342; tcagatggta gaaatattaa aaactacaca tcagaatgat acagtcactt atacctgctg 348; actttatagg gatg atataaatgt gtgtatatat gttatatata catatattca 354; atactgcctt tttttttgtc tacagtatca aaattgactg gttgaagcat gagaagaatg 360; tttcccccac acccagttaa gagtttttgt gtctgttttc tttgtgtatc agtgaacgat 366; gttaagaatc tctt tttgaagaaa aagcaatatt aaag caaggagaat 372; tgaaggacta tgtttgccgt gaggaaatag attttcatga ctagtttgtt ttatactttt 378; aaggttggca tctatgtggg ccttatatac atga actttagtca ccttggtgct 384; tatgggccat acct atgaatcttt aaggcacaat cagttgtact ttacatttaa 390; agatcacttg ggcc gcctttccct cctacccgct ccttccccac atgccttcca 396; aggttagctg gtaactgtag ggctgcagag ctgagcccat ggttgtgtgt aacttgccct 102; cctc attgccacct taggtcactt tatgggtctc gtcctccaga gggttcggaa 108; gtggagtctg gccc tcctgcaggc cctagcaccc tgtcctgctc cttaactgtg 114; tgtgtgactc tccaagagag ttgtcctgcc tgctgaagtg aaccagtacc cagaaagaca 120; actgtgagcc atcttggttt tcactcgctg tttagctgag gtcttgggcc acaaaagggg 126; tttcacaaac ctctggatat atcagagttt atgagaaagg aaacatgctc agtcaaacca 132; aatcaaacaa attgaatttt atgttttata aagtgcttct gaaagctaag atttgaaaga 138; agtctgaaat caaagtattt ggcagcataa ctccttaaag gtagtggcgt tgatagacca Z44; ttttcagaca gaatttataa tgaa aaggcaggtc agag aaatggacct 150; gcattcagat ccaactgccc agcaagcgtt tggatgcaga cactgctctg gacgtggtat 156; actccccaga gtccataaaa gctt attttaggaa tgcc ccccacaact Z62; ggggtaaaag aagagagaaa agtcacgctt ttctctcatt tcattgtgtg tgca:gtgtg Z68; Cgtgtgtgtg tgtgtgtgtg tgtgctgaga tgtgtgattt ttctttctca agga:catgg Z74; tgggatcaca gaactctttt gtga gatccaggtc tctgaatatc tttt :gtata 180; taataataat aaaaagctcc tcaccaaatt caagcttgta cattatattt tctt :ctgtg 186; tttttaaatt taagttttat gtat gtaaatatgt ggacccagga actg:tatta Z92; atgagcaaaa agttactgtt cagggcagtg attctgttta ataatcagac aaaa:gtaga Z98; cgagcttttt aaagccatat aact ctgtacagta ggtaccggcc tgta:tattg 504; taacaataac tctagcaatg tatagtgtat ctatatagtt tggagtgcct tcgcttccat 510; gtgttttttt ttttaatttg ttctttttta aattttaatt ggtttccttt atccatgtct 516; ccctgtccac cccctttccc tttgaaataa taactcactc ataacagtat ctttgcccct 522; tccacagtta agtttcagtg ataccatact caggagtggg aagaggaaat cgta 528; atttcatttc gttgaagccc tgcctttgtt ttggttctga atgtctttcc tcctcggtag 534; gacc ggtttcattt catacttagt ccattcaggg acttagtgta gcaccaggga 540; gccctagagc tggaggatat gatt aaattttgct cgtctcttcc acaagcccta 546; accatgggtc ttaaaaacag cagattctgg gagccttcca tgctctctct ctctcctctt 552; ttatctactt ccctcccaaa tgagagagtg acagagaatt gtttttttat aaatcgaagt 558; ttcttaatag gttt tgatacgtca taaa atgctatagt gcaattacta 564; gcagttactg cacggagtgc caccgtgcca atagaggact gttgttttaa caagggaact 570; cttagcccat ttcctccctc ccgccatctc tacccttgct caatgaaata tcattttaat 576; taaa aaaaatcagt ttaattctta ctgtgtgccc aagg ccttttttga 582; aagaaaaata gaatgttttg cctcaaagta gtccatataa aatgtcttga atagaagaaa 588; ccaa accaaaggtt actatttttg aaacatcgtg tgttcattcc gcag 594; aagactgcac cttctttcca tgct gtgtcatttt ttttaagtcc tcttaatttt 600; tagacacatt tttggtttat gttttaacaa ccta accagtcatc ttgtctgcac 606; caatgcaaag gtttctgaga ggagtattct ctatccctgt ggatatgaag gcat 612; ttcatctatt tttccctttc ctttttaaag gatttaactt tggaatcttc caaaggaagt 618; ttggccaatg ccagatcccc aggaatttgg ggggttttct ttcttttcaa ttgt 624; atctgattcc tactgttcat gttagtgatc atctaatcac agagccaaac acttttctcc 630; cctgtgtgga aaagtaggta tgctttacaa taaaatctgt cttttctggt agaaacctga 636; gccactgaaa ataaaagaga caactagaag agag tcccagactg agatctacct 642; ttgagaggct ttgaaagtaa tccctggggt ttggattatt ttcacaaggg cgtt 648; ttattcaagt ttgttgctcc gttttgcacc tctgcaataa aatg acaaccagta 654; cataaggggt tagcttgaca aagtagactt ccttgtgtta atttttaagt ttttttttcc 660; ttaactatat ctgtctacag gcagatacag atagttgtat gaaaatctgc ttgcctgtaa 666; aatttgcatt tataaatgtg ttgccgatgg tggg cctgtacaca taccaattag 672; cgtgaccact tccatcttaa aaacaaacct aaaaaacaaa atttattata tatatatata 678; tatatatata aaggactgtg ggttgtatac aaactattgc aaacacttgt gcaaatctgt 684; ataa aggaaaagca aaatctgtat aacattatta ctacttgaat gcctctgtga 6901 ctgatttttt ttta aatataaact tgaa gctc tttt 6961 ttcc ccattccctt gtaaatacat tttgttctat tggt ttggaaatag 7021 ttaactggta ctgtaatttg cattaaataa aaagtaggtt agcctggaaa tgaaattaaa 7081 aaaaaaaaaa aaaaa LOCUS NM_002093 (isoform 1) AA/translation="MSGRPRTTSFAESCKPVQQPSAFGSMKVSRDKDGSKVTTVVATP PQEVSYTDTKVIGNGSFGVVYQAK-CDSGTLVAIKKVLQDKRFKWRTLQIM?LL KLDHCNIVRLRYEEYSSG‘KKD‘VYLN-VLDYVPA. VYRVARHYSRAKQTAPVIYVKA YMYQLFRSAAYIHSFGIC4RDIKPQNL.-DPDTAV-K-CDFGSAKQLV?GEPWVSYIC PT-IFGATDYTSSIDVWSAGCV.ATLLLGQPIFPGDSGVDQLVEIIKVLGTP iR‘QIR‘WWPNY LthPQIKAiPWiKDSSGiGHh SGVRVERPRLPP4AIA.CSRL.

LY FLA? *ACAHShbDLLRDPNVKJPWGRDTPAAFNFTTQELSSVPPLATILIP PHARIQAAASTP"WATAASDANTGDRGQTNWAASASASNST CDNA: cgggcttgtg ccgccgccgc cgccgccgcc gcccgggcca agtgacaaag gaaggaagga 6; agcgaggagg agccggcccc gcagccgctg acagggctct gggctggggc aaagcgcgga 12; cacttcctga gcgggcaccg agcagagccg aggggcggga gggcggccga gctgttgccg 18; cggacggggg agggggcccc gagggacgga agcggttgcc gggttcccat gtccccggcg 24; aatggggaac agtcgaggag ccgctgcctg gggtctgaag ggagctgcct ccgccaccgc ; catggccgct ggatccagcc gccgcctgca gctgctcctg gcgcaatgag gagaggagcc 36; gccgccaccg ccaccgcccg cctctgactg actcgcgact ccgccgccct ctagttcgcc 42; gggcccctgc cgtcagcccg ccggatcccg cggcttgccg gagctgcagc gtttcccgtc 48; gcatctccga gccaccccct ccctccctct ccctccctcc tacccatccc cctttctctt 54; caagcgtgag gatc cttccgccgc ttcccttctt cattgactcg gaaaaaaaat 60; ccccgaggaa aatataatat tcgaagtact cattttcaat caagtatttg cccccgtttc 66; acgtgataca tattttttta ggatttgccc tctcttttct ctcctcccag gaaagggagg 72; ggaaagaatt gtattttttc ccta aatcatctat atgttaaata tccgtgccga 78; tctgtcttga aggagaaata tatcgcttgt tttt atagtataca aaaggagtga 84; aaagccaaga ggacgaagtc tttttctttt tcttctgtgg gagaacttaa tgctgcattt 90; atcgttaacc taacacccca acataaagac aaga ggaa ggaaggaaaa 96; tcgc gaagagagtg atcatgtcag ggcggcccag aaccacctcc tttgcggaga 102; gctgcaagcc ggtgcagcag ccttcagctt ttggcagcat gaaagttagc agagacaagg 108; acggcagcaa ggtgacaaca gtggtggcaa ctcctgggca gggtccagac aggccacaag 114; gcta tacagacact aaagtgattg gaaatggatc atttggtgtg gtatatcaag 120; ccaaactttg tgattcagga gaactggtcg ccatcaagaa agtattgcag gacaagagat 126; ttaagaatcg agagctccag atcatgagaa agctagatca ctgtaacata gtccgattgc L32; gttatttctt ctactccagt ggtgagaaga aagatgaggt ctatcttaat ctggtgctgg L38; ttcc ggaaacagta tacagagttg ccagacacta tagtcgagcc aaacagacgc L44; tccctgtgat caag ttgtatatgt atcagctgtt ccgaagttta atcc L50; ttgg aatctgccat cgggatatta aaccgcagaa cctcttgttg gatcctgata L56; ctgctgtatt aaaactctgt ggaa gtgcaaagca gctggtccga ggagaaccca L62; cgta tatctgttct cggtactata gggcaccaga gttgatcttt ggagccactg L68; attatacctc tagtatagat tctg ctggctgtgt gttggctgag ctgttactag L74; gacaaccaat aggg gatagtggtg tggatcagtt ggtagaaata atcaaggtcc L80; tgggaactcc aacaagggag caaatcagag aaatgaaccc caca gaatttaaat L86; tccctcaaat taaggcacat ccttggacta aggattcgtc aggaacagga catttcacct L92; caggagtgcg ggtcttccga actc caccggaggc actg tgtagccgtc L98; tgctggagta tacaccaact gcccgactaa caccactgga tgca cattcatttt 204; ttgatgaatt acgggaccca aatgtcaaac taccaaatgg gcgagacaca cctgcactct 210; tcaacttcac cactcaagaa ctgtcaagta atccacctct ggctaccatc cttattcctc 216; ctcatgctcg gattcaagca gctgcttcaa cccccacaaa tgccacagca gcgtcagatg 222; ctaatactgg agaccgtgga cagaccaata atgctgcttc tgcatcagct tcca 228; cctgaacagt cccgagcagc cagctgcaca ggaaaaacca ccagttactt gagtgtcact 234; cagcaacact ggtcacgttt ggaaagaata ttaaaaagag aaaaaaatcc tgttcatttt 240; agtgttcaat ttttttatta ttattgttgt tcttatttaa ccttgtaaaa tatctataaa 246; tacaaaccaa tttcattgta ttctcacttt gagggagatc cagggggtgg gaggggttgt 252; ggggaggggg aaagcggagc actagaacat acaatctctc tcccacgaca atcttttttt 258; attaaaagtc tgc:gttgta tactttaaaa acaggactcc tgcctcatgc cccttccaca 264; gaaa acc:ttttct gtgctgatgg gtttttttga actttgtttt cttttaaagt 270; ctagtgtgag act:tggtat agtgcacagc ttgaaattgg ttgggagctt agcaggtata 276; actcaacggg gac:taaatg tcacttgtaa aattaatcca tatcttcggg tatttataga 282; cttgcctttg gca:gttggt ggcaggtgtg gcagacaaag gtat cattcgtaac 288; ccagggaggt caa:aaagtt tggaactcta cagggaagat tcttagtaga tttgttaagg 294; ttttgttttg ctc:cagtta gtgctagtga tgtagaggct tgtacaggag gctgccagag 300; gggaagcagc aagcaagact caggcacaca tgctctacag cttt gtttgcctga 306; ccaaagttct atct tagcacagtt tcaaactagt gacctgggag gagatggaag 312; gggtgttgag caggctgagc tagctgctga aggc tgatgagccc agaggaaggg 318; gacaggtcag ggatacatct caccactgtg aataagtttg tccagatttt tttctaaagt 324; tacttccctt ggaaagatac acttgagagg acattgtagt taaataatgt taac 330; agtcatctac tggtttattt ttcatatttt ttaattgaaa cttg tagc 336; cacattctac acatagttct aaat ccaaatctag aatctgtatt taatttgttt 342; tttaacctca tgctttttac atttatttat tgatgcatgt cagatggtag aaatattaaa 348; aactacacat cagaatgata ctta tacctgctga ctttatagga aagctgatga 354; tataaatgtg tgtatatatg ttatatatac atatattcaa tactgccttt ttttttgtct 360; acagtatcaa aattgactgg ttgaagcatg agaagaatgt ttcccccaca taag 366; agtttttgtg tctgttttct ttgtgtatca gtgaacgatg ttaagaatca gtctctcttt 372; ttgaagaaaa agcaatattc cttggaaagc aaggagaatt gaaggactat gtttgccgtg 378; aggaaataga ttttcatgac tagtttgttt tatactttta aggttggcat ctatgtgggc 384; cttatatact tgaa ctttagtcac cttggtgctt atgggccatt acttgaccta 390; tgaatcttta aggcacaatc agttgtactt tacatttaaa gatcacttga gtgatggccg 396; cctc ctacccgctc cttccccaca ccaa ggttagctgg taactgtagg 102; gctgcagagc tgagcccatg gttgtgtgta acttgccctc accctcctca ttgccacctt 108; aggtcacttt atgggtctcg agag ggttcggaag tggagtctgt tggcagccct 114; cctgcaggcc ctagcaccct gtcctgctcc ttaactgtgt gtgtgactct ccaagagagt 120; tgtcctgcct gtga accagtaccc agaaagacaa ctgtgagcca tcttggtttt 126; cactcgctgt ttagctgagg tcttgggcca caaaaggggt ttcacaaacc tata 132; tcagagttta tgagaaagga aacatgctca gtcaaaccaa atcaaacaaa ttgaatttta 138; tgttttataa agtgcttctg aaagctaaga tttgaaagaa gtctgaaatc aaagtatttg Z44; gcagcataac tccttaaagg tagtggcgtt gatagaccat tttcagacag aatttataaa 150; gaatctgaaa aggcaggtct gtgatagaga aatggacctg cattcagatc caactgccca 156; gcaagcgttt ggatgcagac actgctctgg acgtggtata ctccccagag tccataaaaa Z62; tcagtgctta ttttaggaaa caggttgccc actg gggtaaaaga agagagaaaa Z68; gtcacgcttt tctctcattt cattgtgtgt gcatgtgtgc gtgtgtgtgt gtgtgtgtgt Z74; agat gtgtgatttt tctttctcaa ggatcatggt acag aactctttta 180; tacaagtgag atccaggtct ctgaatatct atat aataataata aaaagctcct 186; caccaaattc aagcttgtac attatatttt ctttctgtgt ttttaaattt aagttttatt Z92; gttttgtatg taaatatgtg gacccaggaa ttaa tgagcaaaaa gttactgttc 498; agggcagtga ttctgtttaa taatcagaca aaatgtagac gagcttttta aagccatata 504; gttttaactc tgtacagtag gtaccggcct ttgt aacaataact ctagcaatgt 510; atagtgtatc tatatagttt ggagtgcctt cgcttccatg tgtttttttt tttaatttgt 516; tcttttttaa attg gtttccttta tccatgtctc cctgtccacc ccctttccct 522; ttgaaataat aactcactca taacagtatc tttgcccctt ccacagttaa gtttcagtga 528; taccatactc aggagtggga agaggaaatc atattcgtaa tttcatttcg ttgaagccct 534; gcctttgttt tggttctgaa tgtctttcct cctcggtagc agtgagaccg gtttcatttc 540; atacttagtc cattcaggga cttagtgtag ggag ccctagagct ggaggatatc 546; gaatagatta aattttgctc gtctcttcca caagccctaa ccatgggtct cagc 552; agattctggg agccttccat gctctctctc tctcctcttt tatctacttc cctcccaaat 558; gagagagtga cagagaattg tttttttata aatcgaagtt tcttaatagt atcaggtttt 564; gatacgtcag tggtctaaaa tgctatagtg ctag cagttactgc acggagtgcc 570; accgtgccaa tagaggactg ttgttttaac aagggaactc ttagcccatt tcctccctcc 576; cgccatctct acccttgctc aatgaaatat cattttaatt aaaa agtt 582; taattcttac tgtgtgccca acacgaaggc cttttttgaa agaaaaatag ttgc 588; ctcaaagtag tccatataaa atgtcttgaa tagaagaaaa aactaccaaa ccaaaggtta 594; ctatttttga aacatcgtgt tcca gcaaggcaga agactgcacc ttctttccag 600; tgacatgctg tgtcattttt tttaagtcct cttaattttt agacacattt ttggtttatg 606; ttttaacaat gtatgcctaa ccagtcatct tgtctgcacc aatgcaaagg tttctgagag 612; tctc tatccctgtg gatatgaaga cactggcatt tcatctattt ttccctttcc 618; tttttaaagg atttaacttt ggaatcttcc aaaggaagtt tggccaatgc cagatcccca 624; ggaatttggg gggttttctt tcttttcaac tgaaattgta tctgattcct actgttcatg 630; ttagtgatca tctaatcaca aaca cttttctccc ctgtgtggaa aagtaggtat 636; gctttacaat aaaatctgtc ttttctggta gaaacctgag ccactgaaaa taaaagagac 642; aagc gagt cccagactga gatctacctt tgagaggctt tgaaagtaat 648; ccctggggtt tggattattt tcacaagggt tatgccgttt tattcaagtt tgttgctccg 654; ttttgcacct ctgcaataaa agcaaaatga caaccagtac ataaggggtt agcttgacaa 660; agtagacttc cttgtgttaa tttttaagtt tttttttcct taactatatc tgtctacagg 666; caga tatg aaaatctgct tgcctgtaaa atttgcattt gtgt 672; tgccgatgga gggc ctgtacacat accaattagc gtgaccactt ccatcttaaa 6781 aacaaaccta aaaaacaaaa tttattatat atatatatat atatatataa aggactgtgg 6841 gttgtataca aactattgca aacacttgtg caaatctgtc ttgatataaa ggaaaagcaa 6901 aatctgtata acattattac tacttgaatg cctctgtgac tgattttttt ttcattttaa 6961 atataaactt ttttgtgaaa agtatgctca atgttttttt tccctttccc cttg 7021 taaatacatt ttgttctatg tgacttggtt tggaaatagt taactggtac tgtaatttgc 7081 attaaataaa aagtaggtta gcctggaaat gaaattaaaa aaaa aaaa . HDLBP: HDLBP high density lipoprotein binding protein [ Homo sapiens ] LOCUS NM_0012439OO rm b) AA /translation="M{-A*RDRW-EVA VWMHEVSIKSGFPGACVGVRSTMSSVAVLT HRSGLVPQQIKVA1-VS U] U "U "U YKDAEPP-P‘KAAC-*SAQ*PAGAWGN KIRPIKASVIiQVbiVPL L WVQEG‘G‘QAKIC.41MQR1GAiS-AKDQG GKLDAVMKARKDIVARLQTQASATVAIPKEHiRFVIGKVGTK.QDLdLK A1 KIQIPRPDDPSNQIK11G1K4GI‘KARH4V--ISATQDKQAV L AEHPEIAGP YNRLVG‘IMQ‘1G1RIVIPPPSVW? *IVbiGdeQLAQAVARIKKIY4*KAVSE VS SVAAPSWLHRFIIGKKGQNAAKITQQWPKV414b1*G4DKIi.4GP143VVVAQ4Q14 GMVKDAINRWDYVEIWIDHKFiRioIGKSGAVINRIKJQYKVSVRIPPDSTKSN.1RI EGDPQGVQQAKR‘--*-ASRW*V*R KDLII‘QREiRiIIGQKG L RI?*IRDKEPLVI INFPDPAQKSDIVQ-?GPKV*V*KC KYMQKWVAD-VTWSYSISVPIFKQF{KNIIGK GGANIKKIR L *SW KID-PA*WSVS*11111GKRAVCLAARSRI-SIQK3-AWIA*V* VSIPAK-HNS-IG"KG?-I?SIW**CGGVHI{EPVLGSGSD VVIRGPSSDVEKAKKQ LLHLA L *KQiKSb V31QAKPEYiKFLIGKGGGKIRKVRDS GARVIEPAAdlKDQD.

IiIIGK‘DAVR*AQK L .4A-IQV.3WVVTDSWLVDPK{{R{FVIRRGQV-?*IA L *YG GVMVSFPRSG"QSDKVTAKGAKDCVEAAKKRIQ411*D.4AQV1.4CAIPQKF{RSVW GPKGSRIQQI RDESVQIKEPD?‘*VAViSi4PVVQ‘VGD‘AG‘G?*AK3CDPGSPRR MWQEHO DIIIISGRK4KC4AAK4A.4A-VPV114V4VPED-{?YVIGQKGSGIRKWWD*E*VW{VPAPELQSDIIAITG-AAV-DQAKAG.L*RVK*-QA*Q*DRA-RSFK-SVTVDPKYPKIIGRKGAVITQI?-TH3VWIQFPDK3DGVQPQDQI111GY4KV14AARDAIARIV4L4QWVS‘DVPLDiRViARIIGARGKAIRKIMDEFKVDIRFPQSGAPDPVCVTVTGA*NV44AIDHI-N.444YLADVVDSEALQVYWKPPAH L *AKAPSRGFVVRDAPWTASSEKAPDWSSS‘*EPSEGAQVAPKTLPWGPKR CDNA: 1 ggagcg:ccc ggc:tctccc gcgcgggggg cgagtaagcc agcggcagga ccagcgggcg 61 cacg acaaaagctg gcaggctgac agaggcggcc tcaggacgga ccttctggct 121 actgaccgtt ttgctgctac cacttataac cacctggtta agtcgagatt tggaggtggt 181 ttagtttggg gcctggatgc accttgcaga gagagaccgc tggctttttg tagcaactgt 241 catgatgcat tttgtaagca ttaagagtgg ttttcccgga ttgtgtgtag gtgtgagatc ; aaccatgagt tccgttgcag ccca agagagtttt cacc gaagtgggct 361 ggttccgcaa caaatcaaag ttgccactct aaattcagaa gaggagagcg accctccaac 421 ctacaaggat gccttccctc cacttcctga tgct tgcctggaaa gtgcccagga 481 acccgctgga gcctggggga tccg acccatcaag gcttctgtca tcactcaggt 541 tgta cccctggagg agagaaaata caaggatatg aaccagtttg gagaaggtga 60; acaagcaaaa atctgccttg agatcatgca gagaactggt gctcacttgg agctgtcttt 66; ggccaaagac caaggcctct ccatcatggt aaag ctggatgctg tcatgaaagc 72; tcggaaggac attgttgcta gactgcagac tcaggcctca gcaactgttg ccaa 78; agaacaccat cgctttgtta ttggcaaaaa tggagagaaa ctgcaagact tggagctaaa 84; aactgcaacc aaaatccaga tcccacgccc agatgacccc agcaatcaga tcac 90; tggcaccaaa gagggcatcg agaaagctcg ccatgaagtc ttactcatct ctgccgagca 96; ggacaaacgt gctgtggaga ggctagaagt agaaaaggca ttccacccct tcatcgctgg L02; gccgtataat agactggttg gcgagatcat gcaggagaca ggcacgcgca tcaacatccc L08; cccacccagc gtgaaccgga cagagattgt cttcactgga gagaaggaac ctca L14; ggctgtggct cgcatcaaga agatttatga ggagaaggcc aatagcttca ccgtctcctc L20; tgtcgccgcc ccttcctggc gttt catcattggc aagaaagggc agaacctggc L26; caaaatcact cagcagatgc caaaggttca catcgagttc acagagggcg aagacaagat L32; caccctggag ggccctacag aggatgtcaa tgtggcccag gaacagatag aaggcatggt L38; caaagatttg attaaccgga tggactatgt ggagatcaac atcgaccaca agttccacag L44; gcacctcatt gggaagagcg gtgccaacat aaacagaatc aaagaccagt acaaggtgtc L50; cgtgcgcatc cctcctgaca gtgagaagag caatttgatc cgcatcgagg caca L56; gggcgtgcag caggccaagc tgct ggagcttgca tctcgcatgg aaaatgagcg L62; taccaaggat ctaatcattg agcaaagatt tcatcgcaca atcattgggc agaagggtga L68; acggatccgt gaaattcgtg acaaattccc agaggtcatc attaactttc cagacccagc L74; aagt gacattgtcc agctcagagg acctaagaat gaggtggaaa aatgcacaaa L80; gcag aagatggtgg cagatctggt tagc tattcaattt ctgttccgat L86; cttcaaacag tttcacaaga atatcattgg gaaaggaggc gcaaacatta aaaagattcg L92; aagc aacaccaaaa tcgaccttcc agcagagaat agcaattcag agaccattat L98; catcacaggc aagcgagcca actgcgaagc gagc aggattctgt ctattcagaa 204; agacctggcc aacatagccg aggtagaggt ctccatccct gccaagctgc acaactccct 210; cattggcacc aagggccgtc tgatccgctc ggag ggcg gggtccacat 216; tcactttccc ggtt gcga caccgttgtt atcaggggcc cttcctcgga 222; tgtggagaag gccaagaagc agctcctgca tctggcggag gagaagcaaa ccaagagttt 228; cactgttgac atccgcgcca agccagaata ccacaaattc ctcatcggca aggggggcgg 234; caaaattcgc aaggtgcgcg acagcactgg agcacgtgtc cctg 240; caaggaccag gacctgatca ccatcattgg aaaggaggac gccgtccgag aggcacagaa 246; ggagctggag gccttgatcc aaaacctgga taatgtggtg gaagactcca tgctggtgga 252; ccccaagcac caccgccact tcgtcatccg cagaggccag gtcttgcggg agattgctga 258; agagtatggc ggggtgatgg tcagcttccc acgctctggc acacagagcg tcac 264; gggc gccaaggact gtgtggaggc agccaagaaa cgcattcagg ttga 270; ggacctggaa gctcaggtga cattagaatg tgctataccc cagaaattcc atcgatctgt 276; catgggcccc tcca gaatccagca gattactcgg gatttcagtg ttcaaattaa 282; agac gaga acgcagttca cagtacagag ccagttgtcc aggagaatgg 288; ggacgaagct ggggagggga ctaa agattgtgac cccggctctc caaggaggtg 294; tgacatcatc atcatctctg gccggaaaga aaagtgtgag gctgccaagg aagctctgga 300; ggcattggtt cctgtcacca ttgaagtaga ggtgcccttt gaccttcacc gttacgttat 306; tgggcagaaa ggaagtggga tccgcaagat gatggatgag gtga acatacatgt 312; cccggcacct gagctgcagt ctgacatcat cgccatcacg ggcctcgctg caaatttgga 318; ccgggccaag gctggactgc tggagcgtgt gaaggagcta caggccgagc aggaggaccg 324; ggctttaagg agttttaagc tgagtgtcac tgtagacccc catc ccaagattat 330; cgggagaaag ggggcagtaa ttacccaaat ccggttggag catgacgtga acatccagtt 336; tcctgataag gacgatggga accagcccca ggaccaaatt accatcacag ggtacgaaaa 342; gaacacagaa gctgccaggg atgctatact gagaattgtg ggtgaacttg agcagatggt 348; ggac gtcccgctgg gcgt tcacgcccgc atcattggtg cccgcggcaa 354; agccattcgc aaaatcatgg acgaattcaa ggtggacatt ccac agagcggagc 360; cccc aactgcgtca ctgtgacggg gctcccagag aatgtggagg aagccatcga 366; ccacatcctc aatctggagg aggaatacct agctgacgtg gtggacagtg aggcgctgca 372; ggtatacatg aaacccccag cacacgaaga ggccaaggca ccttccagag gctttgtggt 378; gcgggacgca ccctggaccg ccagcagcag tgagaaggct cctgacatga gcagctctga 384; ggaatttccc agctttgggg ctcaggtggc tcccaagacc ctcccttggg gccccaaacg 390; ataatgatca aaaagaacag aaccctctcc agcctgctga cccaaaccca accacacaat 396; ggtttgtctc aatctgaccc tgga ccctccgtaa attgttgacg ctcttccccc 402; ttcccgaggt cccgcaggga gcctagcgcc tgtg tgcggccgct cctccaggcc 408; tggccgtgcc cgctcaggac ctgctccact gtttaacact aaaccaaggt catgagcatt 414; cgtgctaaga taacagactc cagctcctgg tccacccggc gtca gcactctggc 420; cttcatcacg agagctccgc agccgtggct aggattccac ttcctgtgtc atgacctcag 426; gaaataaacg actt tataaaagcc aaacgtttgc cctcttcctt tcccacctcc 132; ctcctgccag tttcccttgg cagt cctgtttgtg gagtgcaatc agcctcctcc 138; agctgccaga gcgcctcagc acaggtgtca gggtgcaagg aagacctggc aatggacagc Z44; aggaggcagg ttcctggagc tggggggtga cctgagaggc agagggtgac gggttctcag 150; gcagtcctga ttttacctgc Cgtggggtct gaaagcacca agggtccctg cctc 156; cactgccaga ccctcagcct tggt gagtggagcc tggaggcaag gtggtaggca Z62; ccatctgggt cccctgtggc cgtcacagtg tctgctgtga ttgagatgcg ttgg Z68; tagg gccttacgct tgtcctcagt gggggcagtt tgccttagat gacagctggg Z74; ctcttcttca caccacctgc agcccctccc tgcccctgcc ctagctgctg tgtgttcagt Z80; tgccttcttt ctacctcagc ngcgtggag tggtctctgt gcagttagtg ccaccccaca Z86; cacccgtctc ttgattgaga tgtttctggt ggttatgggt ttcccgtgga gctgggggtg Z92; ggcgccgtgt acctaagctg gaggctggcg ctctccctca gcacaggtgg gtcagtggcc Z98; agcaggccca tctggagtgg gagtgggcac cccg cccacaggcc atccggctgt 504; gcaggccagc ccctaggagc cggg tgactggcag ttttcacggt ctagggccga 510; gacgatggca tggggcctag agcatgaggt agagcagaat gcagaccacg ccgctggatg 516; ccgagagacc ctgctctccg agggaggcat ctgtgtcatg ctgtgagggc tgaggacggg 522; gccctagtct ctggttttct ggtcttaaca tccttatctg tgtccgccac ggaggtgact 528; gagctgctag cgagttgtcc tgtcccaggt acttgagttt tggaaaagct gactcacgcc 534; catccatctc acagcccttc cctggggaca gtcgcttccg ccttgacacc tcactctcag 540; ttgaataact tggt catcttcaga ctcgaattct gacc cagacggctt 546; agcccaagtc tagttgcagc tgcctcggca agtccccatt tgctcaggca gccctgaatg 552; ggcctgttta caggaatggt aaattgggat gaat atagcttcca gcttcatagg 558; tgac cacggcttag gaaacaggga aagaaagcaa ggcccttttc ctgcctttcc 564; cgggatctgt ctactccacc tccacggggg aggccagtgg ggaagggctg tcacctcttc 570; cccatctgca tgagttctgg aactctgtcc tgttggctgc ttgcttccag ctccccccaa 576; tctccatcgc agcgggttcc tcctgtcttt tctacagtgt acat cctgccccta 582; ccctctccca aaggtcaatt ttaattctca ccaagttttg tctg tatgtcgctt 588; gatgtcttag acgcgagccc tttcctaaac tgttcagcgc ttcc tttgggtggt 594; tgttgcaagg gtgatgacat gactgtcccc gtct ccctgaagcg tctgtgctgt 600; caggacagcc ctgggcagag cagg ggtgaggcgt gcgtgtgctt ttcctccttg 606; ttggatgtct tccatatcat ctgtttccat agctacaatc catcccttgg ccttaacttt 612; ggaatttgga gattatatgc gtgt aaaggctcat gaatatggat gacactggaa 6181 aaat tctaaaataa aacccgaaac cagatgtagc atgctgggac tcattttgaa 6241 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa a LOCJS WW_005336 (isoform a) AA /:ransla:ion="MSSVAVL1Q*ShA‘HRSGLVPQQIKVAi-VS***SDPP1YKDAE PP-P*KAAC-*SAQ*PAGAWGNKIRPIKASVIiQVhHVPL L *RKYKJMWQEG‘G‘QAK IC-‘IMQR1GAi-4.5-AKDQGLSIWVSGKADAVMKARKDIVARLQTQASATVAIPKE HHRFVIGKWGTK-QD.4LK AiKIQIPRPDDPSNQIKliGiK G) H *KA?{*V--ISAT QDKRAV4?.4V4KAbiPhIAGPYV?-VG*IWQ*1G1RINIPPPSVVR14IVE1G*K* C LAQAVARIKKIY44KKKK1 IAVLVKKSQiKYVIGPKGNSLQ‘I-*?1GVSVLIPPS DSIS‘iVI-RG‘P‘K-GQA- *VYAKANSFTVSSVAAPSWLHRFIIGKKGQNAAKITQ QMPKVHI‘Ei‘G L DKIiL GPi‘DVVVAQ‘Q1*GMVKDLINRWDYVEIVIDHKF{R{4L IGKSGAVIWRIK RIPPDSTKSN.1R1TGDPQGVQQAKR*--*-ASRW*W*R KDLIILQRE{R*Jl—U IIGQKG L RI?‘IRDKEPLVIINEPDPAQKSDIVQ RGPKV‘V‘KC "KYMQKWVAD-V'WSYSISVPIFKQF{KNIIGKGGANIKKIR L *SW TVSWS ETIIITGKQAVCIAARSRI-SIQK3-AVIA*V*VSIPAK-HNS-IG"KGR-IRSIW**L‘J CGGVHIihPV.GSGSD VVIRGPSSDVEKAKKQLLHLALL L *KQiKSb VDIQAKPEYiK FLIGKGGGKIRKVRDS GARVIEPAA43KDQD.1111GK4DAVR*AQK*.4A-IQW-D WLVJPK{{R{FVIRRGQV-?*IA L 4YGGVMVSFPRSG"QSDKVTAKGAKDCV EAAKKRIQ41143-4AQV1.4CAIPQKF{RSVWGPKGSRIQQI IKEPDR** VAViSi*PVVQ‘VGD‘AG‘G?*AK3CDPGSPRRCDIIIISGRK4KC‘AAK‘A.4A-VP V114V4VPED-{?YVIGQKGSGIRKWWD*E*VVIiVPAPELQSDIIAITG-AAV-DKA KAG.L4RVK*-QA*Q*DRA-?SFK-SVTV3PKYHPKIIGRKGAVITQIR-TH3VWIQF PDKJDGVQPQDQIiIiGY‘KWi‘AARDAI-?IVG4L4QWVS4DVPLD{RV{ARIIGAR GKAIRKIMDEFKVDIRFPQSGAPDPVCViViG-P*NV**AIDHI-N.444YLADVVDS EALQVYWKPPAHL *AKAPSRGFVVRDAPWTASSSEKAPDWSSSL *EPSEGAQVAPKTL PWGPKR CDNA: 1 atagaggctg ggggtggggg gggaggtcaa gcg:agcc:c ttctccttta tggc 61 ggcttgtccc tgtttcgcca cagttcctac c:tatgagct cggttttctt atgcttataa 121 gagtggaaca gctg gcaggctgac agaggcggcc tcaggacgga ccttctggct 181 actgaccgtt ttgctgtggt tttcccggat tagg tgtgagatca accatgagtt 241 ccgttgcagt tttgacccaa gagagttttg ctgaacaccg aagtgggctg gttccgcaac ; aagt tgccactcta aattcagaag aggagagcga ccctccaacc tacaaggatg 361 ccttccctcc acttcctgag aaagctgctt gcctggaaag tgcccaggaa cccgctggag 421 cctgggggaa caagatccga cccatcaagg cttctgtcat cactcaggtg ttccatgtac 481 agga gagaaaatac aaggatatga accagtttgg agaaggtgaa caagcaaaaa 541 tctgccttga gatcatgcag agaactggtg ctcacttgga gctgtctttg gccaaagacc 60; aaggcctctc catcatggtg tcaggaaagc tggatgctgt catgaaagct cggaaggaca 661 ttgttgctag actgcagact caggcctcag caactgttgc cattcccaaa catc 721 ttat tggcaaaaat ggagagaaac tgcaagactt ggagctaaaa acca 781 aaatccagat cccacgccca gatgacccca agat caagatcact ggcaccaaag 841 agggcatcga gaaagctcgc catgaagtct tactcatctc tgccgagcag gacaaacgtg 90; ctgtggagag gctagaagta gaaaaggcat tccacccctt catcgctggg ccgtataata 96; gactggttgg cgagatcatg caggagacag gcacgcgcat caacatcccc ccacccagcg L02; tgaaccggac agagattgtc ttcactggag agaaggaaca gttggctcag gctgtggctc L08; gcatcaagaa gatttatgag gagaagaaaa agaagactac aaccattgca gtggaagtga L14; agaaatccca acacaagtat gtcattgggc ccaagggcaa ttcattgcag gagatccttg L20; agagaactgg agtttccgtt gagatcccac cctcagacag catctctgag actgtaatac L26; ttcgaggcga aaag ttaggtcagg cgttgactga agtctatgcc aaggccaata L32; gcttcaccgt tgtc gccgcccctt cctggcttca ccgtttcatc attggcaaga L38; agaa cctggccaaa atcactcagc agatgccaaa ggttcacatc gagttcacag L44; agggcgaaga caagatcacc ctggagggcc ctacagagga tgtcaatgtg gcccaggaac L50; agatagaagg catggtcaaa gatttgatta accggatgga ctatgtggag atcaacatcg L56; accacaagtt ccacaggcac ggga agagcggtgc caacataaac agaatcaaag L62; acaa ggtgtccgtg cgcatccctc ctgacagtga gaagagcaat ttgatccgca L68; tcgaggggga gggc gtgcagcagg ccaagcgaga gctgctggag cttgcatctc L74; gcatggaaaa tgagcgtacc aaggatctaa tcattgagca aagatttcat cgcacaatca L80; ttgggcagaa gggtgaacgg atccgtgaaa ttcgtgacaa agag gtcatcatta L86; actttccaga cccagcacaa aaaagtgaca ttgtccagct cagaggacct aagaatgagg L92; tggaaaaatg cacaaaatac atgcagaaga tggtggcaga ggaa tatt L98; caatttctgt tccgatcttc aaacagtttc acaagaatat gaaa ggaggcgcaa 204; acattaaaaa gattcgtgaa gaaagcaaca ccaaaatcga ccttccagca gagaatagca 210; attcagagac catc acaggcaagc actg cgaagctgcc cggagcagga 216; ttctgtctat tcagaaagac ctggccaaca tagccgaggt agaggtctcc atccctgcca 222; agctgcacaa ctccctcatt ggcaccaagg gccgtctgat ccgctccatc atggaggagt 228; gcggcggggt ccacattcac tttcccgtgg cagg aagcgacacc gttgttatca 234; ggggcccttc ctcggatgtg gagaaggcca agaagcagct cctgcatctg gcggaggaga 240; agcaaaccaa gagtttcact gttgacatcc gcgccaagcc agaataccac aaattcctca 246; tcggcaaggg gggcggcaaa attcgcaagg tgcgcgacag agca cgtgtcatct 252; tccctgcggc tgaggacaag gacc tgatcaccat cattggaaag gaggacgccg 258; tccgagaggc acagaaggag ctggaggcct tgatccaaaa taat gtggtggaag 264; tgct ggtggacccc aagcaccacc gccacttcgt catccgcaga ggccaggtct 270; tgcgggagat tgctgaagag tatggcgggg tgatggtcag cttcccacgc tctggcacac 276; agagcgacaa cctc gcca aggactgtgt ggaggcagcc aagaaacgca 282; ttcaggagat cattgaggac ctggaagctc aggtgacatt agaatgtgct ataccccaga 288; aattccatcg atctgtcatg ggccccaaag gttccagaat ccagcagatt actcgggatt 294; tcagtgttca aattaaattc ccagacagag aggagaacgc agttcacagt acagagccag 300; ttgtccagga ggac gaagctgggg aggggagaga ggctaaagat tgtgaccccg 306; gctctccaag gaggtgtgac atcatcatca tctctggccg gaaagaaaag tgtgaggctg 312; ccaaggaagc tctggaggca ttggttcctg tcaccattga agtagaggtg ccctttgacc 318; ttcaccgtta cgttattggg cagaaaggaa gtgggatccg caagatgatg gatgagtttg 324; aggtgaacat acatgtcccg gcacctgagc ctga catcatcgcc atcacgggcc 330; caaa tttggaccgg gccaaggctg gactgctgga gcgtgtgaag gagctacagg 336; ccgagcagga ggaccgggct ttaaggagtt ttaagctgag tgtcactgta aaat 342; ccaa gattatcggg agaaaggggg cagtaattac ccaaatccgg ttggagcatg 348; acgtgaacat ccagtttcct gataaggacg atgggaacca gccccaggac caaattacca 354; ggta cgaaaagaac acagaagctg ccagggatgc tatactgaga attgtgggtg 360; aacttgagca gatggtttct gaggacgtcc acca ccgcgttcac gcccgcatca 366; ttggtgcccg agcc attcgcaaaa tcatggacga ggtg gacattcgct 372; tcccacagag cggagcccca gaccccaact gcgtcactgt gacggggctc ccagagaatg 378; tggaggaagc catcgaccac atcctcaatc tggaggagga atacctagct gacgtggtgg 384; aggc gctgcaggta tacatgaaac ccccagcaca cgaagaggcc aaggcacctt 390; ccagaggctt tgtggtgcgg gacgcaccct ggaccgccag cagcagtgag aaggctcctg 396; gcag ctctgaggaa agct ttggggctca ggtggctccc aagaccctcc 102; cttggggccc caaacgataa tgatcaaaaa gaacagaacc ctctccagcc tgctgaccca 108; acca cacaatggtt tgtctcaatc tgacccagcg gctggaccct ccgtaaattg 114; ttgacgctct tcccccttcc cgaggtcccg cagggagcct agcgcctggc tgtgtgtgcg 120; gccgctcctc caggcctggc cgtgcccgct caggacctgc tccactgttt aacactaaac 126; caaggtcatg cgtg ctaagataac agactccagc tcctggtcca atgt Z32; cagtcagcac tctggccttc atcacgagag ctccgcagcc gtggctagga ttccacttcc Z38; tgtgtcatga cctcaggaaa taaacgtcct tgactttata aaagccaaac gtttgccctc Z44; ttcctttccc acctccctcc tgccagtttc ccttggtcca gacagtcctg tttgtggagt 150; gcaatcagcc tcctccagct gccagagcgc ctcagcacag gtgtcagggt gcaaggaaga 156; cctggcaatg gacagcagga ggcaggttcc tggagctggg gggtgacctg agaggcagag Z62; ggtgacgggt tctcaggcag tcctgatttt acctgccgtg gggtctgaaa gcaccaaggg Z68; tccctgcccc tacctccact ccct cagcctgagg tctggtgagt ggagcctgga Z74; ggcaaggtgg taggcaccat ctgggtcccc tgtggccgtc acagtgtctg ctgtgattga 180; caca ggttggggga ggtagggcct tacgcttgtc ctcagtgggg gcagtttgcc 186; ttagatgaca gctgggctct tcttcacacc acctgcagcc cctccctgcc cctgccctag Z92; ctgctgtgtg ttcagttgcc ttctttctac ctcagccggc gtggagtggt ctctgtgcag Z98; ttagtgccac cacc cgtctcttga ttgagatgtt tctggtggtt atgggtttcc 504; Cgtggagctg ggggtgggcg acct aagctggagg ctggcgctct ccctcagcac 510; aggtgggtca gtggccagca ggcccatctg gagt gggcacttcc accccgccca 516; caggccatcc ggctgtgcag gccagcccct aggagcaggt cccgggtgac tggcagtttt 522; cacggtctag ggccgagacg atggcatggg gcctagagca tgaggtagag cagaatgcag 528; accacgccgc tggatgccga gagaccctgc tctccgaggg aggcatctgt gtcatgctgt 534; gagggctgag gangggCCC tagtctctgg ttttctggtc ttaacatcct tatctgtgtc 540; cgccacggag gtgactgagc tgctagcgag ttgtcctgtc ccaggtactt gagttttgga 546; gact cacgcccatc catctcacag cccttccctg gggacagtcg cttccgcctt 552; gacacctcac tctcagttga caag cttggtcatc ttcagactcg aattcttgag 558; tagacccaga cggcttagcc caagtctagt tgcagctgcc agtc cccatttgct 564; caggcagccc tgaatgggcc cagg aatggtaaat tgga aggaatatag 570; cttccagctt cataggctag cacg gcttaggaaa cagggaaaga aagcaaggcc 576; cttttcctgc ctttcccggg atctgtctac tccacctcca cgggggaggc cagtggggaa 582; gggctgtcac ctcttcccca tctgcatgag ttctggaact ctgtcctgtt ttgc 588; ttccagctcc ccccaatctc catcgcagcg ggttcctcct tcta cagtgtcata 594; aaacatcctg cccctaccct aagg tcaattttaa ttctcaccaa gttttgcaca 600; tctctgtatg tcgcttgatg tcttagacgc tttc ctaaactgtt cagcgctctc 606; ttttcctttg ggtggttgtt gcaagggtga tgacatgact gtccccaggc ctgtctccct 612; gaagcgtctg tgctgtcagg acagccctgg gcagagatga ggcaggggtg aggcgtgcgt 618; gtgcttttcc tccttgttgg atgtcttcca tatcatctgt ttccatagct acaatccatc 624; ccttggcctt aactttggaa tttggagatt atatgcaaac atgtgtaaag gctcatgaat 630; gaca ctggaatttt ataaattcta aacc cgaaaccaga tgtagcatgc 636; tgggactcat tttgaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 6421 aaaaaaa LOCUS NM_203346 ( isoform a, :ranscript variant 2) AA /:ransla:ion="MSSVAVLIQ *StA‘HRSGLVPQQIKVAI .WS 4 4 *S DPPIYKDAE AAC *SAQ 4 PAGAWGNKI RPIKASVIIQVEHVPL L *RKYK I DMWQEG 4G *QAK IC *IMQQIGAi. 4 S-AKDQGLSIWVSGKADAVMKARKDIVA RLQTQASATVAIPK 4 I . 4 HHQFVIGKVG K-QD4 .4LK AIKIQ DPSNQIKIIGIK L GI *KA R44V -ISA 4 QDK?AV*? *V‘KAE {PEIAGPYWR *IWQ‘IGIRINIPPPSVV XI *IVhIG‘K 4 C LAQAVAQIKKIY 4 4KKKKI IAVﬁ4VKKSQiKYVIGPKGNSLQ 4 I a iGVSVLIPPS DSIS‘iVI RGdeK .GQA *VYAKANSFTVSSVAAPSWLH RFI IGKKGQN 4AKITQ QMPKVHI4bi4G L 3K IIL L GPI 4 *QI4GMVKDLINRWDYV 44 IVI DHKF IGKSGAVIWQIK QY*Jl—U KVSVRIPP .IRITGDPQGVQQAKR * 4 I .ASQW KDLIILQ?E4R II GQKG L RI? DKhPLVIINbPDPAQKSDIVQ RGPKW "KYMQKWVAD-V'WSYSISV {KNIIGKGGANIKKIR L *SV KI 3 ETIIITGKQAVCIAAL‘J QSRI AVIA‘V‘VSIPAK-HNS-IG'"KG Q CGGVHIibPV.GSGSLL D VVI DVTKAKKQLLHLA L *KQIKSE V D I QAKPEY FLIGKGGGKIQKVQD S GA *DKDQD-IIIIGK‘DAV? *AQK 4 .IQW VVVEDSWLVJPKii? {FVI *IA L VSFPRSG"QS DKVT.4KGAK3CV EAAKKRIQ 4 II 4 DH AQVI {RSVWGPKGSRIQQI IKEP D? 4 4 VAViSi‘PVVQ VG)4 *AG RCDII IISGRK4KC *AAK *A *A-VP ViI‘V‘VPbD RYV *VWIiVPAPELQS DIIAITG .AAW DKA KAG.L 4 RVKd-QAdQ *DRA RKGAVITQI R THDVWIQF PDKDDGWQPQDQIII iGY *DVPLD { QViARIIGAR GKAIRKIMDEFKVDI QFPQSGAPDPVCVIVIG DHI-N 4 4 *YLADVVDS EALQVYWKPPAHL *AKAPS RDAPWTASSS DWSSS L *bPSbGAQVAPKTL PWGPKR CDNA: l ggagcgtccc ggcttc 1CCC gcgcgggggg cgag :aagcc agcggcagga ggcg 6; ggggcccacg acaaaagctg gcaggctgac agaggcggcc tcaggacgga ccttctggct 12; actgaccgtt ttgctgtggt tttcccggat tgtgtgtagg tgtgagatca accatgagtt 18; ccgttgcagt tttgacccaa gagagttttg ctgaacaccg aagtgggctg gttccgcaac 24; aaatcaaagt tcta aattcagaag aggagagcga ccctccaacc tacaaggatg ; ccttccctcc tgag aaagctgctt gcctggaaag tgcccaggaa cccgctggag 36; cctgggggaa caagatccga aagg cttctgtcat ggtg ttccatgtac 42; ccctggagga gagaaaatac aaggatatga accagtttgg tgaa caagcaaaaa 48; tctgccttga gatcatgcag agaactggtg ctcacttgga gctgtctttg gccaaagacc 54; aaggcctctc catcatggtg tcaggaaagc tggatgctgt catgaaagct cggaaggaca 60; ttgttgctag actgcagact caggcctcag caactgttgc cattcccaaa gaacaccatc 66; ttat tggcaaaaat ggagagaaac tgcaagactt ggagctaaaa actgcaacca 72; aaatccagat ccca gatgacccca gcaatcagat caagatcact ggcaccaaag 78; agggcatcga gaaagctcgc catgaagtct tactcatctc tgccgagcag gacaaacgtg 84; ctgtggagag gctagaagta gaaaaggcat tccacccctt tggg ccgtataata 90; gactggttgg catg caggagacag gcat caacatcccc ccacccagcg 96; tgaaccggac tgtc ttcactggag agaaggaaca gttggctcag gctgtggctc L02; gcatcaagaa gatttatgag gagaagaaaa agaagactac aaccattgca gtga L08; agaaatccca acacaagtat gtcattgggc ccaagggcaa ttcattgcag gagatccttg L14; agagaactgg agtttccgtt gagatcccac cctcagacag catctctgag actgtaatac L20; ttcgaggcga acctgaaaag ttaggtcagg cgttgactga agtctatgcc aaggccaata L26; gcttcaccgt tgtc gccgcccctt cctggcttca ccgtttcatc attggcaaga L32; aagggcagaa cctggccaaa atcactcagc agatgccaaa catc gagttcacag L38; agggcgaaga caagatcacc ctggagggcc ctacagagga tgtcaatgtg gcccaggaac L44; aagg catggtcaaa gatttgatta accggatgga ctatgtggag atcaacatcg L50; accacaagtt ccacaggcac ctcattggga agagcggtgc caacataaac agaatcaaag L56; accagtacaa ggtgtccgtg cgcatccctc ctgacagtga gaagagcaat ttgatccgca L62; ggga cccacagggc gtgcagcagg ccaagcgaga gctgctggag cttgcatctc L68; gcatggaaaa tgagcgtacc aaggatctaa tcattgagca aagatttcat atca L74; ttgggcagaa gggtgaacgg atccgtgaaa ttcgtgacaa attcccagag gtcatcatta L80; actttccaga cccagcacaa aaaagtgaca ttgtccagct cagaggacct aagaatgagg L86; tggaaaaatg cacaaaatac atgcagaaga tggtggcaga tctggtggaa aatagctatt L92; caatttctgt tccgatcttc aaacagtttc acaagaatat cattgggaaa ggaggcgcaa L98; acattaaaaa gattcgtgaa gaaagcaaca ccaaaatcga ccttccagca gagaatagca 204; attcagagac cattatcatc acaggcaagc gagccaactg cgaagctgcc cggagcagga 210; ttctgtctat tcagaaagac ctggccaaca tagccgaggt agaggtctcc gcca 216; agctgcacaa ctccctcatt ggcaccaagg gccgtctgat ccgctccatc atggaggagt 222; gggt ccacattcac tttcccgtgg aaggttcagg aagcgacacc gttgttatca 228; ggggcccttc ctcggatgtg gagaaggcca agaagcagct cctgcatctg gcggaggaga 234; agcaaaccaa gagtttcact gttgacatcc gcgccaagcc ccac aaattcctca 240; tcggcaaggg gggcggcaaa aagg tgcgcgacag cactggagca cgtgtcatct 246; cggc tgaggacaag gacc tgatcaccat aaag gaggacgccg 252; tccgagaggc acagaaggag ctggaggcct tgatccaaaa cctggataat gtggtggaag 258; actccatgct ggtggacccc aagcaccacc gccacttcgt catccgcaga ggccaggtct 264; tgcgggagat tgctgaagag gggg tgatggtcag cttcccacgc tctggcacac 270; agagcgacaa cctc aagggcgcca aggactgtgt agcc aagaaacgca 276; ttcaggagat cattgaggac ctggaagctc aggtgacatt agaatgtgct ataccccaga 282; aattccatcg atctgtcatg ggccccaaag gttccagaat ccagcagatt actcgggatt 288; tcagtgttca aattaaattc ccagacagag aggagaacgc agttcacagt acagagccag 294; ttgtccagga gaatggggac gaagctgggg aggggagaga agat tgtgaccccg 300; gctctccaag gaggtgtgac atcatcatca gccg gaaagaaaag gctg 306; ccaaggaagc tctggaggca ttggttcctg tcaccattga agtagaggtg gacc 312; ttcaccgtta cgttattggg cagaaaggaa gtgggatccg caagatgatg gatgagtttg 318; aggtgaacat acatgtcccg gcacctgagc tgcagtctga cgcc atcacgggcc 324; tcgctgcaaa tttggaccgg gccaaggctg gactgctgga gcgtgtgaag gagctacagg 330; ccgagcagga ggaccgggct agtt ttaagctgag tgtcactgta gaccccaaat 336; ccaa gattatcggg agaaaggggg ttac ccaaatccgg ttggagcatg 342; acgtgaacat ccagtttcct gataaggacg atgggaacca gccccaggac caaattacca 348; tcacagggta cgaaaagaac acagaagctg ccagggatgc tatactgaga attgtgggtg 354; aacttgagca gatggtttct gaggacgtcc cgctggacca ccgcgttcac gcccgcatca 360; ttggtgcccg cggcaaagcc attcgcaaaa tcatggacga attcaaggtg gacattcgct 366; tcccacagag cggagcccca aact gcgtcactgt gacggggctc ccagagaatg 372; tggaggaagc catcgaccac atcctcaatc tggaggagga atacctagct gacgtggtgg 378; aggc gctgcaggta tacatgaaac ccccagcaca cgaagaggcc aaggcacctt 384; ccagaggctt tgtggtgcgg gacgcaccct ggaccgccag cagcagtgag aaggctcctg 390; acatgagcag ctctgaggaa tttcccagct ttggggctca ggtggctccc aagaccctcc 396; cttggggccc caaacgataa tgatcaaaaa gaacagaacc ctctccagcc tgctgaccca 102; aacccaacca cacaatggtt tgtctcaatc tgacccagcg gctggaccct ccgtaaattg 108; ttgacgctct tcccccttcc cgaggtcccg cagggagcct agcgcctggc tgtgtgtgcg 114; gccgctcctc caggcctggc cgtgcccgct caggacctgc gttt aacactaaac 120; caaggtcatg agcattcgtg ctaagataac agactccagc tcctggtcca cccggcatgt 126; cagtcagcac tctggccttc atcacgagag agcc gtggctagga ttccacttcc Z32; tgtgtcatga cctcaggaaa taaacgtcct tgactttata aaac gtttgccctc Z38; tccc ctcc tgccagtttc ccttggtcca gacagtcctg gagt Z44; gcaatcagcc tcctccagct gcgc ctcagcacag gtgtcagggt gcaaggaaga 150; cctggcaatg gacagcagga ggcaggttcc tggagctggg cctg agaggcagag 156; ggtgacgggt tctcaggcag tcctgatttt acctgccgtg gggtctgaaa gcaccaaggg Z62; tccctgcccc tacctccact gccagaccct cagcctgagg tctggtgagt ggagcctgga Z68; ggcaaggtgg taggcaccat ctgggtcccc cgtc acagtgtctg ctgtgattga gatgcgcaca ggttggggga ggtagggcct tacgcttgtc ctcagtgggg ttagatgaca gctgggctct cacc acctgcagcc cctccctgcc cctgccctag 186; tgtg ttcagttgcc ttctttctac ctcagccggc gtggagtggt ctctgtgcag Z92; ttagtgccac cccacacacc cgtctcttga ttgagatgtt tctggtggtt atgggtttcc Z98; cgtggagctg ggggtgggcg ccgtgtacct aagctggagg ctggcgctct ccctcagcac 504; aggtgggtca gtggccagca ggcccatctg gagtgggagt gggcacttcc accccgccca 510; caggccatcc ggctgtgcag ccct aggagcaggt tgac tggcagtttt 516; cacggtctag ggccgagacg atggcatggg gcctagagca tgaggtagag cagaatgcag 522; accacgccgc tggatgccga gagaccctgc tctccgaggg aggcatctgt gtcatgctgt 528; gagggctgag gangggCCC tagtctctgg ggtc ttaacatcct tgtc 534; cgccacggag gtgactgagc tgctagcgag ttgtcctgtc ccaggtactt gagttttgga 540; aaagctgact cacgcccatc catctcacag cccttccctg gggacagtcg cttccgcctt 546; gacacctcac tctcagttga ataactcaag cttggtcatc ttcagactcg aattcttgag 552; tagacccaga cggcttagcc caagtctagt tgcc tcggcaagtc tgct 558; caggcagccc tgaatgggcc tgtttacagg aatggtaaat tgggattgga aggaatatag 564; cttccagctt cataggctag ggtgaccacg gcttaggaaa cagggaaaga aagcaaggcc 570; cttttcctgc ctttcccggg atctgtctac tccacctcca cgggggaggc cagtggggaa 576; tcac ctcttcccca tctgcatgag ttctggaact ctgtcctgtt ggctgcttgc 582; ttccagctcc ccccaatctc catcgcagcg ggttcctcct gtcttttcta cagtgtcata 588; aaacatcctg cccctaccct ctcccaaagg tcaattttaa ttctcaccaa caca 594; tctctgtatg gatg tcttagacgc gagccctttc ctaaactgtt cagcgctctc 600; ttttcctttg ggtggttgtt gcaagggtga tgacatgact aggc ctgtctccct 606; gaagcgtctg cagg acagccctgg gcagagatga ggcaggggtg aggcgtgcgt 612; gtgcttttcc tccttgttgg atgtcttcca tatcatctgt agct acaatccatc 618; ccttggcctt aactttggaa tttggagatt atatgcaaac aaag gctcatgaat 624; atggatgaca tttt ataaattcta aaataaaacc cgaaaccaga tgtagcatgc 630; tgggactcat tttgaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 636; aaaaaaa 11. 2BA: HISTlHZBA histone cluster 1, H2ba [ {omo s Aocus WM_170610 AA / :ransLation="MPEVSSKGATISKKGFKKAVVKTQKK EGKKRKRTRKESYSIYIY KVLKQVHPD'"GISSKAMSIWNSEVLDIE *RIAS *ASRLAHYSK QSTISSREIQTAVRL .LPGTLAKHAVSEGTKAVTKYTSSK CDNA: atgccggagg ctaa aggtgctacc atttccaaga agggctttaa gaaagc:gtc 6; gttaagaccc agaaaaagga aaag cgcaagagga cccgtaagga gagtta :tct 12; atct tgct aaagcaggtc catccggaca ctggcatctc ttcgaaagct 18; atgagcatta tgaattcctt cgtcactgat atctttgagc gtatagcgag cgaggcatca 24; cgtttggctc actacagcaa gcgctccacc atttcttcca gagagattca gacagcagtg ; cgcttgctac tgccgggaga gctggctaaa catgctgtgt ctgagggcac caaggctgtc 36; actaagtaca ccagctccaa gtaagcctgc taagtaaacg tcatttctaa cccaaaggct 42; cttttcagag ccactta 12. HMGBZ: {MGBZ high mobili :y group box 2 [ {omo sapiens ] LOCUS 130688 (isoform 2) AA /transla :ion="MGKGDPNKPRGKWSSYAbbVQiC RddiKKK {PDSSVWFAEFSKK CSLRWKLMSAK *KSKh *DMAKSDKA KY3?3WKWYVPPKGDKKGKKKDPVAPKRPPSAF HRPKIKSEHPGLSIGDiAKKLG *WWS *QSAKDKQPYTQKAAK .K‘KY‘KDIA AY KAKGKSEAGKKGPG QPTGSKKKN 4P4 34444444 4D 4D4444D 4D4 4 CDNA: aaaccagttc acgccggagc cccg:gaggg aagcg:ctcc gt:gggtccg gccgctctgc 6; gggactctga ggaaaagctc gcaccaggca agaataccct ccaataccct nggtggacg 12; cggatctgtc aacatgggta aaggagaccc caacaagccg cggggcaaaa tgtcctcgta 18; cgccttcttc acct gccgggaaga gcacaagaag aaacacccgg actcttccgt 24; caatttcgcg gaattctcca agaagtgttc ggagagatgg aagaccatgt ctgcaaagga ; gaagtcgaag tttgaagata aaag agct cgctatgaca gggagatgaa 36; aaattacgtt cctcccaaag gtgataagaa gaaa ccca atgctcctaa 42; aaggccacca tctgccttct tcctgttttg ctctgaacat cgcccaaaga tcaaaagtga 48; acaccctggc ctatccattg gggatactgc aaagaaattg ggtgaaatgt ggtctgagca 54; gtcagccaaa gataaacaac catatgaaca gaaagcagct aagctaaagg atga 60; aaaggatatt gctgcatatc gtgccaaggg tgaa gcaggaaaga agggccctgg 66; caggccaaca ggctcaaaga agaagaacga agat gaggaggagg aggaggaaga 72; agaagatgaa gatgaggagg aagaggatga agatgaagaa taaatggcta tcctttaatg 78; atgcgtgtgg aatgtgtgtg tgtgctcagg caattatttt gctaagaatg tgaattcaag 84; tgcagctcaa tactagcttc agtataaaaa ctgtacagat ttttgtatag ctgataagat 90L tctctgtaga gaaaatactt ttaaaaaatg caggttgtag gatg tcat 96L acagttagat tttacagctt ctgatgttga atgttcctaa atatttaatg gtttttttaa L02; tttcttgtgt atggtagcac agcaaacttg taggaattag tatcaatagt aaattttggg L08; ttttttagga attt cgttttttta aaaaaaattt tgtaataaaa ttatgtatat L14; tatttctatt gtctttgtct taatatgcta agttaatttt cactttaaaa aagccatttg L20; aagaccagag ctatgttgat ttttttcggt gcct agtagttctt agacacagtt L26; gacctagtaa aatgtttgag aattaaaacc aaacatgctc atatttgcaa aatgttcttt L32; aaaagttaca tgttgaactc cttt attt atgcagtttt acagaacgtt L38; aagttttgta cttgacgttt ctgtttatta ttgt tcctcaggtg tgtgtatata L44; cata tatatatata tatatat LOCUS NW_001130689 (isoform 3) AA la:ion="MGKGDPNKPRGKWSSYAbbVQiCRd4iKKKiPDSSVVFAEFSKK CSLRWKLMSAKdKSKb4DMAKSDKARYDQ3WKVYVPPKGDKKGKKKDPVAPKRPPSAF FLFCSEHRPKIKSEHPGLSIGDiAKKLGdWW5*QSAKDKQPYTQKAAK.KdKYdKDIA AYQAKGKSEAGKKGPGQPTGSKKKN4P4DddddddddDdDddddDdDdd CDNA: l gga:ttgggc gggaagcgga gccccgccag cgcccgccct ggcagctgcg ggctccgcgc 6L cgaccctccg gcttcccctc tcccccctcg gccccgtcag gtggacgcgg a:ctgtcaac l2; atgggtaaag gagaccccaa gcgg ggcaaaatgt cctcgtacgc c:tcttcgtg 18L cagacctgcc gggaagagca caagaagaaa cacccggact cttccgtcaa t:tcgcggaa 24L ttctccaaga cgga gaag accatgtctg caaaggagaa g:cgaagttt 30L gaagatatgg gtga caaagctcgc tatgacaggg agatgaaaaa t:acgttcct 36L cccaaaggtg ataagaaggg gaagaaaaag gaccccaatg ctcctaaaag gccaccatct 42L gccttcttcc tgttttgctc tgaacatcgc ccaaagatca aaagtgaaca ccctggccta 48L tccattgggg atactgcaaa gaaattgggt gaaatgtggt ctgagcagtc agccaaagat 54L aaacaaccat atgaacagaa agcagctaag ctaaaggaga aatatgaaaa ggatattgct 60L gcatatcgtg ccaagggcaa aagtgaagca ggaaagaagg gccctggcag gccaacaggc 66L tcaaagaaga agaacgaacc agaagatgag gaggaggagg aggaagaaga agatgaagat 72L gaggaggaag aggatgaaga tgaagaataa atcc tttaatgatg cgtgtggaat 78L gtgt gctcaggcaa ttattttgct aagaatgtga attcaagtgc agctcaatac 84L tagcttcagt ataaaaactg tacagatttt tgtatagctg ataagattct ctgtagagaa 90L aatactttta aaaaatgcag gttgtagctt tttgatgggc taca gttagatttt 96L acagcttctg atgttgaatg ttcctaaata tttaatggtt tttttaattt cttgtgtatg L02; gtagcacagc aaacttgtag gaattagtat caatagtaaa ttttgggttt atgt L08; tgcatttcgt ttttttaaaa aaaattttgt aataaaatta tgtatattat ttctattgtc L14; tttgtcttaa tatgctaagt tcac tttaaaaaag ccatttgaag accagagcta L20; tgttgatttt tttcggtatt tctgcctagt agttcttaga cacagttgac ctagtaaaat L26; gtttgagaat taaaaccaaa catgctcata tttgcaaaat gttctttaaa agttacatgt L32; cagt gaactttata agaatttatg taca gaacgttaag ttttgtactt L38; gacgtttctg tttattagct aaattgttcc tcaggtgtgt gtatatatat atacatatat L44; atatatatat atat LOCUS NW_002129 rm 1) AA /transla:ion="MGKGDPNKPRGKWSSYAbbVQiC?‘diKKK {PDSSVWFAEFSKK CSLRWKLMSAKdKSKbdDMAKSDKARYDQEWKWYVPPKGDKKGKKKDPWAPKRPPSAF FLFCSEHRPKIKSEHPGLSIGDiAKKLG*WW5*QSAKDKQPY .KdKYdKDIA AYQAKGKSEAGKKGPGQPTGSKKKN*PdD********D*D*** 4D44 CDNA: l gggga:gtgg cccgtggcct agc:cg:caa gt:gccg:gg gaac tctgcaaaac 6; aagaggctga ggattgcgtt agaga :aaac cagttcacgc cggagccccg aagc 12; gtctccgttg ggtccggccg ggga ctctgaggaa aagctcgcac caggtggacg 18; cggatctgtc aacatgggta aaggagaccc caacaagccg cggggcaaaa tgtcctcgta 24; cgccttcttc gtgcagacct gccgggaaga gcacaagaag aaacacccgg actcttccgt ; caatttcgcg gaattctcca agaagtgttc ggagagatgg aagaccatgt ctgcaaagga 36; gaag tttgaagata tggcaaaaag tgacaaagct cgctatgaca gggagatgaa 42; aaattacgtt cctcccaaag gtgataagaa ggggaagaaa ccca atgctcctaa 48; aaggccacca tctgccttct tcctgttttg ctctgaacat cgcccaaaga tcaaaagtga 54; acaccctggc ctatccattg gggatactgc aaagaaattg ggtgaaatgt ggtctgagca 60; caaa gataaacaac catatgaaca gaaagcagct aagg agaaatatga 66; aaaggatatt tatc gtgccaaggg caaaagtgaa gcaggaaaga agggccctgg 72; caggccaaca ggctcaaaga acga accagaagat gaggaggagg aggaggaaga 78; agaagatgaa gatgaggagg aagaggatga agatgaagaa taaatggcta tcctttaatg 84; atgcgtgtgg aatgtgtgtg cagg caattatttt gctaagaatg tgaattcaag 90; tgcagctcaa tactagcttc agtataaaaa ctgtacagat ttttgtatag ctgataagat 96; tctctgtaga gaaaatactt ttaaaaaatg caggttgtag gatg ggctactcat 102; acagttagat tttacagctt ctgatgttga atgttcctaa atatttaatg gtttttttaa 108; tttcttgtgt atggtagcac agcaaacttg taggaattag tatcaatagt aaattttggg L14; ttttttagga tgttgcattt ttta aaaaaaattt tgtaataaaa ttatgtatat L20; tatt gtctttgtct taatatgcta agttaatttt cactttaaaa aagccatttg L26; aagaccagag ctatgttgat cggt atttctgcct agtagttctt agacacagtt L32; gtaa aa:gtttgag aattaaaacc aaacatgctc gcaa cttt L38; aaaagttaca :tgaac agtgaacttt ataagaattt atgcagtttt acagaacgtt L44; aagttttgta ct:gacgt ctgtttatta gctaaattgt tcctcaggtg tgtgtatata L50; tatatacata ta:atata tatatat 13. HNRNPK: HNRVPK he :erogeneous nuclear ribonucleopro :ein K {omo s ] AOCUS NM_002140 (iso form a variant 1) AA/translation="M *i‘QP L *ibPN *iNG *bGKRPAdDM L *‘QAEK RSRNLD‘MV JRILLQSKWAGAVIGKGGKVIKA.JRTDYWASVSVPDSSGPﬂRIL S ISADI *iIG‘I-K KIIP i-‘dG-Q-PSP ALSQLP. *SDAV* C .NYQHYKGSDFDC'.‘J QLLIHQSLAGGII GVKGAKIK* .R‘NiQ IK-bQ‘ CCPHS D avv JIGGKPD QVVTC IKIIL DLISTSPI KGRAQPY DPVFYDETYDYGGFTWWFDD RRG RPVGFPMRGRGGFD RWPPGRGGRPWPPS RRDYDDWSPRRGPPPPPPGQGG QGGSRA QV-P .PPPPPPRGGDJWAYDRRGRPG DRYD GMVGFSADETWDSAI D WSPS‘ WQMAY *PQGGSGY DYSYAGGRGSYGDLGGPIITTQV TIPKDAAGSIIGKGGQRIKQI?{ESGASIKID “PL *GS *DRIIiIiGiQDQIQNAQYL LQNSVKQYADVEGF CDNA: ccctagccgc cccc agctag:gag tgcgcgaacg agaaaggagg agggcgctcc 6; aggcgacagc actgcagacg ccatta:cct ctgtttctct gctgcaccga cctcgacgtc 12; ttgcctgtgt cccacttgtt cgcggcctat aggctactgc agcactgggg tgtcagt:gt 18; tggtccgacc cagaacgctt cagttctgct ctgcaaggat atataataac tgattgg:gt 24; gcccgtttaa taaaagaata tggaaactga acagccagaa gaaaccttcc c:ga ; aaccaatggt gaatttggta aacgccctgc agaagatatg gaagaggaac aagcatt:aa 36; aagatctaga aacactgatg agatggttga attacgcatt ctgcttcaga gcaagaa:gc 42; tggggcagtg aaag gaggcaagaa tattaaggct ctccgtacag actacaa:gc 48; cagtgtttca gtcccagaca gcagtggccc cgagcgcata ttgagtatca gtgctga:at 54; tgaaacaatt ggagaaattc tgaagaaaat catccctacc ttggaagagg gcctgcagtt 60; gccatcaccc actgcaacca tccc gctcgaatct gatgctgtgg aatgcttaaa 66; ttaccaacac tataaaggaa gtgactttga ctgcgagttg aggctgttga ttcatcagag 72; tctagcagga ggaattattg gggtcaaagg tgctaaaatc aaagaacttc gagagaacac 78; tcaaaccacc cttt aatg ctgtcctcat tccactgaca gagttgttct 84; tattggagga aaacccgata taga gtgcataaag atcatccttg tatc 90; tgagtctccc atcaaaggac gtgcacagcc ttatgatccc aatttttacg atgaaaccta 96; tgattatggt ggttttacaa tgatgtttga tgaccgtcgc ccag tgggatttcc L02; catgcgggga agaggtggtt ttgacagaat tggt ngggtgggc gtcccatgcc L08; tccatctaga agagattatg atgatatgag ccctcgtcga ggaccacctc ctcc L14; cggacgaggc ggccggggtg gtagcagagc tcggaatctt cctcttcctc caccaccacc L20; acctagaggg ggagacctca tggcctatga cagaagaggg agacctggag accgttacga L26; cggcatggtt ggtttcagtg ctgatgaaac ttgggactct gcaatagata catggagccc L32; atcagaatgg cagatggctt atgaaccaca gggtggctcc ggatatgatt attcctatgc L38; agggggtcgt ggctcatatg gtgatcttgg tggacctatt attactacac aagtaactat L44; tcccaaagat ttggctggat ctattattgg caaaggtggt cagcggatta aacaaatccg L50; tcatgagtcg ggagcttcga tcaaaattga tgagccttta gaaggatccg aagatcggat L56; cattaccatt acaggaacac aggaccagat acagaatgca cagtatttgc tgcagaacag L62; tgtgaagcag tatgcagatg ttgaaggatt caag atattttttc ttttttatag L68; tgtgaagcag tattctggaa agtttttcta agactagtga agaactgaag gagtcctgca L74; tctttttttt tttatctgct tctgtttaaa aagccaacat tcctctgctt cataggtgtt L80; ctgcatttga ggtgtagtga tgct gttcaccaga tgtaatgttt tagttcctta L86; caaacagggt tggggggggg aagggcgtgc aaaaactaac attgaaattt tgaaacagca L92; gcagagtgag tggattttat ttttcgttat tgttggtggt ttaaaaaatt ccccccatgt L98; tgtg ttgc gtca ctgtaacatt tggggggtgg gacagggagg 204; aaaagtaaca atagtccaca tgtccctggc atctgttcag agcagtgtgc agaatgtaat 210; gctcttttgt cgtt ttatgatttt taaaataaat ttagtgaacc tatttttggt 216; ggtcattttt tttttaagac agtcatttta aaatggtggc tgaatttccc cccc 222; caaactaaac actaagttta attttcagct cctctgttgg acatataagt gcatctcttg 228; ttggacatag gcaaaataac ttggcaaact tagttctggt gatttcttga tggtttggaa 234; gtctattgct gggaagaaat tccatcatac atattcatgc ttataataag attt 240; tttgtttgtt aatg ccta cttttcaaca attttctatg ttagttgtga 246; agaactaagg tggggagcag tactacaagt atgg tatgagtata taccagaatt 252; ctgattggca gcaagtttta ttaatcagaa taacacttgg ttatggaagt tgct 258; gaaaaaattg ttta ttagataatt tctcacctat agacttaaac tgtcaatttg 264; ctctagtgtc gtta aactttgtaa aatatatata tacttgtttt tccattgtat 270; gcaaattgaa agaaaaagat gtaccatttc tctgttgtat gttggattat gtaggaaatg 276; tttgtgtaca attcaaaaaa aaaaaagatg aaaaaagttc ctgtggatgt tttgtgtagt 2821 gcat ttgtattgat agt :aaaatt cacttccaaa taaataaaac acccatgatg 2881 ctagatttga tgtgtgcccg att :gaacaa gggttgattg acacctgtaa aatttg:tga 2941 cctc ttaaaaggaa ata:agtaat cttatg:aaa aaaaaaaaaa aaaaa 1OCUS NM 031262 (iso form 3 variant 3) AA/translation="M *1‘QP L *1EPN *bGKRPAdDM L *‘QAEK RSRNiD‘MV JRILLQSKWAGAVIGKGGKVIKA.JRTDYVASVSVPDSSGPﬂRIL S ISADI -K KIIP 1.44G-Q-PSP AisQLP. *SDAV* C .NYQHYKGSDFDC'.‘J QLLIHQSLAGGII GVKGAKIKd-RdNiQ IK-bQ‘ CCPHS D RVV D QVVTC IKIIL DLISTSPI KGRAQPYDPWFYDETYDYGGFTWWFDD 3G RPVGFPMRGRGGFD RWPPGRGGRPWPPS RRDYDDWSPRRGPPPPPPGQGG QV-P .PPPPPPRGGDJWAYDRRGRPG DRYD GMVGFSADETWDSAI D WSPS‘ *PQGGSGY DYSYAGGRGSYGDLGGPIITTQV TIPKD1AGSIIGKGGQRIKQI ESGASIKID “PL *GS *DRIIiIiGiQDQIQNAQYL LQNSVKQYSGKFF CDNA: ccctagccgc ccctcccccc agctag:gag tgcgcgaacg agaaaggagg agggcgctcc 61 aggcgacagc actgcagacg ccatta:cct ctct gctgcaccga cctcgacgtc 121 ttgcctgtgt cccacttgtt ctat aggctactgc agcactgggg tgtcagt:gt 181 tggtccgacc cagaacgctt cagttctgct ggat atataataac tgattgg:gt 241 gcccgtttaa taaaagaata tggaaactga acagccagaa gaaaccttcc ctaacac:ga 301 tggt ggta aacgccctgc agaagatatg gaagaggaac aagcatt:aa 361 aagatctaga aacactgatg agatggttga attacgcatt caga gcaagaa:gc 421 tggggcagtg attggaaaag gaggcaagaa tattaaggct ctccgtacag actacaa:gc 481 cagtgtttca gtcccagaca gcagtggccc cgagcgcata ttgagtatca gtgctga:at 541 tgaaacaatt ggagaaattc tgaagaaaat catccctacc gagg gcctgcagtt 601 gccatcaccc actgcaacca gccagctccc gctcgaatct gatgctgtgg aatgcttaaa 661 ttaccaacac tataaaggaa gtgactttga ctgcgagttg aggctgttga ttcatcagag 721 tctagcagga ggaattattg gggtcaaagg tgctaaaatc aaagaacttc gagagaacac 781 cacc atcaagcttt tccaggaatg ctgtcctcat tccactgaca gagttgttct 841 tattggagga aaacccgata gggttgtaga gtgcataaag cttg atcttatatc 901 tgagtctccc atcaaaggac gtgcacagcc ttatgatccc aatttttacg ccta 961 tgattatggt ggttttacaa tgatgtttga tgaccgtcgc ggacgcccag tgggatttcc 1021 catgcgggga agaggtggtt ttgacagaat gcctcctggt ngggtgggc gtcccatgcc 1081 tccatctaga agagattatg atgatatgag ccctcgtcga ggaccacctc cccctcctcc 1141 cggacgaggc ggccggggtg gagc tcggaatctt cctc caccaccacc 1201 acctagaggg ggagacctca tggcctatga cagaagaggg agacctggag accgttacga 1261 cggcatggtt ggtttcagtg ctgatgaaac ttgggactct gcaatagata catggagccc L32; atcagaatgg cagatggctt atgaaccaca gggtggctcc ggatatgatt attcctatgc L38; agggggtcgt ggctcatatg gtgatcttgg tggacctatt attactacac aagtaactat L44; tcccaaagat ggat ctattattgg caaaggtggt cagcggatta aacaaatccg L50; tcatgagtcg ggagcttcga tcaaaattga tgagccttta gaaggatccg aagatcggat L56; cattaccatt acaggaacac aggaccagat acagaatgca cagtatttgc tgcagaacag L62; tgtgaagcag tattctggaa agtttttcta agactagtga agaactgaag gagtcctgca L68; tctttttttt tttatctgct tctgtttaaa aagccaacat gctt cataggtgtt L74; ctgcatttga ggtgtagtga aatctttgct gttcaccaga tgtaatgttt tagttcctta L80; caaacagggt tggggggggg aagggcgtgc aaaaactaac attgaaattt tgaaacagca L86; gcagagtgag tggattttat ttttcgttat tgttggtggt ttaaaaaatt ccccccatgt L92; aattattgtg aacaccttgc tttgtggtca ctgtaacatt tggggggtgg gacagggagg L98; aaaagtaaca atagtccaca tgtccctggc atctgttcag gtgc agaatgtaat 204; gctcttttgt cgtt ttatgatttt taaaataaat ttagtgaacc tatttttggt 210; tttt tttttaagac agtcatttta aaatggtggc tgaatttccc aacccacccc 216; aaac ttta attttcagct cctctgttgg aagt gcatctcttg 222; ttggacatag gcaaaataac ttggcaaact tagttctggt gatttcttga tggtttggaa 228; gtctattgct gggaagaaat tccatcatac atattcatgc ttataataag ctggggattt 234; tttgtttgtt tttgcaaatg cttgccccta cttttcaaca attttctatg ttagttgtga 240; agaactaagg tggggagcag aagt tgagtaatgg tatgagtata taccagaatt 246; ctgattggca gcaagtttta ttaatcagaa taacacttgg ttatggaagt gactaatgct 252; gaaaaaattg attattttta ttagataatt tctcacctat agacttaaac tgtcaatttg 258; ctctagtgtc ttattagtta aactttgtaa aatatatata tacttgtttt tccattgtat 264; tgaa agat gtaccatttc tctgttgtat gttggattat gtaggaaatg 270; taca attcaaaaaa aaaaaagatg aaaaaagttc ctgtggatgt tttgtgtagt 276; atcttggcat ttgtattgat agttaaaatt cacttccaaa taaataaaac acccatgatg 282; ctagatttga tgtgtgcccg acaa gggttgattg gtaa aatttgttga 288; aacgttcctc ttaaaaggaa atatagtaat cttatgtaaa aaaaaaaaaa aaaaa LocuS NM_031263 (iso form a variant 2) AA / transla:ion="M*i*QPL 4 bPWi‘iNG *bGKRPA *DW***QAEKRSRW1D* MV LRILLQSKVAGAVIGKGGKNIKAA ?"DYVASVSVPDSSGPﬂL RI-SISADI‘iIGdILK KIIP 1L L *G-Q-PSP AiSQ.P. *SDAV *C-NYQ {YKGSDF DC'-?.LIHQSLAGGII‘J GVKGAKIK L .R‘NiQ iIKLbQLCCPHSiD RVVLIGGKPD QVVTCIKIILDLISTSPI KGRAQPYDPNFYDETYDYGGFTMWFDD RRGRPVGFPMRGRGGFDRMPPGRGGRPMPPS RRDYDDMSPRRGPPPPPPGRGGRGGSRARNLPAPPPPPPRGGDLMAYDRRGRPGDRYD GMVGFSADETWDSAIDLWSPS dPQGGSGYDYSYAGGRGSYGDLGGPIITTQV AGSIIGKGGQRIKQI RHESGASIKID‘PLdGS *DRIIiIiGiQDQIQNAQYL LQNSVKQYADVEGF CDNA: tagggcgcga cggcggggag gacgcgagaa ggcgggggag gggagcctgc gctcgttttc 6; tgtctagctc gctg aggcggcgcg gcagcggagg gacggcagtc tcgcgcggct 12; gcac tggggtgtca gt:gttggtc cgacccagaa cgcttcagtt ctgctctgca 18; aggatatata gatt gg:gtgcccg tttaataaaa gaatatggaa actgaacagc 24; cagaagaaac cttccctaac ac:gaaacca atggtgaatt tggtaaacgc cctgcagaag ; atatggaaga agca tt:aaaagat ctagaaacac tgatgagatg gttgaattac 36; gcattctgct tcagagcaag aa:gctgggg cagtgattgg aggc atta 42; aggctctccg tacagactac aa:gccagtg tttcagtccc agacagcagt ggccccgagc 48; gcatattgag tatcagtgct gaaa caattggaga aattctgaag aaaatcatcc 54; ctaccttgga agagggcctg cagttgccat cacccactgc aaccagccag ctcccgctcg 60; aatctgatgc tgtggaatgc ttaaattacc aacactataa aggaagtgac tttgactgcg 66; agttgaggct gttgattcat cagagtctag caggaggaat tattggggtc aaaggtgcta 72; aaatcaaaga acttcgagag aacactcaaa ccaccatcaa gcttttccag gaatgctgtc 78; ctcattccac tgacagagtt gttcttattg gaggaaaacc cgatagggtt gtagagtgca 84; taaagatcat ccttgatctt atatctgagt tcaa tgca cagccttatg 90; atcccaattt ttacgatgaa gatt atggtggttt tacaatgatg tttgatgacc 96; gtcgcggacg cccagtggga tttcccatgc ggggaagagg tggttttgac agaatgcctc L02; ctggtcgggg tgggcgtccc atgcctccat gaga ttatgatgat atgagccctc L08; gtcgaggacc acctccccct cctcccggac gaggcggccg gggtggtagc agagctcgga L14; atcttcctct tcctccacca ccaccaccta gagggggaga cctcatggcc tatgacagaa L20; gagggagacc tggagaccgt tacgacggca tggttggttt cagtgctgat gaaacttggg L26; actctgcaat agatacatgg agcccatcag aatggcagat ggcttatgaa ggtg L32; gctccggata tgattattcc tatgcagggg gctc tgat cttggtggac L38; ctattattac tacacaagta actattccca aagatttggc tggatctatt attggcaaag L44; gtggtcagcg gattaaacaa atccgtcatg agtcgggagc ttcgatcaaa attgatgagc L50; ctttagaagg atccgaagat cggatcatta ccattacagg aacacaggac cagatacaga L56; atgcacagta tttgctgcag aacagtgtga agcagtatgc agatgttgaa ggattctaat L62; gcaagatatt ttttcttttt tatagtgtga agcagtattc tggaaagttt gact 168; agtgaagaac tgaaggagtc ctgcatcttt ttttttttat ctgcttctgt ttaaaaagcc 174; aacattcctc tgcttcatag tgca tttgaggtgt agtgaaatct ttgctgttca 180; ccagatgtaa tgttttagtt ccttacaaac gggg gggggaaggg cgtgcaaaaa 186; ctaacattga aattttgaaa cagcagcaga gtgagtggat tttatttttc gttattgttg 192; taaa aaattccccc catgtaatta ttgtgaacac cttgctttgt ggtcactgta 198; acatttgggg ggtgggacag ggaggaaaag taacaatagt ccacatgtcc ctggcatctg 204; ttcagagcag tgtgcagaat gtaatgctct tttgtaagaa acgttttatg atttttaaaa 210; taaatttagt gaacctattt ttggtggtca tttttttttt aagacagtca ttttaaaatg 216; gtggctgaat ttcccaaccc acccccaaac taaacactaa gtttaatttt cagctcctct 222; gttggacata taagtgcatc tcttgttgga cataggcaaa ataacttggc aaacttagtt 228; ctggtgattt cttgatggtt tggaagtcta ttgctgggaa ccat catacatatt 234; catgcttata ataagctggg gattttttgt ttgtttttgc aaatgcttgc ccctactttt 240; caacaatttt ctatgttagt tgtgaagaac taaggtgggg acta caagttgagt 246; aatggtatga gtatatacca gaattctgat tggcagcaag ttttattaat cagaataaca 252; cttggttatg gaagtgacta atgctgaaaa aattgattat taga taatttctca 258; gact taaactgtca atttgctcta gtgtcttatt agttaaactt tgtaaaatat 264; atatatactt gtttttccat tgtatgcaaa ttgaaagaaa aagatgtacc atttctctgt 270; tgtatgttgg attatgtagg aaatgtttgt gtacaattca aaaaaaaaaa agatgaaaaa 276; tgtg gatgttttgt gtagtatctt ggcatttgta ttgatagtta aaattcactt 282; ccaaataaat aaaacaccca tgatgctaga tttgatgtgt gcccgatttg aacaagggtt 288; gattgacacc tgtaaaattt gttgaaacgt tcctcttaaa aggaaatata gtaatcttat 294; gtaaaaaaaa aaaaaaaaaa l4. like [ Homo sap iens ] AOCUS WM_OOL207000 (isoform B) nslation= "MTVPPR .SHVPPPLFPSAPATLASRSLS {WRPRPPRQ .AP.LPS AAPSSARQGARQAQR {V"AQQPSRLAGGAAIKGGRRRRP DLFRRiFKSSSIQ QSAAAA AATR"ARQ {PPADSSVLW4DMN4YSNIA. *bA‘GSKINASKNQQD DGKMFIGGASWDTS KKDL *Y-SREG *VV DC IKiDPViGRS RGEGEVLEKDAASVDKV. 4LK *HK-DGK-I DPKRAKA .KGKTPPKKVFVGGLSPDLS A. *QIK *YEGAEG *I 4N14 .PMDLKLVLRRGE CinYiD‘ *PVKKLLTS QYHQIGSGKCEIKVAQPKEVYRQQQQQQKGGRGAAAGGRGG GQQSTYGKASRGGGNHQNNYQPY CDNA: 1 gaggccgcgc cggg C :tcggccga tcagcccggg aggccccgcc gcgccccctt 61 cgcg cccgtggtca aaga ggcgcccgcg ctgcgctgcc cggaggagcc 12; gtcgcgcgcc cgcttcctgt tcggctggtt cctgccagct caaa acacgcgtgc 18; gcgcggcggg cgagcgcgct cgccgcctca gtcgccagcg ccgggcgcag tccgcctttt 24; tccggagcag actggccgcg gtgctagtcg gtagcagcgg ccgccgcagc ggctccgcac ; tggcgaaccg agggcagaaa aaggcggggt tgacggcttt ttggtaggag tgggctggac 36; cggacgccag aggc tcccaaggca agagggactg tggccctgcg tcggctctgc 42; tcgggactgc tgaccccagg cgcc ccttcgtttt ctga ttcttctctt 48; ctcccaagcc ccct cacgcgtggc ctctctcctt aggg ccgcgatgga 54; ggtcccgccc aggctttccc cgcc gccattgttc ccctccgctc ccgctacttt 60; agcctcccgc agcctctccc attggcggcc gngCCgCCg cggcagctag ccccgctcct 66; cccttcgctc gctcccagct ccgcccggca gggggcgcgc cgggcccagc tcac 72; cgcccagcag ccctcccgat tggcgggcgg ggcggctata aagggagggc gcaggcggcg 78; cccggatctc cgcc attttaaatc cagctccata caacgctccg ccgccgctgc 84; gacc cggactgcgc gccagcaccc ccctgccgac agctccgtca ctatggagga 90; tatgaacgag tacagcaata tagaggaatt cgcagaggga tccaagatca acgcgagcaa 96; gaatcagcag gatgacggta aaatgtttat cttg agctgggata caagcaaaaa L02; agatctgaca gagtacttgt ctcgatttgg ggaagttgta gactgcacaa ttaaaacaga L08; tccagtcact gggagatcaa gaggatttgg atttgtgctt ttcaaagatg ctgctagtgt L14; tgataaggtt ttggaactga aagaacacaa actggatggc atag atcccaaaag L20; ggccaaagct ttaaaaggga aagaacctcc ggtt ggtg gattgagccc L26; ggatacttct gaagaacaaa ttaaagaata ttttggagcc tttggagaga ttgaaaatat L32; tgaacttccc atggatacaa aaacaaatga aagaagagga ttta atac L38; tgatgaagag ccagtaaaaa aattgttaga aagcagatac catcaaattg gttctgggaa L44; gtgtgaaatc aaagttgcac aacccaaaga ggtatatagg cagcaacagc aacaacaaaa L50; aggtggaaga ggtgctgcag ctggtggacg aggtggtacg aggggtcgtg gccgaggcca L56; acagagcact tatggcaagg catctcgagg gggtggcaat caccaaaaca attaccagcc L62; atactaaagg agaacattgg agaaaacagg aggagatgtt aaagtaaccc atcttgcagg L68; acgacattga agattggtct tctgttgatc taagatgatt attttgtaaa agactttcta L74; gtgtacaaga caccattgtg tccaactgta tatagctgcc aattagtttt ctttgttttt L80; actttgtcct ttgctatctg tgttatgact ggat ttgtttatac acattttatt L86; tgtatcattt catgttaaac taaa tgcttcctta tgtgattgct tttctgcgtc L92; aggtactaca tagctctgta aaaaatgtaa tttaaaataa gcaataatta aggcacagtt 198; gattttgtag agtattggtc catacagaga aactgtggtc ctttataaat agccagccag 204; cgtcaccctc ttctccaatt tgtaggtgta ttttatgctc ttaaggcttc atcttctccc 210; tgtaactgag atttctacca cacctttgaa caatgttctt tcccttctgg gaag 216; actgtcctga aaggaagaca taagtgttgt gattagtaga agctttctag tagaccatat 222; ttcttctgga taaa attgttagta gctcctttta ctttgttcct gtctctggaa 228; agccattttt gaattgctga ttactttggc tttaatcagt ctag aaaaagcttt 234; gtaatcataa cacaatgagt aattcttgat aaaagttcag atacaaaagg agcactgtaa 240; aactggtagg agctatggtt catt ggaagtagtt caag gattttggta 246; gaaaggtatg agtttggtcg aaaaattaaa atagtggcaa aataagattt agttgtgttt 252; tctcagagcc gccacaagat tgaacaaaat gttttctgtt tgggcatcct gttg 258; tattagctgt taatgctctg tgagtttaga cttg atagtaaatc tagtttttga 264; cacagtgcat agta gttaaatatt tacatattca gaaaggaata gtggaaaagg 270; tatcttggtt atgacaaagt cattacaaat gtgactaagt cattacaaat gagt 276; cattacagtg gaccctctgg gtgcattgaa aagaatccgt tcca ggtttcagag 282; gacctggaat aataaaaagc tttt gcattcagtg tagttggatt ttgggacctt 288; ggcctcagtg ttatttactg ggattggcat acgtgttcac aggcagagta gttgatctca 294; cacaacgggt gatctcacaa aactggtaag tttcttatgc tcatgagccc tccctttttt 300; tttttaattt tgca actttcttaa caatgattct acttcctggg ctatcacatt 306; ataatgctct tggcctcttt tttgctgctg ttttgctatt cttaaactta ggccaagtac 312; caatgttggc tgttagaagg gattctgttc catg tagg gaatggaagt 318; aagttcattt tgtg ttgtcagtag gtgcggtgtc tagggtagtg aatcctgtaa 324; gttcaaattt atgattaggt gacgagttga cattgagatt gtccttttcc ctgatcaaaa 330; aatgaataaa ttaa acaaaatcca taat caagtcttga tatgtatgac 336; tgagaaaaaa tacactacat ctagagatga ttgagatgtt ttgcaaagaa ttgaaggggg 342; agtgagaatt ggtttttctt gcaggggctt tgaactctag atttgggctt tgaactctag 348; atttaattca gatttcaggg tctatcagtt caccaactga tgcaaatttg aacagatact 354; ctaaggctaa gtgtcctagg ttggatgaac tgaagctact atcaagatct cgttcccaag 360; gattaattta gaacaaagta attggacaag tttattgggg aggggataga aatgaattct 366; aaagtaccta taacaaatac tctgtgtatg ttttttacat cgtatttgcc ttttacattg 372; tttagaccaa attctgtgtg tcct cgggaagagg atagaaatta attctaaagt 3781 acctataaca aatgctcggt ttgtatatgt ttttatacat tgcc ttttgcattg 3841 ttcagaccaa attctgtgtg atgttatcct aacaaaacac aatt tctttggtta 3901 acatgttaat ctgtaatctc acttttataa gatgaggact attaaaatga gatgtctgtt 3961 gggatgctaa aaaa aaaaaa Aocus VM_O3L372 (isoform a) AA /transla:ion="MEVPPRLSHVPPPLFPSAPATLASRSLS {WRPQPPQQLAPLLPS JAPSSARQGARQAQR iV"AQQPSRLAGGAAIKGGRRRRPDLFRRiFKSSSIQQSAAAA AATRHARQJPPADSSViM‘DMN‘YSNIA. *bA‘GSKINASKNQQD DGKMFIGGASWDTS KKDL *Y-SREG‘VV DC IKiDPViGRS RGEGEVLEKDAASVDKV.4LK*HK.3GKLI DPKRAKA .KGKTPPKKVFVGGLSPDLS A. EGAEG41 4N14 .PMDiKiWLRRGb ChIiYiD* *PVKKLLTSRYHQIGSGKCEIKVAQPKEVYRQQQQQQKGGRGAAAGGRGG TRGRGRGQGQNWVQGFNNYYDQGYGNYVSAYGGDQWYSGYGGYDYTGYNYGNYGYGQG YADYSGQQSTYGKAS QGGGNHQNNYQPY CDNA: gaggccgcgc ggggcccggg c:tcggccga cggg aggccccgcc gcgccccctt 6; ggcccgcgcg cccgtggtca cagtggaaga ggCgCCCgCg tgcc cggaggagcc 12; gtcgcgcgcc cgcttcctgt tcggctggtt cctgccagct caaa acacgcgtgc 18; gcgcggcggg cgagcgcgct cgccgcctca gtcgccagcg gcag tccgcctttt 24; tccggagcag actggccgcg gtcg gtagcagcgg ccgccgcagc ggctccgcac ; tggcgaaccg agggcagaaa aaggcggggt tgacggcttt ttggtaggag tgggctggac 36; ccag agacaaaggc tcccaaggca agagggactg tggccctgcg tcggctctgc 42; tcgggactgc tgaccccagg aatttacgcc ccttcgtttt tctcttctga ttcttctctt 48; ctcccaagcc cgcgtcccct cacgcgtggc ctctctcctt gccgggaggg tgga 54; ggtcccgccc aggctttccc atgtgccgcc gccattgttc ccctccgctc cttt 60; agcctcccgc agcctctccc attggcggcc gcggccgccg cggcagctag ccccgctcct 66; cccttcgctc gctcccagct ccgcccggca gcgc cagc gccacgtcac 72; cgcccagcag ccctcccgat tggcgggcgg ggcggctata aagggagggc gcaggcggcg 78; cccggatctc ttccgccgcc attttaaatc cagctccata caacgctccg ccgccgctgc 84; tgccgcgacc cggactgcgc gccagcaccc ccctgccgac agctccgtca ctatggagga 90; tatgaacgag tacagcaata tagaggaatt cgcagaggga tccaagatca acgcgagcaa 96; gaatcagcag gatgacggta aaatgtttat tggaggcttg agctgggata caagcaaaaa 102; agatctgaca gagtacttgt ctcgatttgg ggaagttgta gactgcacaa ttaaaacaga 108; tccagtcact gggagatcaa gaggatttgg atttgtgctt ttcaaagatg ctgctagtgt 114; tgataaggtt ttggaactga aagaacacaa tggc aaattgatag atcccaaaag 120; ggccaaagct ttaaaaggga aagaacctcc caaaaaggtt tttgtgggtg gattgagccc L26; ggatacttct gaagaacaaa ttaaagaata ttttggagcc tttggagaga ttgaaaatat L32; tgaacttccc atggatacaa aaacaaatga aagaagagga ttttgtttta tcacatatac L38; agag ccagtaaaaa aattgttaga aagcagatac catcaaattg gttctgggaa L44; gtgtgaaatc aaagttgcac aaga ggtatatagg cagcaacagc aacaacaaaa L50; aggtggaaga ggtgctgcag ctggtggacg tacg aggggtcgtg gccgaggtca L56; gggccaaaac tggaaccaag gatttaataa ctattatgat caaggatatg gaaattacaa L62; tagtgcctat ggtggtgatc aaaactatag tggctatggc ggatatgatt atactgggta L68; tggg aactatggat atggacaggg atatgcagac tacagtggcc aacagagcac L74; caag gcatctcgag ggggtggcaa tcaccaaaac aattaccagc catactaaag L80; gagaacattg gagaaaacag gaggagatgt taaagtaacc catcttgcag gacgacattg L86; aagattggtc ttctgttgat ctaagatgat tattttgtaa ttct agtgtacaag L92; acaccattgt gtccaactgt atatagctgc caattagttt tctttgtttt tactttgtcc L98; atct gtgttatgac tcaatgtgga tttgtttata cacattttat ttgtatcatt 204; tcatgttaaa cctcaaataa atgcttcctt atgtgattgc ttttctgcgt caggtactac 210; atagctctgt aaaaaatgta atttaaaata agcaataatt aaggcacagt tgattttgta 216; gagtattggt ccatacagag aaactgtggt cctttataaa tagccagcca gcgtcaccct 222; caat ttgtaggtgt attttatgct cttaaggctt ctcc ctgtaactga 228; gatttctacc acacctttga acaatgttct ttcccttctg gttatctgaa gactgtcctg 234; aaaggaagac ataagtgttg gtag aagctttcta gtagaccata tttcttctgg 240; attgtaataa aattgttagt tttt actttgttcc tgtctctgga aagccatttt 246; tgaattgctg attactttgg tcag tggtcaccta gaaaaagctt tgtaatcata 252; acacaatgag taattcttga taaaagttca gatacaaaag gagcactgta aaactggtag 258; gagctatggt ttaagagcat tggaagtagt tacaactcaa tggt agaaaggtat 264; gagtttggtc gaaaaattaa aatagtggca aaataagatt tagttgtgtt ttctcagagc 270; cgccacaaga ttgaacaaaa tgttttctgt ttgggcatcc tgaggaagtt gtattagctg 276; ttaatgctct gtgagtttag aaaaagtctt aaat ctagtttttg acacagtgca 282; tgaactaagt agttaaatat ttacatattc gaat aaag gtatcttggt 288; tatgacaaag tcattacaaa tgtgactaag tcattacaaa tgtgactgag tcattacagt 294; tctg ggtgcattga aaagaatccg ttttatatcc aggtttcaga ggacctggaa 300; taataaaaag ctttggattt tgcattcagt gtagttggat tttgggacct tggcctcagt 306; tact gggattggca tacgtgttca caggcagagt agttgatctc acacaacggg 312; tgatctcaca aaactggtaa gtttcttatg ctcatgagcc tttt aatt 318; tggtgcctgc aactttctta acaatgattc tacttcctgg gctatcacat tataatgctc 324; ttggcctctt ttttgctgct gttttgctat tcttaaactt aggccaagta ccaatgttgg 330; ctgttagaag ggattctgtt acat gcaactttag ggaatggaag taagttcatt 336; tttaagttgt gttgtcagta ggtgcggtgt ctagggtagt gaatcctgta agttcaaatt 342; tatgattagg tgacgagttg acattgagat tgtccttttc cctgatcaaa aaatgaataa 348; agccttttta aacaaaatcc aaacttttaa cttg atatgtatga ctgagaaaaa 354; atacactaca tctagagatg attgagatgt tttgcaaaga attgaagggg gagtgagaat 360; tggtttttct tgcaggggct ttgaactcta gatttgggct ttgaactcta gatttaattc 366; agatttcagg gtctatcagt tcaccaactg atgcaaattt gaacagatac tctaaggcta 372; agtgtcctag gttggatgaa ctgaagctac tatcaagatc tcg:tcccaa ggattaattt 378; agaacaaagt acaa gtttattggg gaggggatag aaa:gaattc taaagtacct 384; ataacaaata ctctgtgtat gttttttaca tcgtatttgc ctt:tacatt acca 390; aattctgtgt atcc agag gatagaaatt aat:ctaaag tacctataac 396; aaatgctcgg tttgtatatg tttttataca tcgtatttgc ctt :tgcatt gttcagacca 402; gtgt gatgttatcc taacaaaaca ccttagtaat ttc:ttggtt aacatgttaa 408; tctgtaatct cacttttata agatgaggac tattaaaatg aga:gtctgt tgggatgcta 414; aaaa aaaaaaa . HSPA9: HSPA9 heat shock 70kDa protein 9 (mortalin) [ Homo sapiens ] LOCUS VM_OO4134 AA / :ranslation:"MISASRAAAARLVGAAASRGPTAA RHQDSWNGLSH DYASEAIKGAVVGIDAG"TNSCVAVMEGKQAKV *NA‘GAR 11PSVVAE1ADG MPAKRQAVTVPNVTFYA"KRLIGRRYDDPEVQKJIKVVPFK GDAWV'A4 YSPSQIGAFVLMKMK41A*NY-GH1AKNAVITVPAYFVDSQRQATKDAGQISG VIN4P AAA-AYGLDKSTDKVIAVYDAGGG1EDISI *IQKGVELVKS NGDib -RHIVK4bK? 4 GVD-iKDNWALQRV? AA4KAKC4LSSSVQ"DIN-PY4 DSSGPK RAQbLGIViDLIRR1IAPCQKAMQDAEVSKSDIGTVI-VGGWT PKVQQ"VQDAFGQAPSKAVWPDEAVAIGAAIQGGVLAGDVTDVL.LDV"?.5-GI4 GGVF"K. RVi IPiKKSQVtSiAADGQiQVLIKVCQGA. R‘MAGDWKLLGQFTAIG PPAPRG LV bDIDANGIViVSAKDKGTGREQQIVIQSSGG-SKDDITVWVKVA KYA 4 4 D NMA‘GIIHDi‘iKM 4 *bKDQLPAD‘CNK-K 4 *ISKW? * I -A KDS4 G RQAASSLQQASLK-FTMAYKKWAS*R*GSGSSG1G QK4DQKA. 4 *KQ CDNA: 1 :tcctcccc ggac :ct:tc :gagc:caga gccgccgcag ccgggacagg agggcaggc 61 :tctccaacc atca:gc:gc ggagcatatt acctgtacgc cctggctccg ggagcggcag 121 :cgagtatcc tctggtcagg cggcgcgggc ggcgcctcag cggaagagcg ggcctctggg 181 ccgcagtgac caacccccgc ccctcacccc ttgg aggtttccag aagcgctgcc 24; gccaccgcat cgcgcagctc tttgccgtcg gagcgcttgt ttgctgcctc gtactcctcc ; atttatccgc catgataagt gccagccgag ctgcagcagc ccgtctcgtg ggcgccgcag 36; cctcccgggg ccctacggcc gcccgccacc aggatagctg gaatggcctt agtcatgagg 42; cttttagact tgtttcaagg tatg catcagaagc aatcaaggga gcagttgttg 48; gtattgattt gggtactacc aactcctgcg tggcagttat ggaaggtaaa caagcaaagg 54; tgctggagaa tgccgaaggt gccagaacca cagt cttt acagcagatg 60; gtgagcgact tgttggaatg ccggccaagc gacaggctgt caccaaccca aacaatacat 66; tttatgctac caagcgtctc attggccggc gatatgatga tcctgaagta cagaaagaca 72; atgt tccctttaaa attgtccgtg cctccaatgg tgatgcctgg gttgaggctc 78; atgggaaatt gtattctccg agtcagattg gagcatttgt gttgatgaag atgaaagaga 84; aaaa ttacttgggg cacacagcaa aaaatgctgt gatcacagtc ccagcttatt 90; tcaatgactc gcagagacag gccactaaag atgctggcca tgga ctgaatgtgc 96; ttcgggtgat taatgagccc acagctgctg ctcttgccta tggtctagac aaatcagaag L02; acaaagtcat tgctgtatat ggtg gtggaacttt tgatatttct gaaa L08; ttcagaaagg agtatttgag gtgaaatcca caaatgggga taccttctta ggtggggaag L14; actttgacca ggccttgcta cggcacattg tgaaggagtt caagagagag acaggggttg L20; atttgactaa agacaacatg gcacttcaga gggtacggga tgaa aaggctaaat L26; gtgaactctc ctcatctgtg cagactgaca tcaatttgcc ctatcttaca atggattctt L32; ctggacccaa gcatttgaat atgaagttga cccgtgctca atttgaaggg actg L38; atctaatcag aaggactatc gctccatgcc aaaaagctat gcaagatgca gaagtcagca L44; agagtgacat aggagaagtg attcttgtgg tgac taggatgccc aaggttcagc L50; agactgtaca tttt ggcagagccc aagc tgtcaatcct gatgaggctg L56; tggccattgg agctgccatt cagggaggtg tgttggccgg cgatgtcacg gatgtgctgc L62; tccttgatgt cactcccctg tctctgggta ttgaaactct aggaggtgtc tttaccaaac L68; ttattaatag gaataccact attccaacca agaagagcca ggtattctct actgccgctg L74; atggtcaaac gcaagtggaa attaaagtgt gtcagggtga aagagagatg gctggagaca L80; tcct tggacagttt actttgattg gaattccacc agcccctcgt cctc L86; agattgaagt tacatttgac attgatgcca atgggatagt acatgtttct gata L92; aaggcacagg gcag cagattgtaa tccagtcttc tggtggatta agcaaagatg L98; atattgaaaa taaa aatgcagaga aatatgctga agaagaccgg cgaaagaagg 204; aacgagttga agcagttaat atggctgaag ttca cgacacagaa accaagatgg 210; aagaattcaa ggaccaatta cctgctgatg agtgcaacaa gctgaaagaa gagatttcca 216; aaatgaggga ggct agaaaagaca gcgaaacagg agaaaatatt agacaggcag 222; catcctctct tcagcaggca tcactgaagc aaat ggcatacaaa aagatggcat 228; ctgagcgaga aggctctgga ggca ctggggaaca agat caaaaggagg 234; aaaaacagta ataatagcag aaattttgaa gccagaagga caacatatga agcttaggag 240; tgaagagact tcctgagcag aaatgggcga acttcagtct ttttactgtg tttttgcagt 246; attctatata taatttcctt taaa tttagtgacc attagctagt gatcatttaa 252; gtga ttctaacagt ataaagttca caatattcta tgtccctagc ctgtcatttt 258; tcagctgcat gtaaaaggag gtaggatgaa ttgatcatta taaagattta actattttat 264; gtga ccatattttc aaggggtgaa accatctcgc acacagcaat gaaggtagtc 270; atccatagac tgag accacatatg gggatgagat ccttctagtt agcctagtac 276; tgctgtactg atgt acatggggtc cttcaactga ggccttgcaa gtcaagctgg 282; ctgtgccatg tttgtagatg gggcagagga atctagaaca atgggaaact tagctattta 288; tattaggtac agctattaaa acaaggtagg aatgaggcta gacctttaac ttccctaagg 294; catacttttc tagctacctt tgtg tctggcacct acatccttga tgattgttct 300; cttacccatt ctggaatttt ttttttttta aataaataca gaaagcatct tgatctcttg 306; tttgtgaggg gtgatgccct gagatttagc ttcaagaata tgccatggct catgcttccc 312; ccca aagagggaaa tacaggattt gctaacactg gttaaaaatg caaattcaag 318; atttggaagg gctgttataa tgaaataatg agcagtatca gcatgtgcaa atcttgtttg 324; aaggatttta ttttctcccc ttagaccttt ggtacattta gaatcttgaa agtttctaga 330; tctctaacat gaaagtttct agatctctaa catgaaagtt tttagatctc taacatgaaa 336; accaaggtgg ctattttcag gttgctttca gctccaagta gaaataacca gaattggctt 342; acattaaaga aactgcatct agaaataagt cctaagatac tatg gctcaaaaat 348; aaaaggaacc cagatttctt tcccta 16. MAP2K2: MAP2K2 mitogen—activated protein kinase kinase 2 [ {omo sapiens ] JOCJS NM_030662 AA /translation="M4ARRKPV .PA-iINPiIA S *GAS *ANT.VDT.QKK .4 4T. 4LD‘QQKKR-TAFLTQKAKVGdLKD DDE‘RIS dLGAGNGGVVTKVQi QPSG .

IHLEIKPAIQWQIIR « T.QVT. +1 *CNSPYIVGEYGAEYSDG *ISICMA. {WDGGS .DQV-K *I-GKVSIAVL RG .AYT.R'TKHQIMH RDVKPSWI .VNSRGTIKLC DFGVSGQ AIDSMAWSFVGTRSYWAPTR .QGTHYSVQSDIWSMGLS .VT-AVG QYPIPPP DAK*-* AIFGRPVVDGddeP {SISP QPRPPGRPVSGHGMDSRPAWAIF 'TT. . DYIVNTPPPK E DFQTFVNKC .IKVPATRAD .KMLiNHibIKRS 4V4 *VDEAGWLCKTJRLWQP GTPTRTAV CDNA: cccctgcctc tcggactcgg gctgcggcgt cagccttctt cgggcctcgg cagcggtagc 6; ggctcgctcg cctcagcccc agcgcccctc ggctaccctc ggcccaggcc cgcagcgccg 12; cccgccctcg gccgccccga cgccggcctg ggCCgngCC gcagccccgg gctcgcgtag 18; gcgccgaccg ctcccggccc gccccctatg ggccccggct agaggcgccg ccgccgccgg 24; cccgcggagc cccgatgctg agga agccggtgct gctc accatcaacc ; ctaccatcgc cgagggccca acca gcgagggcgc ctccgaggca aacctggtgg 36; acctgcagaa gaagctggag gagctggaac ttgacgagca gcagaagaag ngctggaag 42; cctttctcac ccagaaagcc aaggtcggcg aaga cgatgacttc gaaaggatct 48; cagagctggg cgcgggcaac ggcggggtgg tcaccaaagt ccagcacaga ccctcgggcc 54; tggc caggaagctg atccaccttg agatcaagcc ccgg aaccagatca 60; tccgcgagct gcaggtcctg cacgaatgca actcgccgta catcgtgggc ttctacgggg 66; acag tgacggggag attt gcatggaaca catggacggc ggctccctgg 72; accaggtgct gaaagaggcc aagaggattc ccgaggagat cctggggaaa gtcagcatcg 78; cggttctccg gggcttggcg cgag acca gatcatgcac cgagatgtga 84; agccctccaa catcctcgtg aactctagag gggagatcaa gctgtgtgac ttcggggtga 90; gcggccagct catcgactcc atggccaact ccttcgtggg cacgcgctcc tacatggctc 96; ngagcggtt gcagggcaca cattactcgg tgcagtcgga catctggagc ctgt L02; ccctggtgga gctggccgtc ggaaggtacc ccatcccccc gcccgacgcc aaagagctgg L08; tctt tggccggccc gtggtcgacg gggaagaagg agagcctcac agcatctcgc L14; ctcggccgag gccccccggg cgccccgtca gcggtcacgg gatggatagc cggcctgcca L20; tggccatctt tgaactcctg gactatattg tgaacgagcc acctcctaag ctgcccaacg L26; gtgtgttcac ccccgacttc caggagtttg tcaataaatg cctcatcaag aacccagcgg L32; agcgggcgga cctgaagatg aacc acaccttcat caagcggtcc gaggtggaag L38; aagtggattt tgccggctgg ttgtgtaaaa ccctgcggct gaaccagccc ggcacaccca L44; cgcgcaccgc acag tggccgggct gtcc cgctggtgac ctgcccaccg L50; tccctgtcca tgccccgccc ttccagctga ggacaggctg gcgcctccac ccaccctcct L56; gcctcacccc tgcggagagc accgtggcgg ggcgacagcg catgcaggaa nggggtctc L62; ctctcctgcc cgtcctggcc ggggtgcctc cggg cgacgctgct gtgtgtggtc L68; tcagaggctc tgcttcctta ggttacaaaa aggg agagaaaaag caaaaaaaaa L74; aaaaaaaaaa aaaaaaaaa l7. LDHA: ;DHA lactate dehydrogenase A [ Homo sapiens ] LOCJS WM_001135239 (isoform 2) AA/:ransla:ion="WATLKDQLIYN.LKd4QLPQNKIiVVGVGAVGMACAISILMKDL AD?.ALVDVIdDKLKG4WMDLQHGS.F.R"PKIVSGKVDIATYVAWKISGFPKWRVIG SGCWLDSARFRY.MGTR.GVHPLSC{GWVLGEHGDSSVPVWSGMNVAGVSLKTAHPDL G1DKDKdQWKdViKQVVdSAYdVIK.KGY"SWAIG.SVAD.ATSIMKNLRRV4PVSTM IKGAYGIKDDVFASVPCILGQWGISDLVKVL.15444ARLKKSADTLWGIQKT.QF CDNA: cggt cggttg:ctg gctgcgcgcg ccacccgggc ctctccagtg ccccgcctgg 6; atcc acccccagcc cgactcacac gtgggttccc gcacgtccgc cggccccccc l2; cgctgacgtc agcatagctg ttccacttaa ggcccctccc gcgcccagct cagagtgctg l8; cagccgctgc cgccgattcc ggatctcatt gccacgcgcc cccgacgacc gcccgacgtg 24; cgat tccttttggt tccaagtcca atatggcaac tctaaaggat cagctgattt ; ataatcttct aaaggaagaa cagacccccc agat tacagttgtt ggggttggtg 36; gcat ggcctgtgcc atcagtatct taatgaagga cttggcagat gaacttgctc 42; ttgttgatgt catcgaagac aaattgaagg gagagatgat ggatctccaa catggcagcc 48; ttttccttag aacaccaaag attgtctctg gcaaagtgga tatcttgacc tacgtggctt 54; ggaagataag tggttttccc aaaaaccgtg ttattggaag cggttgcaat ctggattcag 60; cccgattccg ttacctaatg aggc tgggagttca cccattaagc tgtcatgggt 66; gggtccttgg tgga gattccagtg tgcctgtatg gagtggaatg aatgttgctg 72; gtgtctctct tctg cacccagatt tagggactga taaagataag gaacagtgga 78; ttca ggtg gttgagagtg aggt gatcaaactc aaaggctaca 84; catcctgggc tattggactc tctgtagcag atttggcaga gagtataatg aagaatctta 90; ggcgggtgca cccagtttcc accatgatta agggtcttta cggaataaag gatgatgtct 96; tccttagtgt tccttgcatt ttgggacaga atggaatctc agaccttgtg aaggtgactc ;O2; tgacttctga ggaagaggcc cgtttgaaga agagtgcaga ttgg gggatccaaa ;O8; aggagctgca attttaaagt atgt catatcattt cactgtctag gctacaacag ;14; gattctaggt ggaggttgtg catgttgtcc tttttatctg atctgtgatt aaagcagtaa ;20; tattttaaga tggactggga aaaacatcaa ctcctgaagt tagaaataag aatggtttgt ;26; aaaatccaca gctatatcct gatgctggat ggtattaatc ttgtgtagtc ttcaactggt ;32; tagtgtgaaa tagttctgcc acctctgacg caccactgcc aatgctgtac gtactgcatt ;38; tgccccttga gccaggtgga tgtttaccgt gtgttatata acttcctggc actg ;44; aacatgccta gtccaacatt ttttcccagt gagtcacatc ctgggatcca gtgtataaat ;50; ccaatatcat tgca taattcttcc aaaggatctt attttgtgaa ctatatcagt 156; agtgtacatt accatataat gtaaaaagat ctacatacaa acaatgcaac caactatcca 162; agtgttatac caactaaaac ccccaataaa acag tgactacttt ggttaattca 168; ttatattaag atataaagtc ataaagctgc tagttattat attaatttgg aaatattagg 174; ctattcttgg gcaaccctgc aacgattttt tctaacaggg atattattga ctaatagcag 180; aggatgtaat agtcaactga gttgtattgg taccacttcc attgtaagtc tatt 186; atatatttga tgct aatcataatt ggaaagtaac attctatatg taaatgtaaa 192; atttatttgc caactgaata taggcaatga gtca ctatagggaa cacagatttt 198; tgagatcttg tcctctggaa gctggtaaca attaaaaaca atcttaaggc agggaaaaaa 204; aaaaaaaaaa aa Aocus NM_001165414 (isoform 3) AA/translation= "MGLPSGGYiYiQisIbAEHAKIPEGSKSVWATLKDQLIYN LK'‘J EQTPQVKITVVGVGAVGMACAISI .MKD .ADT .ALVDVI *DKLKG‘WMDLQHGS .F-R "PKIVSGKDYNVHANSKLVIITAGARQQ‘G‘S? .N .VQRVVVIFKF IIPNVVKYSPWC KLAIVSVPVDIAHYVAWKISGFPKVRVIGSGCV JDSARF RY.MGTR .GVHPLSCiGWV .GTHGDSSVPVWSGMNVAGVSLKT .HPD .Gi WK‘ViKQVV‘SAY‘VIK .KGY "SWAIG-SVAD .ATSIMKNLRRV iPVSTWIKG JYGIKDDVF.JSVPCILGQVGIS DLVK V1.15A. *‘ARLKKSADTLWGIQKﬂA. .QF CDNA: 1 t :gggcgggg cgtaaaagcc gggcgt10gg aggacccagc aa:tag:ctg atttccgccc 61 acctttccga gcgggaagga gagccacaaa gcgcgcatgc gcgcggatca ccgcaggctc 12; ctgtgccttg ggcttgagct ttgtggcagt cttt tctgcacgta tctctggtgt 18; ttacttgaga agcctggctg tgtccttgct gtaggagccg gagtagctca tctt 24; gtctgaggaa aggccagccc cacttggggt taataaaccg cgatgggtga accctcagga ; ggctatactt acacccaaac gtcgatattc cttttccacg ctaagattcc ttttggttcc 36; aagtccaata tggcaactct aaaggatcag ctgatttata atcttctaaa ggaagaacag 42; accccccaga ataagattac agttgttggg gttggtgctg ttggcatggc ctgtgccatc 48; agtatcttaa tgaaggactt ggcagatgaa cttg ttgatgtcat cgaagacaaa 54; ggag agatgatgga tctccaacat ggcagccttt tccttagaac gatt 60; gtctctggca ataa tgtaactgca aactccaagc tggtcattat cacggctggg 66; gcacgtcagc aagagggaga aagccgtctt aatttggtcc agcgtaacgt gaacatcttt 72; aaattcatca ttcctaatgt tgtaaaatac agcccgaact gcaagttgct ttca 78; aatccagtgg atatcttgac ctacgtggct ataa ttcc caaaaaccgt 84; gttattggaa gcaa ttca gcccgattcc gttacctaat gggggaaagg 90; ctgggagttc acccattaag ctgtcatggg tgggtccttg gggaacatgg agattccagt 96; gtgcctgtat ggagtggaat gaatgttgct ggtgtctctc tgaagactct gcacccagat 102; ttagggactg ataaagataa gtgg aaagaggttc aggt ggttgagagt 108; gcttatgagg tgatcaaact caaaggctac acatcctggg ctattggact ctctgtagca 114; gatttggcag agagtataat gaagaatctt aggcgggtgc acccagtttc caccatgatt 120; aagggtcttt taaa ggatgatgtc ttccttagtg ttccttgcat tttgggacag 126; aatggaatct cagaccttgt gaaggtgact ctgacttctg aggaagaggc gaag 132; aagagtgcag atacactttg ggggatccaa aaggagctgc aattttaaag tcttctgatg 138; tcatatcatt tcactgtcta ggctacaaca ggattctagg tggaggttgt gcatgttgtc 144; ctttttatct gatctgtgat taaagcagta atattttaag atggactggg aaaaacatca 150; actcctgaag ttagaaataa gaatggtttg taaaatccac agctatatcc tgatgctgga 156; tggtattaat cttgtgtagt cttcaactgg tgaa atagttctgc cacctctgac 162; gcaccactgc caatgctgta cgtactgcat ttgccccttg agccaggtgg atgtttaccg 168; tgtgttatat aacttcctgg ctccttcact gaacatgcct agtccaacat tttttcccag 174; tgagtcacat cctgggatcc agtgtataaa tccaatatca tgtcttgtgc ataattcttc 180; caaaggatct tattttgtga actatatcag tagtgtacat taccatataa tgtaaaaaga 186; tctacataca aacaatgcaa atcc aagtgttata ccaactaaaa cccccaataa 192; accttgaaca gtgactactt attc attatattaa gatataaagt cataaagctg 198; ctagttatta tttg gaaatattag cttg ggcaaccctg caacgatttt 204; ttctaacagg gatattattg actaatagca gaggatgtaa tagtcaactg agttgtattg 210; gtaccacttc cattgtaagt cccaaagtat tatatatttg atgc taatcataat 216; tggaaagtaa cattctatat gtaaatgtaa aatttatttg ccaactgaat ataggcaatg 222; atagtgtgtc actataggga acacagattt ttgagatctt gtcctctgga taac 228; aattaaaaac aatcttaagg cagggaaaaa aaaaaaaaaa aaa LOCUS 165415 (isoform 4) AA/transla:ion="WATLKDQLIYN.LK*4QiPQNKI1VVGVGAVGMACAISILMKDL ADTLALVDV14DKLKG4MMDLQHGSLF .RTPKIVSGKDYNV'"ANSKLVIITAGARQQ:A.

GTSR-N .VQRWVNIFKFIIPNVVKYSPVCK. VDIL'"YVAWKISGFPKNRVIG SGCNADSARF RYLMGERLGVHPLSCHGWVLGA. EHGDSSVPVWSGMNVAGVSLKTLHPDA GiDKDKdQWK *CRYi AAILKSSDVISFHCLGYNRILGGGCACCPFYLICD CDNA: 1 gtctgccggt tctg gctgcgcgcg ccacccgggc ctctccagtg ccccgcctgg 61 ctcggcatcc acccccagcc cgactcacac gtgggttccc gcacgtccgc cggccccccc 121 cgctgacgtc agcatagctg ttccacttaa ggcccctccc gcgcccagct cagagtgctg 181 cagccgctgc cgccgattcc ggatctcatt gccacgcgcc cccgacgacc cgtg 24; cattcccgat tccttttggt tccaagtcca atatggcaac tctaaaggat cagctgattt ; ataatcttct aaaggaagaa cagacccccc agaataagat tacagttgtt ggggttggtg 36; ctgttggcat ggcctgtgcc atcagtatct taatgaagga cttggcagat gaacttgctc 42; ttgttgatgt catcgaagac aaattgaagg gagagatgat ggatctccaa catggcagcc 48; ttttccttag aacaccaaag attgtctctg gcaaagacta taatgtaact tcca 54; agctggtcat tatcacggct ggggcacgtc agcaagaggg agaaagccgt cttaatttgg 60; tccagcgtaa cgtgaacatc tttaaattca tcattcctaa tgttgtaaaa tacagcccga 66; actgcaagtt gcttattgtt tcaaatccag tctt gacctacgtg gcttggaaga 72; taagtggttt aaac cgtgttattg gaagcggttg caatctggat tcagcccgat 78; tccgttacct aatgggggaa aggctgggag ttcacccatt aagctgtcat gggtgggtcc 84; ttggggaaca tggagattcc agtgtgcctg tatggagtgg aatgaatgtt gctggtgtct 90; ctctgaagac ccca gatttaggga ctgataaaga taaggaacag tggaaagagt 96; acac tttgggggat ccaaaaggag ctgcaatttt aaagtcttct gatgtcatat 102; catttcactg tctaggctac aacaggattc taggtggagg ttgtgcatgt tgtccttttt 108; atctgatctg tgattaaagc agtaatattt ggac tgggaaaaac atcaactcct 114; gaagttagaa atgg tttgtaaaat ccacagctat atcctgatgc tggatggtat 120; taatcttgtg tagtcttcaa ctggttagtg tgaaatagtt ctgccacctc tgacgcacca 126; ctgccaatgc tact gcatttgccc cttgagccag gtggatgttt accgtgtgtt 132; atataacttc ctggctcctt cactgaacat gcctagtcca acattttttc agtc 138; acatcctggg atccagtgta taaatccaat atcatgtctt gtgcataatt aagg 144; atcttatttt gtgaactata tcagtagtgt acattaccat ataatgtaaa aagatctaca 150; tacaaacaat gcaaccaact atccaagtgt tataccaact aaaaccccca ataaaccttg 156; aacagtgact actttggtta attcattata ttaagatata aagtcataaa gctgctagtt 162; attatattaa tttggaaata tatt cttgggcaac cctgcaacga ttttttctaa 168; tatt attgactaat agcagaggat gtaatagtca actgagttgt acca 174; cttccattgt aagtcccaaa tata tttgataata atgctaatca taattggaaa 180; gtaacattct atatgtaaat gtaaaattta tttgccaact gaatataggc aatgatagtg 186; tgtcactata gggaacacag atttttgaga tcttgtcctc ctgg taacaattaa 192; aaacaatctt aaggcaggga aaaaaaaaaa aaaaaaa LOCUS 165416 (isoform 5) AA/transla :ion="WATLKDQLIYN.LK*4QiPQNKIiVVGVGAVGMACAISILMKDL VDVI *DKLKG4WMDLQ4GSLF-RTPKIVSGKDYNV"ANSKLVIITAGARQQ:A.

GTSR-N .VQRVVNIFKFIIPNVVKYSPWCK--IVSNPVDIL"YVAWKISGFPKNRVIG SGCNADSARF RYLMGuR .GVHP .SCHGWVLGEHGDSSVPVWSGMNVAGVSLKTLHPDA GiDK DK‘QWK‘VHKQVV‘RVEL A.

CDNA: gtctgccggt cggttgtctg gctgcgcgcg ccacccgggc ctctccagtg ccccgcctgg 6; ctcggcatcc acccccagcc cgactcacac gtgggttccc gcacgtccgc cggccccccc 12; cgctgacgtc agcatagctg ttccacttaa ggcccctccc gcgcccagct cagagtgctg 18; ctgc cgccgattcc ggatctcatt gccacgcgcc cccgacgacc gcccgacgtg 24; cgat tccttttggt tccaagtcca atatggcaac tctaaaggat cagctgattt ; ataatcttct aaaggaagaa cccc agaataagat tacagttgtt ggggttggtg 36; ctgttggcat ggcctgtgcc atcagtatct taatgaagga cttggcagat gctc 42; ttgttgatgt catcgaagac aaattgaagg gagagatgat ggatctccaa catggcagcc 48; ttttccttag aacaccaaag attgtctctg gcaaagacta taatgtaact gcaaactcca 54; agctggtcat tatcacggct ggggcacgtc agcaagaggg ccgt cttaatttgg 60; gtaa cgtgaacatc tttaaattca tcattcctaa tgttgtaaaa tacagcccga 66; actgcaagtt gcttattgtt tcaaatccag tggatatctt gacctacgtg gcttggaaga 72; taagtggttt tcccaaaaac cgtgttattg gaagcggttg caatctggat tcagcccgat 78; tccgttacct aatgggggaa ggag ttcacccatt aagctgtcat gggtgggtcc 84; aaca tggagattcc cctg tatggagtgg aatgaatgtt gctggtgtct 90; ctctgaagac tctgcaccca gatttaggga ctgataaaga taaggaacag gagg 96; ttcacaagca ggtggttgag ttta cggaataaag gatgatgtct tccttagtgt L02; tccttgcatt ttgggacaga atggaatctc agaccttgtg aaggtgactc tgacttctga L08; ggaagaggcc aaga agagtgcaga tacactttgg gggatccaaa aggagctgca L14; attttaaagt cttctgatgt catatcattt cactgtctag gctacaacag gattctaggt L20; ggaggttgtg catgttgtcc tttttatctg atctgtgatt gtaa tattttaaga L26; tggactggga aaaacatcaa ctcctgaagt tagaaataag aatggtttgt aaaatccaca L32; gctatatcct gatgctggat ggtattaatc ttgtgtagtc ttcaactggt tagtgtgaaa L38; tagttctgcc acctctgacg caccactgcc aatgctgtac gtactgcatt tgccccttga L44; gccaggtgga tgtttaccgt tata acttcctggc tccttcactg aacatgccta L50; gtccaacatt ttttcccagt gagtcacatc ctgggatcca gtgtataaat ccaatatcat L56; gtcttgtgca taattcttcc aaaggatctt attttgtgaa ctatatcagt agtgtacatt L62; accatataat gtaaaaagat acaa acaatgcaac caactatcca atac 168; aaac ccccaataaa ccttgaacag tgactacttt ggttaattca ttatattaag 174; atataaagtc ataaagctgc ttat ttgg aaatattagg ctattcttgg 180; gcaaccctgc aacgattttt tctaacaggg ttga ctaatagcag aggatgtaat 186; agtcaactga gttgtattgg taccacttcc attgtaagtc ccaaagtatt atatatttga 192; tgct aatcataatt taac attctatatg taaatgtaaa atttatttgc 198; caactgaata taggcaatga tagtgtgtca ctatagggaa cacagatttt tgagatcttg 204; ggaa gctggtaaca attaaaaaca atcttaaggc agggaaaaaa aaaaaaaaaa 210; aa LOC JS WM_005566 (isoform 1) AA/ :ransla:ion= H WATLKDQLIYN mK4 dQiPQNKIiVVGVGAVGMACAISILMKDL A34-ALVDV14DKLKG4WMDLQHGS E .R"PKIVSGKDYNV'"ANSKLVIITAGARQQ:A.

GTSR-N-VQRVVVIFKF IIPNVVKYSPVCK. .IVSWPVDIJ"YVAWKISGFPKVRVIG SGCVJDSARFRY .MG*R-GVHPLSC EHG DSSVPVWSGMNVAGVSLKT .HPD.

GiDK3K4QWK4ViKQVV‘SAY‘VIK .KGY"SWAIG .SVAD .A TSIMKNLRRV IKGJYGIKDDVFJSVPCILGQWGIS DLVKVi-iS** *ARLKKSADTLWGIQKT CDNA: gtctgccggt cggttg:ctg gctgcgcgcg ccacccgggc ctctccagtg ccccgcctgg 6; ctcggcatcc acccccagcc cgactcacac gtgggttccc gcacgtccgc cggccccccc 12; cgctgacgtc agcatagctg ttccacttaa ggcccctccc gcgcccagct cagagtgctg 18; cagccgctgc cgccgattcc catt cgcc cccgacgacc cgtg 24; cattcccgat tccttttggt tccaagtcca atatggcaac tctaaaggat cagctgattt ; ataatcttct aaaggaagaa cagacccccc agaataagat tacagttgtt ggggttggtg 36; ctgttggcat ggcctgtgcc atcagtatct taatgaagga cttggcagat gaacttgctc 42; ttgttgatgt catcgaagac aaattgaagg tgat ggatctccaa catggcagcc 48; ttttccttag aacaccaaag tctg gcaaagacta taatgtaact gcaaactcca 54; tcat tatcacggct ggggcacgtc agcaagaggg agaaagccgt cttaatttgg 60; tccagcgtaa cgtgaacatc tttaaattca tcattcctaa tgttgtaaaa tacagcccga 66; actgcaagtt gcttattgtt tcaaatccag tggatatctt gacctacgtg gcttggaaga 72; taagtggttt tcccaaaaac cgtgttattg gaagcggttg caatctggat tcagcccgat 78; tccgttacct aatgggggaa aggctgggag ttcacccatt aagctgtcat gggtgggtcc 84; ttggggaaca tggagattcc agtgtgcctg tatggagtgg aatgaatgtt gctggtgtct 90; ctctgaagac tctgcaccca gatttaggga ctgataaaga taaggaacag tggaaagagg 96; ttcacaagca ggtggttgag agtgcttatg aggtgatcaa aggc tacacatcct 102; gggctattgg actctctgta gcagatttgg cagagagtat aatgaagaat cttaggcggg L08; tgcacccagt ttccaccatg attaagggtc tttacggaat aaaggatgat gtcttcctta L14; gtgttccttg cattttggga ggaa tctcagacct ggtg actctgactt L20; ctgaggaaga ggcccgtttg aagaagagtg cagatacact ttgggggatc gagc L26; tgcaatttta aagtcttctg atgtcatatc ctgt ctaggctaca acaggattct L32; aggtggaggt tgtgcatgtt gtccttttta tctgatctgt gattaaagca gtaatatttt L38; aagatggact gggaaaaaca tcaactcctg aagttagaaa taagaatggt ttgtaaaatc L44; cacagctata tcctgatgct ggatggtatt aatcttgtgt agtcttcaac tggttagtgt L50; gaaatagttc tgccacctct gacgcaccac tgccaatgct actg catttgcccc L56; ttgagccagg tggatgttta ccgtgtgtta ttcc tggctccttc actgaacatg L62; cctagtccaa cattttttcc cagtgagtca catcctggga tccagtgtat aaatccaata L68; tcatgtcttg tgcataattc ttccaaagga tcttattttg tgaactatat cagtagtgta L74; cattaccata taatgtaaaa agatctacat acaaacaatg caaccaacta tccaagtgtt L80; acta aaacccccaa taaaccttga acagtgacta ctttggttaa ttcattatat L86; taagatataa agtcataaag ctgctagtta ttatattaat ttggaaatat taggctattc L92; ttgggcaacc ctgcaacgat tttttctaac atta ttgactaata gcagaggatg L98; taatagtcaa ctgagttgta ttggtaccac tgta agtcccaaag tattatatat 204; ttgataataa tgctaatcat aattggaaag taacattcta tatgtaaatg ttat 210; ttgccaactg aatataggca atgatagtgt gtcactatag ggaacacaga tttttgagat 216; ctct ggaagctggt aacaattaaa aacaatctta aggcagggaa aaaaaaaaaa 222; aaaaaa l8. MAP4: WAP4 microtubule—associated protein 4 [ Homo sapiens LOCJS WM_001134364 (isoform 4) AA/1ransla:ion="WAD-S-ADALi*PSPDI*G*IK?DbIAi-*A*AbDDVVGL VGKA.

T'i—K/l<}2"UU'TI—]DYIPL-DV3*KiGNS*SKKKPCS*iSQI*DiPSSKP --AWGG4GVLGSD iGSP L d*KMAYQ‘YPWSQWWPLDLWECbQPLQVVDPIQTDPFKWYHDDDLADLVFPSSA"A TSIFAGQVJPLKDSYGMSPCWTAVVPQGWSVTA-VSP{S*SEVSP*AVA*PPQPLAV *-AK*I*WAS**RPPAQA-TIMWGLKTTDMAPSK* *WA-AKDWA-A K *VA-AK *SP KLDV -AKDWQPSWTSDMALVKDM4LP *K‘VA-VKDVRWPLL NV P * *VAPAKJVL--K*i*QASPIKMDLAPSKDWGPPK‘VKK‘ *QASPIKMDLAP W EVKIVPAKJ-V--S*I‘VAQANDIISSL‘ISSA‘KVA-SS* *VALARDM .PP* VVI- K3KA-P-‘A‘VAPVKDMAQ-P‘ *IAPAKDVAPS VKLVGLLKDWSP U] * *MALGKDVLPPP‘i‘VV-IKVVCLPP*M*VA- *DQVPA-K *APLAKDGV."L WAWV"PAK3VPP-S*i*A PVPIKJWLIAQLQKGIS‘DSi-‘S-QDVGQSAAPTFWIS wD iV GiGKKCS-PA“)SV.4KLGA. QKPCVSQPS U] U] U] G) H CU QP**GRPVVSG"G Z u I PPNKdLPPSP‘KK KP-AiiQPAK SiSKAK QPiSLPKQPAPTTIGGLNKKP ZU] PAAPPKRPAVASARPSILPSKDVKPKPIADAKAPEKRASPSKPASAPASR m GSKSTQ"VAKTTTAAAVASTGPSSRSPS"LLPKKP AIK *GKPA‘VKKMLAKSVPA ULSRPKS"STSSMKKTTTLSGTAPAAGVVPSRVKATPWPSRPSTTPFIDKKPTSAKPS S ATNTSAPDLKNVRSKVGSTENIKHQPGGGRAKV*KKl‘AAAiiRKPLSN AV"KTAGPIASAQKQPAGKVQIVSKKVSYSHIQSKCGSKDNIKHVPGGGNVQIQNKKV DISKVSSKCGSKANIKHKPGGGDVKI 4SQKLNEK4KAQAKVGSLDNVGHLPAGGAVKI L YRLibRANARARTDHGADIVSRPP {FPGGPNSGSRVLGPLSRAVi CDNA: aggccccacc tggc ccggtccgcg tgtgcgccga ctctcgcact ctcctcgctc 6; cgggcgccca gaga gagagcgacg gccatcatag aacagcgaag gcagtcgatc 12; ggcctgcggt ccgcttcggc attcgcaggc cgcaggcggg aggctagagc ccccaggcgc 18; acctcgcccc aaccgcccgc ggctcgggca gctctgcaga gacgtcgtgg cggcagggcc 24; agcacccatt ggtccgccac agccctccgc cctccccctc gccacgctta ttggccggag ; cggcggcgct cgccgggtag gcggtggng Cgtccctccc ttgcgccggc gagg 36; cgacgaggaa gcggccgcct ccctgcgccc cgcccctccg gctagctcgc ccgg 42; ctcctcccga cgtctcctac ctcctcacgg ctcttcccgg cgctctcctg gctcccttct 48; gccccagctc ggcg gcggcgggca gttgcagtgg tgcagaatgg ctgacctcag 54; tcttgcagat acag aaccatctcc agacattgag ggagagataa agcgggactt 60; cattgccaca ctagaggcag aggcctttga tgatgttgtg ggagaaactg ttggaaaaac 66; tatt ctgg atgttgatga gaaaaccggg aactcagagt caaagaagaa 72; accgtgctca gaaactagcc agattgaaga tactccatct tctaaaccaa cactcctagc 78; caatggtggt catggagtag aagggagcga tactacaggg actg aattccttga 84; agagaaaatg gcctaccagg aatacccaaa tagccagaac tggccagaag actt 90; ttgtttccaa caag tggtcgatcc gact gatcccttta agatgtacca 96; tgatgatgac ctggcagatt tggtctttcc ctccagtgcg acagctgata tatt L02; tgcaggacaa aatgatccct tgaaagacag ttacggtatg tctccctgca ctgt L08; tgtacctcag gggtggtctg cctt aaactctcca cactcagagt cctttgtttc L14; cccagaggct gttgcagaac ctcctcagcc aacggcagtt cccttagagc tagccaagga L20; gatagaaatg gcatcagaag agaggccacc agcacaagca ttggaaataa tgatgggact L26; gaagactact gacatggcac catctaaaga aacagagatg gccctcgcca aggacatggc L32; actagctaca aaaaccgagg tggcattggc taaagatatg gaatcaccca ccaaattaga L38; tgtgacactg gccaaggaca tgcagccatc catggaatca gatatggccc tagtcaagga L44; catggaacta cccacagaaa aagaagtggc cctggttaag gatgtcagat ggcccacaga L50; aacagatgta tcttcagcca agaatgtggt actgcccaca gaaacagagg tagccccagc L56; caaggatgtg acactgttga aagaaacaga gagggcatct cctataaaaa tggacttagc L62; cccttccaag gacatgggac cacccaaaga aaacaagaaa gaaacagaga gggcatctcc 168; tataaaaatg gacttggctc cttccaagga catgggacca cccaaagaaa acaagatagt 174; cccagccaag gatttggtat tactctcaga aatagaggtg gcacaggcta atgacattat 180; atcatccaca gaaatatcct agaa ggtggctttg tcctcagaaa cagaggtagc 186; cctggccagg gacatgacac cgga aaccaacgtg atcttgacca aggataaagc 192; actaccttta gaagcagagg cagt caaggacatg gctcaactcc cagaaacaga 198; aatagccccg gccaaggatg tggctccgtc cacagtaaaa gaagtgggct tgttgaagga 204; catgtctcca ctatcagaaa cagaaatggc tctgggcaag gatgtgactc caga 210; aacagaagta atca agaacgtatg tctgcctcca gaaatggagg tggccctgac 216; tgaggatcag gtcccagccc tcaaaacaga agcacccctg gatg gggttctgac 222; cctggccaac aatgtgactc aaga tgttccacca ctctcagaaa cagaggcaac 228; tcca gaca tggaaattgc acaaacacaa aaaggaataa attc 234; ccatttagaa tctctgcagg atgtggggca gtcagctgca cctactttca tgatttcacc 240; agaaaccgtc acaggaacgg ggaaaaagtg cagcttgccg gccgaggagg attctgtgtt 246; agaaaaacta ggggaaagga aaccatgcaa cagtcaacct cttt cttcagagac 252; ctcaggaata gccaggccag aagaaggaag gcctgtggtg agtgggacag gaaatgacat 258; caccacccca ccgaacaagg agctcccacc aagcccagag aagaaaacaa agcctttggc 264; caccactcaa cctgcaaaga cttcaacatc gaaagccaaa acacagccca tccc 270; taagcagcca gctcccacca ccattggtgg gttgaataaa aaacccatga gccttgcttc 276; aggcttagtg ccagctgccc cacccaaacg ccctgccgtc gcctctgcca ggccttccat 282; cttaccttca aaagacgtga agccaaagcc cattgcagat gcaaaggctc ctgagaagcg 288; ggcctcacca tccaagccag cttctgcccc agcctccaga tctgggtcca agagcactca 294; gactgttgca aaaaccacaa cagctgctgc tgttgcctca actggcccaa gcagtaggag 300; cccctccacg ctcctgccca agaagcccac tgccattaag actgagggaa aacctgcaga 306; agtcaagaag gcaa agtctgtacc agctgacttg agtcgcccaa agagcacctc 312; caccagttcc atgaagaaaa ccaccactct cagtgggaca gcccccgctg tggt 318; tcccagccga gtcaaggcca cacccatgcc ctcccggccc tccacaactc ctttcataga 324; caagaagccc acctcggcca gctc caccaccccc cggctcagcc gcctggccac 330; caatacttct gctcctgatc tgaagaatgt ccgctccaag gttggctcca cggaaaacat 336; caagcatcag cctggaggag gccgggccaa agtagagaaa gagg cagctgctac 342; aacccgaaag cctgaatcta atgcagtcac agcc ggcccaattg caagtgcaca 348; gaaacaacct gcggggaaag tccagatagt ctccaaaaaa gtgagctaca gccatattca 354; gtccaagtgt ggttccaagg ttaa ccct ggaggtggta atgttcagat 360; tcagaacaag aaagtggaca tctctaaggt ctcctccaag tgtgggtcta aggctaacat 366; caagcacaag cctggtggag gagatgtcaa gattgaaagt cagaagttga acttcaagga 372; gaaggcccag gccaaggtgg gatccctcga taatgtgggc cacctacctg caggaggtgc 378; tgtgaagatt taca ggctgacgtt ccgggcaaat gccagggccc gcaccgacca 384; cggggccgac tccc gccccccaca cttccctggc ggccccaact cgggctcccg 390; ggtccttggc cccctttccc gggctgtcca ctagaccagt gagcgcttgg gcgccgtgct 396; gggcagcccg ctaggctcgc cttccctcct gctttgcgtg cccggggcag cagcagccct 102; gccccacacc tcctctcact cctg ggcccatctc cctgctttgg tcttgcccca 108; tcactgcgcc actgctccgt ggaggaggtt gggagggggt tggggtggtt gaggctaagt 114; tgggatctag gagaggagaa ccagattcta tcctcatctt tttttggttc tttggtccaa 120; acccaaaaga aactgacatg ccctcccttc tccctggatc tacctggagg gaagagtgga 126; ggtggattcc gagtggtgac ctga ccgtggagct taagccactg cctctccctc Z32; tggtcccaca aatgggcgcc tccc catgcaggtg gtgtcgggcc cttcttgctg Z38; ccctgcccca agttgggggt cagtgctgcc tgtccccatg cttaacatac ccgcctagct Z44; acat ttttcttgtt ttgtcctttt ttct aataacctaa aaactggcaa 150; aatagttctg caggttgaag ccatgtctac atgaaagtcc tcagtaagtg ttagagggaa 156; cagggcggag atatccttat gccacccccg ctggaggatg tgggcagctt agggccctgg aggcggtgcg gcagggaaga ggggtgcaga ggctgtggct ggtgagccgg aaggggccct tggagcgtgg actggttggt tttgccattt tgttgtgtgt atgctgcttt Z74; tcttttctaa ggct ggttttggca tctctgtccc attccctggg atctggtggt 180; cagccctagg agcc agggctggag aacaagaaag ggccaggaga tggaattcct 186; tcaggccggc acccacaccc atgt aagccctcat gtccaaggga gcctcatgca Z92; gatagtagga aatcaggtct ggaaatttaa aaataaaagg catgagacta aggctatctg Z98; cttcccttat gccctgactg gagaggggag ggaggagagg ccac agagggcatc 504; ccagctaggc cttgggatgg ctgcagtgag tccc gggaactgta ttgacacaaa 510; gattcttatt gcacttgtat tttttgtatt aaagtttgca tggtttctaa taaaggattc 516; aagt ttgtagtgaa atggcctggg agattccaag tctg gagggggatt 522; agtg tgcc tctgaggagg ctgccccaga cttggcctcc tcatgccccc 528; tcctgacctc tgcccttctc tggtcctggc atccctggag aaggtagggg tcttgaccta 534; gatt tgatctccat gtgcagggag gctgtcctgg gcctgacagg tcctccccct 540; ttctgaggta gcagtgcctt gtggaggttt gacaccatgt ccctagctcc ccaagcacac 546; aaac tgcaggggct cacggaggaa gtgctgcctg ggccaggggg tttc 552; ctccgtagag accatgtgca gaacacttct ccaa gagg gagccagtgt 558; tttgtcagca ggaagaaagg gcctgctggg atgaaagtgg gaaggaaaca cgta 564; gtcaggagac acctcagggg caacagcaca ggcccagagt acctgctgcc tccactgcgt 570; ctgtcctggg gtcatgagga tgctgaggtt gacgacaggt tcct ttcactcctt 576; tggccaaagg ttgggggtag gtggcccaag tggcgtgctc tctaggtaga caacaggagt 582; ggtcagagtt ccctcaaagg atcctccact ccagagcacc tgagaaggcc gggaccagag 588; gccctgtgtg atgtgtactc cgcagctgtt tggggtggga catttctgta cttctcgatt 594; tgcttatggc atta cctgtgtcag tccatgattc tgttgtaaca gttttaagag 600; taaataaa:a aagctgcctg atgt cccatc acgcagaaaa aaaaaa LOCJS WM_002375 (isoform 1) AA /transla :ion="MADLSLA DALL‘PSPDldelKRDhIAi-*A*AEDDVVG4L iVGK I—U)<}U"UU'TI—] .S IPL.DV34 K *SKKKPCS *iSQI L DiPSSKP --AWGG{GV.GSD4 iGSP 4 4 .4 .44KMAYQ *YPVSQWWPLDLVECEQPLQVVDPIQTDPFKWYHDDD4ADLVFPSSA"A IFAGQWJP4K PCVTAVVPQGWSVTA .VSPiS45bVSP4AVA4PPQPiAV .AK‘I‘WAS *RPPAQA.TIMWGLKTT DMAPSK 4 DWA-A K *VA-AK *SP KLDV SDMALVKDM 4L? 4K i 4 .4 DVSSAKNV P 4 *VAPAK QASPIKMDLAPSK 4 KASPIKMD4AP W DWGPPKE .S *VAQAN DIISS .55 4 *VALARDM .PP 4 VVI. *VAPVKDMAQ .P* VKLVGLLKDWSP U] 4 *MALGKDV .IKVVCLPP *M‘VA- *APLAKDGV."L w.AVV"PAK3VPP PVPIKJWLIAQ iQKGIS DVGQSAAPTFWIS wD 1V S DSV-‘KLG 4 QKPC\ISQPS L SGIARP4 4 GRPWSG"G A u I PPNK 4LPPSP *KK KP-AiiQPAK SiSKAK QPiS4PKQPAPTTIGGLNKKP SU] 4ASGLVPAAPPK RPAVASA QPSI4PSKD DAKAPEKRASPSKPASAPASR m GSKSiQ VAKi iAAAVASTGPSS QSPS". AIK *GKPAdVKKM AKSVPA u 4SRPKS"STSSMKKTTTLSGTAPAAGVVPSRVKATPWPSRPSTTPFIDKKPTSAKPS ST"PR-SR .ATN"SAP D 4KNV TEN IKHQPGGGRAKV4KKi4AAAi RKPLSN AV"KTAGPIASAQKQPAGKVQIVSKKVSYS iIQSKCGSKDNIKHVPGGGNVQIQNKKV DISKVSSKCGSKANIK {KPGGG QK .NbKdKAQAKVGSLDNVGHLPAGGAVKT *GGGS‘AP .CPGPPAG L *PAIS‘AAP‘AGAPiSASGLNGHPT4SGGGDQREAQTLDSQ IQETSI CDNA: aggccccacc cgctggtggc ccggtccgcg tgtgcgccga ctctcgcact ctcctcgc:c 6; cgggcgccca gactctgaga gagagcgacg gccatcatag aacagcgaag gcagtcga:c 12; cggt ccgcttcggc attcgcaggc cgcaggcggg aggctagagc ccccaggcgc 18; acctcgcccc aaccgcccgc ggct cgggca gctctgcaga gacgtcgtgg cggcagggcc 24; agcacccatt ggtccgccac agccctccgc cctccccctc gccacgctta ttggccggag ; cgct cgccgggtag gcggtggcgg cgtccctccc cggc cctcaagagg 36; cgacgaggaa gcggccgcct ccctgcgccc cgcccctccg gctagctcgc 31 8 tggctcccgg 42; ctcctcccga cgtctcctac ctcctcacgg ctcttcccgg cgctctcctg gctcccttct 48; gctc cgtctcggcg gcggcgggca gtgg tgcagaatgg ctgacctcag 54; agat gcattaacag aaccatctcc agacattgag ggagagataa agcgggactt 60; cattgccaca ctagaggcag aggcctttga tgatgttgtg ggagaaactg ttggaaaaac 66; tatt cctctcctgg atgttgatga gaaaaccggg aactcagagt caaagaagaa 72; accgtgctca gaaactagcc aaga tactccatct tctaaaccaa cactcctagc 78; caatggtggt catggagtag aagggagcga tactacaggg tctccaactg aattccttga 84; agagaaaatg gcctaccagg aatacccaaa tagccagaac tggccagaag ataccaactt 90; ccaa cctgagcaag tggtcgatcc tatccagact ttta agatgtacca 96; tgatgatgac ctggcagatt ttcc ctccagtgcg acagctgata cttcaatatt L02; acaa aatgatccct tgaaagacag ttacggtatg tgca acacagctgt L08; tgtacctcag gggtggtctg tggaagcctt aaactctcca cactcagagt cctttgtttc L14; cccagaggct gttgcagaac ctcctcagcc aacggcagtt cccttagagc tagccaagga L20; gatagaaatg gaag agaggccacc agca ttggaaataa tgatgggact L26; gaagactact gacatggcac catctaaaga aacagagatg gccctcgcca aggacatggc L32; actagctaca aaaaccgagg tggcattggc taaagatatg ccca ccaaattaga L38; tgtgacactg gccaaggaca tgcagccatc catggaatca gatatggccc tagtcaagga L44; catggaacta cccacagaaa aagaagtggc cctggttaag gatgtcagat ggcccacaga L50; aacagatgta tcttcagcca agaatgtggt actgcccaca gaaacagagg tagccccagc L56; caaggatgtg acactgttga aagaaacaga gagggcatct cctataaaaa tggacttagc L62; cccttccaag gacatgggac cacccaaaga aaacaagaaa gaaacagaga gggcatctcc L68; tataaaaatg gacttggctc cttccaagga catgggacca cccaaagaaa acaagatagt L74; cccagccaag gatttggtat tactctcaga ggtg gcacaggcta atgacattat L80; atcatccaca gaaatatcct ctgctgagaa ggtggctttg tcctcagaaa cagaggtagc L86; cctggccagg acac tgcccccgga aaccaacgtg atcttgacca aggataaagc L92; actaccttta gaagcagagg tggccccagt caaggacatg gctcaactcc cagaaacaga L98; aatagccccg gatg tggctccgtc cacagtaaaa gaagtgggct tgttgaagga 204; tcca ctatcagaaa cagaaatggc tctgggcaag gatgtgactc cacctccaga 210; aacagaagta atca agaacgtatg tctgcctcca gaaatggagg tggccctgac 216; tgaggatcag gtcccagccc tcaaaacaga agcacccctg gctaaggatg gggttctgac 222; cctggccaac aatgtgactc cagccaaaga tgttccacca ctctcagaaa cagaggcaac 228; accagttcca attaaagaca ttgc acaaacacaa aaaggaataa attc 234; ccatttagaa tctctgcagg atgtggggca gtcagctgca cctactttca cacc 240; agaaaccgtc acaggaacgg ggaaaaagtg gccg gagg attctgtgtt 246; agaaaaacta ggggaaagga aaccatgcaa cagtcaacct tctgagcttt cttcagagac 252; ctcaggaata gccaggccag gaag gcctgtggtg agtgggacag gaaatgacat 258; caccacccca ccgaacaagg agctcccacc aagcccagag acaa agcctttggc 264; caccactcaa cctgcaaaga cttcaacatc gaaagccaaa acacagccca cttctctccc 270; taagcagcca gctcccacca ccattggtgg taaa aaacccatga gccttgcttc 276; aggcttagtg ccagctgccc cacccaaacg ccctgccgtc gcctctgcca ggccttccat 282; cttaccttca aaagacgtga agccaaagcc cattgcagat gcaaaggctc ctgagaagcg 288; ggcctcacca tccaagccag cttctgcccc agcctccaga tctgggtcca agagcactca 294; gactgttgca aaaaccacaa cagctgctgc tgttgcctca actggcccaa gcagtaggag 300; cccctccacg ctcctgccca ccac tgccattaag actgagggaa aacctgcaga 306; agtcaagaag atgactgcaa agtctgtacc agctgacttg agtcgcccaa agagcacctc 312; caccagttcc atgaagaaaa ccaccactct gaca gcccccgctg caggggtggt 318; tcccagccga gtcaaggcca cacccatgcc ctcccggccc tccacaactc ctttcataga 324; caagaagccc acctcggcca aacccagctc caccaccccc cggctcagcc gcctggccac 330; caatacttct gctcctgatc tgaagaatgt ccgctccaag gttggctcca cggaaaacat 336; caagcatcag cctggaggag gccgggccaa agtagagaaa aaaacagagg cagctgctac 342; aacccgaaag cctgaatcta tcac taaaacagcc ggcccaattg caca 348; gaaacaacct gcggggaaag tccagatagt ctccaaaaaa gtgagctaca gccatattca 354; gtccaagtgt ggttccaagg ttaa gcatgtccct ggaggtggta atgttcagat 360; tcagaacaag aaagtggaca tctctaaggt ctcctccaag tgtgggtcta aggctaacat 366; caagcacaag ggag gagatgtcaa gattgaaagt cagaagttga acttcaagga 372; gaaggcccag gccaaggtgg gatccctcga taatgtgggc cacctacctg caggaggtgc 378; tgtgaagact gagggcggtg gcagcgaggc tcctctgtgt ccgggtcccc ctgctgggga 384; ggagccggcc atctctgagg cagcgcctga agctggcgcc cccacttcag ccagtggcct 390; caatggccac cccaccctgt cagggggtgg tgaccaaagg caga ccttggacag 396; ccagatccag gagacaagca tctaatgatg tggt ctcgtcttcc gtctcccccg 402; tgttcccctc ttgtctcccc tgttcccctc ccct cctcccatgt cactgcagat 108; tgagacctac aggctgacgt tccgggcaaa tgccagggcc cgcaccgacc acggggccga 114; cattgtctcc cgccccccac acttccctgg cggccccaac tcgggctccc gggtccttgg 120; ccccctttcc cgggctgtcc ccag cttg gtgc tgggcagccc 126; gctaggctcg ccttccctcc tgctttgcgt gcccggggca gcagcagccc tgccccacac Z32; ctcctctcac tccccagcct gggcccatct ccctgctttg gtcttgcccc atcactgcgc Z38; cactgctccg aggt tgggaggggg tggt tgaggctaag ttgggatcta Z44; ggagaggaga ttct atcctcatct ttttttggtt tcca aacccaaaag 150; aaactgacat gccctccctt ctccctggat ctacctggag ggaagagtgg aggtggattc 156; cgagtggtga caggacgctg accgtggagc ttaagccact gcctctccct ctggtcccac Z62; aaatgggcgc ccccccctcc ccatgcaggt ggtgtcgggc ccttcttgct gccctgcccc Z68; gggg tcagtgctgc ctgtccccat gcttaacata cccgcctagc tgctgtcaca Z74; ttgt tttgtccttt tatttttttc taataaccta aaaactggca aaatagttct 180; gcaggttgaa gccatgtcta catgaaagtc ctcagtaagt gttagaggga acagggcgga 186; gatatcctta tgccaccccc gctggaggat gtgggcagct tagggccctg gaggcggtgc Z92; ggcagggaag aggggtgcag aggctgtggc tggtgagccg gtcaggcaca caaggggccc Z98; ttggagcgtg gactggttgg ttttgccatt ttgttgtgtg tatgctgctt ttcttttcta 504; accaagaggc tggttttggc gtcc cattccctgg gatctggtgg tcagccctag 510; gataaaaagc tgga gaacaagaaa gggccaggag atggaattcc ttcaggccgg 516; cacc ctaggacatg taagccctca tgtccaaggg agcctcatgc agatagtagg 522; aaatcaggtc tggaaattta aaaataaaag gcatgagact aaggctatct gcttccctta 528; tgccctgact ggagagggga gggaggagag gcaaggccca cagagggcat cccagctagg 534; ccttgggatg gctgcagtga ggagaaatcc ctgt attgacacaa agattcttat 540; tgcacttgta ttttttgtat taaagtttgc atggtttcta ataaaggatt caaacataag 546; tttgtagtga aatggcctgg gagattccaa gggcttctct ggagggggat tggctgcagt 552; gtagatttgc ggag gctgccccag acttggcctc cccc ctcctgacct 558; ctgcccttct ctggtcctgg catccctgga gaaggtaggg acct aagtttagat 564; ttgatctcca tgtgcaggga ggctgtcctg ggcctgacag gtcctccccc tttctgaggt 570; agcagtgcct tgtggaggtt tgacaccatg tccctagctc cccaagcaca caccaggaaa 576; ctgcaggggc tcacggagga gcct gggccagggg gaccagcttt cctccgtaga 582; gaccatgtgc agaacacttc tgctgtgcca agaacatgag ggagccagtg ttttgtcagc 588; aggaagaaag ggcctgctgg gatgaaagtg ggaaggaaac agggttgcgt agtcaggaga 594; cacctcaggg gcaacagcac aggcccagag ctgc ctccactgcg tctgtcctgg 600; ggtcatgagg atgctgaggt tgacgacagg ttccaggtcc tttcactcct ttggccaaag 606; gttgggggta ggtggcccaa gtggcgtgct ctctaggtag acaacaggag tggtcagagt 612; tccctcaaag gatcctccac tccagagcac ctgagaaggc cgggaccaga ggccctgtgt 618; gatgtgtact ccgcagctgt ttggggtggg acatttctgt acttctcgat ttgcttatgg 624; ctcagccatt acctgtgtca gtccatgatt ctgttgtaac agttttaaga gtaaataaat 630; aaagctgcct gatgtcccat cacgcagaaa aaaaaaa LOCUS NM_O30885 (isoform 3) AA/transla :ion H WADLSLADALL *PSPDI‘G *IKRDEIALL 4A *AEDDVVGLLVGK TDYIPLLDVD *KiGNS *SKKKPCS *iSQI *DiPSSKPiLLANGGHGV *GSDii‘A CDNA: aggccccacc cgctggtggc ccggtccgcg ccga cact ctcctcgctc 6; cgggcgccca gactctgaga gagagcgacg gccatcatag aacagcgaag gcagtcgatc 12; ggcctgcggt ccgcttcggc attcgcaggc cgcaggcggg aggctagagc ccccaggcgc 18; cccc aaccgcccgc ggctcgggca gctctgcaga gacgtcgtgg cggcagggcc 24; agcacccatt ggtccgccac agccctccgc cctccccctc gccacgctta ttggccggag ; cgct cgccgggtag gcggtggcgg cgtccctccc cggc cctcaagagg 36; cgacgaggaa gcggccgcct ccctgcgccc tccg gctagctcgc tggctcccgg 42; ctcctcccga cgtctcctac ctcctcacgg ctcttcccgg cgctctcctg gctcccttct 48; gccccagctc ggcg ggca gttgcagtgg tgcagaatgg ctgacctcag 54; tcttgcagat gcattaacag aaccatctcc tgag ggagagataa agcgggactt 60; cattgccaca ctagaggcag aggcctttga tgatgttgtg ggagaaactg ttggaaaaac 66; agactatatt cctctcctgg atgttgatga cggg aactcagagt caaagaagaa 72; accgtgctca gaaactagcc agattgaaga tactccatct tctaaaccaa cactcctagc 78; caatggtggt catggagtag aagggagcga tactacagaa gcctagcgtg tctctcaaca 84; ctggggctgc tgcaacacca gaccagtgat ctttcctaag catcgttata cttctaaaac 90; cttcagcatt ttgcagagct ttgcttttca ttcctggaca tgatgtagaa gagg 96; gtagttcttc ggggcctatt tctgctgatg cctgagcaaa caacctgctt cctcttgtgc 102; tctgcagggt gagc ctcatttccc aaca caaagtgcaa aatgaattct 108; ttttaatttt tttt acaaaggtta tctaatgtct tttatttctt gttttcttta 114; tgattttatc ttca ttctcacatt tttttccttt aaatattttt agttgacctt L20; tttcctttgg ttttcaaatg ttcaacatga atcagaatag tgtaacacca aatgagaaca L26; ttca taaaggggtt gaggccacca gtactgcagc gaatttcctt ttcttctccc L32; tcctccttcc ttctctgagc ttgcttttag ggaaggttaa tcttacaggc tacctatgtt L38; tctctccacc ttactaaaat ctaaataatg atagatattt taagttttta aattgagtag L44; ttctgagtaa aata tttttccaaa ttaaataatc ctttattatt ttgg L50; gccaaatttt tttttttttg gagacggact aatc taagattgtt ggac L56; tttcttattc ccattcctaa ttttttcaaa ctaattgctt aaatctagaa ccagttgaga L62; ttagtactgt acaatggtat gctttgattg tatttataga acat aaaacatgga L68; ccatgttttg taga ggaattctgg tttaaaatct gaaatacttt aaagttttct L74; atccttttac tgattatgca gcttcttata acccccaagg tacagattat ttaa L80; aagaaataat catg ttctgagaaa gattttgaga tatacattgt tttttgtttt L86; tgagacaggg tctcactctt gcctaggctg gagtgcggtg gcgcgatctt ggcttactgc L92; accctctgcc tcccaggttc aagtggttct cctgcctcag cctcccaagt agctgggatt L98; ataggtgtgc gccaccacac cagctaattt ttgtattttt agtagagacg gggttttact 204; ttgttggcca ggctggtctc gaactcctga gtga tccacccgcc ttggccaccc 210; aaagtgctgg gattacaggt cact gtgcctggcc agatatacat tgttttagat 216; cccctgatac agaactactt ttgagatggt agaa tatcatgaaa gtttaaactg 222; aatccttgca gcgacttccg agga agcattccct cttctacagg ggtctggtca 228; caagggctgg attctctgac aaaaatttct tctatggatt ctggacagaa gacacttgag 234; atgt tttaaaggaa agaggttatg cttatcttct tagcagttga taaaatatta 240; aaactctctt gccatagaga atacaccatc aaagaaaaca tttc cctttggtgt 246; acagttattt attaagttga ttttggggtt tctttcgcca tttt caaaactggt 252; agtagttatt tttaaaaatc atgtttgaca tctttctatt gctcgtaacc agtccctgta 258; gctgtctaag ttatgggtgg agaagcctgg gtatgacttt ccgttgtgta cactcacact 264; tcatgattga catttattta ttcttttatt tccatttggt tatgcttatt tttgtttgaa 270; atttgttttt ttcaaatatg tttccttttt gaacttacag aattgttgaa attttctact 276; aacagccagc taaaatttgg tatatgttag ctctatctgt ttcacttgga cgtttcattt 282; tgaaagaaag aaattttatg tttcacatat agttttatac aaagtagcca gtcccataat 288; gaaatgctgt attgccatag tggtcacacc caagtggtcc agtatctcaa tggtgaggca 294; gccagactgg tcagggctgc tttgttgaaa tgtgatgatt ttcatatgcc ttctttctct 300; ttctctctct cttttttttt ccttttttgg cccaatgttg aagatgtaga actttgtttt 3061 taaataatgt ttttataatt tcattcgtat acctaagttt gtattttttg tgactttgga 3121 cttcaacagt attg ggacttctaa tgtgattact gtactaaata aattccacta 3181 atct aaca ccttaaaaaa aaaaaaaaaa aaaa 19. MAPKl: WAPKl mitogen—activated protein kinase 1 [ Homo sapiens ] LOCJS WW_002745 (isoform 1) AA/:ranslation="MAAAAAAGAGPEMVRGQVFDVGPRYTN.SYIGTGAYGMVCSAYD NVNKVRVAIKKISPbLHQLYCQRLLRdIKIL.RFRHTNIIGINDIIRAPTIEQMKDVY IVQDLMTTD.YK..KTQH.SNDHICYF.YQI.RGLKYIHSANVAHRDLKPSVLLLNTT CDLKICDFGAARVADPDHDHTGFLTEYVATRWYRAPTIM.NSKGYTKSIDIWSVGCIA ATM-SNRPIFPGKiY.DQ.NHILGI.GSPSQTDLNCIIN.KARVY.LSLPHKNKVPWW DSKA.D..DKMLTFNPHKR14V4QA.AHPY.TQYYDPSDdPIAdAPbeDM; LDD.PKdKLKdﬂlbddiARbQPGYRS CDNA: gcccctccc: ccgcccgccc gccggcccgc ccgtcagtct ggcaggcagg caggcaa:cg 61 gtgg ctgtcggctc ttcagctctc ccgctcggcg tcttccttcc tcctcccggt 121 cagcgtcggc ggctgcaccg gcggcggcgc agtccctgcg ggaggggcga caagagctga 181 gccg ccgagcgtcg agctcagcgc ggcggaggcg gcggcggccc ggcagccaac 24; atggcggcgg ngngngC gggcgcgggc ccggagatgg tccgcgggca ggtgttcgac ; gtggggccgc gctacaccaa cctctcgtac atcggcgagg gcgcctacgg catggtgtgc 361 tctgcttatg tcaa caaagttcga gtagctatca agaaaatcag cccctttgag 421 caccagacct actgccagag aaccctgagg gagataaaaa tcttactgcg cttcagacat 481 gagaacatca ttggaatcaa tgacattatt cgagcaccaa ccatcgagca aatgaaagat 541 gtatatatag tacaggacct catggaaaca gatctttaca agctcttgaa gacacaacac 60; ctcagcaatg accatatctg ctattttctc taccagatcc tcagagggtt aaaatatatc 661 cattcagcta acgttctgca ccgtgacctc aagccttcca acctgctgct cacc 721 tgtgatctca agatctgtga ctttggcctg gttg cagatccaga ccatgatcac 781 acagggttcc tgacagaata tgtggccaca taca gggctccaga aattatgttg 841 aattccaagg gctacaccaa gtccattgat atttggtctg taggctgcat tctggcagaa 90; atgctttcta acaggcccat ctttccaggg aagcattatc ttgaccagct catt 961 ttgggtattc ttggatcccc atcacaagaa gacctgaatt gtataataaa tttaaaagct 1021 aggaactatt tgctttctct tccacacaaa aataaggtgc catggaacag gctgttccca 1081 aatgctgact ccaaagctct attg atgt tgacattcaa cccacacaag 1141 aggattgaag tagaacaggc tctggcccac ccatatctgg agcagtatta cgacccgagt 1201 gacgagccca tcgccgaagc accattcaag ttcgacatgg atga cttgcctaag L26; gaaaagctca aagaactaat ttttgaagag actgctagat tccagccagg atacagatct L32; taaatttgtc aggacaaggg ctcagaggac tggacgtgct cagacatcgg tgttcttctt L38; tctt gacccctggt cctgtctcca gcccgtcttg ccac tttgactcct L44; ttgagccgtt tggaggggcg gtttctggta gctt ttatgctttc aaagaatttc L50; ttcagtccag agaattcctc ctggcagccc tgtgtgtgtc acccattggt gacctgcggc L56; agtatgtact tcagtgcacc tactgcttac tgttgcttta g:cactaatt gctttctggt L62; ttgaaagatg cagtggttcc tccctctcct gaatcctttt c:acatgatg ccctgctgac L68; catgcagccg caccagagag agattcttcc ccaattggct c:agtcactg gcatctcact L74; ttatgatagg gaaggctact acctagggca ctttaagtca g:gacagccc cttatttgca L80; cttcaccttt tgaccataac ccca gagc t:gtggaaat accttggctg L86; atgttgcagc ctgcagcaag tgcttccgtc tccggaatcc agca cttgtccacg L92; tcttttctca tatcatggta gtcactaaca tatataaggt a:gtgctatt ggcccagctt L98; ttagaaaatg cagtcatttt tctaaataaa aaggaagtac tgcacccagc actc 204; tgtagttact gtggtcactt gtaccatata gaggtgtaac acttgtcaag aagcgttatg 210; tgcagtactt aatgtttgta agacttacaa aaaaagattt aaagtggcag cttcactcga 216; catttggtga gagaagtaca aaggttgcag tgctgagctg tgggcggttt ctggggatgt 222; cccagggtgg aactccacat gctggtgcat atacgccctt gagctacttc aaatgtgggt 228; gtttcagtaa ccacgttcca agga tttagcagag aggaacactg cgtctttaaa 234; agta tacaattctt tttccttcta cagcatgtca gcatctcaag tttc 240; aacctacagt ataacaattt aagc ctccaggagc tcatgacgtg tgtt 246; ctgtcctcaa gtactcaaat atttctgata ctgctgagtc agactgtcag aaaaagctag 252; cactaactcg tgtttggagc tctatccata ttttactgat ctctttaagt atttgttcct 258; gccactgtgt actgtggagt tgactcggtg ttctgtccca gtgcggtgcc tcctcttgac 264; ttccccactg ctctctgtgg tgagaaattt gccttgttca ataattactg cgca 270; tgactgttac agctttctgt gcagagatga ctgtccaagt gccacatgcc tacgattgaa 276; atgaaaactc tacc tctgagttgt gttccacgga aaatgctatc cagcagatca 282; tttaggaaaa ataattctat ttttagcttt tcatttctca gctgtccttt tttcttgttt 288; tgac agcaatggag ttat ataaagactg cctgctaata tgaacagaaa 294; tgcatttgta attcatgaaa ataaatgtac atcttctatc ttcacattca tgttaagatt 300; cagtgttgct ttcctctgga tcagcgtgtc tgaatggaca gtcaggttca ggttgtgctg 306; aacacagaaa tgctcacagg cctcactttg ccgcccaggc actggcccag cacttggatt 312; tacataagat gagttagaaa ggtacttctg tagggtcctt tttacctctg ctcggcagag 318; aatcgatgct gtcatgttcc tttattcaca atcttaggtc tcaaatattc tgtcaaaccc 324; taacaaagaa gccccgacat ctcaggttgg attccctggt tctctctaaa gagggcctgc 330; ccttgtgccc cagaggtgct gctgggcaca gccaagagtt gggaagggcc gccccacagt 336; acgcagtcct caccacccag cccagggtgc tcacgctcac cactcctgtg gaag 342; tggc tcatcctcgg aaaacagacc cacatctcta ttcttgccct gaaatacgcg 348; cttttcactt tcag agctgccgtc tgaaggtcca attg acgggacaca 354; gaaatgtgac tgttaccgga tgat tagtcagttt tcatttataa aaaagcattg 360; acagttttat tgtt tctttttaaa tggaaagtta ctattataag gttaatttgg 366; agtcctcttc taaatagaaa accatatcct tggctactaa catctggaga ctgtgagctc 372; attc cccttcctgg tactgtggag tcagattggc atgaaaccac taacttcatt 378; ctagaatcat tgtagccata agttgtgtgc tttttattaa tcatgccaaa cataatgtaa 384; ctgggcagag aatggtccta accaaggtac ctatgaaaag cgctagctat catgtgtagt 390; agatgcatca ttttggctct tcttacattt gtaaaaatgt acagattagg tcatcttaat 396; tagt gaac tcca ctatttgtat gttcaaataa gctttcagac 102; taatagcttt tttggtgtct aaaatgtaag caaaaaattc ctgctgaaac attccagtcc 108; tttcatttag tataaaagaa atactgaaca agccagtggg atggaattga aagaactaat 114; catgaggact ctgtcctgac acaggtcctc aaagctagca gagatacgca gacattgtgg 120; catctgggta gaagaatact gtattgtgtg tgcagtgcac agtgtgtggt acac 126; tcattccttc tgctcttggg cacaggcagt agag gtaaccagta gctacatgta gctcaccagt ggttttctct aaggaatcac aaaagtaaac tacccaacca 138; catgccacgt aatatttcag ccattcagag gttt tatt tgcttatatg Z44; ttaatatggt ttttaaattg gtaactttta tatagtatgg taacagtatg ttaatacaca 150; catacatacg cacacatgct ttgggtcctt ccataatact tttatatttg taaatcaatg 156; ttttggagca agtt taagggaaat atttttgtaa atgtaatggt tttgaaaatc Z62; tgagcaatcc ttttgcttat acatttttaa tgtg ctttaaaatt gttatgctgg tgtttgaaac atgatactcc tgtggtgcag atgagaagct ataacagtga atatgtggtt Z74; tctcttacgt cctt gacatgatgg gtcagaaaca aatggaaatc cagagcaagt 180; cctccagggt tgcaccaggt ttacctaaag gcct tttcttgtgc tgtttatgcg 186; tgtagagcac tcaagaaagt tctgaaactg ctttgtatct gctttgtact gttggtgcct 492; tcttggtatt gtaccccaaa attctgcata gattatttag tataatggta agttaaaaaa 498; agga agattttatt aagaatctga atgtttattc attatattgt ttaa 504; cattaacatt tatttgtggt atttgtgatt tggttaatct gtataaaaat tgtaagtaga 510; aaggtttata ctta ttga tgttgtaaac gtacttttta ggat 516; tatttgaatg tttatggcac ctgacttgta aaaaaaaaaa aaaa aatccttaga 522; atcattaaat tgtgtccctg tattaccaaa ataacacagc accgtgcatg tatagtttaa 528; ttgcagtttc atctgtgaaa acgtgaaatt gtctagtcct tcgttatgtt ccccagatgt 534; cttccagatt tgctctgcat gtggtaactt gtgttagggc tgtgagctgt tcctcgagtt 540; ggat gtcagtgctc ctagggttct ccaggtggtt acct gtgg 546; gggggggggt tgcc cacgcccatc tcctcatcct cctgaacttc ccca 552; ctgctgggca gacatcctgg gcaacccctt ttttcagagc aagaagtcat aaagatagga 558; tttcttggac atttggttct tatcaatatt tatg taatgactta tttacaaaac 564; aaagatactg gaaaatgttt tggatgtggt gttatggaaa gagcacaggc cttggaccca 570; tccagctggg ttcagaacta ccccctgctt ataactgcgg ctggctgtgg gccagtcatt 576; ctgcgtctct gctttcttcc tctgcttcag actgtcagct gtaaagtgga agcaatatta 582; cttgccttgt atatggtaaa gattataaaa atacatttca actgttcagc atagtacttc 588; aaagcaag:a ctcagtaaat agcaagtctt tttaaa LOC JS NW 138957 (isoform 2) AA/ :ranslation="MAAAAAAGAGP EMVRGQVFDVGP RYTN-SYI G TGAYGMVCSAYD NVNKVRVAIKKISPELHQLYCQRLL? *IKIL.RFRH TNIIGIN DIIRAPTIEQMKDVY IVQDLMTTD.YK KTQH-SNDHICYF-YQI .RGLKYIHSANVAH RDLKPSWLLLNTT 0 DLKICDFGAARVA DPDHJHTGFLT44YVATRWYRAP TIM-NSKGYTKSIDIWSVGCI 4 WTM-SNRPIFPGKiY-DQ-NHILGI GSPSQ TDLNCIIN .KARVY.LSLPHKNKVPWW WJFPNADSKA-D . DKMLTFNPHKRI *V‘QA .AHPY TQYYDPSD4P IA‘APEKEDM LL —c DD-PK‘KLKdLIb 4 *iAREQPGYRS CDNA: l gcccctccc ccgcccgccc ccgc gtc cagg caggcaa :cg 6; gtccgagtgg ctgtcggctc ttcagctctc ccgctcggcg tcttccttcc tcctcccggt 12; cagcgtcggc ggctgcaccg gcggcggcgc agtccctgcg ggaggggcga ctga 18; gcggcggccg ccgagcgtcg agctcagcgc ggcggaggcg gcggcggccc ggcagccaac 24; atggcggcgg ngngngC gggcgcgggc ccggagatgg ggca ggtgttcgac ; gtggggccgc gctacaccaa cctctcgtac atcggcgagg gcgcctacgg catggtgtgc 36; tctgcttatg ataatgtcaa caaagttcga gtagctatca agaaaatcag cccctttgag 42; caccagacct actgccagag aaccctgagg gagataaaaa tcttactgcg cttcagacat 48; gagaacatca ttggaatcaa tgacattatt cgagcaccaa ccatcgagca aatgaaagat 54; gtatatatag tacaggacct catggaaaca gatctttaca agctcttgaa gacacaacac 60; ctcagcaatg accatatctg ctattttctc taccagatcc tcagagggtt aaaatatatc 66; cattcagcta acgttctgca ccgtgacctc aagccttcca acctgctgct caacaccacc 72; tgtgatctca agatctgtga ctttggcctg gcccgtgttg cagatccaga ccatgatcac 78; acagggttcc tgacagaata tgtggccaca cgttggtaca gggctccaga aattatgttg 84; aattccaagg gctacaccaa gtccattgat atttggtctg taggctgcat tctggcagaa 90; tcta acaggcccat ctttccaggg aagcattatc ttgaccagct gaaccacatt 96; ttgggtattc cccc atcacaagaa gacctgaatt gtataataaa tttaaaagct L02; aggaactatt tgctttctct tccacacaaa aataaggtgc acag gctgttccca L08; aatgctgact ccaaagctct attg atgt tgacattcaa caag L14; aggattgaag tagaacaggc tctggcccac ctgg agcagtatta cgacccgagt L20; gacgagccca tcgccgaagc accattcaag ttcgacatgg aattggatga cttgcctaag L26; gaaaagctca aagaactaat ttttgaagag actgctagat tccagccagg atacagatct L32; tgtc aggtacctgg agtttaatac agtgagctct agcaagggag gcgctgcctt L38; ttgtttctag aatattatgt aggt ccattatttt tttt ccaagctcct L44; tattggaagg tattttttta aatttagaat ttat agtt acatataaa . MARCKS: MA RCKS myristoylated alanine—rich protein kinase C substrate [ {omo sapiens ] LOCJS WM_002356 AA/:ranslation="WGAQbSKiAAKG‘AAA‘RPG*AAVASSPSKANGQENGHVKVNGD ASPAAA‘SGAK L *LQAVGSAPAADK L *PAAAGSGAASPSAA‘KG‘PAAAAAPEAGASP V‘K‘APA‘G‘AA‘PGSPLAA‘GdAASAASSiSSPKALDGAiPSPSWLiPKKKKKRFSF KKSEKLSGEShKKNKK‘AG‘GG‘A‘APAA‘GGKD*AAGGAAAAAAEAGAASGEQAAAP G4*AAAG‘dGAAGGDPQ‘AKPQ‘AAVAP‘KPPASD‘iKAA L *PSKV WWW *AGASA AACEAPSAAGPGAPP Q4AAPAL L *PAAAAASSACAAPSQ4AQP4CSP *AA* CDNA: cttgggcgtt ggaccccgca tc:tattagc aaccagggag :ctccat tttcctcttg 6; tctacagtgc ggctacaaat tttt tttattactt ctt:tttttt cgaactacac 12; ttgggctcct ttttttgtgc tcgacttttc cacccttttt ccc:ccctcc tgtgctgctg 18; ctttttgatc tcttcgacta aaattttttt atccggagtg tat:taatcg gttctgttct 24; gtcctctcca ccacccccac ccccctccct ccggtgtgtg tgccgctgcc gctgttgccg ; ccgccgctgc tgctgctcgc cccgtcgtta caccaacccg tttg tttcccctct 36; tggatctgtt gagtttcttt gttgaagaag ccagcatggg tgcccagttc tccaagaccg 42; cagcgaaggg agaagccgcc aggc ctggggaggc ggctgtggcc tcgtcgcctt 48; ccaaagcgaa cggacaggag aatggccacg tgaaggtaaa cggcgacgct gcgg 54; ccgccgagtc gggcgccaag ctgc aggccaacgg cagcgccccg gccgccgaca 60; agcc cgcggccgcc gggagcgggg cggcgtcgcc ctccgcggcc gagaaaggtg 66; agccggccgc cgccgctgcc cccgaggccg gggccagccc ggtagagaag gaggcccccg 72; cggaaggcga ggctgccgag cccggctcgc ccacggccgc ggagggagag gccgcgtcgg 78; ccgcctcctc gacttcttcg cccaaggccg gggc cacgccctcg cccagcaacg 84; agaccccgaa aaaaaaaaag aagcgctttt ccttcaagaa gtctttcaag ctgagcggct 90; tctccttcaa gaagaacaag aaggaggctg gagaaggcgg tgaggctgag gcgcccgctg 96; ccgaaggcgg caaggacgag gccgccgggg gcgcagctgc ggccgccgcc gaggcgggcg L02; cggcctccgg ggagcaggca gcggcgccgg gcgaggaggc ggcagcgggc gaggaggggg L08; tgg cgacccgcag gaggccaagc cccaggaggc cgctgtcgcg ccagagaagc L14; cgcccgccag cgacgagacc aaggccgccg aggagcccag caaggtggag aagg L20; aggc cggggccagc gccgccgcct gcgaggcccc ctccgccgcc gggcccggcg L26; cgcccccgga gcaggaggca gcgg ccgc cgca gcctcgtcag L32; cagc cccctcacag gaggcccagc ccgagtgcag agcc cccccagcgg L38; aggcggcaga gtaaaagagc aagcttttgt gagataatcg aagaactttt ctcccccgtt L44; tgtttgttgg agtggtgcca ggtactggtt ttggagaact tgtctacaac cagggattga L50; ttttaaagat gtcttttttt attttacttt tttttaagca ccaaattttg ttgttttttt L56; tttttctccc ctccccacag atcccatctc aaatcattct gttaaccacc attccaacag L62; gaga gcttaaacac cttcttcctc tgccttgttt ctcttttatt ttttattttt L68; tcgcatcagt attaatgttt ttgcatactt tgcatcttta ttcaaaagtg taaactttct L74; ttgtcaatct atggacatgc ccatatatga aggagatggg tgggtcaaaa agggatatca L80; aatgaagtga taggggtcac aatggggaaa ttgaagtggt gcataacatt gccaaaatag L86; tgtgccacta gaaatggtgt aaaggctgtc tttttttttt ttttttaaag aaaagttatt L92; accatgtatt ttgtgaggca ggtttacaac actacaagtc taag aaggaaagag L98; gaaaaaagaa aaaacaccaa tacccagatt taaaaaaaaa aaaacgatca tagtcttagg 204; agttcattta aaccatagga acttttcact tatctcatgt tacc agtcagtgat 210; taagtagaac tacaagttgt ataggcttta ttgtttattg ctggtttatg accttaataa 216; agtgtaatta acca gcagggtgtt tttaactgtg actattgtat aaaaacaaat 222; cttgatatcc agaagcacat gaagtttgca actttccacc ctgcccattt ttgtaaaact 228; gcagtcatct tttt aaaacacaaa actc aaccaagctg tgataagtgg 234; aatggttact gtttatactg tggtatgttt ttgattacag cagataatgc tttcttttcc 240; agtcgtcttt gagaataaag gaaaaaaaaa tcttcagatg caatggtttt gtgtagcatc 246; ttgtctatca tgttttgtaa atactggaga agctttgacc aatttgactt agagatggaa 252; tgtaactttg cttacaaaaa ttgctattaa actcctgctt ttct aattttctgt 258; gagcacacta aaagcgaaaa ataaatgtga ataaaatgta caaatttgtt gtgttttttt 264; atgttctaat aatactgaga cttctaggtc ttaggttaat ttttaggaag atcttgcatg 270; ccatcaggag taaattttat tgtggttctt aagt tttcaagctc tgaaattcat 276; aatccgcagt gtcagattac gtagaggaag atcttacaac attccatgtc aaatctgtta 282; ccatttattg gcatttagtt ttcatttaag acat aattattttt attgtagcta 288; atgt cagattaaat catttacaac aaaaggggtg tgaacctaag actatttaaa 294; tgtcttatga gaaaatttca taaagccatt ctcttgtcat tcaggtccag aaacaaattt 300; taaactgagt gagagtctat agaatccata ctgcagatgg gtcatgaaat gtgaccaaat 306; gtgtttcaaa aattgatggt cctg ctattgtaat tgcttagtgc ttggctaatt 312; tccaaattat tgcataatat gttctacctt aagaaaacag gtttatgtaa aatg 318; gtgttgaatg gatgatgtca gttcatgggc cata agca tcattttttt 324; tttttttttt gaaagtgtgt tagcatcttg ttactcaaag gataagacag acaataatac 330; gaat attaataatc tttactagtt tacctcctct gctctttgcc acccgataac 336; tggatatctt ttccttcaaa ggaccctaaa ctgattgaaa tttaagatat gtatcaaaaa 342; cattatttca tttaatgcac atctgttttg ctgtttttga gcagtgtgca gtttagggtt 348; catgataaat cattgaacca catgtgtaac atgc caaatcttaa actcattaga 354; aaaataacaa attaggtttt gacacgcatt tgga ataatggatc aaaaatagtg 360; gttcatgacc ttaccaaaca cccttgctac taataaaatc cact tagaagggta 366; tgtattttta gttagggttt cttgatcttg gaggatgttt gaaagttaaa aattgaattt 372; ggtaaccaaa ggactgattt atgggtcttt ttaa tttt cttagttacc 378; tagatggcca agtacagtgc ctggtatgta gtaagactca gtaaaaaagt ggatttttaa 384; aaataactcc caaagtgaat agtcaaaaat cctgttagca tata tattgctaag 390; tttgttcttt taacagctgg aatttattaa gatgcattat tttgatttta ttcactgcct 396; aaaacacttt gggtggtatt gatggagttg gtggattttc ctccaagtga ttaaatgaaa 402; tttgacgtat cttttcatcc aaagttttgt acatcatgtt ttctaacgga tgtt 408; aatatggctt ttttgtatta ctaaaaatag ctttgagatt aaat aaataactct 4141 tgtacagttc agtattgtct attaaatctg tattggcagt atgtataatg gcatttgctg 4201 tggt:acaaa cctc tgggttataa taatcatttg atccaattcc ttgt 4261 aaaa:aaagt tttaccagtt gatataatca aaaaaaaaaa aaaa 21. NM41: NM41 NMd/WM23 nucleoside diphosphate kinase 1 [ Homo sapiens ] AOCUS NM_OOO269 (isoform b) AA/translation="MANCERTFIAIKPDGVQRG.VGdllKRbdQKGbRLVGLKFMQAS 4DLLKdHYVDLKDRPFFAGLVKYMiSGPVVAMVWTG.NVVKTGRVMLGETNPADSKPG TIRGDFCIQVGRNII{GSDSV45A4K4IGLWbHPd4.VDYLSCAQNWIYE CDNA: gcagaagcgt cgtg caagtgctgc gaaccacgtg ggtcccgggc gcgtttcggg 61 tgctggcggc :gcagccgga gttcaaacct aagcagctgg aaggaaccat ggccaactgt 121 acct :cattgcgat caaaccagat cagc ggggtcttgt gggagagatt 181 atcaagcgtt :tgagcagaa aggattccgc cttgttggtc tgaaattcat gcaagcttcc 241 gaagatcttc :caaggaaca ctacgttgac ctgaaggacc gtccattctt tgccggcctg ; taca :gcactcagg gccggtagtt gccatggtct gggaggggct ggtg 361 aagacgggcc gagtcatgct gacc aaccctgcag actccaagcc tgggaccatc 421 cgtggagact tctgcataca agttggcagg aacattatac atggcagtga ttctgtggag 481 agtgcagaga aggagatcgg cttgtggttt caccctgagg aactggtaga ttacacgagc 541 tgtgctcaga actggatcta tgaatgacag gagggcagac cacattgctt tcca 60; tttcccctcc ttcccatggg cagaggacca ggctgtagga gtta tttacaggaa 661 cttcatcata att:ggaggg aagctcttgg agctgtgagt tctccctgta cagtgt:acc 721 gacc atc:gattaa aatgcttcct cccagcatag gattcattga gttggt:ac: 781 tcatattgtt gca:tgcttt ttt:tccttc t AOCJS NM_198175 (isoform 1) AA/:ranslation="WVALSTLGIVFQGEGPPISSCDTGTMANCERTFIAIKPDGVQRG .VGdIIKRbdQKGbR.VG.KbMQA54DLLKdHYVDLKDRPFFAGLVKYMiSGPVVAMV WTG.NVVKTGRVMLGETNPADSKPGTIRGDFCIQVGRNIIHGSDSVdSAdeIGLWbH P44-VDYLSCAQNWIYE CDNA: 1 gcagaagcgt tccgtgcgtg ctgc gaaccacgtg gg:cccgggc gcgtttcggg 61 tgctggcggc tgcagccgga gttcaaacct aagcagctgg aagggccctg tggctaggta 121 agtc tctacacagg actaagtcag cctggtgtgc aggggaggca gacacacaaa 181 cagaaaattg gactacagtg tgct gtaagaagag gttaactaaa ggacaggaag 241 atggggccaa gagatggtgc tactgtctac tttagggatc gtctttcaag ggcc 301 tcctatctca agctgtgata caggaaccat ggccaactgt gagcgtacct tcattgcgat 361 caaaccagat ggggtccagc ggggtcttgt gggagagatt atcaagcgtt ttgagcagaa 42; aggattccgc cttgttggtc tgaaattcat gcaagcttcc gaagatcttc tcaaggaaca 48; ctacgttgac ctgaaggacc gtccattctt tgccggcctg gtgaaataca tgcactcagg 54; gccggtagtt gccatggtct gggaggggct gaatgtggtg aagacgggcc gagtcatgct 60; cggggagacc aaccctgcag actccaagcc tgggaccatc cgtggagact tctgcataca 66; agttggcagg atac atggcagtga ttctgtggag agtgcagaga aggagatcgg 72; gttt gagg aactggtaga ttacacgagc tgtgctcaga tcta 78; tgaatgacag gagggcagac cacattgctt ttcacatcca tttcccctcc ttcccatggg 84; cagaggacca agga aatctagtta tttacaggaa cttcatcata atttggaggg 90; aagctcttgg agctgtgagt tctccctgta cagtgttacc atccccgacc atctgattaa 96; aatgcttcct cccagcatag gattcattga gttggttact tcatattgtt gcattgcttt 102; tttt:ccttc t 22. NM L 2: NM *2 NM«/NM23 nucleoside diphosphate kinase 2 [ Homo sapi ens ] JOCUS NM_001018137 (isoform a variant 2) AA/translation="MAN TRTFIAIKPDGVQRG QLVAMKFLRAS ddHLKQHYIDLKDRPFFPGAVKYMVSGPVVAMVWT Z<2<271 H%<23L" G).4 TNPADSKPG TIRGDFCIQVGRNIIHGSDSVKSA 4K *ISLWEKPL L DYKSCA {DWVYL CDNA: l atctcagggc agtaccactg ctgtgcggct g :cagtcag :gcaggcg ccgagaggag 6; gggcttgtga ccgccccagg gaagctgggc atcaccaaag ggagcttgtt ggac 12; actgcaagta ggaagtgtct acaggtcgat gacaggccta atctctatga cagggtctag 18; actttcctca aggg gcgcacctca gggtgaactg gaaaactcga ccgcacttta 24; gtgccaggac catggccaac ctggagcgca ccttcatcgc catcaagccg gacggcgtgc ; agcgcggcct ggtgggcgag atcatcaagc gcttcgagca gaagggattc cgcctcgtgg 36; agtt cctccgggcc gaac acctgaagca gcactacatt gacctgaaag 42; accgaccatt cttccctggg ctggtgaagt acatgaactc agggccggtt atgg 48; tctgggaggg gctgaacgtg gtgaagacag gccgagtgat gcttggggag accaatccag 54; caaa gccaggcacc gggg acttctgcat tcaggttggc aggaacatca 60; gcag tgattcagta aaaagtgctg aaaaagaaat cagcctatgg tttaagcctg 66; aagaactggt tgactacaag gctc atgactgggt ataa gaggtggaca 72; caacagcagt ctccttcagc acggcgtggt gtgtccctgg acacagctct tcattccatt 78; gacttagagg caacaggatt gatcattctt ttatagagca tatttgccaa taaagctttt 84; cgga aaaaaaaaaa aaaaaaa AOCUS NM_001018138 (isoform a variant 3) AA/translation="MAN.TRTFIAIKPDGVQRG.VGdllKRbdQKGbRLVAMKFLRAS 44HLKQHYIDLKDRPFFPGAVKYMVSGPVVAMVWTG.NVVKTGRVMLGETNPADSKPG TIRGDFCIQVGRNIIHGSDSVKSAdK4ISLWbKPd4.VDYKSCAiDWVYE CDNA: gcggccgcgc gtgg:ggggg aggagggacc ggcggcgccc acgtggcctc cgcgggcccc 6; gccagagcct gggc cgca cctctcgccc cgcaggacca tggccaacct l2; ggagcgcacc ttcatcgcca tcaagccgga cggcgtgcag cgcggcctgg tgggcgagat 18; catcaagcgc ttcgagcaga agggattccg cctcgtggcc atgaagttcc cctc 24; tgaagaacac ctgaagcagc actacattga agac cgaccattct tccctgggct ; ggtgaagtac atgaactcag ggccggttgt ggccatggtc tgggaggggc tgaacgtggt 36; gaagacaggc cgagtgatgc ttggggagac caatccagca gattcaaagc caggcaccat 42; tcgtggggac ttctgcattc aggttggcag gaacatcatt catggcagtg attcagtaaa 48; aagtgctgaa aaagaaatca gcctatggtt taagcctgaa gaactggttg actacaagtc 54; ttgtgctcat gactgggtct atgaataaga ggtggacaca acagcagtct ccttcagcac 60; ggcgtggtgt gtccctggac acagctcttc attccattga cttagaggca ttga 66; tcattctttt atagagcata aata aagcttttgg aagccggaaa aaaaaaaaaa 72; aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaa AOCUS NM_001018139 rm a variant 4) AA/translation="MAN.TRTFIAIKPDGVQRG.VGdllKRbdQKGbRLVAMKFLRAS 44HLKQHYIDLKDRPFFPGAVKYMVSGPVVAMVWTG.NVVKTGRVMLGETNPADSKPG TIRGDFCIQVGRNIIHGSDSVKSAdK4ISLWbKPd4.VDYKSCAiDWVYE CDNA: l tggg tccc ctggcgac:c c:cccgt:cc c:cttccgct tgcgctgccg 61 catg gccaacctgg agcgcacct: ca:cgcca:c gacg gcgtgcagcg l2; cggcctggtg ggcgagatca gc:t cgagcagaag cgcc tcgtggccat 18; gaagttcctc cgggcctctg aagaacacct gaagcagcac tacattgacc tgaaagaccg 24; accattcttc cctgggctgg tgaagtacat gaactcaggg ccggttgtgg ccatggtctg ; ggaggggctg aacgtggtga agacaggccg agtgatgctt ggggagacca atccagcaga 36; ttcaaagcca ggcaccattc gtggggactt ctgcattcag gttggcagga acatcattca 42; tggcagtgat tcagtaaaaa gtgctgaaaa agaaatcagc ctatggttta agcctgaaga 48; actggttgac tacaagtctt gtgctcatga ctgggtctat gaataagagg tggacacaac 54; agcagtctcc ttcagcacgg cgtggtgtgt ccctggacac agctcttcat tccattgact 60; tagaggcaac aggattgatc ttat agagcatatt taaa gcttttggaa 66; gccggaaaaa aaaaaaaaaa aa AOCUS NM_001198682 (isoform b variant 5) AA/translation="MAN TRTFIAIKPDGVQRGLVG ‘QKGERLVAMKFLRAS **HLKQHYIDLKDRPFFPGAVKYMNSGPVVAMEHHSWQ CDNA: tgcctggcga gggcgcgccc ngctgggcg tggacac:gt tctccggccg cgtcgggccg 6; ggcgggtggg gcgttcctgc gggttgggcg gctgggccct ccggggtgtg gccaccccgc 12; gctccgccct gcgcccctcc tccgccgccg gctcccgggt gtcg caccagctct 18; ctgctctccc agcgcagcgc cgccgcccgg cccctccagc ttcccggacc atggccaacc 24; tggagcgcac cttcatcgcc atcaagccgg acggcgtgca cctg gaga ; tcatcaagcg cttcgagcag aagggattcc gcctcgtggc catgaagttc ctccgggcct 36; ctgaagaaca cctgaagcag cactacattg acctgaaaga ccgaccattc ttccctgggc 42; tggtgaagta catgaactca gggccggttg tggccatgga ttca tggcagtgat 48; tcagtaaaaa gtgctgaaaa agaaatcagc ctatggttta agcctgaaga actggttgac 54; tacaagtctt gtgctcatga ctgggtctat gaataagagg caac agcagtctcc 60; ttcagcacgg cgtggtgtgt ccctggacac agctcttcat tccattgact caac 66; aggattgatc attcttttat agagcatatt tgccaataaa gcttttggaa gccggaaaaa 72; aaaaaaaaaa aaa aocuS NM_002512 rm a variant 1) AA/translation="MAN TRTFIAIKPDGVQRG *IIKRE‘ RLVAMKFLRAS **HLKQHYIDLKDRPFFPGAVKYMWSGPVVAMVWT ETNPADSKPG TIRGDFCIQVGRNIIHGSDSVKSA 4K *ISLWEKPA. A. DYKSCA YE CDNA: tgcctggcga gggCgCgCCC ngctgggcg tggacac: :ctccggccg cgtcgggccg 6; ggcgggtggg gcgttcctgc gggttgggcg gctgggccct tgtg gccaccccgc 12; gctccgccct gcgcccctcc tccgccgccg gctcccgggt gtggtggtcg caccagctct 18; ctgctctccc agcgcagcgc cgccgcccgg cccctccagc ttcccggacc atggccaacc 24; tggagcgcac cttcatcgcc atcaagccgg acggcgtgca gcgcggcctg gtgggcgaga ; tcatcaagcg cttcgagcag aagggattcc gcctcgtggc catgaagttc ctccgggcct 36; ctgaagaaca cctgaagcag cactacattg aaga ccgaccattc ttccctgggc 42; tggtgaagta catgaactca gggccggttg tggccatggt gggg ctgaacgtgg 48; tgaagacagg ccgagtgatg gaga cagc agattcaaag ccaggcacca 54; ttcgtgggga cttctgcatt caggttggca ggaacatcat tcatggcagt gattcagtaa 60; aaagtgctga aatc agcctatggt ttaagcctga ggtt aagt 66; cttgtgctca tgactgggtc tatgaataag aggtggacac aacagcagtc tccttcagca 721 cggcgtggtg tgga cacagctctt cattccattg acttagaggc aacaggattg 781 atcattcttt tatagagcat atttgccaat aaagcttttg gaagccggaa aaaaaaaaaa 841 aaaaaa 23. PGK1: PGK1 phosphoglycerate kinase 1 [ Homo sapiens ] AOCUS NM_000291 AA/translation="MSLSWKLT.DKLDVKGKRVVMRV4 DFWVPWKNNQITVNQRIKAAV PSIKFCLDWGAKSVV-MSH-GRPDGVPMPDKYS-*PVAV L .KSL-GKDV-F-KDCVGP *V‘KACANPAAGSVI.L*N-QEHV L *‘GKGKDASGNKVKA PAKI‘AERAS-SKLGDVL GTAiRAHSSWVGVVAPQKAGGbLMKK‘ VYEAKA-‘SP*RPELAI-GGAKVA NNW-DKVNTWIIGGGMAFTF-KVLNNW*IG15LED**GAKIVK3LWSKAEKN GVKIiLPVJbViADKhDLNAK1GQA1VASGIPAGWMGLDCGPESSKKYAE VTRAKQI VWVGPVGVE‘W‘AEARGiKA-MD*VVKA15RGCI111GGG31A1CCAKWWTEDKVSHV STGGGASLL .LdGKV .PGVDALSVI CDNA: agcg gccgggaagg ggcggtgcgg gaggcggggt g:ggggcggt agtgtgggcc 6; ctgttcctgc ccgcgcggtg ttccgcattc tgcaagcctc cggagcgcac gtcggcagtc 12; ggctccctcg gaat caccgacctc tctccccagc tgtatttcca aaatgtcgct 18; ttctaacaag ctgacgctgg acaagctgga cgttaaaggg aagcgggtcg ttatgagagt 24; cgacttcaat atga agaacaacca gataacaaac aaccagagga ttaaggctgc ; tgtcccaagc atcaaattct gcttggacaa tggagccaag tcggtagtcc ttatgagcca 36; cctaggccgg cctgatggtg tgcccatgcc tgacaagtac tccttagagc cagttgctgt 42; agaactcaaa tctctgctgg gcaaggatgt cttg aaggactgtg taggcccaga 48; agtggagaaa gcctgtgcca acccagctgc tgggtctgtc atcctgctgg agaacctccg 54; ctttcatgtg gaggaagaag ggaagggaaa agatgcttct gggaacaagg ttaaagccga 60; caaa atagaagctt tccgagcttc actttccaag ctaggggatg tctatgtcaa 66; tgatgctttt ggcactgctc acagagccca cagctccatg gtaggagtca atctgccaca 72; gaaggctggt gggtttttga tgaagaagga gctgaactac tttgcaaagg ccttggagag 78; cccagagcga cccttcctgg ccatcctggg cggagctaaa gttgcagaca agatccagct 84; catcaataat atgctggaca atga gatgattatt ggtggtggaa tggcttttac 90; cttccttaag gtgctcaaca acatggagat tggcacttct ctgtttgatg aagagggagc 96; caagattgtc aaagacctaa tgtccaaagc tgagaagaat ggtgtgaaga ttaccttgcc 102; tgttgacttt gtcactgctg acaagtttga tgagaatgcc aagactggcc aagccactgt 108; tggc atacctgctg tggg cttggactgt ggtcctgaaa gcagcaagaa 114; gtatgctgag gctgtcactc agca gattgtgtgg aatggtcctg tgggggtatt 120; tgaatgggaa gcttttgccc ggggaaccaa agctctcatg gatgaggtgg tgaaagccac 126; ttctaggggc tgcatcacca tcataggtgg cact gccacttgct gtgccaaatg 132; gaacacggag gataaagtca gccatgtgag gggt ggtgccagtt tggagctcct 138; taaa gtccttcctg gggtggatgc tctcagcaat atttagtact ttcctgcctt 144; ttagttcctg tgcacagccc ctaagtcaac ttagcatttt ctgcatctcc acttggcatt 150; agctaaaacc ttccatgtca agattcagct agtggccaag agatgcagtg ccaggaaccc 156; ttaaacagtt catc tcagctcatc ttcactgcac cctggatttg catacattct 162; tccc atttgaattt tttagtgact aaaccattgt gcattctaga gtgcatatat 168; ttatattttg aaaa agaaagtgag cagtgttagc ttagttctct tttgatgtag 174; gttattatga ttagctttgt cactgtttca agca tggaaacaag atgaaattcc 180; atttgtaggt agtgagacaa aattgatgat ccattaagta aacaataaaa gtgtccattg 186; aaaccgtgat tttttttttt ttcctgtcat actttgttag gaagggtgag aatagaatct 192; tgaggaacgg atcagatgtc tatattgctg aatgcaagaa cagc agcagtggag 198; gaca attagataaa tgtccattct ttatcaaggg cctactttat ggcagacatt 204; gtgctagtgc ttttattcta acttttattt gtta cacatgatca taatttaaaa 210; agtcaaggct tataacaaaa aagccccagc ccattcctcc cattcaagat tcccactccc 216; cagaggtgac cactttcaac tcttgagttt ttcaggtata tacctccatg tttctaagta 222; atatgcttat attgttcact tctttttttt ttatttttta aagaaatcta tttcatacca 228; tggaggaagg ctctgttcca tttc cacttcttca ttctctcggt atagttttgt 234; cacaattata atca aaagtctaca taactaatac agctgagcta tgtagtatgc 240; tatgattaaa tttacttatg taaaaaaaaa aaaaaaaaa 24. PGK2: PGK2 phosphoglycerate kinase 2 [ Homo sapiens ] AOCUS NM_138733 AA/translation="MSLSKKLTADKLDVRGKRVIMRV DFWVPMKKNQITVNQRIKASI PSIKYCLDVGAKAVV-MSH-GRPDGVPMPDKYS-APVAV'‘J KSLLGKDVLF-KDCVGA *V‘KACANPAPGSVI.L*N-QEHV L *‘GKGQDPSGKKIKA PDKI‘AERAS-SKLGDVL YVVDAFGTA{RAHSSWVGVVAPHKASGELMKK‘-DYEAKA-*WPV?PELAIAGGAKVA DKIQLIKNW-DKVNTWIIGGGMAYTF-KVLNNW*IGASLED**GAKIVKDIWAKAQKN GVQ1TFPVDFVTGDKFDENAQVGKATVASGISPGWMGLDCGPESNKNHAQVVAQARLI VWVGPLGVELWDAEAKGiKA-MD*IVKAiSKGCI1VIGGG31A1CCAKWNTEDKVSHV STGGGASLL .LdGKI-PGVTALSVM CDNA: 1 aacag :ggcc ctgg agacagtgag gagaagaaag gggcgggaca agggcaaagg 61 cgttagaagt cgac ccagcccctc aacagcaagt tggttcttca gcattaagat 121 ccaggtgtca gcctatgtct tgtc tctc tttctaagaa gttgacttta 181 gacaaactgg atgttagagg gaagcgagtc atcatgagag tagacttcaa tgttcccatg 241 aagaagaacc agattacaaa caaccagagg atcaaggctt ccatcccaag catcaagtac ; tgcctggaca ccaa ggcagtagtt cttatgagtc atctaggtcg gcctgatggt 36; atgc ctgacaaata ttccttagca cctgttgctg ttgagctcaa atccttgctg 42; ggcaaggatg ttctgttcct ctgt gtaggcgcag aagtggagaa agcctgtgcc 48; gctc ctggttcagt catcctgctg gagaacctgc gctttcatgt ggaggaagaa 54; gggaagggcc aagatccctc tggaaagaag attaaagctg agccagataa aatagaagcc 60; ttccgagcat cactttccaa gctaggggac gtctatgtca atgatgcttt tggcactgca 66; caccgcgctc atagttccat agtg aatctgcccc ataaagcatc cggattcttg 72; atgaagaagg aactagatta ctttgctaaa gccttggaaa tgag accctttctg 78; gctatacttg gtggagccaa agtggcagac aagatccaac ttatcaaaaa tatgctggac 84; aaagtcaatg agatgattat tggtggtgga atggcttata ccttccttaa caac 90; aacatggaga ttggtgcttc cctgtttgat gaagagggag ccaagatcgt taaagatatc 96; atggccaaag agaa tggtgtaagg attacttttc ctgttgattt tggg L02; gacaagtttg acgagaacgc tcaggttgga aaagccactg tagcatctgg catatctcct L08; ggctggatgg gtttggactg tggtcctgag agcaacaaga atcatgctca agttgtggct L14; caagcaaggc taattgtttg gaatgggccg ttaggagtat ttgaatggga tgcctttgct L20; aagggaacca aagccctcat ggatgaaatt gtgaaagcca cttccaaggg ctgcatcact L26; gttatagggg gtggagacac tgctacttgc tgtgccaaat ggaacactga agataaagtc L32; agccatgtca gagg cggtgccagt ctagagcttc tggaaggtaa aatccttcct L38; ggagtagagg ccctcagcaa catgtagtta atatagtgtt acttccttct gttttctgtc L44; cctt gctt aatgctttta catctcgatg tgacttttgt taaaatctac L50; tcctagatca agacctatgt aatggacaag cagcaggcca tcaggaactc ttaatatcag L56; cacagcaatt cattttagtt tggtcacgca tttgcctgtt caagttctca tttgaacttc L62; accattgtgc tatctaggga ggacatattc ttaagttgcc tattaaagaa agtgagctga L68; agaaactgaa aaaaaaaaaa aaaaaaaaaa aaaa a . RAB7A: RAB7A RAB7A, member RAS oncogene family [ Homo sapi ens ] LOCUS NW_004637 nslation .KVIILGDSGVGKTSLMNQYVNKKFSVQYKATIGAD FLTKEVMVDDRLVTMQIWDTAGQE RFQSLGVAFYRGADCCVLVFDVTAPNTFKTL DSW RDTF .IQASPRDPE .GNKIDHTNRQVATKRAQAWCYSKNNIPYE *iSAK‘AI NVEQAFQTIARNA. *VdLYV‘bP *PIKLDKNDRAKASAESCSC CDNA: 1 acttccgctc ggggcggcgg ngtggcgga agtgggagcg gagt ca:a 61 aagcctgagg cggcggcagc ggcggagttg gcggcttgga gagctcggga gagttccc:g 12; gaaccagaac ttggaccttc tcgcttctgt cctccgttta gtctcctcct cggcgggagc 18; cctcgcgacg cgcccggccc ggagccccca gcgcagcggc cgcgtttgaa ggatgacctc 24; taggaagaaa gtgttgctga aggttatcat cctgggagat tctggagtcg ggaagacatc ; actcatgaac cagtatgtga ataagaaatt cagcaatcag tacaaagcca caataggagc 36; tgactttctg accaaggagg tgatggtgga tgacaggcta atgc agatatggga 42; cacagcagga caggaacggt tccagtctct cggtgtggcc ttctacagag actg 48; ctgcgttctg gtatttgatg cccc caacacattc aaaaccctag atagctggag 54; agatgagttt ctcatccagg ccagtccccg agatcctgaa aacttcccat ttgttgtgtt 60; caag attgacctcg aaaacagaca agtggccaca aagcgggcac ggtg 66; ctacagcaaa aacaacattc ttga gaccagtgcc aaggaggcca tcaacgtgga 72; gcaggcgttc cagacgattg cacggaatgc acttaagcag gaaacggagg tggagctgta 78; caacgaattt cctgaaccta tcaaactgga caagaatgac cgggccaagg cctcggcaga 84; aagctgcagt tgctgagggg gcagtgagag ttgagcacag agtccttcac aaaccaagaa 90; tagg ccttcaacac aattcccctc tcctcttcca aacaaaacat acattgatct 96; ctcacatcca gctgccaaaa gaaaacccca tcaaacacag ttacacccca catatctctc L02; acacacacac acacacgcac acacacacac acagatctga cgtaatcaaa ctccagccct L08; tgcccgtgat ggctccttgg ggtctgcctg caca tgagcccgcg agtatggcag L14; caggacaagc cagcggtgga agtcattctg atatggagtt ggcattggaa gcttattctt L20; tttgttcact ggagagagag agaactgttt acagttaatc tgtgtctaat tatctgattt L26; tttttattgg tcttgtggtc cccc ccctttcccc tccctccttg aaggctaccc L32; cttgggaagg ctggtgcccc atgccccatt acaggctcac acccagtctg atcaggctga L38; gttttgtatg tatctatctg ttaatgcttg ttacttttaa ctaatcagat acag L44; tatccattta ttatgtaatg cttcttagaa aagaatctta tagtacatgt taatatatgc L50; aaccaattaa aatgtataaa ttagtgtaag aaattcttgg attatgtgtt taagtcctgt L56; aatgcaggcc tgtaaggtgg gaac cctgtttgga ttgcagagtg ttactcagaa L62; ttgggaaatc cagctagcgg cagtattctg tacagtagac acaagaatta tgtacgcctt L68; ttatcaaaga cttaagagcc aaaaagcttt tcatctctcc agggggaaaa ctgtctagtt L74; cccttctgtg tctaaatttt ccaaaacgtt cata tggt atgtgcaatg L80; gataaattgc cgttatttca aaaattaaaa ttctcatttt ctttcttttt tttcccccct L86; gctccacact tcaaaactcc cgttagatca gcattctact gtga aaggaaaacc L92; ctaacagatc tgtcctagtg attttacctt tgttctagaa ggcgctcctt tcagggttgt 1981 ggtattctta ggttagcgga gctttttcct cttttcccca cccatctccc tgcc 2041 cattattaat taacctcttt ctttggttgg aaccctggca gttctgctcc cttcctagga 2101 tctgcccctg cattgtagct tgcttaacgg agcacttctc ctttttccaa aggtctacat 2161 tctagggtgt gggctgagtt cttctgtaaa gagatgaacg caatgccaat aaaattgaac 2221 aagaacaatg ataaaaaaaa 26. RP417: RPL17 ribosomal protein 417 [ Homo sapiens ] AOCUS 985 (isoform A varian: 1) AA/translation="MVRYSLDPENPTKSCKSRGSNARVHFKNTRETAQAIKGWHIRKA TKYLKDVTAQKQCVPFRRYNGGVGQCAQAKQWGWTQGRWPKKSATF-.HWLKNATSNA VDSLVITHIQVNKAPKMQRRTYRAHGRINPYWSSPCHI*MI-1*K*QIVPKP 4 4 *VAQKKKISQKKLKKQKLMARE CDNA: cctgcctcct tcg: ttcttcggct c:cg cgagaagtca agttctca:g 61 agttctccca aaatccaccg ctcttcctct ttccctaagc aggg ttgactggat 12; tggtgaggcc cgtgtggcta cttctgtgga agcagtgctg tagttactgg aagataaaag 18; ggaaagcaag cccttggtgg gggaaagtat ggctgcgatg atggcatttc ttaggacacc 24; tttggattaa taatgaaaac aactactctc tgagcagctg ttcgaatcat ctgatattta ; tactgaatga gtaa gtacgtattg acagaattac cttt cctctaggtg 36; atctgtgaaa atggttcgct attcacttga cccggagaac cccacgaaat catgcaaatc 42; aagaggttcc aatcttcgtg ttcactttaa gaacactcgt gaaactgctc aggccatcaa 48; gggtatgcat atacgaaaag ccacgaagta tctgaaagat gtcactttac agaaacagtg 54; tgtaccattc cgacgttaca atggtggagt gtgt gcgcaggcca gggg 60; ctggacacaa ggtcggtggc ccaaaaagag tgctgaattt ttgctgcaca tgcttaaaaa 66; gagt aatgctgaac ttaagggttt agatgtagat tctctggtca ttgagcatat 72; ccaagtgaac aaagcaccta agatgcgccg ccggacctac agagctcatg gtcggattaa 78; catg agctctccct gccacattga gatgatcctt acggaaaagg aacagattgt 84; tcctaaacca gaagaggagg ttgcccagaa gaaaaagata tcccagaaga aactgaagaa 90; acaaaaactt atggcacggg agtaaattca gcattaaaat aaatgtaatt aaaaggaaaa 96; gaaaaaaaaa aaaaaaaaaa aaaaa AOCUS WM_001035006 rm a varian: 2) AA/translation="MVRYSLDPENPTKSCKSRGSNLRVHFKNTRETAQAIKGWHIRKA TKYLKDVTLQKQCVPFRRYNGGVGQCAQAKQWGWTQGRWPKKSATF-.HWLKNATSNA T-KGLDVDSLVITHIQVNKAPKMQRRTYRAHGRINPYMSSPCHI*MI-1*K*QIVPKP 4 4 KISQKKLKKQKLMARE CDNA: 1 cctgcctcct cagatctcg: ttcttcggct acgaatc:cg cgagaagtca agttctcatg 6; agttctccca aaatccaccg ctcttcctct ttccctaagc agcctgaggt gatctgtgaa l2; aatggttcgc tattcacttg acccggagaa ccccacgaaa aaat caagaggttc l8; caatcttcgt ttta agaacactcg tgaaactgct caggccatca agggtatgca 24; tatacgaaaa gccacgaagt atctgaaaga tgtcacttta cagaaacagt gtgtaccatt ; ccgacgttac aatggtggag ttggcaggtg tgcgcaggcc aagcaatggg gctggacaca 36; aggtcggtgg cccaaaaaga gtgctgaatt tttgctgcac atgcttaaaa agag 42; taatgctgaa cttaagggtt tagatgtaga ttctctggtc attgagcata tccaagtgaa 48; caaagcacct aagatgcgcc gccggaccta cagagctcat ggtcggatta acccatacat 54; gagctctccc tgccacattg agatgatcct tacggaaaag gaacagattg ttcctaaacc 60; agaagaggag gttgcccaga agaaaaagat atcccagaag aaactgaaga aacaaaaact 66; tatggcacgg gagtaaattc aaaa taaatgtaat taaaaggaaa agaaaaaaaa 72; aaaaaaaaaa aaaaaa AOCUS 19934O (isoform a varian: 3) AA/translation="MVRYSLDPENPTKSCKSRGSNLRVHFKNTRETAQAIKGWHIRKA TKYLKDVTLQKQCVPFRRYNGGVGRCAQAKQWGWTQGRWPKKSATF..HWLKNATSNA T.KGLDVDSLVITHIQVNKAPKMRRRTYRAHGRINPYMSSPCHI*MI.ideQIVPKP *44VAQKKKISQKKLKKQKLMARE CDNA: cctgcctcct cagatctcg: ttcttcggct acgaatc:cg gtca agttctca:g 6; agttctccca aaatccaccg ctcttcctct ttccctaagc agcctgaggg ttgactggat l2; tggtgaggcc cgtgtggcta cttctgtgga agcagtgctg tagttactgg aagataaaag 18; ggaaagcaag cccttggtgg gggaaagtga tctgtgaaaa tggttcgcta ttcacttgac 24; ccggagaacc ccacgaaatc atca agaggttcca gtgt tcactttaag ; cgtg aaactgctca caag ggtatgcata tacgaaaagc cacgaagtat 36; ctgaaagatg tcactttaca gaaacagtgt gtaccattcc gacgttacaa tggtggagtt 42; ggcaggtgtg cgcaggccaa gcaatggggc tggacacaag ggcc gagt 48; gctgaatttt tgctgcacat gcttaaaaac gcagagagta atgctgaact taagggttta 54; gatgtagatt ctctggtcat tgagcatatc caagtgaaca aagcacctaa gatgcgccgc 60; cggacctaca atgg tcggattaac ccatacatga gctctccctg ccacattgag 66; atgatcctta cggaaaagga acagattgtt cctaaaccag aagaggaggt tgcccagaag 72; aaaaagatat cccagaagaa actgaagaaa caaaaactta tggcacggga gtaaattcag 78; aata atta aaaggaaaag aaaaaaaaaa aaaaaaaaaa aaaa AOCUS WM_001199341 (isoform a varian: 4) AA/translation="MVRYSLDPENPTKSCKSRGSNLRVHFKNTRETAQAIKGWHIRKA TKYLKDVTLQKQCVPFRRYNGGVGQCAQAKQWGWTQGRWPKKSATF-.HWLKNATSNA T-KGLDVDSLVITHIQVNKAPKMQRRTYRAHGRINPYMSSPCHI*MI-i*K*QIVPKP 4 4 *VAQKKKISQKKLKKQKLMARE CDNA: gcctgagg:g agtgtttc ctgcgttgct ccgagggccc aatcctcctg ccgc 6; ggct gcgc cggcctccag gcccccggga ggagaactcc tagggctact 12; aaatcctcgc tggaggcggt ggcttcttat gcgggaggac gtggcggagg gcctgacttt 18; gggagccggg gtcagtcggc ctctgaggtg atctgtgaaa atggttcgct attcacttga 24; cccggagaac cccacgaaat catgcaaatc aagaggttcc aatcttcgtg ttcactttaa ; gaacactcgt gaaactgctc aggccatcaa gggtatgcat atacgaaaag ccacgaagta 36; tctgaaagat gtcactttac agaaacagtg tgtaccattc cgacgttaca atggtggagt 42; gtgt gcgcaggcca agcaatgggg ctggacacaa ggtcggtggc ccaaaaagag 48; tgctgaattt ttgctgcaca tgcttaaaaa gagt aatgctgaac ttaagggttt 54; agat tctctggtca ttgagcatat ccaagtgaac aaagcaccta agatgcgccg 60; ccggacctac catg gtcggattaa cccatacatg ccct gccacattga 66; gatgatcctt acggaaaagg aacagattgt tcctaaacca gaagaggagg ttgcccagaa 72; gaaaaagata aaga aactgaagaa acaaaaactt atggcacggg agtaaattca 78; gcattaaaat aaatgtaatt aaaaggaaaa gaaaaaaaaa aaaaaaaaaa aaaaa AOCUS WM_001199342 (isoform A varian: 5) AA/translation="MVRYSLDPENPTKSCKSRGSNLRVHFKNTRETAQAIKGWHIRKA TKYLKDVTLQKQCVPFRRYNGGVGRCAQAKQWGWTQGRWPKKSATF-.HWLKNATSNA T-KGLDVDSLVITHIQVNKAPKMRQRTYRAHGRINPYMSSPCHI*MI-i*K*QIVPKP 4 4 *VAQKKKISQKKLKKQKLMARE CDNA: gg:g agtgtttc ctgcgttgct ccgagggccc aatcctcctg ccatcgccgc 6; catcctggct tcgggggcgc cggcctccag gcccccggga ggagaactcc tagggctact 12; aaatcctcgc tggaggcggt ggcttcttat ggac gtggcggagg gcctgacttt 18; gggagccggg gtcagtcggc ctctgagggt tgactggatt ggtgaggccc gtgtggctac 24; ttctgtggaa gcagtgctgt agttactgga agataaaagg gaaagcaagc ccttggtggg ; ggaaagtgat ctgtgaaaat ggttcgctat gacc cggagaaccc cacgaaatca 36; tcaa gaggttccaa tcttcgtgtt cactttaaga acactcgtga aactgctcag 42; gccatcaagg gtatgcatat acgaaaagcc tatc tgaaagatgt cactttacag 48; aaacagtgtg taccattccg acgttacaat ggtggagttg gcaggtgtgc gcaggccaag 54; caatggggct ggacacaagg tcggtggccc aaaaagagtg ctgaattttt gctgcacatg 601 cttaaaaacg cagagagtaa tgctgaactt aagggtttag attc tctggtcatt 661 gagcatatcc aagtgaacaa agcacctaag atgcgccgcc ggacctacag agctcatggt 721 cggattaacc catacatgag ctctccctgc gaga tgatccttac ggaaaaggaa 781 cagattgttc ctaaaccaga agaggaggtt gcccagaaga aaaagatatc ccagaagaaa 841 ctgaagaaac aaaaacttat ggcacgggag taaattcagc attaaaataa atgtaattaa 901 aaggaaaaga aaaaaaaaaa aaaaaaaaaa aaa AOCUS WM_001199343 (isoform a : 6) AA/translation="MVRYSLDPENPTKSCKSRGSNLRVHFKNTRETAQAIKGWHIRKA TKYLKDVTLQKQCVPFRRYNGGVGQCAQAKQWGWTQGRWPKKSATF-.HWLKNATSNA T.KGLDVDSLVI'I‘HIQVNKAPKMRRRTYRAHGRINPYMSSPCHI*MI.ideQIVPKP KKKISQKKLKKQKLMARE CDNA: gg:g agtgtttc:c ctgcgttgct ccgagggccc aatcctcctg ccatcgccgc 61 catcctggct tcgggggcgc cggcctccag ggga ggagaactcc tagggctact 121 aaatcctcgc tggaggcggt ggcttcttat gcgggaggac gtggcggagg gcctgacttt 181 cggg ggttgactgg attggtgagg cccgtgtggc tacttctgtg gaagcagtgc 241 tgtagttact ggaagataaa agggaaagca agcccttggt gggggaaagt gatctgtgaa ; aatggttcgc tattcacttg acccggagaa ccccacgaaa tcatgcaaat caagaggttc 361 caatcttcgt gttcacttta agaacactcg tgaaactgct caggccatca agggtatgca 421 tatacgaaaa gccacgaagt atctgaaaga tgtcacttta cagaaacagt gtgtaccatt 481 ccgacgttac aatggtggag ttggcaggtg tgcgcaggcc aagcaatggg gctggacaca 541 gtgg cccaaaaaga gtgctgaatt tttgctgcac atgcttaaaa acgcagagag 60; taatgctgaa cttaagggtt tagatgtaga ttctctggtc attgagcata tccaagtgaa 661 caaagcacct aagatgcgcc ccta cagagctcat ggtcggatta acccatacat 721 gagctctccc tgccacattg agatgatcct tacggaaaag gaacagattg ttcctaaacc 781 agaagaggag gttgcccaga agaaaaagat gaag aaactgaaga aacaaaaact 841 tatggcacgg gagtaaattc agcattaaaa taat taaaaggaaa agaaaaaaaa 90; aaaaaaaaaa aaaaaa AOCUS WM_001199344 (isoform a varian: 7) nslation="MVRYSLDPENPTKSCKSRGSNLRVHFKNTRETAQAIKGWHIRKA TKYLKDVTLQKQCVPFRRYNGGVGRCAQAKQWGWTQGRWPKKSATF..HWLKNATSNA VDSLVITHIQVNKAPKMRRRTYRAHGRINPYMSSPCHI*MI.ideQIVPKP *44VAQKKKISQKKLKKQKLMARE CDNA: 1 gcctgagg:g agtgtttc:c ctgcgttgct ccgagggccc aatcctcctg ccatcgccgc 61 catcctggct tcgggggcgc cggcctccag gcccccggga ggagaactcc tagggctact 12; tcgc tggaggcggt ggcttcttat gcgggaggac gtggcggagg gcctgacttt 18; gggagccggg tgtg aaaatggttc gctattcact tgacccggag aaccccacga 24; aatcatgcaa atcaagaggt tccaatcttc gtgttcactt taagaacact cgtgaaactg ; ctcaggccat caagggtatg catatacgaa aagccacgaa gtatctgaaa gatgtcactt 36; tacagaaaca gtgtgtacca ttccgacgtt acaatggtgg agttggcagg tgtgcgcagg 42; ccaagcaatg gggctggaca caaggtcggt ggcccaaaaa gagtgctgaa tttttgctgc 48; acatgcttaa aaacgcagag gctg aacttaaggg tttagatgta gattctctgg 54; tcattgagca tatccaagtg aacaaagcac ctaagatgcg gacc tacagagctc 60; atggtcggat taacccatac atgagctctc cctgccacat tgagatgatc gaaa 66; aggaacagat tgttcctaaa ccagaagagg aggttgccca aaag atatcccaga 72; agaaactgaa gaaacaaaaa cttatggcac gggagtaaat tcagcattaa aataaatgta 78; attaaaagga aaagaaaaaa aaaaaaaaaa aaaaaaaa LOCUS NW_OOll99345 (isoform b variant 8) AA/translation="WHIRKATKYLKDVTLQKQCVPFRRYNGGVGRCAQAKQWGWTQGR WPKKSATFL.HWLKNA*SNA*-KGLDVDSLVITHIQVNKAPKMRRRTYRAHGRINPYM SSPCHI4MI-i4K4QIVPKPA. KKISQKKLKKQKLMAR:A.

CDNA: cc:gcctcc: cagatc:cgt ttcttcggct acgaatctcg cgagaagtca agttctcatg 6; ag:tctccca aaatccaccg ctcttcctct ttccctaagc agcctgaggt gatctgtgaa 12; tcgc cttg acccggagaa ccccacgaaa tcgt gaaactgctc 18; aggccatcaa gggtatgcat atacgaaaag ccacgaagta tctgaaagat gtcactttac 24; agaaacagtg tgtaccattc taca atggtggagt tggcaggtgt gcgcaggcca ; agcaatgggg ctggacacaa ggtcggtggc ccaaaaagag tgctgaattt ttgctgcaca 36; tgcttaaaaa cgcagagagt aatgctgaac ttaagggttt agatgtagat tctctggtca 42; ttgagcatat ccaagtgaac aaagcaccta agatgcgccg ccggacctac agagctcatg 48; gtcggattaa cccatacatg agctctccct gccacattga gatgatcctt acggaaaagg 54; aacagattgt tcctaaacca gaagaggagg ttgcccagaa gaaaaagata aaga 60; aactgaagaa actt atggcacggg agtaaattca gcattaaaat aaatgtaatt 66; aaaaggaaaa gaaaaaaaaa aaaaaaaaaa aaaaa 27. RPL28: RPL28 ribosomal protein L28 [ {omo sapiens ] LOCUS NM_OOO991 rm 2) AA/translation="MSAHLQWMVVRNCSSFLIKRNKQTYST EPNNLKARNSFRYNGLI VEPAADGKGVVVVIKRRSGQRKPATSYVRTTINKNARATLSSIRHMIRKNKY RPDLRMAAIRRASAILRSQKPVMVKRKRTRPTKSS CDNA: ctctttccgt ctcaggtcgc cgctgcgaag ggagccgccg ccatgtctgc gcatctgcaa 6; tggatggtcg tgcggaactg ctccagtttc ctgatcaaga ggaataagca gacctacagc 12; actgagccca tgaa ggcccgcaat tccttccgct acaacggact gattcaccgc 18; aagactgtgg gcgtggagcc ggcagccgac ggcaaaggtg tggt cattaagcgg 24; agatccggcc agcc tgccacctcc tatgtgcgga ccaccatcaa caagaatgct ; cgcgccacgc tcagcagcat cagacacatg aaga acaagtaccg ccccgacctg 36; cgcatggcag ccatccgcag ggccagcgcc atcctgcgca agcc tgtgatggtg 42; aagaggaagc ggacccgccc caccaagagc tcctgagccc cctgccccca gagcaataaa 48; gtcagctggc acct gcctcgactg ggcctccctt tttgaaacgc tctggggagc 54; tctggccctg tgtgttgtca ttcaggccat gtcatcaaaa ctctgcatgt caccttgtcc 60; atctggaggt aatg gctggccatg caggaggggt ggggtagctg ccttgtccct 66; ggtgagggca agggtcactg tcttcacaga aaaagtttgc tgacttgtga ccta 72; ctgtcccatt gtgaggtggc ctgaagaatc ccagctgggg cttc cattcagaag 78; aagaaaggcc agcc cagaagggtg caggctgagg gctgggccct gggccctggt 84; gctgtagcac ggtttgggga cttggggtgt tcccaagacc tgggggacga cagacatcac 90; gggaggaaga tgagatgact tttgcatcca gggagtgggt gcagccacat ttggagggga 96; tgggctttac ttgatgcaac ctcatctctg agatgggcaa cttggtgggt ggtggcttat L02; aactgtaagg gagatggcag ccccagggta cagccagcag gcattgagca gccttagcat L08; tgtcccccta ctcccgtcct ccaggtgtcc ccatccctcc cctgtctctt tgagctggct L14; actt aggtctcatc tcagtggccg ctcctgggcc accctgtcac ccaagctttc L20; ctgattgccc agccctcttg tttcctttgg cctgtttgct ccctagtgtt tattacagct L26; tgtgaggcca ggagtttgag accatcctag gcaacataat gagacaccgt ctctaaaata L32; aaattagctg ggtgtggtgg tgcaccgcct gtggtcccag agag gttgagtaga L38; ggctgaggtg agcggagcac ttgagccaag agtatgaggc tgcagtgagc ccatgagccc L44; caccactaca cctg gaagacacca tgacacacag tgaggcctgg atggggaaag L50; agtcctgctg ttgatcctca catgtttcct gggcacctaa ctctgtcagc cactgccagg L56; gaccaaggat tcca tggcacccct ggttcctgcc atcctggggt ttca L62; aagaaggact ctgctccctg tctgagacca cccccggctc tgactgagag taaggggact L68; gtcagggcct cgacttgcca ttggttgggg tcgtacgggg ctgggagccc tgcgttttga L74; ggcagaccac tgcccttccg acctcagtcc tgtctgctcc agtcttgccc agctcgaagg 180; agagcagatc tgaccacttg ccagcccctg tctgctgtga attaccattt cctttgtcct 186; tcccttagtt gggtctatta gctcagattg agaggtgttg ccttaaaact gagttgggtg 192; acttggtacc tgctcaggac cccccgcact gtcccaatcc cactcaggcc cacctccagc 198; tggcctcact ccgctggtga cttcgtacct gctcaggagc ccccactgtc ccagtcccac 204; tcaggcccat ctctggctgg cctcactgcg ctgggactcc gccttcataa ggagagctca 210; ctgctcacgt tagtagatgg ctcg ctct gcac ctgcttcagt 216; tgtcctccac agcactgatt tgcagcccac gcag gtttatctgt ctcatgtttg 222; tcttgtgctg gtgggcaagg ggtttgtcta gcacaccagc atataatgag atgcttgatg 228; aatggtgcat attgaatgta taaagcccac cggtcctgag ctca ctggagactt 234; tctggagatg cgct ctgttgccca ggctggcgag tgcaatggcg cgatcttggc 240; tcactgcagc ctccacctcc tgggttcaag cgattctcct gcctcagcct cccgagtagc 246; tgggattaca ggtgggtgtc accacaccca gctcagtatt gtatttttag tggg 252; gtttcaccat tttgcccagg ctggtttgga actcctgact tcaaattacc cacctgcctc 258; agcctcccaa agtgctggca ttacaggcgc cttt ctgatgtggc tgctgctgct 264; cagaaggcct tgtccttaac cacctccttg cctgccctgg aggcttgtgc ctctaggccc 270; ctgt ggagtcctgc tggctttctc catccctatc tgaatcctcc ctgctgtgtg 276; gcctcccctg gtctcatccg agcc cagcttagtg ggcctctgtt cctgcgggtg 282; gccagcctgt ctgtgtggct gggctgggga ggccacgtct tgaa tgctatcggt 288; gggttggggt ggaggaacca ggagagggct ggagggaggg agatggtctc agccccacag 294; agtttggagt cctcagtgtg ctgagcaaac gtggagacac catttccctc ctctagacct 300; catcttggag agagagatgt tggatggggc catctattcc agctttattc acacaaatca 306; tgtctgttgg cctggaaatt ggaaaaccag ttaaaccaaa aacatgatat aaca 312; ggcaggctca ccatagtaaa aatgctgaaa gccaaagaca aaattgggag aacaaaagaa 318; aagcgtcttg acag aaggtccctg ttag tagctgccct aaac 324; caggcccagg cagtggggac acatccagag tgctgaaaga acctccccca ggtcatccta 330; tccccaagag tgatgcccgg cagcattccc agctcagggc ttca cggaagccag 336; gaatcaaact gcctgggttc cagtcccagc tctgccagtt atgcccagct gtggggactt 342; gggcagctcg tttagtagca ccgtgcctca gtttcccata tgtaaaaggc cattttgagt 348; gcctttcaca gccctgcata aggcaggtgt ctcagtgttc gtct ctccagctct 354; tagtccagta gctgcatggt gagtgagcgt agggcgcacc ctggaaggct gccaagccca 360; aagttgtgca gagcgctggg gactccagac tccccacagc agcagagact nggactgag 366; gcatcctctg ttcacaggac atgctggcat ctactgggtc agggctctgc gtgg 372; ctgtgcaacc ttgggcaagt acct ctctgtgtct cctc atctgtaaca 378; tgcgtgtcga ctac gggt tgatgagaag attaaatgtg caaaacctgc 384; ttgactgtgc ccacaaatcc tgattgtagg aataaattaa tgacttttta taaatatttt 390; gatcagatgg actcatgatc acagatgtct tcacatgcct atgactaatt tgtacacaaa 396; ctaatgctcg tgtttcccaa gcacctggaa gacatgccag atccatgtgc agtaatgcct 402; ggtggctcca ggtctgcccc gccgtcctgt ggggctgtga gctttcccag cctcctgccc 408; gtgtttgtga atatcattct gtcctcagct gcatttccag cccaggctgt ttggcgctgc 414; ccaggaatgg tatcaattcc cctgtttctc ttgtagccag ttactagaat aaaatcatct 420; actttaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaa LOCUS NM_001136134 (isoform 1) AA/translation="WSAHLQWWVVRNCSSFLIKRNKQTYST EPNNLKARNSFRYNGLI HRKTVGVEPAADGKGVVVVIKRRSGQRKPATSYVRTTINKNARA'"LSSIRHMIRKNKY RPDLRMVSWGLGIRLGL1GQCCGLGPP11GCNMGWRGMDSCFQP'"PHTQHWPRGRLV:A.

CDNA: ccgt c:caggtcgc gaag ggagccgccg ccatgtctgc gcatctgcaa 6; tggatggtcg tgcggaactg ctccagtttc ctgatcaaga ggaataagca gacctacagc 12; actgagccca ataacttgaa ggcccgcaat tccttccgct acaacggact gattcaccgc 18; aagactgtgg gcgtggagcc ggcagccgac ggtg tcgtggtggt cattaagcgg 24; agatccggcc agcggaagcc tgccacctcc tatgtgcgga ccaccatcaa caagaatgct ; cgcgccacgc tcagcagcat cagacacatg atccgcaaga acaagtaccg ccccgacctg 36; cgcatggtga gctggggttt ggggatcagg cttggggaga ctggccagtg ctgtggggaa 42; gggcctccca gttg caatatgggc tggagaggga tggattcttg ctttcagcct 48; actccccaca cccagcattg gcctaggggg cggcttgtgg agtgtatggg ctgagccttg 54; ctctgctccc ccgcccccag gcagccatcc gcagggccag cgccatcctg cgcagccaga 60; agcctgtgat ggtgaagagg aagcggaccc gccccaccaa ctga gccccctgcc 66; cccagagcaa taaagtcagc tggctttctc acctgcctcg actgggcctc cctttttgaa 72; acgctctggg gagctctggc tgtt gtcattcagg ccatgtcatc aaaactctgc 78; atgtcacctt gtccatctgg aggtgatgtc aatggctggc catgcaggag gggtggggta 84; gctgccttgt ccctggtgag ggcaagggtc actgtcttca aagt ttgctgactt 90; gtgattgaga cctactgtcc cattgtgagg tggcctgaag aatcccagct gtgg 96; cttccattca gaagaagaaa ttct agcccagaag ggtgcaggct gagggctggg L02; ccctgggccc tggtgctgta gcacggtttg gggacttggg gtgttcccaa gacctggggg L08; acgacagaca tcacgggagg aagatgagat gacttttgca tccagggagt gggtgcagcc L14; acatttggag gggatgggct ttacttgatg caacctcatc tctgagatgg gcaacttggt L20; gggtggtggc ttataactgt aagggagatg gcagccccag ggtacagcca gcaggcattg L26; agcagcctta gcattgtccc cctactcccg tcctccaggt gtccccatcc ctcccctgtc L32; agct ggctcttgtc acttaggtct catctcagtg gccgctcctg ggccaccctg L38; tcacccaagc tttcctgatt gcccagccct cttgtttcct ttggcctgtt tgctccctag L44; tgtttattac agcttgtgag gccaggagtt tgagaccatc ctaggcaaca taatgagaca L50; ccgtctctaa atta gctgggtgtg gtggtgcacc gcctgtggtc cctc L56; agaggttgag tagaggctga ggtgagcgga gcacttgagc caagagtatg aggctgcagt L62; gagcccatga gccccaccac tacactccag cctggaagac acac acagtgaggc L68; ctggatgggg aaagagtcct gctgttgatc ctcacatgtt tcctgggcac ctaactctgt L74; cagccactgc ccaa agca tccatggcac ccctggttcc tgccatcctg L80; gggtacccga gaag gactctgctc cctgtctgag accacccccg gctctgactg L86; agagtaaggg gactgtcagg gcctcgactt gccattggtt ggggtcgtac ggggctggga L92; cgtt ttgaggcaga ccactgccct tccgacctca gtcctgtctg ctccagtctt L98; gcccagctcg aaggagagca gatctgacca cttgccagcc cctgtctgct gtgaattacc 204; atttcctttg tccttccctt agttgggtct attagctcag attgagaggt gttgccttaa 210; aactgagttg ttgg tacctgctca ggaccccccg cactgtccca atcccactca 216; ggcccacctc cagctggcct cactccgctg gtgacttcgt tcag ccac 222; tgtcccagtc ccactcaggc ccatctctgg ctggcctcac tgcgctggga ctccgccttc 228; ataaggagag ctcactgctc acgttagtag atggcccctt ctcgtgaggc ctctcccctg 234; gcacctgctt cagttgtcct ccacagcact gatttgcagc ccacaagctg gcaggtttat 240; ctgtctcatg tttgtcttgt gggc aaggggtttg tctagcacac cagcatataa 246; tgagatgctt tggt gcatattgaa tgtataaagc ccaccggtcc tgagagtttg 252; ctcactggag actttctgga gatggagtct cgctctgttg cccaggctgg caat 258; ggcgcgatct tggctcactg cagcctccac ctcctgggtt caagcgattc tcctgcctca 264; gcctcccgag tagctgggat tggg tgtcaccaca cccagctcag tattgtattt 270; ttagcagaga tggggtttca tgcc caggctggtt tggaactcct gacttcaaat 276; tacccacctg cctc ccaaagtgct ggcattacag gcgctcgagg ctttctgatg 282; tggctgctgc tgctcagaag gccttgtcct taaccacctc cttgcctgcc ctggaggctt 288; ctag gccccacccc ctgtggagtc ctgctggctt tctccatccc tatctgaatc 294; gctg tgtggcctcc cctggtctca tccgtaacac agcccagctt agtgggcctc 300; tgttcctgcg ggtggccagc ctgtctgtgt ggctgggctg ccac gtctggtatc 306; tgaatgctat ngtgggttg agga accaggagag ggctggaggg agggagatgg 312; tctcagcccc acagagtttg gagtcctcag tgtgctgagc ggag acaccatttc 318; cctcctctag acctcatctt ggagagagag atgttggatg gggccatcta ttccagcttt 324; attcacacaa atcatgtctg ttggcctgga aattggaaaa ccagttaaac caaaaacatg 330; atattaagaa aacaggcagg ctcaccatag taaaaatgct gaaagccaaa gacaaaattg 336; ggagaacaaa agaaaagcgt cttgtcacat acagaaggtc cctgataaag ttagtagctg 342; ccctcatcag aaaccaggcc caggcagtgg ggacacatcc ctga aagaacctcc 348; cccaggtcat cctatcccca agagtgatgc ccggcagcat tcccagctca atgg 354; ttcacggaag ccaggaatca ctgg gttccagtcc cagctctgcc agttatgccc 360; agctgtgggg acttgggcag ctcgtttagt agcaccgtgc ctcagtttcc catatgtaaa 366; aggccatttt gagtgccttt cacagccctg cataaggcag gtgtctcagt gttcactgct 372; ccag ctcttagtcc agtagctgca tggtgagtga gcgtagggcg caccctggaa 378; ggctgccaag cccaaagttg tgcagagcgc tggggactcc agactcccca cagcagcaga 384; gactcgggac tgaggcatcc tctgttcaca ggacatgctg gcatctactg ggct 390; ctgctgctcg gtgc aaccttgggc aagttcctca ctgt gtcttcgtac 396; cctcatctgt aacatgcgtg tcgatagacc ctactactca gggttgatga taaa 102; tgtgcaaaac ctgcttgact gtgcccacaa atcctgattg taggaataaa actt 108; tttataaata ttttgatcag atggactcat gatcacagat gtcttcacat gcctatgact 114; aatttgtaca caaactaatg tttc ccaagcacct ggaagacatg ccagatccat 120; gtgcagtaat gcctggtggc tccaggtctg ccccgccgtc ctgtggggct tttc 126; ccagcctcct gcccgtgttt gtgaatatca ttctgtcctc agctgcattt cagg Z32; ctgtttggcg ctgcccagga atggtatcaa ttcccctgtt tctcttgtag ccagttacta Z38; gaataaaatc atctacttta aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaa LOCUS NM_001136135 (isoform 3) AA/translation="MSAHLQWMVVRNCSSFLIKRNKQTYST EPNNLKARNSFRYNGLI HRKTVGVEPAADGKGVVVVIKRRSGQRKPATSYVR'"TINKNARATLSSIRHMIRKNKY RPDLRMDWLASTGSGLCCSVAVQPWASSSTSLCLR"LICNMRV DRPYYSGLMRRLNVQ NLLDCAHKS CDNA: 1 ctctttccgt ctcaggtcgc cgctgcgaag ggagccgccg ctgc gcatctgcaa 6; tggatggtcg tgcggaactg ctccagtttc ctgatcaaga ggaataagca gacctacagc 12; actgagccca ataacttgaa ggcccgcaat tccttccgct acaacggact gattcaccgc 18; aagactgtgg agcc ggcagccgac ggcaaaggtg tcgtggtggt gcgg 24; agatccggcc agcggaagcc tgccacctcc tatgtgcgga ccaccatcaa caagaatgct ; cgcgccacgc tcagcagcat cagacacatg atccgcaaga acaagtaccg cctg 36; cgcatggaca tgctggcatc gtca gggctctgct gctcggtggc tgtgcaacct 42; tgggcaagtt cctc tctgtgtctt cgtaccctca tctgtaacat gcgtgtcgat 48; agaccctact actcagggtt gatgagaaga ttaaatgtgc aaaacctgct tgactgtgcc 54; cacaaatcct gattgtagga ataaattaat gactttttat aaatattttg atcagatgga 60; ctcatgatca cagatgtctt ccta tgactaattt gtacacaaac taatgctcgt 66; caag cacctggaag acatgccaga tccatgtgca gtaatgcctg gtggctccag 72; gtctgccccg ccgtcctgtg gggctgtgag ctttcccagc ctcctgcccg tgtttgtgaa 78; tatcattctg tcctcagctg catttccagc ccaggctgtt tggcgctgcc caggaatggt 84; atcaattccc ctgtttctct tgtagccagt tactagaata aaatcatcta ctttaaaaaa 90; aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaa LOCUS NM_001136136 (isoform 4) AA/translation="MSAHLQWMVVRNCSSFLIKRNKQTYST EPNNLKARNSFRYNGLI HRKTVGVEPAADGKGVVVVIKRRSGTFCLVWARTRPLSRVWTL CDNA: 1 ctct :tccgt ctcaggtcgc cgctgcgaag ggagccgccg ccatgtctgc gcaa 6; tggatggtcg tgcggaactg ctccagtttc ctgatcaaga ggaataagca gacctacagc 12; actgagccca ataacttgaa ggcccgcaat tccttccgct acaacggact gattcaccgc 18; aagactgtgg gcgtggagcc cgac ggtg tcgtggtggt gcgg 24; agatccggtg agttttgtct ggcc agagagcggc ccctttcccg ggtctgggag ; ctgtgatttt ttactgtcag gcaggaagag cggtaactgc ggcg ggcatccctg 36; gcgccagggt gttggtctgg gtaccggctt ccctctcggc cgacttgtca gctctgtgag 42; ccgcgcgcgt ctgagcccgt gtcctcacct gtaaagtgga gaaatgaaaa aggacctgaa 48; cttcctcggt ggttgttgag agttaaggca cggggttgat gttttcagat gaaattctca 54; aagcaagtca gggtggggat tttc atcccacagg tgggaagatt gagg LOCUS NM_001136137 (isoform 5) AA/translation="MSAHLQWMVVRNCSSFLIKRNKQTYST EPNNLKARNSFRYNGLI HRKTVGVEPAADGKGVVVVIKRRSL‘J CDNA: l c:ctttccgt ctcaggtcgc cgctgcgaag ggagccgccg ccatgtctgc gcatctgcaa 6; tggatggtcg actg ctccagtttc ctgatcaaga ggaataagca gacctacagc l2; actgagccca ataacttgaa ggcccgcaat tccttccgct acaacggact gattcaccgc 18; aagactgtgg gcgtggagcc ggcagccgac ggcaaaggtg tcgtggtggt cattaagcgg 24; agatccgagt gagtttttct caggtccttg attggaactg cctcagagcc aagggtcctt ; ttactcagtg gcagcaacaa acgcagtctg ttggctagtg atcctcctgt ctcagggaca 36; cgtagtccag ggagcagcca attgcttggc acttggggac cccgttctgg ggagtcctga 42; aagctttcac ctcttggatt gccgaataca tgggtggccc ttcctagact aagggactgg 48; cctgagtgag gctgggcctc tcagccaagc tgatgttgaa ccactgctgt ggggatgggc 54; ctggggttcc tgggaagctg ttcataccca ttgccaggag cgtgggctct ggctggacct 60; ggatcagatc ctaactgaag cttt ctggcatgag aaaggagtgt tttcatggtg 66; gacagaattg gagt gt 28. RPSS: RPSS ribosomal protein 55 [ Homo sapiens ] AOCJS NM_OOlOO9 AA/:ransla:ion="Mi*W4iAAPAVATTPDIK.FGKWSTDDVQINDISAQDYIAVKE YAKYLPHSAGRYAAKQFRKAQCPIVERLTNSMMMiGRNNGKKLMTVRIVKiAFTIIH.

.TGTNPLQVLVNAIIWSGPREDSTRIGRAGTVRRQAVDVSPLRRVNQAIW.LCTGART AAbRNIKLIAdCLADdLIWAAKGSSVSYAIKKKDdLdRVAKSNR CDNA: l ctct:cctg: c:g:accagg gcggcgcg:g gtctacgccg agag acgctcaggc 6; tgtgttctca gga:gaccga gtgggagaca gcagcaccag cggtggcaga agac 12; atcaagctct ttgggaagtg gagcaccgat gatgtgcaga tcaatgacat gcag 18; gattacattg cagtgaagga gaagtatgcc aagtacctgc ctcacagtgc agggcggtat 24; gccgccaaac gcttccgcaa agctcagtgt cccattgtgg agcgcctcac taactccatg ; atgatgcacg gccgcaacaa cggcaagaag ctcatgactg tgcgcatcgt caagcatgcc 36; ttcgagatca tacacctgct cacaggcgag aaccctctgc aggtcctggt gaacgccatc 42; agtg gtccccggga ggactccaca cgcattgggc gcgccgggac tgtgagacga 48; gtgg atgtgtcccc cctgcgccgt gtgaaccagg ggct gctgtgcaca 54; cgtg cctt catt aagaccattg ctgagtgcct ggcagatgag 60; aatg ctgccaaggg ctcctcgaac tcctatgcca ttaagaagaa gctg 66; gagcgtgtgg ccaagtccaa ccgctgattt tcccagctgc tgcccaataa acctgtctgc 72; cctttggggc agtcccagcc aaaaaaaaaa aaaaa 29. RPS6: RPS6 ribosoma; n S6 [ Homo sapiens ] AOCJS NM_OOlOlO AA/:ranslation="MK;NISFPA"GCQK-I*VDD*RKLRibY*KQWAi*VAADALG L L WKGYVVRISGGNDKQGFPWKQGVL"HGRVR.L-SKGiSCYRPRQTGERKRKSVRGCIV DANASVLWLVIVKKGEKDIPGL D VPRRAGPKRASRIRKLFV-SKTDDVRQYVVRK PLNKEGKKPRTKAPKIQRAVTPRVLQ{KRRQIALKKQRLKKNK L *AAdYAKLLAKRMK *AK‘KRQ‘QIAKRRR-SS-RAS SKSLSSQK CDNA: l ttcc g:ggcgcctc ggaggcgttc agctgcttca agatgaagct gaacatctcc 6; ttcccagcca gcca gaaactcatt gaagtggacg atgaacgcaa acttcgtact l2; ttctatgaga agcgtatggc cacagaagtt gacg ctctgggtga agaatggaag 18; ggttatgtgg tccgaatcag tggtgggaac gacaaacaag gtttccccat gaagcagggt 24; gtcttgaccc atggccgtgt ccgcctgcta ctgagtaagg cctg ttacagacca ; aggagaactg gagaaagaaa gagaaaatca ggtt gcattgtgga tgcaaatctg 36; agcgttctca acttggttat tgtaaaaaaa ggagagaagg atattcctgg actgactgat 42; actacagtgc ctcgccgcct gggccccaaa agagctagca gaatccgcaa caat 48; aaag aagatgatgt ccgccagtat gttgtaagaa agcccttaaa taaagaaggt 54; aagaaaccta ggaccaaagc acccaagatt cttg cacg tgtcctgcag 60; cacaaacggc ggcgtattgc tctgaagaag cagcgtacca agaaaaataa agaagaggct 66; gcagaatatg ctaaactttt ggccaagaga atgaaggagg ctaaggagaa gcgccaggaa 72; caaattgcga agagacgcag actttcctct ctgcgagctt ctacttctaa gtctgaatcc 78; agtcagaaa: aagatttttt gagtaacaaa taaataagat cagactctg . SLTM: SATM SAFE—like, transcription modula:or [ {omo sapiens ] LOCJS 013843 (isoform b) AA/:ransla:ion="WAAATGAVAASAASGQAEGKKITD-?VIDLKST-K?RW-DITGV KTV-ISR-KQAI***GGDPDNI*- VSiDiPVKKP KGKGKKH AD4-SGDASVEDDAL bIKDCdL‘WQ‘Ai‘QDGNDdLKDS‘*bG*V***WVHSK*--SA L *NKQAid-I*A*GI *DI‘K‘DI‘SQ‘I‘ WQ‘G‘DDib- AQDG****W*K*GS-A*AD{iAi**W*AiiiVK *A‘DDWISVTIQAT AIT-DFDGDD--TTGKVVKIu DSLASKPKJGQDAIAQSP‘K‘S KDYEWVAWiKDGKKIDCVKGDPV‘K‘A?‘SSKKA‘SGDK‘KDw -KKGPSSTGASGQAK SSSKESKJSKTSSK u DKGS"SSTSGSSGSSTKWIWVSG;SSW"KAAD-KV-FGKYGKV ASAKVVTWARSPGAKCYGIVTMSSSTEVSRCIAi-H?T'u i C)O -ISVTKVKGDPSKK:L WKK‘V34KSSSQSSGDKKV"SDQSSKTQASVKK L KS*KK*SKDLKKI*GK3* KVDWGASGQLS‘SIKKSL *KKRISSKSPGHMVI.DQTKGDHCQPSQQGRYEKliGRSK L *K*?AS-DKK?DK3YQRK*I-Pb*KWK4QRLR L A -V?b L RLRQAW*-QRRQ*IA*R*Q N *?*?IQIIR*?**R*R-QR*R4RL*I*?QKLL N 4 S R*RIQI*Q*RQ N IAQ*R44LR?QQQQ-RY*Q*KRWS-KQPRDV3 A N DDPYWSTNKK.SLDTDARFG{G m VREVJEJHR*QGREP‘SSAVQSSSE.U N NU/U/U REVGQSAGKKARPTARREL H QYPKNFSDSQRVdPPPPRNdﬂ *SDRR*VRG L N 3*??iVIIiDQPDI"{PRHPQE wAw U] RPiSWKSLGSMSiDKR‘i V*QP*RSGR L <SGiSVRGAPPGNQSSASGYGS? u QGVITD?GGGSQ4YP**RHVV*?{GRDLSGPR LWiGPPSQGPSYiD"QRWGDGQN AGMITQHSSNASPINRIVQISGWSWPRGSGSGFKPFKGGPPRQF CDNA: l gcggccgccg aggcctggg: ggaagt:ggc gctgc:gccg ccgccctgca gcccactcgc 6; tgcctcggca gcgcgctgct agat ggctgccgct accggtgcgg tggcagcctc 12; ggccgcctcg ggtcaggcgg aaggtaaaaa gatcaccgat ctgcgggtca tcgatctgaa 18; gctg aagcggcgga acttagacat caccggagtc aagaccgtgc tcatctcccg 24; actcaagcag gctattgaag aggaaggagg cgatccagat aatattgaat taactgtttc ; aactgatact ccaaacaaga aaccaactaa aggcaaaggt aaaaaacatg aagcagatga 36; gttgagtgga gatgcttctg tggaagatga tgcttttatc aaggactgtg aattggagaa 42; ggca catgagcaag atggaaatga tgaactaaag gactctgaag aatttggtga 48; aaatgaagaa gaaaatgtgc attccaagga gttactctct gaaa acaagagagc 54; tcatgaatta atagaggcag aaggaataga agatatagaa aaagaggaca tcgaaagtca 60; ggaaattgaa gctcaagaag gtgaagatga tacctttcta acagcccaag atggtgagga 66; agaagaaaat gagaaagaag ggagcctagc tgaggctgat cacacagctc atgaagagat 72; ggaagctcat acgactgtga aagaagctga ggatgacaac atctcggtca caatccaggc 78; tgcc ctgg attttgatgg tgatgacctc acag gtaaaaatgt 84; gaaaattaca gattctgaag caagtaagcc aaaagatggg caggacgcca ttgcacagag 90; cccggagaag gaaagcaagg attatgagat gaatgcgaac gatg gtaagaagga 96; agactgcgtg aagggtgacc ctgtcgagaa ggaagccaga gaaagttcta agaaagcaga L02; atctggagac aaagaaaagg atactttgaa gaaagggccc tcgtctactg gggcctctgg L08; tcaagcaaag tcaa aggaatctaa agacagcaag acatcatcta acaa L14; aggaagtaca agtagtacta gtggtagcag tggaagctca actaaaaata tctgggttag L20; tggactttca tctaatacca aagctgctga tttgaagaac ctctttggca aatatggaaa L26; ggttctgagt gcaaaagtag atgc tcgaagtcct ggggcaaaat gctatggcat L32; tgtaactatg tcttcaagca cagaggtgtc caggtgtatt cttc atcgcactga L38; gctgcatgga attt ctgttgaaaa agtaaaaggt gatccctcta agaaagaaat L44; agaa aatgatgaaa gttc aagaagttct ggagataaaa aaaatacgag L50; tgatagaagt agcaagacac aagcctctgt caaaaaagaa gagaaaagat cgtctgagaa L56; atctgaaaaa aaagaaagca ctaa gaaaatagaa ggtaaagatg agaagaatga L62; taatggagca agtggccaaa catcagaatc aaaa agtgaagaaa agaagcgaat L68; aagttccaag agtccaggac atatggtaat actagaccaa actaaaggag atcattgtag L74; aaga agaggaagat atgagaaaat tcatggaaga agtaaggaaa aggagagagc L80; tagtctagat aaaaaaagag ataaagacta cagaaggaaa gagatcttgc cttttgaaaa L86; gatgaaggaa caaaggttga gagaacattt agttcgtttt gaaaggctgc caat 192; ggaacttcga agacgaagag agattgcaga gagagagcgt cgag ttag 198; aataattcgt gaacgggaag aacgggaacg gaga gagagagagc gcctagaaat 204; tgaaaggcaa aaactagaga gagagagaat ggaacgcgaa gaaa gggaacgcat 210; tcgtattgaa cgtc aagc tgaacggatt gctcgagaaa gagaggaact 216; cagaaggcaa caacagcagc ttcgttatga acaagaaaaa aggaattcct tgaaacgccc 222; acgtgatgta gatcataggc gagatgatcc ttactggagc gagaataaaa agttgtctct 228; agatacagat gcacgatttg gccatggatc cgactactct cgccaacaga acagatttaa 234; tgactttgat caccgagaga ggggcaggtt tcctgagagt tcagcagtac agtcttcatc 240; ttttgaaagg cgggatcgct ttgttggtca gggg gcac gacctactgc 246; acgaagggaa gatccaagct tcgaaagata aaat ttcagtgact ccagaagaaa 252; tgagcctcca ccaccaagaa atgaacttag agaatcagac aggcgagaag tacgagggga 258; gcgagacgaa aggagaacgg tgattattca tgacaggcct gatatcactc atcctagaca 264; tcctcgagag gcagggccca atccttccag acccaccagc tggaaaagtg aaggaagcat 270; gtccactgac aaacgggaaa caagagttga aaggccagaa cgatctggga gagaagtatc 276; agggcacagt gtgagaggcg ctccccctgg gaatcgtagc agcgcttcgg ggtacgggag 282; cagagaggga gacagaggag tcatcacaga ccgaggaggt ggatcacagc actatcctga 288; ggagcgacat gtggttgaac gccatggacg ggacacaagc ggaccaagga aagagtggca 294; tggtccaccc tctcaagggc ctagctatca tgatacgagg ggtg acggccgggc 300; aggagcaggc atgataaccc aacattcaag taacgcatcc ccaattaata gaattgtaca 306; aatcagtggc aattccatgc caagaggaag tggctccgga tttaagccat ttaagggtgg 312; acctccgcga cgattctgaa aatgagctct ctgccaaggt tttaagataa tttattgaaa 318; gtaa actttacttg actacttatg aagaggacct ctgacttgct tgagagttct 324; gtcagacttt tctttttaaa aatttaacat gattgctttt ctcaattttg atgt 330; ttaaatagtt ctgttgtaac ttttaatagt tttgtgtatc attcaacttt ttttcttgca 336; gcaccgaggc acatttgaaa agatggaatt gttt tgtttaacgc tgtgtgaata 342; taaagagtag tttgcagctg tgtggtagtg gtttaatttg cagccttagc tctgtggtgt 348; ctggctctag agttacttct ttttaccaag cattttcagc ctccattttg aaggctgtct 354; acacttaaga agtcttagct tttt tagagaataa gattgttcat tgcatttctg 360; atgt aacctatttt tgcagaaggt actgttacat taagtgcatc tgtgtatcct 366; ggtttaaaaa aatgtaatct tttttgaaat aaaccttcat attctgtata gttgctaaag 372; tgttgagaac ctttttaatt gtaaaatgag aaccgatttt cagtttagtg tagcagcaca 378; cttgttcagg tttgcatggt ccaa atagattcat gaaaccttgg ccatgaggtt 384; tgtttcacaa ggttcttaga ccgagttgtg caggtaagtg cacttttagg taatctgcac 390; tgtttgtttg atggataaat tccatctctg ggaattgtgt gggtattaat gtttccatgt 396; tcccaactat gttgagaagt ggaaaaaaac ccaggttcta gatgggtgaa tcagttgggt 402; tttgtaaata cttgtatgtg gggaagacat cttt aaat aaaaatccac 408; acctggaagt gtaaaaaaaa aaaaaaaaaa LOC VW 024755 (isoform a) AA/ :ion= HWAAATGAVAASAASGQAEGKKITD-RVID .KSTLK QRV-DITGV KTV .KQAI*** GGDPDWId. VS DLPVKKP KGKGKKH4AD 4LSG DASVIJDA bIK “QDGND‘-KJS* *bG*W***WVHSK*--SA L *VKRAH‘ .I‘A *GI 431 4AQ‘G‘DDih. L L V‘KDIAGSGDGTQTVSKP .PS'G U] wLADHLA {i iVK‘A‘D DVISV IQALDAI LDEDGDD..TTGKNVKIT DS m KPKDGQ IAQSP4 K*SK3Y *WVAVHK DGKKE DCVKGDPV*K*A? *SSKKA *SG uN N D"AKKGPSSTGASGQAKSSSKESK DSKTSSKJ STSGSSGSSTKNIWVSG mV"KAAD-KV-FGKYGKVASAKVVTWARSPGAKCYGIVTWSSSTEVSRCIA { NH Li VTKVKGDPSKK L WKK‘VJ *KSSS QSSGDKKV"SD ASVKK W SKDLKKI‘GKldKV DWGASGQLS *SIKKSL *KK GHMVIA :CQPSRRGRYEKliG RSK*K*QAS. DKKQDKDYQQK *I-Pb‘KWK dQRLRd NU LRKAW*-RRRR*IA* RdRRdeRIRIIR‘ %«4R«a.QRd RLdId RQKLd RLdeRIRIdeRRK *AdRIARde aLRRQQQQ-RY 4Q4K RVs-K QPRDV WSTNKK-SLDTDARFGiGSDYS RQQVREV DEDHR*RGREP SSSF QSEGKKARPLAQR L DPSE‘ RYPKNESDS QRVEPPPP RN4- DR? *VRG‘ Riv IIiDRPDI"{PRHP PS RPLSWKSLGSMS i3KR‘i V*?P* RSGR‘ {SVR GAPPGNQSSASGYGS REGDRGVITD RGGGSQiYP“ QHVV*?{GR DiSGP PP SQGPSYiD"RRWGDG QAGAGMITQHSSNASPINRIVQISGVSWPRGSGSGFKPFKGGP PRQF CDNA: l gcggccgccg aggcctgggt ggaagttggc gctgctgccg ccgccctgca gcccactcgc 6; ggca gcgcgctgct cttctaagat ggctgccgct accggtgcgg tggcagcctc 12; ggccgcctcg ggtcaggcgg aaggtaaaaa gatcaccgat ctgcgggtca tcgatctgaa 18; gctg aagcggcgga acttagacat agtc aagaccgtgc tcatctcccg 24; actcaagcag gctattgaag aggaaggagg cgatccagat aatattgaat taactgtttc ; aactgatact aaga aaccaactaa aggcaaaggt aaaaaacatg aagcagatga 36; gttgagtgga gatgcttctg tggaagatga tgcttttatc aaggactgtg aattggagaa 42; tcaagaggca catgagcaag atggaaatga tgaactaaag gactctgaag aatttggtga 48; aaatgaagaa gaaaatgtgc attccaagga gttactctct gcagaagaaa acaagagagc 54; tcatgaatta atagaggcag aaggaataga agatatagaa aaagaggaca tcgaaagtca 60; ggaaattgaa gctcaagaag atga tacctttcta acagcccaag atggtgagga 66; agaagaaaat gagaaagata tagcaggttc tggtgatggt acacaagaag tatctaaacc 72; tcttccttca gaagggagcc tagctgaggc tgatcacaca gctcatgaag agatggaagc 78; tcatacgact gtgaaagaag ctgaggatga caacatctcg gtcacaatcc aggctgaaga 84; tgccatcact ctggattttg atggtgatga cctcctagaa acaggtaaaa atgtgaaaat 90; tacagattct agta agccaaaaga tgggcaggac gccattgcac agagcccgga 96; gaaggaaagc aaggattatg agatgaatgc taaa gatggtaaga aggaagactg L02; gggt gaccctgtcg agaaggaagc cagagaaagt tctaagaaag cagaatctgg L08; agacaaagaa aaggatactt aagg gccctcgtct actggggcct aagc L14; aaagagctct tcaaaggaat ctaaagacag caagacatca tctaaagatg acaaaggaag L20; tacaagtagt actagtggta gcagtggaag ctcaactaaa aatatctggg ttagtggact L26; ttcatctaat accaaagctg ctgatttgaa cttt ggcaaatatg ttct L32; gagtgcaaaa gtagttacaa atgctcgaag tcctggggca aaatgctatg gcattgtaac L38; tatgtcttca agcacagagg tgtccaggtg tattgcacat cttcatcgca ctgagctgca L44; tggacagctg atttctgttg aaaaagtaaa aggtgatccc tctaagaaag aaatgaagaa L50; tgat gaaaagagta gttcaagaag ttctggagat aaaaaaaata cgagtgatag L56; aagtagcaag acacaagcct ctgtcaaaaa agaagagaaa agatcgtctg agaaatctga L62; aaaaaaagaa agcaaggata ctaagaaaat agaaggtaaa gatgagaaga atgg L68; tggc caaacatcag aatcgattaa aaaaagtgaa gaaaagaagc gaataagttc L74; caagagtcca atgg taatactaga ccaaactaaa ggagatcatt gtagaccatc L80; aagaagagga agatatgaga aaattcatgg aagaagtaag gaaaaggaga gagctagtct L86; agataaaaaa aaag actacagaag gatc ttgccttttg aaaagatgaa L92; ggaacaaagg ttgagagaac ttcg ttttgaaagg ctgcgacgag aact L98; tcgaagacga agagagattg cagagagaga gcgtcgagag cgagaacgca ttagaataat 204; tcgtgaacgg gaagaacggg taca gagagagaga gagcgcctag aaattgaaag 210; gcaaaaacta gagagagaga gaatggaacg cgaacgcttg gaaagggaac gcattcgtat 216; tgaacaggaa cgtcgtaagg aagctgaacg gattgctcga gaaagagagg aactcagaag 222; acag cagcttcgtt atgaacaaga aaaaaggaat tccttgaaac gcccacgtga 228; tgtagatcat aggcgagatg atccttactg gagcgagaat aaaaagttgt ctctagatac 234; agatgcacga tttggccatg gatccgacta ccaa cagaacagat ttaatgactt 240; tgatcaccga gagaggggca ggtttcctga gagttcagca gtacagtctt catcttttga 246; aaggcgggat cgctttgttg gtcaaagtga ggggaaaaaa gcacgaccta ctgcacgaag 252; ggaagatcca agcttcgaaa gatatcccaa aaatttcagt gactccagaa gaaatgagcc 258; tccaccacca agaaatgaac ttagagaatc agacaggcga gaagtacgag gggagcgaga 264; cgaaaggaga acggtgatta acag gcctgatatc actcatccta gacatcctcg 270; agaggcaggg cccaatcctt ccagacccac cagctggaaa agtgaaggaa gcatgtccac 276; tgacaaacgg gaaacaagag ttgaaaggcc agaacgatct gggagagaag tatcagggca 282; gaga ggcgctcccc ctgggaatcg tagcagcgct tcggggtacg ggagcagaga 288; gggagacaga ggagtcatca cagaccgagg aggtggatca cagcactatc ctgaggagcg 294; ggtt gaacgccatg gacgggacac acca aggaaagagt ggcatggtcc 300; accctctcaa gggcctagct atcatgatac gaggcgaatg ggtgacggcc gagc 306; aggcatgata acccaacatt caagtaacgc atccccaatt aatagaattg tacaaatcag 312; tggcaattcc atgccaagag gaagtggctc cggatttaag ccatttaagg gtggacctcc 318; gcgacgattc tgag gcca aggttttaag ataatttatt gaaatctcct 324; gtaaacttta cttgactact tatgaagagg acctctgact tgcttgagag ttctgtcaga 330; cttttctttt taaaaattta acatgattgc ttttctcaat tttggagaag atgtttaaat 336; agttctgttg taacttttaa tagttttgtg tatcattcaa ctttttttct tgcagcaccg 342; aggcacattt gaaaagatgg aattgaagtc gttttgttta acgctgtgtg aatataaaga 348; gtagtttgca gctgtgtggt agtggtttaa tttgcagcct tagctctgtg gtgtctggct 354; ctagagttac ttctttttac caagcatttt cagcctccat ggct gtctacactt 360; aagaagtctt agctgtctaa tttttagaga ataagattgt tcattgcatt tctgagtatt 366; atgtaaccta tttttgcaga aggtactgtt acattaagtg catctgtgta ttta 372; tgta atcttttttg aaataaacct tcatattctg tatagttgct aaagtgttga 378; gaaccttttt aattgtaaaa tgagaaccga gttt gcag cacacttgtt 384; caggtttgca tggtatgaaa ccaaatagat tcatgaaacc ttggccatga ggtttgtttc 390; acaaggttct gagt ggta agtgcacttt taggtaatct tttg 396; tttgatggat aaattccatc aatt gtgtgggtat taatgtttcc atgttcccaa 402; ctatgttgag aagtggaaaa aaacccaggt tggg tgaatcagtt gggttttgta 408; aatacttgta tgtggggaag acattgttgt ctttttgtga aaataaaaat ccacacctgg 414; aagtgtaaaa aaaaaaaaaa aaaa 31. iMLD4Z MLD4 transmembrane emp24 protein transport domain containing 4 {omo sapiens ] AOCUS NM_182547 AA/translation= "MAGVGAGPARAMGRQALLLAALCATGAQG .YbHIGA. I— *KRCEI A.

*IPD‘iMVIGNYR"QWWDKQKTVF .PSTPG .GMHVTVKDPDGKVV JSRQYGSL‘JGRFTF TSHTPGDHQICLHSNSTRMALFAGGKLRVH DIQVG'THANNYPEIAAKDKLT‘J .QLRA RQL .DQV *QIQK‘QDYQRYR L *RERL is dSiNQRVLWWSIAQTVILILTGIWQMRHLK SFFTAKK .V CDNA: l ggcgct :agg ggc g cagg tgtcggggct gggcctctgc gggcgatggg 6; gcggcaggcc ctgctgcttc tcgcgctgtg cgccacaggc gcccaggggc tctacttcca 12; catcggcgag accgagaagc gctgtttcat cgaggaaatc cccgacgaga ccatggtcat 18; cggcaactat cgtacccaga tgtgggataa gcagaaggag gtcttcctgc cctcgacccc 24; tggcctgggc atgcacgtgg aagtgaagga ccccgacggc aaggtggtgc tgtcccggca ; gtacggctcg gagggccgct tcacgttcac ctcccacacg cccggtgacc atcaaatctg 36; tctgcactcc acca ggatggctct cttcgctggt ggcaaactgc gtgtgcatct 42; ccag gttggggagc atgccaacaa ctaccctgag gcaa aagataagct 48; gacggagcta cagctccgcg cccgccagtt gcttgatcag gtggaacaga ttcagaagga 54; gcaggattac caaaggtatc gtgaagagcg cttccgactg acgagcgaga gcaccaacca 60; gagggtccta tggtggtcca agac tgtcatcctc actg gcatctggca 66; gatgcgtcac ctcaagagct tctttgaggc caagaagctg gtgtagtgcc ctctttgtat 72; gacccttcct ttttacctca tttatttggt actttcccca cacagtcctt tatccgcctg 78; gatttttagg gaaaaaaatg aaaaagaata agtcacattg gttccatggc ccat 84; cagc cacttgctga ccctggttct taaggacaca tgacattagt ccaatctttc 90; aaaatcttgt cttagggctt gtgaggaatc aacc caggactcag tcctgcttct 96; tcga gtgattttcc tctgtttttc actaaataag caaatgaaaa ctctctccat 102; taaaaaaaaa aaaaaaaaaa aaaaaaaa 32. TNRCBA: ADRBKl adrenergic, beta, or kinase 1 [ Homo sapiens ] LOC JS NW_OOl6l9 AA/ :ranslation="MA D-TAVLA TKSKATPAA QASKKI- TPSI RSVMQ KYL L DRG *Vib‘KIbSQK-GY--ER *ARPLV *EY L *IKKY‘KLd 44*?V ARS? EIF DSYIWKT.LACS{PESKSA .GKKQVPP .bQPYI* *ICQW-QGD VbQKhI 4 4 SDKhinCQWKVV*-WIH. {RIIG EVYGC D"GKWYA MKCA DKK QIKMKQG L -A-V*RIWLS ILD-WVG GDLiYHASQHGVESLADM RbYAA‘II {G4VRIS DLGJAC DFSKKKPHASVGTiGYWAPT ?G{SPFR QHK KDK {*IDQML- WAV4-PDSESP4. QGAQ4VK45 PFFRSL DWQMVFLQKYPPP-IPPQG DSDQ'-YQN‘J EPL IS L RWQQ‘VA‘ VbDiIVA YWS KMGVPEJTQWQRRYbY-EPVR. *IQSV* *RKC---KI RGGKQFI AQCDSDPTLVQWKK'.‘J TAQQLVQRVPKWKWKP QSPVVTLSKVP-VQ RGSANG 4 CDNA: 1 cgggcgcgcg ggcggcggcg gcggcggcgc cccgactgca g :cccggcgg gagcggagcg 61 cgagccgggg ccgggcccga gccggcgcca tggggcggcg ccgcctgtga gcggcggcga 12; gcggagccgc gggcgccgag cagg nggaggcgt cggcgcccga ggccgagcga 18; gccgcggccg ggccgggccg agcgccgagc gagcaggagc ggcggcggcg gcggcggcgg 24; cgggaggagg cagcgccgcc gccaagatgg tgga ggcggtgctg gccgacgtga ; gctacctgat ggccatggag aagagcaagg ccacgccggc cgcgcgcgcc agcaagaaga 36; tcctgctgcc cgagcccagc atccgcagtg tcatgcagaa gtacctggag gaccggggcg 42; aggtgacctt tgagaagatc ttttcccaga agctggggta cctgctcttc cgagacttct 48; gcctgaacca ggag gccaggccct tggtggaatt ctatgaggag atcaagaagt 54; acgagaagct ggagacggag gaggagcgtg gcag ccgggagatc ttcgactcat 60; acatcatgaa ggagctgctg gcctgctcgc atcccttctc tgcc actgagcatg 66; tccaaggcca cctggggaag aagcaggtgc ctccggatct gcca tacatcgaag 72; agatttgtca aaacctccga gtgt tccagaaatt cattgagagc gataagttca 78; cacggttttg ccagtggaag aatgtggagc tcaacatcca cctgaccatg aatgacttca 84; gcgtgcatcg catcattggg ggct ttggcgaggt gtgc ngaaggctg 90; acacaggcaa gatgtacgcc atgaagtgcc tggacaaaaa gcgcatcaag atgaagcagg 96; gggagaccct ggccctgaac atca tgctctcgct cgtcagcact ggggactgcc L02; cattcattgt gtca tacgcgttcc acacgccaga caagctcagc ttcatcctgg L08; acctcatgaa gac ctgcactacc acctctccca gcacggggtc ttctcagagg L14; ctgacatgcg cttctatgcg gccgagatca tcctgggcct ggagcacatg cacaaccgct L20; tcgtggtcta ccgggacctg aagccagcca acatccttct ggacgagcat ggccacgtgc L26; ggatctcgga cctgggcctg gcctgtgact tctccaagaa gaagccccat gccagcgtgg L32; gcacccacgg ggct ccggaggtcc tgcagaaggg cgtggcctac gacagcagtg L38; ccgactggtt ctctctgggg tgcatgctct tcaagttgct gcgggggcac agccccttcc L44; ggcagcacaa gaccaaagac aagcatgaga tcgaccgcat gacgctgacg atggccgtgg L50; ccga ctccttctcc cctgaactac gctccctgct ggaggggttg ctgcagaggg L56; atgtcaaccg gagattgggc tgcctgggcc gaggggctca ggaggtgaaa gagagcccct L62; ttttccgctc cctggactgg cagatggtct tcttgcagaa gtaccctccc ccgctgatcc L68; ccccacgagg ggaggtgaac gcggccgacg ccttcgacat tggctccttc gatgaggagg L74; acacaaaagg aatcaagtta ctggacagtg atcaggagct ctaccgcaac ttccccctca L80; ccatctcgga gcggtggcag caggaggtgg cagagactgt cttcgacacc atcaacgctg L86; agacagaccg gctggaggct cgcaagaaag acaa gcagctgggc catgaggaag L92; actacgccct gggcaaggac tgcatcatgc atggctacat gtccaagatg ggcaacccct 198; tcctgaccca gtggcagcgg cggtact:ct acctgttccc cctc gagtggcggg 204; gcgagggcga ggccccgcag c:ga ccatggagga gatccagtcg gtggaggaga 210; cgcagatcaa caag tgcctgc:cc tcaagatccg ngtgggaaa attt 216; tgcagtgcga ccct gagctgg:gc agtggaagaa ggagctgcgc gacgcctacc 222; gcgaggccca gcagctggtg cagcggg:gc ccaagatgaa gaacaagccg cgctcgcccg 228; tggtggagct ggtg ccgctgg:cc agcgcggcag tgccaacggc ctctgacccg 234; cccacccgcc aaac ctctaat:ta ttttgtcgaa tttttattat ttgttttccc 240; gccaagcgga aaaggtttta ttttgtaatt attgtgattt cccgtggccc cagcctggcc 246; cagctccccc gggaggggcc cgcttgcctc tgct gcaccaaccc agccgctgcc 252; cggcgccctc tgtcctgact tcaggggctg cccgctccca gtgtcttcct gtgggggaag 258; agcacagccc ccct tccccgaggg atgatgccac accaagctgt gccaccctgg 264; gctctgtggg ctgcactctg tgcccatggg cactgctggg tggcccatcc cccctcacca 270; ggggcaggca cagcacaggg atccgacttg aattttccca ctgcaccccc tgca 276; gaggggcagg ccctgcactg tcca cagtgttggc gagaggaggg gcccgttgtc 282; tccctggccc cctc ccacagtgac tcgggctcct gtgcccttat tcaggaaaag 288; cctctgtgtc actggctgcc tccactccca cttccctgac actgcggggc ttggctgaga 294; gagtggcatt ggcagcaggt gctgctaccc tccctgctgt cccctcttgc cccaaccccc 300; agcacccggg ctcagggacc acagcaaggc acctgcaggt tgggccatac tggcctcgcc 306; gagg tctcgctgat gctgggctgg gtgcgacccc atctgcccag gacggggccg 312; gccaggtggg gcac gagg ctggctgggg cctatcagtg tgccccccat 318; cctggcccat cagtgtaccc ccgcccaggc tggccagccc cacagcccac gtcctgtcag 324; tgccgccgcc tcgcccaccg catgccccct cgtgccagtc gcgctgcctg tgtggtgtcg 330; cgccttctcc cccccggggc tgggttggcg caccctcccc tcccgtctac tcattccccg 336; gggcgtttct ttgccgattt ttgaatgtga ttttaaagag tgaaaaatga gactatgcgt 342; ttttataaaa aatggtgcct gattaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaa 33. TUBB: TUBB tubulin, beta class I [ Homo sapiens ] LOCUS NM_l78014 AA/translation="MR EIVHIQAGQCGNQIGAKEW *VISD‘HGIDP iGiYHGDSDLQL DRISVYYNEATGGKYVPRAI .V MDSVRSGPFGQIFRPDVFVFGQSGAGVWWA KGHY i4GA 4 -VDSVLDVVQK *A *SCDC .QGFQLT {SLGGGTGSGWGTL .ISKIRA. *YP DRIMNTFSVVPSPKVSDTVV TPYNATLSVHQ .V W D *iYCIDNLALY DICFRTJKLT TPTYGDLN44VSA"MSGVTTCL RFPGQ .NAD-RK-AVVMVPFPRJHFFWPGFAPATSR GSQQYRALLVPLL QQVE DAKNWMAAC DPRHGRYA"VAAVFRGRWSMK *VDdQM-VVQ NKNS SYFVEWIPVNVK'"AVCDIPPRGLKWAVLEIGNSLAIQ 4LEKQIS *QbiAMbRRK AELHWYLG *GMD‘M *J: 4A *SNMNDLVSTYQQYQDALAA. 4 4 *DEG W 4 4 4A CDNA: gcacctcgct gC :ccagcct ctggggcgca t :ccaacc Z 1I ccagcctgcg acctgcggag 6; aaaaaaaatt ac:tattttc ttgccccata cataccttga ggcgagcaaa aaaattaaat 12; tttaaccatg atcg tgcacatcca ggctggtcag tgtggcaacc agatcggtgc 18; caagttctgg gaggtgatca aaca tggcatcgac cccaccggca cctaccacgg 24; cgac ctgcagctgg accgcatctc tgtgtactac aatgaagcca caggtggcaa ; atatgttcct cgtgccatcc tggtggatct agaacctggg gact ctgttcgctc 36; aggtcctttt ggccagatct ttagaccaga caactttgta tttggtcagt ctggggcagg 42; taacaactgg gccaaaggcc caga gggcgccgag ctggttgatt ctgtcctgga 48; acgg aaggaggcag agagctgtga ctgcctgcag ggcttccagc actc 54; actgggcggg ggcacaggct ctggaatggg cactctcctt aaga tccgagaaga 60; ataccctgat cgcatcatga ataccttcag tgtggtgcct tcacccaaag tgtctgacac 66; Cgtggtcgag ccctacaatg ccaccctctc cgtccatcag ttggtagaga atga 72; ttgc attgacaacg aggccctcta tgatatctgc ttccgcactc tgaagctgac 78; cacaccaacc tacggggatc tgaaccacct agcc acca:gagtg ccac 84; ctgcctccgt ttccctggcc agctcaatgc tgacctccgc aagt:ggcag tcaacatggt 90; ccccttccca cgtctccatt tctttatgcc tggctttgcc cctc:cacca gccgtggaag 96; ccagcagtat cgagctctca cagtgccgga actcacccag cagg:cttcg atgccaagaa L02; catgatggct gcctgtgacc acgg ccgatacctc accg :ggctg ctgtcttccg L08; tggtcggatg tccatgaagg aggtcgatga gcagatgctt aacg:gcaga acaagaacag L14; cagctacttt gtggaatgga tccccaacaa tgtcaagaca gccg:ctgtg acatcccacc L20; tcgtggcctc aagatggcag tcaccttcat tggcaatagc acagccatcc aggagctctt L26; catc tcggagcagt tcactgccat gttccgccgg aaggccttcc tccactggta L32; cacaggcgag ggcatggacg agatggagtt caccgaggct gagagcaaca tgaacgacct L38; cgtctctgag tatcagcagt accaggatgc caccgcagaa gaggaggagg atttcggtga L44; ggaggccgaa gaggaggcct aaggcagagc ccccatcacc ttct cagttccctt L50; agccgtctta ctcaactgcc cctttcctct ccctcagaat ttgtgtttgc tgcctctatc L56; ttgttttttg ttttttcttc tggggggggt ctagaacagt gcctggcaca tagtaggcgc L62; tcaataaata cttgtttgtt gaatgtctcc tctctctttc cactctggga aacctaggtt L68; tctgccattc accc tgtatttctt tctggtgccc attccatttg tccagttaat 174; acttcctctt aaaaatctcc ctgg agat cccatttaga accaaccagg 180; tgctgaaaac acatgtagat aatggccatc atcctaagcc caaagtagaa aatggtagaa 186; ggtagtgggt agaagtcact atataaggaa ggggatggga ttttccattc taaaagtttt 192; ggagagggaa atccaggcta ttaaagtcac taaatttcta agtatgtcca tttcccatct 198; cagcttcaag ggaggtgtca gcagtattat ctccactttc aatctccctc caagctctac 204; tctggaggag tctgtcccac tctgtcaagt ggaatccttc cctttccaac tctacctccc 210; tcactcagct cctttcccct gatcagagaa agggatcaag ggggttggga ggggggaaag 216; agaccagcct tggtccctaa gcctccagaa tctt aatccccacc ttttcttact 222; cccaaaaaag aatgaacacc cctgactctg gagtggtgta tactgccaca tcagtgtttg 228; agtcagtccc cagaggagag cctc ctccatcttt tttgcaacat ctcatttctt 234; ccttttgctg ttgcttcccc cctcacacac ttggttttgt tctatcctac gatt 240; tctattttat gttgaacttg ctgctttttt tcatattgaa aagatgacat cgccccaaga 246; gccaaaaata aatgggaatt gaaaaaaaaa aaaaaaaaaa aaaa 34. ‘J UBEZI ubiquitin—conjugating enzyme Lhi 21 [ {omo sapi ens ] LOCUS NM_003345 (variant 1) AA/translation="MSGIALSRLAQ KDHPFGFVAVPTKNP DGTMN-WNW ‘JCA IPGKKGTPWTGGLFK .RMLFKDDYPSSPPKCKREPPLFHPVVYPSGTVCLSILA. *DKD WRPAITIKQILLGIQ 4TJN *PNIQDPAQA 3AYTIYCQNRV4Y *KRVRAQAKKFAPS CDNA: l gcccgcgcca gggtcctcgg agctgctc :g gctgcgcgcg gagcgggctc gaag 6; tcccgagaca aagggaagcg ccgccgccgc cgccccgctc ggtcctccac ctgtccgcta 12; cgctcgccgg ggctgcggcc ggga ctttgaacat gtcggggatc gccctcagca 18; gactcgccca ggagaggaaa gcatggagga aagaccaccc atttggtttc gtggctgtcc 24; aaaa tcccgatggc acgatgaacc tcatgaactg ggagtgcgcc attccaggaa ; agaaagggac tccgtgggaa ggaggcttgt ttaaactacg gatgcttttc aaagatgatt 36; atccatcttc gccaccaaaa tgtaaattcg aaccaccatt atttcacccg aatgtgtacc 42; cttcggggac agtgtgcctg tccatcttag aggaggacaa ggactggagg ccagccatca 48; caatcaaaca gatcctatta ggaatacagg aacttctaaa tgaaccaaat atccaagacc 54; cagctcaagc agaggcctac acgatttact gccaaaacag agtggagtac gagaaaaggg 60; tccgagcaca gaag ccct cagc gaccttgtgg catcgtcaaa 66; aggaagggat tggtttggca agaacttgtt tacaacattt ttgcaaatct aaagttgctc 72; catacaatga ctagtcacct gggggggttg ggcgggcgcc atcttccatt gccgccgcgg 78; ggtc tcgattcgct gaattgcccg taca gggtctcttc tctt 84; tttt gattgttatg taaaactcgc ttta atgt cagtatttca 90; actgctgtaa aattataaac ttttatactt gggtaagtcc cccaggggcg agttcctcgc 96; tctgggatgc aggcatgctt gtgc agagctgcac ttggcctcag ctggctgtat L02; ggaaatgcac cctccctcct gccgctcctc tctagaacct acct gggctgtgct L08; gcttttgagc ctcagacccc aggtcagcat ctcggttctg cgccacttcc tttgtgttta L14; tatggcgttt tgtctgtgtt gctgtttaga gtaaataaac tgtttatata aaggttttgg L20; ttgcattatt atcattgaaa gtgagaggag gcggcctccc agtgcccggc cctccccacc L26; cacctgcagc cccaccgcgg gccaggacca ggctctccat ctgcttcgga tgcacgcagg L32; ctgtgaggct ctgtcttgcc ctggatcttt gtaaacaggg ctgtgtacaa agtgctgctg L38; aggtttctgt gctccccgca tctgcgggct gtagagcgct gggcagctaa gatctgcata L44; ggtcgggatt ggcatcgaga ccctggcaac tgcaccggtg gtct tgggggccac L50; aaggccaggt ccagaccagg gctgggggct ggac tcctatccgg gcagcctgct L56; ggcgggggtt cccctcttca gtggccaggt cacagggatg gagctgcgct gtgcataggg L62; tgccacctca ggtgtctgtc ccttgtgtcc tcaggaggca gccttgctac cacccgtggc L68; aaacgccagg tgctttttct gggagagccc acagccgtgg ccctccaggg cttccccgac L74; cgcc aggtagaggg ccctgggcag cctgtgtctg gaattcttcg tcctgaggcc L80; acctgagtgt ggtctgtcct ggggaggctg tgcgcctcag cagccgtcct gacgctgagc L86; cctctgcaaa ggttgggccg gccaggcctc ttggggctgc ctgagccact gcaggaagtg L92; gcctggctgg gaagttgggt gccggtcacc tcccagcagg aaggcacagt ggacagagat L98; ccct gggggacaca gcccggtgct cccagccctc caacctctgg accc 204; agtctcccca tcctagcgag cttggccctc ctcagtttcg tttcaagcct tgga 210; gctggccctg ctgccctggc accccccggt ggctggagct ccgt ggcccaagtg 216; cagggtccca agagggcagg gcggggctcc ccaaaggagc aaagaatgca gggagggcgg 222; tccagggccc tgggaagggg agctcggcac cctccaggtc cgtgtgggac tccagccgct 228; tggg agtt agaggtgact tccaaaggcc ccccgagccg gcagtgcccc 234; ccaccacccc tccagcgact ctgcggtgcc agtgccttgt tggcttttcc ggctacgcac 240; cctgcagtca ctgagctctc ggtctgacgt ctgatgtttg tggtttgttt ataacacggg 246; gccttacctg gggaattcag ctggtttgaa tagc ccgctcccag aatgtcttat 252; tttgtaatga taca tttagtaata gttacacatg tatatggtta atacatatgg 258; aaattcaata tattttgtag ttaacgtatt ctgaagtaac ggatgtttct cgccaatcgt 2641 agtgacttca gctaacgaaa tgttcttttg tagtaccacg gtcctcggcc taacgaagga 2701 cgtgaacctt gtaagaggag gaaa cgcggtcacc tttgtttagt ggaagggaaa 2761 gtgtgttccc ggcatgaggt gcctcggaat tagtaaagaa ttgtgggcaa tggattaacc 2821 actgtatcta agaatccacc attaaagcat ttgcacagac aaaaaaaaaa a LOCUS NM_194259 (variant 2) AA/translation="MSGIALSRLAQ ERKAWRKDHPFGFVAVPTKNPDGTMN-WNW ‘JCA IPGKKGTPWTGGLFK .RMLFKDDYPSSPPKCKREPPLFHPVVYPSGTVCLSIL 1. *DK WRPAITIKQILLGIQ 4TJN PAQA EAYTIYCQNRV *Y *KRVRAQAKKFAPS CDNA: gcccgcgcca tcgg agc:gctctg gctgcgcgcg gagcgggctc cggagggaag 61 gaca aagggaagcg ccgccgccgc gctc ggtcctccac ctgtccgcta 121 ccgg ggctgcggcc gcccgaggct gccctgagga tctgtgtttg gtgaaaagga 181 gccaaattca cctgcagggc aggcggctct agcagcttca gaagcctggt gccctggcga 241 cactggacct gccttggctt ctttgatccc aaccccaccc ccgatttctg ctctgctgac 301 tggggaagtc atcgtgccac cctg agtgcgggcc tctcagagct ccttcgtccg 361 tgggtctgcc ggggactggg ccttgtctcc ctaacgagtg ccagggactt tgaacatgtc 421 ggggatcgcc ctcagcagac tcgcccagga agca tggaggaaag accacccatt 481 tggtttcgtg gctgtcccaa caaaaaatcc cgatggcacg atgaacctca tgaactggga 541 gtgcgccatt ccaggaaaga aagggactcc gtgggaagga ggcttgttta aactacggat 601 gcttttcaaa tatc catcttcgcc accaaaatgt aaattcgaac caccattatt 661 tcacccgaat gtgtaccctt ngggacagt gtgcctgtcc atcttagagg aggacaagga 721 ctggaggcca gccatcacaa tcaaacagat cctattagga atacaggaac atga 781 tatc caagacccag ctcaagcaga ggcctacacg atttactgcc gagt 841 cgag aaaagggtcc gagcacaagc caagaagttt tcat aagcagcgac 901 cttgtggcat cgtcaaaagg aagggattgg tttggcaaga acttgtttac aacatttttg 961 caaatctaaa gttgctccat acaatgacta gtcacctggg ggggttgggc gggcgccatc 1021 tgcc gccgcgggtg tgcggtctcg attcgctgaa ttgcccgttt ccatacaggg 1081 tctcttcctt cggtcttttg tatttttgat tgttatgtaa aactcgcttt tattttaata 1141 ttgatgtcag aact gctgtaaaat tataaacttt tatacttggg taagtccccc 1201 aggggcgagt tcctcgctct gggatgcagg catgcttctc accgtgcaga gctgcacttg 1261 gcctcagctg gctgtatgga aatgcaccct ccctcctgcc gctcctctct agaaccttct 1321 agaacctggg ctgtgctgct tttgagcctc agaccccagg tcagcatctc ggttctgcgc 1381 cacttccttt gtgtttatat ttgt ctgtgttgct gtttagagta aataaactgt L44; ttatataaag gttttggttg cattattatc attgaaagtg agaggaggcg gcctcccagt L50; gcccggccct ccccacccac ctgcagcccc accgcgggcc aggaccaggc tctccatctg L56; cttcggatgc acgcaggctg tctg tcttgccctg gatctttgta aacagggctg L62; tgtacaaagt gctgctgagg tttctgtgct ccccgcatct gcgggctgta gagcgctggg L68; cagctaagat ctgcataggt cgggattggc atcgagaccc tggcaactgc accggtgcca L74; gctgtcttgg gggccacaag gccaggtcca gaccagggct tgcc tgaggactcc L80; tatccgggca gcctgctggc gggggttccc ctcttcagtg gccaggtcac agggatggag L86; ctgcgctgtg catagggtgc cacctcaggt gtctgtccct tgtgtcctca ggaggcagcc L92; ttgctaccac ccgtggcaaa cgccaggtgc tttttctggg agagcccaca gccgtggccc L98; tccagggctt ccccgaccct tagcgccagg tagagggccc tgggcagcct gtgtctggaa 204; gtcc cacc tgagtgtggt ctgtcctggg gaggctgtgc gcctcagcag 210; ccgtcctgac gctgagccct aggt tgggccggcc aggcctcttg gggctgcctg 216; tgca ggaagtggcc tggctgggaa gttgggtgcc ggtcacctcc cagcaggaag 222; gcacagtgga cagagatggg aagccctggg ggacacagcc cggtgctccc ccaa 228; cctctggctc ccaacccagt ctccccatcc tagcgagctt ggccctcctc agtttcgttt 234; caagccttgg ggctggagct ggccctgctg ccctggcacc ccccggtggc tggagctggg 240; tccccgtggc ccaagtgcag aaga gggcagggcg gggctcccca caaa 246; gaatgcaggg agggcggtcc agggccctgg gaaggggagc ccct ccgt 252; gtgggactcc agccgctgtt ggctgggaat cgaagttaga ggtgacttcc aaaggccccc 258; cgagccggca gtgcccccca ccacccctcc agcgactctg cagt gccttgttgg 264; cttttccggc tacgcaccct gcagtcactg agctctcggt ctgacgtctg atgtttgtgg 270; tata acacggggcc ttacctgggg aattcagctg gtttgaatat ttgtagcccg 276; ctcccagaat gtcttatttt actg aactacattt agtaatagtt acacatgtat 282; atggttaata catatggaaa atat tttgtagtta acgtattctg aagtaacgga 288; tgtttctcgc caatcgtagt gacttcagct aacgaaatgt tcttttgtag taccacggtc 294; ctcggcctaa cgaaggacgt gaaccttgta agaggagagc tctgaaacgc ggtcaccttt 300; gtttagtgga agggaaagtg tgttcccggc atgaggtgcc tcggaattag taaagaattg 306; tgggcaatgg attaaccact aaga atccaccatt tttg cacagacaaa 312; aaaaaaaa LOCUS NM_194260 (variant 3) AA/translation="MSGIALSRLAQ ERKAWRKDHPFGFVAVPTKNPDGTMN .WNW ‘JCA IPGKKGTPWTGGLFK .RMLFKDDYPSSPPKCKREPPLFHPVVYPSGTVCLSILA. *DKD WRPAITIKQILLGIQ 4TJN *PNIQDPAQA 3AYTIYCQNRV4Y QAKKFAPS CDNA: l aac :cgcggg agcgtcaccg tcctgcgacg c :tcagagga tcc: :aggCC tcagtggtct 6; ttgacccccg gccccaggac ctgaccccaa ggaaacctcc gggacc:gtg gctggagagg 12; tgaccgccag gcatccgggg agcctttgga gatctcggct tccttt:tcc cccgctgctt 18; gccggcgtgt cctcgggtgg acgcgggcag cccgaagggg agtttacaga cgctccctca 24; catcggggac gcggctcctt taagggcgga ctttgaacat gtcggggatc gccctcagca ; gactcgccca ggagaggaaa gcatggagga accc atttggtttc gtggctgtcc 36; caacaaaaaa tcccgatggc acgatgaacc tcatgaactg ggagtgcgcc attccaggaa 42; agaaagggac tccgtgggaa ttgt ttaaactacg gatgcttttc aaagatgatt 48; atccatcttc gccaccaaaa tgtaaattcg aaccaccatt atttcacccg aatgtgtacc 54; cttcggggac agtgtgcctg tccatcttag aggaggacaa ggactggagg ccagccatca 60; caatcaaaca gatcctatta ggaatacagg aacttctaaa tgaaccaaat atccaagacc 66; cagctcaagc agaggcctac acgatttact gccaaaacag agtggagtac gagaaaaggg 72; tccgagcaca gaag tttgcgccct cataagcagc gaccttgtgg catcgtcaaa 78; aggaagggat ggca agaacttgtt tacaacattt ttgcaaatct aaagttgctc 84; atga ctagtcacct gggggggttg ggcgggcgcc atcttccatt gccgccgcgg 90; gtgtgcggtc tcgattcgct gaattgcccg tttccataca gggtctcttc cttcggtctt 96; ttgtattttt gattgttatg taaaactcgc ttta atattgatgt cagtatttca L02; actgctgtaa aaac ttttatactt gggtaagtcc cccaggggcg agttcctcgc L08; tctgggatgc aggcatgctt ctcaccgtgc agagctgcac ttggcctcag ctggctgtat L14; ggaaatgcac cctccctcct cctc tctagaacct tctagaacct gggctgtgct L20; gagc ctcagacccc aggtcagcat ctcggttctg cgccacttcc tttgtgttta L26; tatggcgttt tgtctgtgtt gctgtttaga gtaaataaac tgtttatata aaggttttgg L32; ttgcattatt atcattgaaa gtgagaggag gcggcctccc agtgcccggc cctccccacc L38; cacctgcagc gcgg gccaggacca ccat cgga tgcacgcagg L44; ctgtgaggct tgcc ctggatcttt gtaaacaggg ctgtgtacaa agtgctgctg L50; aggtttctgt gctccccgca tctgcgggct cgct gggcagctaa gatctgcata L56; ggtcgggatt gaga ccctggcaac tgcaccggtg ccagctgtct tgggggccac L62; aaggccaggt ccagaccagg gctgggggct gcctgaggac tcctatccgg tgct L68; ggcgggggtt cccctcttca gtggccaggt gatg gagctgcgct aggg 174; tgccacctca ggtgtctgtc ccttgtgtcc tcaggaggca gccttgctac cacccgtggc 180; aaacgccagg tgctttttct gggagagccc acagccgtgg ccctccaggg cgac 186; ccttagcgcc aggtagaggg gcag tctg gaattcttcg tcctgaggcc 192; acctgagtgt ggtctgtcct ggggaggctg tgcgcctcag cagccgtcct gacgctgagc 198; cctctgcaaa ggttgggccg gccaggcctc ttggggctgc ctgagccact gcaggaagtg 204; gcctggctgg gaagttgggt gccggtcacc cagg aaggcacagt ggacagagat 210; gggaagccct caca gcccggtgct cccagccctc caacctctgg ctcccaaccc 216; agtctcccca tcctagcgag cttggccctc ttcg tttcaagcct tggggctgga 222; gctggccctg ctgccctggc accccccggt ggctggagct gggtccccgt ggcccaagtg 228; cagggtccca agagggcagg ctcc ccaaaggagc aaagaatgca gggagggcgg 234; tccagggccc tgggaagggg agctcggcac cctccaggtc cgtgtgggac tccagccgct 240; gttggctggg aatcgaagtt agaggtgact tccaaaggcc gccg cccc 246; cccc tccagcgact ctgcggtgcc agtgccttgt tggcttttcc ggctacgcac 252; cctgcagtca ctgagctctc ggtctgacgt ctgatgtttg tggtttgttt ataacacggg 258; gccttacctg gggaattcag ctggtttgaa tatttgtagc ccgctcccag aatgtcttat 264; tttgtaatga ctgaactaca tttagtaata gttacacatg tatatggtta atacatatgg 270; aaattcaata tattttgtag tatt ctgaagtaac ttct cgccaatcgt 276; agtgacttca gctaacgaaa tgttcttttg tagtaccacg gtcctcggcc taacgaagga 282; cgtgaacctt gtaagaggag agctctgaaa cgcggtcacc tttgtttagt ggaagggaaa 288; gtgtgttccc ggcatgaggt gaat tagtaaagaa ttgtgggcaa tggattaacc 294; actgtatcta agaatccacc gcat ttgcacagac aaaaaaaaaa a LOCUS NM_194261 (variant 4) AA/translation="MSGIALSRLAQ 3RKAWRKDHPFGFVAVPTKNPDGTMN-WNW ‘JCA IPGKKGTPWTGGLFK .RMLFKDDYPSSPPKCKREPPLFHPVVYPSGTVCLSIL A. *DKD WRPAITIKQILLGIQ 4TJN *PNIQDPAQA CQNRV*Y*KRVRAQAKKFAPS CDNA: 1 aactcgcggg agcgtcaccg tcctgcgacg cttcagagga tccttaggcc tcagtggtct 61 ttgacccccg gccccaggac ctgaccccaa ggaaacctcc gggacctgtg gctggagagg 121 gactttgaac atgtcgggga tcgccctcag cagactcgcc agga aagcatggag 181 gaaagaccac ccatttggtt tcgtggctgt cccaacaaaa aatcccgatg gcacgatgaa 241 cctcatgaac tgggagtgcg ccattccagg aaagaaaggg actccgtggg aaggaggctt 301 gtttaaacta cggatgcttt tcaaagatga ttatccatct tcgccaccaa aatgtaaatt 36; acca ttatttcacc cgaatgtgta cccttcgggg acagtgtgcc tgtccatctt 42; agaggaggac aaggactgga ggccagccat cacaatcaaa cagatcctat taggaataca 48; ggaacttcta aatgaaccaa atatccaaga tcaa gcagaggcct acacgattta 54; ctgccaaaac agagtggagt acgagaaaag ggtccgagca caagccaaga agtttgcgcc 60; ctcataagca gcgaccttgt ggcatcgtca aaaggaaggg attggtttgg cttg 66; tttacaacat ttttgcaaat ctaaagttgc tccatacaat gactagtcac gggt 72; tgggcgggcg ccatcttcca ttgccgccgc gggtgtgcgg tctcgattcg ctgaattgcc 78; cata cagggtctct tccttcggtc ttttgtattt ttgattgtta actc 84; gcttttattt taatattgat gtcagtattt caactgctgt ataa acttttatac 90; ttgggtaagt cccccagggg cgagttcctc gctctgggat gcaggcatgc ttctcaccgt 96; gcagagctgc acttggcctc agctggctgt atggaaatgc accctccctc ctgccgctcc L02; tctctagaac gaac ctgggctgtg ctgcttttga gcctcagacc ccaggtcagc L08; gttc tgcgccactt cctttgtgtt tatatggcgt tttgtctgtg ttgctgttta L14; gagtaaataa actgtttata taaaggtttt ggttgcatta ttatcattga aagtgagagg L20; aggcggcctc ccagtgcccg gccctcccca cccacctgca gccccaccgc gggccaggac L26; caggctctcc atctgcttcg gatgcacgca ggctgtgagg ctctgtcttg ccctggatct L32; ttgtaaacag ggctgtgtac aaagtgctgc tgaggtttct gtgctccccg catctgcggg L38; ctgtagagcg ctgggcagct aagatctgca taggtcggga ttggcatcga gaccctggca L44; actgcaccgg tgccagctgt cttgggggcc acaaggccag gtccagacca gggctggggg L50; ctgcctgagg actcctatcc gggcagcctg ctggcggggg ttcccctctt cagtggccag L56; gtcacaggga tggagctgcg ctgtgcatag ggtgccacct caggtgtctg tcccttgtgt L62; cctcaggagg cagccttgct accacccgtg gcaaacgcca ggtgcttttt ctgggagagc L68; ccacagccgt ggccctccag ggcttccccg acccttagcg ccaggtagag ggccctgggc L74; tgtc tggaattctt cgtcctgagg ccacctgagt gtggtctgtc ctggggaggc L80; tgtgcgcctc agcagccgtc ctgacgctga tgca aaggttgggc cggccaggcc L86; tcttggggct gcctgagcca ctgcaggaag ggct gggaagttgg gtgccggtca L92; cctcccagca caca gtggacagag atgggaagcc ctgggggaca cagcccggtg L98; ctcccagccc tccaacctct ggctcccaac ccagtctccc catcctagcg agcttggccc 204; tcctcagttt cgtttcaagc cttggggctg gagctggccc tgctgccctg gcaccccccg 210; ggag ctgggtcccc gtggcccaag gtcc caagagggca gggcggggct 216; ccccaaagga aatg cagggagggc ggtccagggc cctgggaagg ggagctcggc 222; accctccagg tccgtgtggg actccagccg ctgttggctg ggaatcgaag ttagaggtga 228; cttccaaagg ccccccgagc cggcagtgcc ccccaccacc cctccagcga ctctgcggtg 234; ccagtgcctt gttggctttt ccggctacgc accctgcagt cactgagctc tcggtctgac 240; gtctgatgtt tgtggtttgt ttataacacg gggccttacc tggggaattc agctggtttg 246; aatatttgta gcccgctccc agaatgtctt attttgtaat gactgaacta catttagtaa 252; tagttacaca tgtatatggt taatacatat tcaa tatattttgt agttaacgta 258; ttctgaagta acggatgttt ctcgccaatc gtagtgactt cagctaacga aatgttcttt 264; tgtagtacca cggtcctcgg cctaacgaag gacgtgaacc ttgtaagagg agagctctga 270; aacgcggtca cctttgttta gtggaaggga aagtgtgttc tgag cgga 276; aaag gggc aatggattaa ccactgtatc taagaatcca ccattaaagc 282; atttgcacag acaaaaaaaa aaa We

Claims

claim:

1. A method for identifying a modulator of a biological system, the method comprising: establishing a model for the biological system using cells associated with the biological system to represent a characteristic aspect of the biological system; obtaining a first data set from the model, n the first data set represents global proteomic changes in the cells associated with the biological system; obtaining a second data set from the model, wherein the second data set ents one or more functional activities or cellular responses of the cells associated with the ical system, wherein said one or more functional activities or cellular responses of the cells comprises global enzymatic ty and/or an effect of the global enzyme activity on the enzyme metabolites or substrates in the cells associated with the biological system; generating a causal relationship network among the global proteomic s and the one or more functional activities or cellular responses based solely on the first and second data sets using a mmed computing device, wherein the causal relationship network is a an network of causal relationships including quantitative probabilistic directional information regarding onships among the global proteomic changes and the one or more functional activities or ar responses; and identifying, from the causal relationship network, a causal relationship unique in the biological system, wherein at least one enzyme associated with the unique causal relationship is fied as a modulator of the biological system.

2. The method of claim 1, wherein the first data set further represents lipidomic data characterizing the cells associated with the biological system.

3. The method of claim 2, wherein the causal relationship network is generated among the global mic changes, lipidomic data, and the one or more functional activities or cellular responses of the cells, wherein said one or more functional activities or cellular responses of the cells comprises global enzymatic activity and/or the effect of the global enzymatic activity on at least one enzyme metabolite or substrate.

4. The method of claim 1, wherein the first data set further represents one or more of lipidomic, lomic, transcriptomic, c and SNP data characterizing the cells associated with the biological system. 11117145

5. The method of claim 1, wherein the first data set further represents two or more of lipidomic, metabolomic, transcriptomic, genomic and SNP data characterizing the cells associated with the biological .

6. The method of claims 4 or 5, wherein the causal relationship network is ted among the global proteomic changes, the one or more of lipidomic, metabolomic, transcriptomic, genomic, and SNP data, and the one or more functional activities or ar responses of the cells, wherein said one or more functional activities or cellular responses of the cells comprises global enzymatic activity and/or the effect of the global tic activity on at least one enzyme metabolite or substrate.

7. The method of any one of claims 1-6, wherein the global enzyme activity comprises global kinase activity.

8. The method of any one of claims 1-7, wherein the effect of the global enzyme activity on the enzyme lites or substrates comprises the phospho proteome of the cells.

9. The method of any one of claims 1-8, wherein the second data set representing one or more functional activities or cellular responses of the cell further comprises one or more of rgetics, cell proliferation, apoptosis, organellar function, cell migration, tube ion, chemotaxis, extracellular matrix ation, sprouting, and a genotype-phenotype associate actualized by functional models selected from ATP, ROS, OXPHOS, and Seahorse assays.

10. The method of claim 1-8, wherein the causal relationship network is generated among the global proteomic changes, the one or more of lipidomic, metabolomic, transcriptomic, genomic, and SNP data, and the one or more functional ties or cellular responses of the cells, wherein said one or more functional activities or cellular responses of the cells comprises global enzymatic activity and/or the effect of the global tic activity on at least one enzyme metabolite or substrate and further comprises one or more of bioenergetics, cell proliferation, apoptosis, organellar function, cell migration, tube formation, chemotaxis, ellular matrix degradation, sprouting, and a genotype-phenotype associate actualized by functional models selected from ATP, ROS, OXPHOS, and Seahorse assays.

11. The method of any one of claims 1-10, wherein the model of the biological system comprises an in vitro culture of cells associated with the biological system, ally further comprising a matching in vitro culture of control cells. 11117145

12. The method of claim 11, wherein the in vitro culture of the cells is subject to an environmental perturbation, and the in vitro culture of the matching control cells is identical cells not subject to the environmental perturbation.

13. The method of claim 12, wherein the environmental perturbation comprises one or more of contact with a bioactive agent, a change in culture condition, introduction of a genetic modification / mutation, and introduction of a vehicle that causes a c cation / mutation.

14. The method of claim 13, wherein the environmental perturbation comprises contacting the cells with an enzymatic activity tor.

15. The method of claim 14, n the enzymatic activity inhibitor is a kinase inhibitor.

16. The method of claim 13, n the environmental perturbation comprises contacting the cells with CoQ10.

17. The method of claim 14, n the environmental perturbation further comprises contacting the cells with CoQ10.

18. The method of claim 1, wherein the generating step is carried out by an artificial intelligence (AI) -based informatics platform.

19. The method of claim 18, wherein the AI-based informatics platform receives all data input from the first and second data sets without applying a statistical cut-off point.

20. The method of claim 1, wherein the causal onship network established in the generating step is further refined to a simulation causal relationship network, before the identifying step, by in silico simulation based on input data, to provide a confidence level of prediction for one or more causal relationships within the causal relationship

21. The method of claim 11, wherein identifying a causal relationship unique in the ical system comprises: generating a differential causal relationship network from the causal relationship k model and a second causal relationship network model based on matching control cell data, and identifying a causal relationship unique in the biological system from the generated differential causal relationship k that is uniquely present in cells associated with the biological system, and absent in the matching control cells. 11,117,145.jjp

22. The method of claim 12, wherein identifying a causal relationship unique in the biological system comprises: generating a differential causal relationship k from the causal relationship network model and a second causal relationship network model based on matching control cell, and identifying a causal relationship unique in the biological system from the generated ential causal onship network that is uniquely present in cells subject to the environmental perturbation, and absent in the matching l cells.

23. The method of any one of claims 1 to 22, wherein the unique causal relationship identified is a relationship between at least one pair selected from the group consisting of sion of a gene and level of a lipid; expression of a gene and level of a transcript; expression of a gene and level of a metabolite; expression of a first gene and expression of a second gene; sion of a gene and presence of a SNP; expression of a gene and a functional activity; level of a lipid and level of a transcript; level of a lipid and level of a metabolite; level of a first lipid and level of a second lipid; level of a lipid and presence of a SNP; level of a lipid and a functional ty; level of a first transcript and level of a second transcript; level of a transcript and level of a lite; level of a transcript and presence of a SNP; level of a first transcript and level of a functional activity; level of a first metabolite and level of a second metabolite; level of a metabolite and presence of a SNP; level of a metabolite and a functional activity; presence of a first SNP and presence of a second SNP; and presence of a SNP and a functional ty.

24. The method of any one of claims 1 to 23, wherein the unique causal relationship identified is a relationship between at least a level of a lipid, expression of a gene, and one or more functional activities wherein the functional activity is a global kinase activity.

25. The method of claim 1, wherein the biological system is a disease process; wherein a model for the disease process is established using disease related cells to represent a characteristic aspect of the disease process; n the first data set represents global proteomic changes in the disease related cells; wherein the second data set ents one or more functional activities or cellular responses of the disease related cells; 11,117,145.jjp wherein said one or more functional activities or cellular responses of the cells comprise global enzyme activity and/or an effect of the global enzyme ty on the enzyme metabolites or substrates in the disease related cells; and wherein identifying, from the causal relationship network, a causal relationship unique in the biological system comprises identifying a causal relationship unique in the disease process, wherein at least one enzyme associated with the unique causal relationship is identified as a modulator of the disease s.

26. The method of claim 25, wherein the first data set further represents lipidomic data characterizing the disease related cells.

27. The method of claim 26, wherein the causal relationship network is generated among the global proteomic changes, lipidomic data, and the one or more functional activities or cellular responses of the cells, n said one or more functional activities or cellular responses of the cells comprises global enzymatic activity and/or an effect of the global enzyme activity on the enzyme metabolites or substrates in the disease related cells.

28. The method of claim 25, wherein the first data set further represents one or more of lipidomic, metabolomic, transcriptomic, genomic and SNP data terizing the disease related cells.

29. The method of claim 28, wherein the first data set further represents two or more of lipidomic, metabolomic, transcriptomic, genomic and SNP data characterizing the disease related cell.

30. The method of claim 28 or 29, wherein the causal relationship network is generated among the global proteomic changes, the one or more of lipidomic, metabolomic, transcriptomic, c and SNP data, and the one or more functional ties or cellular responses of the cells, wherein said one or more functional ties or cellular responses of the cells comprises global enzymatic activity and/or the effect of the global enzymatic activity on at least one enzyme metabolite or ate in the disease related cells.

31. The method of any one of claims 25 to 30, n the global enzyme activity comprises global kinase activity, and wherein the effect of the global enzyme ty on the enzyme metabolites or substrates comprises the phospho proteome of the cells.

32. The method of any one of claims 25 to 31, wherein the second data set representing one or more onal acivities or cellular resposes of the cell further comprises one or more of bioenergetics, cell proliferation, apoptosis, llar on, 11,117,145.jjp cell ion, tube formation, chemotaxis, extracellular matrix degradation, ing, and a genotype-phenotype associate actualized by functional models selected from ATP, ROS, OXPHOS, and Seahorse assays.

33. The method of claim 32, n the causal relationship network is generated among the global proteomic changes, the one or more of lipidomic, metabolomic, transcriptomic, genomic and SNP data, and the one or more functional activities or cellular responses of the cells, wherein said one or more functional activities or ar responses of the cells ses one or more of bioenergetics, cell proliferation, apoptosis, organellar function, cell migration, tube formation, chemotaxis, extracellular matrix degradation, sprouting, and a genotype-phenotype associate actualized by functional models selected from ATP, ROS, OXPHOS, and Seahorse assays.

34. The method of any of claims 25 to 33, wherein the disease process is , diabetes, obesity, cardiovascular disease, age related r degeneration, diabetic retinopathy, or inflammatory disease.

35. The method of any of claims 25 to 33, wherein the disease s comprises angiogenesis.

36. The method of any of claims 25 to 33, wherein the disease process comprises hepatocellular carcinoma, lung cancer, breast cancer, prostate cancer, melanoma, carcinoma, sarcoma, lymphoma, leukemia, squamous cell carcinoma, ctal cancer, pancreatic , thyroid cancer, endometrial cancer, bladder cancer, kidney cancer, a solid tumor, leukemia, non-Hodgkin lymphoma, or a drug-resistant cancer.

37. The method of any of claims 25 to 33, wherein the disease model comprises an in vitro culture of disease cells, optionally further comprising a matching in vitro culture of control or normal cells.

38. The method of claim 37, wherein the in vitro culture of the disease cells is subject to an environmental perturbation, and the in vitro culture of the ng control cells is identical disease cells not subject to the environmental perturbation.

39. The method of claim 38, wherein the environmental perturbation ses one or more of contact with a bioactive agent, a change in culture condition, introduction of a genetic modification / on, and introduction of a e that causes a genetic modification / mutation.

40. The method of claim 39, wherein the nmental perturbation comprises contacting the cells with an enzymatic activity inhibitor. 11,117,145.jjp

41. The method of claim 40, n the enzymatic ty inhibitor is a kinase inhibitor.

42. The method of claim 39, wherein the environmental perturbation comprises contacting the cells with CoQ10.

43. The method of claim 25, n the characteristic aspect of the disease s comprises a hypoxia condition, a hyperglycemic condition, a lactic acid rich culture condition, or combinations thereof.

44. The method of claim 25, n the ting step is carried out by an artificial intelligence (AI) -based informatics platform.

45. The method of claim 25, wherein the AI-based informatics platform receives all data input from the firstand second data sets without applying a statistical cut-off point.

46. The method of claim 25, wherein the causal relationship network established in the generating step is further refined to a simulation causal relationship k, before the identifying step, by in silico simulation based on input data, to provide a ence level of prediction for one or more causal relationships within the causal relationship network.

47. The method of claim 37, wherein the unique causal relationship is identified as part of a differential causal onship k that is uniquely present in model of disease cells, and absent in the matching control cells.

48. The method of claim 38, wherein the unique causal relationship is identified as part of a differential causal relationship network that is uniquely present in cells subject to environmental pertubation, and absent in the matching control cells.

49. The method of claim 1, wherein the first data set further represents one or more of lipidomic, metabolomic, transcriptomic, genomic, and SNP data characterizing the cells associated with the biological system; wherein said global enzymatic activity and/or an effect of the global enzyme activity on the enzyme metabolites or substrates in the cells associated with the biological system is global kinase activity and/or an effect of the global kinase activity on the kinase metabolites or substrates in the cells associated with the biological ; wherein the causal relationship network is generated among the global proteomic changes, the one or more of lipidomic, metabolomic, transcriptomic, genomic, and SNP data, and the one or more functional activities or cellular responses based solely on the first and second data sets using a programmed ing device; and 11,117,145.jjp wherein at least one kinase associated with the unique causal relationship is identified as a modulator of the biological system. Berg LLC By the Attorneys for the Applicant SPRUSON & ON Per: 11,117,145.jjp “3/62 Ell-II. $30K“ uégmmm umwsmamhmﬁ compmmsnmcwE Em E93 Emsmmmu 23 IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII mwmmmwmtm mmgxwucou :Qmmucncwwm “5 .III. ulti- hwucmu I!!- .I|l.. mcommeE .III. I!!- inf-L mmEhogxm haxw c0333 III-L hmmxmwulhgcm wcmECthﬁm . EEQQ Embargo hﬂmﬂmwumhwxm REE: commmxmmgﬁ EmtmﬂmEngEE mcmmcmmmhmmugm mCOmwum‘mwwﬂm 360 33:3 @ s