CN112289450A - Prediction system for prognosis survival period of intrahepatic cholangiocellular carcinoma patient - Google Patents

Prediction system for prognosis survival period of intrahepatic cholangiocellular carcinoma patient Download PDF

Info

Publication number
CN112289450A
CN112289450A CN202011555919.9A CN202011555919A CN112289450A CN 112289450 A CN112289450 A CN 112289450A CN 202011555919 A CN202011555919 A CN 202011555919A CN 112289450 A CN112289450 A CN 112289450A
Authority
CN
China
Prior art keywords
seq
prediction
prediction system
survival
years
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011555919.9A
Other languages
Chinese (zh)
Other versions
CN112289450B (en
Inventor
孙德强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Gaomei Biotechnology Co.,Ltd.
Original Assignee
Jiangsu Gaomei Gene Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Gaomei Gene Technology Co ltd filed Critical Jiangsu Gaomei Gene Technology Co ltd
Priority to CN202011555919.9A priority Critical patent/CN112289450B/en
Publication of CN112289450A publication Critical patent/CN112289450A/en
Application granted granted Critical
Publication of CN112289450B publication Critical patent/CN112289450B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/154Methylation markers

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Medical Informatics (AREA)
  • Pathology (AREA)
  • Public Health (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • General Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Analytical Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Epidemiology (AREA)
  • Immunology (AREA)
  • Databases & Information Systems (AREA)
  • Primary Health Care (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Biomedical Technology (AREA)
  • Microbiology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biochemistry (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Oncology (AREA)
  • Hospice & Palliative Care (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The application provides a prediction system for prognosis survival period of a patient with intrahepatic cholangiocellular carcinoma, which comprises the following steps: the device comprises an acquisition module and a prediction module, wherein the acquisition module is connected with the prediction module in a wireless and/or wired mode. The prediction system can predict the survival period of the prognosis of the intrahepatic cholangiocellular carcinoma patient by adopting the promoter methylation score as a unique variable, can also predict the survival period of the prognosis of the intrahepatic cholangiocellular carcinoma patient based on a plurality of variables including the promoter methylation score, and has the advantages of high accuracy, strong reliability and stable prediction.

Description

Prediction system for prognosis survival period of intrahepatic cholangiocellular carcinoma patient
Technical Field
The application relates to the technical field of bioinformatics, in particular to a prediction system for prognosis survival period of intrahepatic cholangiocellular carcinoma patients.
Background
Intrahepatic Cholangiocellular Carcinoma (ICC) accounts for 4.4% to 12.0% of primary liver cancer, and is the second most common malignancy with poor prognosis of the liver itself. It has been found that complete surgical resection is the only opportunity for long-lived ICC patients that can undergo surgical resection, and that the overall survival rate five years after surgery is approximately 25.0% to 39.8% in the patients selected.
Due to highly aggressive biological behavior, and the lack of specific symptoms and signs, most ICC patients exhibit relatively advanced disease at the initial visit, and therefore, only a few ICC patients have the opportunity to undergo surgical resection therapy and then survive for a long time. However, therapeutically effective treatments (e.g., liver transplantation, liver resection, radiofrequency ablation, etc.) have a high rate of recurrence, leading to poor prognosis problems. Even with radical resection, 57.9% to 73.4% of ICC patients relapse, and 41.3% to 42.5% die of them.
For ICC patients who can be treated by surgical resection, the method has important clinical, scientific and social values for accurately predicting the prognosis survival period. Accurate prognosis survival prediction can guide doctors to make personalized examination and treatment schemes for high-risk ICC patients, guide the ICC patients to follow treatment plans to avoid over-treatment, and effectively reduce the risk of relapse. In addition, accurate prognostic survival prediction can provide important evidence for the development of novel ICC treatment regimens.
At present, no ideal prognosis survival prediction means exists in ICC clinical practice, and a comprehensive, reliable and stable prediction system for the prognosis survival of ICC patients is urgently needed to be developed.
Disclosure of Invention
The application provides a prognosis survival period prediction system for a patient with intrahepatic cholangiocellular carcinoma, which is used for accurately detecting the prognosis survival period of an ICC (acute coronary syndrome) patient and providing guidance for medical services.
A system for predicting prognosis of survival of a patient with intrahepatic cholangiocellular carcinoma, comprising:
the acquisition module is used for acquiring variables related to the prognosis survival period of the intrahepatic cholangiocellular carcinoma patient; and
the prediction module is used for predicting a prognosis parameter of the intrahepatic cholangiocellular carcinoma patient according to the at least one variable obtained by the acquisition module and outputting the prognosis parameter;
wherein the variables comprise promoter methylation scores that are based on methylation levels of at least 24 promoter regions; the 24 promoter regions are a set of polynucleotide sequences shown in SEQ ID No.1 to SEQ ID No.24 or a set of polynucleotide sequences complementary to SEQ ID No.1 to SEQ ID No. 24;
the acquisition module and the prediction module are connected in a wireless and/or wired mode.
In some embodiments of the present application, the obtaining module is for obtaining a promoter methylation score, the obtaining module comprising:
an analysis unit at least for analyzing the methylation level of the 24 promoter regions in an ex vivo sample of a patient with intrahepatic cholangiocellular carcinoma; and
a scoring unit for calculating a promoter methylation score based on at least the methylation levels of the 24 promoter regions;
wherein, the analysis unit is connected with the scoring unit in a wired and/or wireless mode.
In some embodiments of the present application, the obtaining module further comprises: an output unit for transmitting the promoter methylation score to the prediction module; the output unit is connected with the scoring unit in a wired and/or wireless mode.
In some embodiments of the present application, the promoter methylation score is calculated according to the following formula (1):
promoter methylation score = (-1.778426) × M (SEQ ID No. 1) + (-0.5188023) × M (SEQ ID No. 2) + (-0.007917956) × M (SEQ ID No. 3) + (-4.853461) × M (SEQ ID No. 4) + (0.442986) × M (SEQ ID No. 5) + (-1.512141) × M (SEQ ID No. 6) + (-0.503913) × M (SEQ ID No. 7) + (-2.622882) × M (SEQ ID No. 8) + (-5.796018E-14) × M (SEQ ID No. 9) + (-2.918528) × M (SEQ ID No. 10) + (-1.456336) × M (SEQ ID No. 11) + (-0.02070397) × M (SEQ ID No. 12) + (-0.4516687) × M (SEQ ID No. 13) + (-0.5262961) × M (SEQ ID No. 14) + (-0.01533871) (-3615) (-3985) × M (SEQ ID No. 3) (-0.3580827) (-3615) × M (SEQ ID No. 3) (-18) SEQ ID NO. 17) + (-0.03428591) × M (SEQ ID NO. 18) + (-0.183213) × M (SEQ ID NO. 19) + (-0.6439997) × M (SEQ ID NO. 20) + (-0.9643823) × M (SEQ ID NO. 21) + (-0.4090321) × M (SEQ ID NO. 22) + (-1.727257E-14) × M (SEQ ID NO. 23) + (-0.1818805) × M (SEQ ID NO. 24) (1)
Wherein "x" refers to a multiple, and M refers to the average methylation level of all CpG dinucleotides contained in the corresponding polynucleotide sequence.
In some embodiments of the present application, the variables further include: ascites status, tumor size, macrovascular invasion status, lymph node metastasis status, degree of tumor differentiation, and CA19-9 saccharide antigen concentration.
In some embodiments of the present application, the prognostic parameter includes: a prognostic survival risk grouping, and/or a survival probability value for a specified age.
In some embodiments of the present application, the prognostic risk grouping is to obtain a prognostic classification of the intrahepatic cholangiocellular carcinoma patient based on a preset threshold. For example: if the promoter methylation score is greater than or equal to a preset threshold value, dividing intrahepatic cholangiocellular carcinoma patients into high-risk groups; and if the promoter methylation score is smaller than a preset threshold value, classifying the intrahepatic cholangiocellular carcinoma patients into a low-risk group. For another example: if the promoter methylation score is larger than a preset threshold value, dividing intrahepatic cholangiocellular carcinoma patients into high-risk groups; if the promoter methylation score is equal to a preset threshold value, dividing intrahepatic cholangiocellular carcinoma patients into middle-risk groups; and if the promoter methylation score is smaller than a preset threshold value, classifying the intrahepatic cholangiocellular carcinoma patients into a low-risk group.
In some embodiments of the present application, the survival probability value for the specified age comprises: the prognostic survival is a probability value of three years and the prognostic survival is a probability value of five years.
In some embodiments of the present application, the method for predicting and outputting the prognosis parameter by the prediction module is: and importing the acquired variables into a pre-established Cox regression model, calculating the prognosis parameters through the Cox regression model, and visually presenting a prediction result.
In some embodiments of the present application, the prediction module and the acquisition module are respectively a processor, a server or a computer host; or, the prediction module and the acquisition module are integrated in the same processor, the same server or the same computer host.
The application provides a prediction system for prognosis survival of a patient with intrahepatic cholangiocellular carcinoma, which can predict the prognosis survival of the patient with intrahepatic cholangiocellular carcinoma by taking a promoter methylation score as a unique variable and can also predict the prognosis survival of the patient with intrahepatic cholangiocellular carcinoma based on a plurality of variables, wherein the variables comprise the promoter methylation score. Compared with the existing prognosis survival period prediction system for the intrahepatic cholangiocellular carcinoma patient, the prediction system has the advantages of high accuracy, strong reliability and stable prediction, and can be applied to clinical practice to guide the treatment scheme for the prognosis of the intrahepatic cholangiocellular carcinoma patient.
Drawings
Fig. 1 is a block diagram of a prediction system for prognosis survival of ICC patients according to an embodiment of the present application.
Fig. 2 is a block diagram of a prediction system for prognosis of survival of ICC patients according to another embodiment of the present application.
Fig. 3 is a block diagram of a prediction system for prognosis of survival of ICC patients according to another embodiment of the present application.
Fig. 4 is a schematic diagram illustrating a scenario of a prediction system for prognosis survival of an ICC patient according to an embodiment of the present application.
FIG. 5 is a time-dependent ROC plot (A) and a Kaplan-meier plot (B) of a training queue based on a PMS prediction system, and a time-dependent ROC plot (C) and a Kaplan-meier plot (D) of a validation queue based on a PMS prediction system.
Fig. 6 is a time-dependent ROC graph and a Kaplan-meier graph of five system prediction samples in the experimental example of the present application, where a is a time-dependent ROC graph of AUC of five systems in one year, B is a time-dependent ROC graph of AUC of five systems in two years, C is a time-dependent ROC graph of AUC of five systems in three years, D is a Kaplan-meier graph of prediction samples of PMS prediction systems, E is a Kaplan-meier graph of prediction samples of WCHSU prediction systems, F is a Kaplan-meier graph of prediction samples of jhesm prediction systems, G is a Kaplan-meier graph of prediction samples of bsh prediction systems, and H is a Kaplan-meier graph of prediction samples of AJCC TNM staged systems.
FIG. 7 is a prediction result map of a prediction sample by using the WCHSU-PMS prediction system, wherein A is a nomogram of the prediction sample by the WCHSU-PMS prediction system, B is a calibration curve of the three-year survival rate of the prognosis sample by the WCHSU-PMS prediction system, C is a calibration curve of the five-year survival rate of the prognosis sample by the WCHSU-PMS prediction system, and D is a time-dependent ROC curve of the prediction sample by the WCHSU-PMS prediction system.
Detailed Description
The terms and technical means used in the present application are explained below:
the term "prognosis" refers to providing a prognosis or prediction of the possible course or outcome of ICC. The term includes reference to predicting ICC progress (such as relapse or metastatic spread), survival, resistance, partial or complete remission, or good or poor outcome (good or poor prognosis, respectively). The term also includes a time limit for predicting any of the above (e.g., more than, less than, or equal to a given number of years, such as 0.5, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more years), for example: providing a prognosis can include predicting a patient's survival as greater than, less than, or equal to a given number of years.
If survival, recurrence, metastatic spread, or other event (e.g., more than, less than, equal to, or within a given time period, etc.) is described herein in connection with a given time period, the given time period is preferably a time point within one of the following ranges: 0 to 18 years, 0 to 17 years, 0 to 16 years, 0 to 15 years, 0 to 14 years, 0 to 13 years, 0 to 12 years, 0 to 11 years, 0 to 10 years, 0 to 9 years, 0 to 8 years, 0 to 7 years, 0 to 6 years, 0 to 5 years, 0 to 4 years, 0 to 3 years, 0 to 2 years, 0 to 1 year, 1 to 18 years, 1 to 16 years, 1 to 14 years, 1 to 12 years, 1 to 10 years, 1 to 9 years, 1 to 8 years, 1 to 7 years, 1 to 6 years, 1 to 5 years, 1 to 4 years, 1 to 3 years, 1 to 2 years, 2 to 18 years, 2 to 16 years, 2 to 14 years, 2 to 12 years, 2 to 10 years, 2 to 9 years, 2 to 8 years, 2 to 7 years, 2 to 6 years, 2 to 5 years, 2 to 4 years, 2 to 3, 3 to 16 years, 3 to 3 years, 3 to 12 years, 2 to 10 years, 2 to 9 years, 2 to 8 years, 2 to 7 years, 2 to 6 years, 2 to 5 years, 2 to 4 years, 2 to 3, 3 to 3 years, 3 to 12 years, 3 to 3 years, 3 to 13 years, 3 to 3 years, 3 to 9 years, 3 to 8 years, 3 to 7 years, 3 to 6 years, 3 to 5 years, 3 to 4 years, 4 to 10 years, 4 to 9 years, 4 to 8 years, 4 to 7 years, 4 to 6 years, or 4 to 5 years. The foregoing ranges are inclusive, so that those skilled in the art will appreciate that, for example, a time point in the range of 4-5 years will be understood to include both endpoints of the range, such that the time point may, for example, be 4 years or 5 years (or any time point falling between the endpoints, such as 4.5 years). Thus, in at least some embodiments, providing a prognosis can include predicting that the patient has (i) survival that is greater than, less than, or equal to 3 years; or (ii) greater than, less than, or equal to 5 years. In some embodiments of the invention, the method comprises predicting patient survival for more than, less than, or equal to 3 or 5 years.
The term "patient" includes human patients and other mammals, and also includes any individual who has or has had ICC, or who wishes to be analyzed or treated using the methods of the invention. Suitable mammals falling within the scope of the present application include, but are not limited to: primates, livestock (e.g., sheep, cattle, horses, monkeys, pigs), laboratory test animals (e.g., rabbits, mice, rats, guinea pigs, hamsters), pets (e.g., cats, dogs), and captive wild animals (e.g., foxes, deer, macadamia dogs). Preferably, the patient is a human patient. Optionally, the patient may be receiving ICC therapy; alternatively, the patient is in stage I, II, III or IV of ICC; still alternatively, the patient may be: (a) stage I or II patients; (b) stage II or III patients; or (c) stage III or IV patients.
The term "ex vivo sample" may include, for example, tumor material, tissue or bodily fluid, such as blood or biopsy, derived from a surgical resection or biopsy (e.g., cells from a patient biopsy). The tissue may be removed by any suitable method, such as punch biopsy, aspiration, scraping, excision with surgical resection. Suitable samples contain total tumor material, i.e., Tumor Infiltrating Leukocytes (TILs), stroma, and tumor cells. Optionally, the ex vivo sample may be excised tumor fragments. Ex vivo samples may be obtained at one or more time points. Optionally, the ex vivo sample may be processed (e.g., fixed, stored, frozen, lysed, homogenized, DNA or RNA extracted, cDNA transformed, ultrafiltered, diluted (e.g., with saline, buffer, or physiologically acceptable diluent, etc.), concentrated, evaporated, centrifuged, isolated, filtered, etc., using one or more post-collection preparation or storage techniques, prior to analyzing the ex vivo sample using the predictive system of the present application.
The term "promoter" refers to a nucleotide sequence recognized, bound and initiated by RNA polymerase that contains conserved sequences required for specific binding of RNA polymerase and initiation of transcription, which is not transcribed per se. In the present application, "promoter" and "promoter region" may be used interchangeably.
The term "methylation level" represents the proportion of one or more sites in a polynucleotide sequence that are in a methylated state. The methylation level of a region is the average of the methylation levels of all sites in the region, which can be, for example, a promoter region. An increase or decrease in methylation level of a region does not indicate an increase or decrease in methylation level at all sites in the region. Procedures are known in the art for converting the results obtained from methods for detecting DNA methylation (e.g., simplified methylation sequencing) to methylation levels. Exemplary embodiments utilize the moibs software to obtain methylation level parameters for CpG sites; for example, the methylation level parameter of the CpG sites represented by the M coefficient in formula (1) is obtained using MOABS software.
All statistics involved in the examples of the present application were performed by Rstudio 1.1.463, SPSS 25.0, and GraphPad Prism 8 software.
All time-dependent ROC curves in the examples of the present application were measured by the "survivvalroc" package for Area Under the Curve (AUC).
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art. In addition, any methods and materials similar or equivalent to those described herein can be used in the present application. The preferred embodiments and materials described herein are exemplary only, and are not intended to limit the scope of the present application.
Unless otherwise indicated, the starting materials and reagents used in the following examples are all commercially available or may be prepared by methods known in the art.
The whole genome extraction, the construction of a WGBS (whole genome sulfite sequencing) library, the WGBS sequencing and the data processing referred to in the examples of the present application were all performed by beijing knoo genesis science and technology ltd.
The kit Tiangen DP304 related to the examples of the present application was purchased from Tiangen Biochemical technology (Beijing) Ltd; the DNA analysis kit Qubit 2.0 Fluorometer is purchased from California Life technologies, Inc. USA; agilent Bioanalyzer 2100 was purchased from Agilent.
All operations in the examples of this application are in accordance with the manufacturer's specifications.
As shown in fig. 1, the present embodiment provides a system for predicting survival prognosis of an Intrahepatic Cholangiocellular Carcinoma (ICC) patient, where the system mainly includes: the device comprises an acquisition module 1 and a prediction module 2, wherein the acquisition module 1 is connected with the prediction module 2 in a wireless and/or wired mode.
In particular, the acquisition module 1 is used to acquire variables related to the prognosis survival of an ICC patient, the variables comprising promoter methylation scores. A promoter methylation score is based on the methylation levels of at least 24 promoter regions; the 24 promoter regions are a set of polynucleotide sequences shown in SEQ ID No.1 to SEQ ID No.24 or a set of polynucleotide sequences complementary to SEQ ID No.1 to SEQ ID No. 24.
The prediction module 2 is used for predicting a prognosis parameter of the ICC patient according to the at least one variable obtained by the obtaining module 1 and outputting the prognosis parameter. The method for predicting and outputting the prediction parameters by the prediction module 2 is as follows: and importing the obtained one or more variables into a pre-established Cox regression model, calculating the prognosis parameters through the Cox regression model, and generating a survival probability nomogram by adopting an R language RMS (root mean square) operation package to visually present a prediction result.
The prognostic parameters include a prognostic survival risk grouping, and/or a survival probability value for a specified age. The prognosis survival risk grouping is to obtain a prognosis classification of the ICC patient based on a preset threshold, for example: dividing ICC patients into high-risk groups if the promoter methylation score is greater than or equal to a preset threshold value; and if the promoter methylation score is smaller than a preset threshold value, dividing the ICC patients into low-risk groups. The probability of survival value for the specified age may be, for example, a probability of three years of prognostic survival and a probability of five years of prognostic survival.
The promoter methylation score can be used as an independent predictor of the prognosis survival of ICC patients. Other variables that are relevant to the prognostic survival of ICC patients may be, for example: ascites status, tumor size, macrovascular invasion status, lymph node metastasis status, degree of tumor differentiation, and CA19-9 saccharide antigen concentration.
In one embodiment of the present application, the prediction system uses promoter methylation scores as the only prediction variables, as shown in fig. 2, and the obtaining module 1 is only used for obtaining promoter methylation scores, and comprises: the analysis unit 11 and the scoring unit 12 are connected with each other in a wired and/or wireless mode, and the analysis unit 11 and the scoring unit 12 are connected with each other in a wired and/or wireless mode. Wherein, the analysis unit 11 is at least used for analyzing the methylation levels of 24 promoter regions in an in vitro sample of a patient with intrahepatic cholangiocellular carcinoma, and the scoring unit 12 is used for calculating a promoter methylation score according to the methylation levels of the 24 promoter regions.
In one embodiment of the present application, the obtaining module 1 further includes an output unit 13, the output unit 13 is connected to the scoring unit 12 in a wired and/or wireless manner, and the output unit 13 is configured to transmit the promoter methylation score to the predicting module 2.
In an embodiment of the present application, the prognostic parameters include a probability value that the prognostic survival time is three years and a probability value that the prognostic survival time is five years, and correspondingly, the method for predicting and outputting the prognostic parameters by the prediction module 2 is as follows: importing the obtained promoter methylation scores into a pre-established Cox regression model, wherein the Cox regression model is only used for analyzing the influence of the promoter methylation scores on the prognosis survival of an ICC patient; and calculating to obtain a prognosis parameter through the Cox regression model, and generating a survival probability nomogram by adopting an R language RMS (root mean square) operation package to visually present a prediction result. The survival probability nomogram is sequentially provided with a promoter methylation score nomogram, a total risk score nomogram, a three-year survival rate nomogram and a five-year survival rate nomogram, the obtained promoter methylation scores are mapped to an abscissa of the promoter methylation score nomogram, a numerical value of a position, corresponding to the abscissa, on the total risk score nomogram is the total risk score, a numerical value of a position, corresponding to the total risk score, on the three-year survival rate nomogram is a probability value that the prognosis survival period is three years, and a numerical value of a position, corresponding to the total risk score, on the five-year survival rate nomogram is a probability value that the prognosis survival period is five years.
In another embodiment of the present application, the prognosis parameter is a prognosis survival risk group, and correspondingly, the method for predicting and outputting the prognosis parameter by the prediction module 2 is as follows: importing the obtained promoter methylation scores into a pre-established Cox regression model, wherein the Cox regression model is only used for analyzing the influence of the promoter methylation scores on the prognosis survival of ICC patients, and the optimal threshold values of the promoter methylation scores in the Cox regression model are determined through a survivvalROC software package; if the obtained promoter methylation score is larger than the optimal threshold value, dividing the corresponding ICC patients into high-risk groups and outputting the groups; and if the obtained promoter methylation score is smaller than the optimal threshold value, dividing the corresponding ICC patients into low-risk groups and outputting the groups.
In another embodiment of the present application, as shown in fig. 3, the obtaining module 1 includes a promoter methylation score obtaining submodule 111, an ascites status obtaining submodule 112, a tumor size obtaining submodule 113, a large blood vessel invasion status obtaining submodule 114, a lymph node metastasis status obtaining submodule 115, a tumor differentiation degree obtaining submodule 116, and a CA19-9 sugar antigen concentration obtaining submodule 117, which are independent of each other.
The prognosis parameters are a probability value that the prognosis survival time is three years and a probability value that the prognosis survival time is five years, and correspondingly, the method for predicting and outputting the prognosis parameters by the prediction module 2 is as follows: importing the obtained multiple variables into a pre-established Cox regression model, wherein the Cox regression model is used for analyzing the influence of the multiple variables on the prognosis survival period of the ICC patient; and calculating to obtain a prognosis parameter through the Cox regression model, and generating a survival probability nomogram by adopting an R language RMS operation package to visually present a prediction result. A score column line, a column line for each of the plurality of variables, a total risk score column line, a three-year survival rate column line, and a five-year survival rate column line are presented in the probability-of-survival nomogram. The acquired variables are respectively mapped to the abscissa of the corresponding variable column line, the score of each variable in the variables is obtained through the score column line, the total risk score is calculated, the total risk score is the accumulated sum of the scores of the variables, the total risk score is mapped to the total risk score column line, the numerical value of the position, corresponding to the total risk score, on the three-year survival rate column line is the probability value of the prognosis survival period being three years, and the numerical value of the position, corresponding to the total risk score, on the five-year survival rate column line is the probability value of the prognosis survival period being five years.
The Wireless connection may be a bluetooth connection, a Wireless-Fidelity (WiFi) connection, an infrared connection, a mobile data network connection, or the like, and the wired connection may be a hardwired connection, a Universal Serial Bus (USB) connection, or the like. In addition, the form of the obtaining module is not particularly limited as long as the variable related to the prognosis survival time of the ICC patient can be obtained, and the obtaining module can be selected according to actual needs.
In some embodiments of the present application, the promoter methylation score is calculated according to the following formula (1):
promoter methylation score = (-1.778426) × M (SEQ ID No. 1) + (-0.5188023) × M (SEQ ID No. 2) + (-0.007917956) × M (SEQ ID No. 3) + (-4.853461) × M (SEQ ID No. 4) + (0.442986) × M (SEQ ID No. 5) + (-1.512141) × M (SEQ ID No. 6) + (-0.503913) × M (SEQ ID No. 7) + (-2.622882) × M (SEQ ID No. 8) + (-5.796018E-14) × M (SEQ ID No. 9) + (-2.918528) × M (SEQ ID No. 10) + (-1.456336) × M (SEQ ID No. 11) + (-0.02070397) × M (SEQ ID No. 12) + (-0.4516687) × M (SEQ ID No. 13) + (-0.5262961) × M (SEQ ID No. 14) + (-0.01533871) (-3615) (-3985) × M (SEQ ID No. 3) (-0.3580827) (-3615) × M (SEQ ID No. 3) (-18) SEQ ID NO. 17) + (-0.03428591) × M (SEQ ID NO. 18) + (-0.183213) × M (SEQ ID NO. 19) + (-0.6439997) × M (SEQ ID NO. 20) + (-0.9643823) × M (SEQ ID NO. 21) + (-0.4090321) × M (SEQ ID NO. 22) + (-1.727257E-14) × M (SEQ ID NO. 23) + (-0.1818805) × M (SEQ ID NO. 24) (1)
Where "×" refers to the multiplication number, M refers to the average methylation level of all CpG dinucleotides contained in the corresponding polynucleotide sequence, and M can be obtained by the moebs software. For example: m (SEQ ID NO. 1) refers to the average methylation level of all CpG dinucleotides contained in the polynucleotide sequence as shown in SEQ ID NO. 1.
The prediction module 2 and the acquisition module 1 are respectively a processor, a server or a computer host; alternatively, the prediction module and the acquisition module are integrated in the same processor, the same server or the same computer host.
For example, the server described in this embodiment includes but is not limited to a computer, a network host, a single network server, a plurality of network server sets, or a cloud server formed by a plurality of servers. Among them, the Cloud server is constituted by a large number of computers or web servers based on Cloud Computing (Cloud Computing).
The Processor may include one or more Processing cores, and the Processor described in the embodiments of the present Application may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, a discrete hardware component, or the like. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, preferably the processor may integrate an application processor, which handles primarily the operating system, user interfaces, application programs, etc., and a modem processor, which handles primarily wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor.
In one embodiment of the present application, as shown in fig. 4, the prediction system for prognosis of survival of ICC patients comprises: a server 100 and a terminal 200, wherein the server 100 integrates an acquisition module and a prediction module. The server 100 and the terminal 200 may communicate with each other through any communication method, including but not limited to mobile communication based on 3rd Generation Partnership Project (3 GPP), Long Term Evolution (LTE), Worldwide Interoperability for Microwave Access (WiMAX), or computer network communication based on TCP/IP Protocol Suite (TCP/IP), User Datagram Protocol (UDP), and so on.
The server 100 includes: a plurality of processors; a memory; and a plurality of application programs, wherein the plurality of application programs are stored in the memory and configured to be executed by the processor.
The memory may be used to store software programs and modules, and the processor may execute various functional applications and data processing by operating the software programs and modules stored in the memory. The memory may mainly include, for example, a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to the use of the server, and the like. Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory may also include a memory controller to provide the processor access to the memory.
The terminal 200 may be a general-purpose computer device or a special-purpose computer device. In a specific implementation, the terminal 200 may be a desktop, a laptop, a web server, a Personal Digital Assistant (PDA), a mobile phone, a tablet computer, a wireless terminal device, a communication device, an embedded device, and the like, and the type of the terminal 200 is not limited in this embodiment.
One skilled in the art will appreciate that the server may also include one or more of a power supply, an analysis module, and an output module. The power supply is used for supplying power to each component of the server, and preferably, the power supply can be logically connected with the processor through the power management system, so that functions of charging, discharging, power consumption management and the like can be managed through the power management system. The power supply may also include one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and any other components. The analysis module is used for receiving input digital or character information and can be a keyboard, a mouse, an operating rod, a human-computer interaction laser pen and the like. The output module is used for visualizing or audio-processing the prognosis parameter of the ICC patient, and can be a display screen, a printer or an audio output device.
It should be noted that the prediction systems shown in fig. 1-4 do not constitute a specific limitation of the prediction system of the present application, and may include more or fewer components than those shown, or some components in combination, or a different arrangement of components.
In an embodiment of the present application, the processor in the prediction system loads the executable files corresponding to the processes of the plurality of applications into the memory according to the following instructions, and the processor runs the applications stored in the memory, so as to implement various functions, as follows:
obtaining variables associated with prognostic survival of an ICC patient, said variables comprising promoter methylation scores; and
and introducing the obtained variables into a pre-established Cox regression model, calculating to obtain a prognosis parameter through the Cox regression model, and visually presenting a prediction result.
The prediction system of the present application will be described below with reference to specific examples.
Example 1: PMS prediction system
The present embodiment provides a prediction system of ICC patient prognostic survival, which is a prediction system as shown in fig. 2 and predicts based on Promoter Methylation Score (PMS) only, and thus, the prediction system is named PMS prediction system.
PMS is based on the methylation level of 24 promoter regions, the 24 promoter regions are a collection of polynucleotide sequences shown in SEQ ID No.1 to SEQ ID No.24, and the specific information is detailed in Table 1 below. Any of these 24 promoter regions is a 4000bps total polynucleotide sequence consisting of 2000bps upstream and 2000bps downstream based on the transcription start site.
Table 124 relevant information tables of promoter regions
Figure 59963DEST_PATH_IMAGE001
The method for predicting the prognosis survival period of the ICC patient by the PMS prediction system comprises the following steps:
s1, providing an in vitro sample of an ICC patient, and sequentially performing whole genome extraction, WGBS library construction, WGBS sequencing and WGBS sequencing result data processing operations on the in vitro sample to obtain the methylation level of any one of 24 promoter regions;
s2, calculating the PMS of the in-vitro sample by adopting a formula (1);
and S3, introducing the PMS into a pre-established Cox regression model, calculating to obtain the ICC patient prognosis survival risk group based on the optimal threshold value of the Cox regression model, if the PMS obtained in the step S2 is larger than the optimal threshold value, dividing the corresponding ICC patients into high-risk groups and outputting the high-risk groups, and if the PMS obtained in the step S2 is smaller than the optimal threshold value, dividing the corresponding ICC patients into low-risk groups and outputting the low-risk groups.
The specific flow of step S1 is as follows:
s11, genomic DNA was isolated using the tiangen dp304 kit, and DNA degradation and contamination was monitored using agarose gel. The DNA purity was measured with a spectrophotometer (IMPLEN, CA, USA). DNA concentration was confirmed using the DNA analysis kit in the Qubit 2.0 Fluorometer, sonicated to 200 + 300bp with Covaris S220, followed by end repair and adenylation, and genomic DNA spiked with 26 ng λ DNA in a total amount of 5.2 μ g was fragmented. The cytosine methylated barcode was ligated to the sonicated DNA according to the kit manufacturer's instructions. The ligated DNA fragments were treated twice with bisulfite using EZ DNA Methylation-GoldTM kit (Zymo Research). The bisulfite treated single stranded DNA fragment was PCR amplified using the KAPA HiFi HotStart Uracil + ReadyMix (2X) kit. Library concentrations were quantified by a Qubit 2.0 fluorometer (Life Technologies, CA, USA) and quantitative PCR, and insert sizes were determined on the Agilent Bioanalyzer 2100 system.
S12, sequencing the prepared gene library on the Illumina NovaSeq platform and generating a 150bp paired end reading. Image analysis and base detection were performed using the Illumina CASAVA pipeline, ultimately yielding a paired end read of 150 bp. The sequencing depth of each sample was at least 30 ×.
S13, generate a quality report of the original WGBS sequencing reads using FastQC (Andrews, 2010). Fastp is used for quality control and low quality read filtering for C + + development, which performs fast quality control and data filtering on high-throughput sequencing reads (Chen et al, 2018). The adapter is pruned from the original reads (AGATCGGAAGAGCACACGTCTGAACTCCAGTCA, AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT). Both ends with quality less than 3 or N cardinalities are also pruned (each ready: cut _ front, cut _ front _ window _ size 1, cut _ front _ mean _ quality 3, cut _ tail, cut _ tail _ window _ size 1, cut _ tail _ mean _ quality 3 is filtered by a quality value). The average quality is checked using a 4-benchmark sliding window and the cardinality is clipped to make its average quality below 15 (cut _ right, cut _ right _ window _ size 4, cut _ right _ mean _ quality 15). Since the first 10 bps were treated with biased cytosine sulphite, the two end reads were trimmed from the front by 10 bps (trim _ front 110, trim _ front 210). The minimum length of the trim reads is reserved to 36 bps (length _ required 36). Only correctly paired reads were retained for downstream analysis, which were clean reads of the WGBS library.
S14, DNA methylation analysis is carried out by using MOABS software. The BSMAP module of MOABS was used to align clean reads to the reference genome (hg 38). The mcal module of MOABS performed a single sample analysis to report the methylation levels of CpG dinucleotides throughout the genome, as well as CpG depth coverage and average methylation levels. To check the methylation level of the promoter region, the Gencode transcription annotation was downloaded from the UCSC table browser (V33). A promoter region was defined at 2,000 bps upstream and 2,000 bps downstream of the transcription initiation site, resulting in a total length of 4,000 bps promoter region. The mean methylation level containing CpG sites is used to represent the methylation level of the promoter region (SEQ ID No.1-SEQ ID No. 24).
The method for establishing the Cox regression model comprises the following steps:
s21, obtaining in vitro samples of 334 ICC patients from biological banks of Huaxi hospital of Sichuan university, Zhongshan hospital of Fudan university and tumor hospital of Tianjin medical university, and freezing and storing the obtained in vitro samples at-80 ℃ or in liquid nitrogen. 334 ICC patients are divided into a training queue and a verification queue, wherein the training queue is used for constructing a Cox regression model, the verification queue is used for verifying the effectiveness of the Cox regression model, and specific classification information is detailed in the following table 2:
TABLE 2334 ICC patient Classification List
Figure 445945DEST_PATH_IMAGE002
Baseline characteristics of 334 ICC patients are detailed in table 3:
table 3334 baseline profile of ICC patients
Figure 897786DEST_PATH_IMAGE003
As shown in table 3, some baseline characteristics are not balanced between the two queues. This difference can be explained by a explanation of both geodesic and economic factors, as socioeconomic status of the southwest continent (training cohort) lags behind the eastern coastal region of china (validation cohort).
S22, sequentially carrying out whole genome extraction, WGBS library construction, WGBS sequencing and WGBS sequencing result data processing operations on the 334 in-vitro samples to obtain DNA methylation data of each in-vitro sample.
S23, analyzing DNA methylation data of 334 in vitro samples, and screening 24 promoter regions (shown in Table 1) related to ICC, wherein the specific screening process is as follows:
the 175976 promoter regions obtained by WGBS in the training cohort were identified, the 175976 promoter regions corresponding to 198074 transcripts of 42737 genes (coding and non-coding). As can be seen from the WGBS data, the methylation levels of any of the 175976 promoter regions were continuous and quantitative, the methylation levels of the promoter regions were determined to be significantly correlated with overall survival (P < 0.01) using the single predictor Cox method, and 36938 of the 175976 promoter regions were found to be significant in the single predictor Cox analysis.
Calculating the consistency index (C-index) of any one promoter region in the 175976 promoter regions, and selecting 5000 promoter regions ranked before the C-index;
calculating the Coefficient of Variation (CV) of each promoter region of each in vitro sample in the training queue, and filtering the promoter region with the CV value less than 0.05;
3596 promoter regions which meet the standard (P <0.01 and C-index 5000 at top in single predictor Cox analysis and CV > 0.05) were selected and then further screened using the LASSO Cox algorithm for 3596 promoter regions, finally obtaining 24 promoter regions as shown in Table 1, corresponding to the polynucleotide sequences as shown in SEQ ID No.1 to SEQ ID No. 24.
S24, constructing a PMS calculation formula according to the average methylation level of all CpG dinucleotides contained in any promoter region in the 24 promoter regions, wherein the PMS calculation formula is shown as formula (1).
And S25, constructing a Cox regression model by adopting the PMS of the training queue.
And S26, verifying the effectiveness of the Cox regression model by adopting a verification queue.
The Cox regression model has one or more optimal thresholds based on which all ICC patients in the validation cohort are assigned to one of the high PMS and the low PMS.
For 164 ICC patients in the training cohort, the C-index for Overall Survival (OS) was 0.772 (95% CI: 0.731-0.813) and the optimal threshold for PMS was-11.20716. For 170 patients in the validation cohort, the C-index of the OS was 0.735 (95% CI: 0.676-0.793), the PMS in the validation cohort was calculated by the same coefficients as the PMS in the training cohort, and the groupings were defined by the same thresholds.
The prognostic survival profiles of the training cohort and validation cohort are detailed in table 4 below:
TABLE 4 Pre-and post-lifecycle condition List for training and validation queues
Figure 11105DEST_PATH_IMAGE004
As shown in fig. 5 and table 4, the AUC for the training cohort for 1 year, 2 years and 3 years of overall survival was 0.844, 0.830 and 0.883, respectively. There were 96 and 68 patients with low PMS and high PMS, respectively. The overall survival rate for the low PMS group was significantly higher than for the high PMS group (P <0.001, HR = 4.261, 95% CI: 2.743-6.619). Median survival in the low PMS group was 55.2 ± 3.3 months, with overall survival rates of 92.7%, 71.3% and 45.2% for one, three and five years of prognosis, respectively. Meanwhile, median survival in the high PMS group was 11.8 ± 2.5 months, with overall survival rates of 48.5%, 13.9% and 4.9% for one, three and five years of prognosis, respectively.
For the validation cohort, the AUC for 1, 2 and 3 years of overall survival were 0.773, 0.770 and 0.779. There are 109 and 61 PMS low and high groups, respectively. The overall survival in the low PMS group was significantly higher than the high PMS group (P <0.001, HR = 3.999, 95% CI: 2.283-7.004). Median survival in the low PMS group was 53.0 ± 11.6 months with overall survival rates of 92.9%, 71.3% and 42.1% for 1, 3 and 5 years of prognosis, respectively. In contrast, median survival in the high PMS group was 16.9 ± 2.0 months, while the prognostic 1-year, 3-year and 5-year overall survival was 66.1%, 23.7% and 23.7%, respectively.
Taken together, the overall survival C-index for PMS in the training cohort was 0.772, and median survival was significantly higher for the low PMS group than for the high PMS group; the total life cycle C-index of PMS in the verification queue is 0.735, and the median life cycle of the low PMS group is obviously higher than that of the PMS high group. The effectiveness of the constructed Cox regression model is fully demonstrated by the prognostic prediction data of the training queue and the verification queue, so that the PMS can accurately predict the prognostic survival of the ICC patients, and the effectiveness of the prediction system of the embodiment is further proved.
Experimental example: comparing PMS prediction system, WCHSU prediction system, JUSM prediction system, EHBSH prediction system and AJCC TNM staging system
The WCHSU prediction system is mainly different from the PMS prediction system in that: the WCHSU prediction system adopts six variables, namely ascites state, tumor size, large vessel invasion state, lymph node metastasis state, tumor differentiation degree and CA19-9 saccharide antigen concentration.
The JUSM prediction system is mainly different from the PMS prediction system in that: the JUSM prediction system adopts six variables, namely age, tumor size, tumor number, lymph node state, large vessel invasion state and liver cirrhosis, and the specific content of the JUSM prediction system is referred to as a reference document:
Y Wang et al, Prognostic Nomogram for Intrahepatic Cholangiocarcinoma After Partial Hepatectomy, J. Clin. Oncol. 31(2013)1188-1195。
the EHBSH prediction system differs from the PMS prediction system mainly in that: the EHBSH prediction system uses six variables, age, tumor size, number of tumors, cirrhosis, lymph node metastasis status, and macrovascular invasion status, specific contents of EHBSH prediction system references: o Hyder, H Marques et al, A Nomogram to Predict Long-term overview review for Intrahepatic Cholangiocarpioma, JAMA Surg.149 (2014) 432.
The AJCC TNM staging system and the PMS forecasting system are mainly characterized in that: the T stage in the AJCC TNM stage system mainly comprises three variables of tumor number, vascular invasion and direct extrahepatic invasion; the N stage in the AJCC TNM staging system is based on a variable, i.e., whether there is metastasis of one or more regional lymph nodes; the M staging in the AJCC TNM staging system is based on a variable, i.e. the presence or absence of distant metastasis. Specific contents of the AJCC TNM staging system refer to the eighth edition TNM staging system for ICC, promulgated by the American Joint Committee on Cancer, AJCC.
The PMS prediction system, the WCHSU prediction system, the jhausm prediction system, the EHBSH prediction system, and the AJCC TNM staging system were used to predict 334 ICC patients in example 1, the 334 ICC patients were taken as a whole without classification, and the variable information is detailed in table 5 below:
table 5 list of variable information
Figure 559898DEST_PATH_IMAGE005
The prediction results of the PMS prediction system, the WCHSU prediction system, the jhausm prediction system, the EHBSH prediction system, and the AJCC TNM staging system are shown in fig. 6 and the following table 6:
TABLE 6 time-dependent ROC graph Performance generated by five System predictions
Figure 851202DEST_PATH_IMAGE007
As can be seen from table 6 and fig. 6, the Kaplan-meier plots of the five systems each have four clearly separated Kaplan-meier curves (overall P < 0.001), and the Kaplan-meier plots of the five systems all show the same trend, fully illustrating the effectiveness of the PMS prediction system. The C-index of the PMS prediction system is obviously higher than that of a WCHSU prediction system, a JUSM prediction system, an EHBSH prediction system and an AJCC TNM staging system, the AUCs of the PMS prediction system in one year, two years and three years are also higher than those of the WCHSU prediction system, the JUSM prediction system, the EHBSH prediction system and the AJCC TNM staging system, and the fact that compared with the other four systems, the PMS prediction system has higher accuracy is fully demonstrated. Therefore, the PMS prediction system is superior to the other four systems.
Example 2: WCHSU-PMS prediction system
The embodiment provides a prediction system for prognosis survival period of an ICC patient, PMS is added into a WCHSU prediction system as a new variable, and the WCHSU-PMS prediction system is obtained. The WCHSU-PMS prediction system is a prediction system as shown in fig. 3, which has seven variables, respectively: PMS, ascites state, tumor size, macrovascular invasion state, lymph node metastasis state, degree of tumor differentiation, and CA19-9 saccharide antigen concentration.
The method for predicting the prognosis survival period of the ICC patient by the WCHSU-PMS prediction system comprises the following steps:
s1, providing an ICC patient in-vitro sample, and sequentially performing whole genome extraction, WGBS library construction, WGBS sequencing and WGBS sequencing result data processing operations on the in-vitro sample to obtain the methylation level of any one of the 24 promoter regions;
s2, calculating the PMS of the in-vitro sample by adopting a formula (1);
s3, introducing the PMS obtained in the step S2, the ascites state, the tumor size, the large vessel invasion state, the lymph node metastasis state, the tumor differentiation degree and the CA19-9 carbohydrate antigen concentration of the ICC patient (the judgment standards of the six variables refer to a WCHSU prediction system) into a pre-established Cox regression model to calculate a probability value that the prognosis survival time is three years and a probability value that the prognosis survival time is five years, and generating a survival probability nomogram by adopting an R language RMS (root mean square) operation package to visually present a prediction result, wherein the Cox regression model is obtained by adding PMS variables on the basis of the Cox regression model of the WCHSU prediction system.
As shown in a in fig. 7, the survival probability nomogram is presented with score, ascites status, tumor size, large vessel invasion, lymph node metastasis status, tumor differentiation degree, CA19-9 carbohydrate antigen concentration, PMS, total risk score, three-year survival rate, and five-year survival rate nomograms. And mapping the seven variables obtained in the step S3 to the abscissa of the corresponding variable row line, obtaining the scores of the variables through the score row line, and calculating the total risk score, which is the cumulative sum of the scores of the variables, mapping the total risk score to the total risk score row line, where the value at the position corresponding to the total risk score on the three-year survival rate row line is the probability value of the prognosis survival period of three years, and the value at the position corresponding to the total risk score on the five-year survival rate row line is the probability value of the prognosis survival period of five years.
The prognostic survival of 334 ICC patients in example 1 was predicted using the WCHSU-PMS prediction system. As shown in FIG. 7, the C-index of the Cox regression model of the WCHSU-PMS prediction system is 0.799 (95% CI: 0.769-0.829), and the 3-year and 5-year AUC of the time-dependent ROC curve are 0.874 and 0.854, respectively; for the 334 ICC patients, a satisfactory agreement was achieved between the predicted outcome of the WCHSU-PMS prediction system and the actual prognostic survival observed.
The WCHSU-PMS prediction system has better performance than a PMS prediction system, a WCHSU prediction system, a JUSM prediction system, an EHBSH prediction system and an AJCC TNM staging system because: the prediction system of the embodiment considers various variables including PMS, and has higher accuracy.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The system for predicting the survival period of the intrahepatic cholangiocellular carcinoma patient after prognosis provided by the embodiment of the application is described in detail above. The principle and the implementation of the present application are explained by applying specific examples, and the above description of the embodiments is only used to help understanding the technical solution and the core idea of the present application; those of ordinary skill in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications or substitutions do not depart from the spirit and scope of the present disclosure as defined by the appended claims.
Sequence listing
<110> Jiangsu Gaomei Gene science and technology Co., Ltd
<120> prediction system for prognosis survival period of intrahepatic cholangiocellular carcinoma patient
<130> WH200271-CN
<160> 24
<170> SIPOSequenceListing 1.0
<210> 1
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 1
tttctaggca ccccccaggt tacttgctga gctctgctaa tctatagtgt ttcattagta 60
tttatgagtc ttgcattatt agcatttcca gaaaacaatc aaatatctcc ttagtggcat 120
gaattaaaag tgagaacttc catggaatgc aaaagaagac acacttcggc ccctggaagt 180
acactctagc cttttgttag cttttgaata gcactgacaa tgcttgaaaa gtaattcatc 240
tatttttgta agaccaacaa taggtgccaa ccaacctcat ctgtctgatt gtggctgtgg 300
gggtggacga gaggacagag aagaggggaa aactttcttt caattttttt gaaaccaaat 360
attatttacc tccatatcct ctgtagtatc tagctaggga gacagaatca ttcattcact 420
ctttctgtcg caaacttcta tccacccatg tatgaggtga cgatgggaaa cagtgatccc 480
agagccagaa ggcctggttt tgagccttgg gtcaacctgc atgaccttgg caagttaacc 540
agacctcagc tacaaaatga agctaatgat cttccctcat aatgacattg tggggatcaa 600
aatgattcaa tatatttaat gcattctgta actgtaaaat acttacacaa atataatcca 660
taaaatagtg agttatcttg aaataacaat ggactaattc cagtgctggg cttttcttcc 720
tccttcctca cctccaagcc aatccctatt gccaaagaac catgggtccc caccgcattg 780
aaatgaaaag gtcttcagaa ataatctccc taatatggtt ggtctgtagg tctgtggctg 840
ttgctctaga ctctggcatc cagaaatcag atggagagcg gtcgcttggg gaacatctga 900
aggtgcctgg aggattgctg agttcatagg tgaggtggag gagagcagct agccatctac 960
aggccccagc ctccctccca ggctttggga cccttggaag tgttacaggg gtgggtcatt 1020
ccacagggag gtctgcttca atctctgaga gccaaaagct actgcaggct gtgtccaggg 1080
ctggccctga cactgctgtt ccttccggtg ctctccaaga agggagcagg gcgttctatt 1140
ctgcatcagc ctctcattgt tcttgttttt tagacaaaat aataaatgta aagctctaga 1200
acagtgtctg tgagacagta aatactttct ggaaaggtta ataataagga tcacaatggc 1260
acccaattag aatccctgag accaactcca gagtgacctc caaatggtac aaatcaccac 1320
ccttcttgta gggggaggtg ggaggttcag ttctctatga gaaccttaag agacccctct 1380
tctgcgactg ctgactcctg gatttgtgtg gggggccatt ttataactga cagttgggag 1440
caaagacctc cccttgggcc tttgctcaaa cagctccatt gagctttcta ccttgggtgc 1500
gctgatcact gtgcttgctt tagctcctct gcccaggatc ccaggtgaca gcaggatgga 1560
ctgttggcta cgaagagtca tttaaggagt taaaaataca gattcctgca gcctgctcct 1620
ggagactctg attcagtcgg gtctaggatg ggcccaggaa tttatgtttg tagcaagcag 1680
ccagagtgtt tctggacaat tgtcaggttt gggaataatt gaataagcaa gctgagcagc 1740
cgaccttcat ttatgagcac aggagccatg tgctaaatga ggggaagaat catgggtgag 1800
gagacggggg atgaagttta cctccacaca cagcagcaaa atggccagcc agcttgggga 1860
gcctccttcc acgttacttc ttgctttgcg ttgtcagggg atctcggcca gaactgcttt 1920
cctggaggtc acgttcaacg tagtaaacaa tcgctgaaca cccgccctgt gcattcagat 1980
cctgatcacc gcccagtcta gcaaggggga gaaataaaaa tagaagtata gtgtgatgag 2040
ttctgtggtc aagttctgtg ttgggggaca tgggagcagc atggcagaag cagccagcgg 2100
tcctgcctgg agaaggttga gggaaggcaa tgaaaagaaa cgaccatcat gactggtgtc 2160
tgagcaggca tcgtgcagtg actaattaca gagaacctac tcctgctggg aactttgtac 2220
ccaggacaag caaaccagat ccctgtcctt tctgtgctcc aagtctagtg tgggaaccaa 2280
accaaacatt catcaaagaa ccatataggc aggtgtgaaa gtgtaattat ggtgagtgcc 2340
acgaacaaga ccccacgatg ctatgaaagg ggaaacaggg aggttaggaa ggctctgatg 2400
agggatgact gagagaagat ctgaatgaca aaaacagcaa acacatggct agggtgtctg 2460
aaaagttttg aaacacaaaa taaacttatt ttaaagcatt tccaagtaat atgcttgata 2520
ggcttttctt caaccctcca agaccttttc aggtgaagtt ctctacattt aaagaatgag 2580
ttcaattgtt aatacactgt aaagtgtccc cccaaagtct ggaaacatag gcaaacaata 2640
cctgtattgt gctttttatt agatgaacaa attagaccca ttgctgtccc tgaaaaggac 2700
ctgaagggac agggtagctg acagcagcag tggtggggtg agggtggatg gcagggccca 2760
tgaatggcag ccccacacca caggaccttg taagccacgt ggagggctag agtttgctgt 2820
gagagccacg gggttttaag caaggagggg acataattga tgtggatttc ataaagaaca 2880
ctctctctga cgtgtgagag gacagatggg agagggtgta gggggaggaa atgatcacag 2940
ttgtccagaa agtggcagct gggccaaggg tgatggtggt cggaaagggc gggatgatgg 3000
cgtggctcag gtgttttcca gcagatatgg gtgggagaac tctcaggcag agggacatga 3060
gaggaggtgt ccctgggtgt gcccaacttt ccttcttcta accctagtct cctatgaggg 3120
ctctcagtct ggaatgtata tgtatattct tttttttttt cctttccaaa acaggaaaat 3180
catcagtctc tacctgactc ctcaccatct ctcttggggg gttaaaacac cttattcaac 3240
cccaggcctc tccaccctgc ctgtccaaac cttgccccag ctctcaaacc acactgcctt 3300
acaaaatccc tgtggttacg ataatatcac acacttcaga aaccaaggct ctctggcagc 3360
gagggcacag agtggaacct cagtcagtag ttattaactt ggctattctg gatatatatt 3420
ttcttgatgg ttctcacctt gctaacctgg catgctcagg tgccctctaa tagaggtaag 3480
tactgttttg tacaaaaatg gtctgatcac agtaaatgat ctgcttgtag aacattacat 3540
ttattactta gacccagctc taggcattct gaaataggta tctacatttt ccccaattta 3600
atgaattgct tcaaataact agtcctaata caaagtcgga agctctcttt tccccccaga 3660
gtatatatga tgcataacta gatattcagt attaaaggaa aatagatagt attttccccc 3720
agaattgatg ttataattat ggttgcaatt ttgaacaaaa gcacaaatag agcatcaata 3780
gaaaaatgat taaggcacca tggttattta ataaaatgat gctttatcat gtaaacattc 3840
tggattagaa gactcctgaa atgggccatc tggttgatta ccttacttta ttgatgggaa 3900
atctgggacc aaagagactg catcttgcgg gagatcaccc agctggtgac taatagagcc 3960
agggccagaa ccaatgtctt ctgaggccca ggttaccctc c 4001
<210> 2
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 2
aagacccggt cttcaggttc tctaaaataa tttactgtgt caagttttga taatattcct 60
agctctctga aaatgattga atcaaataag tgtctatttt ttttttctgc aaacactacc 120
cgccagaaca atttccggtt gttaaatagt aaagtcaccg ttcctttggt aaggaatatt 180
taaaggaact cccttggaaa tgaattttgt aataggtgat ttttattctg tttttactct 240
atgtgatact gatccttgca ctggatttca gtttggcaat atgcttttta aaactgggag 300
tcttaaggtg actggatatt agcatttttc actaagtttt ttggatttga aagtaccttt 360
tgatattatt tcttggtgag gctaaaaaaa ttgataaata gctgacaaac ctctgtaacc 420
tacaggtagg ctagatcaat gttatacatt tctaatgtca tggtgttgtt aaagtagaga 480
attttgaaga gtagaaaatc gtagctgtca aaatgcacaa agatgaacat tcagaaagaa 540
ttgctgaatc atcattgctt ggcatataga gtaggcgttc tcatttcttt aaaggttaac 600
acatgattat taccctttat tatattctaa acatatatgt acattttctt ttcctcataa 660
aggttttcta acattttatt tggactgtgt gcaattcttc tgacttttct tatttatctc 720
tctaatagaa atgggaacat ttttgaaaag atgagaaaac catacaggag ataaaagatg 780
agttatatac atagaaaatg tctcataaat acctgaaata tgttatactt tcaaaagcag 840
gcatcaaaag ggtatataaa tgctatgatc taactttgtt aacaaaaaat tataaaacta 900
cagcaaataa ccatgaagat ttatagaaaa cagaattaga agtaaataca gtaatattta 960
acagggatta cctaaaatga tattataggg agatttttaa aaactttttt ctgcgttttc 1020
caaaatctcc aacaatgaat atatatttat aattagggaa atgtgttttt gaaagaaaaa 1080
atatttctca tattactgcc ccttaagtgg catatccaaa ttttacattg atgcagttgc 1140
agttaagtct tgtaaggaac atgagtatgt actgtctgta acctgatact ttgttctagc 1200
tttcctgcag caccacagtt ctttaaatag ggatcctgtg gggcggctgc aggggacggg 1260
ggttgagggg tggcagggac ctcattcagt tttagtttag ttgaagtaac taagttcaga 1320
tactaaaatt tcaaaaataa gtatttgttt gccagaacta tgaactcatg tatactagaa 1380
ataactagtt agctaaaatc atgtttaagt cctttactta ttttaaaata acgcatttga 1440
attttattat agatactccc ttaacaatgt gaaactagta cacaaacttt ctggcgtttc 1500
tgattgatag accccaggaa ttttgaagac gacataaatt acttggaaat cctaatttaa 1560
acaacctgcg ttttggtatt atatacttaa cagacaaata gcttaaaatt aaaaaatccc 1620
taaagtagaa ttgtcgttcc ttttatctgt atattaactt taacttcttt taacataggg 1680
tgtttaatga ctgtgatgat tttcacctag tatggagtac ataaatatat agttgtgaca 1740
ttctgtgttg aaatattggc atgttaaccc atttgtcctg tttaattagg aaataaagac 1800
ggtagagcta gtcggaaaat gaatagttaa tcgtggagga ggagaaaaca tagctcagta 1860
ctttgactag aggcagctgg gaaagtgacg tcctgtagca tttgctgttc tagaaagtac 1920
agagacacgt agtgaaaatg ggaggatcta gaaggaggct gtctcctgtg tagtgtatat 1980
ttatctgtaa gtgagccgtt ggggaaggat tgaatacaga gacgctgtct gcttgctgcc 2040
ttaagacagc tagctgaatt gctgattaac ttttaaaata cccagcttgg tttatttttc 2100
ttagaatctg ttgctaagac tggggacgct gttttctttt acaaagggaa atctaagtta 2160
atttcaaggc attcgaaatg gggaaagact attattgcat tttgggaatt gagaaaggag 2220
cttcagatga agatattaaa aaggcttacc gaaaacaagc cctcaaattt catccggaca 2280
agaacaaatc tcctcaggca gaggaaaaat ttaaagaggt cgcagaagct tatgaagtat 2340
tgagtgatcc taaaaagaga gaaatatatg atcagtttgg ggaggaaggt aagtattctc 2400
tgtatatctt tctgtctctt cagctctgtc tctttttttt ttttctctct ctctgccagc 2460
ctttttcccc tctctctcag cttgttaagg ttatccttaa attggattct cagtcatatc 2520
aaaagtagaa ggacaaatgt ataataacaa ggtttcatct ctgatcatat gaatgagtat 2580
tgcagagcca tgaatgaaat gtccttgttt gcctttttta ttttagcatt ttgtattata 2640
catattttat cattagactt acaatgttaa caatgtatat tttaaaataa accttgaatt 2700
cccataccta ttagtaaatt gtgtcaaaag tcttatgaaa cctgtttcgt tttctaatga 2760
attttctgcc agaaccaata aagagtcttt agtcatttaa aaccttaaga ttggcatttc 2820
agattctgag tagtttatag cttattcaca aaaacaaaaa gtagtttttt gcctttctta 2880
tagcaaggct agtttagact gaacaagaca gagttgtgac tctcagaagc agaaggcgca 2940
ggcatttagc ttctagttgt ttttggcatt ctgcctggag aggaaagaat tttcctgatg 3000
tcataaaata ccagaacaag tttttagctt aattgtaaac aataattttg tgccaagaaa 3060
atgaaatctc agaaatacaa tttttgacct aaaatggcca cctagttaga ataattactg 3120
atttactttt agataacatt atttatttag aaattttact atctaggtat agagaaaaac 3180
atgacaactt ccctttcctc tccctgccat acttttgttt tttgcattgt gagatcaaaa 3240
actagtgagc acagtaaagg aggataattt cagattatcc ttttctcatt ctgcatttta 3300
aatcaggttt gtccatcact tccatataaa cggggctatg aagtaagact attttaaatt 3360
gctaattgat aacacaaaaa actttaactt gtactgtttt ggggattttc aacttttaac 3420
atcaggacta agttatttta ttattagttt ttctcccaaa atactggctt tcaaatcact 3480
gtagctatga aaacttgtta tccctttgat tcctttatta catttaacat caaattttcc 3540
taattagaaa tatatcctta aaatatagtc aagcagtgac tgcttatttt gaatgaaaca 3600
aattaaatgt agacaagcag ggtgattgta tatagagggt tttttccttt ttgttttgtt 3660
tagctttttt ggtctatgag ttgcccaaaa ttgaagttta tgttgcaggt tttggtgtgg 3720
tgaggaccta cagtgtcagg ttcttacaaa gtaaatattg aaaaaggcaa atactgggca 3780
gttcaatctt gcatgatacc tatggccata ttacccaaaa tatcatgact tggtttatct 3840
caaaagctaa gcaggattct gagatactac caataatgta ctactatttc taaaggagta 3900
attttgtaat tttgtaaaga aattcacaaa catggaatga cactgtgaat tcctgtgaaa 3960
aagcagatat ccactgcttt gtatttaaaa tattttgcaa c 4001
<210> 3
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 3
cccggtcttc aggttctcta aaataattta ctgtgtcaag ttttgataat attcctagct 60
ctctgaaaat gattgaatca aataagtgtc tatttttttt ttctgcaaac actacccgcc 120
agaacaattt ccggttgtta aatagtaaag tcaccgttcc tttggtaagg aatatttaaa 180
ggaactccct tggaaatgaa ttttgtaata ggtgattttt attctgtttt tactctatgt 240
gatactgatc cttgcactgg atttcagttt ggcaatatgc tttttaaaac tgggagtctt 300
aaggtgactg gatattagca tttttcacta agttttttgg atttgaaagt accttttgat 360
attatttctt ggtgaggcta aaaaaattga taaatagctg acaaacctct gtaacctaca 420
ggtaggctag atcaatgtta tacatttcta atgtcatggt gttgttaaag tagagaattt 480
tgaagagtag aaaatcgtag ctgtcaaaat gcacaaagat gaacattcag aaagaattgc 540
tgaatcatca ttgcttggca tatagagtag gcgttctcat ttctttaaag gttaacacat 600
gattattacc ctttattata ttctaaacat atatgtacat tttcttttcc tcataaaggt 660
tttctaacat tttatttgga ctgtgtgcaa ttcttctgac ttttcttatt tatctctcta 720
atagaaatgg gaacattttt gaaaagatga gaaaaccata caggagataa aagatgagtt 780
atatacatag aaaatgtctc ataaatacct gaaatatgtt atactttcaa aagcaggcat 840
caaaagggta tataaatgct atgatctaac tttgttaaca aaaaattata aaactacagc 900
aaataaccat gaagatttat agaaaacaga attagaagta aatacagtaa tatttaacag 960
ggattaccta aaatgatatt atagggagat ttttaaaaac ttttttctgc gttttccaaa 1020
atctccaaca atgaatatat atttataatt agggaaatgt gtttttgaaa gaaaaaatat 1080
ttctcatatt actgcccctt aagtggcata tccaaatttt acattgatgc agttgcagtt 1140
aagtcttgta aggaacatga gtatgtactg tctgtaacct gatactttgt tctagctttc 1200
ctgcagcacc acagttcttt aaatagggat cctgtggggc ggctgcaggg gacgggggtt 1260
gaggggtggc agggacctca ttcagtttta gtttagttga agtaactaag ttcagatact 1320
aaaatttcaa aaataagtat ttgtttgcca gaactatgaa ctcatgtata ctagaaataa 1380
ctagttagct aaaatcatgt ttaagtcctt tacttatttt aaaataacgc atttgaattt 1440
tattatagat actcccttaa caatgtgaaa ctagtacaca aactttctgg cgtttctgat 1500
tgatagaccc caggaatttt gaagacgaca taaattactt ggaaatccta atttaaacaa 1560
cctgcgtttt ggtattatat acttaacaga caaatagctt aaaattaaaa aatccctaaa 1620
gtagaattgt cgttcctttt atctgtatat taactttaac ttcttttaac atagggtgtt 1680
taatgactgt gatgattttc acctagtatg gagtacataa atatatagtt gtgacattct 1740
gtgttgaaat attggcatgt taacccattt gtcctgttta attaggaaat aaagacggta 1800
gagctagtcg gaaaatgaat agttaatcgt ggaggaggag aaaacatagc tcagtacttt 1860
gactagaggc agctgggaaa gtgacgtcct gtagcatttg ctgttctaga aagtacagag 1920
acacgtagtg aaaatgggag gatctagaag gaggctgtct cctgtgtagt gtatatttat 1980
ctgtaagtga gccgttgggg aaggattgaa tacagagacg ctgtctgctt gctgccttaa 2040
gacagctagc tgaattgctg attaactttt aaaataccca gcttggttta tttttcttag 2100
aatctgttgc taagactggg gacgctgttt tcttttacaa agggaaatct aagttaattt 2160
caaggcattc gaaatgggga aagactatta ttgcattttg ggaattgaga aaggagcttc 2220
agatgaagat attaaaaagg cttaccgaaa acaagccctc aaatttcatc cggacaagaa 2280
caaatctcct caggcagagg aaaaatttaa agaggtcgca gaagcttatg aagtattgag 2340
tgatcctaaa aagagagaaa tatatgatca gtttggggag gaaggtaagt attctctgta 2400
tatctttctg tctcttcagc tctgtctctt tttttttttt ctctctctct gccagccttt 2460
ttcccctctc tctcagcttg ttaaggttat ccttaaattg gattctcagt catatcaaaa 2520
gtagaaggac aaatgtataa taacaaggtt tcatctctga tcatatgaat gagtattgca 2580
gagccatgaa tgaaatgtcc ttgtttgcct tttttatttt agcattttgt attatacata 2640
ttttatcatt agacttacaa tgttaacaat gtatatttta aaataaacct tgaattccca 2700
tacctattag taaattgtgt caaaagtctt atgaaacctg tttcgttttc taatgaattt 2760
tctgccagaa ccaataaaga gtctttagtc atttaaaacc ttaagattgg catttcagat 2820
tctgagtagt ttatagctta ttcacaaaaa caaaaagtag ttttttgcct ttcttatagc 2880
aaggctagtt tagactgaac aagacagagt tgtgactctc agaagcagaa ggcgcaggca 2940
tttagcttct agttgttttt ggcattctgc ctggagagga aagaattttc ctgatgtcat 3000
aaaataccag aacaagtttt tagcttaatt gtaaacaata attttgtgcc aagaaaatga 3060
aatctcagaa atacaatttt tgacctaaaa tggccaccta gttagaataa ttactgattt 3120
acttttagat aacattattt atttagaaat tttactatct aggtatagag aaaaacatga 3180
caacttccct ttcctctccc tgccatactt ttgttttttg cattgtgaga tcaaaaacta 3240
gtgagcacag taaaggagga taatttcaga ttatcctttt ctcattctgc attttaaatc 3300
aggtttgtcc atcacttcca tataaacggg gctatgaagt aagactattt taaattgcta 3360
attgataaca caaaaaactt taacttgtac tgttttgggg attttcaact tttaacatca 3420
ggactaagtt attttattat tagtttttct cccaaaatac tggctttcaa atcactgtag 3480
ctatgaaaac ttgttatccc tttgattcct ttattacatt taacatcaaa ttttcctaat 3540
tagaaatata tccttaaaat atagtcaagc agtgactgct tattttgaat gaaacaaatt 3600
aaatgtagac aagcagggtg attgtatata gagggttttt tcctttttgt tttgtttagc 3660
ttttttggtc tatgagttgc ccaaaattga agtttatgtt gcaggttttg gtgtggtgag 3720
gacctacagt gtcaggttct tacaaagtaa atattgaaaa aggcaaatac tgggcagttc 3780
aatcttgcat gatacctatg gccatattac ccaaaatatc atgacttggt ttatctcaaa 3840
agctaagcag gattctgaga tactaccaat aatgtactac tatttctaaa ggagtaattt 3900
tgtaattttg taaagaaatt cacaaacatg gaatgacact gtgaattcct gtgaaaaagc 3960
agatatccac tgctttgtat ttaaaatatt ttgcaacaat t 4001
<210> 4
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 4
acacacacac acacacatat atatatatag agagagagag agagagagag acagagacag 60
agacagagag atccttgtaa aatagtagaa aatgcacctg atttggaagc tcaaaatctg 120
tgttcaaatt tcagttcttg aagttgaata attgatctct tgagctttta gagtgtctgt 180
atgtctgcaa aagaaagata tcatgagaat tatatataag acatgccaag tatatgcata 240
tgttaaatat aatatattgt ataattacat atataattat ataatggaaa taaagatgtt 300
tagaataatt cttgtaatat tagcaggtaa ctaaagaata atgatacttt gacatttttg 360
tctttagcat aattgcaatc atttgagtct gccactatat tttacccatt ttaattctga 420
aacataataa atacttcaca ttcatcaaga ataccagaca tactagataa aaaactatgg 480
ccagcaaaaa agacctttct catatttaaa tggctatgaa tacttggagg tgtgagatct 540
gccataaatt ctaatttcag ttctgtcact tgattcataa tacatgcaat ctatatcaac 600
acataatatc tatatcaatc tactttctac ttcaatattt aaaaatccga agttaaaagt 660
tggccctaaa atgaaagtca tgtttcatat aaccataata ttacctaata gaaaaccatt 720
actaagttga ggcatagaaa tagttgaatt ctgtttcagt cattacagtg ataaattgta 780
aatgcaaacc atttattctg tttttatctg taaaacagaa aagaggacat ctacatatgc 840
tcaaggtcta aaattctgtg gttttataat cttctaattc tattaatttc ctaataaata 900
atttaaaata taaactagtt tcaaaatact tatgtttcaa ataatgcagg aagcattttc 960
agttttctac gttcatcaaa gatgaggctt acctttcttg ggctggcatc atatttcaga 1020
tatggcatgt catattcaac aacttaaaga tggctgggcg tgatggttca cacctgtaat 1080
cccagcactt tgggaagcca aggagggcag atcatgaggt caggagttcg agaccagcct 1140
gatcaacatg gcaaaacccc atcgctacta aaaatacaaa aattagccag gtgtggtggg 1200
gtgtgcctat aatcccagct actcaggagg ctgaggcagg agaatcgctt gaacccaaga 1260
ggtgtaggtt gcagtgagtc gcgatcatgc cactgcactc tagcctgggc aacagagcga 1320
gactccatct caaaaaaaca aaaaacaaac aacaacaaca acaacaaaac ttaaagacta 1380
gaggggttaa aatatggaag gaaattatat gttttaaaga aaaagacaaa ataataaaca 1440
aatgcattta atatctactt tatgagtaga agagaagact ggtctatttt ctagtgaaga 1500
aatggtcact gtgagcaaag cttctttcat tccacattgc ctttcttaag atttgtatca 1560
ttacatgttt gcaatgcact gtaatgtcac agtcattgtt atgaataatt acaaagtgag 1620
atgtgtattt tatgagactg ttacagagaa cgaatcagtg ggggaaatta aaaataaaag 1680
agcctagatg tcttcacagc agaaagtcag ttacctatgc tcctcttcag cttttattat 1740
aaaatgtttt tagtgctcag aacctcagct ttaatgatgt ttatttcttt ctccattttt 1800
cacaatgtga ccaaaaactg caccatcatt aacaaatcat cgattccaca tccccttctc 1860
tgactcactc ttttctattt ttccagcaca cttaatcaaa cttctcaatt ataacactta 1920
ccattgagaa gcttcatgca cttgcccctc tcccataatc ctcagtgtgt tttcagcttt 1980
cttcttttcc tgcttaagtt ccagagacaa tcattaaaat cattttgtta tcaatgtctt 2040
taatttctat gtccctctct ccatcccttc tcgcatgttg gtcaagaccc tactaatttc 2100
ctttgagctt acatcctgcc aattgagggt tgttagataa aatcacatag attggcatta 2160
tatattccta acagtattaa ttgagtagtc aacaatgtcc atcaatctta ctctattttc 2220
atatggtgct aaaatattta tatcaatgac tttttaatgc attttccaca ctaatcaaac 2280
ctctcactcc cttcatcttt tcatcacact ttttagctga tgacctcaat tactattttg 2340
taaaatcttg aagccatcag ataagaactc aactcatctc atcaccaaac tgtaaactta 2400
acctgaacag tcaccaatac tctctttatt cactcttgtt ataatggatg tcagagtttg 2460
gtttcccttc ttctataatc ctagcactct tattgtattt gggatcctat cttactcatt 2520
ttttagtcac atccattgac tatatcttcc ctttctacac attgttgtac gtgagtattc 2580
aaacattctc tggcatctcc catattaaag aaaaataaat ctctccttga atcccaaatt 2640
actttccata tgctataata ttttcttttt catttcaagt caaatttatt gggaagaatc 2700
tttatatatg ctactttatc tcctaagccc atgttgaatc tgaactgact ctccatctgg 2760
cttctacctt tgccattata tcaaaactgc tcctaccaaa gacaccaaaa atatgttctg 2820
cccaccccaa cctttcttga ataattggat aattcctatt caagtctcat attttgtttc 2880
ctcaatgata ccttaataat taattaaatt atgctgtcat agtactctgt atttattctt 2940
cttagcatat atctacatag attataattc aatacttgtt gttgctcaaa ggcacatcac 3000
tctatggtgg tacaattaaa taaaaattct gaccaagaag cattactgag tactgagagt 3060
aacactatca atacatagca aaacttatca aaagaaatct gcttattgca actcctcccc 3120
taaagattca ttcaacagtg tgatggctct aatccctttt tcacttttct ctgcctaaaa 3180
caatctgaat tgtacctcca cctaatcaac tatcctcttc atttcttact caattatctc 3240
gctacctctt tactatatat agaaatctag tctttttcag cagcattaaa aaaatgtctt 3300
ctaagaagga tagagacctt gtatccttct ataaatggga aagggttata aactcttcag 3360
taatgtttaa tttataacct tactacttac ttagtcccta tctttcttag atctcttacc 3420
tataattgag ggagtgaatg agtgaataaa tatttgaaaa agaaattggt aatgacatcg 3480
tgattcactg acaatatgcc aagaaagaag aatgaaaact gttagaatta aatatttgtg 3540
ttattttaag gttttgaaac tctgcccttt cttttagagg tctaagtcat tagatcccaa 3600
tatctttatt taatttttat atggtagaga tctaggaaga acataattga gacatgtcat 3660
ataattactt cctgatttga tgtgccattt taattaattt ggggttaagt tcccatacta 3720
taggataaaa tggtggggtt catgttatct ctttaatttt cccattactc tggggaggag 3780
aagggagagt acttaagctt ttcatattaa agtttctaag ctgataaaaa ttcaaggtta 3840
gcacagattt tgtgtggttc aagtaagaca cacactatct gaggaatatt taggcgatca 3900
aatatatttt tagaactgtc tcagagtttt ttttttctct gacaaaataa tacagtttct 3960
aggaaaccac tgaaagacta ttttttaagt tagcttcatg a 4001
<210> 5
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 5
acaaaaaaca gcaaatcagt gttgtatagg taaacatcgg aatgaagaaa aaggtcacag 60
gaatgaatat gaagaaagaa aagaatagat aggagagcaa agaccagaga gctagtatat 120
gaataatagg ggtccccaga gaggaaagca gaacaaatga agctgaaaca acaaaatcac 180
agcagaatat tttcctatcc tgtaaaaata aattgcagtt ttcagatagt cttgccatgt 240
tttgaagcaa aaatacattt aataacactg agcacttaat atattgctag aaagttatta 300
tatatattgt ttagtacttt gataagaaag agatagcaaa atatagtcct ttccctcaag 360
gggctaaaaa tctattggaa aaagcacata aaaatataat tgtttaatat atggcaaagt 420
cagtgatgga cataatgtag gtttgttgta gcaccaaagc aaggcaaagt attggttaag 480
ggcttaatgt gtctatggga gcaggcatcc aagacaatat gagatgtaaa taatgaatta 540
tgtccaagca ctagatgaag ggagatggaa agcatagcta ttcaagacag caggacaatg 600
tggaggaata tataataata accagataat attctatggg aaaatacaca gtttcttgtt 660
tctggatcaa agaaatccaa ggtgggagtt aagaagctca gggccagaaa agtgggcaaa 720
agtgagtttc aacgactgtg tgtgccatgt tgtgaagtag acactttatt ctgaggtgat 780
agaacatcac tgaagagttt tttgtagggt aaggaaaatt ttcatgtgac atgaagcact 840
gtttacttct cagagggtgt actagagaaa aggagactag ttgaaaggct attgccaaag 900
tctgggtaag aaataacaaa gactcaggat agaagggagt agttacagca gaaaacagga 960
tggatgtgag aaatgttcag ctgtgaaaat agaattaagt aactcatgga taggctgaag 1020
ggtaaaatgg gtaagaaggg ctcagtgatg ccttctgtat ttctagcttg atgactgtgt 1080
tatgaggctg gaacatgatg aacagagaaa taaaatgttg catattaaat ttgtaatcct 1140
ccctaagagg agatgcaaag ttattgtttt attcttaatt ttgataatta cttaaatata 1200
ggtttgactg ccatttaaca aaatctgaaa ctcgtcacta ggagaattac atgtttgtca 1260
acatgccaaa atcccataga aaattattta aaaatatagc ccatgtggtc aaatgctgag 1320
ccagatgaaa gagcaaagtg gcaaaaacaa aacagaaaaa gcacacccag aaagcgtgaa 1380
acaggattac agaatttaat ataggcagag ttttgaaaat aaatgtactt taaacacatt 1440
cattaaaaaa aaaaaactct caaattgggt taacaatata caacttcttg gatattatgg 1500
cagcagcctg gtcatgcctg acttaccttt catcactttc agttaaacac tccaaagtgt 1560
ccagcccgag ataaaagctg acagcaacac agctggttga tgaaaagaga agcccatatt 1620
caaattccag tactaagccc gctggcagga cgtttgatga gatgttccca gaagcacaaa 1680
cttgaccaga gaacagtctt ctctctggaa agaaattgat tgtgaaagtc accttccacc 1740
ttccactacc tggctgtctc ccatcccaca tcctggacac gtttcaatgt actctatctc 1800
agtcaagggc acagcgatct atcccgtttt tatacctggc tcctaacact ttctgtatta 1860
gtctgttatc atgctgctaa taaagacata ccaaacactg ggtaattaat aaaggaaaga 1920
ggtttaattg actcacagtt tagcatggct gtggagacct caggaaactt accaatatgg 1980
tggaaaggga agcaaacatg tccttcttaa caggagtgca ggggaactac cctttataaa 2040
accatcagat ctcgtgagac ttactcactg tcatgagaac agcacaggaa cctcatgatt 2100
ccattacttc ccatcaggtc cctcccacaa cacgtgggga ttatgggagc tacaattcaa 2160
gatgagatct aggtgacgaa ccaaccaaat catatcactg tccctttcct catgcctgct 2220
atccaatctg tcagcagttc tgagtatttt acctccaaac tcttttacat ttgattgctt 2280
ttctttctac caccatcacc atagtccaag ctactttcac tcctgaactt cttaagcaat 2340
tcccccacaa tctatttctt gctctttcat ttataaaggt tttcctggag tgccagagtt 2400
acattacacg tgcatgtcac tctttttggc accatagaga ccaaagtact gaacaagaca 2460
agatccctgg tctgaaagag cctacgtttt agttagattg acagctagtg tctgcttgta 2520
ttacctttca ccaggtaagc tacttgcttt ggggaaaacc aaacaggata aaggctgggt 2580
ccttcgtgaa gaaaagatct tggagaaagt agagtacaga gaccatagcc gaagtcttgg 2640
tagaatatgg ttggcttgaa tggggctatc actgggagtc ctggaaaata caaagcagca 2700
aaaaagcaga aaaaaaaatc actaaaatta cacaaaattt actggtagca agcacaatgc 2760
atcgtataca tattgttttc tttgaaaatt ttgatctaac aacttttgat ctaacactct 2820
ttggaaacta aaaaagcaat actcacacat gtgtaaatat tagatagtta agaatattct 2880
ttgcagcact actgctgcaa gatatggtaa gaaaaggatt agctcagaat atatttgcag 2940
agaagagcca tggaatttga actggacgtg gaatgtgagg gaaagaggac tccttattga 3000
atgaaagaaa ggtgatacta aattggtgga gctatttgtt tgaagtgtga gacttgggga 3060
agaagtctgc tgcacaagtg gaaataattc aatggtattc ttggactttt tttgataata 3120
gagagatact aatattttgt acatatcaac ttttgtagaa agtcttaaaa tgtcttaatc 3180
cactttacac tttattaagt ggagtctaat gaagaaattg agtatttatt ttctgcctga 3240
tgctcttgcg agtcattatc gttgttctga atatcagttg tcactattca ggcctctgac 3300
atgagatatg caagatgaag gagattctag acaaatagaa gaaccacagt gcttgatttc 3360
aaaattttca aagaaaacag tatgtatacg gtgcactgtg cttgctacca gtaaattgtg 3420
taattttagt gatttttatt ctgctttttt gctactttgc attttccagg actcccaatg 3480
atagcaaaat ctgtgggtcc tcaagtccct tgtgtaaaat gatgtggtat ttgcataaac 3540
tatgtcatca tctcatatat tctaagtcat ctcaagatta cttaaaatac ctaatacaat 3600
ttaaatgctg ggtaaatagt tgttatactg tattgtttca ggagtaatga caagaaaaaa 3660
tgtctgcaca tgttcagaca gatacaacca tccatatttc ttctgaatat tttcagttca 3720
caaatcggtt gactccaaat agatgtggaa cccatggata ccaattgact attacagtgt 3780
aataattttg tgtgcttatt cagctataat atgtaaatac atttttcact ttttaagatc 3840
caggtactac ttaaatattc agaatcaaga tatgaaagtg cttcctaaga ctctccttca 3900
gatacccatg tcaggaaatt cagatgttca catttcactg tatattgttc tggggatttt 3960
ccccatgcct tcaaatatgt ctatcatgtc ttcttaagtt t 4001
<210> 6
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 6
acagcaaatc agtgttgtat aggtaaacat cggaatgaag aaaaaggtca caggaatgaa 60
tatgaagaaa gaaaagaata gataggagag caaagaccag agagctagta tatgaataat 120
aggggtcccc agagaggaaa gcagaacaaa tgaagctgaa acaacaaaat cacagcagaa 180
tattttccta tcctgtaaaa ataaattgca gttttcagat agtcttgcca tgttttgaag 240
caaaaataca tttaataaca ctgagcactt aatatattgc tagaaagtta ttatatatat 300
tgtttagtac tttgataaga aagagatagc aaaatatagt cctttccctc aaggggctaa 360
aaatctattg gaaaaagcac ataaaaatat aattgtttaa tatatggcaa agtcagtgat 420
ggacataatg taggtttgtt gtagcaccaa agcaaggcaa agtattggtt aagggcttaa 480
tgtgtctatg ggagcaggca tccaagacaa tatgagatgt aaataatgaa ttatgtccaa 540
gcactagatg aagggagatg gaaagcatag ctattcaaga cagcaggaca atgtggagga 600
atatataata ataaccagat aatattctat gggaaaatac acagtttctt gtttctggat 660
caaagaaatc caaggtggga gttaagaagc tcagggccag aaaagtgggc aaaagtgagt 720
ttcaacgact gtgtgtgcca tgttgtgaag tagacacttt attctgaggt gatagaacat 780
cactgaagag ttttttgtag ggtaaggaaa attttcatgt gacatgaagc actgtttact 840
tctcagaggg tgtactagag aaaaggagac tagttgaaag gctattgcca aagtctgggt 900
aagaaataac aaagactcag gatagaaggg agtagttaca gcagaaaaca ggatggatgt 960
gagaaatgtt cagctgtgaa aatagaatta agtaactcat ggataggctg aagggtaaaa 1020
tgggtaagaa gggctcagtg atgccttctg tatttctagc ttgatgactg tgttatgagg 1080
ctggaacatg atgaacagag aaataaaatg ttgcatatta aatttgtaat cctccctaag 1140
aggagatgca aagttattgt tttattctta attttgataa ttacttaaat ataggtttga 1200
ctgccattta acaaaatctg aaactcgtca ctaggagaat tacatgtttg tcaacatgcc 1260
aaaatcccat agaaaattat ttaaaaatat agcccatgtg gtcaaatgct gagccagatg 1320
aaagagcaaa gtggcaaaaa caaaacagaa aaagcacacc cagaaagcgt gaaacaggat 1380
tacagaattt aatataggca gagttttgaa aataaatgta ctttaaacac attcattaaa 1440
aaaaaaaaac tctcaaattg ggttaacaat atacaacttc ttggatatta tggcagcagc 1500
ctggtcatgc ctgacttacc tttcatcact ttcagttaaa cactccaaag tgtccagccc 1560
gagataaaag ctgacagcaa cacagctggt tgatgaaaag agaagcccat attcaaattc 1620
cagtactaag cccgctggca ggacgtttga tgagatgttc ccagaagcac aaacttgacc 1680
agagaacagt cttctctctg gaaagaaatt gattgtgaaa gtcaccttcc accttccact 1740
acctggctgt ctcccatccc acatcctgga cacgtttcaa tgtactctat ctcagtcaag 1800
ggcacagcga tctatcccgt ttttatacct ggctcctaac actttctgta ttagtctgtt 1860
atcatgctgc taataaagac ataccaaaca ctgggtaatt aataaaggaa agaggtttaa 1920
ttgactcaca gtttagcatg gctgtggaga cctcaggaaa cttaccaata tggtggaaag 1980
ggaagcaaac atgtccttct taacaggagt gcaggggaac taccctttat aaaaccatca 2040
gatctcgtga gacttactca ctgtcatgag aacagcacag gaacctcatg attccattac 2100
ttcccatcag gtccctccca caacacgtgg ggattatggg agctacaatt caagatgaga 2160
tctaggtgac gaaccaacca aatcatatca ctgtcccttt cctcatgcct gctatccaat 2220
ctgtcagcag ttctgagtat tttacctcca aactctttta catttgattg cttttctttc 2280
taccaccatc accatagtcc aagctacttt cactcctgaa cttcttaagc aattccccca 2340
caatctattt cttgctcttt catttataaa ggttttcctg gagtgccaga gttacattac 2400
acgtgcatgt cactcttttt ggcaccatag agaccaaagt actgaacaag acaagatccc 2460
tggtctgaaa gagcctacgt tttagttaga ttgacagcta gtgtctgctt gtattacctt 2520
tcaccaggta agctacttgc tttggggaaa accaaacagg ataaaggctg ggtccttcgt 2580
gaagaaaaga tcttggagaa agtagagtac agagaccata gccgaagtct tggtagaata 2640
tggttggctt gaatggggct atcactggga gtcctggaaa atacaaagca gcaaaaaagc 2700
agaaaaaaaa atcactaaaa ttacacaaaa tttactggta gcaagcacaa tgcatcgtat 2760
acatattgtt ttctttgaaa attttgatct aacaactttt gatctaacac tctttggaaa 2820
ctaaaaaagc aatactcaca catgtgtaaa tattagatag ttaagaatat tctttgcagc 2880
actactgctg caagatatgg taagaaaagg attagctcag aatatatttg cagagaagag 2940
ccatggaatt tgaactggac gtggaatgtg agggaaagag gactccttat tgaatgaaag 3000
aaaggtgata ctaaattggt ggagctattt gtttgaagtg tgagacttgg ggaagaagtc 3060
tgctgcacaa gtggaaataa ttcaatggta ttcttggact ttttttgata atagagagat 3120
actaatattt tgtacatatc aacttttgta gaaagtctta aaatgtctta atccacttta 3180
cactttatta agtggagtct aatgaagaaa ttgagtattt attttctgcc tgatgctctt 3240
gcgagtcatt atcgttgttc tgaatatcag ttgtcactat tcaggcctct gacatgagat 3300
atgcaagatg aaggagattc tagacaaata gaagaaccac agtgcttgat ttcaaaattt 3360
tcaaagaaaa cagtatgtat acggtgcact gtgcttgcta ccagtaaatt gtgtaatttt 3420
agtgattttt attctgcttt tttgctactt tgcattttcc aggactccca atgatagcaa 3480
aatctgtggg tcctcaagtc ccttgtgtaa aatgatgtgg tatttgcata aactatgtca 3540
tcatctcata tattctaagt catctcaaga ttacttaaaa tacctaatac aatttaaatg 3600
ctgggtaaat agttgttata ctgtattgtt tcaggagtaa tgacaagaaa aaatgtctgc 3660
acatgttcag acagatacaa ccatccatat ttcttctgaa tattttcagt tcacaaatcg 3720
gttgactcca aatagatgtg gaacccatgg ataccaattg actattacag tgtaataatt 3780
ttgtgtgctt attcagctat aatatgtaaa tacatttttc actttttaag atccaggtac 3840
tacttaaata ttcagaatca agatatgaaa gtgcttccta agactctcct tcagataccc 3900
atgtcaggaa attcagatgt tcacatttca ctgtatattg ttctggggat tttccccatg 3960
ccttcaaata tgtctatcat gtcttcttaa gtttttctaa a 4001
<210> 7
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 7
acatcggaat gaagaaaaag gtcacaggaa tgaatatgaa gaaagaaaag aatagatagg 60
agagcaaaga ccagagagct agtatatgaa taataggggt ccccagagag gaaagcagaa 120
caaatgaagc tgaaacaaca aaatcacagc agaatatttt cctatcctgt aaaaataaat 180
tgcagttttc agatagtctt gccatgtttt gaagcaaaaa tacatttaat aacactgagc 240
acttaatata ttgctagaaa gttattatat atattgttta gtactttgat aagaaagaga 300
tagcaaaata tagtcctttc cctcaagggg ctaaaaatct attggaaaaa gcacataaaa 360
atataattgt ttaatatatg gcaaagtcag tgatggacat aatgtaggtt tgttgtagca 420
ccaaagcaag gcaaagtatt ggttaagggc ttaatgtgtc tatgggagca ggcatccaag 480
acaatatgag atgtaaataa tgaattatgt ccaagcacta gatgaaggga gatggaaagc 540
atagctattc aagacagcag gacaatgtgg aggaatatat aataataacc agataatatt 600
ctatgggaaa atacacagtt tcttgtttct ggatcaaaga aatccaaggt gggagttaag 660
aagctcaggg ccagaaaagt gggcaaaagt gagtttcaac gactgtgtgt gccatgttgt 720
gaagtagaca ctttattctg aggtgataga acatcactga agagtttttt gtagggtaag 780
gaaaattttc atgtgacatg aagcactgtt tacttctcag agggtgtact agagaaaagg 840
agactagttg aaaggctatt gccaaagtct gggtaagaaa taacaaagac tcaggataga 900
agggagtagt tacagcagaa aacaggatgg atgtgagaaa tgttcagctg tgaaaataga 960
attaagtaac tcatggatag gctgaagggt aaaatgggta agaagggctc agtgatgcct 1020
tctgtatttc tagcttgatg actgtgttat gaggctggaa catgatgaac agagaaataa 1080
aatgttgcat attaaatttg taatcctccc taagaggaga tgcaaagtta ttgttttatt 1140
cttaattttg ataattactt aaatataggt ttgactgcca tttaacaaaa tctgaaactc 1200
gtcactagga gaattacatg tttgtcaaca tgccaaaatc ccatagaaaa ttatttaaaa 1260
atatagccca tgtggtcaaa tgctgagcca gatgaaagag caaagtggca aaaacaaaac 1320
agaaaaagca cacccagaaa gcgtgaaaca ggattacaga atttaatata ggcagagttt 1380
tgaaaataaa tgtactttaa acacattcat taaaaaaaaa aaactctcaa attgggttaa 1440
caatatacaa cttcttggat attatggcag cagcctggtc atgcctgact tacctttcat 1500
cactttcagt taaacactcc aaagtgtcca gcccgagata aaagctgaca gcaacacagc 1560
tggttgatga aaagagaagc ccatattcaa attccagtac taagcccgct ggcaggacgt 1620
ttgatgagat gttcccagaa gcacaaactt gaccagagaa cagtcttctc tctggaaaga 1680
aattgattgt gaaagtcacc ttccaccttc cactacctgg ctgtctccca tcccacatcc 1740
tggacacgtt tcaatgtact ctatctcagt caagggcaca gcgatctatc ccgtttttat 1800
acctggctcc taacactttc tgtattagtc tgttatcatg ctgctaataa agacatacca 1860
aacactgggt aattaataaa ggaaagaggt ttaattgact cacagtttag catggctgtg 1920
gagacctcag gaaacttacc aatatggtgg aaagggaagc aaacatgtcc ttcttaacag 1980
gagtgcaggg gaactaccct ttataaaacc atcagatctc gtgagactta ctcactgtca 2040
tgagaacagc acaggaacct catgattcca ttacttccca tcaggtccct cccacaacac 2100
gtggggatta tgggagctac aattcaagat gagatctagg tgacgaacca accaaatcat 2160
atcactgtcc ctttcctcat gcctgctatc caatctgtca gcagttctga gtattttacc 2220
tccaaactct tttacatttg attgcttttc tttctaccac catcaccata gtccaagcta 2280
ctttcactcc tgaacttctt aagcaattcc cccacaatct atttcttgct ctttcattta 2340
taaaggtttt cctggagtgc cagagttaca ttacacgtgc atgtcactct ttttggcacc 2400
atagagacca aagtactgaa caagacaaga tccctggtct gaaagagcct acgttttagt 2460
tagattgaca gctagtgtct gcttgtatta cctttcacca ggtaagctac ttgctttggg 2520
gaaaaccaaa caggataaag gctgggtcct tcgtgaagaa aagatcttgg agaaagtaga 2580
gtacagagac catagccgaa gtcttggtag aatatggttg gcttgaatgg ggctatcact 2640
gggagtcctg gaaaatacaa agcagcaaaa aagcagaaaa aaaaatcact aaaattacac 2700
aaaatttact ggtagcaagc acaatgcatc gtatacatat tgttttcttt gaaaattttg 2760
atctaacaac ttttgatcta acactctttg gaaactaaaa aagcaatact cacacatgtg 2820
taaatattag atagttaaga atattctttg cagcactact gctgcaagat atggtaagaa 2880
aaggattagc tcagaatata tttgcagaga agagccatgg aatttgaact ggacgtggaa 2940
tgtgagggaa agaggactcc ttattgaatg aaagaaaggt gatactaaat tggtggagct 3000
atttgtttga agtgtgagac ttggggaaga agtctgctgc acaagtggaa ataattcaat 3060
ggtattcttg gacttttttt gataatagag agatactaat attttgtaca tatcaacttt 3120
tgtagaaagt cttaaaatgt cttaatccac tttacacttt attaagtgga gtctaatgaa 3180
gaaattgagt atttattttc tgcctgatgc tcttgcgagt cattatcgtt gttctgaata 3240
tcagttgtca ctattcaggc ctctgacatg agatatgcaa gatgaaggag attctagaca 3300
aatagaagaa ccacagtgct tgatttcaaa attttcaaag aaaacagtat gtatacggtg 3360
cactgtgctt gctaccagta aattgtgtaa ttttagtgat ttttattctg cttttttgct 3420
actttgcatt ttccaggact cccaatgata gcaaaatctg tgggtcctca agtcccttgt 3480
gtaaaatgat gtggtatttg cataaactat gtcatcatct catatattct aagtcatctc 3540
aagattactt aaaataccta atacaattta aatgctgggt aaatagttgt tatactgtat 3600
tgtttcagga gtaatgacaa gaaaaaatgt ctgcacatgt tcagacagat acaaccatcc 3660
atatttcttc tgaatatttt cagttcacaa atcggttgac tccaaataga tgtggaaccc 3720
atggatacca attgactatt acagtgtaat aattttgtgt gcttattcag ctataatatg 3780
taaatacatt tttcactttt taagatccag gtactactta aatattcaga atcaagatat 3840
gaaagtgctt cctaagactc tccttcagat acccatgtca ggaaattcag atgttcacat 3900
ttcactgtat attgttctgg ggattttccc catgccttca aatatgtcta tcatgtcttc 3960
ttaagttttt ctaaatatct gtatctacac aaatatatat c 4001
<210> 8
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 8
agcgagtccc aactatacaa atcacagaaa aacaagacac agaaagtgtc actaaatcaa 60
gaactttaaa ataaatgttt tttttatccc tgactataat catttaccat aggcctcaag 120
gaaaaaaaaa atgctaccat gaaacctaat gtttccgtga aagcacagga aattagttat 180
tctgtcttta aaataaaaca aaattaatgt atatttttct aatgaatttg ataaagtgaa 240
ccctaagttg agtgtggcat tcaccaaaac tataagaaaa aaccccgttt actataagat 300
ggcaaaatat gggtcttagt tcaagctaac acttttaaag tagcctttct ttcacaactt 360
aaaaaaatgt tgttaataaa aaatggccct cactttgaat ctgaagtatt ttttaatgtc 420
accaagtgct attattaaat gtgtactata tatgcatgta tacacacata aaaataagca 480
ctacagatgt gggcatgaaa attcttagca aaacccccac atatttcaaa tactgttctt 540
cttcattatt gtcctctacc ataattagca cgttacagca taacaggtgg ctttcattta 600
tagaatgcaa aattaatgaa atttagagat ggtcttgtct ccaaaaacaa attggccatt 660
ctataaaaag tacacccctt aatctacaaa acaaacagaa acaaactccc ttcatgctgg 720
ttttccaaat tataggaatt tctgaatgta gttaagtaaa ttgaatgcat ttactgacag 780
cttaaaattc atacacttca aggaaaacaa gtcattatta tggattttat aagaaagggg 840
aagagaaaca aagattgccg atttttccag cttggtaacc ccagatctgt ttaatgctct 900
ctgagtgata acaggattta ctgctataca ctgggctgct ccacaaatgc taaatcttag 960
gcaaatttgt gggcagttgt gttataggtc ctgacctagc cattctggac agacagacca 1020
ggaaagcata aatagtactg ctataaacta actatagcat ctgtccctaa caagcaagac 1080
tcacaaagta tttttttaaa atatgaggag acactgaata gactgtcaac acacatagga 1140
atgtttaata cactcaaata aataagatat gtgctttaac tagtcagaca ttgtttaaat 1200
ttagtgggtg aagaaagaga gagggtcaga caggcagaca cagcctccca attatgcaga 1260
atgatccttc agatcatgtg aacgctataa ttaaatgttg ctaccaaatc cccactaccc 1320
tttctcccac ctagaaaaag ttaatgcatg aattcagtat gagcaaattg tgatttataa 1380
aaacaaacaa acaaacaaac aaacaaaacc caccctattc actccgtagg ggaataaagc 1440
tttcttgcat taagtcacgc atcatggggg taggaaaaaa gcacagtact gaaagaaatc 1500
caacgtaaca caatagtaat tgttagacca cctccttcag ctcacctctg cttattcatt 1560
tgttttaaag ctctacacta cccacccaag aggtatacca tctctgcaag ctgtttccta 1620
ggcagcatct tttagattca ggacttaata ataaaaaaca acaacaacaa caacaaaaca 1680
tttctatcag aatatctaag gagaaatttg ttttcagaaa gaggcatgag aggctaagag 1740
ctctgtgtct atcaacagac cagaggttac ctatctgtcc ttgcacgtgg gcacacacaa 1800
atacaaacac atacacactg tggtcctcta caactaggat aaattttctc ctttaaaaaa 1860
atgctaaaag caggcaccag aaaggttaag aataagagaa caatgaaaaa ataaagttca 1920
tgtaaacagt aagcacgtac ctgccaggcg atgccatttc ctctacattg tggagcccga 1980
gtgctgggta ttagcattat tgctactata aagcttccga taaaaatgga tcgtctcttg 2040
ctgacctgag aaccagctgg ctgcctgaga agtggtttcc cccatatatg attcagtctc 2100
cttgacaaat atactgcatt gtaccggatg agtcatccga gggttagtcc gcattttata 2160
ggcctgttgt gtcattactc acattcaaca tcctgcttct caaaagtgga aactggaagc 2220
ctgacagctg acagatgcgt tttccaatta aaatattgtt ttaaaaattc ctgtaagtta 2280
tgattaccaa tcattgatga aaggatgaaa tgcccaggaa gcaaatggca cagttcaata 2340
cctgagatta cattctggag aaccacaagt gcaaattctt cacctgcccc cagccccaaa 2400
ccaaagaatt tcgtccaaag ttgagagcaa ggatccctct ctttaaagtt atcagctgaa 2460
ttcgtgtagg catggacttg ttgattcttg tgctgtgagg ttacccagct gcagaagcta 2520
tgtaaaaagt atacatcaaa attttggtcc tggattaatt ttataggtgg gatagtggca 2580
ttgtgtttat aaaatgtcta cttttttaga gaggcatgtt gaagtatgtt aggggtaaaa 2640
tgatatgata tctggggttt gttgtgaata ttttagcaaa aaataaataa gggacggggt 2700
aaagaaatgt ggcaaaatct tgataaatgg ttgcatctgt gtagtgcgta tatgcaggct 2760
gactgtattt ctcttttttg aagtatcttt gaaaattttc ataataaaat gtttttaaaa 2820
attaaaaaaa attttttttt cagtccctgc aatctaattt atacctaaac aaggcacttt 2880
tctagttagg tatagagcta ccacacatgt gagtgtggga gctgaacttg aagttgtttc 2940
tgccttccag ctgggaggta tgatacatgt atcctgcggc tgttgtgagt gttgcatcaa 3000
cctcctgaac aaacgcctag tgccttctcc tcccagggcg atctggtgat tgtttcctta 3060
tgcattctct gaggacacaa gccctggctt catggctcag tgttaacgtt atgaaggaga 3120
ggtgactata cccagcttcc ttacaaggtt ttggaaatgc tccatgagag gttatgtaag 3180
tcacatccat agtctcttaa accagcaggc cctttggtaa tgtggcacag ggtttgcttg 3240
actcacttaa tgttcaaatt gggtgtctta aaccatctgg aggaggaaga gctttgattt 3300
agtggctgac aaaaaaaaaa tccagatatg gaagctccaa tgtatgtgaa ataaaaccac 3360
tgttgaaact ggctgtcact ttaaaaacta tccatattga tgcgctgtga aaacatgaca 3420
agtcaccaca ggaaacaaaa tagaggggtc actaaagccc agagctcctg catggccccc 3480
ttagtcctga aacttaaaaa tggaagcatg tttcaggctg aggttcacac ctcagcaaac 3540
agctgactct aattcttaaa ggaccctgtc taataggact tcaaactggc tccaaatggt 3600
tgggacagtc ccatccaaca tcaaattaca acctcactaa tccagactga ttaggattgt 3660
gcatgggcct caactgaaga cttccttggc tttgacaatt ctttcctcct cacgaggact 3720
gcaggctgca gccttatcaa aatgtttcct tgaaatttta taacttttca cacacttgaa 3780
aagcataatt taccttaatg cagcaggtag aaatggggga aattaataaa cttgtagtct 3840
aatttgagtt tctatagtat tttggagaaa taagttcaca gttcattgaa gaattttata 3900
attcagattt cagtaatgaa aaatacaaaa ataatagcaa taataatgtg tgcacacatg 3960
cacaaacaat gagaataagg caaagatagt aaaatggtaa c 4001
<210> 9
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 9
ggtctggttc aaagcaaaga tgacctaaaa gaaaggtctc ttttaagttc actaacaagg 60
caactgtttg ctgccatctt atttcttttg tccatggctg gaaatacctt cacctttgtg 120
ctcgttatgt aaggacgtgg atagtgctac ttaaaaaagg gaacacaaat ccctggacag 180
catccaaaga aaaggagaac agtccttccc aaatcagccc caatcagaca cacctaggga 240
actgccttat ggaggacaca gacaacctgg aactcttcca tgagtgtgac caagaggtga 300
gggaacccaa gagctggctt gtgaggaata gaagctagga ggtggggcag cgagagggct 360
ggggactcac agacacacct accagtgggt cccaattatg taaaggactg ggcagaaacc 420
agtgcattat ccacatcttt tctaaccatc aatgcagcat tcccagctaa tggcgtcatt 480
agatccatgg gatattaagg gaagttaatg tgaaaaaatg ggaaatgaag gctaaagaaa 540
attaaacacc agcctggcca acatggtgaa acatgtctct actaaaaata caaaaattag 600
ctgggcatgg tggcatgtgc ctctagtccc agctacttgg gaggctaagg ctagagaatc 660
gtttgaaccc aggagacaga ggttgcagta agccaagatt gtgccattgt actccaggct 720
ggatgacagt gagactctgt ctcaaaaaaa aaaaaaagaa aagaaaagaa aattaaacag 780
gttttaattt tacagaactt ctcagagtct ttaatatact aatgtgcgat gtgggtctcc 840
ttgatgggga acagagtatg ccaggtttcc caaacatatt tgaggcatac atcacaagag 900
aaagctttgt tgaaacatat tttgggaaaa tccagattca agatcctgat gccaggaacc 960
ctatctctcc agctctgacc cagatctcca gtgtccagga gaattcattt tacaaagcag 1020
ggaattctcc tgggcactgg agagctgggt cagaggtggg gaggtagggt tcctggtatc 1080
aggtgggtag tggggctggc caccccctac gaggctccag acaacactta caacccgtgg 1140
aattttctct ctcactttcc ccaggtacag atgcagctct atagcctctg acagctcccc 1200
tttcttgtgc ttccccagta atcagcatac aggatgctgc ctttgctgga atatctcctc 1260
actctttgct gcaatacaga caggcttggt ccatctgtgt tgcagcaaag actgactatg 1320
aggtctttag caagatgttt caactctcca gagacaacgc attagaggtc ttaagtcagc 1380
aaatgaccct acccagcccc aacccctttc tcctttcaca tgttctcctc attctgagtg 1440
gtgaggagac aaggcttcct ggtttgcctc ccatgggcca aggtggcaca ggcccactgc 1500
gtaccccttg tctcagtcca tttgtgctgc tacaacaaaa tgcctgagac tgggtaattt 1560
atgaggaaca gaaatatatt tctcaccatt ctggaggctg ggaagtccga gatcaaggca 1620
ctggcctttg gtctggtgag ggccttcttg ctgcatcctc acatggaaga agacagaagg 1680
gcaagctaac tgaatggctg catgacacct cttttattag ggccttaatc ccataaaggc 1740
cttataaaga ggagtcttaa tcccaaaaag gtccttataa agaggagcct tcatggctta 1800
atcacctcct aaaggcctca ccttttaata ccatcacact agcaacatct gaattttgga 1860
ggggacacat tcaaaccaca gcatccctcc ttccacatag tttacctccc tcagttactc 1920
taatatttcc tgtgcctgtg atgcaacctt aaccaaagga gacagatctc tctgtaaaac 1980
ccctccatcc tctcttctct caatacctcc tccttttctc tggcagaggc acctccagtc 2040
tctctggaag gcgtctcccc aaggttctgc acaatagcct cttctgtctt ggtaccctct 2100
agaccatact gtgttcccat ccttcccatc cctctattgg tcacaagcca gagcaggtaa 2160
caaatctgct tggtcatgcg gtgatttggt ttggctgtgt tcccacccaa atctcatctt 2220
gaattgtagc tcctagaatt cccacgtgtt gtgagaggga cccagtggga ggtaattgaa 2280
tcatggaggt aggtctttcc catgccgttc tcgtgatagt gaataagtct cgtggttcta 2340
taaaggggag ttcccctgca cacgctctct tgccttccac catgtaagac acgcctttcc 2400
tcctccttca ccttccgcca tgattgtgag gcctccccag gcaagtggaa ctgtgagtcc 2460
attaaacccc tttcctttac agattaccca gtcttgggta tgtcttcatt agcagtgtga 2520
gaacagactg gtacatgtgg gatcacagcc ttgccagaaa agcagataca cccgggtgct 2580
tgtctgcaaa ctgatctaaa tgagacaact cagccttcca caggtataag cccctgcatt 2640
ctaaagccat acaattaggg cctacccaca acaactggga attgccaact gacttcagat 2700
cagctggagg aacaaacctt caaaggtttg ggagggaaga gagaggggca ttatgaaaca 2760
cctatcaggt atgcctggcc ccaggcccaa gtcccacata ttctgtctcc atctaactca 2820
tccaagccaa atcctcagtc tgactggctc acaaatacct gctgatgagc tttgctattc 2880
tcttaaaggc attgagcaca agcccaggtt acctggtgct taaatacctt cctcaaagag 2940
tgtccattaa ttttgaaaca tttttctttc ttaaggtttt cttggtattt ggagaagatt 3000
gagttaccca ttgtaacaga catgtcagtc tgcaaactgg ggcagactgt gtatgggcca 3060
catcacagat taaataagaa ctttaggcca ggcgcggtgg ctcacacctg tgatcccagc 3120
actttgggag gccgatgcag gtggatccct tgaggtcagg agttcaagac cagcctggcc 3180
aacatggtaa aaccctgtct ctactaaaaa tacaaaaatt agccaggcat ggtggcaggt 3240
gcctgtaatc ccagctactt gggaggctga agcacgagaa tcgcttttaa cctgggaggc 3300
agaggttgca gtgagctgag attgtgccac tgcactccag cctgggcaac agagcaagat 3360
tccttctcaa aaaataaata aataaaatga ctactggaga gttttgtgca ttataatgat 3420
ttaagatatc aggtactccc ctactgagaa ttacacacaa tccctggtgg aagatggaca 3480
ttaggacagt gacctggcta atggtgttca tcagtgacag gtttctggcc ctgtggtcct 3540
caaggaccag aaggcctggg aacagcaaag ttgggggaga caagggcggg ggaagaaggg 3600
tagcaaaatg ccaactgtga tcctgagtta ctctaggatt ctggaaactg agtggttgat 3660
agactgaaca tcaagaatgg aagggcataa gggatgatga agggtatcat gagagttcat 3720
tctgtgtcaa ctaatcaagg acagaatcaa ggcaagtcac agtactaaat gatcctaaat 3780
ttcaatcatt catcccttct attcattaat tcagtcaata aatagttatt atatgcccac 3840
tgttccaagc actgaagata cagtcatgaa caagacaaga tccttgcctt taatggagct 3900
ttcattttac ctgaaaaagg gggcattttt caagaaaaat cttgaaaaga tgggtggaaa 3960
ggaggtctag tgtatcatag aagcttctgg gaggagaagg a 4001
<210> 10
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 10
agaaggaata gaaaaagtgt ttattgaata tatttgccaa gcacataatc accaaaaaga 60
ctgatttgca agcccttgtt tggtattatc tacttgtctc aaactagggt cagcgtctgg 120
ctttagttac taggtaataa gtttgggact tttaatcaac agtgaaaaag gcaatgtcat 180
acattaggaa atatgttgta tttgagaaag caacattttg tctctagacc taaattttct 240
tatctgagaa atataggaag agaggtcatg ttttctagca tacaaatgtg cctcccatat 300
ttacatagtt ttgaacaaat agaaagaata agcagaaact cggaggtttc agaggctaag 360
cagatagggt gttccagccc tctactaata tttttctaat gtttcactgt aaatgttgag 420
cggcacttgg aaatttccca agtctcaaag ctgttgttct cttgctttct cctcccattg 480
gatggaaata tgatgtttga gcttaatgtc aagaacatgg gtcatgagaa aactaacagg 540
tggcttcaaa ttctttttac aagtcttaat tatggaaaca gatttattac aaagtagaat 600
acctaatgtt ttgggtctaa aacttcagga tattagacct ctcatccaga ggtcctattt 660
ccacaatact tgctaaactc tgagaatttt ccaaaatggt ttaaaacagc ccaagaagaa 720
ggtttgcatt acagaaaagg gaatttatgc agtagaatat aatggaaatt tacacagtca 780
tttgagcctg agtcaaagtc tttaatatca agtcaatctc agaaaaacta aagtcactcg 840
tggaatccct tttatttgct ttgttcattc cacaccttta tttaacctgg catgctcttt 900
tgatattgcc aaaattcaga actctattta aatggcattc tattcttttt ttttcctttt 960
ttttaagatg gagtctcact cttgtcaccc aggttggagt gcagtggcac aatctcagct 1020
cactgcaacc tctgcctccc agattgaagc gattcttctg cctcagcctc ctgagtagct 1080
ggtattatag gcgcccgcca ccatggggtt tcaccatgtt ggccaggctg gtctcaggtg 1140
atccacctgc atcggcctcc caaagtgctg ggattacagg agtgagccac tgcacccggc 1200
tggcattcta ttcttttcat ggatcctttc ttggcccctt ttaataaaaa atatgtatct 1260
ctgcttcact cctgttgcac tttatttaca cctcatcttc tggcatttgc cccaagttat 1320
aagaagccat cctatacatg ctcatgatgt tttcttagtc ataagcacaa gtaagacagg 1380
gaatattagc atgatgcttg agacatgcta aggtttcagt tgctgagtga gataggaaat 1440
ctgtgttgaa tctatttctc tgtatcacat cacctactta tggtcactga ggaaatgaag 1500
ctttactgaa actggctttc atatccataa gtaaaaatat atttttggtt atataatcta 1560
tacaatttat atttgcataa ctgagggctt taaaaaacac aatcattgca accactaata 1620
atttttttaa ttaaaggtat tttgtgtgta ctttataaaa atccaagcta tcgactattt 1680
tgctgtaaaa aatataggct tttatcatac aagtgacttc taaaaaataa tatttaatgc 1740
taaatatata aaagggagaa taagaaagag ataactgttt ttggaaaaag taaatagttt 1800
gcttttctcg gggtgagatt catattatca gtacacttga agataactaa cttaaaatat 1860
atagtttaaa agaaactcaa gtttgtactc accgattttc aaaacataac acgcattcat 1920
tgggataagt atttccatca gtcccacaga cagggtcata tatcttggtg catccattaa 1980
gttcattgta acatttggcc taaaaatgga attaaacaga atcatttccc attattctcc 2040
attcttgagt tcatagcaaa atctctgaaa tggttttatt tctctgggga ataactgtga 2100
ttgggagaat gatactgtgt gttgtaagag aaataaccta ctatctggag acagaaaacc 2160
ttgttttaag tcatgcctct gctatttcta agctgtgtga cattagtaag tcctatggcc 2220
cttctgaaac tcagtcagtt catctgcaaa acagtgatga tattaccatt tcccacctat 2280
ctcatagtgt tgttgtgaga tttacatgga tgggtgagat aatgttcaag tttgcaatta 2340
ctgcgggtgc tggcatctct caggttggcc acatcttggg gcacgttctt ccttgtacac 2400
tgtagtccac acacactggt cttctattag gcacttaaag gcactaggca ttgtcgggtg 2460
ttggaagtct ttgcattcta ctccagtaag agtttataat catcaaatga ttgtcattgt 2520
cctagacatt gttctagtgc tttttatata ttattctgtt taaacctcct aactatcctg 2580
tgaactaagt attatattct tcttcaggtg aagataatga ggcacaaagc aataaaatta 2640
gccaaggcca cccagctaag gaattcctaa attcagaatt caaacctagt cagagttcct 2700
gagctttagc tcttaagtac tttattctac tacatctccc caccagtttt gttttactaa 2760
cactcatttt ttttttttgg ctcaaaataa cccctcagtt acttcggagc accctccctg 2820
atttttctaa ctaggctaag ttcctaacgg cagcagaact ttttcataac acttatcaca 2880
aatgtaatat tgcattcatt catgaacttt tggtctatca gctgtctctg actagatata 2940
aatgtactga gaacagggac acttcttcct ctggtctcca aatttactga aattacctct 3000
taaacaaaaa aatatgaaaa ggttagagtt agaatatttt tctagaagga gaacctagtt 3060
tattattttc tgtcattcat tataaattag agccacacct tggtgattgg gagagctgca 3120
tgccaacatg gtcaaactgg ctgtaaggag gaagggctac ttggagtgaa gttgagaagg 3180
cttccacccc aaatacttgt ctacagctat cactagtcta tataaaaacc ccttcacagc 3240
aagcactgta ggactatatg gaacttgctt atacatcagt gaaattgtcc ttcaatatac 3300
gcctgtgggt tggaactact actaccattt gcatttctca cgttaaccct ccctagcatt 3360
catactcctc ttaacttcag gctaaactga aaggtgacag caaggctgca tttttgttgg 3420
atcaaactgt tccagtctga gaaataagaa attacaaata tctctttacc tctcttccca 3480
gggagtcagc tccagtgtta cctagaaata aatcagatat ggtaagttgg gtcctaaatg 3540
aaagaagtca gatctattct tcatcagaat tctctgcttt cattgcagac tgtgacttct 3600
ttactaggct ctttcattcc ccaccctttc tgatttctct aatcttccaa aagtatctga 3660
ctgattttcc tgtacccctt ccttggcaaa accttaaaaa gtttgagtaa aactttgatt 3720
atgaccaaag tatctcgatt tgatgtttta ggaagttgat ctatttcctt gaaaacagaa 3780
attatgcact tactgctgga aatgaatcag gcactgagca aattatcttt tgaatactca 3840
caaacattct caaaaccagt tactagatct ttaatctcac aattagtgaa gtataatttt 3900
tatctcagtt tacagatgaa agtatggaga gtttttataa ctagctgtga aacagttatc 3960
ataagcagta accaggtctt ctgatgcaaa gttagctttt t 4001
<210> 11
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 11
acagacaagg caaatacatc aaaatcttaa catttatccc tggcactggg actgattttt 60
gcttttgtct ttatccctta gttatgttaa catataatag gtttgtatta ttaagtgaat 120
atttttaaaa tgtcttagtt ttaatttctc atatagcaaa catcaataga cataacccac 180
ataaacaaaa cctctttgga gttctatttg taaggatgtc tgagaccaaa aggtttaagg 240
aatgctggct tagagagttg aaatctaaac caaggccgtc aatcccagcc cctgtaaaat 300
accagacaaa aacaatccag agaaagtgaa aaaatcataa caaaatgata aaaacatatt 360
aaacattaaa caaccataaa gatggcacct ggtggtccct agcagcctct ccaggctcat 420
ggaaatgagt cacattcagg agccaagacc tccaacccaa ggtctggagg ccatttctga 480
acctcagact ccaaaagaag tttgggcaca ctggttttaa caataagttt ttatatttct 540
ttttcctgtc tgttgaaaac acattcaagt tgcctcaaga ggtgaaagga agccagaggt 600
agcctccaac catcctcaac gctgtctttc aggagaggcc gcgtccctct ctgtctccct 660
ctcagaggag tggcccaagc gcagcatgcc acagctgttc aacatggaca gcatgctttg 720
gctctttccc tgctaaaact gaagctaaga gttcacagac ctataaaata cttagacttt 780
gatagaccaa aacctgcaag tatgctcaag aggatataaa taaccaagtt tacaatcctt 840
attttttggt tcctgttatc tcccccatct acctctgaat tcagcaaact cttgacactc 900
actcataact gaggtccagg aaggaagaga aaagtcacca tcctatggta tttagggaca 960
gagcgggtgt cagcatctcc acttcccagg gctctgaccc atgatgtggc ccagggctca 1020
cgctgctgaa tgctaggtaa ggagcagtcc tgactaacag ctgttatttt attagttgaa 1080
aaacaagaag gccctgaaaa tcccccctct tcccggttac ttggttaagg gtccatgttc 1140
tttgttcatt catacttgga gtctcactct gttgcccagg ctggagtgca gtggtgtgat 1200
ctcggctcac tgcaacctcc acctcctggt tcaagcgatt ctcaggcttc agcctcacga 1260
gtagctggga ttacagatgc gtgccaccac acctggctac tttttttgta tttttagtag 1320
agatggggtt ttgccatatt ggctgggctg ggttggccag gctggtctcg aactcctgac 1380
ctcaagtgat ccgcccacct tggccgccca aagtgctggg attataggcg tgagctactg 1440
cgcgcagcct ggttcattta tacttgacat aaatccaaaa gaaggccacc cctgcacgat 1500
ctccaagagc ctctgtggaa agctctagtt taaccctagg aaggagaaca gctcacctct 1560
ggcatgatct cgatcttctc gtaccgacgt ttgaagggca gggcgagaac ctcagcccgc 1620
cgataggaca tggcgggggg ctggcagtcg atcatgaggt ccatcctgtg ggactgggct 1680
gggggcggct gagtgtactg ctcaggacac accacgtgca ggggctgagg aaccaatggg 1740
acagttgaga aaggtggtgt gaagtacagt ggcccctgtg gaaggcccat tcctgacccc 1800
gatccactcc ccgccattac aggctcacca cttccaggcc aaggtgaact ttcatcctct 1860
caagcccaat gaaccgtaga caggagaggg aacgtgaagg cactggggca gaagaaacat 1920
ctcggaaacc cctaaactgc ctcagcttaa tttatctttc ttactcttta cataagtttc 1980
acggtaactt ccatgtgtct gctcccagcg gtctctgggc ttcctcgggg cacggtcttt 2040
ctctaacatt cccacatgcc tgctattata agctcttggc tgattaccaa gagctgtggc 2100
ttttcgtgcc gggctggcgc tgtgggtttg gagtttccgc tggatctgca ggtcagctgg 2160
tattaggcag tgggcactcc tcagctcatg tgctcaccat gtgcaaccaa ggagcagagg 2220
cagagttttt tttttttgtt ttttttttta aacaaaatgt ctctttttgt tatcttgaga 2280
aaaaaagatt agtaaaccaa tttttacttt gcacaaatat cacccaaatg aaaaggaatg 2340
ctggcagccc acaggggcac tgtgttgtcc ctgcttctac ttaagagctt ggaccaagcc 2400
ctgaattccc tgctgaagac ggctttccac caggctgacc gggctgagct tccaaaagca 2460
cttgttcctt agggctaata cttttctcct ccaaactgcc ctttgctcct cctcttgacc 2520
tgtctcaaag catctcatgc cacaggtgca gtctgatggc ttctcccagg ttttcctcac 2580
ctaaagtgga gaccaggtgg cctccagttt ccaggattag cacctgacag aggggcgtca 2640
agagctggtt tagggactgg gatttgaagt taataatgaa aagcaacacg cctttcactg 2700
ctgcttctaa cgtggatgag catacattgg ggtagcgaat cactgtagac aggccccacg 2760
tttgctgtct gctctcatca tacctagaga gtgccagtat tctttcttca ttgggcaagc 2820
acattaataa tcaccagaat ccgctcttgg ggccctcctg gggttctccg gctttctctg 2880
tttcctgttc ccctgcctta gagttttgtg aaagggcctc tctcatcctg aggttgaagg 2940
tgctgtgctg ggcccctcca ccctcatctc actcctggga acaccaggga gtggaaaggg 3000
agagctcttc ctcgccggtg tcagctgtcc tccttgccaa ttccagttcc tctcccgact 3060
ctcaagagcc agaaacccag cgagcagtgg ctgagccagc acagtggttc cgtggccctg 3120
ccttcctgct acgcgcgcct cctggtctac actgcacccc agctacttct tccctttctc 3180
cagggcagaa aatgtgaagt gaaacaaatg accaaaaact gccacaaaac ccaaacccga 3240
cagcactgtt tgtggtacca gcctacacgt gcgacacggc cccagtaacg cgtttcattt 3300
aaactgtctc agcttagtga gtatgtttct agaatgactt tttttttttt tggcaggggt 3360
gagagggttt gcttttaagc ggacacagat ggttttcttt cccttccaca tctgatttct 3420
ttagaatgga aactgcggtc agggtgtaaa gaccactcct tttcagtgtg gaggggggct 3480
ggcatgtgtc ccagcattct gcagatgcgt gagaactgaa gacaagaatc aaacacagct 3540
agatctctct cccctcaccc cacacaagca tgagccacac caaggtgaca tgactacgtg 3600
aaggcacaag cagtgtcctt cattacctgc acttctttaa gcagctttga cacctggatg 3660
aagttcagcc gggtgtcgat ggggcagtag atgcatttca tggccagcgg ctggtaagga 3720
gccagggctt ccaggtagga gaagtctggt tctagggaaa gtacaagaaa gaaatcttac 3780
aatgcagact acacaggagc tgccaagcca gacagcaact ggggcgctga gggcccacag 3840
acagtcaggg ggtcaggagg gagccaggaa gcacacgtgg gggacggagg tcaaaggggc 3900
cccgtggaag gaggggcagg ctgggatgga ccctgtgggt gggctccaag cttcagcaga 3960
ggcagatgag gtaaatgagg gagaaataca tatgtggggg a 4001
<210> 12
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 12
aaatgtactc cttcttataa gtctccactc tcaaatgact tgtttctgtt ccttttttct 60
cttttaaatg tgtatctggt agaatgagca tttagaagca taatacatgt atacactttg 120
tgtttaattt tctatggcat aagtaagcag tttttatgaa ttgctggtat tgcttatgag 180
caattattac aaacattgag agaagggaac ccctgtgaac ctttaacatt tctcaggagt 240
tagtggtaaa cccatgaaca tgtattttta aaccaaatta cccacctctt ggagttcaat 300
ctctgttaat tctttattaa agtagtgaag tatcagttgt tccaatgata taatgatcaa 360
gcaaccctgg aaattaaatc ccaaagcagt gcacctttag tttgttcagt gatagtagga 420
catcccacga gccatcatat attttcaagt ttttatactc aatctacttt ttcagcaatc 480
ttttgggaac tatcccaaga taatttactg cataagtgca tctatcttct aaaagacatt 540
tggaatattt cttagtctga cctctgcacc ctgagacact ctataaagga aacaatcaga 600
aaaatttaac aaagaaataa ataggttaag aagaaagcaa tctaggcgtt tgcactgagt 660
ttgcaacagt gccattgcta catttaagag ctgtagttct tcctcacatt ttaactgaga 720
gccactagtt atgttactta aaatctcccc attaaataag acagttgctg aacaactaaa 780
aagtacaaaa tatcacaaaa tcaaaaagta gcaagtcata aggggatttc cgcatcctag 840
catgtgtgtg tgtgtgtgtg tgtgtgtgtg tgaaagaaaa cgttacagtt aaccgttaca 900
attgctctca ctccactcca ccacctcatc ctgtgtagtc tgcctgcgga acccgcggga 960
atctctcctc agtgtagtaa gagcaaaggc cagcatcctg tgcaaaggtg ctctgcagcg 1020
tcgtgatcca gggattttag catctgtcgt cgcttgcaca tcctctctta cccctctgct 1080
atctggtgga gttgggcgag ggtctccagg gcttccagag agtgtcgttt acgcatgtga 1140
cttgccatgc gctcaaacta aagcgccgcc ggggacttac tgaagcccac ctcggccctc 1200
ctccactttg tcctcagtct tcaggttttc ctttctgccg ctagggccta agttgtgggt 1260
tcaccataac tcctcagcag acattggagt gaacgcatcg actgccgtca cccaagtgct 1320
aatcactgcc ttctcccact cagcgctgga gtgggagatt catccatcgg aagattcgta 1380
gccaccaggt ccagtcaagg atttcatatg cactttccct cagaaaaccc tgaaaagcaa 1440
acgacccctg gaatgtcaca cactcctaaa tatccctgga aatccgcttc tctgtgtttc 1500
gcttcatggt gagtgtcgag ggccagataa gacaaagaaa aaaatgtatg gaaggttatt 1560
cccggtcggc tcctccttcc tgtgagtctc agacaggctt gcaggcttac aggctttccg 1620
ccgctccccg ttggcagcct tcatcgaatt aggtgggtgg gggtgggaaa ttgggtaaga 1680
aaataaagtc gttgtgggcg gctggggaac ctggcgtcag tcccccgtgg ctgtgcgcag 1740
gtaccctgca acgtcgcggt ggccccgctc ctcggccaag tccacgggca gacgacccca 1800
ggcatcgcgc acgtccagcc gcgccccggc ccggtgcagc accaccagcg tgtccaggaa 1860
gccctcccgg gcagcatcat gcaccggtcg ggtgagagtg gcagggtctg cgcagttggg 1920
ctccgcgccg tggagcagca gcagctccgc cacgcgggcg ctgcccatca tcatgacctg 1980
ccagagagag cagagtggtc agagccaggg tgggggcagg tatgggagat gccggccggg 2040
gcaaggcagg tggagccatt taaagaaaca cctaattgca aagttttcac ccagtgcaga 2100
ggtgttcagg tctctgatgt ctggtgtttc ttcatttgct gatgcaatcc actttcccac 2160
cccaccttca ggttatactc agtacaaatt aaatgccatt ttattctcta aacgtgcaga 2220
gacaagaaag ttgatggtaa agtgatgatc atcattatgg aaaaacaaat cttgatttcc 2280
attggaacat gggaatctat tttgttaaat gatttagggg cagagttaaa tttattcggc 2340
ttttaaagtt ttaaattatt tgccttgctg acccctcctc cataatccag gtctacaaat 2400
atttattagg tagtcaacta ctgtttgtta gaagttggga gtaatggttt aggggagaaa 2460
ataaacaact aagttttttt ctttcttttt tttttaattt atttagttct catagcaaat 2520
cccgtgcgga aggcttttgt ttgtcatgtg tctgagctca taactggctt gtagtgtgat 2580
aatttgagcc aaagttggag attagaaggg aaaagtaaat taaatcatgc aaaatcttat 2640
aaaattataa taaatgattt ttccttaaca gttcatcatt ttaaatttag actataatat 2700
ttttaatgta atataaaact aactatattt gtatactaca ggattttata atagctaaga 2760
tttaaaaatg ttaaacagta aatattttgc tatataaaaa gagcttgggc caggcccagt 2820
ggcttatgcc tgtaatccca gcactttggg aggccatagc aggcagatca cctgaggtca 2880
ggagttcgag accagcctgg ccaacatggt gaaatcccgt ctctactaca aaaaataaaa 2940
aattagctgg gcatggtggg gcgtgcctgt agtcccagca actcaggagg ctgaggcagg 3000
agaatcgctt gcatcctgga agaggttgca gtgagctgag attgtgccac tgcactccag 3060
cctgtccgac agagtgaggc ctcatctcaa aaaaaataaa ataaataaat aaataaatag 3120
agcttggatg ttagcagtca atcaggatga tgcataactt agaaaaaatg atgttttgca 3180
ggagtaatcc tgtagcagcc atttgtgtac atgaacatta tgaaaacatt tgctgtcata 3240
tttaaagtca attataagac aatttttggt aactaaatgt ctaaaatcga gttatttgtt 3300
atgggtttat gcactttaaa atataccatt taattgaaag gcagtatact tctcatgttt 3360
tgttggcttg tttcattaat attagaaatg ttcatttcat ctcttaaaaa ccgtgataag 3420
ttatatctac tagtgatgta aggatatttt acatagaaaa aaaggggaaa atagctgctg 3480
ttagtatttc agaaaatggt gaattacata tataacttgt ttttataatt attaataaaa 3540
ttttaagttt ttaagatgca ctttacaata tttttgctcc tttaaaaatc cctcatttgt 3600
tataccatta ttctaagaat tcaaaagaaa tgaacacttc tttaaaatct atactattaa 3660
caagatccct taatattagc ttagttttag agggtgatag tgaggtgagt actgaatatg 3720
accactagcc aacggtaaag tacaaaagag ttgtccgaga ttgtaaaaga aataaaaaat 3780
atcagtacta attaaagcag gattcgtact taaacattga ataagtgtat tttaacaatg 3840
aagataaaga tgcattattt atgaagatcc tttgccattc aaaaaggacc taacagttgc 3900
tgcaggaata tttttgtaat ctgggcactg agtatgataa ttaaagaatg agaaacctat 3960
agaactatat attttttctc ttatgcatca ctcataagac a 4001
<210> 13
<211> 4001
<212> DNA
<213> Homo sapiens
<220>
<221> misc_feature
<222> (2050)..(2050)
<223> n is a, c, g, or t
<400> 13
ttctcctttg catataagtc aatctctagg gacctttagg gaaacattta ctttttcagg 60
catctagaat attattagga atagtaaaat cacaatgtat taatataaat caataggcat 120
tcctctatta acagaaaccc catgagtact ctcaaattct cattacaaga cactggcaca 180
caggaggaaa tgtctatgct gagcaacttg aggacgttgt ttctttacct cggtatacta 240
aaacaactgt caattttcac tattcaatat ttcctgagtg ttactgattc ccagtccgtg 300
cacaaggcta tgagagaata cagattctct ctctaacatc taggaattca caattcagtg 360
gggagaaagg cacacaaaca aagtaggtaa aaataaaagt agtggaaaat gtcgctttga 420
gaaatagcat atactcaaat taatacttat ccattttcta aaagccttga gttgtttcag 480
caaacacaat acatggtgaa aaaatacagt actcaaagga agaactcttg gtttcacaat 540
tcttaacata ttgtaacgtg acttcaggca cgtcattctt aaccagggtt ggccttagtc 600
acctcgtcta aaaatgtcag tattagcaac ctaaacccga atcgttttac cggcataaac 660
taaattgtct ctagtggctc ttccgggtca aagatccaca atactcacat actcatccat 720
cagattaccc cagtaatcat ggcccaggga accactacgg gtaaactaga tgaattagag 780
ttcctaaact gagataggct ttaaacttgt gaactcagaa aatgtattac agtggccatg 840
atacagcctt ataaaatgaa caagaatagc tgaacatttt gtgtatcatt ctggtacaat 900
atgcgtgtga atttcccacc aggaatccga gtttctgcac tactggaacc acgcctccca 960
gagaaatcaa ggagacacca gaaaaacctc ctcaagggac agggaaaaat cacggacaag 1020
ctttcttccc ttctcacctc cccctaaaaa agcccagtgt ttttcttccc ctccagctat 1080
gcagctgcac ccagcagaga agtactagat tagcatcatc tgcatttcat tccttttctt 1140
ttgcaatagc tactcgccta taataaacag accttgtgct caagggagaa tttacttccc 1200
cgtccagtaa aaatgaagtt ccttattttc actaacaacg ccgtctccaa tcaacaccca 1260
acctccctgg ggggtgggtg agaccagaca cgcctctccc tgccacccgt gctgcaagcc 1320
tgacgaccca caccctagga attgtcccac tttccgagcc ccttctgctc gagtggcgag 1380
gcttcggggg acagcggaag aggccaggga tttggcgtgc acaccagggg ggcgtccagg 1440
caggcccgag agcgctaggc tgcctccgct gacagggggc tcgcacaggc ggaaggggag 1500
cggcgagacc tatgccaccc ccacttaccc actttgtcct tcccgtacat gtgcccggcc 1560
acctgatgcg agaggggcac gcagccgttg aggaagcgga gtctgccgcc cgccggctgc 1620
ggggtgccct caggggtgga ctcgatcgcc ggtgaggtcc gcatttctgg ggggcccggc 1680
gcctcgaccc ggagggggga tggtggctct gttgccataa cggagagcag aagcggtaac 1740
ggcagcgaga gtaggaaaaa aaatagggcg agggaggggg cgccggagaa cccgagggtc 1800
gctcaggctc gggcgcgagg aggcccgggg gttcccgcgg ctggtgccct ctgaagcgcg 1860
gggagggggc ccatgacgcc gccggggcgc gggcgctcct ctgcccaact ctcggcggaa 1920
cgcggctccc ggctcctgtt cctctgccgc cgcagccgcc ggccccggcg ctgcccggta 1980
gacagaaccg agccgaagaa cagcagcagc ggcgccccgc gctccctggg gccctgacgg 2040
cgacggcggn cccacctcct gcgcccccgc cccctcccgc cgtcctccag tcccctcagc 2100
tgccgcgcgc gcgtcaccgc cgcccgcacc gccgctgccg ccgctactga gcatgcccag 2160
aggccgtcca acgggattcc ggccccccaa gtcagcgggg cgggcgcgcc gtgctgccag 2220
gacctggcga cggcgacagg gatagggcga gcgcacccct gtttcttctc acccccaccc 2280
cgcgggcatt tcgaagtcac gcgtgctgct gtggcctcat tcattcaaca cgcttctcag 2340
cgcgtcagct cagagaagcc agggacttct aagtgtgcgg agcagtccgg caagaattgg 2400
aaggcactag cattgccaag ggtggctcta aaggccaccc ctaactgcac tcaactccct 2460
ccaactccca ccccgcacga aaacaacaga agtaaaagca ggcagcttat tagagatgcg 2520
ccttcaaaat acccttcgag ttaatctcag cccccctccc cggattgctg tctctaaacc 2580
gctcttacca aaaagtaaaa cacataaaaa taaaataaaa ttggaaccga cacgaatttc 2640
agactgtaga agacagagaa aacttctaat ttttttaagt gaacatgttt tttaaagaaa 2700
ggagcaagtt ttttctctcc cactcgacct ttaagatagt gcaggttcca aggtcctgag 2760
tagataatta tatcaaaagc agaaagggat ggacaccggt agctacgcga ggttctggaa 2820
tgggactcag tcattgattc ctgtatgtcc gccgctggca tgttatataa aagctggacc 2880
tggccttgaa ggtaaatcgt cacaagcccc gagcaagaaa aactggggtc aaatcttagg 2940
tatctgatta aagtaagtgt cctattaaca cctgtttctg cccccagaag cccacacacc 3000
tggaagagac gtaaaatctg gcctggattc cctaatccca gggccctgcc cctcggcttc 3060
agggccccgc ccctcggctt cagggccccg cccctcggcg ttgcgcaaac tctgttgcta 3120
cctaagtctt ttcctcctcc cacctcccgt agctgtcaaa atccggagtc cagacttcct 3180
tatggccagg acagctccac gcatgctcag cacctaccag ggtcgagctc acacacgtgc 3240
ctacgacccg ccgcggcgcc tgcgcggtag catcgcggag tcggtgcttt agtacgccgc 3300
tggcaccttt actctcgccg gccgcgcgaa cccgtttgag ctcggtatcc tagtgcacac 3360
gccttgcaag cgacggcgcc atgagtctga cttccagttc cagcgtacga ggtgaagtgg 3420
ggcctgggag cgggtgaggc tggtgaggct cagcggttgc cgcgcccgca gttcgactgg 3480
gttgttttac cactgcccta cgcgcccgcc ggtcctataa ccttgagatg cagaagggca 3540
agcgctcggc ccggctgcgg gctccgggcc ggacagcgaa agggggcgag gtcagggatg 3600
gattgatggt ttggggccat gccggaccag agtccacgtt taccccagct agccaaaccc 3660
taacacctaa tgaccgcgcc gctagctttt cttgttcacc cctaagacgc tttcatgtcc 3720
ttaagtcaca tagcctggaa aagaaggaaa ggtgggagaa aacaacagca aacctgtttg 3780
cccttctcat gtttcagtca gcgtcgggcc acaacgtcga tttcctgcaa tcctgaagca 3840
ttgcgtgact aatcacatct tagttttgca gatttttttt taagggatga aacgcaatcc 3900
gaggttttca gttttcccgg ttttttatta ttcctctggc atctgacact cggtggcagt 3960
agtccaggat aaaggattga aaactgcgtt cgttgaacct t 4001
<210> 14
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 14
gtccggacag ggcaggtgga gattgtggag cactttctat ccctgggcct ggaaatcaat 60
gccagagaca gggtgagtgc tagcctgtcc gctgctcacc cgccatgggt gtgtgggcag 120
cctgcgggcc cctacaggtg gtgtccttcc tcctggggcc ttcaggactt ggctcctgag 180
cagggtctcc agcccacctc ccatcaccct tgcctcctcc cactgcccac aagggccata 240
gtgagtgcca cactgactct aggaagggga tactgccctg catgacgctg tgaggctcaa 300
ccgctacaaa atcatcaaac tgctgctcct gcatggggct gacatgatga ccaagaacct 360
ggtaagctca ttccctcctt gattcagtca ctggcgtgag cactcataca gtggaggtgc 420
tgtcctggtc cctggagcag cccccgccta gggacatgta tcattggctc tggccagcac 480
catagtacat aaacttgtcc tcttccagag caccattcca actgtccgcc ccattctcta 540
tgtgtgtcta tgtccagagt gcaagtgcag agagacagca ggaggggcag caagaggact 600
tcccaaagac ccccaagtag gccagagggg gtcctcaagt actggcccaa ggaggccaga 660
atcatgaaag gccagagctg gaagggaccg agcactcagc tggcatgaca gttcctagca 720
ccccatgtca gctcctgtgg tctttgagtt ttcttaggat ctaggcagga agtccttgga 780
aggggaaggt gcccagggaa caccttggac tagggaacac ttagtccaag ttcttctttg 840
gggcccagag cattcacaaa aggaggccat gggagagtcc ctcgggctca agaaaggcta 900
aagggataat gagaaaagca ggaaaatatg gagaaggtat aaatgagggg ttttcaccct 960
cagcattatt gacatttagg gctgggtgat tgtttgttgt aggggaccat cctatgcctt 1020
gtaggatatt taagcaacat tcttggcctc tgcccactag atgccagtag tacccatccc 1080
ctagtgtgac aaccaaaagt atcttcacac aatgccagtt gcccccattt gagaagcact 1140
ggtggtgtca atgagggtac cttctgtgaa agccttcagg acagttaaca gattttctct 1200
tgccaacccc cacgccccgt cccgccccct ccaggcagga aagaccccga cggacctggt 1260
gcagctctgg caggctgata cccggcacgc cctggagcat cctgagccgg gggctgagca 1320
taacgggctg gaggggccta atgatagtgg gcgagagacc cctcagcctg tgccagccca 1380
gtgaatgcgt gccccagccc agccagctac ccagcccctc tctgtgtgca gccggagggt 1440
cctaagaatg gctcccggag ctaactgagg gcccagcctt ttttctgcat gatccaggag 1500
cacataccac aaactaccac aataaaaaag ctgtttttgc taattgcgat gttcatttcc 1560
acttgtgtct gaaatctttg ggaatgaagg aagactgggg gcaaatgctg gtttggggag 1620
gagggaaact gggagccggt tggccctctg agccagccct gcagcttctt tactgtggtc 1680
tcttttcctt tcctccctgc cactcaactt ttcctgcccc ttctcacatc tccagggcca 1740
ggcttggccc ctgcatggat tgacttggta atacactcat cctggccccc ttcctggtgc 1800
ctgctgtcct gtggggctgt gagtggctag cacaggcttg tcacgtggct gccttggcag 1860
gaggagcctg gtaagggagc ggggcagggg atggacagcc tgtcacaggc agcttcttgc 1920
ccagccttgc cacacctccc acagggcagg acctgctcag ccaatcagct ccaccaacct 1980
cagaactgcc cgtggaaacc tggctgggct tctggcttca gagctgggaa ttgtgggagg 2040
agggcccaga tcccactgga gataccgaag gagatactct ttagtattgg gatcccagaa 2100
accagtccta tatctggcca gagggacact gggttgtggc atctctctag ctgctaccca 2160
gaaggaacag ggccccctgg ggcctatagg ccttgcccct gaccctggga acacccagct 2220
caggcctgcc ccagtggcca caagtcaggg aggccgcaaa tttttaacta gaaacattga 2280
tcattaatag ggggttagaa agagttcaaa ctaagtctca ctctgggaca tagaccaatt 2340
gtgcttcagg cctcctgcag agatgctggg tccccaagtc tggtcttctg tgaggcaggg 2400
gctaagcagg agcttgtcca ggaatgtggg ggtctgggcc tcaggggagg ggaagaaggt 2460
ggacattgcg ggtatctacc cccctgtgac cacccccttc actgccactg cagaggtgga 2520
ctatgggaaa ctggaggaga atctgcacaa actgggcacc ttccccttcc gaggtaagtg 2580
gggctgtcct ctgtgggacc tggggagatg tgagtggccc tttagccaga gaccaaggga 2640
gggtacacag gctctgtcaa gctgtccagg gatagtctgg tactcagtgg gtcattttat 2700
tatgggagag gaacagggga gactggtctg gagtcaggag acaggcagga agctccacgc 2760
tggcctgtaa ccttaagcta aacctcacct gcttcagttt acccagctgt aaactgttaa 2820
agtgcctttg tagctcccag agtgattctg aggttggact ggataatggg cgtgaaagtc 2880
cttgtcaacc ttgtcacttg gtgcagatgt gaaggcatgt ggctggggtc acgtggcagc 2940
ccagccacag tcccagttcc ctccaaagtt tgtttgtaca ttctcttctg gacaaagcct 3000
ccttgccaga attagttctc ctgtgacccc ttggctcctg tgactcctgt tgtggctcct 3060
gtgaccccat ctctctttac acatgggtct ccccaactct gaggttaggg accagatgat 3120
atttgcaaca atcaacatgt aacagagtcc caggcacaat aaatgaaggt agaagaaaat 3180
gtcaaattct gcctgccctg ctccgcccat ccccattcct tactgcacaa tcccctgagg 3240
cagtgctctc tgtaggacca tcccacagat gggaatctgg gcctcagagt gacatggtca 3300
ctgcccaagg tcattcggct aacacatagc tgagctgagg tttgcacctc gtctgactgc 3360
atggcccatg ctccatttta ctccctcgct gactctcatt ataaaccctt tggagctgct 3420
aggccacact atttcaacgg agtgtctggt cagcactagc tttgttgtgg ctgctattga 3480
ggcagtatcg tgtagactgt ggactctggt ttcaaatcac agctctggag tgagcgcctg 3540
taatcccaac actttgggag gccgaggcgg agggatcatg aggtcaggag ttcaagacca 3600
gcctggccaa catagtgaaa ccccgtctct actaaaaatg caaaaattag ctgggcgtgg 3660
tggtgtgtgc ctgtaatccc agctactcgg gagactgagg caggagaatc gcttgaaccc 3720
gggaggcaga gattgcagtg agctgagatc ataccattgc actctagccc gggcgacaga 3780
gcaagactcc atctcaaaaa aaaaaaaaaa aatcaaatca catctctggc actgacttag 3840
tgtgggatga ttgctgtccc tatcggacac atgggagtga taatgctcat attatgggat 3900
ggttggacaa aaaaatctga gagaatgcaa gtgaaatcgt gctataaatt gtggagtaat 3960
atacattgga aggtactgat tccaatggta ataatgatta a 4001
<210> 15
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 15
gatttgcttt ctgcagagga gccagagagg gtatccatgt tgaacttgta tatccatagg 60
aggaaaccag tcccatggcc cctgaagggt cctggcagtg ggaaggggag aggtctgcat 120
gtgcatgtac tggacagcat attccatgtg ggtcatggag tgatgactgt gggcctcttt 180
cgggcattgt agcaggcccc catttcattt catgatgacc tcggggtgct gggacaggaa 240
tgtgattgcg ggtccacgtg cttctgcaac tcctcagact gacttggttt ggaaggtctg 300
tgttcaggac cctctgtgag atgcttgtct ccattaggca tttgagctta tatcggccca 360
gggtttctac ctatcgtagg ttgtgcatac tttgcctctg aaccactcag gcacctcgat 420
gcagtcaacg tgcccgatta taagagtgtg acttcccata tcctggtgat ctctactcgg 480
aaacatggtc caggaaggtc tctgtgctta aataccttag tagctcgtga tgttaggtta 540
ttttctaatt cctcctaatt cttcccagga gtcttctttg tgaaatagtc atcttgttgt 600
aggacagggc tggtatccac aggcagtgat gcataggttt tgcactgcgc ggccatgttc 660
aaaggcaagt gccttaaaat gtatggtgtt ctcagccaca gtgccttaga agccatgtgg 720
gattcagtgt gtctggtggt cctctgagct gcacctgtga ggaagacgct gaggtagagg 780
ttaagttgtt attgaacgtt tttctcaatc tgaaaagtgt tcaagcagcc cagggtgaaa 840
agtcttcaag actgaaggaa aattgccttt tgtactggaa gagtttttcc ccaaaactct 900
cctcatttaa atagccctgt ggcttgtgtc ttctctgatg accaagagga gggcttttga 960
gtcagtgatg gcaaatgagt ggttttgaga gagttcttac acttagggct ctgtagcttg 1020
atggtgcctt caaaattaac acgtgcctgc tatcttcatt cctgggtaga cccatacacc 1080
catcccaaaa ggggagtagc ctgtgggaag cacatatggg atataccaca gtgagtcgag 1140
tggaatggcc cttttgacat gtgtacctgc ctcccgttct taagcacacc cgtttttcct 1200
ctgcacttcc caggtgttgt ggccatggaa gtaaggccag tggctacgag ggacagtctt 1260
catcttggga gatccctaga gagggcaggg agccccttga catgtggaat gtgatctctg 1320
ttgggtgctg gtggatccca caggttggta aatgtccatt ttccaaaagc ccctgatgtt 1380
ccttctttgc agggataaag actgtgcatg gatcgatgat gacctcaata catgcattcc 1440
ttggaaagct gaacaaaatg agtgaaaact ctataccgtc gtcctcgtca aactgaggtc 1500
cagcacgtgg ctccaactgg agctggagag agagagacaa cttccattgg ttgatgtggg 1560
ttgcactaag ccatccatcc aggcttagga tggggtccac tttggatcaa agggagtcac 1620
agggcagtag gtgctcctgt gcagaggaga ttccttgtgt gaaggacttc tctttgtgag 1680
gctgggtagc acgcacaatg tctgcaggac tgggtgaatg tgcaggtggc acagaagagt 1740
tagccggtaa cccgcctctg tgtggccagg tttgtccttg agtgctggct ctgggttccc 1800
ttgttcccca gaggtagata gctccttgtg gttgggaaga gagcccaagg ccacatcttt 1860
gttgttcttt ggctcatttt ttttcatggc tgaggacttt ttagtgccca tggtgttttc 1920
gtagcagcag tggaagtcag ttgtgaggag gttggtgatg cttattttct ttctggagag 1980
taccctgtag atcggctcca tttatgattt gtgtgtctcc agggagaaac tgaaatccca 2040
tggctggtga tggatactcc aggtgggaat gagaaagcgc catacatgtg ttgacttggt 2100
accctaagat ggggatggtc cgccacattt gcgcgtctgt tttgcatggt gtgtactgca 2160
catcacaaat gccatgttgg ccttagggtg ctgggaatgg aatatgatct gatgtctaca 2220
tgggagggcc tttcagcaga ttgcttttgt ttcctgacag tctttgctcc atccccagtc 2280
acacgtgttt ttctccaggg aacattttat tggagtatat caggtcgata tcagaaaaaa 2340
aaatgttgtg gttatcccaa cattaagtgt ggtgatgtaa cagcagtgtg gatggtagct 2400
caagtgaaca gtggcaaaac caaaatatat tggtttacgt aagtgaccct attctctcct 2460
atgtttgaag tctggtatgc atctctttct gcaataatct cagtgatctg gaaactttca 2520
ggaggtcttt tttagatatt ttatgtcctt gtggttgtac tcatactccc tgcagtgcta 2580
agtctgagtc attctgctat cctagtgcta ctggaatttc tattgttttg ggggcactgt 2640
ctggtcattg tggatacctc tcttggggcc ctttaataag aagtatgtta gaggtcggtt 2700
ttggggcagt gttgacctcc tgcctaaggt caaaggcagt tttcttccta aatcttattg 2760
caggattcac tgcctggggg ttgtgtgaca gggagagcac aaaattaggg tatcccatgt 2820
ggacttgtgt ccagtatcaa ggcagtgttg gataattttc tatatccaca aggctcccac 2880
tggactcttt atgtccagca gacattcaca ggccaacttt tctcttacac agggggcctc 2940
attcctgcca tgtgtagctt ccatggaacc atccttcagg ttttaagtgc aaggatttct 3000
ctcaagagtt taatttgttt ctttaagaca ccgttttctt ttctataaat tctgctttct 3060
gtggtttgta gcacagatga aattgccagg tcatattcta catttctgcc caatcttgga 3120
ttagtgagcc tttgaaatat gttccttgat cactggcaat ggttcttgga ataccagaac 3180
ttagttacaa gcattgcaga cctacagagg tttgttgtta tttttaactg catgaagcag 3240
tgggtttgtt tgtatcttgt cactcacggc tgtcagcaaa acttaggtgt tgctaaacca 3300
gaccacaccc atgaattata gtgcctctgg ttccttctgt agtagcctgt gggaagtaga 3360
tacaggatgc accacaaggg aatgaatggc atggcccttg agacctgtgt acctgcttgc 3420
ctttctcaag cacacctgtt ttgcctctaa ttcctaggtg gtctggcatg gaagaaggcc 3480
agaggctgcg atggatcatc tttgtcttgg gagattcctg gagatggtgg ggagcccctg 3540
ggcacgtgca gcatatgtgg cgtgggcttc ttcaggtgct ggtggatccc tcaggatggt 3600
aaatgtccat tttccaaaat cttctgatat tccatctttg taggtacaaa gactgtgcat 3660
ggatcgatga tgacttttat acatgcattc cttggaaagc tgaacaaaat gagtgaaaac 3720
tctataccgt catcttcgtt gaactgaggt ccagcacatt gctccaacag tgggtagata 3780
gagagggaaa acttccattt gttgacgcag gttccattga ggggtttgtc caggcttagg 3840
atggggtccc ctttgggcaa agggagtcac aggggagtgc ggcctcctgt gcacaggaga 3900
ttccttgtgt gaaggattta cctctgtgag agcagggtag cacacataac tatgcaggag 3960
caggtgaatg tgcaggtggc acaggaagag ttagatggca g 4001
<210> 16
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 16
caacatgatg aaaccccgtc tctactaaaa atacaaaatt tagctgggca tggtggtggg 60
cagctgtaat cccagctact taggaggctg gggcaggaga attgcttgaa cccgggaggc 120
ggaggttgca gtgagccgag atcacaccac tgcactccag cttgggcaat agagtgagac 180
tccatcttaa aaaatccaaa caacaacaaa aaacaacgca gctgaaaaat gttcctttcc 240
agtccctaaa ctgggcagac agaactggct tcatcagtat cagctgctct gaaatgaact 300
aaaatatctt ataacaccaa ttactgaaga atttgtaaca aggtcaatag aggcacattc 360
atttgaatat aatttttaag actgctacag aaggccatac ttacattgta agcagtactt 420
aatttgtgtt tctctggcgc aagtttttat ctttgtgatt gttggtagag ttggttttcc 480
tttggaggaa aaattgcatt acatgagtta aactcagtca ataacttttc aaagtataac 540
acatggaaat acagggactc tttcacttac tgccatccct ggtgtagtga gaaattttgt 600
ggcacaatat gagtgcataa aaaaacattc cttaagtagc tacaggttaa gtatccctca 660
tctgaaatgc ttgggatcag aaatgtttca gattttgaat ttttttagat tttggaatat 720
ctgcattata cttaccagtt cagcatccct aatctgaaaa tccaaaattc aaaatgctcc 780
aatgagcatt tcctttgagc ctcatgtgag cacacaaaaa aaacttcaga ttttggagca 840
tttcagatta tggattttgg tagtaggtag actcaacctg tatatacctc attttatcct 900
ccttttgacc cccaaggcag acgagtactt tcccttcgaa tccggaacta taattacgat 960
acaggtgaga atatgtaaaa cttttctcca ctgaacacaa caccactgtc ccttcttgat 1020
aaaaggtaat cataagctgg cagtgctggg ccctctggta cttaacaaat ttgggtgctg 1080
tttgtctgct tctcactctt tagtccaact ctaggagatt ctgccgatct tagattgatt 1140
ccatacctgc agtaccctct ctcttggata cgagcaattg ttaatggggc tagacagttt 1200
atccaggcta tagttcagcc tcaaattgca ttacatgagt taaactcggt cagtaacaac 1260
ttttgaaagt ataacacatg gaaatacagg gaccctttta ctaccatccc tggtgtagtg 1320
agaaattttg tggcacaata tgagtgcata aaaaaacatt ccttaagtag ctacaggttg 1380
agtattcctc atctgaaatg cttgggatca gaaacgtttt aacattgaca aaaccgggag 1440
cctcagatat actggccatt tcccccacat tttaagtata tcaaatacac aaaatttgaa 1500
aaaaaaaaaa gttttttgca tggtattcca ctttgtatgg ccactggaaa acaagactgc 1560
ataacctatc ctgagaatca agtggttttt agtcagatct gtgtcaagct aagaaatcta 1620
cactgcaacc ttcatttcca ctcactgtct tcccctccaa tcttagctac tccatttgat 1680
taaactgcaa acacaacatc attcaatcag atactgttga atcaatgtcc aataaagtac 1740
ttacattggt ccagctggtt catcatctac atggtgttca gttttttaaa aatccatgag 1800
agataagaga agaaaggaga aaactgttag attctgcata aggattaaca gaatagcaaa 1860
ctgcaagtgg caatcatggc tactcacagc atttattaaa ggccattcca tctggaggtg 1920
caaaacactc atggagttat tatttctaga ttcaagttta aatggcaggt atgttgtttc 1980
aatggtttta gttaacactt tgaatgcttg tcatgaaaaa ctgcaactgt agcggacaga 2040
taggctgcaa cattgttata ctcatcttgt ccccaagttc cagtggttca ctcttaaata 2100
gatgagaaga caaaccaaca tttcagttgt aaaatcctgc tgtttgtttt cttgtggtga 2160
atgcatgcag ttttggctca atcctgcaat gtgaatggat gtgtaccatc tgtacattta 2220
gccacacatc ctcacttccc caaagggcac tctgtttttc ttgctgtttt gtaatcttta 2280
gaagagtctt tagtctggag aatgagattg agaagaggca atcacaatag ttcaacctaa 2340
attgggtcag ccatgctggg ctatgccagc aacaagtgta tgtataccag aggatgttct 2400
gtgacacacc gaggaaagct gaacagtggt gacatccctc tagaaaagac ttcagaaaga 2460
caggctcact cctgtgccct ggccaaaacc agaagccaaa cccaggggag ccatgagata 2520
acttttttaa aatcaaaaag ctgacattat atgagtttaa gccttcttaa attgatactt 2580
tatgattctg atttataagt ctttcgtgtt tcaactaagt gcaacttgat cccctttaag 2640
cagagataca cattcacaga gagagaatgt tttaaaaaga cccacaaggg gaagggaaca 2700
agtaagtgct cttagtttgt ttttattata caaaagtcag gtaaactaaa tgattcagaa 2760
cagtttgatt tcactgaagg cttctagaaa aattctcagc aggattccac aagggggctt 2820
ggtgcacatg aagcagttat ggagattaca gtctcagtgt aggcatctgc aggagcacat 2880
gtcgggagta atgtctgtca gagggagaca ttgctccacc tgttccttcg aggcaacgga 2940
gaggcaaagc caccgccagc acacacagtg actggagcaa aaaaggattt caaaaaaaag 3000
aactaaccac aaaccaatga ctcatcagcc aaaacaatag gctagcgaca cactagtgtc 3060
cacaggagtc cactactaac aacagtatac actatcaaac gggtacagag aacaaactaa 3120
gacttaagga cacccaagac atctgtgtgc tgtgtatgca gtggagagat acaaagacat 3180
accaggtact aagacacaaa agcccagcct aaattctccc ctcacaacat cctaatgctc 3240
agcatcccgg ccaagccagc tcagctacat gctaggaact aagcactttg gggatttaaa 3300
tgaaaagaca taggagcaaa cttggaatac ttttagtgct ctggctgtgc actcgaaaat 3360
acaaaattct aatgcaagag agccgaatca agaatacacc aatctactcc caaatttaaa 3420
ataaaaagca ggggagggag aatcgcaata gccaggctaa tcttttcaaa gcatgacact 3480
gtaggcacca acccaagcac ctcacaccac gggctgtttt tccttttgga gtgtgggtat 3540
tccctggtgt ctgattagta atgtcacttt ccagcccaac aagctggtaa catgcaaagg 3600
gttgcatgga ggtgttttag cgacagtgcc tccggcatag ctgccatgcc tgcaagtggg 3660
agcagttgag attcacaagg gggcaaaatc ccaggagccc tcttcatctg tctccccaaa 3720
tccactttca cactgacaaa caaaagagcc tcatgtctcc ccactctcct ggactgaatc 3780
actatctgca gcaccactct cccctacagg tcagacaaaa aaacgacttg acccctaaca 3840
gggaagtcag gggagaggga aagaacagaa ttacagaaaa aagatacaga aaccaaagta 3900
aatgaggatt aaacagcaaa ctgaaatcca aattagttgt gaaagtgaat gaagtaaact 3960
tgaagaaaag tgttaacatt atgaactgtg catcgtatag a 4001
<210> 17
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 17
tttgtaactt tggaatttca tatctctact tatttccacc ttcttcccaa ttaggctatg 60
tgtgggagaa aaccaattgt aatgatttcc tctgtgagca ggaacttgtg atgaaactgt 120
ggttccctcc cccaaccagt taggcaacct attaacaaaa aagagaagtt tggatatgtc 180
acaccacagc ctgcttctgt gaatgatttc ttttaagaat cctcatattc taggaattgt 240
tttttaaaca cagtatccaa tgaaaaggtt gaagatttat ttcctccatg gagggaatag 300
aaagattatt cttattattt taaaaaacca tattgaggct gggcatggtg gctcatgcct 360
gtaatcccag cattatggga ggccgaggcg ggcggatcac gaggtcagga gatcgagacc 420
atcctggcta acacggtgaa acgctgtctc tactaaaaat acaaaaaatt agccgggtgt 480
ggtggcagat gcctgtagtc ccagctactc gggaggctga ggcaggagaa tggcgtgaac 540
ccgggaggcg gagcttgcag tgagccgaga tcttgccact gcactccagc ctgggtgaca 600
gagcgagact ccgtctcaaa caaacaaaaa agccatattg aaagtaggaa gagaactaat 660
tcttgacatt taaacaaagc tgtctctcta tctagtaagt aagaagtaag taaatgtctt 720
gctctgttgc ccaggctgga gtgcagtggc acaatcacag ctccctatag cctgaaagtc 780
ctgggctcaa gtgatcctcc tgcctcagcc tcccaggtgg gtgggactgc aggtgcagtg 840
caggtgccat cacgcccagc tatttttttt tcttgtgtag aaatgggggt ctcattatgt 900
tgttcatgct ggtcttgaac tcctgagctc aagtgatcct ctcaccttgg cccccaaagt 960
gctgggatta caggcatgaa ccaccacgcc cagcccaaag ctttattata ttagatttca 1020
aatttctaac atatcaaagt agcgaagtag tacagtcaac cccctgccat cacatggctt 1080
cgacccaatc ttgtttcttt tctcctccca cctactccct cactcccata ttattttaga 1140
gcaaatccca gccattttat ctgcatcttt cagaatatat ctctaaaaga caggaattct 1200
tctttaaaat aatgctaatt aaaatagtct atgaaagtaa tatatactta ttagcaaaaa 1260
acaggttatt aaaagtagaa gcaaaaagtc acccatagtc cttttgcctg aaataactcc 1320
cattaacatt ttgatatatt tcttcacaga tgagaaaaaa ttaagaactt tccccccttc 1380
cctgccctat aatccaaatt gtgtgggcaa atctgagatt actagatgta ttggaaatcc 1440
tttctctttc atagataaaa aaacagaagt agacagaatt ttaaaaatag atatactatg 1500
ggatggctat tggagtcatg gcagtttttt ttttcttaaa tgaaaagtac ttttatagaa 1560
attgttcttt gaaaatattc agcaaataat catttattgc tttctatgtg cagatcaagt 1620
ttgtatgata attaaattat tgataactta tacctacgat tttaagaata ttttcaaccc 1680
tttggtcaga ttgtctccaa ttagcataat agcaaagaga cttttcaagg ttaatatgaa 1740
actttttctc agctattttt ttattgtcta tttctcaggg gaaaaaggaa acttatcctc 1800
ttaggcagtt ttcataatgt tatacattta aaaattaatc tgaaggagct taaaatataa 1860
atgcattagg tgaaaaatag cagttgattt aaataaaatg aaggaggaga agaaaaagaa 1920
ccttaccgac acaaaataga atgagattgg tttgcagatg gcgagcgatg ttgcttctgc 1980
tcgcagtccc catttctgtg tatgccgggg cgggttgggg tgtgtatttg aaaggaaagt 2040
tggctttctt ttttcagaag ggaagaagag ctgaagagag gaacctctta tgaaggaacc 2100
ataatcagga actttttgaa attgtgcaag gatggcttca ccatttaaca ttgagatttt 2160
ggcctggagc acatttagaa atgtgcatat cttttttttt tccatggatt ttacagttgt 2220
cataggtgtt agatgtttct gggtgagatt ttgatttaca cctagcacat atttccagga 2280
aggagtattg tttgacaaat aattcgtact tggatttttc agaacagttc agtcactgtc 2340
tagggtacct gctctgggtg ccactctggg ccagtaactg ctgggaacaa agaacagcag 2400
aaggagaagt ggctttgtgt agagatgaga aaagttagta aacaacacat tctaatgcag 2460
cgaataaata aattgtctgt tattaacttt ttgtactcca gttttgaatc caaagacacc 2520
tgccttcccc ccttttttga cagggtctct atcacccagg ctggaatgca gtggcaagat 2580
tgcaggtcat tgcagcctca gcttcctgag tagctgggac tacaggcatg caccaccacc 2640
cctggctaat tttttttttt ttttgtagat tctgtgttgt ctaggctagt cctgccctca 2700
agcgatcccc ctgccttggc ctcccaaagt gttaggatta caggtgtgag ccaccatgcc 2760
tggcccaaag accccctttt aatgtgttga gaaacaaata tgtgggccct ctgtcctctg 2820
tgtcctctcc cccagcactt agtggtgatt cagtaagggt cttctttgta aagcagcaga 2880
aacagactca ggctgactta aaaggaaatt tatgcaaaca tttgaaggta gatcctagag 2940
tcactgggag ggctagtgaa tcttgtgtgg ggtccacagc caggaagaat gcctcaaatc 3000
acaccataaa caggtctgaa gaggatgcta ctgccactgc caaacactag gtgccatcag 3060
ccctgttgtc agcatcgctc aaccactgcc ccagcccctt ggtctcactg ccattgctcc 3120
acccccagag aaggagaagc tcacatcccc actactctgg gtcactggct ctccattcca 3180
gggcatgggc aagtgtgcct ggcttaggct gcgtgccagt gcctcagctg cacggcaggc 3240
caggaaagcg gctcccaggc cttgtgtctc tcccagactc tgggcaggca attccccaga 3300
catgggaaga gcgtcatctg caccactgaa aattatgaaa aatctcccca ccatgatatg 3360
atgtctgtgc ccttggcatc tgctcatgat atattttctc attaacagaa agttattctc 3420
agtctgtttt gttttttttt tttttttttt ttttgagaca gagtctcgct ctgtcgccca 3480
ggctggagtg cagtggcacg atcttggctc gctgcaagct ccgcctcccg ggttcacacc 3540
attctcctgc ctcagcctcc caagtagctg ggactacagg tgcatgttgc catgcccagc 3600
taattttttt gtattttttt tagtagagac ggggtttcac tgtgttagtc aggatggtct 3660
tgatctcctg acctcgtgat ccgcccacct cggcctccca aagtgctggg attacaggca 3720
tgtgccacca cgcccggcct caatctggtt ttaacacttc aacagatgta tgtagttttt 3780
gtatcaccat agcacctaaa aagtaccccc aaagttaaat atactgtcat ctctgtgtgt 3840
gtgtctgtag tcagtgccag gtttgtatgt gaatttgagg accttcccta gtccccttgt 3900
cccccaggca tctcctgctc ataggtttgg aaatgactga tagaggcttc atctctagtc 3960
caagaaatgc tggacggtgt atccatccta ctgttccatc t 4001
<210> 18
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 18
tgtaactttg gaatttcata tctctactta tttccacctt cttcccaatt aggctatgtg 60
tgggagaaaa ccaattgtaa tgatttcctc tgtgagcagg aacttgtgat gaaactgtgg 120
ttccctcccc caaccagtta ggcaacctat taacaaaaaa gagaagtttg gatatgtcac 180
accacagcct gcttctgtga atgatttctt ttaagaatcc tcatattcta ggaattgttt 240
tttaaacaca gtatccaatg aaaaggttga agatttattt cctccatgga gggaatagaa 300
agattattct tattatttta aaaaaccata ttgaggctgg gcatggtggc tcatgcctgt 360
aatcccagca ttatgggagg ccgaggcggg cggatcacga ggtcaggaga tcgagaccat 420
cctggctaac acggtgaaac gctgtctcta ctaaaaatac aaaaaattag ccgggtgtgg 480
tggcagatgc ctgtagtccc agctactcgg gaggctgagg caggagaatg gcgtgaaccc 540
gggaggcgga gcttgcagtg agccgagatc ttgccactgc actccagcct gggtgacaga 600
gcgagactcc gtctcaaaca aacaaaaaag ccatattgaa agtaggaaga gaactaattc 660
ttgacattta aacaaagctg tctctctatc tagtaagtaa gaagtaagta aatgtcttgc 720
tctgttgccc aggctggagt gcagtggcac aatcacagct ccctatagcc tgaaagtcct 780
gggctcaagt gatcctcctg cctcagcctc ccaggtgggt gggactgcag gtgcagtgca 840
ggtgccatca cgcccagcta tttttttttc ttgtgtagaa atgggggtct cattatgttg 900
ttcatgctgg tcttgaactc ctgagctcaa gtgatcctct caccttggcc cccaaagtgc 960
tgggattaca ggcatgaacc accacgccca gcccaaagct ttattatatt agatttcaaa 1020
tttctaacat atcaaagtag cgaagtagta cagtcaaccc cctgccatca catggcttcg 1080
acccaatctt gtttcttttc tcctcccacc tactccctca ctcccatatt attttagagc 1140
aaatcccagc cattttatct gcatctttca gaatatatct ctaaaagaca ggaattcttc 1200
tttaaaataa tgctaattaa aatagtctat gaaagtaata tatacttatt agcaaaaaac 1260
aggttattaa aagtagaagc aaaaagtcac ccatagtcct tttgcctgaa ataactccca 1320
ttaacatttt gatatatttc ttcacagatg agaaaaaatt aagaactttc cccccttccc 1380
tgccctataa tccaaattgt gtgggcaaat ctgagattac tagatgtatt ggaaatcctt 1440
tctctttcat agataaaaaa acagaagtag acagaatttt aaaaatagat atactatggg 1500
atggctattg gagtcatggc agtttttttt ttcttaaatg aaaagtactt ttatagaaat 1560
tgttctttga aaatattcag caaataatca tttattgctt tctatgtgca gatcaagttt 1620
gtatgataat taaattattg ataacttata cctacgattt taagaatatt ttcaaccctt 1680
tggtcagatt gtctccaatt agcataatag caaagagact tttcaaggtt aatatgaaac 1740
tttttctcag ctattttttt attgtctatt tctcagggga aaaaggaaac ttatcctctt 1800
aggcagtttt cataatgtta tacatttaaa aattaatctg aaggagctta aaatataaat 1860
gcattaggtg aaaaatagca gttgatttaa ataaaatgaa ggaggagaag aaaaagaacc 1920
ttaccgacac aaaatagaat gagattggtt tgcagatggc gagcgatgtt gcttctgctc 1980
gcagtcccca tttctgtgta tgccggggcg ggttggggtg tgtatttgaa aggaaagttg 2040
gctttctttt ttcagaaggg aagaagagct gaagagagga acctcttatg aaggaaccat 2100
aatcaggaac tttttgaaat tgtgcaagga tggcttcacc atttaacatt gagattttgg 2160
cctggagcac atttagaaat gtgcatatct tttttttttc catggatttt acagttgtca 2220
taggtgttag atgtttctgg gtgagatttt gatttacacc tagcacatat ttccaggaag 2280
gagtattgtt tgacaaataa ttcgtacttg gatttttcag aacagttcag tcactgtcta 2340
gggtacctgc tctgggtgcc actctgggcc agtaactgct gggaacaaag aacagcagaa 2400
ggagaagtgg ctttgtgtag agatgagaaa agttagtaaa caacacattc taatgcagcg 2460
aataaataaa ttgtctgtta ttaacttttt gtactccagt tttgaatcca aagacacctg 2520
ccttcccccc ttttttgaca gggtctctat cacccaggct ggaatgcagt ggcaagattg 2580
caggtcattg cagcctcagc ttcctgagta gctgggacta caggcatgca ccaccacccc 2640
tggctaattt tttttttttt ttgtagattc tgtgttgtct aggctagtcc tgccctcaag 2700
cgatccccct gccttggcct cccaaagtgt taggattaca ggtgtgagcc accatgcctg 2760
gcccaaagac ccccttttaa tgtgttgaga aacaaatatg tgggccctct gtcctctgtg 2820
tcctctcccc cagcacttag tggtgattca gtaagggtct tctttgtaaa gcagcagaaa 2880
cagactcagg ctgacttaaa aggaaattta tgcaaacatt tgaaggtaga tcctagagtc 2940
actgggaggg ctagtgaatc ttgtgtgggg tccacagcca ggaagaatgc ctcaaatcac 3000
accataaaca ggtctgaaga ggatgctact gccactgcca aacactaggt gccatcagcc 3060
ctgttgtcag catcgctcaa ccactgcccc agccccttgg tctcactgcc attgctccac 3120
ccccagagaa ggagaagctc acatccccac tactctgggt cactggctct ccattccagg 3180
gcatgggcaa gtgtgcctgg cttaggctgc gtgccagtgc ctcagctgca cggcaggcca 3240
ggaaagcggc tcccaggcct tgtgtctctc ccagactctg ggcaggcaat tccccagaca 3300
tgggaagagc gtcatctgca ccactgaaaa ttatgaaaaa tctccccacc atgatatgat 3360
gtctgtgccc ttggcatctg ctcatgatat attttctcat taacagaaag ttattctcag 3420
tctgttttgt tttttttttt tttttttttt ttgagacaga gtctcgctct gtcgcccagg 3480
ctggagtgca gtggcacgat cttggctcgc tgcaagctcc gcctcccggg ttcacaccat 3540
tctcctgcct cagcctccca agtagctggg actacaggtg catgttgcca tgcccagcta 3600
atttttttgt atttttttta gtagagacgg ggtttcactg tgttagtcag gatggtcttg 3660
atctcctgac ctcgtgatcc gcccacctcg gcctcccaaa gtgctgggat tacaggcatg 3720
tgccaccacg cccggcctca atctggtttt aacacttcaa cagatgtatg tagtttttgt 3780
atcaccatag cacctaaaaa gtacccccaa agttaaatat actgtcatct ctgtgtgtgt 3840
gtctgtagtc agtgccaggt ttgtatgtga atttgaggac cttccctagt ccccttgtcc 3900
cccaggcatc tcctgctcat aggtttggaa atgactgata gaggcttcat ctctagtcca 3960
agaaatgctg gacggtgtat ccatcctact gttccatctc t 4001
<210> 19
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 19
gcatggccca tggcggcctc ggaggaagct gaaagggtcc tgggtacagt gcaggtggga 60
aggagggggc aggaatccca gcctgcccac cccagatggg agaagccagc attccggcgc 120
ctcacgccag actcaaagcc attcatttct gccctgggat gttggataag ggcagcagac 180
gctcaaggtc agggctgcct ccccatgtct gtttgaagcc cagctccggg gcaccgttgc 240
tagtgggact ctgctaggcc ggcagccatc ctgttgaggg tgacaggacc acccaagggg 300
cttgtggtgc ccaggacggg gaccacacag ggccaccttc ctgctcagga atacagagcc 360
gctgcgagcc ccctcaggga gccaggctgg tcccgagggc ccagctttcc caagacgcac 420
ctgtggtcta ttcggagcct ctgtcccctt taaaatgagc tgtcttttta ttatcaggct 480
atgtgtgctc tttatccatt ctagatgtta gaccttgtcc gaggcctgat ctgcaagtgt 540
tccctcccgt tctgtgggtt ctttctcttt cttgatggtg ttctttgtag cgcaacagct 600
taacttctga ggaagttggg cttgtctatt ctttcttctg ttggcgtcac atccaagaga 660
ccactgcctg aggcaagtgc gtggagatgc tgtttggctc cggtactctg aagaggcgcc 720
tgctgtccca ggcccactac ccccgctcat gtcttttcct ttccttcctc cagattgttt 780
cctatcttct caccttctcc tttgcagtct ctgaacctgg aaatctgggg acaagctggc 840
agcggttggg gctgtggagg caccgccccc tccagcgagg tggtcccccc gcatctgggc 900
caacgtcagg gagggaggat gctggggcac agggctcggg gaggacacgg ggacccatct 960
ccagaaatca gccagcaggg acagctggct ggggaccagt ccaaccgctc agccataagc 1020
catggagcca gaagagggag ctaacggctc agagctggag acccagtcca cgcggcgctt 1080
acctgatgga caccgcgagg gacagcacct ccagcagcag ggctgcactg aagaccgcca 1140
gggcggcggg cgggggaggc acggtggccg cctcgtactt caggaagcag gtggtggaca 1200
gcgtcagaaa cactgtggtc gacagaaaca catccaggag ggagctgaag gtgggactag 1260
caaacgtctt cacgggggag ttctttatga cctgtgtgga gcaggagtgg agcagaatga 1320
ctgggaggcc atgggcggcc aggagcgcag ttcagatgga gaaacaggca acacgtgcgg 1380
agcactgctg ctctgggcca ggcacagggc aggaaggctc tcggagggct gtctccctct 1440
actaccccat ttaacactgg ggagtccaag gcacagagaa gttaaacttg cccagggggc 1500
tgggcacggt ggctcacgcc tgtaatcccg acactttggg aggctgacgt aggagggtcg 1560
ctggagccca ggagttcccg gccagcgtaa cacagtgaga ctccatctct acaaaaaact 1620
ttaaaattag ccaggcatgg cagtgcgtga gtgtggtccc agctactcag gaggatgacg 1680
caggaggatt gtgtgagccc aggagtttca ggccgcagtg agctataatc atgccactgc 1740
actccagcct gggcaacata gcgagacccc gtctaccaaa aaataaaata attagctggg 1800
catagtggtt tgtgcctgta gtcccaggta cttgggatct gaggtgggag gattgcttga 1860
gcccaggagt ttgaagccgc agtgaactat gatcacacca ctgcagaatt ataccgtttc 1920
cgaaaacaaa aaaacacaaa aaaaaccctg ccagggtcac agagcaactg attatgaccc 1980
agcacattct catagacaca ttatatattt agctgcaaga tgagttactg actggcgcag 2040
gtgaggtccc tgcatctgta aggcagttgg acgagagcaa tgtgcagggc ggtcctgctg 2100
catggcggag tcctaggttc agcaggtgcg aaagcggggc ttcgtggcgg aagtggccga 2160
cggggattta agtggcacat catcgcctgg gaacctacgg ggcctggctc tgtgctgggg 2220
gccagtacac tagagaggca cacactagtg ttcggcaatt gagggggaga aataacgatg 2280
ctgacacgtc caagaggtga ccaggcccag gccctgcaag gctctgggga ggtgcgatct 2340
ccattgacgc tcacggcacc ccacaggcag gaactaccac ccatctcggt gtacagattg 2400
ggaggccaag gccaggggct acccctaggg cacacagcga acaaggagcc aggccaggac 2460
ttggcatgga ccggttgact cctgagtgtg tccctaacct agaggaaggt gtaggatcca 2520
aaggcagagg ctgtggttgg gagaggaagg gatgtagcca gttcagctgc tttccatttg 2580
gccttgcaac cttctggcaa cccctgagag gccatgggtg tccactcagc atcccggcca 2640
ggctcggggc tgagggcagg agctggagac gcacggagcg gaaaggtctc ctggcagccc 2700
tgggtgcggt gtgcgttctt tgggacaaac cagctccggg tgcctcgacc actcccagga 2760
actaggtggg agggcaggag gggcacttgg acggtgggag caaggctgcc agtcagccct 2820
gatgccaggc tggaccctgc cagggcaacc tgggggttag ggtttggttc caaattctct 2880
ccaaggaagc tggaggctct gtagtgccca aaagggcaag tcaggtccca gccctggaat 2940
cccctcagaa ggtgtgggaa tttctttttt ttcatttttc ttttcttttc cttttttttt 3000
tttttttttt tgagatggga tctggctgtc gccaggctgg agtgcagtgg cgcgatcttg 3060
gctcactgca acctccgcct cccaggttca agcaattctc ctgcctcagc ctctggagta 3120
gctgggagga caggcacccg ccaccacgcc cagctaattt ttgtatgatt agtagagatg 3180
ggatttcacc aagttggcta ggatggtctc tatctcgacc tcgtgatctg cctgcctcgg 3240
cctcccaaag tgctgggatt acaggcctga gccactgtac ccggccaggt gtaggaattt 3300
caggggcgac actgcccatg cccactcacc ttggcagagg cccacgcacg tcccctttct 3360
gtggggctcc atctagtgcc cgtcaggggt tcgagtcagg gccgggtgca tgcatgtgat 3420
gtccctgaga catcactgcc ttgcccacat gtgaggccgg aaggtttcca cgtgggctgc 3480
cgtccttctg agagcccaca cctcccacct tccaagccaa gggctcgggt acagggcctt 3540
tcctgattgg ggctgccatc tcctctgtag aactgcccca tccatggacg ctgccagagg 3600
ccccgctcct ccctccacct cccttgcagt ctctcctgcc tgccctgcta cttgccagag 3660
acacctgtcc ctgccacagc ctccccctct gggccacccc tggccctgat gccagaggct 3720
cccaggccca tctgccccac ccctgtgcat cccgggagtg ccacctgtcc cagtcccact 3780
gagaacgacg ctcccccaca tcatgacact ctttcgccct ctctccccat gggcgaaacc 3840
cagggacagt gccactggct gccagggcac cgtgggccag gctccgccct gctcactacc 3900
cacatcccag ggcctccggg tgcggccgtc ccaccatcat cagagcacag gctctgcagc 3960
cgggctgtct gggctcaagg ccactccacc atgattagct a 4001
<210> 20
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 20
agtgtaggat tggaatttct gtctttattc caattttata ctttgaagac aatatctgga 60
gctgataaat cctttatctt ttttcattat aaaagagaat atcaccaaac ttagtgataa 120
ttctctttta taatgcatct ttattgatat tttaagttag aggtcagcaa actttttatg 180
ttaagggcca gataataaat atttgaggcc ttgtgggcca tatggtctct gtcacaacta 240
cgcaggtctg ctcttgctgg gcaagagcag ccataggcaa tacattaaca attaagcatg 300
tgttccaata aaactttatt tgcaaaagca ggtggtaggc ctgctggccc atttgaagat 360
aaagtatagc ataatcaaat tttttcctca tgactaaaaa taatagttac aaataaacac 420
tttacaaagc agaacataga gatcatataa aaatccttcc tggtatttat aatatacttt 480
ttacatggta gacttagctt ctgattatag tttgagttcc tttttatttg tgaatacctg 540
gctgactgac atagctattt tcgagtaact ttttgttcat attgtttacc tttttggata 600
ttaggtgctc tgtttctctt tttctttttc tttctttctt tctttttttt tttttttctt 660
gagacagtgt cttcctctgt tgctcaggct ggagtgtagt ggcgcgatct ctgctcacta 720
ttgcaacctc tgcctcctgt gttcaagcag tcctcctgcc tcagcctccc aagtagctgg 780
gactacaggc gcaagctacc acgctagact aatttttgta atttttttag agatggggtt 840
tcaccatgtt attcatttga tcttaaactc ctgagctcaa gcgatccacc cacctcagcc 900
tcccaaagtg ctgggattac tggtgtgaac cacggcacct ggcctgctct gtttttctat 960
aaaacaattt catttttctc aaaaggagaa caggtcagac tctcctacag ataatgcgga 1020
actgcatagg agtattggtt ggagcaacag tagtctatga ccaaacaaaa atgttatgtg 1080
tataaaactt ggaattggta ccaactttgt aaccatctta gaaataacat tttactgtgt 1140
acagcttatt ataacagtat tcaaaattgc ataatacgtc atattaagaa aattatctct 1200
gtaatgcgtt aatgcactgg gagctgccta ttctctgacc aaactgttag tttgttaatc 1260
tcacttaaca aaactagaat atgttcatta ataatgtgtg aaagactgtt acaccatttt 1320
taactcagtt gctgcagttg tctagttgta agcagcactg cttgcatggt gcagagggat 1380
atttcaaaat attatgcagc agcagttttg ggtttaattg tctgcagatg gtagatgatc 1440
tattttcctt actcgctgcc acatttctaa tagtacccag gctgttccat gtaaattagt 1500
aatctttcat gtaagcctgt ccttaatttt ccaatatttt tttgcatcat tttccagctg 1560
tatcatgtta gaactgatgt tttaattttc tatttttttt aaatatctgt gtgtcattaa 1620
tgttaaccac ttaaacacaa ctgctgagct gtgcaggcaa actaagaagc cagcctggtc 1680
tgcttcaaga ctcttaatca tttctttgtt cacatgttaa ccatgataac tatagcaaag 1740
gcaattcgta gtttctattc ttttccaacc aatacttgga gatgcatttt ttattctctt 1800
aaagtgaaaa aataaagttc ctgaatggaa agctgcatga aatttagata tcagagaaat 1860
ttttctacat aaaaacgttt ttagtgaaag agtgtaatat tcttctttac aaaagggatt 1920
tagtaagaag agaaaacatt gtgatttcaa caatagaatg ttgaaaaacc taatacgttc 1980
taaaagccta aagtttggat caccagaatt aacaatgaat ctttcagcac cacactcaat 2040
ttcaattgga aacactgctc agtcttcacc tttatttagg atgaatgttt ttggttttgt 2100
tttccttttt ttcttcttca gttgatctat ccaagatctc agccaaacct tgcaactgtg 2160
tcatagttag aaatccaact gcctgactag agcttgttta agggaaaaga attcagaata 2220
ttcagcacct tgctatgttt tttttttctt tttttccttt tctttttgtt gttgttagaa 2280
caaagttgag acaatttgat atactgatgg ggattcttct tgctagccta gcttcagtct 2340
ggcttttttt tccctcatgc atgctgacct ggaaatcctc ctttgcctta gctgcgataa 2400
tcgtgactca gcttacaaca ccttacattc gcatcgcccc cgctgcaggc gacggccaac 2460
taacaggtca gatgttcaag tatagacgag tcatcttgcc catttcccac tcatttttct 2520
cagctctgtt tcccctctac atttcctcct ttgcttaagg cctgtttaca atgtttcaga 2580
cccttggtac agacctttgc tcgtattaaa ttaagtgaca tgcttggcct cttctgtgtg 2640
cacatttata tccatgcatt aagtacaact ccactgcatg ttttcattgt gtgtttggtc 2700
tgttggtatg tttctttcta gactaatgtt atatagaaat gaaatttgct ggacttgaaa 2760
tagcaataac atcaaacagg gtttccaaca aaatacataa agatatattc ctagttttcc 2820
aaattagtct aaaatggtat tcagatttga gggtgcaaga gatttatttt ttaccttgcc 2880
tggtctagtt tgcacataca tttagctata cacaatcttg gctgccagta gtattaaacg 2940
aagcccccca aaggtgttta gggtggcagt gttctaacag ctttgtgtgt atgtacaata 3000
gttgcagagc tgattgcatt ccttatagat ccttgtttca aagcatagtg tttatataaa 3060
tcatgcagtg gcatatctgc aagtcaaatg ctttgaataa tttaggaaag tttagaaaaa 3120
ttgaatggct tagagtcatg gagttgaact gtgcttcctt ataccaagag aagtgagctt 3180
gttcactctt ccctctaaag ccttatagac atctctatac attcatatag gaaaagagtc 3240
cagtaaatta cttcagattg aaggtgagta gttagtgaac tccaaaagac aaactttttt 3300
gcttcagcaa aggcaagtca ttgtcaagat ttctaacatg attacctggg caggactcaa 3360
aattctggtt tcagaagcac ctgcttacag tctaggcctg atgaaagcca agtaagaagc 3420
agaaactact ttctatatga ttttcactcc agcaattgtt ttttgtgtgt gaaactaggc 3480
aatgtcccct aacatttaag aaaatggtaa agcttaactc atggtagtaa attacaaatg 3540
tttaagttgt tctgtgaact tattattttg gacagtaaag aaaactaagc ctcaggaaaa 3600
ttgtagactg atgttataag gtaaagaaat gtaaaataat tctatttcct tctccactgg 3660
aattctgtac atattttcag ttttaaagct atcaatttat tgacagttct gttgcccttg 3720
cctgcaagtt tcttttaaga gtatttcttt tcctgattga agttctactt gttacagact 3780
cttcctgctc ctgaacttct gaagtgttct tcccacaagt aaaagtattt gagggacagt 3840
gcctttctta ataagtcctc ccctctaaaa ccttgacctt tcaggactgg cctgtggcac 3900
tctgtgtagg gttaggagtc catcttgaag gttattggga tccttcagca actctgacca 3960
cccattgtgt ctgagctggc cccagactca tcagtacatt g 4001
<210> 21
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 21
ttctcaccag tatggatcct ttgatgcaga gtaagttctg agccacgact aaaggccctc 60
ccacattcct tacattcata gggtttctct ccagtatgag ttctcagatg ttcagtaagg 120
tgtgagccac gaataaaagc cttcccacac tgtttacatt cataaggttt ttcacctctg 180
tgaattctct gatgttcact aagttgtgag ccataaataa aggccttgtc acattcctta 240
catttgtagg gcttttcacc tgtatgaatt ctttgatgat atgtaagttg tgtagcacgt 300
acaaaggtct tcccacattc cttacattca taatgtttct caccatgaat tttctcatgt 360
tgagtaagat atgcaacacg aataaaggcc tttccacatt ccttacattc aaagggtttc 420
tcacctgtat gaattctctg atgttcacta agttgtttgc cacaaataaa ggcctttcca 480
cattccttac atttgtaggg cttctctccg gtatgaattc tttgatgttg aataagatta 540
gaattagaaa taaaggcttt cccacattct ttgcatttat aaggtctctc acctgaatga 600
actctcaggt ggtaagtaag ttgtgagcca cgaaaaaagg tcttcccgca ttctttacat 660
tcatagggtt tctctcctgt atgaattctc tgatgttcat tcagttggga ggcacataaa 720
aaggctttcc cacattcttt acatatgtaa ggcttttcac cagtatgaac tctctgatgg 780
tatgtaaggt gtgaaccaag aataaaggcc tttccacatt ccttacactc ataaggtttc 840
tcaccactat gaattctctg atggtaagta agttgagagc caagaataaa ggccttccca 900
caatccttac attcataggg tttttcacca ctatgaattc tctgatgaag agtatattgt 960
gaacaataac taaaggcttt tccacatttc ttacattcat atggtttctc tcctgtatga 1020
actctctgat gttcagtgag ctgtgaacca cgaataaaag ctttcccaca tgcgttacac 1080
tgataaggtt tttcattagt atgaattttc tgatgttgaa tgagatcaga agtacgacca 1140
aaggcctttc cacattccat acactcataa ggtttctcac cagtctgaat tctctgatgt 1200
tgaatatagc ttggcttttt gctaaaggta ttcctgtgtt tcttaacttc agagcatttt 1260
tctatattat gattttcctc atgttgaata aggcatgaca ggtagctgaa accttgtctg 1320
cattccttac atttgtacaa tttttccttg gtaggaatta tccgatgaaa agtagaagag 1380
gttgaatgac tttcagtggc cttttcttca caggtaattt tgacacacat gtaaagccct 1440
tcctgacttg cttcttgacc ctctaagttg cctttgcact ccatattgtc tccaagacca 1500
ttgtattgaa ggtgatggtt gatacgtttc cccatattct cccactggag tgattccatt 1560
tcataaattt ccttttctgg agataacttt ttggtcacac aactggattc caagtctgta 1620
gaataaaaag aaagcaaatt cctgcttttg tttacaagga gaaataaaaa ttctatggta 1680
aaaatgatag atgaaaataa ttctcattaa gaatagaatg gcttaaacag tacagaaata 1740
ttttctgcca tttttcttta aaggcacaag gagagaataa agaaaaagga gagcttatag 1800
gaaagaaggt aattagaaaa ataggttaag tcagtggttc tcaaactttg gaatggctta 1860
gagacaccca ggaggcttat aaaatagtaa taaaaacacc ctcttgaaaa caaacagatg 1920
tgtagctgga catacaatct atcaacaaat cctatcagct ctcctgtcaa cattaagaat 1980
ctgaccacct ctactaccac catcctaggc caagtcaccc ctctctcatt gtcataacct 2040
cttagctggt ttctctgctt tcatccttgc ttctatgatc tattcttcat acagcagcca 2100
gaggaattct tttaaaatgt ggtttatata acatgatttc ccctgctaaa aatcttccct 2160
attaatggtc tgccatctca ggataaaatc caaagtcctt aatatggctc acaagaccct 2220
acacaagctt gcctgtggcc acttttctta tgtcatctcc cactttggcc acttttctga 2280
tatcatcact taccactccc ctaattattc atttggctgt ggtcataatg gcctccttgg 2340
tgttaaaaaa gtgtcaagca tatgcctaca gttttgtctt tagaatgccc ttcctctaga 2400
tacctgcatg gcttgttccc tcatttcagt tttctgctca gctgtcacat tatcaaaaca 2460
tccaacacat aattgtacct atattgcatt tttcactaat ggatatacca tttctaaagt 2520
ttgactattg tctgtctgtc ccatgcttaa tcaaagcaag ggctttgttt catatactgc 2580
tttatctcta gtgcctaaca tcattactta cacattgctg gcattcaata tttgtcaaca 2640
atttctaaaa tatatatgtt ccttcattca taaatctcct tagatgcttc tcataaaaca 2700
caaagtttga gaatagagct aataatcttt caatggttta cccttgtatt tagaataaaa 2760
tccaaactcc tattcatggc ctacaagccc ctttacgatc cagttctttc ttaaatttct 2820
ccaatttctt caacgtacct tttaagcctc aaattgagcc actactatct tcagaacaat 2880
gctctaatct tattggcctg atttaagttc caagaacatg tcaaattctt tcccatttta 2940
gattttccat atgctgttcc atctgccaag aatgctctca aagtggccac ctcctttgcg 3000
tgtgaaatgt cctagcttaa atgtcagctg ctcagaaaga ctatcctgga tcactttaat 3060
taggtccatg ccaggtattt tctatacctg taatttattc ctaattcttg tttgaatttt 3120
tgcaattcta aaatttcatt tatttatctg ttgatctttc tgctcctcaa gctctacatg 3180
tgtaggggga aaatgcctgt tttattatat agctagtccc cagaacacag tatacatata 3240
atgttttatt cctttaatat agtttttcat gtcttacttt ccaatattta ttacttaagg 3300
ccaaatgttc ttatatttcc caaagaagta cctcaaaatc tcctagaaat agtgacaagt 3360
ttacacaata aagatcttat attattccga agaaagaagt tgagaaacaa gtgtactttt 3420
caaatgcaga gaaacctatc tgatgtaact ggatcagcag gcacagggtc tctgaccact 3480
gtatataggc tgtagaataa caaagactgg tcatgggaag aaaaataaat acacatcagt 3540
aagctcaaat ttaaacagtc ttttagaaat aagcctccaa aacttgagaa aacacacttt 3600
tgctgaaaag caaaaagaaa tgataaaaac aaaagcaact accccccccc acttgccata 3660
tacacaccac agctcagttt aacaccaatt catgtcagca acacatccta gaacaagata 3720
gaccctgctt ttgtttcttt catagaagaa caatatggtt tataaaacac tgaaagaaat 3780
caatgggctt atggtgatta agggaacact ataaggaatc aaataaattc agttgggacc 3840
caaaatagct acctatgtgc tagtgcccta ataaaaatgg gaagaaatat taatctaata 3900
tgattctggg gtaggaggca ctatgagaac acaggatggt gctaaacaga aatctagagg 3960
aatatagaaa aagttaagat tagaatacca aagtctcaga a 4001
<210> 22
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 22
aaaacatgtc tgtctccatg tgtctctctc tccccacccc aatctatacc tcccctggcc 60
accaccccac ttctacacct ctttcacaac ccctactttc ctccccctcc tccagtcacc 120
acccctatct ctgaagttat tcgtaaaggt accattgtct cacctccttc ccccactctt 180
cccagccttc ttcccctacc accccggccc ccgccaccac aagcctgccc tctgaaaatt 240
aattcgccat cgagatatac atgcttcggt tctattttgc atttctgcac tcaatgaggt 300
ttttgcagcc agattccctt cccacagtcg aagcaagtat ccctccgtga aaaattcaca 360
gcgttacacc aagggcagtc ccagtcccct ggcctgcgat atactggagg tctttgctga 420
tgaggttcgg agtatctcct gtctcttggg tgctcctgat ggtctattct gtggggcccc 480
ccatcagacc acaggctagt atcaaaggcc tcccagtcgg aaggcagctg aggcggaact 540
ggtgctgtga ctgttgcttg tggaggggat atagtgggta tgtctgagaa ggaggaaaaa 600
gggcttgagt atgagtatga gtatgaggca gggagctgga caggaagagg ttctgatgag 660
gctctggacc aggggttcgt ggttccagag aaataagatg tggctgtttc gaccgaccgg 720
agatctccct ccttctgccc ccaataggtg tacttttcct gatcagcatc tccaaggagc 780
tgcaggagga ttctctccag cttctctgtg gtctccatgg aggcagccct ggcctccacg 840
tgggggaact gggtctccat gggggcagcc ccagcctcca cagggggagc catggcctcc 900
acaggggcag ccacaacctc cacatccctc tgctcttctt ctgctccttt tcctccagca 960
gcaccagcag cagccaccgc cgcctcttct tcctcctcag ctgctccttc tgtgcaaacc 1020
cctccggcag tggccaggcc tggccaccct gccccctgct cccactcatg gggagacacc 1080
agctcctgga gagaagtggg cagggctgag gtgtgttctg ggggcaaggc gggagcacta 1140
agcagggtag gacggagcta ccaggtggga agggcggggc agggcgaagc catggagcag 1200
ggtggagagg ggtgggaatg gggcagggca gtggtactgg gcagggacca gttgagggat 1260
cttactgcat ggagcggctt ccaactcacc ttgtctctct ctttctgcac ctccacgagg 1320
ctggtctggg ccattcttag ccgggaggcc gcctccttgc gttccgtctc ctgctgctcc 1380
ctgagcttct tcaggtctga tgccaaggcc tgtgcggctg atttgtgcag tttggcgaag 1440
ccgtgcagcc accgcaccct gtgcctttgt agctgtgcct gcctgtgggc aaagcgcact 1500
cccaaggcca ggctgcccca ggtgcaggcc tctttgacct cactgggcac ctcgctgtcc 1560
tccagtatgg ccctcagctt gtcttccacc ttctcccagg ataaggatat attctcaaga 1620
tagaactcgg ggcctttcgt gtgcctggcc attttctcgt tgatgaaggc caccacgttg 1680
ctatgccgga acccgctact ggggtcctca ggtctcaagg ccatgatcgc ctaggggttt 1740
aacggtttca ctagccctgt gtggatggag cagccaatag gttcctttcc tcccccttag 1800
cccctcccct catccatctt ctcccgccct cttgtccccg ccctacccgc tctgacaaga 1860
ccgtcctaat gaccccctga ccggcttgtc ctaccctagg acacctccca ccaggcctca 1920
cctcttcagc cagggccaga ggaagtacag gccaaccccg ctgtcttaac acgcccgaag 1980
agagaggaag atctctctcc ctcctaaggt tccagggaaa ggaagactca gggccaacgt 2040
cggcttttac tctgggactg ttccaatttc cagagatacc aggagtggaa gggagtgagt 2100
gaggaaggcc cctgccactc cattgggatt taagggaacc tgtcacatgg ttgggagccc 2160
ttaaaggtcg gaagaaacag gagtggtcag aaaagactta gaaccaagaa aattaagccc 2220
ttataatcaa gtcaaaaaag tcccatgtat ccctccctct tgacaatggt ttgtcccatg 2280
ccaacggcta gtagggaccc ccctagaggt tgaaggggga ataatgggta tgttctagaa 2340
cacgaaaaag cccccaaccc atgaaaattc tgcagttcat tttccagact cctaaatcct 2400
agtatacagg ttagtgacct ttttctgaaa cgggcaagat agcaggcatt tgaggctttg 2460
tgggcaaaag attatcccaa ctacttgtag cacaaaagca gccataaata atacctgaag 2520
aaatgaccat ggccatgttc taataaaagt tcatttataa aaccaggcag ttggcatgat 2580
ttggcctgta gatgtcaaac tcccaatcta gattcacatt tcatacatgt cctacagaca 2640
catacaatca gaatgttcat gatatgactc ccccttttcc ctgttaaaag aaaaacgtta 2700
gacaacgtta acagtttgtt tacacataaa caacaaccaa aaaaaaaaaa aaagaaaaac 2760
gagtcatgaa tttgtcagta ctcagaacca gaagagtttc agagagctat tcccagcaac 2820
atgagaagtg aatttttata ggctgagttt ggacacaaaa gagaaaaatc acctgattgg 2880
ctacagcaag gcatttgcct tatttggaca tggtcttgtc acttgactgc ctgtaattag 2940
ctggagcctg gctagttgtg tttggctgaa attaggctgt cttttataca gcttatgtta 3000
agtttcagtc tgtttacatg ctaagttagt ttgcggtttg ttagatagga actcaaagta 3060
tggaaacagc ttcaaactaa tggcctcttg cttattttaa tatatgcagg agatgattcc 3120
cctcctaaat tcctatgttg tccaatacag aaacctcaaa atcaaatttt taagttacct 3180
acccttcata ccgaccctta aaaacaagtc aactaagaga caaaatcttg ttgctgtgac 3240
ctcagaaact tctaagacat ccatgcccac ttagggaatg tccatgtttc ttctcaaata 3300
atcttcctgt ctttaatttc ttcataatcc ctaatcactt cctaatcttt cctatgatca 3360
tgtttctctt aagcttaaaa tcatttgatg gatccttgtg gcacactcct tgtcctgaca 3420
ttagaatctc accccaggat accattctaa tttaactctc tccacaatgt tatccaccat 3480
catttatgta cctaagggac taagccatac agggatacac tgattttggc gttttaagtt 3540
attttccctt ctgaaaaatc agagcattca ttttttatta tatgggcatt atgttctatt 3600
ctatgaacta catttaaaaa gacaatattt cttctctaag ggtcctatgt cagataatta 3660
gagaggaaaa agacaataaa tcaagcatta aaatattacc taagaaggta gttaaacccc 3720
agaggcataa agtcccacca tttaaatgct gtttttagat gctgttttgt gatttctttc 3780
cttctcagct tgtatatttt atttatcagc catcagtgga aagggttcat gttttggctt 3840
attctagagt gacttctaca gaaagagcat ccagaagata atgtagaaaa gagaatggtg 3900
aggaaagggg gagagactgg agtcaggaag acccattagg gagatgctaa agcaattcaa 3960
gaagtaaact aaagcagtgg aaatagggat ggagagaaga g 4001
<210> 23
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 23
gtgtctctct ctccccaccc caatctatac ctcccctggc caccacccca cttctacacc 60
tctttcacaa cccctacttt cctccccctc ctccagtcac cacccctatc tctgaagtta 120
ttcgtaaagg taccattgtc tcacctcctt cccccactct tcccagcctt cttcccctac 180
caccccggcc cccgccacca caagcctgcc ctctgaaaat taattcgcca tcgagatata 240
catgcttcgg ttctattttg catttctgca ctcaatgagg tttttgcagc cagattccct 300
tcccacagtc gaagcaagta tccctccgtg aaaaattcac agcgttacac caagggcagt 360
cccagtcccc tggcctgcga tatactggag gtctttgctg atgaggttcg gagtatctcc 420
tgtctcttgg gtgctcctga tggtctattc tgtggggccc cccatcagac cacaggctag 480
tatcaaaggc ctcccagtcg gaaggcagct gaggcggaac tggtgctgtg actgttgctt 540
gtggagggga tatagtgggt atgtctgaga aggaggaaaa agggcttgag tatgagtatg 600
agtatgaggc agggagctgg acaggaagag gttctgatga ggctctggac caggggttcg 660
tggttccaga gaaataagat gtggctgttt cgaccgaccg gagatctccc tccttctgcc 720
cccaataggt gtacttttcc tgatcagcat ctccaaggag ctgcaggagg attctctcca 780
gcttctctgt ggtctccatg gaggcagccc tggcctccac gtgggggaac tgggtctcca 840
tgggggcagc cccagcctcc acagggggag ccatggcctc cacaggggca gccacaacct 900
ccacatccct ctgctcttct tctgctcctt ttcctccagc agcaccagca gcagccaccg 960
ccgcctcttc ttcctcctca gctgctcctt ctgtgcaaac ccctccggca gtggccaggc 1020
ctggccaccc tgccccctgc tcccactcat ggggagacac cagctcctgg agagaagtgg 1080
gcagggctga ggtgtgttct gggggcaagg cgggagcact aagcagggta ggacggagct 1140
accaggtggg aagggcgggg cagggcgaag ccatggagca gggtggagag gggtgggaat 1200
ggggcagggc agtggtactg ggcagggacc agttgaggga tcttactgca tggagcggct 1260
tccaactcac cttgtctctc tctttctgca cctccacgag gctggtctgg gccattctta 1320
gccgggaggc cgcctccttg cgttccgtct cctgctgctc cctgagcttc ttcaggtctg 1380
atgccaaggc ctgtgcggct gatttgtgca gtttggcgaa gccgtgcagc caccgcaccc 1440
tgtgcctttg tagctgtgcc tgcctgtggg caaagcgcac tcccaaggcc aggctgcccc 1500
aggtgcaggc ctctttgacc tcactgggca cctcgctgtc ctccagtatg gccctcagct 1560
tgtcttccac cttctcccag gataaggata tattctcaag atagaactcg gggcctttcg 1620
tgtgcctggc cattttctcg ttgatgaagg ccaccacgtt gctatgccgg aacccgctac 1680
tggggtcctc aggtctcaag gccatgatcg cctaggggtt taacggtttc actagccctg 1740
tgtggatgga gcagccaata ggttcctttc ctccccctta gcccctcccc tcatccatct 1800
tctcccgccc tcttgtcccc gccctacccg ctctgacaag accgtcctaa tgaccccctg 1860
accggcttgt cctaccctag gacacctccc accaggcctc acctcttcag ccagggccag 1920
aggaagtaca ggccaacccc gctgtcttaa cacgcccgaa gagagaggaa gatctctctc 1980
cctcctaagg ttccagggaa aggaagactc agggccaacg tcggctttta ctctgggact 2040
gttccaattt ccagagatac caggagtgga agggagtgag tgaggaaggc ccctgccact 2100
ccattgggat ttaagggaac ctgtcacatg gttgggagcc cttaaaggtc ggaagaaaca 2160
ggagtggtca gaaaagactt agaaccaaga aaattaagcc cttataatca agtcaaaaaa 2220
gtcccatgta tccctccctc ttgacaatgg tttgtcccat gccaacggct agtagggacc 2280
cccctagagg ttgaaggggg aataatgggt atgttctaga acacgaaaaa gcccccaacc 2340
catgaaaatt ctgcagttca ttttccagac tcctaaatcc tagtatacag gttagtgacc 2400
tttttctgaa acgggcaaga tagcaggcat ttgaggcttt gtgggcaaaa gattatccca 2460
actacttgta gcacaaaagc agccataaat aatacctgaa gaaatgacca tggccatgtt 2520
ctaataaaag ttcatttata aaaccaggca gttggcatga tttggcctgt agatgtcaaa 2580
ctcccaatct agattcacat ttcatacatg tcctacagac acatacaatc agaatgttca 2640
tgatatgact cccccttttc cctgttaaaa gaaaaacgtt agacaacgtt aacagtttgt 2700
ttacacataa acaacaacca aaaaaaaaaa aaaagaaaaa cgagtcatga atttgtcagt 2760
actcagaacc agaagagttt cagagagcta ttcccagcaa catgagaagt gaatttttat 2820
aggctgagtt tggacacaaa agagaaaaat cacctgattg gctacagcaa ggcatttgcc 2880
ttatttggac atggtcttgt cacttgactg cctgtaatta gctggagcct ggctagttgt 2940
gtttggctga aattaggctg tcttttatac agcttatgtt aagtttcagt ctgtttacat 3000
gctaagttag tttgcggttt gttagatagg aactcaaagt atggaaacag cttcaaacta 3060
atggcctctt gcttatttta atatatgcag gagatgattc ccctcctaaa ttcctatgtt 3120
gtccaataca gaaacctcaa aatcaaattt ttaagttacc tacccttcat accgaccctt 3180
aaaaacaagt caactaagag acaaaatctt gttgctgtga cctcagaaac ttctaagaca 3240
tccatgccca cttagggaat gtccatgttt cttctcaaat aatcttcctg tctttaattt 3300
cttcataatc cctaatcact tcctaatctt tcctatgatc atgtttctct taagcttaaa 3360
atcatttgat ggatccttgt ggcacactcc ttgtcctgac attagaatct caccccagga 3420
taccattcta atttaactct ctccacaatg ttatccacca tcatttatgt acctaaggga 3480
ctaagccata cagggataca ctgattttgg cgttttaagt tattttccct tctgaaaaat 3540
cagagcattc attttttatt atatgggcat tatgttctat tctatgaact acatttaaaa 3600
agacaatatt tcttctctaa gggtcctatg tcagataatt agagaggaaa aagacaataa 3660
atcaagcatt aaaatattac ctaagaaggt agttaaaccc cagaggcata aagtcccacc 3720
atttaaatgc tgtttttaga tgctgttttg tgatttcttt ccttctcagc ttgtatattt 3780
tatttatcag ccatcagtgg aaagggttca tgttttggct tattctagag tgacttctac 3840
agaaagagca tccagaagat aatgtagaaa agagaatggt gaggaaaggg ggagagactg 3900
gagtcaggaa gacccattag ggagatgcta aagcaattca agaagtaaac taaagcagtg 3960
gaaataggga tggagagaag agtgtagaga gccatttggt a 4001
<210> 24
<211> 4001
<212> DNA
<213> Homo sapiens
<400> 24
cattcaaatg attctccagc cagagaagtt ataggcagag gcagtgacaa aggtcaaagc 60
acataataag tgtactttct gagagggagt tttttttccc cattttaaca tttttgtctt 120
gaatttcaat tacttaaatc caaaatcact gccttttcct atgtctcaaa taatggaagc 180
tgctttgaga ttcacatcaa aagccaacat ggcactataa tggctctcca gcgcttgaat 240
aaagcccagt agtacttata acagtatcag ctcaaaaatg ggaagcgtta caatagctac 300
cacttaatga gcactatgtg gatgatacta ttcgaaagct ttacaccctt cattttattt 360
aatccccaca atcatcctga gataaactct attattcata ttttacagat gactcaacta 420
aggctcaggg aggtaaggta gcatgatcaa agtgtctatc agaaagtgat agagatggaa 480
cttgaaccca ggcaatatgg ctccagaaca catactccaa cccctaggca tattttctcc 540
cttagcttgt gagagccaat gcctttccgg tgtatattgc ttgaaatcta ttatatttga 600
acccctcagg agcacaggtg tcatcagcct cgtctctgcg tccctattgt ttcctcttcc 660
cttcacttgg ttcaaaaaga actgttctgc ttttccacta agcctcttca ccatttgact 720
gctactatct ggggaagcct ctctatctgc actctgtatc tagagtgttc cttgctgctc 780
gaaatatact ccacggtcca gcagcatcag catcaccagg gaacttgtta gaaatgaagc 840
atctcaggta tcaccccaga ctactgaatc agaatctgca ttttaacaga ctctcaggtg 900
attcgtgtgc acattgaagc tcgagagcag cgctctccat cttttgcagt acttactccc 960
ttagtaattc cttgctggag tgttccgtac acagaagtct aaaatctatg gcttgaaagg 1020
gagggaatca tttaagctgt tccctgttct ctggccttta ggagctccaa aatcagtcag 1080
agaccatgct aagagtagat cttgctgtca ttagactaac aaatatgtgg acgaaatgaa 1140
aagacagcca cgttagccat gatttccttt ccatatttca ttcatgctct ccctgtatct 1200
cacactgcca caacacagat gctttctttt atatcatggg aacattttta attaaagcat 1260
tggaaccttt gagaatagac aatcctttaa aagttaagtt tatctgtatt tacaagcctc 1320
catttatttg gcagtgacaa taggaaaaaa aaactgctgc taaaggcata tagaggaatg 1380
attaagctaa tttacaataa ttgctttaat cttgacatat tgaattgtag tgtctaactc 1440
tttgagttct ctcggtctga ggtaaaggaa aaatataaca ctgtgttcga tattgtacct 1500
ctgaagcaca atggaaattt tgccctacag tgccagcctg tcacccatgg tgtccatgtg 1560
atagaagggg agagatcacc cgcaatgtct aattgccact gtactcactc ctgtgcttta 1620
agatagaaag cccatgttca gaagcaaagc cattagcagc tattaattgc ttagctctaa 1680
taaattgtcc agaccaggac ttaattaggg aagacattta ggaacaggac acaggacaag 1740
aggacataaa ggagagccag gaatgagatg aaacctgagc agccaggaag gccaggaagg 1800
ggcccaggat ggagagaggg tagttgagat atccggatgt caactttact tactttctga 1860
tcgcgtttca agacagtccg gctggttgct tcttctaaag actcctatag ttttgcatct 1920
atacctctcc aggaagcacc tggagtagga agagccgtgg tatctaaggt gatatgaatg 1980
gtttcttttt gctcttggca gctctgctca cccctgcctc ccctgaagct gtctacttct 2040
ctgcttgatg ggactggctg ggtcaggcct ggttttgctt tctgactgac tcaggtctga 2100
gtcactgatc aagaggagga agtcctaatc atgggtgcag ccctttcaag taagaaacat 2160
ggaatttcat caaaagggac tgtgagctct ccgagacatt tggctgacag ttattgtccc 2220
attatttcac ttttagagaa actacattct tcatctttgg cccttcgtga actctgctcc 2280
accagtgatt tcaatctcaa agtttcctga agggctttgg ccagatgtga ccaacagctc 2340
atcatgagct gatatctggt atacggcctc aaacaatagg aagaagatga tcattgtatt 2400
catgcaacat ttatataggt gtctctctcc agattctggc aagaaattgc aaaaaaatga 2460
actattgctt ttgaattgga aacaaagtac ttggctgcta aaaatgtggc actgcagccc 2520
acttacaaca gttgagggca ttacaacagt tggactctct tggtgtcatt cctttaaaga 2580
aaagcattat cagcaagaag caaatatggg tcattcttat taatatttcc atactctaac 2640
ttccttctga ggggcttggt gcattctaac tcaattatct cacttatctt tgagggcttt 2700
tgattatctt gggtggggtt gtgggacagg gagggtcgac acacaagcac atatccttta 2760
agagctcaca gatgctttta catctagcaa cttgtttgag cctcgtaagt ctatgaggtc 2820
tacagagatg atcatacctt atttgtgaaa gaaccatgtt tgaaagccag ggcttccagc 2880
actacatccc aaataccccg agagctgtgg ctctcagcac cactgttgct gtagacagca 2940
ttcactgttc tttaacttcc aaggatgtca gggtagccgt gcgcttgcaa agagggggca 3000
ccagggcttt tttcttcccc tgcttcattt gctctttatt tcttacactc tgactcctta 3060
ttgaaggttt gtggccctaa atgcagtttc tttagctgtc tctttccctc caaccccagg 3120
tagtatttta gcctggatag ttcccatagg tatttcttta tacctaaatt caaaagtgtt 3180
tcttgacttc cttaaaccta tgccctttta tgatcaacat cttgcatgtc ctaccatttc 3240
tccccctccc ctgacccttg caaagacact ggaatggaca ttcatcccca gtggagagtt 3300
caggctcctg gtttccatag ccttcatctt atccctacca gcagccacat ttttacgaat 3360
ctttatttcc tgttttaaga aagcaatatg cctatttttt ctctgcgata taaggtagca 3420
taatattgat aggatctttc aaactatttc tctctggccc tggttctgaa acatatagcc 3480
ccttgagaaa catggctcta ttattaccta tcaagtttta ttatatttaa aactgtttcc 3540
acatctgttc ccctaaaccg tgacctcctt gaggatgggg atataatctg gttaatcttt 3600
gtatctgcaa atgccaagca gagtgccagg tacataaatt gttgaacgaa cgaacgaatg 3660
aatgaatgaa tgaataaata gtgccagaag aagacaatgc tactccatca tttcccccat 3720
gtcttgctct gtttttagcc aaagtacaaa ctggaagtgt ctaagatatt ttaacaccca 3780
actatgaata gtgtcaatga aaatgacaat agttcccttc tctgttcccc cattctgtct 3840
attttcccat agaatcttta ctgctcaggt tctatatgct tctgcttgac taattataca 3900
cttctttctt tcttcttgaa tgtttaaact gacagtaatt gcccagaact gaatgcatga 3960
cctctaagaa tctaatacta ctaaaaaggg actggtttca c 4001

Claims (10)

1. A system for predicting the prognostic survival of a patient with intrahepatic cholangiocellular carcinoma, comprising:
the acquisition module is used for acquiring variables related to the prognosis survival period of the intrahepatic cholangiocellular carcinoma patient; and
the prediction module is used for predicting a prognosis parameter of the intrahepatic cholangiocellular carcinoma patient according to the at least one variable obtained by the acquisition module and outputting the prognosis parameter;
wherein the variables comprise promoter methylation scores that are based on methylation levels of at least 24 promoter regions; the 24 promoter regions are a set of polynucleotide sequences shown in SEQ ID No.1 to SEQ ID No.24 or a set of polynucleotide sequences complementary to SEQ ID No.1 to SEQ ID No. 24;
the acquisition module and the prediction module are connected in a wireless and/or wired mode.
2. The prediction system of claim 1, wherein the obtaining module is configured to obtain a promoter methylation score, and wherein the obtaining module comprises:
an analysis unit at least for analyzing the methylation level of the 24 promoter regions in an ex vivo sample of a patient with intrahepatic cholangiocellular carcinoma; and
a scoring unit for calculating a promoter methylation score based on at least the methylation levels of the 24 promoter regions;
wherein, the analysis unit is connected with the scoring unit in a wired and/or wireless mode.
3. The prediction system of claim 2, wherein the acquisition module further comprises: an output unit for transmitting the promoter methylation score to the prediction module; the output unit is connected with the scoring unit in a wired and/or wireless mode.
4. The prediction system of any one of claims 1 to 3, wherein the promoter methylation score is calculated according to formula (1):
promoter methylation score = (-1.778426) × M (SEQ ID No. 1) + (-0.5188023) × M (SEQ ID No. 2) + (-0.007917956) × M (SEQ ID No. 3) + (-4.853461) × M (SEQ ID No. 4) + (0.442986) × M (SEQ ID No. 5) + (-1.512141) × M (SEQ ID No. 6) + (-0.503913) × M (SEQ ID No. 7) + (-2.622882) × M (SEQ ID No. 8) + (-5.796018E-14) × M (SEQ ID No. 9) + (-2.918528) × M (SEQ ID No. 10) + (-1.456336) × M (SEQ ID No. 11) + (-0.02070397) × M (SEQ ID No. 12) + (-0.4516687) × M (SEQ ID No. 13) + (-0.5262961) × M (SEQ ID No. 14) + (-0.01533871) (-3615) (-3985) × M (SEQ ID No. 3) (-0.3580827) (-3615) × M (SEQ ID No. 3) (-18) SEQ ID NO. 17) + (-0.03428591) × M (SEQ ID NO. 18) + (-0.183213) × M (SEQ ID NO. 19) + (-0.6439997) × M (SEQ ID NO. 20) + (-0.9643823) × M (SEQ ID NO. 21) + (-0.4090321) × M (SEQ ID NO. 22) + (-1.727257E-14) × M (SEQ ID NO. 23) + (-0.1818805) × M (SEQ ID NO. 24) (1)
Wherein "x" refers to a multiple, and M refers to the average methylation level of all CpG dinucleotides contained in the corresponding polynucleotide sequence.
5. The prediction system of claim 1, wherein the variables further comprise: ascites status, tumor size, macrovascular invasion status, lymph node metastasis status, degree of tumor differentiation, and CA19-9 saccharide antigen concentration.
6. The prediction system of claim 1, wherein the prognostic parameter includes: a prognostic survival risk grouping, and/or a survival probability value for a specified age.
7. The prediction system according to claim 6, wherein the prognostic risk grouping is based on a preset threshold to obtain a prognostic classification for patients with intrahepatic cholangiocellular carcinoma.
8. The prediction system of claim 6, wherein the probability-to-live value for the specified age comprises: the prognostic survival is a probability value of three years and the prognostic survival is a probability value of five years.
9. The prediction system of claim 1, wherein the prediction module predicts and outputs the prognostic parameter by: and importing the acquired variables into a pre-established Cox regression model, calculating the prognosis parameters through the Cox regression model, and visually presenting a prediction result.
10. The prediction system of claim 1, wherein the obtaining module and the prediction module are respectively a processor, a server or a computer host; or, the obtaining module and the predicting module are integrated in the same processor, the same server or the same computer host.
CN202011555919.9A 2020-12-25 2020-12-25 Prediction system for prognosis survival period of intrahepatic cholangiocellular carcinoma patient Active CN112289450B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011555919.9A CN112289450B (en) 2020-12-25 2020-12-25 Prediction system for prognosis survival period of intrahepatic cholangiocellular carcinoma patient

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011555919.9A CN112289450B (en) 2020-12-25 2020-12-25 Prediction system for prognosis survival period of intrahepatic cholangiocellular carcinoma patient

Publications (2)

Publication Number Publication Date
CN112289450A true CN112289450A (en) 2021-01-29
CN112289450B CN112289450B (en) 2021-05-18

Family

ID=74426344

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011555919.9A Active CN112289450B (en) 2020-12-25 2020-12-25 Prediction system for prognosis survival period of intrahepatic cholangiocellular carcinoma patient

Country Status (1)

Country Link
CN (1) CN112289450B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114155956A (en) * 2021-12-02 2022-03-08 首都医科大学附属北京地坛医院 System for predicting blood vessel invasion probability of primary liver cancer patient incapable of being resected by surgery

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104854247A (en) * 2012-10-12 2015-08-19 新加坡科技研究局 Method of prognosis and stratification of ovarian cancer
WO2015169857A1 (en) * 2014-05-07 2015-11-12 Université Libre de Bruxelles Breast cancer epigenetic markers useful in anthracycline treatment prognosis
CN105063029A (en) * 2014-12-12 2015-11-18 中国人民解放军第二军医大学 Intrahepatic duct cell cancer related gene mutation targets and application thereof
CN107267625A (en) * 2017-07-06 2017-10-20 王冬国 Purposes of the lncRNA as biomarker in liver cancer diagnosis and treatment
CN110004226A (en) * 2019-02-14 2019-07-12 辽宁省肿瘤医院 A kind of method and model application based on carcinoma of the rectum transcript profile gene and methylation Conjoint Analysis prediction prognosis
US20190249258A1 (en) * 2016-05-16 2019-08-15 Dimo Dietrich Method for assessing a prognosis and predicting the response of patients with malignant diseases to immunotherapy
CN111402949A (en) * 2020-04-17 2020-07-10 北京恩瑞尼生物科技股份有限公司 Construction method of unified model for diagnosis, prognosis and recurrence of hepatocellular carcinoma patient

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104854247A (en) * 2012-10-12 2015-08-19 新加坡科技研究局 Method of prognosis and stratification of ovarian cancer
WO2015169857A1 (en) * 2014-05-07 2015-11-12 Université Libre de Bruxelles Breast cancer epigenetic markers useful in anthracycline treatment prognosis
CN105063029A (en) * 2014-12-12 2015-11-18 中国人民解放军第二军医大学 Intrahepatic duct cell cancer related gene mutation targets and application thereof
US20190249258A1 (en) * 2016-05-16 2019-08-15 Dimo Dietrich Method for assessing a prognosis and predicting the response of patients with malignant diseases to immunotherapy
CN107267625A (en) * 2017-07-06 2017-10-20 王冬国 Purposes of the lncRNA as biomarker in liver cancer diagnosis and treatment
CN110004226A (en) * 2019-02-14 2019-07-12 辽宁省肿瘤医院 A kind of method and model application based on carcinoma of the rectum transcript profile gene and methylation Conjoint Analysis prediction prognosis
CN111402949A (en) * 2020-04-17 2020-07-10 北京恩瑞尼生物科技股份有限公司 Construction method of unified model for diagnosis, prognosis and recurrence of hepatocellular carcinoma patient

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114155956A (en) * 2021-12-02 2022-03-08 首都医科大学附属北京地坛医院 System for predicting blood vessel invasion probability of primary liver cancer patient incapable of being resected by surgery

Also Published As

Publication number Publication date
CN112289450B (en) 2021-05-18

Similar Documents

Publication Publication Date Title
AU2013277457B2 (en) Humanized IL-7 rodents
CN101668865B (en) Genetic susceptibility variants associated with cardiovascular disease
KR102291355B1 (en) Identification of patients in need of pd-l1 inhibitor cotherapy
CN101874120B (en) Genetic variants on chr2 and chr16 as markers for use in breast cancer risk assessment, diagnosis, prognosis and treatment
AU2016376191A1 (en) Materials and methods for treatment of amyotrophic lateral sclerosis and/or frontal temporal lobular degeneration
CN107223159A (en) The detection of DNA from particular cell types and correlation technique
DK2155907T3 (en) Genetic variants useful for risk assessment of coronary artery disease and myocardial infarction
AU2016351889A1 (en) Detection of foetal chromosomal aneuploidies using DNA regions that are differentially methylated between the foetus and the pregnant female
CN113853437A (en) Use of adeno-associated viral vectors for correcting gene defects/expressing proteins in hair cells and supporting cells in the inner ear
CA2936612A1 (en) Atf6 polymorphisms associated with myocardial infarction, method of detection and uses thereof
CN101641451A (en) Cancer susceptibility variants on the chr8q24.21
KR20210138587A (en) Combination Gene Targets for Improved Immunotherapy
CN109476698B (en) Gene-based diagnosis of inflammatory bowel disease
CN112280868B (en) Intrahepatic cholangiocellular carcinoma patient prognosis detection biomarker and detection kit
CN112270992B (en) Construction method of intrahepatic cholangiocellular carcinoma patient prognosis evaluation model
KR20180049093A (en) New biomarkers and methods of treatment of cancer
AU2018360287B2 (en) Method for determining the response of a malignant disease to an immunotherapy
CN112289450B (en) Prediction system for prognosis survival period of intrahepatic cholangiocellular carcinoma patient
CA2497597A1 (en) Methods for identifying subjects at risk of melanoma and treatments
JP2001245666A (en) New polypeptide
CN107223162A (en) New RNA biomarkers label for diagnosis of prostate cancer
KR20230074214A (en) Methods of treating fatty liver disease
KR20190126812A (en) Biomarkers for Disease Diagnosis
KR20210095859A (en) Nucleic Acids for Cell Recognition and Integration
CN108770360A (en) To Cancerous disease carry out by stages, the means and method of parting and treatment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20210423

Address after: 314100 Room 501, building A2, No. 555, Chuangye Road, Dayun Town, Jiashan County, Jiaxing City, Zhejiang Province

Applicant after: Zhejiang Gaomei Biotechnology Co.,Ltd.

Address before: 210031 room 1601, block a, gene building, 211 pubin Road, Jiangbei new district, Nanjing City, Jiangsu Province

Applicant before: Jiangsu gaomei Gene Technology Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant