CN109423515B - Gene markers for liver cancer detection and application thereof - Google Patents

Gene markers for liver cancer detection and application thereof Download PDF

Info

Publication number
CN109423515B
CN109423515B CN201710710566.7A CN201710710566A CN109423515B CN 109423515 B CN109423515 B CN 109423515B CN 201710710566 A CN201710710566 A CN 201710710566A CN 109423515 B CN109423515 B CN 109423515B
Authority
CN
China
Prior art keywords
gene
liver cancer
seq
detection
probe
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710710566.7A
Other languages
Chinese (zh)
Other versions
CN109423515A (en
Inventor
刘星
韩峻松
欧莹
杨超
袁箐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANGHAI BIOCHIP CO Ltd
Original Assignee
SHANGHAI BIOCHIP CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANGHAI BIOCHIP CO Ltd filed Critical SHANGHAI BIOCHIP CO Ltd
Priority to CN201710710566.7A priority Critical patent/CN109423515B/en
Publication of CN109423515A publication Critical patent/CN109423515A/en
Application granted granted Critical
Publication of CN109423515B publication Critical patent/CN109423515B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Abstract

The invention discloses a group of genetic markers for liver cancer detection and application thereof, and is particularly suitable for detection of early liver cancer and/or detection of AFP negative liver cancer. A group of liver cancer gene markers comprises an EP400 gene, an MAPK1IP1L gene, a NUFIP2 gene, a PHC3 gene, an RPS6KB1 gene and an STX7 gene. The invention has the advantages that: the gene marker of the invention is used for the screening diagnosis of liver cancer, has high sensitivity and strong specificity, is convenient to detect because the peripheral blood which is most easily collected clinically is taken as a detection sample, is suitable for the early diagnosis of liver cancer and the detection of AFP negative liver cancer patients, and has wide application prospect.

Description

Gene markers for liver cancer detection and application thereof
Technical Field
The invention relates to a group of gene markers for liver cancer detection and application thereof, in particular to the detection of early liver cancer and/or the detection of AFP negative liver cancer, and the application field is biological science, biochemistry, biotechnology, medicine and medical technology.
Background
Liver cancer is one of malignant tumors that seriously affect the health of our country. According to the latest published annual registration report for tumors 2014 published by national tumor registration center, the morbidity accounts for 4 th of all malignant tumors, and the mortality accounts for 2 nd of all cancer deaths, and is only second to lung cancer. Epidemiological statistics show that the high-incidence age of liver cancer of Chinese people is advanced from 40-60 years to 30-60 years before. The liver cancer is in the trend of youthful onset, and seriously damages the population quality, social stability and economic construction of each region in China.
The expert consensus of 'standardized diagnosis and treatment of primary liver cancer' in 2009 indicates that: the primary liver cancer is a high-incidence tumor in China, and has more difficulties in treatment; therefore, early detection and early diagnosis must be considered. The treatment means and prognosis of liver cancer are closely related to the course of disease of patients: the patients with earlier disease course can adopt treatment methods such as surgical excision, liver transplantation and the like, and the 5-year survival rate can reach 50-70%; for patients with a later course of disease, the measures such as hepatic artery chemotherapy and targeted therapy can be adopted, and the 3-year survival rate is only 10-40%. Early diagnosis of liver cancer is the best measure for improving surgical resection rate, reducing mortality rate, prolonging survival time and improving survival quality.
The liver cancer is screened internationally by mainly applying a diagnosis technology combining Alpha Fetoprotein (AFP), ultrasound and other imaging, but the ultrasound examination depends on the experience and equipment of an operator, while other imaging procedures require high cost, and the liver cancer is generated in an early stage and cannot be examined by the imaging; the AFP sensitivity is low, and the AFP is difficult to be used for early diagnosis and is easy to cause missed diagnosis or misdiagnosis of diseases. Early diagnosis of treatment delays, so that more than two-thirds of liver cancer patients are treated in the late stage. At present, 38.76 million new liver cancer patients are added in China every year, wherein the number of AFP negative liver cancer patients is about 11.6 million, and the liver cancer patients belong to a high-risk liver cancer patient group which is easy to delay the best treatment opportunity due to missed diagnosis and misdiagnosis.
AFP negative liver cancer patients account for about 30 percent of the total number of liver cancer patients, small liver cancer (tumor body is less than or equal to 3cm) is particularly common, and is mostly low-differentiation liver cancer, the prognosis is poor, and the clinical cure rate can be improved by early diagnosis and early surgical excision. The AFP level of serum of an AFP negative liver cancer patient is normal (< 20 mu g/L), the clinical symptoms are light and lack of specificity, the current clinical diagnosis mostly depends on the imaging and pathological examination, particularly, the AFP negative liver cancer with the tumor less than 3cm lacks valuable and practical diagnostic markers, and the diagnosis is easily misdiagnosed as a liver benign disease, so the treatment is delayed. How to diagnose AFP negative liver cancer is a considerable problem. The combined detection of other tumor markers is a preferable method for diagnosing liver cancer, especially AFP negative liver cancer.
At present, tumor biomarkers are generally obtained from tumor tissue specimens or peripheral blood. Blood is a connective tissue that circulates in a broad sense, contacts all organs of the human body, contains indexes reflecting changes of the body, and participates in disease processes in pathological states; compared with the sampling of tumor tissue samples, the acquisition of the markers in the blood circulation is non-invasive, has the advantages of repeatable sampling and dynamic monitoring, and is an ideal tissue for replacing the biopsy of pathological parts and carrying out non-invasive molecular detection. The study of novel tumor markers at the level of transcriptomes including messenger rna (mrna), micro-rna (mirna) and nucleic acid mutations including circulating tumor dna (ctdna) is gaining increasing attention. Compared with the traditional serum protein molecular marker, the nucleic acid molecular marker has obvious advantages in stability, detection sensitivity and detection flux, can better meet the requirement of multi-marker combined detection to improve the diagnosis rate, and is a development trend of tumor marker research and development and detection.
Research shows that the peripheral blood transcriptome can accurately display the expression characteristics of disease-specific genes and can be applied to early diagnosis and prognosis judgment of diseases. Based on the specific gene expression profile of peripheral blood and combined with the fluorescent quantitative PCR technology, England scientists detect the expression condition of an 8-gene characteristic profile in kidney cancer as an early diagnosis index of renal cell carcinoma and obviously related to the overall survival period and the progression-free survival period of patients. Compared with miRNA, the research on the expression profile of the tumor-related mRNA is more thorough, the detection method is more mature, and the result is more accurate. Gene diagnosis products for early warning of risk of intestinal cancer based on mRNA level in the united states have been developed and marketed.
Disclosure of Invention
The first technical problem to be solved by the invention is to provide a group of effective novel gene markers for liver cancer detection based on peripheral blood gene expression characteristic pedigrees.
Another technical problem to be solved by the present invention is to provide the use of the above gene marker.
In order to achieve the purpose, the invention adopts the following technical scheme:
in the first aspect of the invention, a group of liver cancer gene markers is provided, which consists of an EP400 gene, a MAPK1IP1L gene, a NUFIP2 gene, a PHC3 gene, an RPS6KB1 gene and an STX7 gene. The research of the invention discovers for the first time that the genes are differentially expressed in peripheral blood of liver cancer patients (especially AFP negative liver cancer patients).
The EP400 gene (Homo sapiens E1A binding protein p400, human E1A binding protein p400) is a polynucleotide sequence shown in SEQ ID NO:1 of the sequence Listing (NCBI accession No. NM _015409), the MAPK1IP1L gene (Homo sapiens mitogen-activated protein kinase 1interacting protein 1like human mitogen-activated protein kinase 1interacting protein 1) is a polynucleotide sequence shown in SEQ ID NO:2 of the sequence Listing (NCBI accession No. NM _144578), the NUFIP2 gene (Homo sapiens NUFIP2, FMR1interacting protein 2, human FMR1interacting protein 2) is a polynucleotide sequence shown in SEQ ID NO:3 (NCBI accession No. 020772), the PHC3 gene (Homo sapiens accession No. E3, human homolog 3 homolog accession No. 2) is a polynucleotide sequence shown in SEQ ID NO: 465 homolog of SEQ ID NO: PIB 3 homolog, the human homolog of the sequence No. 2 (NCBI accession No. 7) is a polynucleotide sequence shown in SEQ ID NO: 465 homologous polynucleotide sequence shown in SEQ ID NO:5 (NCBI accession No. 4) of the sequence No. 4 NCBI number NM _003161), STX7 gene (Homo sapiens syntaxin 7, human synapsin 7) is a polynucleotide sequence (NCBI number NM _003569) shown in a sequence table SEQ ID NO. 6.
The method is based on the quantitative analysis of the human whole-gene expression profile, compares the gene expression difference in the peripheral blood samples of the liver cancer patient and the normal human by quantitatively analyzing the gene expression condition of the total RNA of the peripheral blood, and screens out 6 gene signals which are differentially expressed (the gene expression is up-regulated or down-regulated) in the peripheral blood of the liver cancer patient by a Support Vector Machine (Support Vector Machine) method. Then, the relative expression of the 6 gene markers in the peripheral blood of the detected person can be quantitatively detected by adopting fluorescent quantitative RT-PCR, gene expression profiling chip or RNA sequencing technology, so as to judge the probability of the detected person suffering from liver cancer.
In a second aspect of the invention, the application of the reagent for detecting the liver cancer gene marker in the preparation of a product for detecting liver cancer is provided.
The products for detecting liver cancer include, but are not limited to, real-time quantitative PCR kit, RNA sequencing kit or gene chip.
The reagent for detecting the liver cancer gene marker is a primer and/or a probe for detecting the liver cancer marker.
The primer is a primer for specifically amplifying SEQ ID NO:1 to SEQ ID NO:6 gene sequence. Preferably, the primer has a sequence shown in a sequence table SEQ ID NO. 7-SEQ ID NO. 18.
The probe is a probe which can be matched with SEQ ID NO:1 to SEQ ID NO: 6. Preferably, the probe has a sequence shown in a sequence table SEQ ID NO. 19-SEQ ID NO. 24.
In a third aspect of the present invention, a kit for detecting liver cancer is provided, which comprises a nucleic acid sequence specifically directed to SEQ ID NO:1 to SEQ ID NO: 6.
Preferably, the sequence of the primer is SEQ ID NO:7 to SEQ ID NO:18 is shown in the figure; the sequence of the probe is SEQ ID NO:19 to SEQ ID NO: as shown at 24.
The kit also comprises a primer and/or a probe aiming at an internal reference gene GAPDH (NM-002046.5, Homo sapiens glyceraldehyde-3-phosphate dehydrogenase). The forward primer sequence was ACCATGAGAAGTATGACAACAGCC (SEQ ID NO: 25 of the sequence Listing), the reverse primer sequence was CACGATACCAAAGTTGTCATGGA (SEQ ID NO: 26 of the sequence Listing), and the probe sequence was TCAGCAATGCCTCCTGCACCACC (SEQ ID NO: 27 of the sequence Listing). GAPDH is an enzyme in glycolysis reactions, widely distributed in cells in various tissues. The enzyme gene is housekeeping (house keeping) gene, is expressed at high level in almost all tissues, and is a standardized internal reference of the common fluorescent quantitative PCR experiment operation.
The probe sequence has a fluorescent label, the 5 'end fluorescent label is one of FAM, HEX, TET, TAMRA, Cy5, Cy3, VIC, R0X and JOE, and the 3' end fluorescent label is one of TAMRA, BHQ, MGB, DABCYL and Elipse.
By using the kit, the kit can detect the nucleotide sequence of SEQ ID NO:1 to SEQ ID NO:6, and then judging the probability of the liver cancer of the examined person according to the information of the up-regulation or down-regulation of the gene expression, thereby realizing the screening and diagnosis of the liver cancer.
All kits described herein can contain suitable packaging and instructions for use in the methods disclosed herein. The kit may further comprise suitable buffers and a polymerase, such as a thermostable polymerase, e.g., Taq polymerase. Such kits also comprise control primers and/or probes.
According to the inventionThe detection kit is a nucleic acid detection kit and comprises a peripheral blood total RNA extraction reagent and a PCR reaction system reagent, wherein the PCR reaction system comprises dNTP and MgCl2Taq DNA polymerase, PCR reaction buffer, primers and probes.
In a fourth aspect of the invention, a method for detecting liver cancer is provided, and determining the presence and/or level of the panel of liver cancer gene biomarkers can comprise determining the presence and/or level of a polynucleotide (e.g., a biomarker gene transcript) encoding the biomarker. The method comprises the following steps:
(1) determining the presence and/or level of a hepatoma gene marker in a sample of a subject; and
(2) comparing the presence and/or level of the liver cancer gene marker to a control, wherein a presence and/or level in the sample that is different from the control is indicative of liver cancer.
The sample comprises blood, plasma, serum, urine, platelets, megakaryocytes (megakaryocytes), or excreta.
In a fifth aspect of the present invention, there is provided a method for treating liver cancer, the method comprising:
(1) diagnosing or detecting liver cancer according to the methods of the invention; and
(2) administering or recommending a therapeutic agent for treating liver cancer.
The invention uses the peripheral blood gene marker for liver cancer screening, the marker is used for screening and diagnosing liver cancer, has high sensitivity and specificity, and is simple and convenient in detection process due to the adoption of peripheral blood which is most easily collected clinically, thus being suitable for diagnosing early liver cancer, in particular AFP negative liver cancer. The screening means of combining Alpha Fetoprotein (AFP), ultrasound and other imaging which are mainly applied to liver cancer diagnosis at present is improved, so that the omission of AFP negative liver cancer is reduced, the dependence on experience and equipment of operators is reduced, and a novel diagnosis method is provided for early diagnosis of liver cancer. The research and development of peripheral blood liver cancer molecular diagnosis products can meet the development direction and social requirements of clinical diagnosis and treatment of liver cancer, is favorable for further promoting the popularization of liver cancer screening, improves the detection rate of liver cancer patients and the success rate of clinical treatment, and saves social capital and corresponding medical resources.
The invention has the advantages that: the gene marker of the invention is used for the screening diagnosis of liver cancer, has high sensitivity and strong specificity, is convenient to detect because the peripheral blood which is most easily collected clinically is taken as a detection sample, is suitable for the early diagnosis of liver cancer and the detection of AFP negative liver cancer patients, and has wide application prospect.
The invention will be further illustrated with reference to the following specific examples. These examples are intended to illustrate the invention and are not intended to limit the scope of the invention. The experimental procedures, in which specific conditions are not noted in the following examples, are generally carried out according to conventional conditions or according to conditions recommended by the manufacturers. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art. In addition, any methods and materials similar or equivalent to those described herein can be used in the methods of the present invention. The preferred embodiments and materials described herein are intended to be exemplary only.
Drawings
FIG. 1 shows the logarithmic fluorescence intensity values after normalization processing for 100 chips in example 1.
FIG. 2 is a diagram showing the accuracy, sensitivity and specificity of the liver cancer and normal control samples randomly sampled by using the selected liver cancer gene markers in example 3 of the present invention.
Detailed Description
The term "training set" or "control set" refers to a set of samples used to establish a correlation between two variables. For example, a training set is a set of patient samples used to establish correlations between gene expression and the condition of a patient. In the context of the present invention, a training set is a set of samples from liver cancer patients, hepatitis patients, normal persons for which pathologically relevant information including clinical diagnosis, tumor size, serum AFP expression, etc. can be obtained. The training set was used to establish correlations between expression profiles and patient disease classification (including serum AFP expression).
The term "test set" refers to a set of samples used to validate correlations established using a training set. For example, a test set is a set of samples from patients whose gene expression and patient condition are known. This set is used to verify whether the correlations determined using the training set will correctly predict the patient's condition. In the context of the present invention, a test set is a set of samples from liver cancer patients and normal persons for which a patient disease classification (including serum AFP expression) is available. The test set is used to measure gene expression in the profile and test whether the actual patient disease classification matches the prediction made by measuring gene expression.
Example 1 screening of markers for liver cancer characteristic genes
Materials and methods
(A) Material
The inventor entrusts Shenzhen hospital of Beijing university to collect a large number of peripheral blood samples of normal people, liver cancer patients and liver benign disease patients in 11 months to 2013 months in 2012 (the samples for research are collected, sampled, subpackaged and stored under uniform conditions in the same period), and through sorting sample data, the inventor selects 81 patient plasma (hepatitis B-liver cancer (including primary liver cancer and intrahepatic bile duct cancer)) and 19 plasma of normal people randomly collected in the same period with the hospital and an experimental group, and avoids taking plasma samples of people with family history of liver cancer as much as possible to detect mRNA.
Hepatocellular carcinoma sample inclusion met both of the following criteria: firstly, a pathological diagnosis standard: biopsy or surgical resection of tissue specimens of liver space occupying lesions or extrahepatic metastases is diagnosed as HCC by histopathological and/or cytological examination, which is the gold standard. ② clinical diagnosis standard: clinical diagnosis of HCC can be established when two items (1) + (2) a or three items (1) + (2) b + (3) are required to be simultaneously satisfied. (1) Evidence of cirrhosis and HBV and/or HCV infection (HBV and/or HCV antigen positive); (2) typical HCC imaging characteristics are that contemporaneous multi-row CT scans and/or dynamic contrast-enhanced MRI examinations show rapid heterogeneous vascular reinforcement (hepatic hypervariability) of liver occupancy at the arterial phase, and rapid elution (venous or delayed phase wash out) at the venous phase. a, if the occupying diameter of the liver is 2cm, one of CT and MRI two imaging examinations shows that the occupying area of the liver has the characteristics of the liver cancer, and HCC can be diagnosed; b, if the occupied diameter of the liver is 1-2 cm, CT and MRI two-image examination is needed to show that the occupied area of the liver has the characteristics of the liver cancer, so that HCC can be diagnosed, and the specificity of diagnosis is enhanced. (3) Serum AFP 400 μ g/L lasts for 1 month or 200 μ g/L lasts for 2 months, and can eliminate AFP increase caused by other reasons, including pregnancy, germline embryonal tumor, active liver disease and secondary liver cancer.
After the discussion of the ethical committee of the hospital passed and the patients signed the informed consent, the patient peripheral blood samples were collected.
(II) method
1. Extracting and purifying total RNA in peripheral blood by using PAXgeneblood RNA Kit of QIAGEN company, and identifying the integrity and yield of the extracted total RNA fragment by using an Agilent BioAnalyzer model 2100 micro electrophoresis analyzer;
2. detecting the Gene Expression profile of peripheral blood mRNA of 54 patients (comprising 25 AFP negative liver cancers and 29 AFP positive liver cancers), 9 patients (AFP negative) diagnosed with intrahepatic bile duct cancer, 18 patients (comprising 12 AFP negative hepatitis and 6 AFP positive hepatitis) diagnosed with hepatitis and 19 normal persons by adopting an Affymetrix Gene Expression Using Affymetrix U133Plus 2.0 human whole-Gene Expression profile chip (Affymetrix, Inc. Santa Clara, Calif.);
3. chip data preprocessing: first, the background of the probe intensity was corrected by the MAS5 method, and the normalization process was performed on 100 chips. FIG. 1 shows the logarithmic fluorescence intensity values after normalization processing for 100 chips.
4. Removing probe set (Probe set) signals with excessively high and excessively low expression signals in the total RNA, and selecting probe set signals with moderate expression (between the fluorescence signal value of 100-10000) in all 100 samples in the training set for subsequent analysis;
5. selecting differentially expressed genes of liver cancer, hepatitis and normal people by adopting a T test method, and screening a probe set signal under the condition that the fold change FC is more than 1.2 times and the significance level P is less than 0.05 to obtain 248 groups of probe sets meeting the requirements;
6. screening liver cancer characteristic genes from the 248 probe sets by using a Support Vector Machine method (Support Vector Machine), constructing a binary mathematical model according to an SFFS (sequential Forward flow selection) screening strategy, randomly sampling training set samples by using a LOOCV (leave-one-out cross identification) method, classifying and predicting the sample groups generated by sampling, selecting 6 genes with the highest prediction accuracy (accuracy) for liver cancer as liver cancer candidate characteristic genes, and establishing a screening diagnosis model of liver cancer and normal persons.
Secondly, the result is:
by analyzing and comparing the gene expression profiles of different types of samples and utilizing correlation analysis, the characteristic gene expression profile which can most distinguish AFP negative liver cancer from a control sample is screened out. A group of liver cancer characteristic gene markers are obtained, are an AFP negative liver cancer characteristic expression gene catalogue (gene panel), and consist of an EP400 gene, an MAPK1IP1L gene, a NUFIP2 gene, a PHC3 gene, an RPS6KB1 gene and an STX7 gene.
The AFP negative liver cancer characteristic marker obtained by the invention consists of 6 genes shown in Table 1.
TABLE 1 liver cancer characteristic Gene marker sequences
Serial number Name of liver cancer characteristic gene Sequence Listing numbering
1 EP400 gene SEQ ID NO:1
2 MAPK1IP1L gene SEQ ID NO:2
3 NUFIP2 gene SEQ ID NO:3
4 PHC3 gene SEQ ID NO:4
5 RPS6KB1 gene SEQ ID NO:5
6 STX7 gene SEQ ID NO:6
Example 2 fluorescent quantitative RT-PCR kit for detecting liver cancer
The kit comprises the following components:
1. specific primers for detecting 6 liver cancer characteristic gene markers (shown in table 1):
reagent 1: EP400 gene forward primer: 5'GGATCTTGTCAGTGACGTTGT 3' (SEQ ID NO: 7)
Reagent 2: EP400 gene reverse primer: 5'GTAGCGATTCCGGCACTGT 3' (SEQ ID NO: 8)
Reagent 3: MAPK1IP1L gene forward primer: 5'ACCTGAGGCAGCTCTTTTGC 3' (SEQ ID NO: 9)
Reagent 4: MAPK1IP1L gene reverse primer: 5'GGATGTGTGTGCTCCTTCTAAAGAA3' (SEQ ID NO: 10)
And (5) reagent: NUFIP2 gene forward primer: 5'TTTCTCTCAAAGGACTACGAGAT 3' (SEQ ID NO: 11)
Reagent 6: NUFIP2 gene reverse primer: 5'AGCAGCCCTCAGGTCAAA 3' (SEQ ID NO: 12)
And (7) reagent: PHC3 gene forward primer: 5'CAGTTCTACCAGCGGCAGTA 3' (SEQ ID NO: 13)
Reagent 8: PHC3 gene reverse primer: 5'GGTAGCGGGTGTGAAAATCA 3' (SEQ ID NO: 14)
Reagent 9: RPS6KB1 gene forward primer: 5'CCGAACTCTGGGCCATACA 3' (SEQ ID NO: 15)
Reagent 10: RPS6KB1 gene reverse primer: 5'TTGCAGGATGCTCACACATCTC 3' (SEQ ID NO: 16)
Reagent 11: STX7 gene forward primer: 5'CTCTTTTGTGAGTGAGTGATTGGAA 3' (SEQ ID NO: 17)
Reagent 12: STX7 gene reverse primer: 5'CCCTGCATTGTCCATTCTGTT 3' (SEQ ID NO: 18)
2. Detecting fluorescent quantitative PCR probe sequences of 6 liver cancer characteristic gene markers (shown in table 1), wherein the probe markers of 6 target genes are FAM markers at the 5 'end and MGB markers at the 3' end:
reagent 13: EP400 gene probe sequence: 5'AACTCCTGTAGCCGAATCTACCGCTC 3' (SEQ ID NO: 19)
Reagent 14: MAPK1IP1L gene probe sequence: 5'TAGCCACCCCCACCCCACTTGC 3' (SEQ ID NO: 20)
Reagent 15: NUFIP2 gene probe sequence: 5'TCAAAATCCTCTGGCCTCTCCTACGAAC3' (SEQ ID NO: 21)
Reagent 16: PHC3 gene probe sequence: 5'CCCCTACCCTAACGGCAAGCCA3' (SEQ ID NO: 22)
Reagent 17: RPS6KB1 gene probe sequence: 5'CAAACGGCCAGAGCACCTGCGT3' (SEQ ID NO: 23)
Reagent 18: STX7 gene probe sequence: 5'ACTCCTGTTGCCAGAATCAGACTGCCCTA 3' (SEQ ID NO: 24)
3. Control Gene (SEQ ID NO: 28) detection reagent
Reagent 19: GAPDH gene forward primer: ACCATGAGAAGTATGACAACAGCC (SEQ ID NO: 25)
Reagent 20: GAPDH gene reverse primer: CACGATACCAAAGTTGTCATGGA (SEQ ID NO: 26)
Reagent 21: GAPDH gene probe sequence: TCAGCAATGCCTCCTGCACCACC (SEQ ID NO: 27) GAPDH was labeled 5 'with VIC and 3' with MGB.
Second, use method
The reagents are core reagents for forming the liver cancer detection kit, and each reagent is independently packaged. On the basis, other reagents such as total RNA extraction reagent, PCR reaction solution, Taq enzyme system, packaging material and the like can be added, and are all conventional reagents in the field.
The primers and probes were used at a concentration of 10. mu.M, and the kit was used as described in example 3.
Example 3 screening diagnosis of Normal human by Using the screened marker for liver cancer characteristic Gene
A method step
1. Collecting peripheral blood samples of the sample to be detected: peripheral blood samples of patients were collected using BD PAXgeneRNA blood collection tubes from QIAGEN corporation. The sample sources were the same as in example 1.
2. Extracting and purifying total RNA in a peripheral blood sample of a sample to be detected: extracting and purifying total RNA in peripheral blood by using PAXgeneblood RNA Kit of QIAGEN company, and identifying the integrity and yield of the extracted total RNA fragment by using an Agilent BioAnalyzer model 2100 micro electrophoresis analyzer;
3. reverse transcription reaction: performing Reverse Transcription reaction by using a High-Capacity cDNA Reverse Transcription kit of Life Techonlgy company, using total RNA as a template and Random primers as Reverse Transcription primers to synthesize cDNA;
table 2: reaction system
Reaction components Volume (μ l)
10×RT Buffer 2
25×dNTP mix 0.8
10×RT Random Primers 2
MultiScribeTM Reverse Transcriptase 1
RNase Inhibitor 1
mRNA Volume containing 1. mu.g of mRNA
Nuclease-free H2O Make up to a total volume of 20. mu.l
Table 3: reaction conditions
Is provided with Step 1 Step 2 Step 3 Step 4
Temperature of 25℃ 37℃ 85℃ 4
Time
10 minutes 120 minutes 5 minutes
4. Fluorescent quantitative RT-PCR detection:
based on the related sequences of 6 liver cancer characteristic genes EP400 gene, MAPK1IP1L gene, NUFIP2 gene, PHC3 gene, RPS6KB1 gene, STX7 gene (SEQ ID NO: 1-SEQ ID NO: 6) and internal reference gene GAPDH (SEQ ID NO: 28), the primers of example 2, SEQ ID NO:7 to SEQ ID NO: 18. SEQ ID NO: 25. SEQ ID NO: 26 as a primer, and a specific probe SEQ ID NO:19 to SEQ ID NO: 24. SEQ ID NO: 27 as a probe, and cDNA obtained by reverse transcription is used as an amplification template to carry out real-time fluorescent quantitative RT-PCR reaction to obtain the mRNA relative content of the 6 gene markers in peripheral blood samples. Table 4 shows the composition of the PCR reaction premix (Master Mix), wherein the forward primer, reverse primer and probe refer to the forward primer, reverse primer and probe of each individual gene among EP400 gene, MAPK1IP1L gene, NUFIP2 gene, PHC3 gene, RPS6KB1 gene and STX7 gene. The system for fluorescent quantitative PCR detection is that a single target gene and a primer probe of internal reference GAPDH are mixed to form a double PCR system, and each target gene is detected once.
Table 4: reaction System exemplified by Taqman Probe method
Component (A) ×1(μl)
PCR reaction premix(2×) 10
Forward primer (10. mu.M) 1.8
Reverse primer (10. mu.M) 1.8
Probe (10 μ M) 0.5
GAPDH forward primer (10. mu.M) 1.8
GAPDH reverse primer (10. mu.M) 1.8
GAPDH probe (10. mu.M) 0.5
Blood sample cDNA (20 ng/. mu.l) 1
H2O 0.8
Table 5: reaction conditions
Figure BDA0001382552300000121
5. Diagnosis of donor results of samples to be tested: and (3) judging whether the examined person suffers from the liver cancer or not according to the mRNA relative content of 6 gene markers in the peripheral blood sample in the fluorescent quantitative RT-PCR.
Secondly, analyzing the results
And storing the detection data file after the reaction is finished. The Start value, the End value and the Threshold value of Baseline are adjusted according to the analyzed image (the user can adjust the values according to the actual situation, the Start value can be I-10, the stop value can be 5-20, the Threshold value can be selected in the range of 5K-50K), so that the Standard curve chart under the 'Standard curve' window is optimized, namely the R2 value (correlation value) is 0.97. And finally, under a Report window, a recording instrument automatically analyzes the calculated copy number of the unknown specimen of the E2A-PBX1 (E2A-PBX1-Qty), and the result is derived. And (4) resetting the 4 positive quantitative reference products of the TBP according to the corresponding sequence, and deriving the copy number (TBP-Qty) of the TBP unknown specimen by the same steps.
Thirdly, judging the result
If the growth curve does not present an S-shaped curve or the Ct value is blank, the detected gene expression level is judged to be less than the detection limit.
If the growth curve is S-shaped and the Ct value of the sample<37, calculating the expression quantity (delta Ct) of each target gene relative to the internal reference in the sample, wherein the delta Ct is Ct(target Gene)-Ct(internal reference gene)And then calculating the logistic regression value of the sample by adopting the following formula: logit 4.2715 Δ Ct(MAPK1IP1L)+1.1987*ΔCt(RPS6KB1)-1.2091*ΔCt(NUFIP2)+0.2748*ΔCt(PHC3)+1.9441*ΔCt(STX7)+0.4728*ΔCt(EP400)+18.1326。
When the Logit value of the sample to be detected is greater than 2.30873, judging the property of the sample to be liver cancer; when the Logit value of the sample to be detected is < -1.67951, judging the property of the sample to be normal; when the Logit value of the sample to be detected is not less than 1.67951 and not more than 2.30873, the method cannot be used for judging the property of the sample.
The relative expression quantity of 6 liver cancer gene markers in 53 liver cancer confirmed patients (comprising 24 AFP negative liver cancers and 29 AFP positive liver cancers) and 19 normal persons peripheral blood is detected by fluorescent quantitative RT-PCR, the samples (comprising Training set and Test set) are classified and predicted by using a liver cancer screening diagnosis model, the prediction result is compared with the clinical diagnosis detection result of the liver cancer, the Sensitivity (Sensitivity), Specificity (Specificity) and Accuracy (Accuracy) of the 6 liver cancer gene markers of the invention on the screening diagnosis of the liver cancer (comprising AFP negative liver cancers) are obtained, and the specific detection result is shown in a figure 2 and a table 6.
Table 6: sensitivity, specificity and accuracy for discriminating and detecting liver cancer and normal person
Figure BDA0001382552300000131
The above-mentioned embodiments only express the embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention.
Sequence listing
<110> Shanghai biochip Co., Ltd
<120> a group of gene markers for liver cancer detection and application thereof
<130>
<160> 28
<170> PatentIn version 3.5
<210> 1
<211> 12317
<212> DNA
<213> genus, species, Homo sapiens (human)
<400> 1
gtagcagccg cgccgccgct tcctcccgcc ggggccccgg atgcactgag cggctgcggc 60
gcggcttcca tcctcccgcc ctcctgacgc ggccggagcg cagccctgag gcccagggag 120
aacgacacat tggatacaga agggaggtga tcatgcacca tggcactggc ccccagaacg 180
tccagcatca gctgcagagg tccagggcct gccctggcag cgagggtgag gagcagccgg 240
cccaccccaa cccacccccg tcccccgcag ctcccttcgc tccctcagca agcccgtcgg 300
caccccagtc tcccagttat caaatacagc agctgatgaa taggagccct gcaaccgggc 360
agaacgtgaa catcaccctg cagagcgtgg gccctgtcgt cgggggaaac cagcagatca 420
cactggcccc actgccgctc cccagcccca cctctccagg cttccagttc agcgctcagc 480
ctcggcggtt tgagcatggg tctccatcat acattcaggt cacgtccccc ttgtcccagc 540
aggtccagac ccagagtccc acgcagccca gtccggggcc ggggcaggcc ttgcagaatg 600
tgcgtgcagg tgcccctggc cctgggctgg gcctctgcag cagcagccct acagggggct 660
tcgtggatgc cagcgtgctg gtgaggcaga tcagcttgag cccctccagt ggtggacact 720
ttgtgtttca ggatgggtca gggctcaccc agatcgccca gggagcccag gttcagctcc 780
agcacccggg tacgcccatc acagtccgag agcggagacc ctcccagccc cacacacagt 840
cagggggcac catccaccac ctgggacccc agagccctgc agccgcgggt ggggccggcc 900
tgcagcccct ggccagccca agccacatca ccacggctaa cttgccaccg cagatcagca 960
gcatcatcca gggccagctg gttcagcagc agcaggtgct gcaggggccg ccgctgcccc 1020
ggcccctggg cttcgagagg acacccggcg tgctgctccc cggggctggg ggcgcagcgg 1080
ggtttgggat gacgtcccca cccccgccca ccagcccttc caggactgcc gtgcccccag 1140
gcctttccag cctcccactc acgtctgtgg ggaacacggg aatgaagaag gttcccaaga 1200
agttagagga gattccccca gcctctccgg agatggcaca gatgaggaag cagtgcctgg 1260
actatcatta ccaggagatg caggctctga aggaggtctt caaggagtat ttgattgaac 1320
tgtttttctt gcaacacttt caagggaaca tgatggattt cttagctttc aagaagaaac 1380
attatgcccc attacaagca tatcttaggc agaatgattt ggacattgaa gaagaggagg 1440
aggaggagga agaggaggaa gaaaaatctg aggttatcaa tgacgagcag caagccctcg 1500
cagggagcct ggtagcaggg gccggaagca cagtagagac ggacctgttt aagaggcagc 1560
aggcgatgcc ctccacaggt atggcagagc agtctaagag gcctcgcctt gaagtgggtc 1620
accaaggggt agttttccag cacccagggg cggacgcagg cgttcctctc cagcaactaa 1680
tgccgaccgc acaaggagga atgcccccca cgccgcaggc cgcgcagctc gctggacaga 1740
ggcagagtca gcagcagtat gacccctcca cggggcctcc cgtgcagaac gctgccagct 1800
tgcacacccc actgccgcag ctgcccggga ggctgccccc agccggtgtt cccactgcag 1860
ccctctcctc tgcgctgcag tttgcacagc agccgcaagt ggtagaggcc cagacacagc 1920
tccaaatccc ggtgaagact cagcagccca atgttcccat ccctgcaccg cccagcagcc 1980
aactccccat ccctccctcg cagcctgcac agctggccct ccacgttccc acacctggaa 2040
aggtgcaggt gcaggcctct cagctttcct ccctgccaca gatggtagca tcgacaaggc 2100
tccctgtgga ccctgccccg ccctgcccac ggcctctgcc cacctcttct acctcgtccc 2160
tcgcgcctgt gagtggctcc ggcccaggac cctcccctgc tcgatcctct ccagtaaata 2220
gaccttcctc agccaccaat aaggcactat ctccagtcac ttcccggacc ccaggggtgg 2280
tggcatctgc ccccaccaaa ccacagagtc ctgctcagaa tgccacctcg tcccaagaca 2340
gttctcagga tacgctgaca gaacaaataa ctctggagaa ccaggtgcat cagcgcattg 2400
cggagctgag gaaagcaggt ctgtggtccc agaggcgtct gccaaagctg caggaggccc 2460
cacgccccaa gtcccactgg gactatctgc tggaggagat gcagtggatg gccacagact 2520
ttgcccagga gaggaggtgg aaggtggctg ctgcgaagaa gctcgttaga actgtggtgc 2580
gccatcacga ggagaagcag ctccgtgaag aaagggggaa gaaggaagag cagagcagac 2640
tgaggcggat agccgcctcc acggcccggg agatagagtg cttttggtcg aatattgaac 2700
aggttgtgga aataaaacta cgagtagaat tagaagaaaa aaggaagaag gccttaaatt 2760
tacagaaagt ttccaggaga gggaaagaat tgagacctaa aggatttgac gcattacagg 2820
aaagttctct ggattcagga atgtctggaa gaaaaagaaa agctagcata tctttgactg 2880
atgacgaagt ggacgatgaa gaggaaacaa ttgaagagga ggaagcaaat gaaggcgttg 2940
tggaccacca aacagaactt tctaatttag ccaaggaagc tgagctgccc ctcctggacc 3000
tgatgaagct gtacgaaggc gccttcctgc cgagttctca gtggccccgg ccgaagcctg 3060
atggggagga cacaagcgga gaggaagatg cagatgactg tccaggcgac agggagagtc 3120
gcaaggactt ggttctcatc gactcgcttt tcatcatgga tcagttcaaa gctgccgaga 3180
ggatgaatat cgggaagcca aacgccaagg acattgcgga cgtcactgcg gtggctgaag 3240
ccatcctgcc gaagggcagt gctcgggtca caacctcggt caagtttaat gctccatctt 3300
tgttgtatgg ggctctcaga gattatcaga agattggcct ggactggctg gccaaacttt 3360
acaggaagaa tctcaatggc atattggcag atgaagctgg gctgggtaaa acagtgcaga 3420
tcattgcttt ttttgcccac ctagcttgta acgaaggtaa ttggggcccc catcttgttg 3480
ttgtgagaag ttgtaacata ctcaagtggg agcttgaatt gaaacgttgg tgtcccggac 3540
tcaaaatcct ctcatatatt ggcagccaca gagaactcaa agcaaagaga caggagtggg 3600
ccgaacccaa cagcttccac gtctgcatca cgtcctacac tcagttcttc cggggcctca 3660
ccgccttcac acgagtgcgc tggaagtgcc tggtcattga tgagatgcag cgcgtgaagg 3720
gcatgaccga gaggcactgg gaagcggttt tcaccctgca gagccaacaa cgtctgcttc 3780
tgatcgactc gccgctgcac aataccttcc tggagctctg gaccatggtg cacttcctgg 3840
tcccagggat ctccaggccc tacctgagct cccctctgag ggcccccagt gaagagagcc 3900
aggattacta ccataaagtg gtcataaggt tacacagggt gacacagcca tttattttga 3960
ggagaactaa gagagatgtg gaaaagcaac taacaaagaa atatgagcat gttttgaagt 4020
gtcgcctttc taaccgacaa aaagccttat acgaggacgt tatcctgcaa cctggcactc 4080
aggaggcctt gaagagcggg cactttgtca acgtcctgag catccttgtg cggctgcagc 4140
gcatctgcaa ccaccctggg ctcgtcgagc cccggcaccc aggctcttcc tacgtggcgg 4200
ggccactgga gtatccgtcc gcatctctaa tcctgaaggc actggagaga gatttctgga 4260
aggaagcaga tctttctatg tttgatctca tcggcttaga aaataaaatc actcgtcacg 4320
aggcagagtt gctgtctaag aaaaagatac cgcggaaact catggaggaa atctccactt 4380
cagcagcccc agcagcccga ccagcagcag caaagctgaa ggccagcagg ttgtttcagc 4440
ctgtgcagta tggccagaag cccgagggtc gcaccgtggc tttccccagc actcacccgc 4500
cccggacggc agcccccacc acggcctctg ctgctccaca gggcccgctt cgaggacggc 4560
cgcccatcgc cacgttctct gccaatccgg aggcaaaagc agcagcagcc ccgtttcaga 4620
cctctcaggc ttccgccagt gctccacgac accagcccgc ctcggcctcc agcacagccg 4680
ctagcccggc ccatcctgcg aaactgcggg cccagaccac agcacaggcc tccaccccag 4740
gccagccccc gccccagccc caggccccct cgcacgcggc cgggcagagc gcgctgcctc 4800
agaggctggt gctcccctcg caggcccagg cccgcttgcc cagtggagag gtagtgaaaa 4860
tagctcagct ggcatccatc acaggaccac agagccgcgt ggctcagcca gagacgccgg 4920
tgacactgca gttccagggc agcaagttca ccctgtcaca cagccagctc cggcagctca 4980
cagcgggcca gccgctgcag ctgcaaggca gcgtcctcca gatcgtgtcc gcccccgggc 5040
agccctacct tcgagcccct ggccctgtgg tgatgcagac cgtgtctcag gcgggcgctg 5100
tgcacggcgc cctgggaagc aagcccccgg ccggcggtcc cagccctgca cccttgaccc 5160
cacaagttgg cgttccgggc cgcgtggcgg tgaatgcctt ggctgtagga gaacccggaa 5220
cggcctccaa accagcttct cccattggag ggccgaccca ggaggaaaag accagactct 5280
tgaaagagcg cctggatcag atttatttag tcaacgagcg gcgctgttct caagctccag 5340
tctatggcag agacttgcta aggatttgtg ccctgcctag ccatggaagg gtacagtggc 5400
gtgggtccct ggatggccgt cgtgggaagg aggccgggcc agcgcacagt tacacttcat 5460
cctcagaaag tccaagtgag ctgatgttga cgctttgtcg gtgtggagag tctctgcagg 5520
atgttattga cagggtggcc tttgtgattc ctccggtggt ggcagcaccc ccgtccctac 5580
gggtgccgcg gccgccaccc ctgtacagcc acagaatgag gatcttgagg cagggcctga 5640
gagagcacgc tgcgccgtac ttccagcagc tgcggcagac cacggctcca cgcctgctgc 5700
agttccctga gctgaggctg gtgcagttcg actcagggaa gttggaagct ttagctatct 5760
tgcttcagaa attgaaatct gaaggacgtc gggtgctgat tttatcacag atgattctta 5820
tgttggacat tttagagatg ttcttgaact tccattacct cacctatgta agaatcgatg 5880
aaaatgccag cagtgagcaa cggcaggaac tgatgaggag tttcaacaga gacaggcgga 5940
ttttttgtgc cattctctcc actcacagcc gtaccacagg tataaacctt gtagaggcgg 6000
acaccgtcgt gttttatgac aatgacctga atccagtgat ggatgccaaa gctcaggagt 6060
ggtgcgatag gatcgggaga tgcaaagaca tccacatata caggcttgtg agtggcaatt 6120
ccattgaaga gaaattgttg aaaaatggaa ctaaagatct gatccgagaa gtggctgctc 6180
agggaaatga ctactccatg gctttcttaa ctcagcgaac catccaggag ctgtttgaag 6240
tttattctcc catggatgat gctggcttcc cggtcaaagc tgaggagttt gtggtgcttt 6300
ctcaggaacc ttctgtcacg gaaaccattg cacccaaaat tgcaagacct ttcatagagg 6360
ccctcaagag tattgagtat ctggaggagg atgcccagaa gtccgcacag gagggggtgc 6420
tgggaccaca cactgatgct ctgtcatcag actctgagaa catgccgtgt gatgaagaac 6480
catcccaatt agaggagcta gctgacttca tggagcagct tacaccaatt gaaaaatatg 6540
ctttaaatta cctggaatta ttccatactt ctattgagca agaaaaggag agaaacagtg 6600
aggacgcagt gatgactgca gtgagggcat gggagttctg gaacctgaag accctgcagg 6660
agagggaggc ccggctgcgg ctggagcagg aggaggcgga gctcctgacc tacacgcgag 6720
aggatgccta cagcatggag tatgtctacg aagatgtcga tgggcagaca gaagtcatgc 6780
cgctctggac cccacccacc ccgccgcagg acgacagcga catctacctc gactcggtca 6840
tgtgtctcat gtatgaagcc actcccatcc cagaggctaa gctgccccct gtgtacgtga 6900
ggaaggagcg gaagcgacac aaaacagacc cctcagctgc aggcaggaag aagaagcagc 6960
gtcacgggga ggcggtcgtc cctcctcggt ccctgtttga ccgcgcaaca ccaggacttc 7020
tgaaaattcg cagagagggc aaggagcaga agaagaatat tctgctgaag cagcaggtgc 7080
cattcgccaa gcccctgcca acttttgcca aacccacagc tgagcctggt caagacaacc 7140
ccgagtggct catcagtgag gactgggcgc tgctgcaggc tgtaaagcag ttactggagc 7200
tgcctttgaa cctcacaatc gtgtcacctg ctcacacacc taattgggat cttgtcagtg 7260
acgttgttaa ctcctgtagc cgaatctacc gctcttccaa acagtgccgg aatcgctacg 7320
agaatgtcat cattccacga gaggagggga agagtaaaaa caaccgtcct ctccgtacga 7380
gccagatcta tgcccaggat gagaatgcca cacacaccca gctgtacacg agccactttg 7440
acttaatgaa aatgactgct ggcaagagga gtcccccaat caaacctctg cttggcatga 7500
atccctttca gaagaacccc aagcacgcgt ctgtgttggc agaaagtgga atcaactatg 7560
acaagccgct gcctcccatc caggtggcat ctctccgtgc agagcgaatc gcaaaagaga 7620
aaaaggctct ggctgatcag cagaaggcac agcagccggc cgtggcccag ccacccccgc 7680
cccagccgca gcccccacca cccccgcagc agccaccgcc accgctgcca caaccacagg 7740
cagcgggcag ccagccgcca gcagggccac cagctgtcca gccccaaccc cagccacagc 7800
cccagaccca gccacagcct gtgcaggccc cagcgaaggc gcagcccgca atcacgacgg 7860
ggggcagtgc agccgtactg gcaggaacca ttaaaacatc agttactggg acgagcatgc 7920
ccactggtgc cgtgagtgga aatgtgatcg tgaacaccat cgcaggggtc ccagctgcca 7980
ccttccagtc catcaacaag cgcctggcgt cgccagtggc tcctggggcc ttgactacgc 8040
cgggaggctc tgctcccgcc caggtggtgc acacccagcc cccgccacgg gcagtcggct 8100
ccccagccac ggcgacccct gacctggtgt ccatggcaac gactcagggt gttcgagcgg 8160
tcacttctgt gacagcctcg gccgtggtca ctaccaacct gaccccagtg cagaccccgg 8220
cacggtcttt ggtgccccaa gtgtcccaag ccacaggagt tcagctccct ggaaaaacca 8280
tcacacctgc acatttccag cttctcaggc agcagcagca gcagcagcaa caacagcagc 8340
agcagcagca gcagcagcag cagcagcagc agcagcaaca gcagcagcag caacagacga 8400
cgacgacctc tcaggtgcaa gttccacaga tccagggcca ggcccagtcc ccagcacaga 8460
tcaaagctgt gggcaagctg acgccggaac acctcatcaa aatgcagaag cagaaactgc 8520
agatgccccc gcagccccca ccgccacagg cccagtctgc gcccccgcag ccaacagccc 8580
aagtgcaagt gcagacctcg cagccgccgc agcagcagag cccccagctc acgacggtca 8640
cggccccaag gcctggtgcc ctgctgacgg gcaccaccgt ggccaacctc caggtggccc 8700
ggctcacccg ggttcccact tctcagctgc aggcgcaagg gcagatgcag acccaggcac 8760
cccagccagc ccaggtggcc ttggcgaagc ctccggtggt gtccgtcccg gcagctgtgg 8820
tctcctcacc gggagtcacc accctgccca tgaacgtcgc ggggatcagc gtggcgatcg 8880
gtcagccaca gaaggcagca ggacagaccg tggtggccca gcccgtgcac atgcagcagc 8940
tgctgaagct gaagcagcag gccgtccagc agcagaaggc catccagccc caggctgcac 9000
agggcccggc agccgtccag cagaagatca ccgcacagca gatcaccacc cctggcgcgc 9060
agcagaaggt tgcctacgcc gcgcagccgg cccttaagac ccagtttctt accacaccca 9120
tctcccaggc ccagaaactg gccggggccc agcaagtgca gacccagatc caggttgcaa 9180
aacttcctca agttgttcaa cagcaaacac ccgtggccag catccagcaa gttgcctctg 9240
cttcccagca ggcttctcca cagactgtgg cgctcacgca ggcgacggcg gccgggcagc 9300
aggtgcagat gatccctgca gtgaccgcga ctgcccaggt ggttcagcag aaactcattc 9360
agcagcaggt ggtgaccacg gcgtcggccc cgctccagac tccaggcgct cccaacccag 9420
cccaggtgcc cgccagctcc gacagcccaa gccagcagcc caagttacag atgagggtcc 9480
ctgctgtcag gctaaagaca cctactaagc ctccgtgcca gtagtcaggg cagcagggct 9540
gcctctcatc taaagcaaaa ctaccttcct cacagaaaac gctttattag tgaaccttgg 9600
gaccatgtca cgcaagagat tcagcactgg gaaagatata attgaaacaa aatagtgtaa 9660
tcattttatt aaaatgcatc ccacactgca ggacaaatgg tccttatgga gtgccgcgtt 9720
ctctgtacta cgtggctcat ggaaaaagtg acaacatggc ttcctctaaa tcatttcacc 9780
tttcagtccc cacccgcacc cgtcccctag agccatagta ctgtgttctg aaagccattt 9840
agaatttctt tgtgagcatg tagtgctttg cacgccacag aagccgtctg ccgtgtgtga 9900
ggagcataca atggactttc taaagataag gcgtgggctt ccacagtgtc tgccagagtt 9960
tagttcttta taccttactg aaaaatgcct cgtggtcttc gcagagggga aggcctgtct 10020
aaagtcaatc atccgagatg ggttttccat tccaaagaaa ggcaatatgg ttccttcctt 10080
ccctcctaaa atatgactta acttttaaga gaaatgttct gacacccacc taaacacaca 10140
aggcacgttc ctggcctgtg ttcaagggaa atgatcagtc attgcattgt tattccaaag 10200
agcagccaac agtggcctcc cccaggccct accctgcaat gggattcgct ttcatttaat 10260
ggaaacttct gggactgatg cccaactcag tgcactcaag acgcatctcc agttttcggg 10320
ggaagctggt atttgacata gtgtgttaaa cagctcctga gaacctttgg gacactctgc 10380
catggctggc gtgaggccca gaggaccacg cagaggcaat ggtagtacag atgtcacagc 10440
tgagggtacg atgaggcctg ggctcagtga gccaggacga atgtgacaga caccccttgc 10500
tgccacagtc agccctttga cgaaggtggg ctggtgattc tggaagtatt ggctatagcg 10560
gtgggcccag tcaactcttc cttgtggact tacgacagca gattttctct aggataagct 10620
tgtgtggttc tgccagtgaa gcagagaacc acctgtgctg ttgtggaagg cgtgccgttg 10680
agggggaaaa cgaagcccag tatttgctac tgtttttcct ttttttacta tgacaggaaa 10740
ataaatgcaa ttttagtgga attgattgac agtgtctcct tactttgaag ttttcaccaa 10800
agcaaaaagg tccatatcca atagtatcct ttgtgctgtg gcttgatttt ggcctatttt 10860
acattatttg gtccaggaaa ttaggttata ttaggttttt tgtatactaa aaatcagtta 10920
tggcacaata aagattttct gtttttaaat tgtatttcat ctgcttcctc cccattctct 10980
cactttaagt gacattgagg aaggtattct gtcccacagg tttctgtgga cagcgataca 11040
gcaggagtca gtgaaatcaa ctggggagct cacttgagct cttgataaga aatgtggaga 11100
aaagtaaaaa ccaagctttg aagaaacaga agaaattaat cttttagtta gttgaacata 11160
ccaaagcaga ggactggaat ctgtttgttc taaccaaccc gttctccctg gcttggcacg 11220
tgccgtgaga gcgcagcttg ccggagggag ggccgctgtg tgcgcctcac atctggctcc 11280
cagtggaaac ttttactcct cctcatccgc agatgtgata gaactgaagt atctaggaat 11340
tctgcctttg tcatttgttt taatttgtgt gccctgttca ttttttttgt ctttcccaaa 11400
tcttggtagt ctccttatag ttgaagataa aatgttgagt gcacttattt tagaatatcc 11460
tagacataac tgtctaagta aaagcgctct attaatctaa aacactacaa gagaatttaa 11520
caccatctct caaatgcttt tttggagagc ttaatgggat tctgaatatt tgcaatgtgg 11580
agtttccgcc ccgatctcac gtcagtgagg gtctcctgtc tctcaagtgt gtttcctttg 11640
gctgttccct aatacaaaac acggacatat ttttactcgt agcactcaat ttagtaactt 11700
ctagatgcta ccgttgacct gagttaaatt catttagtcg tgtacgtaaa aactctcctt 11760
ttagtgtgtt attttcttgg ccttcccttt taaaggttaa agtttctaac ctaagaatta 11820
agtacgcgtt caggaagctg ttgtctaggc cttccccttg tgaatctggg ttcattccaa 11880
tacggcaagt aagagttgga aactttgaga acacagacta taaaggcagc agcccgaaca 11940
ctgtcagact ctaattggcg accctgggaa acagttgccc tgctattctt taaagaaaga 12000
cgtttattct gatgataaaa acagttagcc agactgtttt taaagcacct ggcgggaagc 12060
agaaggttgg atccaagccc ttgttcagat ttggtgcctg ataagacagg ggtttctctt 12120
tttgtgacct ttattattat tattttgtta actgttgtaa ccagttagct gttgtgtttt 12180
aagatagaaa ggaacaagac taaaattgta aatactttgt aaacatcagc atttgtactt 12240
gaatagtagg attttaaagg gcattgatag cataccaaac aaaaggcaaa ataaagtgac 12300
ctttttatat atttttt 12317
<210> 2
<211> 6469
<212> DNA
<213> genus, species, Homo sapiens (human)
<400> 2
ggcgcttcct gttccggcgc caggaggagc cgcgcgctgc tggtgctgtt gccgccgctg 60
ctctagctgc cgtcagtcag gctgcgcccg cgtcttcagg gcccagtccc tcggacccat 120
cgccgcttct agaccctact gcggtctcgg atattgccgg gaaaatgtct gatgaatttt 180
cgttggcaga tgcactacct gaacactccc ctgccaaaac ctctgctgtg agcaatacaa 240
aacctggcca acctcctcaa ggctggccag gctccaaccc ttggaataat ccgagtgctc 300
catcttcagt gccatctgga ctcccaccaa gtgcaacacc ctccactgtg ccttttggac 360
cagcaccaac aggaatgtat ccctccgtgc ctcccaccgg accacctcca ggacccccag 420
caccctttcc tccttccgga ccatcatgtc ccccacctgg tggtccttat ccagccccaa 480
ctgtgccggg ccctggcccc acagggccat atcctacacc aaatatgccc tttccagagc 540
tacccagacc atatggtgca cccacagatc cagctgcagc tggtccttta ggtccatggg 600
gatccatgtc ttctggacct tgggcgccag gaatgggagg gcagtatcct acccctaata 660
tgccatatcc atctccaggc ccatatcccg ctcctcctcc tccccaagcc cctggggcag 720
caccacctgt tccatggggc accgttccac caggagcctg gggaccacca gcaccatatc 780
ctgcccctac aggatcgtat cccacaccag gactctatcc tactcccagt aatcctttcc 840
aagtgccttc aggaccttct ggtgctccac caatgcctgg tggcccccat tcttaccatt 900
aagttaacaa tggacgaaga gatgacgctt tgctttttga agtacatgta tatgcacatg 960
aatgcatata taaaaattgc tggtttcact attagagggc attcatgaaa gaacaactct 1020
tgcacctctc agagaagata actgcctctt gtacttggat gcgtagtaca tcatatgtat 1080
acaatcagat aaaagcatag aagtaaatca ttcggatgtg atttttattt ggttttcatg 1140
gaaagttaaa gtgataaagt atattgaata gttctttgac agaatttgtt taaactatga 1200
aactacacac ttaaaaatct aagatgtgga ttattgttag aatctgcaac ttcattggca 1260
aattatttca agtatttttc tataatcact ttccccttct aaataaataa acttcgagaa 1320
taacccatca taatccaaac aaatgatgcc tcaacatttt gagctgctct gtcggacaaa 1380
taaacctggt cctcttgagg ttatattttg gatatacatt tttaaactgt cagtaattat 1440
tgtcagatgt ggagttcaat agccagccag tgttcatttt tatccttgag cttttagtaa 1500
aaacttcctg gttttatttt tagtcattgg gtcatacagc actaaagtct gctatttatg 1560
gaaactaact tttttgtttt taatccaggc caacatgtat gtaaattaaa tttttagata 1620
attgattatc tctttgtact acttgagatt tgattatgag atgtgcatat tgctttggga 1680
agagctcgag gaaggaaata attctctcct ttgttttgaa cctcaaacta gataaaccct 1740
aggaattgct taactgcaac aagtaatttt cattcccaca aaaacctgag gcagctcttt 1800
tgcccagagc gttccctgta gccaccccca ccccacttgc ccttggttct ttagaaggag 1860
cacacacatc ccttgattcc tccctgatgt ggtaaactgg cacactccag gggtctaaaa 1920
cataaaacag ttgtgtttag ggaaccttaa gtcatgcaga catgactgtt ctctttgtac 1980
aagtgtgaat caaaatatgt atctcttttt cagagtctgg ttaagctatg tcattgtcta 2040
ctgcatagtt tcctgagtct gtttgtaaag tgcttatggc taacagttca gttctgtatt 2100
tgttgacagg taaataagtg gagttgagtg ccatctttga aaaaattacc ctctagctct 2160
aacactgaaa ataataataa attgtagatc tctgcaacta agtttaaagc agtgtgactg 2220
tgttgcttaa atatcaagta ttgtttataa ccaccaaaaa aaaaaagccc tggtagtttt 2280
ttggcacctt atgtttaaat cagattctta gatttggagt agacctgacc ttgttattta 2340
ttagataaca ttttgaatgt atccattgga tttctaaaat gtattgtgaa tttctcagac 2400
aaacaggatt tatgctggag ctctgttttg cttagaaata aaatatttag tagtttattt 2460
ctgctctaat taaaatgtca agaatgccaa atgctgccag ttttttggtt tgatagctac 2520
ctccttctaa gaaagcaaaa tggttacctt tgagaggaac attcagtgtt taatcatccc 2580
ttatgttaac tagatgatag attcaagctt ttagaaatga gaaagtagaa actaatttgt 2640
taagatattt tcagactgcg gaatgttgtt agctttttct ttcacttctc ttcaaggaca 2700
ggtgttagct gtctacaata ctgttgaact ctgttgtcaa agtagccccc ttagtctaca 2760
aggcaggtag ccttggcttg aattatcaat atcaaaatgt cagttaacca tggagggata 2820
aagtaatgtg aaaagtgaga tggctgcaaa gatagctctc cttacagtta ttttggctgt 2880
cctacattgg gataagctga caaattagca gtatttagtt taacactgga gcaaatataa 2940
tttgagtagg aagaagagat agcaggtttg ggaatctata attatgaagt ccattgattt 3000
tgggagaaaa tctgttgcta aaggatttga agggccatga acacaatttg ggattattac 3060
tccctataag tataataatt ttgctagtga cccatactgt ccagtgtgcc ctaaatcata 3120
ctgctattgt actccctttg ttttcaagga ctttgcaact ggtatttggg ggagattttt 3180
tttttttttt gagacggagt ctcgctctgt cgcccatgct ggagtgcagt ggtgctatct 3240
tggttcactg caagctccat ctcccaggtt cacaccattc tcctgcctca gcctcccaag 3300
cagctgggac tacaggtgcc cgccaccatg cccggctaat tttttttttt tttttttttt 3360
agtagagatg gggtttcact gtgttagcca ggatggtctc gatctcctga cctcgtgatc 3420
tgcccgcctt ggcttcccaa agtgctggga ttacaggcgt gagccaccac gtccggccga 3480
tttttttttt tttttttaat gtaagaatgg agataaaagg gataatataa tttgctttta 3540
tattgttatt tttgtaaagc atcttttctt caattcttgt tggcattctg ggccaaaata 3600
tttcaggttg gttcggtgtg gagttaagaa aagcaggcgt tttagtggag aaatggggaa 3660
cagcatcaag aaaggctttt ttcctttttt cttttttttt tggagacaga gtcttgccct 3720
gtcacccagg ctggagtgca atggcgtgat cttggctcgc tgcaacctct gcctccaggt 3780
tcaaacgatt cttctgcctc agcctcccaa gtagctggga ttacaggtgc ccgccaccac 3840
acccattttt gtatttttag tagagacggg ggtttcacca tgttggccag ggtggtctga 3900
aactcctgac ctcgtgatcc gcctgcctca gcctcccaaa gtgctgagat tacaggcgtg 3960
agccaccatg cgtgaccttt ttttcttttt aaaagggaac aatgttgctt tcaaaacaag 4020
acatgctagg ctgaaactga tttatggaaa agactgcttg ttagcaagta tatttggtct 4080
tgagggggat acagattata gaatatgctg acatttgggc ttcagaggaa gaattttcaa 4140
atctaatgga aatagttgag gtgttcagga atgctgtttc ttggagttgg aagcttaggt 4200
tttgaaatgt tgaaaccaaa aagacaaaaa ttaaaacata gaccttaggt cgtcattcac 4260
acccggttct caagaatcaa gtggagcact tcaaagacct tggcttgtct gtcccatcct 4320
gccactttct catcttttca tgcttttgaa gacaccattt acagctctga ctcagcccta 4380
ttttgtgtaa agtaatatat tgattattca gaaatagaca atacattttt taattaccca 4440
aggactgact gttttgtgca ttttactgtt ggttgtcttc agtagagaat agtaataggg 4500
cagagaaaag tatatatttt gcctcagtca gtcccaccac cacaatggac tattgggata 4560
ttttctaaaa aaccaatcaa tttgcccatg attacctcac aaataattag tgctacctgg 4620
ggtactctca aatatacagc ttttgaaact gtagatgaaa aaagctctac tcagagtttt 4680
tgtcaagact gtgcctgggt tgaatatcag tcaattgcct acacttctaa acaataagtg 4740
ccaatgtctc aattttctca ccctgaatga tagaagctag ctttatcaaa tgccaaggtt 4800
agaaagcctg gaaataaaac ttaagcacag acattcaagt ttttgaaaag cataagccta 4860
aattcagata aatcacactg atatattgta ctatgcatag aaagttgtag gtggcgttca 4920
gggaagactt tgattttaat aaagcaatat ttagtattga agacaaacac tttttatttt 4980
cagatttctg ccaagtaaaa cagaaattgc caataaaata atcagtattt tgtaaatggc 5040
aggcaagctt ctggctgtcg aaaacatctg agtcatttat tcagtagaca atatgtcctt 5100
gatccaggtt ctttgccagc tataagggaa tccctgtcct tgagaggctc atagtctata 5160
agtaacatta cagaatttgt tagcataccc attcattatt agttttacct aaacgtgtta 5220
ggatcactac tggtggaaat tgtaaccagc ctttgggcat cttaaagggt gacatgtggc 5280
atgccttttt ttttttttaa gaatttaatg tttttcaaga ttgtagtgtt gatcagcgca 5340
acaattcaag tgtgcaaagt aacaggatag tttgcctctt cactttaccc ctggataaag 5400
gcactttcac tgcctgtcac tgatcagcag atactgactt gttgccatta agtgaacttg 5460
acttcttatg tgtgctctat gagtttgttg taattttctt cttgaaattg tgatttttca 5520
ctgacagtaa tgacaaattt aatgtatgta attgtctatg cattttaagt taaactgcct 5580
aaaatgtgat ttgagacata tacatatgtt tgtattataa attgtaagca atcagtttga 5640
gatactaggt tttatcacct gctgctgtat ttgtaaacaa agacaaatgt tgctttaaga 5700
agtaattata attaggaata ggctatggat gtgatacttg gtatttttta agataaactt 5760
gtttgctttt gtgtattata cctggaaact ttttttaaaa aatgtatttt catggtttca 5820
cagatttttc atgttatttt attctttagg cccaattctg ggcttctctg agcaagtcca 5880
gagcctaatt aactgtaaat ttgttgtcaa aaaggaagaa aaaagggcct gagatacctc 5940
tttgcatgtg acctgcattc actaaggata tctggaaacc acccttcctc cgcaaaccct 6000
ctcagcaaca tggtgtccat tgtggtgatt ttctcttctt ttaaggctag gctactcttg 6060
gtaaccagat tatccgtata tatgataata tgaagtcagg gaactttctc tgtctgtccc 6120
tactcccctc actcccccac tttctgttat gaaagatagt tctactttta tcattaactg 6180
ctacgcattt agtgagggtc acattattaa acttggagtt taccattttc ccacaggaga 6240
tttcgctggc attccttgga actcccaatt tcagtagggc aatgaatgaa tgaatacttt 6300
gcagtgctac ttttggaagg aatttctgct ttttgcctta tgattggaca aaatgcagct 6360
gtaaaatttt aaattgtttt tgatatgtta ttcaatatcc catgaaagta ttcacctaaa 6420
gtggagttat gaaatggatg gtgaaataat aagaccattc tggagcagg 6469
<210> 3
<211> 10897
<212> DNA
<213> genus, species, Homo sapiens (human)
<400> 3
agatatactg agtgagccct gagaagcagt ctcagatcct gacggtgcag cagcccgcag 60
cctcagccag ggagtcccag ccgctttcaa tggaggagaa gcccggccag ccacagcctc 120
agcaccatca cagccaccac catccgcacc atcaccctca gcagcagcag cagcagccgc 180
accaccacca ccattattat ttctacaacc acagccacaa ccaccaccac caccatcatc 240
accagcagcc tcaccaatac ctgcagcatg gagccgaggg cagccccaag gcccagccaa 300
agccgctgaa acatgagcag aaacacaccc tccagcagca ccaggaaacg ccgaagaaga 360
aaacaggcta tggtgaacta aacggtaatg ctggagaaag agaaatatct ttaaagaacc 420
tgagttctga tgaagccacc aaccctattt ccagggtcct caatggcaac cagcaagttg 480
tagacactag cctgaagcag actgtaaagg ccaacacctt tgggaaagca ggaattaaaa 540
ccaagaattt cattcagaaa aacagtatgg acaaaaagaa tgggaagtct tatgaaaata 600
aatctggaga gaatcagtct gtagataagt ctgatactat accaattcca aatggtgtgg 660
taacaaataa ttctggttat attactaatg gttatatggg taaaggagca gataatgatg 720
gtagtggatc tgagagcgga tatacaactc ctaaaaaaag gaaagctagg cgcaatagtg 780
ccaagggttg tgaaaacctt aatatagtgc aggacaaaat aatgcaacaa gagaccagtg 840
tcccaacctt aaaacaggga cttgaaactt tcaagcctga ctatagtgaa caaaagggaa 900
atcgagtaga tggttcgaag cccatttgga agtatgaaac tgggcctgga ggaacaagtc 960
gaggaaaacc tgctgtgggt gatatgcttc ggaaaagctc agatagtaaa cctggtgtga 1020
gcagcaaaaa gtttgatgat cggcccaaag gaaagcatgc ttcagctgtt gcctccaaag 1080
aggactcgtg gaccctattt aaaccacccc cagtttttcc agtggacaat agcagtgcta 1140
aaatagttcc taaaataagt tatgcaagca aagttaagga aaacctcaac aaaactatac 1200
agaactcttc tgtgtcacca acttcatctt catcatcttc atcatctacc ggggaaactc 1260
agacccaatc atcaagtcgc ttatcccagg tccctatgtc agcgctgaaa tctgttactt 1320
ctgccaactt ttctaatggg cctgttttag cagggactga tggaaatgtt tatcctccag 1380
ggggtcagcc actgctaact actgctgcta atactctaac acccatctct tctgggacag 1440
attcagttct ccaggacatg agtctaactt cagcagctgt tgaacaaatt aagactagcc 1500
tttttatcta tccttcaaat atgcaaacta tgctgttgag cacagcacaa gtggatctgc 1560
cctctcagac agatcagcaa aacctggggg atatcttcca gaatcagtgg ggtttatcat 1620
ttataaatga gcccagtgct ggccctgaga ctgttactgg gaagtcatca gagcataaag 1680
tgatggaggt gacatttcaa ggagaatatc ctgctacttt ggtttcacag ggtgctgaaa 1740
taattccctc aggaactgag catcctgtgt ttcccaaggc ttacgagctg gagaaacgga 1800
ctagtcctca agttctgggt agcattctaa aatctgggac tactagtgag agtggagcct 1860
tatccttgga acccagtcat ataggtgacc tgcagaaagc agacaccagt agtcaaggtg 1920
ctttagtgtt tctctcaaag gactacgaga tagaaagtca aaatcctctg gcctctccta 1980
cgaacacttt gttaggctct gccaaagaac agagatacca gagaggccta gaaaggaatg 2040
atagctgggg ttcttttgac ctgagggctg ctattgtata tcacactaaa gaaatggaat 2100
ctatttggaa tttgcagaag caagatccca aaaggataat cacttacaat gaagccatgg 2160
atagtccaga tcaatgaagg accagactgc ctattcgtaa cctttctgca gcattagagc 2220
catcgttcat gggggacaca aggcttttat gctcctagat cttcaacgca gcagaggaac 2280
cataagtaga atcacaggat aatatataca aatatatata tatacatata tatatatata 2340
gttatttaaa aaaggcaact gaaagtaatt agacttctta aggaatcaaa tttatttcaa 2400
gagactacac atggttattt aatctccggt actgaatagg ttttttttct tctgttagtt 2460
tttgttttta agtgtgaatg caagtgatta atgaatacag acttaacaag tgtggttcta 2520
aagttcctgc tgtcatcaac ttgggcaaca aatgacccac tggaaaggca aatccactta 2580
aaagatctct gtatcttgtt ctgtgactga agtgatacac taatcacggg gaacccagaa 2640
tgattcaaca ttttcccccc actcctccct tgatcttttt ggttttactt taattaagcc 2700
ctgcgagaat gctggataaa tgccttgaag ttagcagggt gtattttttt agcgaatatg 2760
atttgcatgt cttgccagga gttaagcggc ctctggggtg ttggggaaat actttatttc 2820
tttccattta ttttttgtgg ggcggggata ggggagggca ttgaagttct acaattctgg 2880
aatagttagt tgatggtaca tagttaactt ggcttcggtt acatattgga ctttaacaac 2940
tgaagaatct atgcgtgtca tttaaagaaa agttgcagaa caagcaattg gcttagatat 3000
acaatctgga aaaatattcc tgtgcccata ttttaatgta attgtataac tgggagcaaa 3060
aatatattct gcttttcaac tgtaggtgct ccagacttgc tctccgtcac taacactaaa 3120
tgtgctgttt tccttgtttt tcatcaaaca tttaagacaa acttagacct ttctgtaaat 3180
tatcttttaa tttctcagca aaatctaaaa ggggaagaaa aaagtccatg aaaactaaaa 3240
cttttcatgt ttttagccag tgagaagata ataaaccctg actgtagaag gtgtgttttc 3300
atgcaaacta tacttctgag cttgttagct tctaattata tcttaataaa tatattttat 3360
tactagagca agatgggttt ttaaggaaaa taatgtgaaa ttctggaaat tttctttggg 3420
gcagagaaga gcattagccc tgtcttatca ttacattgcc atcctgttgc actgcagctt 3480
gtgtatagca tgctaaaata aatttttgtg tgtgtgtgca gaaattaagg gtccaattga 3540
gattgggtga tgttagtaac ataataacaa gttgtctggc ctgacacagc atcacatcac 3600
acacacagaa attagtatat ccatgtatgt caaatacagg ttaaaatatc agggcattta 3660
tataaagagt tgtagtcttc tgataaaagt agactggatc ccctggggta tttggggaga 3720
aagtaactac tttggctcta cccctagaaa tgtccagttt tgagtgactg tagtatggat 3780
gggttttctt gttttgttga ttatttgagg cttttaaaac aagtagttca tgaaagaagc 3840
tgttggactc aacatagagt agagtaacta tctttttagt ctggatttct gccctgctta 3900
gattttaaaa gtataagcat ggattgccaa ttccacttga tgtaaacaaa actttttttt 3960
atacataata tatatatata tatataaaat aacttattgt atcagtccag gttcagaaac 4020
ttgtggtagg ccagttccag atagtttcat ttcacctgta aactgtatca ctttgactga 4080
tattgtaatt ttcaaatgta taatatgttt acagatgtgc cctgcattta gtctgccttg 4140
ttcctatttt gatttttgtt gagtctcctg cctgcttgcc aaaagctagg atgcttcagg 4200
cccatgtaca attgaaagca gaggcatcct tgagctttaa agcattgaac aaactggaaa 4260
atgcaacata ccacataact gaagtgaaaa aagtctgtgt ttttgtgttt ttttaaataa 4320
aaattttcaa aaagttaaaa aaaaagacat ataaggttga ttaaagggaa aaaaggctcc 4380
agtttgtttt acaggtttta aagttctgct gtgtgttcaa ttgccttgtg taaccacttg 4440
tcgccttagg gccagattcc cctctctagt cccctttttt aaatgtccat tttgcttgcc 4500
tggaatttta aagttcttcc gtctcacaac tcacaagaaa ctttctgggt ttgtgacata 4560
cagaggttga attgagtata tatttgaaaa ggaaaaaaca aaaaacaaac ccagacccca 4620
cctgaattgg gctttttaac ttagaagcaa cacttgatta aacatcttta gaaagctatt 4680
gcttttctaa tttccttcca tatccctcag gcctcagtgt tcagagaagc caaaaagaat 4740
gtatcacttc tctgtctgtc caaaggtttt tgagagtctc acttctaaat gaaacaatgc 4800
aacatttcac tttgatttct ccactgaaat ttccttgatt atatggttag aggtatgtag 4860
ttaggaatgt ctgttaactt tctgagaacc ctagtgcccc atcatattaa ctgtcagtat 4920
tttgggggca ttaggttaat agacttaatt gcctaggtac aagcaggact ttgggacaaa 4980
tctctttgtg ctgtttggta acacttaact ctatttgttg caatctttct ccttaggtcc 5040
tcacacaatt ccttacagag cacttattaa aaaaaaatct taagagttga tctgttttct 5100
gattattttg tgtaagcttc taaacaaact tcagctgtga ttaatttagc acatttaaat 5160
aacgtgttat tgtttggtat aaagaatttt ccttcaactc agagtattag tactgtagca 5220
taaaccaaat acagtctaga ggggattttt aacatccctc cattataaag actgaaaaag 5280
gggtgtgtgt gtgtgtgtgt gtttatgtat gtatgtatgt gtgtgtgagg aaaagatgga 5340
gatattaaaa attagtaaat gaatgtgtat aagacattag tattcagaga atgaacttgt 5400
atttattttg tgccatttgt tttcattaca cagaaaaaag tcaggtggtt taaatcctta 5460
aaagggtagt attgaaaaat ggcactaaga atgaaattat gacctatttt tttaatagct 5520
atgaagatac taattatggg tgaagatttc tttttaaatc tgttttgatt attgtaggct 5580
tctgtgtcac ataccactct tgtaggtgtc ctcaataatc cccttttccc acaaaataca 5640
cagggtgtat tatctttctc tttattcacc cccactttgc tgaactgaag ttaattacat 5700
agcctttctt ctaacctcct tagtaatgaa ccttcacata aagtgtattt acagcgtctg 5760
tggtagccag cccttcctcc tctactttct aggaggggat agccaataac taggaattta 5820
atgacagatt tttttttctt tgaaataaat ggccagagtt tctccatttt agaattttgt 5880
tgtcctcctt aatcatctgc ttacctagtc attactcaat ctgcagaaac ttcataaagg 5940
aaaagtgctg cattgttttt acaaataaca gtttgtaggg aaaatatgac aaacctcaac 6000
tatgggagtt gtccacaata caaaattttg aaaaaacatt acatagtgat aatatcatac 6060
ttggttgtta ggcttgttgc ttccccacat cagaggcatc taatgattta tcttttgtaa 6120
ttgctgtgaa cttttttaaa taagccattt agtgtgaaat tgtcatgtat caaatggcta 6180
ttggaaatgg actttactca attttaattc cactgtaaat aaggacggag tcattcctac 6240
aaggctctct tcagagaaat agattaaaag tccaatttcc aggtattatt agtatagtta 6300
tgccgctggg ccacatcctc aacaacagct gatccctctt gtataaatat gttaactgtg 6360
cagaacagtt atgttatggg acaaatataa tggtcattat ggtcagattg gttgatgcca 6420
caccagtcaa ggtagagtct gatagggcag tatcttaata accctaccca tgacttaact 6480
gttggatttg aaaggaaaac gtaggatttg ctcttgtccc cttacccgcc acaaaatttt 6540
gataatttgt ttaaaaggga gaggcagagg aaaagactag aagcataaat agctgcttta 6600
ggtttgccag aggcacatag cttaacatta gttcttaata tcgatgttat ttttactaat 6660
gtaattaatc aacagagcac caagattctt tcatggtgaa aagggtgggc ttctgttttg 6720
gtatcttaaa atgtttcttt taaaatatac atcacctgtg tgagaaccag gaccacctgg 6780
gagagtgatg aatcattggc tccactcaaa agcattgctt tactgagttt taaatttcac 6840
actgttttgc cgctcaagaa aggtcttaaa gtagttaaag gatgccagca atagtgcgaa 6900
tagaattttc ggttgtctgc ataataaaaa cacccattgc agcatgattg gtatgttgct 6960
cttgcattat tgggaatggt aaatcagtta tgggctagaa actatggaat ggccgtcctc 7020
atatgtgatg ggattgctga ttcagacttc cctattttcc atacaatttt gttatgtgca 7080
gagttctaaa gccattttat aatactgcag tatccccccc ccccccacct tttttttttt 7140
gagacggact gtctgttgcc caggctggag tacagtggcg caatcttggc tcactgcaac 7200
ctccacctcc ctggttcaag caattcccct gcctcagcct cccatgtagc tgggattacg 7260
ggcgcacacc accacacctg gctaatttgt attcttagta gagacggggt ttcaccatgt 7320
tgaccagact ggtctcgaac tcctgacctc aggcaatctg cccgcctcag cctcccaaag 7380
tgctggaatt acaggtgtga gccaccgtgc ccggccaagt atctcttttc tacagcctta 7440
ttaaactaac tacaaacatt tattttccaa tttagtttta ctttcagtgc atatcaaagt 7500
tgttgtactc ttcagaccaa caaattaact tgagggcaaa ttacatagct ttccatgtac 7560
ccttttttcc tcaggtgcta atcaaaggct ctgaaaatgg atactgcttt agtgatgtct 7620
gctttattct taaaatgctt atttcttttg ctagatgtaa agatttggtg ttaacaaaag 7680
tggttttaat atgtaaatat gaatgaatgc ctttagttta ccctgtttgt ctattattaa 7740
tctgttttca tttatccttc atagaggagg atcctttcat gatcttgaat acatttcatt 7800
agatattgtt gcattttaag aatgaaaata caactgtttt ctgtcttaga ttaatcctgc 7860
tgctatgaga aactgaaaat caagaatgtg atgcactttt tacattacta tataccatac 7920
atataccata ggttgctttg atacctttcc tgtagcacag ccactaacaa gagtgaatga 7980
attataaaat tctttttggg agggaatcaa tacaagtaac taattcttag ctgatattgt 8040
cctatgaagg acaataactt aggaatataa gaattctgtt aatagtacac tttttggcct 8100
taaatgtctt ctactactga aaatagttta aatcttagct ttgtttctat tattccctct 8160
ctctgcctca gaaagaggaa ttgggaagaa tggcttaaag gacgtggtgt cattgatttg 8220
ttgctgatct tttagaaaac atttgtctat gtaagctggg gacttatttt ttgtttgtat 8280
atagagggga aatagtgctg ccctgaacca atcagattta gtttaaatca aatcaatcaa 8340
aactccagct gtttctcttg tctttttact tagcaaagga aaactttagt gaatgctact 8400
tgacaagaag aaaagtcatt tctcaagcac atacccaaac ttgaaggtga ttgaacccaa 8460
aataatgggt gggaaacacc aaatgaggtg gaggaatgag aaagatgtgt gggccaaagc 8520
tatctggtta tattttgatg ttgccaatat cgcaaagcca aaattttaat ttgcttattt 8580
aatatatttg ttggccagag atctattttt atatcaatgt gccttgcatg tatattaaaa 8640
aaaaaaaatt ggaaacgcca tgtagtaatg cctgagatag tcgatggttc ttaccacctc 8700
actaattttt atgcagtatg aaatgctcat tctattgccc aactggtgct ctctgtttaa 8760
agttacagat cttgcgaaac tggaactatt ttataagctg gggaagtgat ttactttttt 8820
tgttgtatct tttttgttct tagtctgtta gtggctgtcc tgtagtggga aatagtaaaa 8880
ggattcttca ctcccttctc ccctcagcac cttcttcaag taaacatttc ttgtgtgctt 8940
tgaaaaaagt ttcagcttgc tgtctctttt agtgttttaa agaagtgtta tacaaagcat 9000
tgtttgcaaa atatagggag ataatggagt ccactttaat ttggaattct gtgtgagcta 9060
tgatccaagt tatcagctct ttccaacttt aaaaattttg ttaaaagcac cttgcttaga 9120
aaattttaaa tatttatgtc tgcaacaatt gtctcaaaat aataaactgt gcaattcttg 9180
tcattaaaaa aaaaaaagat ctgaattttc cctaatgtga cttgttagtt tctctctgta 9240
tttcctgcca gtgtaaatgt gaaagctttg cttgcattac gttttagaaa tgcattttgc 9300
acactcgaat tttgccgaag ctccgtgaaa aggttagatc taagtagatg aataaagcta 9360
tgcacatgtt ttgaaagttt aatttgtgtg tcattaccaa aagtgaccga tttgtcctta 9420
ctactttgct gttgttagct ttaccatctt tggaaacttg gctcaaagtt acatagttct 9480
gggctagctc atcagtggaa ctaggagaga ggaaaactgg cacctatttt aataaagttc 9540
aatttaaacg agagcttgac ttgtatctat taaagagctt ttcttgaaac agggcagttt 9600
tatcagcttt acaaatcatt ggatgctctt ccttagtaat attttggttt atttgatcaa 9660
atagaaatgg aaagtaattc aaactgaaag accctttttt gtcatatgga acttggtgac 9720
gattttttgt cttaaagctg gtttaaaggt aggataggct tttaccttta ttgctttagc 9780
ataaatttgg tttactgaat tgactggctt gagattagaa ttattcagtt gtttgtaaga 9840
tcaaagcact ggttgtttta aagataacgt gtatctttta aaaaattgcc caagctgatt 9900
agaacaagtt taggagttgg gtacatttgg ttcaagtgct gcaatctgta tgtactaaat 9960
agctttactt tgtgtatgtg tacttataat gtgtagatgt actactaccc aggttttgtc 10020
aaatcatctt ttttaaagtt tttttttttt taattggttc aggacctttg taggagaggc 10080
taatatgttt aagtagaaga tattactgat agcattttcc ccatgctcct acataaaaaa 10140
taaatatttc cattttatag ctttttcaat atacagaaga gggttacttc ttcatcaagt 10200
atattgttgc ctttgaggac acagcaaaac ccttctatat gtatcttcat tgatagtggc 10260
agttaaaaac taagttatcc agttaagact taaaaggtga cccatattaa ttgcatggcc 10320
ttaaaaggca gaaatgcagg agtgtagcaa gcatcatttt agatggctat ggttcctctt 10380
ccgcatctgt cagtagttca cttatgttca gtcttagaac ctactggagg agtgaagtaa 10440
tttctctgtc tcgtgcagag gcactaagga gctgagttac ctcttaatct gggggaatgg 10500
ataataagtg gagtacagtt atgttaaagg atgttccccc cgctcaaaaa aaagtttcaa 10560
tgtttgtttt gcccagtcaa aaatataggt cttttctaca tataagaaca gtcaccagaa 10620
attttccctt ttgctaaatg cttaggtatt tgctatagct gtttctgatg tcatggattc 10680
tgaggaagtg tcatttacgt gatgatcttc ctttattgat gtcttcatca tgttcagtgt 10740
tttaaaaata taaattacaa acactctaca accataccca gatttactta ttttatcaga 10800
aaaaaaactt gagaaatttg tagatcaaat tgagagacaa taagtgtaca ttgttgaata 10860
aaaaatttta aagtttctga aaaaaaaaaa aaaaaaa 10897
<210> 4
<211> 12687
<212> DNA
<213> genus, species, homo sapiens (human)
<400> 4
atgcgcagcc catgttagtg atggaggaga gaagatggcg gaagcggaat ttaaggacca 60
tagtacagct atggatactg aaccaaaccc gggaacatct tctgtgtcaa caacaaccag 120
cagtaccacc accaccacca tcaccacttc ctcctctcga atgcagcagc cacagatctc 180
tgtctacagt ggttcagacc gacatgctgt acaggtaatt caacaggcat tgcatcggcc 240
ccccagctca gctgctcagt accttcagca aatgtatgca gcccaacaac agcacttgat 300
gctgcatact gcagctcttc agcagcagca tttaagcagc tcccagcttc agagccttgc 360
tgctgttcag gcaagtttgt ccagtggaag accatctaca tctcccacag gaagtgtcac 420
acagcagtca agtatgtccc aaacgtctat caacctctcc acttctccta cacctgcaca 480
gttaataagc cgttcccagg cttccagttc taccagcggc agtattaccc aacagactat 540
gttactaggg agtacttccc ctaccctaac ggcaagccaa gctcaaatgt atctccgagc 600
tcaaatgctg attttcacac ccgctaccac tgtggctgct gtacagtctg acattcctgt 660
tgtctcgtcg tcatcgtcat cttcctgtca gtctgcagct actcaggttc agaatttaac 720
attacgcagc cagaagttgg gtgtattatc tagctcacag aatggtccac caaaaagcac 780
tagtcaaact cagtcattga caatttgtca taacaaaaca acagtgacca gttctaaaat 840
cagccaacga gatccttctc cagaaagtaa taagaaagga gagagcccaa gcctggaatc 900
acgaagcaca gctgtcaccc ggacatcaag tattcaccag ttaatagcac cagcttcata 960
ttctccaatt cagcctcatt ctctaataaa acatcagcag attcctcttc attcaccacc 1020
ttccaaagtt tcccatcatc agctgatatt acaacagcag caacagcaaa ttcagccaat 1080
cacacttcag aattcaactc aagacccacc cccatcccag cactgtatac cactccagaa 1140
ccatggcctt cctccagctc ccagtaatgc ccagtcacag cattgttcac cgattcagag 1200
tcatccctct cctttaacag tgtctcctaa tcagtcacag tcagcacagc agtctgtagt 1260
ggtgtctcct ccaccacctc attcaccaag tcagtctcct actataatta ttcatccaca 1320
agcacttatt cagccacacc ctcttgtgtc atcagctctc cagccagggc caaatttgca 1380
gcagtccact gctaatcagg tgcaagctac agcacagttg aatcttccat cccatcttcc 1440
acttccagct tcccctgttg tacacattgg cccagttcag cagtctgcct tggtatcccc 1500
aggccagcag attgtctctc catcacacca gcaatattca tccctgcagt cctctccaat 1560
cccaattgca agtcctccac agatgtcgac atctcctcca gctcagattc caccactgcc 1620
cttgcagtct atgcagtctt tacaagtgca gcctgaaatt ctgtcccagg gccaggtttt 1680
ggtgcagaat gctttggtgt cagaagagga acttccagct gcagaagctt tggtccagtt 1740
gccatttcag actcttcctc ctccacagac tgttgcggta aacctacaag tgcaaccacc 1800
agcacctgtt gatccaccag tggtttatca ggtagaagat gtgtgtgaag aagaaatgcc 1860
agaagagtca gatgaatgtg tccggatgga tagaacccca ccaccaccca ctttgtctcc 1920
agcagctata acagtgggga gaggagaaga tttgacttct gaacatcctt tgttagagca 1980
agtggaatta cctgctgtgg catcagtcag tgcttcagta attaaatctc catcagatcc 2040
ctcacatgtt tctgttccac cacctccatt gttacttcca gctgccacca caaggagtaa 2100
cagtacatct atgcacagta gcattcccag tatagagaac aaacctccac aggctattgt 2160
taaaccacag atcctaaccc atgttattga aggctttgtg attcaggagg gattggagcc 2220
atttcctgtg agtcgttcct ctttgctaat agaacagcct gtgaaaaaac ggcctctttt 2280
ggataatcag gtgataaatt cagtgtgtgt tcagccagag ctacagaata atacaaaaca 2340
tgcggataat tcatctgaca cagagatgga agacatgatt gctgaagaga cattagaaga 2400
aatggacagt gagttgctca agtgtgaatt ctgtgggaaa atgggatatg ctaatgaatt 2460
tttgcggtca aaacgattct gcactatgtc atgtgccaaa aggtacaatg ttagctgttc 2520
taaaaaattt gcacttagtc gttggaatcg taagcctgat aatcaaagtc ttgggcatcg 2580
tggccgtcgt ccaagtggcc ctgatggggc agcgagagaa catatcctta ggcagcttcc 2640
aattacttat ccatctgcag aagaagactt ggcttctcat gaagattctg tgccatctgc 2700
tatgacaact cgtctgcgca ggcagagcga gcgggaaaga gaacgtgagc ttcgggatgt 2760
gagaattcgg aaaatgcctg agaacagtga cttgctacca gttgcacaaa cagagccatc 2820
tatatggaca gttgatgatg tctgggcctt catccattct ttgcctggct gccaggatat 2880
cgcagatgaa ttcagagcac aggagattga tggacaggcc cttctcttgc tgaaagaaga 2940
ccatctcatg agtgcaatga atatcaagct aggcccagcc ctgaagatct gtgcacgcat 3000
caactctctg aaggaatctt aacaggaaca tgaagccttg ataaaacagc agttttactt 3060
ttctcacaaa aacttgtaag gtaaaggcct aacttggtct agaatatgac acttattgtg 3120
gtggatagcc aagcacattg ggatctccac atcaaatact gacatttctt ctacaggtat 3180
aataattcat catgcatttt cataattaat aaacattggt aaaattaatt ttacaggtta 3240
catgaaacat tgaaagactt gttacagagg gccatgatat ttttcaaaga aatgtgttat 3300
actagataat ttttttaaag gtgatgttta tcattaatat aaagaatcct tttaaaagta 3360
atttaatgat ttacatttct cctcttttga ttcaattttc ttatacattt tttctaccct 3420
attagttttc taaaggttgt catgagaggt atattatgga ataatttagt agtccagtga 3480
cagaatcgta tgaaatcagt gtacatttta aaaaacatgt cttttagaca tatgctttat 3540
ctataaaaaa ggaattgtgt tctagtatga acaatactga tctggaagtg agaagagtta 3600
gtttctattc caaacttgac caagaatttg gtttgactga gaacgttttc ctctcagttt 3660
ttgtacattt atttagagca gtggttctca gtggaggtca gttttgatcg ccaggggaca 3720
tctggcaatg ttgagacatt ttggttgtca cagcttgggg gtgggttcag gggagggttg 3780
ctactggtgt ctagtagtta gaagccagag atgtttctaa acatcttata atgcacagga 3840
cagcacccct ccactgtaaa gaattattgg ttcaaaaata tcggtactgc caaggttgag 3900
aaactctgat atagaaggag tgataaatat tgttttcacc caaaggaata cttttaaagg 3960
atgaagctta ctaaacatat atgatggaag tattattcag ataacattaa tattctgctg 4020
aataattttt tctagtttaa tcatactaga aaaagaaaaa aaatctacaa attgtcctat 4080
aaaataagga caaacatgca aataatttaa ctctcagaaa gtactaattc attctgatta 4140
tctttcatac ctctgtgctc ctctgcactg acgaagacat aatatgatta tacctatgaa 4200
ctagtgcaca gccttttctg gcaagaaaat agtttgtagc agatacgtgg ttgctctttg 4260
gatttttttc tattgttgaa catgctggga ctagctagaa tgcacattcc tacttccttt 4320
accaaacgtt tgcatgcttc ctgcaaagca cttaccaagt gatttctctt gaaccatcgg 4380
atataatttt gtatgtacat gtttgaggaa aaaaatgtaa agcaaaacct tttactgaac 4440
agtgttctat agaattatga cactaaaaca aaattgtttg tggaagccct gaaagcttta 4500
tagtcctgga catcaaaaat tttatttgag atgatgaatg ttttgttttc atcttttctt 4560
atattaccac aattgagata ttttagtaat tgaaggaaca tacacagata tttggcagaa 4620
gtcgagtaag gaggggaaaa aaagagtccg tgagtttcag tcattttcac tgctcttttc 4680
aaaaagattg tgttgagctg gtagaagact aaagatgtca ctgaagacat cacagatact 4740
atatttatct tttggctttg tgtacattag agaatgttga ttatttttat acaaaaatac 4800
agcgggtaat ttttttaatc tttagatgcc tcttgtttga atgtatgctt tgtggaattc 4860
tttgtgtagt aatgttttaa aaaaagatgt ttactgatag ttacatgtag gattagaata 4920
tgtaatataa tataaggctc atgttccaga cctacgatag cttgtagtct atgttacgta 4980
tttctttata tcacattttt aatcattgga ttaaagtatc aaggaaagct aggtactcta 5040
taatgagttt tcatttatta gcagttaatc atcatgacag aattgtcata tgcttgactt 5100
ttccctcttc ttggaatttc agaacacaaa tacaggctaa gcattagtaa gagatggccc 5160
acagtatgag agagagaggt gcaacggaaa atctcgcctg gaattaaaac ttttcataga 5220
ttatccacgg ttaatacaaa atttattata tggggataga ctgctccagc aataatgatt 5280
acatcctata actgtattac ctatggcctt taaggtatca attttgaact gtgttgtagg 5340
ctctcctttt atttgttctc tttcctaata gcagccattc tgtacttatt gaaagcccct 5400
gtgcctactg ctgtcttaag tattcaggag gggcttacaa gagggttttc tattggagaa 5460
taccgtataa tcttaaatct agtccagatc tctgttgtcc ccactcaaaa catacacaaa 5520
atatgcactt gcttttttca agtgagtttt tatttaaaaa tggcttgttt gctatcacat 5580
tggtgcagct gtttctttca agatgagtta atcatcttaa tttcaaagct tcagctatat 5640
ataatggata tatagacaac actgagcatc cacctctctc ctgagcttta aagcagagtt 5700
tcagtatgat ataggtgggg agagtaaatt gttttcatat cctttcatac tactactaat 5760
agttttagga ttttgactgg ggagagataa tgacaaacag aaagggaaca tggaggttct 5820
tcctactttt gctacctaag tttgcatttt ctgacttcct tgcagtgttg cactctttgt 5880
cccattggga taaaaagcat aagtttgaaa ttttgcttta agccttgtgt tcctggggaa 5940
gttaaacaac taagagagct gatttgtaaa aattattttt tatatgacat taatattcat 6000
caagccttgt gtaggcatgt gtaagacaca gctatgcagc tttgagtagt caatatagta 6060
tgagatagag tgttgtccca aatcctcctg tcacttttta agtagcatat tatttccctg 6120
atggtcctgt tactttgctg ttgaatgctc taaacagaac tttttaaaag gtgtgtttta 6180
agagcagtca cctaggagta gacaaggtgg aatgggagga gagaaatggt aatgcaaaag 6240
cttgagcatg ggaagagtca gaggaggagg ccatcatcct tgttagctta gcctacttca 6300
acactgagca catttctgca cttttgaagt gaaattcatg ttttacttag aagaaataat 6360
tttctttcat tagggatccc agttgatttt tgtttcctgg tgtatcaaaa tacttagaac 6420
tatgaaacaa gtattattgt gatcatgcct ttgaataatt tttgacgtag cttatcttca 6480
tgtatcaagt ataaaattat aatgagacat ctattcacaa atacaagtct tagattgaat 6540
tgaaatgtgt tatagtgccc tgtctcccac tgacttgttc agttaaatgt cttaaagtac 6600
attatgtaca tcttcaggct tttggtacca caatggcaca agtatggtag ggaggcaata 6660
tagtcttagg ctatatgcct atattaagtg tgtataaaca atttttgaaa gaatacacta 6720
ttatagatgt atgtgagtga tgctgacctg acagccatat ccagtggatg aaactgactg 6780
gacacactgt taaaatgttt taaagatgta ttttcagcca gaacagcctg gttatagttt 6840
gtggttttca ccttggtgga ttgcaggaac acatgcagcc tactggcatt gagcattagc 6900
taatggcatg aaagggcctc atctcactac ctctctaagg cctctagctc caagaaaacc 6960
atgaaaactt ctttcttgga gagatctttg tctcagaatc cttagagagg atttcgtatg 7020
ggggctaact ttaggaaggg aggcagctgg ggcaggactt tctgatacct gacagtcatg 7080
ttccagagca acctttgggc agtggaaact ggcgcatcta tgcaaaatga ttgctcaatc 7140
tctatcttgt gtactacata tgtaactagc tgggccctaa ggaaggtttt ctagggggaa 7200
ggatagggaa gtagaggagg agacaagtag gaggaacaaa gcattctaga cccaagagga 7260
tagaagatat ttaggataga tatggctttc atccatagtt caaaataatg cgttttgtta 7320
gatgccagtt atagcagtaa ataggttata gtttttatat gtcaagattt acctgtaatc 7380
agactcattc tttcactctc tatacccact gtctccatgc ttgggagcat ggatattaat 7440
agttccagtg atgtagaagt tagtgatttt tgatttctga aaaaggtgag aaccttttat 7500
tacagttgga gaatatttgt caaaaattca aaggttgttg taattgagtt gccagaatta 7560
cagagtttcc attttcagat atcacagttg aatcacctct gtagattgtt ataaagagag 7620
gcattttaag atagtatttt atttgctagg ttgtgtctca gtctaagaat tgggaaaaga 7680
agagctatag gtttctcttt cctagtctgg atttcagtaa acacaagcct acctctgctt 7740
ctttggttca cagcagtgtg gatcatgaaa tgaactgttt acccacattc atcaatattg 7800
gtattttaca aatctacttg gagcatttaa tttcatctca aagattgtga tccactttag 7860
ataagcacaa atacagtatt aggaaaagta aatatgcaat cttactaaaa tttcaacttg 7920
ttaagctgta tatcttaaaa gaaattattt ggggctgggc atggtggctc acacctgtaa 7980
tcccagcact ttgggaggct gaggtgggta gatcacctga ggtcaggagt tcgagaccag 8040
cctgaccaat atggtgaaac cctatctcta ctaaaaacac aaaaattagc tgggtgtggt 8100
ggcatgcacc tgtaattcca gctacttggg aggctgagac aggagaattg cttgaaccca 8160
ggtggtggag gttgcagtga gccaagatca cacccctgca ctccagcctg ggtgacagag 8220
cgagactcca tctcaaaaaa acaaaacaaa aaattatttg ggaagatacg tcctctttta 8280
ttagaagttc ataaaatgta tcatatagtt ttgttcacag tagttatata agctttcttc 8340
aaataaattt aaaattagat taccttcttt ggaaaaagaa tttcctaaat ttttaagaat 8400
tttcaaagtt ttacatatta gtttttagaa cctaatccgt tttaaaattg tactatgaga 8460
aagctttttt ttgaaagttg taaagcatta atacaaataa tacaaatata attattacca 8520
tcacattcca gagaatatgg ctttttctaa actttcaatt tagaaaacat acattaaggg 8580
agaatctctg ccctcctttt cagctctgaa gatcagcttt tctactcaga cacatgcaca 8640
caccccttcc aagtgtcatg tttatgggaa catttgggaa atgttttcca gatgttttat 8700
tttttccctt ttatagtttg ttgacattta attttactta aagatgacaa ttttaatcgg 8760
aaatgttaga ggtacaacat agtgaggttc tagctagctt tatacttttg aaaaatattt 8820
ttgtttctac tgctttttac aagtactagt cctctcagtg atactggtgg tgttcagtat 8880
gaatccatag aaagaaaaca aaatttgttg tttaaaaaaa gcagagtaat gaatgaattt 8940
cagttttgaa aacaacataa tttgaaaaca ctgttatact aacatggcaa ggtgttaatt 9000
aaatataaga gtaaggtagt aagttctttt agagcacctg tttaaattta ctccagtaat 9060
catcttaagg attgatagtc accatcactt attggcttaa aagttatatt tcatggaata 9120
ttatcagtgt taaatccaag ctttgtggag ctttaagtga tggtggtgaa aaagttggtg 9180
tttatgagag agtggtgggg tgtctagtca ttagtgaagt taaacatcaa cctgttttag 9240
aaagaatttt ttagtcttgc ctaaagtaaa ccagaagtgt ctagtgttta aatctttatt 9300
tagaatgctt ctcttaaaag tattttttgt tttgggtagt attaaataat cagtaaataa 9360
tctatttcag tagtaaataa tgaattaaga tgatgatgaa tgaggattaa cacactggtc 9420
tggagactgg ggttttattt cagtgggtta gctgtgtgtg acatgttggg caattactca 9480
gctgttttaa cagcttccag atatgcagta tggtgcctgt actactcaaa agttgatttt 9540
ggtttaattc atctttaagg tacctcccag ctctaaaact atgattctag gctgtgtaat 9600
ggggttattc ctactttatt ctctttcctt ttttaagggt tcattttata cttaataagc 9660
atccatttct tgggtcacct acagtctttg ttctcctaag gattaaaata gaaaattcat 9720
acataacaag caaatgatga cattttccta aatgctcctt attggttaac cactgaatat 9780
atgaacacat atgaatattg tcattcatgt acttaaattc atttagcaaa ctatttgaac 9840
acttacatgt gcagtgtttg gtgaacatga catgaggaac tagtagtaag taaaatcttc 9900
cccccaaaat tcattgtggc ttaaataaat atgaacataa tcattactac ttaatatact 9960
gagagggaat cttaataaac ttggaactgg gagggaatat ttgtatacat tgggtaaagg 10020
gttaggctag atgacatcta aggggtctga gtgaatcata tcataatttt tataacacat 10080
ttcacatact aaacatcagt tggccccata cctgattaag ttacaaaatt taggagactt 10140
aacattaagg acttacaggt tgagacagcc cgtatttcac aacattattt tgacacttga 10200
ctctattcca gagttgttgc tatacaaggc atgtggcaga acaaaaaaaa agctggtgtt 10260
gatataagag ctttttaccc agtattgaca gtgagcaact ttctttcttt tttttttttt 10320
ttcttttttt tttttttgag atgggttcgc tctgttgccc aggctggtgt gcagtggtgc 10380
gatctcagct cactgcaacc tccacctccc gggttgaagc gattgtcttg cctcagcctt 10440
ccaagtagct ggaattacag gtgcccgccg ccacacctgg ctaatttttg tatttttagt 10500
agagacgggg cttcaccatg ttggccaggc tagtctcgaa ctcttgacct caagtgatcc 10560
acctgccttg gcctccctaa gtgctgggat tacaggcatg agccaccaca cctgtccgac 10620
agtgtagcaa ctttctaaaa ctgaaaaatc tcaaaggaga tcattggaac tgacttgttc 10680
atttattttt tgtttttaaa ttaagaaaga ttacacaaaa taagtgttac tgtactttaa 10740
gctattacaa atatccaact tttaaagata tgtaagaatc agtaatattc tagaaagcac 10800
atatatagta aaagggcatc ctttaaatgt agaacgggta aacatgaaac agttccatgc 10860
ttgaattgtt aagtatctag ggggtaaaca ttgaatggga gaatcattta ttgggttaag 10920
gtcccttcct tgtcattctg ggatctgtga atcacattgt aattcctgtt gacaaagctt 10980
tacttgttaa catcagttga tactgacatt ctccataaag atatagaatg aaaatatcta 11040
ttaaaaatag tttatcattg ttttagcttt tttgttttgt ttgttttgag acagagtctc 11100
actgtcaccc aggcttgagt gcagcggtgt gatcttggct caatgcaacc tccacctccc 11160
aggttcaata gattctccca ccttggcctc ccaagtagct gggattactg gcatgcacca 11220
ctatgcctgg ccagtttttt gtatttttag tagagatggg gtttcaccat gttggccagg 11280
ctggactcga actcctgacc tcaagtgatc cgcccacctc agcctcccaa agtgccggga 11340
ttataggcat gagccactgc gcctagcctg ttgcagcttt ttaaagcagg aaaatatcca 11400
tataaactgt tgggttagaa tctatattag aatctttcaa actaattgaa aacaggaaga 11460
ctatcatcta agtagccaga taatctgggt ttcaaaaagt tattccatgg tactggttta 11520
aaaaatactt ttcaagtgtt ttaattttta aagtgtaact aattcttcaa atatgttatg 11580
ctgttaaaat atgtattcca taagtacttt ttgtatatgt attcttaaat tttaaaaagt 11640
caactgaatg cgcaaagatg atataatttt ggatgtagac atttaaacta gattcccagt 11700
cctctccttc aaaagcttgg tctttgtttt tcctataggg aaaaaagtca aaataagttc 11760
caaaaactat cctcaaagta gtattgtgct tgtagtaaat gaaggttgga tggatggata 11820
ctgacaatgg tggcaggcat ttcaagcctt ttaaattagt actttttgtc gtcttgctta 11880
ttaaaatttt gttaatttta gcaaagacca attgttgtga taaactggtg ttttttggat 11940
gcttcaagca cacgttaacc aattttttaa ttcccctttt ggttcctccc attgttctaa 12000
aataggactt tcatattatt aaaacctcaa aagatgatcc acccaggatg aacaaagatc 12060
accaagggga aagaaaacat tttttatctt tacagaaaac atgttaagat tatatataga 12120
tgtattcttt acattggata ttgtattaga gtcctcctta caagaaatga aatagttttt 12180
agcactctta gcattagagt tcctagattg gtgttgatag ctacagtttt aaaatgtata 12240
acctgaaaat gaaggttaat tttgcattgt aagagcacat ttgatctatg taaaaagtgt 12300
ccatttggtg tattttttta aaaaagagaa agcactttca tattaagtag catgtgtatg 12360
aatttagatt ttcatatttg ttgtgtctgt attcagtgaa gtaaattgag catttaaatg 12420
tttgttgatg gcaacattaa ctattaaatt aaagcacctt atactctgct gcttaacttg 12480
cttgtaattg cacctttgtt acctgcacat tttcatatag aatattgttg taacattgct 12540
tcatgtgggt ctggatggaa gattagtggg cctacaggat catttattta tattgtttat 12600
attacaataa tatattgtag atcagttgta agttcatttc tttacaaata aaagcctctt 12660
ccatttgact ggaaaaaaaa aaaaaaa 12687
<210> 5
<211> 5368
<212> DNA
<213> genus, species, homo sapiens (human)
<400> 5
gtttggcttc acggaaccct gtacgcatgc tcctacgctg aactttagga gccagtctaa 60
ggcctaggcg cagacgcact gagcctaagc agccggtgat ggcggcagcg gctgtggtgg 120
ctgcggcggg tccgggccca tgaggcgacg aaggaggcgg gacggctttt acccagcccc 180
ggacttccga gacagggaag ctgaggacat ggcaggagtg tttgacatag acctggacca 240
gccagaggac gcgggctctg aggatgagct ggaggagggg ggtcagttaa atgaaagcat 300
ggaccatggg ggagttggac catatgaact tggcatggaa cattgtgaga aatttgaaat 360
ctcagaaact agtgtgaaca gagggccaga aaaaatcaga ccagaatgtt ttgagctact 420
tcgggtactt ggtaaagggg gctatggaaa ggtttttcaa gtacgaaaag taacaggagc 480
aaatactggg aaaatatttg ccatgaaggt gcttaaaaag gcaatgatag taagaaatgc 540
taaagataca gctcatacaa aagcagaacg gaatattctg gaggaagtaa agcatccctt 600
catcgtggat ttaatttatg cctttcagac tggtggaaaa ctctacctca tccttgagta 660
tctcagtgga ggagaactat ttatgcagtt agaaagagag ggaatattta tggaagacac 720
tgcctgcttt tacttggcag aaatctccat ggctttgggg catttacatc aaaaggggat 780
catctacaga gacctgaagc cggagaatat catgcttaat caccaaggtc atgtgaaact 840
aacagacttt ggactatgca aagaatctat tcatgatgga acagtcacac acacattttg 900
tggaacaata gaatacatgg cccctgaaat cttgatgaga agtggccaca atcgtgctgt 960
ggattggtgg agtttgggag cattaatgta tgacatgctg actggagcac ccccattcac 1020
tggggagaat agaaagaaaa caattgacaa aatcctcaaa tgtaaactca atttgcctcc 1080
ctacctcaca caagaagcca gagatctgct taaaaagctg ctgaaaagaa atgctgcttc 1140
tcgtctggga gctggtcctg gggacgctgg agaagttcaa gctcatccat tctttagaca 1200
cattaactgg gaagaacttc tggctcgaaa ggtggagccc ccctttaaac ctctgttgca 1260
atctgaagag gatgtaagtc agtttgattc caagtttaca cgtcagacac ctgtcgacag 1320
cccagatgac tcaactctca gtgaaagtgc caatcaggtc tttctgggtt ttacatatgt 1380
ggctccatct gtacttgaaa gtgtgaaaga aaagttttcc tttgaaccaa aaatccgatc 1440
acctcgaaga tttattggca gcccacgaac acctgtcagc ccagtcaaat tttctcctgg 1500
ggatttctgg ggaagaggtg cttcggccag cacagcaaat cctcagacac ctgtggaata 1560
cccaatggaa acaagtggca tagagcagat ggatgtgaca atgagtgggg aagcatcggc 1620
accacttcca atacgacagc cgaactctgg gccatacaaa aaacaagctt ttcccatgat 1680
ctccaaacgg ccagagcacc tgcgtatgaa tctatgacag agcaatgctt ttaatgaatt 1740
taaggcaaaa aaggtggaga gggagatgtg tgagcatcct gcaaggtgaa acgactcaaa 1800
atgacagttt cagagagtca atgtcattac atagaacact tcagacacag gaaaaataaa 1860
cgtggatttt aaaaaatcaa tcaatggtgc aaaaaaaaac ttaaagcaaa atagtattgc 1920
tgaactctta ggcacatcaa ttaattgatt cctcgcgaca tcttctcaac cttatcaagg 1980
attttcatgt tgatgactcg aaactgacag tattaagggt aggatgttgc ttctgaatca 2040
ctgttgagtt ctgattgtgt tgaagaaggg ttatcctttc attaggcaaa gtacaaaatt 2100
gcctataata cttgcaacta aggacaaatt agcatgcaag cttggtcaaa ctttttccag 2160
caaaatggaa gcaaagacaa aagaaactta ccaattgatg ttttacgtgc aaacaacctg 2220
aatctttttt ttatataaat atatattttt caaatagatt tttgattcag ctcattatga 2280
aaaacatccc aaactttaaa atgcgaaatt attggttggt gtgaagaaag ccagacaact 2340
tctgtttctt ctcttggtga aataataaaa tgcaaatgaa tcattgttaa ccacagctgt 2400
ggctcgtttg agggattggg gtggacctgg ggtttatttt cagtaaccca gctgcaatac 2460
ctgtctgtaa tatgagaaaa aaaaaatgaa tctatttaat catttctact tgcagtactg 2520
ctatgtgcta agcttaactg gaagccttgg aatgggcata agttgtatgt cctacatttc 2580
atcattgtcc cgggcctgca ttgcactgga aaaaaaaatc gccacctgtt cttacaccag 2640
tatttggttc aagacaccaa atgtcttcag cccatggctg aagaacaaca gaagagagtc 2700
aggataaaaa atacatactg tggtcggcaa ggtgagggag atagggatat ccaggggaag 2760
agggtgttgc tgtggcccac tctctgtcta atctctttac agcaaattgg taagattttc 2820
agttttactt ctttctactg tttctgctgt ctaccttcct tatatttttt tcctcaacag 2880
ttttaaaaag aaaaaaaggt ctattttttt ttctcctata cttgggctac attttttgat 2940
tgtaaaaata tttgatggcc ttttgatgaa tgtcttccac agtaaagaaa acttagtggc 3000
ttaatttagg aaacatgtta acaggacact atgtttttga aattgtaaca aaatctacat 3060
aaatgattta caggttaaaa gaataaaaat aaaggtaact ttacctttct taaatatttc 3120
ctgccttaaa gagagcattt ccatgacttt agctggtgaa agggtttaat atctgcagag 3180
ctttataaaa atatatttca gtgcatactg gtataataga tgatcatgca gttgcagttg 3240
agttgtatca ccttttttgt ttgtctttta taatgtcttc agtctgagtg tgcaaagtca 3300
atttgtaata ttttgcaacc ctaggatttt tttaaataga tgctgcttgc tatgttttca 3360
aacctttttg agccatagga tccaagccat aaaattcttt atgcatgttg aattcagtca 3420
gaaaagagca aggctttgct ttttgaaatt gcaactcaaa tgagatggga tgaaatccta 3480
tgacagtaag caaaaacaga accatgaaaa atgattggac atacaccttt tcaattgtgg 3540
caataattga aagaatcgat aaaagttcat ctttggacag aaagccttta aaaaaaaaat 3600
cactccctct tccccctcct cccttattgc agcagcctac tgagaacttt gactgttgct 3660
ggtaaattag aagctacaat aataattaag ggcagaaatt atacttaaaa agtgcagatc 3720
cttgttcttt gacaatttgt gatgtctgaa aaaacagaac ccgaaaagct atggtgatat 3780
gtacaggcat tatttcagac tgtaaatggc ttgtgatact cttgatactt gttttcaaat 3840
atgtttacta actgtagtgt tgactgcctg accaaattcc agtgaaactt atacaccaaa 3900
atattcttcc taggtcctat ttgctagtaa catgagcact gtgattggct ggctataacc 3960
accccagtta aaccattttc ataattagta gtgccagcaa tagtggcaaa cactgcaact 4020
tttctgcata aaaagcatta attgcacagc taccatccac acaaatacat agtttttctg 4080
acttcacatt tattaagtga aatttatttc ccatgctgtg gaaagtttat tgagaacttg 4140
tttcataaat ggatatccct actatgactg tgaaaacatg tcaagtgtca cattagtgtc 4200
acagacagaa agcacacacc tatgcaatat ggcttatcta tatttatttg taaaaatcca 4260
agcatagttt aaaatatgat gtcgatatta ctagtcttga gtttctaaga gggttcttta 4320
tgttatacca ggtaagtgta taaaagagat taagtgcttt tttttcatca cttgattatt 4380
ttctttaaaa tcagctatta caggatattt ttttatttta tacatgctgt tttttaatta 4440
aaatataatc actgaagttt actaatttga ttttataagg tttgtagcat tacagaataa 4500
ctaaactggg atttataaac cagctgtgat taacaatgta aagtattaat tattgaactt 4560
tgaaccagat ttttaggaaa attatgttct ttttccccct ttatggtctt aactaatttg 4620
aatccttcaa gaaggatttt tccatactat tttttaagat agaagataat ttgtgggcag 4680
gggtggagga tgcatgtatg atactccata aattcaacat tctttactat aggtaatgaa 4740
tgattataaa caagatgcat cttagatagt attaatatac tgagccttgg attatatatt 4800
taatatagga cctattttga atattcagtt aatcatatgg ttcctagctt acaagggcta 4860
gatctaagat tattcccatg agaaatgttg aatttatgaa gaatagattt taaggctttg 4920
aaaatggtta atttctcaaa aacatcaatg tccaaacatc tacctttttt cataggagta 4980
gacactagca agctggacaa actatcacaa aagtatttgt cacacataac ctgtggtctg 5040
ttgctgatta atacagtact ttttcttgtg tgattcttaa cattatagca caagtattat 5100
ctcagtggat tatccggaat aacatctgaa agatgggttc atctatgttt gtgtttgctc 5160
tttaaactat tgtttctcct atcccaagtt cgctttgcat ctatcagtaa ataaaattct 5220
tcagctgcct tattaggagt gctatgaggg taacacctgt tctgcttttc atcttgtatt 5280
tagttgactg tattatttga tttcggattg aatgaatgta aatagaaatt aaatgcaaat 5340
ttgaatgaac ataaaaaaaa aaaaaaaa 5368
<210> 6
<211> 4270
<212> DNA
<213> genus, species, Homo sapiens (human)
<400> 6
gagggccgct gtcactcagc cccgcgggcc aatagaaaag gggtgaaccc cgccttcttc 60
ctgagttgtg ctgcgggcat gcgcactggg cgtccccacg ccaccgccca tcagctgaga 120
attgcagctg agggctccgg ggtaggtggg tgacggcggt cggaggtgta ggagggagcc 180
gtggaggtcc aggtgactgc ttagaaaact gcacagcatc tgatgaaatt agcgaataag 240
aacatcaacc atgtcttaca ctccaggagt tggtggtgac cccgcccagt tggcccagag 300
gatctcttct aacatccaga agatcacaca gtgttctgtg gaaatacaaa gaactctgaa 360
tcaacttgga acacctcaag attcacctga attgaggcaa cagttgcaac agaagcagca 420
gtatactaac cagcttgcca aagaaacaga taagtacatt aaagagtttg gatctctgcc 480
caccaccccc agtgaacagc gtcaaaggaa aatacagaag gatcgcttag tggcagagtt 540
cacaacatca ctgacaaact tccagaaggt ccagaggcag gctgctgagc gagagaaaga 600
gtttgttgct cgagtaagag ccagttccag agtgtctggc agttttcctg aggacagctc 660
aaaagaaagg aatcttgtat cctgggaaag ccaaactcaa cctcaagtgc aggtgcagga 720
tgaagaaatt acagaggatg acctccgtct tattcatgag agagaatctt ctatcaggca 780
acttgaagct gatattatgg atattaatga aatatttaaa gatttgggaa tgatgattca 840
tgaacaagga gatgtaatag atagcataga agccaatgtg gaaaatgcag aggtgcacgt 900
tcagcaagca aatcagcagc tgtcaagggc agcagattat cagcgcaaat ccagaaaaac 960
cctgtgcatc atcattctta tccttgtcat tggagttgcg attatcagtc tcatcatatg 1020
gggattgaac cactgaagtt ataaaggagc acactgtcgc actacattgt ctaaattatg 1080
taggaagatt cctgtaatca tgttttttta attattattt taaagctatt gtataaagga 1140
tggttcccat actttgttat ttttattggg ggggttgggg tggttccttt ggattaaatc 1200
tgatattttc taatactgaa agattttcta aatgtcactg ctgacataac tcccttggtc 1260
ttcaatttaa tagttgttaa gtttttgccc acattgcata tgcctttcat ttataattta 1320
tttaccctgc ttgacttagt tttgggaatt cgtaaattta aaggtgtgtg tattctgttt 1380
gcatctccct gtcactgtga cacacctaga tgtgtgttac ttcaattaaa attctcaaat 1440
ttaattttga tttgcttcag cagggaaaat attctcaata atgtaaaata attaaggtct 1500
atacatgggt tgtatttttc tggttcacaa cagcacaaag tgtctttcat ttttttgttg 1560
gttttctttt aagatctttt ttaccctgaa gtcggtgaat acttttctag tttatttgat 1620
actctttctg tgtatatatt aagcttttgc tgtagattgc ctagtaaaat tactaaggat 1680
aggttgtttt tacatatggt ctatttaagt ctgatgttta cgggggaaag tgtagttact 1740
aaaaatgttt aacataattt ggaagaagag tatgaacaac caataccaat acctattgcg 1800
tttggattct taagacccca gtttgttatt ccactaaact agttatctta accatatcat 1860
ctggttttgt gggccattat ttaccttccc ttatgtctta tagaataatg gttaatattt 1920
ttaggtcaaa attacttttg gaaagtaact ttcccacaat taactgtttt tgagcacctg 1980
acaaaattta gtgtttacct tgcgtgccat tttgtgtcat ccttcattaa aaaagcaatt 2040
ggaggtttgc cagttatctc acttcccttt ttaaatcaat gttgttttaa tgcactaatc 2100
tgaattctgt aaagaggatt atcttagttt atactttgta ttttataatg ttcttgtata 2160
gcagctcggt actgaaggcg gtgtttaact tggcaagctc tgagacttca aatgggaaca 2220
aatagtaagt agctaagtaa accacatctt tgcaaccaaa ataaagatga gttaaaaggt 2280
atctggttag gcctatttca tgaggactat gctctggtgg gaccacaggt cacctgatac 2340
ttagtgctgt gctgcctgaa acctcagcat ggagacatca ccacatgcac tgtggccatt 2400
cagcatcttt ctggagcacc agttcactcc aggtttttat tttagggtgt catcgatttt 2460
acctactttg tcagactggt agaagttgct ttgcatatca gaaaaactcc atttttttcc 2520
acaaaaggga ttacagaaaa ctcttttgtg agtgagtgat tggaacttag agactcctgt 2580
tgccagaatc agactgccct agaacagaat ggacaatgca gggaggagaa ttcacacaaa 2640
cagcacctgt tctgaggcct gtgccagccc accaggcctg ctcaaatgtg gtctttactt 2700
caagtgcaca gaggcacatg aggtttctgg tgataaacca gcgtcttacc gctgttttaa 2760
agtcccatcc ccatggcttt cacaatcagt tccgtttttt ttgctgtact tgataaaatg 2820
tttattctca tacaggtcaa gtacatttac ttctattcac agtgagtacc caataacaac 2880
aaaagcgctt acaaatttgg ggggcgtgat tttagtacct ttatttgaag tgtaatcatt 2940
ttaaattatt attattttaa ctggggcagt tatcagtggt ttaaacagga actttagtgg 3000
cttcaatttg tttaagaaac atattaagtt tgagggaaaa atttcccatg aaatatttgg 3060
aacgtaagag tagtattgat tagagaaaat taaataagaa acatagtatg gtagccaaat 3120
tttttaaaaa atcttgaact tttctgtagg tcagttttag aactgctgtg aaaagtgaag 3180
gttgccctgt ggagattaaa attagagttg ttttcataac tgacagcatg gtgaatccat 3240
ttgagtcaaa gtgaagaatt tcctcatcaa gtgactatac atttgttttt gtgtgctcaa 3300
aagaaatact caaacacaga ctgatattaa ccagccaggt aaattgaacg acaatgtggc 3360
attaggtatt tggctgttta ttggtcgtta aatactatgg ttttgcaata tgattgatgg 3420
taaaagagtg atgtcatatt gatactagag tagcttgttt ttttagtagg tgtgggacca 3480
tctcttttac aagtgcaact cagtctagga cagccatgga tgcagtgtct ggagttggac 3540
cctctgagcc cgctggctgc cccagtagca tctgcattgg tgaccaagga cactgcactt 3600
tgaagaggtc gccactgggt tatttagtgt cttcactgct tttgttaaaa attgtaaaat 3660
tttgtacaca aaaagttgtg tttttgaata tcaattgttt agacacacct acaatgataa 3720
ataagtgcct ttaaaggccc ctctttccat gaaatacatc tgtggtttag caaggaaagt 3780
acaaaatagt ttatgtagtt ggtataattt ttatttgtgt cttcatgtag aaaaaatgaa 3840
tgtcataata aaatataaaa cttacgtaaa gaaaataaag tcattgtcca ccttaatagc 3900
taaggtccac aagggtaact tatgcagcat ttattttttt tgaaagtcaa aattgaattt 3960
atttctttca catggctggt ttgctgcaat atgaagtttc agaatgggct gaagtaagtt 4020
gattgaggga tttgagttga atgacatttt caagttcatt taaatatgat aaaaattcat 4080
tggtggtaaa taacatctgt ctttcctgga aaaaaaaaag ttgtgtattt tcatgattca 4140
gttaaaacaa aaaatgagcc tgtgaatccc aggccttttt agtcctccat aacatttgaa 4200
cagtttgact tgtcagcaaa gaaatacact tatcaaattt taaaccaatg ggagcctgaa 4260
agtgttacag 4270
<210> 7
<211> 21
<212> DNA
<213> Artificial sequence
<400> 7
ggatcttgtc agtgacgttg t 21
<210> 8
<211> 19
<212> DNA
<213> Artificial sequence
<400> 8
gtagcgattc cggcactgt 19
<210> 9
<211> 20
<212> DNA
<213> Artificial sequence
<400> 9
acctgaggca gctcttttgc 20
<210> 10
<211> 25
<212> DNA
<213> Artificial sequence
<400> 10
ggatgtgtgt gctccttcta aagaa 25
<210> 11
<211> 23
<212> DNA
<213> Artificial sequence
<400> 11
tttctctcaa aggactacga gat 23
<210> 12
<211> 18
<212> DNA
<213> Artificial sequence
<400> 12
agcagccctc aggtcaaa 18
<210> 13
<211> 20
<212> DNA
<213> Artificial sequence
<400> 13
cagttctacc agcggcagta 20
<210> 14
<211> 20
<212> DNA
<213> Artificial sequence
<400> 14
ggtagcgggt gtgaaaatca 20
<210> 15
<211> 19
<212> DNA
<213> Artificial sequence
<400> 15
ccgaactctg ggccataca 19
<210> 16
<211> 22
<212> DNA
<213> Artificial sequence
<400> 16
ttgcaggatg ctcacacatc tc 22
<210> 17
<211> 25
<212> DNA
<213> Artificial sequence
<400> 17
ctcttttgtg agtgagtgat tggaa 25
<210> 18
<211> 21
<212> DNA
<213> Artificial sequence
<400> 18
ccctgcattg tccattctgt t 21
<210> 19
<211> 26
<212> DNA
<213> Artificial sequence
<400> 19
aactcctgta gccgaatcta ccgctc 26
<210> 20
<211> 22
<212> DNA
<213> Artificial sequence
<400> 20
tagccacccc caccccactt gc 22
<210> 21
<211> 28
<212> DNA
<213> Artificial sequence
<400> 21
tcaaaatcct ctggcctctc ctacgaac 28
<210> 22
<211> 22
<212> DNA
<213> Artificial sequence
<400> 22
cccctaccct aacggcaagc ca 22
<210> 23
<211> 22
<212> DNA
<213> Artificial sequence
<400> 23
caaacggcca gagcacctgc gt 22
<210> 24
<211> 29
<212> DNA
<213> Artificial sequence
<400> 24
actcctgttg ccagaatcag actgcccta 29
<210> 25
<211> 24
<212> DNA
<213> Artificial sequence
<400> 25
accatgagaa gtatgacaac agcc 24
<210> 26
<211> 23
<212> DNA
<213> Artificial sequence
<400> 26
cacgatacca aagttgtcat gga 23
<210> 27
<211> 23
<212> DNA
<213> Artificial sequence
<400> 27
tcagcaatgc ctcctgcacc acc 23
<210> 28
<211> 1421
<212> DNA
<213> genus, species, Homo sapiens (human)
<400> 28
gcctcaagac cttgggctgg gactggctga gcctggcggg aggcggggtc cgagtcaccg 60
cctgccgccg cgcccccggt ttctataaat tgagcccgca gcctcccgct tcgctctctg 120
ctcctcctgt tcgacagtca gccgcatctt cttttgcgtc gccagccgag ccacatcgct 180
cagacaccat ggggaaggtg aaggtcggag tcaacggatt tggtcgtatt gggcgcctgg 240
tcaccagggc tgcttttaac tctggtaaag tggatattgt tgccatcaat gaccccttca 300
ttgacctcaa ctacatggtt tacatgttcc aatatgattc cacccatggc aaattccatg 360
gcaccgtcaa ggctgagaac gggaagcttg tcatcaatgg aaatcccatc accatcttcc 420
aggagcgaga tccctccaaa atcaagtggg gcgatgctgg cgctgagtac gtcgtggagt 480
ccactggcgt cttcaccacc atggagaagg ctggggctca tttgcagggg ggagccaaaa 540
gggtcatcat ctctgccccc tctgctgatg cccccatgtt cgtcatgggt gtgaaccatg 600
agaagtatga caacagcctc aagatcatca gcaatgcctc ctgcaccacc aactgcttag 660
cacccctggc caaggtcatc catgacaact ttggtatcgt ggaaggactc atgaccacag 720
tccatgccat cactgccacc cagaagactg tggatggccc ctccgggaaa ctgtggcgtg 780
atggccgcgg ggctctccag aacatcatcc ctgcctctac tggcgctgcc aaggctgtgg 840
gcaaggtcat ccctgagctg aacgggaagc tcactggcat ggccttccgt gtccccactg 900
ccaacgtgtc agtggtggac ctgacctgcc gtctagaaaa acctgccaaa tatgatgaca 960
tcaagaaggt ggtgaagcag gcgtcggagg gccccctcaa gggcatcctg ggctacactg 1020
agcaccaggt ggtctcctct gacttcaaca gcgacaccca ctcctccacc tttgacgctg 1080
gggctggcat tgccctcaac gaccactttg tcaagctcat ttcctggtat gacaacgaat 1140
ttggctacag caacagggtg gtggacctca tggcccacat ggcctccaag gagtaagacc 1200
cctggaccac cagccccagc aagagcacaa gaggaagaga gagaccctca ctgctgggga 1260
gtccctgcca cactcagtcc cccaccacac tgaatctccc ctcctcacag ttgccatgta 1320
gaccccttga agaggggagg ggcctaggga gccgcacctt gtcatgtacc atcaataaag 1380
taccctgtgc tcaaccagtt aaaaaaaaaa aaaaaaaaaa a 1421

Claims (7)

1. A group of liver cancer gene markers is characterized in that: consists of an EP400 gene, an MAPK1IP1L gene, a NUFIP2 gene, a PHC3 gene, an RPS6KB1 gene and an STX7 gene.
2. The panel of liver cancer gene markers of claim 1, wherein: the EP400 gene is a polynucleotide sequence shown in a sequence table SEQ ID NO. 1, the MAPK1IP1L gene is a polynucleotide sequence shown in a sequence table SEQ ID NO. 2, the NUFIP2 gene is a polynucleotide sequence shown in a sequence table SEQ ID NO. 3, the PHC3 gene is a polynucleotide sequence shown in a sequence table SEQ ID NO. 4, the RPS6KB1 gene is a polynucleotide sequence shown in a sequence table SEQ ID NO. 5, and the STX7 gene is a polynucleotide sequence shown in a sequence table SEQ ID NO. 6.
3. Use of the reagent for detecting the expression level of the liver cancer gene marker according to claim 1 or 2 in the preparation of a product for detecting liver cancer.
4. Use according to claim 3, characterized in that: the product for detecting liver cancer is a real-time quantitative PCR kit, an RNA sequencing kit or a gene chip.
5. Use according to claim 3, characterized in that: the reagent for detecting the liver cancer gene marker is a primer and/or a probe for detecting the liver cancer gene marker.
6. Use according to claim 5, characterized in that: the primer is a primer for specifically amplifying SEQ ID NO:1 to SEQ ID NO:6 gene sequence primer; the probe is a probe which can be matched with SEQ ID NO:1 to SEQ ID NO: 6.
7. Use according to claim 6, characterized in that: the primer is a sequence shown in SEQ ID NO 7-18 of a sequence table; the probe is a sequence shown in a sequence table SEQ ID NO. 19-SEQ ID NO. 24.
CN201710710566.7A 2017-08-18 2017-08-18 Gene markers for liver cancer detection and application thereof Active CN109423515B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710710566.7A CN109423515B (en) 2017-08-18 2017-08-18 Gene markers for liver cancer detection and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710710566.7A CN109423515B (en) 2017-08-18 2017-08-18 Gene markers for liver cancer detection and application thereof

Publications (2)

Publication Number Publication Date
CN109423515A CN109423515A (en) 2019-03-05
CN109423515B true CN109423515B (en) 2022-04-19

Family

ID=65497571

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710710566.7A Active CN109423515B (en) 2017-08-18 2017-08-18 Gene markers for liver cancer detection and application thereof

Country Status (1)

Country Link
CN (1) CN109423515B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110241220B (en) * 2019-07-31 2022-11-01 青岛解码医学检验有限公司 Peripheral blood transcriptional gene marker for breast cancer detection and application thereof
CN110904225B (en) * 2019-11-19 2022-04-12 中国医学科学院肿瘤医院 Combined marker for liver cancer detection and application thereof
CN111413498B (en) * 2020-04-08 2023-08-04 复旦大学附属中山医院 Autoantibody 7-AAb detection panel for liver cell liver cancer and application thereof
CN112626198A (en) * 2020-12-25 2021-04-09 杭州师范大学附属医院 Molecular marker for liver disease severe treatment and application thereof
CN113555118B (en) * 2021-07-26 2023-03-31 内蒙古自治区人民医院 Method and device for predicting disease degree, electronic equipment and storage medium
CN113999914A (en) * 2021-11-30 2022-02-01 杭州翱锐基因科技有限公司 Novel combined marker for early detection of multi-target hepatocellular carcinoma and application thereof

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140045915A1 (en) * 2010-08-31 2014-02-13 The General Hospital Corporation Cancer-related biological materials in microvesicles
CN105219844B (en) * 2015-06-08 2018-12-14 华夏京都医疗投资管理有限公司 Gene marker combination, kit and the disease risks prediction model of a kind of a kind of disease of screening ten

Also Published As

Publication number Publication date
CN109423515A (en) 2019-03-05

Similar Documents

Publication Publication Date Title
CN109423515B (en) Gene markers for liver cancer detection and application thereof
CN109790583B (en) Methods for typing lung adenocarcinoma subtypes
CN107077536B (en) Evaluation of activity of TGF-beta cell signaling pathway using mathematical modeling of target gene expression
EP1272668B1 (en) Methods, compositions and kits for the detection and monitoring of breast cancer
KR102023584B1 (en) PREDICTING GASTROENTEROPANCREATIC NEUROENDOCRINE NEOPLASMS (GEP-NENs)
KR101828290B1 (en) Markers for endometrial cancer
ES2374954T3 (en) GENETIC VARIATIONS ASSOCIATED WITH TUMORS.
AU2013277971B2 (en) Molecular malignancy in melanocytic lesions
CN109863251B (en) Method for subtyping lung squamous cell carcinoma
RU2721916C2 (en) Methods for prostate cancer prediction
AU2012345789B2 (en) Methods of treating breast cancer with taxane therapy
CN111183233A (en) Assessment of Notch cell signaling pathway activity using mathematical modeling of target gene expression
KR100964193B1 (en) Markers for liver cancer prognosis
KR20150090246A (en) Molecular diagnostic test for cancer
CA2430981A1 (en) Gene expression profiling of primary breast carcinomas using arrays of candidate genes
KR20140044341A (en) Molecular diagnostic test for cancer
KR20140006898A (en) Colon cancer gene expression signatures and methods of use
CN101573453A (en) Methods of predicting distant metastasis of lymph node-negative primary breast cancer using biological pathway gene expression analysis
CN111479933A (en) Assessment of JAK-STAT1/2 cell signaling pathway activity using mathematical modeling of target gene expression
CN101111768A (en) Lung cancer prognostics
CN112646888B (en) Kit for detecting mammary tumor specific methylation
CA2666057C (en) Genetic variations associated with tumors
US20230022417A1 (en) Chemical compositions and methods of use
CN112391466A (en) Methylation biomarker for detecting breast cancer or combination and application thereof
US20030175761A1 (en) Identification of genes whose expression patterns distinguish benign lymphoid tissue and mantle cell, follicular, and small lymphocytic lymphoma

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant