CN112760371A

CN112760371A - Primer, kit and analysis method for detecting MUC1 gene mutation

Info

Publication number: CN112760371A
Application number: CN202110256784.4A
Authority: CN
Inventors: 赵明珠; 李红梅
Original assignee: Shanghai Jiaotong University
Current assignee: Shanghai Jiaotong University
Priority date: 2021-03-09
Filing date: 2021-03-09
Publication date: 2021-05-07

Abstract

The invention relates to a primer, a kit and an analysis method for detecting MUC1 gene mutation. The PCR primers for detecting MUC1 gene mutation are selected from one or more pairs of primers F1, R1 to F16 and R16, and the sequences of the primers are respectively shown as SEQ ID NO.1-SEQ ID NO. 32. Compared with the prior art, the invention provides an experimental method and an analytical method for MUC1 gene mutation detection, solves the problems of ADTKD disease screening and diagnosis caused by MUC1 mutation, and has very important significance for disease diagnosis of ADTKD clinical diagnosis patients and gene mutation screening and treatment intervention of family members. The invention comprises an effective PCR amplification system and an effective library construction system, and the PCR system and the library construction system are matched for use, so that the experimental cost of MUC1 mutation detection is reduced, and the invention has wide applicable machine types including the sequenl machine type and the sequenl 2/2e machine type.

Description

Primer, kit and analysis method for detecting MUC1 gene mutation

Technical Field

The invention belongs to the technical field of biological medicines, and particularly relates to a primer, a kit and an analysis method for detecting MUC1 gene mutation.

Background

Autosomal dominant small tube interstitial nephropathy (ADTKD for short) is a group of interstitial nephropathy with genetic predisposition and familial aggregative properties. Such renal patients eventually enter end-stage renal disease (ESRD), which imposes a heavy burden on individuals, families, and society.

In 2015, global kidney disease prognosis improvement tissue (KDIGO for short) has achieved consensus on renaming, clinical manifestations, kidney pathology, diagnosis and treatment of autosomal dominant tubulointerstitial kidney disease. 5 ADTKD virulence genes (UMOD, MUC1, HNF1B, REN, SEC61A1) have been found and correspond to their genotyping.

Patients with a family history of chronic kidney disease, particularly those with hyperuricemia, need vigilant early screening for ADTKD to improve prognosis. The type-B ultrasonic diagnosis of ADTKD patients shows no typical characteristics, so that great trouble is brought to clinical diagnosis. Detection of gene mutations is the only method for the definitive diagnosis of ADTKD and its subtypes. The gene mutation detection has important significance for the definite diagnosis and the guide treatment of ADTKD suspected patients, the preoperative evaluation of renal function normal family members with the intention of donating kidney, the early screening of healthy people with ADTKD risk, and the genetic diagnosis and the prenatal and postnatal care before the adult receives embryo implantation.

At present, the sequencing technology (WES for short) of human exons in the second generation gene sequencing is an important mode for ADTKD clinical diagnosis. However, the method can only solve the mutation detection of 4 genes of UMOD, HNF1B, REN and SEC61A1, and cannot solve the mutation detection of MUC1 gene. Previous studies showed that the prevalence of ADTKD-UMOD was 37.1%, the most common subtype of ADTKD; the prevalence of ADTKD-MUC1 was 21%, the second most common subtype of ADTKD, thus leading to many clinical omissions.

The MUC1 gene maps to human chromosome 1q22 and contains 7 exons, and the 2 nd exon contains a plurality of tandem repeats (VNTRs for short). The encoded product MUC1 mucin is typical type I transmembrane mucin and consists of a polypeptide skeleton and glycosyl side chains connected by O-glycosidic bonds. The MUC1 gene can be expressed in various tissues under normal conditions, is widely distributed in a far-end renal tubule in the kidney and has an important function of maintaining the renal tubular cavity. If a cytosine deoxynucleotide is inserted into the VNTRs domain of MUC1, it will cause the gene to shift its position, generating a new peptide chain that cannot be folded into a protein with normal function and accumulated in the tubular epithelial cells, eventually leading to renal tubular dysfunction and necrosis. In many studies, the tandem repeat region of the No. 2 exon of MUC1 gene is frequently mutated with 4 types of gene mutations, 28dupA, 26_27insG, 23delinsAT and 27dupC, wherein 27dupC (90% of the mutation ratio) is the main mutation type. In addition, MUC1 mucin is also present in epithelial tissues and organs such as breast, pancreas, ovary, and the like. In pathology, the gene is expressed in almost all pancreatic ductal carcinoma, cholangiocarcinoma and partial ampulla carcinoma, and can be used for differential diagnosis of pancreatic ductal carcinoma, cholangiocarcinoma, gastric adenocarcinoma and the like (positive in 20% of cases); more than 90 percent of breast cancer positive expression can be used for differential diagnosis of breast cancer and metastatic gastric cancer. It is highly expressed abnormally on the surface of the cancerous epithelial cells of many types of solid tumors and leukemia, and the structure is correspondingly changed, thereby becoming a target of immune response and playing an important role in the development, metastasis, prognosis and immunotherapy of breast cancer.

Methods for early detection of mutations in the MUC1 gene include mass spectrometry and immunohistochemistry. At present, the mass spectrometry detection method is based on MALDI-TOF MS mass spectrometer for mutation detection, and the mutation of 27dupC is judged by determining the relative molecular mass of oligonucleotide after enzyme digestion. This method relies on the detection of known specific mutation sites and is limited by the existing repertoire, poor in detection sensitivity and requires large amounts of starting DNA. Immunohistochemistry relies on the unique frameshift mutein of MUC1 aberrantly expressed on tissue smears, which is detected after staining. Patients with poor renal function have difficulty performing renal biopsy and thus sample acquisition is not easy. At present, the two methods can not detect all mutations of the MUC1 gene, can not exactly identify that the MUC1 mutation occurs in the several repeated regions of VNTRs, belong to indirect detection methods of MUC1 gene mutation, and have low detection accuracy and poor sensitivity.

The experts suggested by KDIGO have agreed that gene mutation detection based on gene sequencing technology is a direct detection method for ADTKD pathogenic genes, and is the only method for definite diagnosis of ADTKD and its subtypes. Due to the particularity of the MUC1 gene and the limitation of the second-generation WES sequencing, regions containing complex sequence structures, such as a high GC/AT region and a plurality of repeated sequence regions, have multiple difficulties of low amplification efficiency, poor amplification specificity, difficult cloning of amplification products and the like. Meanwhile, for regions containing a plurality of repetitive sequences, even if sequencing data is obtained, since genomic DNA is randomly broken into small fragments of about 300bp before sequencing, the sequencing fragments cannot judge which position of the genome is specific when mapping a reference genome, so that mapping fails, and finally the regions cannot be successfully read. The third generation gene sequencing technology, namely the single molecule real-time sequencing technology, can detect a region with the length of more than 10kb, can easily span the VNTRs region process of the MUC1 gene, obtains a gene sequence by detecting a fluorescent signal, and can theoretically detect a tandem repeat sequence region. However, the third-generation sequencing has low accuracy (about 85%) for sequencing single bases, and the difficulty lies in the correction and analysis of sequencing results and the accurate mutation judgment in clinic.

Disclosure of Invention

Based on the technical current situation that VNTRs region mutation detection of MUC1 gene is lack of high efficiency and accuracy in the prior art, the invention provides a primer, a kit and an analysis method for detecting MUC1 gene mutation.

The invention adopts a single-molecule real-time sequencing technology to sequence the VNTRs region of the MUC1 gene. The experimental problems addressed by the efforts include: PCR primer design, replaced magnetic beads, enzyme and applicable system, and reduces the detection cost. Analytical problems addressed include: and correcting, clustering and analyzing the sequencing result by using data generated by a single-molecule real-time sequencing technology, establishing an analysis method, and providing an accurate MUC1 mutation judgment method.

The purpose of the invention can be realized by the following technical scheme:

the invention provides PCR primers for detecting MUC1 gene mutation, wherein one or more of the following 16 pairs of primers are selected, and the sequences of the primers are as follows:

primer F1: CACATATCAGAGTGCGGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R1: CACATATCAGAGTGCGGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F2: ACACACAGACTGTGAGGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R2: ACACACAGACTGTGAGGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F3: CACGCACACACGCGCGGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R3: CACGCACACACGCGCGGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F4: ACAGTCGAGCGCTGCGGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R4: ACAGTCGAGCGCTGCGGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F5: ACACACGCGAGACAGAGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R5: ACACACGCGAGACAGAGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F6: ACGCGCTATCTCAGAGGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R6: ACGCGCTATCTCAGAGGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F7: ACACTAGATCGCGTGTGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R7: ACACTAGATCGCGTGTGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F8: CTCACTACGCGCGCGTGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R8: CTCACTACGCGCGCGTGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F9: CGCATGACACGTGTGTGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R9: CGCATGACACGTGTGTGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F10: CATAGAGAGATAGTATGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R10: CATAGAGAGATAGTATGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F11: CACACGCGCGCTATATGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R11: CACACGCGCGCTATATGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F12: TCACGTGCTCACTGTGGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R12: TCACGTGCTCACTGTGGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F13: ACACACTCTATCAGATGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R13: ACACACTCTATCAGATGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F14: CACGACACGACGATGTGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R14: CACGACACGACGATGTGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F15: CTATACATAGTGATGTGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R15: CTATACATAGTGATGTGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F16: CACTCACGTGTGATATGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R16: CACTCACGTGTGATATGCCGTTGTGCACCAGAGTAGAAGCTGA

The sequences of the primers are respectively shown as SEQ ID NO.1-SEQ ID NO. 32.

When multiple pairs of primers are selected, MUC1 mutation detection can be performed on multiple samples at the same time, for example, when 16 pairs of primers are designed in a PCR system, MUC1 mutation detection can be performed on 16 samples at the same time, and each sample is subjected to 8 groups of amplification.

The invention also provides a kit for detecting MUC1 gene mutation, which comprises: the PCR primer for detecting MUC1 gene mutation according to the first aspect of the present invention.

The PCR primer comprises a sample label, the primer and the sample label are synthesized in a customized manner, 16bp bases are selected for designing the sample label, the resolution efficiency of Subreds is generally over 66%, and the accuracy can reach 98%; the resolution efficiency and accuracy of the consensus reads are comparable to the second generation sequencing.

In one embodiment of the invention, the kit for detecting MUC1 gene mutation further comprises magnetic beads for purification, DNA repair enzyme, DNA end filling enzyme, DNA ligase and DNA exonuclease.

The above-mentioned magnetic beads for purification, DNA repair enzyme, DNA end blunting enzyme, DNA ligase, DNA exonuclease and the like are commonly used raw materials in biotechnology and are commercially available.

In one embodiment of the present invention, the DNA polymerase is selected from the group consisting of Takara DNA polymerase;

the magnetic Beads for purification are selected from AMPure XP Beads of Beckman company.

The preparation method of the magnetic beads for purification comprises the following steps:

1) resuspend AMpure XP beads, take out 500ul, centrifuge at 16000rpm for 1min, place on magnetic frame, take out supernatant to new tube for preservation.

2) Adding 1ml of water on the magnetic beads, resuspending, centrifuging at 16000rpm for 1min, placing on a magnetic frame, taking out the supernatant, and discarding; repeating for 4-5 times.

3) 1ml of 10mM Tris-HCl (pH8.5) was added to the beads, mixed and resuspended, centrifuged at 16000rpm for 1min, placed on a magnetic frame and the supernatant removed and discarded.

4) The supernatant stored in the first step was added to beads and resuspended. This is the treated beads which can be stored at 4 ℃ until use.

The DNA Repair enzyme selects NEB Next FFPE DNA Repair Mix.

The DNA end-blunting enzyme selects NEB Next end repair enzyme mix.

The DNA ligase is selected from T4 DNA ligase from Thermofisiher company, 5U/ul.

The DNA Exonuclease is selected from Exonuclease VII (NEB) and Exonuclease III (NEB).

In one embodiment of the invention, the kit for detecting MUC1 gene mutation further comprises a negative control product and a positive control product.

And the negative control product selects genomic DNA of a Hela cell line: HeLa Genomic DNA (NEB).

The positive control is selected from Jurkat cell line genome DNA: jurkat Genomic DNA (Thermofoisher).

The invention also provides a using method of the kit for detecting MUC1 gene mutation, the kit for detecting MUC1 gene mutation is used for detecting MUC1 gene mutation, and in one embodiment of the invention, when the kit is used, a PCR reaction mixture system is as follows: each Reaction mixture system was 50. mu.l containing 200ng of genomic DNA template, 5 × Reaction Buffer 10. mu.l, dNTP (2.5mM each) 4. mu.l, DNA Polymerase 1. mu.l, primer F (10. mu.M primer, HPLC grade) 1. mu.l, primer R (10. mu.M primer, HPLC grade)HPLC grade) 1. mu.l, supplemented ddH₂O to a total volume of 50. mu.l.

The invention also provides a method for detecting MUC1 gene mutation, which comprises the steps of establishing a library establishing system, sequentially preparing a joint sequence, preparing a connecting system, preparing an enzyme digestion system, pretreating magnetic beads and establishing a purification system.

In one embodiment of the invention, the method is implemented by the sequenl and sequenl 2/2e models. The detection effect of MUC1 gene mutation is consistent, and the difference between models does not exist.

The invention also provides an analysis method for judging whether the MUC1 gene is mutated.

Aiming at the generation-oriented gene sequencing data output mode and error type, the invention can carry out correction and analysis in a targeted way. The correction and analysis system comprises a data correction module, a data splitting module, a cluster analysis module, a base position re-correction module and a mutation discrimination module. The data correction module was linked to single molecule real time data (PacBio SMRT data). The method can obtain 99.99% mutation discrimination accuracy.

The invention relates to an analysis method for judging whether MUC1 gene is mutated or not, which comprises the following steps:

step 1: data were corrected within a single molecule well, results were optimized and filtered. The preferred approach sets the minimum prediction accuracy to 90% so that raw data with less than 0.9 accuracy is filtered out.

Step 2: after filtering, splitting different data, and optimizing and filtering the result. Preferably, a minimum label split score of 55 is set, and raw data with a split accuracy of less than 0.75 is filtered out.

And step 3: and performing data clustering and statistical analysis based on multiple sequence alignment, and preferably selecting a clustering result. And performing multiple sequence comparison on the data retained after the splitting, and clustering similar data. The clustering result with high accuracy of more than 99.99 percent is preferred.

And 4, step 4: and (4) carrying out base position recalibration by using the original data to obtain the accuracy of each base position of the clustering result, and optimizing the result. The clustering result with high accuracy of more than 99.99 percent is preferred.

And 5: and (3) optimizing the clustering result of the base position re-correction by combining PCR experimental data so as to obtain accurate mutation judgment. Preferably, the PCR byproduct results are filtered for clustering results that match the length of the PCR product.

The method provided by the invention can solve the problems in the prior art, improves the applicability of an experimental method and an analysis method, and reduces the detection cost.

Compared with the prior art, the beneficial effects of the invention are embodied in the following aspects:

(1) the invention provides an experimental method and an analytical method for MUC1 gene mutation detection, solves the screening and diagnosis problems of ADTKD diseases caused by MUC1 mutation, and has very important significance for the disease diagnosis of ADTKD clinical diagnosis patients and the screening and treatment intervention of family member gene mutation.

(2) The invention comprises an effective PCR amplification system and an effective library construction system, and the PCR system and the library construction system are matched for use, so that the experimental cost of MUC1 mutation detection is reduced, and the invention has wide applicable machine types including the sequenl machine type and the sequenl 2/2e machine type.

(3) The present invention comprises an optimized assay system, requiring a total blood volume of about 200ul, and an average extractable cellular DNA yield of about 4-10ug, sufficient for MUC1 gene mutation detection.

(4) The invention provides an optimized correction and analysis method, can obtain accurate MUC1 gene mutation judgment, has the accuracy rate of 99.99 percent, solves the problem that the original data of the third generation single molecule real-time sequencing has low accuracy rate and inaccurate gene mutation judgment, and provides a more reliable conclusion for clinic.

Drawings

FIG. 1 is a gel diagram of the PCR products of insertion of hela and jurkat.

Detailed Description

In one embodiment of the invention, the invention provides a method for detecting MUC1 gene mutation, which comprises the steps of establishing a library building system, preparing a linker sequence, preparing a connecting system, preparing an enzyme digestion system, pretreating magnetic beads and establishing a purification system in sequence.

Preparation of linker sequence: the adaptor sequence unique to the third generation sequencing (HPLC grade) was synthesized, 10mM Tris-HCl (pH7.5) solution was prepared, and the primers were dissolved to 170. mu.M; preparing 200mM Tris-HCl (pH7.5) solution and 2M NaCl solution, and mixing the solution 1:1 to obtain mother solution for later use; sucking 10 mul of primer and 10 mul of mother liquor, adding 80 mul of amplification sterile water to dilute the primer to 17 mul, shaking, uniformly mixing and instantaneously centrifuging; placing the diluted primers in a PCR instrument, setting the program at 95 ℃ for 5 min; after the operation is finished, the primer is subjected to instantaneous centrifugation and is placed at the temperature of 20-25 ℃ and kept stand for more than 12 hours at room temperature; storing at-20 deg.C for use.

Preparation of a linking system: 6. mu.l of T4 DNA ligase (Thermofisiher), 4. mu.l of 10X T4 DNA ligase buffer (Thermofisiher), 1ul of prepared linker, PCR product, and ddH2O were prepared, and the total system volume was 40. mu.l; the reaction condition of the system is set at 22 ℃ and is more than or equal to 10h (the temperature of a hot cover is 25 ℃).

Preparation of an enzyme digestion system: preparing 0.5 mul of Exonuclease VII and 0.5 mul of Exonuclease III; the reaction conditions of the system are set at 37 ℃ for 1 h.

Pretreatment of magnetic beads: AMpure XP magnetic beads (Beckman) were treated with equal volumes of ddH₂Rinsing with O for 4-5 times, and storing with stock solution.

Establishing a purification system: preparing 1.0% -1.2% agarose gel, and detecting whether the PCR product band is correct or not; purification using 0.8X magnetic beads; the concentration was checked using Qubit 4.0 and the product was ready for use.

The invention also provides an analysis method for judging whether the MUC1 gene is mutated or not, which comprises the following steps:

The invention is described in detail below with reference to the figures and specific embodiments.

Example 1

The embodiment provides a kit for detecting MUC1 gene mutation, which comprises: PCR primers for detecting MUC1 gene mutation, magnetic beads for purification, DNA repair enzyme, DNA end filling enzyme, DNA ligase, DNA exonuclease, negative control and positive control.

In this example, the DNA polymerase of Takara was selected as the DNA-amplifying enzyme.

The DNA Repair enzyme selects NEB Next FFPE DNA Repair Mix.

The DNA terminal blunting enzyme is used for selecting NEBNext end repair enzyme mix.

The DNA ligase is selected from T4 DNA ligase from the company thermofisher, 5U/ul.

Wherein, the PCR primer for detecting MUC1 gene mutation selects one or more pairs of the following 16 pairs of primers, and the sequences of the primers are as follows:

primer F1: CACATATCAGAGTGCGGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R1: CACATATCAGAGTGCGGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F2: ACACACAGACTGTGAGGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R2: ACACACAGACTGTGAGGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F3: CACGCACACACGCGCGGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R3: CACGCACACACGCGCGGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F4: ACAGTCGAGCGCTGCGGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R4: ACAGTCGAGCGCTGCGGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F5: ACACACGCGAGACAGAGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R5: ACACACGCGAGACAGAGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F6: ACGCGCTATCTCAGAGGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R6: ACGCGCTATCTCAGAGGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F7: ACACTAGATCGCGTGTGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R7: ACACTAGATCGCGTGTGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F8: CTCACTACGCGCGCGTGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R8: CTCACTACGCGCGCGTGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F9: CGCATGACACGTGTGTGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R9: CGCATGACACGTGTGTGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F10: CATAGAGAGATAGTATGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R10: CATAGAGAGATAGTATGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F11: CACACGCGCGCTATATGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R11: CACACGCGCGCTATATGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F12: TCACGTGCTCACTGTGGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R12: TCACGTGCTCACTGTGGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F13: ACACACTCTATCAGATGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R13: ACACACTCTATCAGATGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F14: CACGACACGACGATGTGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R14: CACGACACGACGATGTGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F15: CTATACATAGTGATGTGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R15: CTATACATAGTGATGTGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F16: CACTCACGTGTGATATGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R16: CACTCACGTGTGATATGCCGTTGTGCACCAGAGTAGAAGCTGA

When the kit for detecting MUC1 gene mutation provided by the embodiment is used, the PCR reaction mixture system is as follows: 50. mu.l of each reaction mixture system,containing 200ng of genomic DNA template, 5 × Reaction Buffer 10. mu.l, dNTP (2.5mM each 4. mu.l, DNA Polymerase 1. mu.l, primer F (10. mu.M primer, HPLC grade) 1. mu.l, primer R (10. mu.M primer, HPLC grade) 1. mu.l, and complement H₂O to a total volume of 50. mu.l.

Example 2

This example establishes an amplification system based on the kit for detecting mutation of MUC1 gene provided in example 1.

Specifically, when the kit for detecting MUC1 gene mutation is used, the PCR reaction mixture system is as follows: each Reaction mixture system was 50. mu.l containing 200ng of genomic DNA template, 5 × Reaction Buffer 10. mu.l, dNTP (2.5mM each) 4. mu.l, DNA Polymerase 1. mu.l, primer F (10. mu.M primer, HPLC grade) 1. mu.l, primer R (10. mu.M primer, HPLC grade) 1. mu.l, and complement ddH₂O to a total volume of 50. mu.l.

The PCR thermal cycling conditions were: preheating a hot cover at 105 ℃, preserving heat at 98 ℃ for 5min, and then entering 30 cycles (adopting a two-step thermal cycle condition): denaturation temperature 98 ℃ for 10s (1 st to 30 th cycle), followed by annealing and extension temperature: 74 ℃ 4min (1 st to 5 th cycle)/72 ℃ 4min (6 th to 10 th cycle)/70 ℃ 4min (11 th to 15 th cycle)/68 ℃ 4min (16 th to 30 th cycle). The final extension temperature was maintained at 68 ℃ for 10min and subsequently at 4 ℃.

Namely, the amplification reaction conditions were as follows:

98℃，5min；

98 ℃, 10s, 74 ℃ and 4 min; 5 cycles;

98 ℃, 10s, 72 ℃ for 4 min; 5 cycles;

98 ℃, 10s, 70 ℃ and 4 min; 5 cycles;

98 ℃, 10s, 68 ℃ for 4 min; 5 cycles;

68℃，10min；

4℃，∞。

the amplified products are detected by agarose gel electrophoresis of 1.0-1.2% to detect the fragment size of the amplified products.

Establishing a purification system: preparing 1.0% -1.2% agarose gel, and detecting whether the PCR product band is correct or not; purification was performed using 0.8X magnetic beads (magnetic beads for purification designed in example 3); the concentration was checked using Qubit 4.0 and the product was ready for use.

Example 3

Design of magnetic beads for purification, linker sequence and enzyme reaction system

In order to reduce the library construction cost, commercially available magnetic beads and enzymes are adopted to optimize a reaction system, and then the library construction is completed. In this example, the design of the magnetic bead for purification, the linker sequence, and the enzyme reaction system includes a preparation method of the magnetic bead for purification, a design and preparation method of the linker sequence, a repair enzyme reaction system, a ligase reaction system, an enzyme digestion reaction system, and a suitable reaction condition design.

Preparation of magnetic beads for purification:

2) Adding 1ml of water into the magnetic beads, resuspending, centrifuging at 16000rpm for 1min, placing on a magnetic frame, taking out the supernatant, and discarding; repeating for 4-5 times.

Preparation of linker sequence:

1) the adaptor sequence unique to the third generation sequencing was prepared by dissolving the primers to 170. mu.M in 10mM Tris-HCl (pH7.5), preparing 200mM Tris-HCl (pH7.5) and 2M NaCl, and mixing them 1:1 to prepare a mother solution.

2) 10 mul of the primer and 10 mul of the mother liquor are sucked, 80 mul of water is added to dilute the primer to 17 mul; 95 ℃ for 5 min. After the operation is finished, the mixture is quickly taken out and is kept stand at the room temperature of 20-25 ℃ for more than 12 hours; storing at-20 deg.C for use.

Repairing an enzyme reaction system:

1) 53.5ul of DNA sample, 6.5ul of FFPE DNA repair system (10X), 2ul of NEB Next FFPE DNA repair enzyme; 20 ℃ for 15 min.

2) 62ul of repaired DNA sample, 3.5ul of FFPE DNA end filling repair system (10X) and 5ul of NEBNext FFPE end filling enzyme; ddH₂O 29.5ul；20℃，30min。

3) And (5) purifying by using 0.8X magnetic beads.

A ligase reaction system:

1) 23ul of DNA sample after the end filling and repairing, 1ul of adaptor sequence (17uM), 2ul of 10X T4 DNA ligase system and 6ul of T4 DNA ligase (5U/ul); ligation was carried out overnight at 22 ℃.

2) Inactivating at 65 deg.C for 10 min.

An exonuclease reaction system:

1) 40ul of DNA sample after the joint connection, 0.5ul of DNA Exonuclease VII and 0.5ul of DNA Exonuclease III; 37 ℃ for 1 h.

2) And (5) purifying by using 0.8X magnetic beads.

Example 4

This example provides methods for interpreting MUC1 mutation detection results

The interpretation of the MUC1 mutation detection results described in this example is based on a self-developed data analysis method that is applicable to data generated by three generations of single molecule real-time sequencing of long fragment PCR amplification products (PacBio SMRT data). According to the data characteristics, the correction and analysis system comprises a data correction module, a data splitting module, a cluster analysis module, a base position re-correction module and a mutation discrimination module. The method can obtain 99.99% mutation discrimination accuracy. And the result of the MUC1 mutation detection is read according to the experimental result.

The data correction module is implemented as follows: and carrying out in-hole subbranches correction on each effective data coordinate hole, and filtering out original data with the accuracy rate lower than 0.9. Through single-hole data correction, each coordinate hole obtains single data read with high accuracy, and holes with low data quality are discarded.

The specific implementation process of the data splitting module is as follows: and (3) setting the minimum data splitting score to be 55 by taking the primers and the sample labels as data splitting bases, and filtering out original data with the splitting accuracy rate lower than 0.75. Through the data splitting step, different sample data detected in the same batch are split at a higher accuracy rate, and data which cannot be subjected to the higher accuracy rate are abandoned.

The specific implementation process of the cluster analysis module is as follows: and performing multiple sequence comparison on the data remained after each sample is split, and performing data clustering and statistical analysis on different PCR products. The clustering result with high accuracy of more than 99.99 percent is preferred. The retained high quality data is fully clustered to retain dominant and non-dominant PCR amplification products. Then, the clustering result is subjected to statistical analysis, and the coverage rate and the read length are given.

The specific implementation process of the base position recalibration module is as follows: and obtaining the original data hole site corresponding to each clustering result through the hole site information. And (3) re-correcting each base position of the clustering result by using the original data of not less than 3000 subreads to obtain the accuracy rate of each base position of the clustering result, preferably the clustering result with the high accuracy rate of more than 99.99%. The strictly-executed clustering standard and the base position re-correction result can realize the detection and analysis of single base mutation with high accuracy. Occasionally, the poor quality of the base position can be accurately found.

The mutation judgment module is implemented in the following specific process: and (3) optimizing the clustering result of the base position re-correction by combining PCR experimental data so as to obtain accurate mutation judgment. Preferably, the PCR byproduct results are filtered for clustering results that match the length of the PCR product.

And optimizing the clustering result of the base position re-correction so as to obtain accurate mutation judgment. Preferably, the PCR byproduct results are filtered for clustering results that match the length of the PCR product. And (3) optimizing the clustering result of the base position re-correction by combining PCR experimental data so as to obtain accurate mutation judgment. Preferably, the PCR byproduct results are filtered for clustering results that match the length of the PCR product.

Example 5

Cell line genome DNA sample detection result

In this example, HeLa cell line genomic DNA and Jurkat cell line genomic DNA were selected for experiments, and 200ng of each cell line genomic DNA was put into an amplification tube, and Master Mix, primers and ddH were added₂O, total system volume 50 mul. 8-tube parallel amplification experiments were performed for each cell line genome. The amplification reaction conditions were as follows:

98℃，5min；

98 ℃, 10s, 74 ℃ and 4 min; 5 cycles;

98 ℃, 10s, 72 ℃ for 4 min; 5 cycles;

98 ℃, 10s, 70 ℃ and 4 min; 5 cycles;

98 ℃, 10s, 68 ℃ for 4 min; 5 cycles;

68℃，10min；

4℃，∞。

the amplified products are detected by agarose gel electrophoresis of 1.0-1.2% to detect the fragment size of the amplified products. As shown in fig. 1.

The PCR product was purified with 0.8X magnetic beads and then subjected to concentration measurement, and the results are shown in Table 1. Then entering a library building process, and sequentially carrying out a repair enzyme reaction, a ligase reaction and an exonuclease reaction. Obtaining the library for later use.

Repairing an enzyme reaction system:

3) The beads were purified and the concentration was measured at 0.8 Xand the results are shown in the table.

A ligase reaction system:

2) Inactivating at 65 deg.C for 10 min.

An exonuclease reaction system:

2)0.8 magnetic beads were purified and the concentration was measured, the results are shown in Table 1.

TABLE 1 data of concentration measurements after each purification

Reaction step and product concentration	Hela cell line (ng/ul)	Jurkat cell line (ng/ul)
			Amplification product	45.0	18.9
Repair product	22.4	10.8
			Enzyme digestion product	9.9	11.4

Performing an on-machine experiment by using a PacBio SMRT standard system, and entering a data correction and analysis process after data is obtained: firstly, single-molecule hole correction is carried out on the data, and original data with accuracy rate lower than 0.9 is filtered out. And then, splitting different data, and filtering out original data with splitting accuracy lower than 0.75. And then carrying out data clustering, preferably on the clustering result with high accuracy of more than 99.99%. And finally, carrying out base bit recalibration by using the original data, and preferably selecting the clustering result with high accuracy of more than 99.99%.

And combining the data analysis result and the amplification experiment result to obtain accurate MUC1 gene mutation interpretation. Mutation results of MUC1 gene in genome of each cell line: MUC1 gene of Hela cell line genome DNA is not mutated and is used as a negative control in the kit; the VNTRs sequence of the MUC1 gene of the genomic DNA of the Jurkat cell line was mutated to 27dupC and used as a positive control in the kit.

The specific sequence results are as follows:

the MUC1 gene of Hela cell line genome DNA is shown in SEQ ID NO.33, and the MUC1 gene of Jurkat cell line genome DNA is shown in SEQ ID NO. 34.

Example 6: clinical sample test results

In this example, 7 clinical suspected samples (sample 1, sample 2, sample 3, sample 4, sample 5, sample 6 and sample 7) were selected, and whole blood of clinical suspected pathology was collected and leukocyte DNA was extracted for the experiment. The experimental procedure and the analytical method are consistent with the genomic DNA of the cell line. See above.

In this example, HeLa cell line genomic DNA and Jurkat cell line genomic DNA were selected for experiments, and 200ng of each cell line genomic DNA was put into an amplification tube, and Master Mix, primers and ddH were added₂O, total system volume 50. mu.l. 8-tube parallel amplification experiments were performed for each cell line genome. The amplification reaction conditions were as follows:

98℃，5min；

98 ℃, 10s, 74 ℃ and 4 min; 5 cycles;

98 ℃, 10s, 72 ℃ for 4 min; 5 cycles;

98 ℃, 10s, 70 ℃ and 4 min; 5 cycles;

98 ℃, 10s, 68 ℃ for 4 min; 5 cycles;

68℃，10min；

4℃，∞。

The PCR product was purified with 0.8X magnetic beads and then subjected to concentration measurement, the results are shown in the table. Then entering a library building process, and sequentially carrying out a repair enzyme reaction, a ligase reaction and an exonuclease reaction. Obtaining the library for later use.

Repairing an enzyme reaction system:

3)0.8 magnetic beads were purified and the concentration was measured, the results are shown in Table 2.

A ligase reaction system:

2) Inactivating at 65 deg.C for 10 min.

An exonuclease reaction system:

2)0.8 magnetic beads were purified and the concentration was measured, the results are shown in Table 2.

TABLE 2 data of concentration measurements after each purification

And combining the data analysis result and the amplification experiment result to obtain accurate MUC1 gene mutation interpretation. MUC1 gene mutation results for each suspected sample genome: 2, the VNTRs sequence of MUC1 gene of the genomic DNA of the samples 1 and 2 is suspected to have 27dupC mutation; the genomic DNA of samples 3, 4, 5, 6 and 7 was suspected of having no mutation in the MUC1 gene.

The specific sequence results are as follows:

the MUC1 gene of the leucocyte DNA of the blood of the suspected sample 1 is shown as SEQ ID NO.35, the MUC1 gene of the leucocyte DNA of the blood of the suspected sample 2 is shown as SEQ ID NO.36, the MUC1 gene of the leucocyte DNA of the blood of the suspected sample 3 is shown as SEQ ID NO.37, the MUC1 gene of the leucocyte DNA of the blood of the suspected sample 4 is shown as SEQ ID NO.38, the MUC1 gene of the leucocyte DNA of the blood of the suspected sample 5 is shown as SEQ ID NO.39, the MUC1 gene of the leucocyte DNA of the blood of the suspected sample 6 is shown as SEQ ID NO.40, and the MUC1 gene of the leucocyte DNA of the blood of the suspected sample 7 is shown as SEQ ID NO. 41.

The embodiments described above are described to facilitate an understanding and use of the invention by those skilled in the art. It will be readily apparent to those skilled in the art that various modifications to these embodiments may be made, and the generic principles described herein may be applied to other embodiments without the use of the inventive faculty. Therefore, the present invention is not limited to the above embodiments, and those skilled in the art should make improvements and modifications within the scope of the present invention based on the disclosure of the present invention.

Sequence listing

<110> Shanghai university of transportation

<120> primer, kit and analysis method for detecting MUC1 gene mutation

<160> 41

<170> SIPOSequenceListing 1.0

<210> 1

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 1

cacatatcag agtgcgggag aaaaggagac ttcggctacc cag 43

<210> 2

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 2

cacatatcag agtgcggccg ttgtgcacca gagtagaagc tga 43

<210> 3

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 3

acacacagac tgtgagggag aaaaggagac ttcggctacc cag 43

<210> 4

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 4

acacacagac tgtgaggccg ttgtgcacca gagtagaagc tga 43

<210> 5

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 5

cacgcacaca cgcgcgggag aaaaggagac ttcggctacc cag 43

<210> 6

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 6

cacgcacaca cgcgcggccg ttgtgcacca gagtagaagc tga 43

<210> 7

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 7

acagtcgagc gctgcgggag aaaaggagac ttcggctacc cag 43

<210> 8

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 8

acagtcgagc gctgcggccg ttgtgcacca gagtagaagc tga 43

<210> 9

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 9

acacacgcga gacagaggag aaaaggagac ttcggctacc cag 43

<210> 10

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 10

acacacgcga gacagagccg ttgtgcacca gagtagaagc tga 43

<210> 11

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 11

acgcgctatc tcagagggag aaaaggagac ttcggctacc cag 43

<210> 12

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 12

acgcgctatc tcagaggccg ttgtgcacca gagtagaagc tga 43

<210> 13

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 13

acactagatc gcgtgtggag aaaaggagac ttcggctacc cag 43

<210> 14

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 14

acactagatc gcgtgtgccg ttgtgcacca gagtagaagc tga 43

<210> 15

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 15

ctcactacgc gcgcgtggag aaaaggagac ttcggctacc cag 43

<210> 16

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 16

ctcactacgc gcgcgtgccg ttgtgcacca gagtagaagc tga 43

<210> 17

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 17

cgcatgacac gtgtgtggag aaaaggagac ttcggctacc cag 43

<210> 18

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 18

cgcatgacac gtgtgtgccg ttgtgcacca gagtagaagc tga 43

<210> 19

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 19

catagagaga tagtatggag aaaaggagac ttcggctacc cag 43

<210> 20

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 20

catagagaga tagtatgccg ttgtgcacca gagtagaagc tga 43

<210> 21

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 21

cacacgcgcg ctatatggag aaaaggagac ttcggctacc cag 43

<210> 22

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 22

cacacgcgcg ctatatgccg ttgtgcacca gagtagaagc tga 43

<210> 23

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 23

tcacgtgctc actgtgggag aaaaggagac ttcggctacc cag 43

<210> 24

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 24

tcacgtgctc actgtggccg ttgtgcacca gagtagaagc tga 43

<210> 25

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 25

acacactcta tcagatggag aaaaggagac ttcggctacc cag 43

<210> 26

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 26

acacactcta tcagatgccg ttgtgcacca gagtagaagc tga 43

<210> 27

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 27

cacgacacga cgatgtggag aaaaggagac ttcggctacc cag 43

<210> 28

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 28

cacgacacga cgatgtgccg ttgtgcacca gagtagaagc tga 43

<210> 29

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 29

ctatacatag tgatgtggag aaaaggagac ttcggctacc cag 43

<210> 30

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 30

ctatacatag tgatgtgccg ttgtgcacca gagtagaagc tga 43

<210> 31

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 31

cactcacgtg tgatatggag aaaaggagac ttcggctacc cag 43

<210> 32

<211> 43

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 32

cactcacgtg tgatatgccg ttgtgcacca gagtagaagc tga 43

<210> 33

<211> 2238

<212> DNA

<213> Intelligent (Homo sapiens)

<400> 33

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 60

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 120

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgccccccca 180

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgcgcccgca 240

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 300

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgcccccaca 360

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 420

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 480

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 540

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgccccccca 600

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgccccccca 660

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgcgcccgca 720

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 780

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 840

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 900

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 960

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 1020

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 1080

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgccccccca 1140

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1200

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1260

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1320

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1380

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1440

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1500

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgcgcccgca 1560

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 1620

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgccccccca 1680

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1740

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1800

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1860

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1920

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1980

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 2040

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgcaccccca 2100

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cccggccccg 2160

ggctccaccg cccccccagc ccacggtgtc acctcggccc cggacaccag gccggccccg 2220

ggctccaccg ccccccca 2238

<210> 34

<211> 2239

<212> DNA

<213> Intelligent (Homo sapiens)

<400> 34

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 60

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 120

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgccccccca 180

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgcgcccgca 240

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 300

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgcccccaca 360

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 420

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 480

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcacccgca 540

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgccccccca 600

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgccccccca 660

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgcgcccgca 720

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 780

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 840

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 900

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 960

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 1020

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 1080

ccccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgccccccca 1140

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1200

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1260

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1320

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgcccccccc 1380

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1440

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1500

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgcgcccgc 1560

agcccacggt gtcacctcgg ccccggagag caggccggcc ccgggctcca ccgcgcccgc 1620

agcccacggt gtcacctcgg ccccggagag caggccggcc ccgggctcca ccgccccccc 1680

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1740

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1800

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1860

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1920

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1980

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 2040

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgcaccccc 2100

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccccggcccc 2160

gggctccacc gcccccccag cccacggtgt cacctcggcc ccggacacca ggccggcccc 2220

gggctccacc gccccccca 2239

<210> 35

<211> 2239

<212> DNA

<213> Intelligent (Homo sapiens)

<400> 35

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 60

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 120

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgccccccca 180

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgcgcccgca 240

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 300

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgcccccaca 360

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 420

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 480

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 540

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcccccccc 600

agcccacggt gtcacctcgg ccccggacac caggcccgcc ccgggctcca ccgccccccc 660

agcccacggt gtcacctcgg ccccggacac caggcccgcc ccgggctcca ccgcgcccgc 720

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgcccccca 780

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 840

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 900

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 960

agcccacggt gtcacctcgg ccccggagag caggccggcc ccgggctcca ccgcgcccgc 1020

agcccacggt gtcacctcgg ccccggagag caggccggcc ccgggctcca ccgcgcccgc 1080

agcccacggt gtcacctcgg ccccggagag caggccggcc ccgggctcca ccgccccccc 1140

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1200

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1260

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1320

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1380

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1440

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1500

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgcgcccgc 1560

agcccacggt gtcacctcgg ccccggagag caggccggcc ccgggctcca ccgcgcccgc 1620

agcccacggt gtcacctcgg ccccggagag caggccggcc ccgggctcca ccgccccccc 1680

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1740

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1800

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1860

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1920

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1980

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 2040

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgcaccccc 2100

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccccggcccc 2160

gggctccacc gcccccccag cccacggtgt cacctcggcc ccggacacca ggccggcccc 2220

gggctccacc gccccccca 2239

<210> 36

<211> 2239

<212> DNA

<213> Intelligent (Homo sapiens)

<400> 36

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 60

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 120

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgccccccca 180

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgcgcccgca 240

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 300

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgcccccaca 360

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 420

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 480

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 540

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcccccccc 600

agcccacggt gtcacctcgg ccccggacac caggcccgcc ccgggctcca ccgccccccc 660

agcccacggt gtcacctcgg ccccggacac caggcccgcc ccgggctcca ccgcgcccgc 720

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgcccccca 780

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 840

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 900

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 960

agcccacggt gtcacctcgg ccccggagag caggccggcc ccgggctcca ccgcgcccgc 1020

agcccacggt gtcacctcgg ccccggagag caggccggcc ccgggctcca ccgcgcccgc 1080

agcccacggt gtcacctcgg ccccggagag caggccggcc ccgggctcca ccgccccccc 1140

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1200

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1260

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1320

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1380

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1440

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1500

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgcgcccgc 1560

agcccacggt gtcacctcgg ccccggagag caggccggcc ccgggctcca ccgcgcccgc 1620

agcccacggt gtcacctcgg ccccggagag caggccggcc ccgggctcca ccgccccccc 1680

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1740

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1800

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1860

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1920

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 1980

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgccccccc 2040

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccgcaccccc 2100

agcccacggt gtcacctcgg ccccggacac caggccggcc ccgggctcca ccccggcccc 2160

gggctccacc gcccccccag cccacggtgt cacctcggcc ccggacacca ggccggcccc 2220

gggctccacc gccccccca 2239

<210> 37

<211> 2298

<212> DNA

<213> Intelligent (Homo sapiens)

<400> 37

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 60

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 120

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgccccccca 180

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgcgcccgca 240

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 300

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgcccccaca 360

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 420

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 480

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 540

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgccccccca 600

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgccccccca 660

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgcgcccgca 720

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 780

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 840

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 900

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 960

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 1020

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 1080

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgccccccca 1140

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1200

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1260

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1320

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1380

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1440

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1500

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1560

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgcgcccgca 1620

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 1680

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgccccccca 1740

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1800

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1860

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1920

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1980

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 2040

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 2100

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgcaccccca 2160

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cccggccccg 2220

ggctccaccg cccccccagc ccacggtgtc acctcggccc cggacaccag gccggccccg 2280

ggctccaccg ccccccca 2298

<210> 38

<211> 2178

<212> DNA

<213> Intelligent (Homo sapiens)

<400> 38

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 60

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 120

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgccccccca 180

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgcgcccgca 240

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 300

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 360

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 420

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 480

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 540

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgccccccca 600

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgccccccca 660

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgcgcccgca 720

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 780

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 840

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 900

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 960

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 1020

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 1080

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgccccccca 1140

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1200

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1260

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1320

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1380

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1440

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgcgcccgca 1500

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 1560

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgccccccca 1620

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1680

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1740

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1800

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1860

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1920

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1980

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgcaccccca 2040

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cccggccccg 2100

ggctccaccg cccccccagc ccacggtgtc acctcggccc cggacaccag gccggccccg 2160

ggctccaccg ccccccca 2178

<210> 39

<211> 2238

<212> DNA

<213> Intelligent (Homo sapiens)

<400> 39

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 60

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 120

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgccccccca 180

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgcgcccgca 240

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 300

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgcccccaca 360

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 420

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 480

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 540

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgccccccca 600

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgccccccca 660

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgcgcccgca 720

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 780

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 840

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 900

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 960

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 1020

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 1080

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgccccccca 1140

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1200

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1260

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1320

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1380

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1440

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1500

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgcgcccgca 1560

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 1620

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgccccccca 1680

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1740

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1800

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1860

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1920

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1980

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 2040

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgcaccccca 2100

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cccggccccg 2160

ggctccaccg cccccccagc ccacggtgtc acctcggccc cggacaccag gccggccccg 2220

ggctccaccg ccccccca 2238

<210> 40

<211> 2178

<212> DNA

<213> Intelligent (Homo sapiens)

<400> 40

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 60

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 120

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgccccccca 180

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgcgcccgca 240

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 300

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgcccccaca 360

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 420

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 480

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 540

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgccccccca 600

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgccccccca 660

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgcgcccgca 720

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 780

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 840

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 900

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 960

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 1020

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 1080

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgccccccca 1140

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1200

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1260

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1320

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1380

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1440

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgcgcccgca 1500

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 1560

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgccccccca 1620

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1680

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1740

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1800

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1860

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1920

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1980

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgcaccccca 2040

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cccggccccg 2100

ggctccaccg cccccccagc ccacggtgtc acctcggccc cggacaccag gccggccccg 2160

ggctccaccg ccccccca 2178

<210> 41

<211> 2118

<212> DNA

<213> Intelligent (Homo sapiens)

<400> 41

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 60

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 120

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgccccccca 180

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgcgcccgca 240

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccaa 300

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgcccccaca 360

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 420

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 480

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 540

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgccccccca 600

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgccccccca 660

gcccacggtg tcacctcggc cccggacacc aggcccgccc cgggctccac cgcgcccgca 720

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 780

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 840

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 900

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 960

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgccccccca 1020

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1080

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1140

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1200

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1260

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1320

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1380

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgcgcccgca 1440

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgcgcccgca 1500

gcccacggtg tcacctcggc cccggagagc aggccggccc cgggctccac cgccccccca 1560

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1620

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1680

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1740

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1800

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1860

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgccccccca 1920

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cgcaccccca 1980

gcccacggtg tcacctcggc cccggacacc aggccggccc cgggctccac cccggccccg 2040

ggctccaccg cccccccagc ccacggtgtc acctcggccc cggacaccag gccggccccg 2100

ggctccaccg ccccccca 2118

Claims

1. PCR primers for detecting MUC1 gene mutation, wherein one or more of the following 16 pairs of primers are selected, and the sequences of each primer are as follows:

primer F1: CACATATCAGAGTGCGGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R1: CACATATCAGAGTGCGGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F2: ACACACAGACTGTGAGGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R2: ACACACAGACTGTGAGGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F3: CACGCACACACGCGCGGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R3: CACGCACACACGCGCGGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F4: ACAGTCGAGCGCTGCGGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R4: ACAGTCGAGCGCTGCGGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F5: ACACACGCGAGACAGAGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R5: ACACACGCGAGACAGAGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F6: ACGCGCTATCTCAGAGGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R6: ACGCGCTATCTCAGAGGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F7: ACACTAGATCGCGTGTGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R7: ACACTAGATCGCGTGTGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F8: CTCACTACGCGCGCGTGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R8: CTCACTACGCGCGCGTGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F9: CGCATGACACGTGTGTGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R9: CGCATGACACGTGTGTGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F10: CATAGAGAGATAGTATGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R10: CATAGAGAGATAGTATGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F11: CACACGCGCGCTATATGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R11: CACACGCGCGCTATATGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F12: TCACGTGCTCACTGTGGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R12: TCACGTGCTCACTGTGGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F13: ACACACTCTATCAGATGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R13: ACACACTCTATCAGATGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F14: CACGACACGACGATGTGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R14: CACGACACGACGATGTGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F15: CTATACATAGTGATGTGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R15: CTATACATAGTGATGTGCCGTTGTGCACCAGAGTAGAAGCTGA

Primer F16: CACTCACGTGTGATATGGAGAAAAGGAGACTTCGGCTACCCAG

Primer R16: CACTCACGTGTGATATGCCGTTGTGCACCAGAGTAGAAGCTGA.

2. A kit for detecting MUC1 gene mutation, which comprises the PCR primer for detecting MUC1 gene mutation of claim 1.

3. The kit for detecting MUC1 gene mutation according to claim 2, further comprising magnetic beads for purification, a linker, a DNA repair enzyme, a DNA end blunting enzyme, a DNA ligase, a DNA exonuclease;

the DNA polymerase of Takara company is selected as the DNA amplification enzyme;

the magnetic Beads for purification are selected from AMPure XP Beads of Beckman company;

the DNA Repair enzyme selects NEB Next FFPE DNA Repair Mix;

the DNA terminal filling enzyme selects NEB Next end repair enzyme mix;

the DNA ligase is selected from T4 DNA ligase of Thermofisiher company;

the DNA Exonuclease is selected from Exonuclease VII or Exonuclease III.

4. The kit for detecting MUC1 gene mutation of claim 2, wherein the kit for detecting MUC1 gene mutation further comprises a negative control and a positive control.

5. The kit for detecting MUC1 gene mutation according to claim 4, wherein said negative control is selected from the group consisting of Hela cell line genomic DNA; the positive control is selected from Jurkat cell line genome DNA.

6. The method of using the kit for detecting MUC1 gene mutation of any one of claims 2 to 5, wherein the detection of MUC1 gene mutation is carried out using the kit for detecting MUC1 gene mutation.

7. The method of using the kit for detecting MUC1 gene mutation of claim 6, wherein the kit is usedWhen the kit is used, the PCR reaction mixture system is as follows: each Reaction mixture system was 50. mu.l containing 200ng of genomic DNA template, 5 × Reaction Buffer 10. mu.l, 2.5mM dNTP 4. mu.l, DNA Polymerase 1. mu.l, 10. mu.M HPLC grade primer F1. mu.l, 10. mu.M HPLC grade primer R1. mu.l, and complement ddH₂O to a total volume of 50. mu.l.

8. A method for detecting MUC1 gene mutation is characterized by comprising the steps of establishing a library building system, sequentially preparing a linker sequence, preparing a connecting system, preparing an enzyme digestion system, pretreating magnetic beads and establishing a purification system;

preparation of linker sequence: synthesizing a linker sequence unique to the third generation sequencing, preparing a Tris-HCl solution, and dissolving the primer of claim 1 to 170 μ M; preparing a Tris-HCl solution and a NaCl solution, and mixing the Tris-HCl solution and the NaCl solution in a ratio of 1:1 to obtain a mother solution for later use; absorbing the primers and the mother solution, adding amplified sterile water to dilute the primers, shaking and uniformly mixing, and performing instantaneous centrifugation; placing the diluted primers in a PCR instrument, setting the program at 95 ℃ for 5 min; after the operation is finished, the primers are subjected to instantaneous centrifugation for later use;

preparation of a linking system: preparation of T4 DNA ligase, 10X T4 DNA ligase buffer, prepared linker, PCR product, and ddH₂O; setting the reaction condition of the system at 22 ℃ for more than or equal to 10 h;

preparation of an enzyme digestion system: preparing Exonuclease VII and Exonuclease III; setting the reaction condition of the system at 37 ℃ for 1 h;

pretreatment of magnetic beads: equal-volume ddH for AMpure XP magnetic beads₂After O rinsing, preserving the solution for later use by using original preservation solution;

establishing a purification system: preparing 1.0% -1.2% agarose gel, and detecting whether the PCR product band is correct or not; purification using 0.8X magnetic beads; detecting the concentration, and reserving the product for later use.

9. An analysis method for judging whether MUC1 gene is mutated or not, which is characterized by comprising the following steps:

step 1: performing single-molecule pore correction on the data, and optimizing and filtering the result;

step 2: splitting different data after filtering, and optimizing and filtering the result;

and step 3: performing data clustering and statistical analysis based on multiple sequence comparison, preferably selecting a clustering result, performing multiple sequence comparison on the data retained after splitting, and clustering similar data;

and 4, step 4: carrying out base position recalibration on the original data to obtain the accuracy of each base position of the clustering result, and optimizing the result;

and 5: and (3) optimizing the clustering result of the base position re-correction by combining PCR experimental data so as to obtain accurate mutation judgment.

10. The assay of claim 9 wherein the MUC1 gene is mutated or not,

in the step 1, the minimum prediction accuracy is set to be 90%, so that original data with the accuracy lower than 0.9 is filtered;

in the step 2, setting the minimum label splitting score to be 55, and filtering out original data with the splitting accuracy rate lower than 0.75;

in the step 3, selecting a clustering result with high accuracy of more than 99.99 percent;

in step 4, selecting a clustering result with high accuracy of more than 99.99%;

in step 5, the clustering result which meets the length of the PCR product is selected, and the PCR byproduct result is filtered.