CN105368930B - The determination method that enzymes combinations are sequenced in genotyping technique is sequenced - Google Patents

The determination method that enzymes combinations are sequenced in genotyping technique is sequenced Download PDF

Info

Publication number
CN105368930B
CN105368930B CN201510660486.6A CN201510660486A CN105368930B CN 105368930 B CN105368930 B CN 105368930B CN 201510660486 A CN201510660486 A CN 201510660486A CN 105368930 B CN105368930 B CN 105368930B
Authority
CN
China
Prior art keywords
digestion
sequence
added
purifying
bar code
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510660486.6A
Other languages
Chinese (zh)
Other versions
CN105368930A (en
Inventor
胡晓湘
王宇哲
曹学敏
李宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Tian Derivatives Technology Co ltd
Original Assignee
China Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Agricultural University filed Critical China Agricultural University
Priority to CN201510660486.6A priority Critical patent/CN105368930B/en
Publication of CN105368930A publication Critical patent/CN105368930A/en
Application granted granted Critical
Publication of CN105368930B publication Critical patent/CN105368930B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention provides a kind of determination methods that enzymes combinations are sequenced in sequencing genotyping technique, include the following steps:(1) restriction enzyme digestion sites prediction is carried out to target genome, counts the endonuclease bamhi number that different digestion modes obtain;(2) joint sequence and PCR amplification primer sequence at every kind of endonuclease bamhi both ends are designed according to the endonuclease bamhi for the various digestion modes predicted in step (1);(3) sequencing library is constructed using GBS technology for different digestion modes respectively;(4) it is sequenced using the sequencing library that step (3) construct;(5) SNP marker site is obtained according to sequencing result;(6) the specific enzymes combinations for being directed to target genome are determined according to different enzymes combinations SNP marker site number obtained, endonuclease bamhi size.

Description

The determination method that enzymes combinations are sequenced in genotyping technique is sequenced
Technical field
The present invention relates to field of biotechnology, specifically, being related to that enzymes combinations are sequenced in a kind of sequencing genotyping technique Determination method.
Background technique
Genetic molecule marks (measurable heritable polymorphism between Different Individual in one or more groups) firmly In occupation of the core status and Population Genetics of modern genetics, the important research of the subjects such as ecological and Developmental Biology Direction.The genetic marker of mainstream has been developed to the third generation, i.e. single nucleotide polymorphism (Single Nucleotide at present Polymorphisms, SNP) molecular labeling.The characteristics of this genetic marker is the displacement of single base, and generally there was only two Kind base composition, is a kind of label of two condition, is marked with the RFLP of the first generation and the STR of the second generation using the difference of length as heredity The characteristics of note, is completely different.
The full-length genome SNP typing method of mainstream mainly has two methods of gene typing chips and two generations sequencing at present.Base Because the characteristics of typing chip be it is consistent, as a result repetitive rate is high, but the cost of one experiment sample of chip technology parting is very high, For population genetic study field, the cost price of group's parting is too big, and chip technology is limited by technology, there is also SNP polymorphic site general difference in different groups, low (the SNP chip density of agricultural animal field mainstream at present of mark density About 60000 SNP/ chips) the defects of, the problems such as not being able to satisfy the positioning of fine functional gene and whole-genome association. The development of next-generation sequencing technologies enables the research of genomics and transcription group more to go deep into, and sequencing can obtain full genome The horizontal high density marker map of group, however, there are also the disadvantages that unit sample cost is excessively high.
Simplifying genomic sequencing technique (reduced-representation sequencing) studies population analysis The identification of the high-throughput molecular labeling of required covering full-length genome is possibly realized with parting.But different simplification gene order-checking Method build library strategy, single endonuclease digestion/double digestion combination selection, microarray dataset in terms of have bigger difference, these The efficiency and cost of subsequent parting will be significantly affected.For example, the method for RAD sequencing builds complicated, the excessive step of library strategy Suddenly subsequent experimental result can be interfered;Different restriction enzymes digestion frequency and distribution in different species gene groups have Relatively big difference selects which kind of enzyme test just to become the decision of decision experiment acquisition SNP quantity and cost particular species Factor;2b-RAD technology uses II Type B restriction enzyme, the digestion of 2b-RAD technology although available full-length genome level Segment, but the clip size of this digestion only has 25-35bp, according to the average frequency that full-length genome makes a variation, too short digestion piece Section is difficult a large amount of sequencing data can be brought to lose rich in SNP site;In addition, short sequence is facing genome repeat region ratio To when, can bring it is a large amount of compare mistake, can also interfere with the parting accuracy of SNP, and then severe jamming downstream strongly Using.When carrying out SNP marker Locus Analysis in Shoots, single endonuclease digestion is unfavorable for the screening of endonuclease bamhi in follow-up test, and the double enzymes in part It cuts combination and haves the shortcomings that endonuclease bamhi is excessive or very few, endonuclease bamhi excessively will increase experimental cost, and endonuclease bamhi is very few The density of SNP excavation will be reduced, and then influences subsequent biological analysis, enzymes combinations also can be due to the methylation of genome Influence digesting efficiency.To sum up many reasons, to the species of any need research, it is essential that enzymes combinations, which are preferably tested, 's.
It is therefore desirable to develop enzymes combinations during a kind of new SNP marker Locus Analysis in Shoots general with each species Determine method, thus the enzymes combinations that the acquisition fast and convenient when carrying out SNP marker Locus Analysis in Shoots is most suitable, to reduce gene point The cost of type provides convenience for the downstream application after Genotyping.
Summary of the invention
In view of the deficiencies of the prior art, the purpose of the present invention is to provide a kind of SNP based on sequencing genotyping technique The determination method of enzymes combinations in marker site analytic process.
To achieve the above objectives, the present invention provides the determination sides that enzymes combinations are sequenced in a kind of sequencing genotyping technique Method includes the following steps:
(1) restriction enzyme digestion sites prediction is carried out to target genome, counts the enzyme that different digestion modes obtain Cut segment number;
(2) every kind of endonuclease bamhi both ends are designed according to the endonuclease bamhi for the various digestion modes predicted in step (1) Joint sequence and PCR amplification primer sequence;
(3) sequencing library is constructed using GBS technology for different digestion modes respectively;
(4) it is sequenced using the sequencing library that step (3) construct;
(5) SNP marker site is obtained according to sequencing result;
(6) it is determined according to different enzymes combinations SNP marker site number obtained, endonuclease bamhi size and is directed to purpose base Because of the specific enzymes combinations of group;
Wherein, in step (1), the restriction enzyme site prediction to target genome includes that single endonuclease digestion prediction and double digestion are pre- It surveys.
In the present invention, restriction enzyme site prediction can be carried out by computer program, for example, in a kind of implementation of the invention In mode, write that perl script Site_predict.pl is as follows, the file for needing to input be cow genome group chromosome title, Sequence and the restriction enzyme site sequence for being predicted enzyme.Operation is ordered:perl Site_predict.pl.The genome sequence of ox It is downloaded from Ensembl, version number is:UMD3.1,INSDC Assembly GCA_000003055.3,Nov2009.
After obtaining single restricted digestion result, the simulation knot to the restriction enzyme of combination of two can according to need Fruit is handled, to obtain the analog result under two kinds of enzymes while operative condition.By taking EcoR I and Msp I as an example, order as follows:
Less-S ecor1.pos | awk'OFS=" t " { print $ 1, $ 2, " 1 " } ' | less-S>ecor
Less-S msp1.pos | awk'OFS=" t " { print $ 1, $ 2, " 2 " } ' | less-S>msp
cat ecor msp|sort-k1,1-k2,2n|less-S>double_length_input
Optionally, digestion mode described in step (1) includes 7 kinds of single endonuclease digestion modes as shown in Table 1 and 8 kinds of double enzyme groups Conjunction mode;
Table 1
Optionally, the joint sequence of every kind of endonuclease bamhi includes that a universal joint and a bar code connect in step (2) Head.
Optionally, the universal joint is pair formed by 2 universal linker sequences (universal linker sequence 1 and 2) annealing Chain DNA, wherein universal linker sequence 1 passes through 5 ' phosphorylation modifications, and the bar code connector is by two bar code joint sequences The double-stranded DNA that (bar code joint sequence 1 and 2) annealing is formed, wherein bar code joint sequence 2 passes through 5 ' phosphorylation modifications.
It wherein include any short nucleotide bar code sequence that length is 6-9bp in the bar code joint sequence 1 and 2.
Optionally, PCR amplification primer sequence such as SEQ ID NO described in step (2):Shown in 1-2.
Optionally, include the following steps in step (3):
(a) digestion is carried out to genome using restriction enzyme and obtains digestion products;
(b) universal joint and bar code connector are prepared;
(c) universal joint and bar code connector are attached respectively with digestion products and are reacted, obtain connection product;
(d) connection product equal proportion is subjected to mixed pond, the connection product after obtaining mixed pond;
(e) magnetic bead progress the first purifying acquisition first that 1.2-1.4 times of volume is added in the connection product behind mixed pond is pure Change product;
(f) magnetic bead progress the second purifying acquisition second that 0.8-0.9 times of volume is added in first purified product is pure Change product;
(g) PCR amplification is carried out to the second purified product and obtains PCR product;
(h) magnetic bead that 1.2-1.4 times of volume is added in PCR product carries out third purifying and obtains third purified product;
(i) magnetic bead that 0.8-0.9 times of volume is added in third purified product carries out the 4th purifying and obtains simplified genome Sequencing library.
Optionally, the step of first purifying is with third purifying is identical, specifically includes:After magnetic bead is added, in gyroscope System after upper incubation at room temperature 18-22min is incubated for;It is placed on magnetic frame, discards supernatant after incubation, 480- is added 70% ethyl alcohol of 520 μ L slowly rotates after standing 30-40s, moves magnetic bead on tube wall, after solution clarification, removes supernatant Liquid repeats this step and is once precipitated;Low TE is added in precipitating obtained again, is inhaled with pipettor and is shaken after beating up and down It swings, clarification is stood after centrifugation and obtains supernatant;Wherein, it is precipitated relative to described in 100 μ L, the additive amount of Low TE is 140-160 μ L。
Optionally, the step of the second purifying is with the 4th purifying is identical, specifically includes:After magnetic bead is added, in gyroscope upper chamber Temperature is incubated for 13-16min;It is placed on magnetic frame, discards supernatant after incubation, 70% ethyl alcohol of 480-520 μ L is added, stand It is slowly rotated after 30-40s, moves magnetic bead on tube wall, after solution clarification, remove supernatant, repeated this step and once obtain It must precipitate;Low TE is added in precipitating obtained again, is inhaled with pipettor and is vibrated after beating up and down, clarification is stood after centrifugation and is obtained Supernatant;Wherein, it is precipitated relative to described in 100 μ L, the additive amount of Low TE is 30-50 μ L.
Optionally, the annealing system of universal joint described in step (c) is:100 μM of 15 μ L of universal linker sequence;100μ 25 μ L, 5 × Annealing Buffer of M universal linker sequence 10 μ L, 30 μ L of nuclease-free water;Cycle of annealing is:It is heated to 95 DEG C, and 25 DEG C are cooled to the speed of 1 DEG C/min, it is saved after 25 DEG C of heat preservation 30min in 4 DEG C.
The annealing system of bar code connector is:100 μM of 15 μ L of bar code joint sequence;100 μM of 25 μ of bar code joint sequence 10 μ L of L, 5 × Annealing Buffer, 30 μ L of nuclease-free water;Response procedures are:95 DEG C of 3min, with the speed of 1 DEG C/min Cooling saves after 25 DEG C of heat preservation 30min in 4 DEG C until dropping to 25 DEG C.
Optionally, the system of connection reaction described in step (c) is:20 μ L, 5 × DNA Ligase of digestion products 8 μ L of Reaction Buffer, 2 μ L of DNA ligase, 5 μ L of nuclease-free water, 5 μ L of connector mixture;Mixing is placed on PCR, Response procedures are:22 DEG C of heat preservations 1h, 65 DEG C of heat preservation 30min are cooled to 4 DEG C of preservations.
Optionally, it is 500 High Output Kit of NextSeq that the sequencing kit that sequencing uses is carried out in step (4) (75 cycles)。
Optionally, sequencing data is carried out in step (5) and compares the software of genome to be bowtie2 (version number bowtie2- 2.2.3) (it is based on (SuSE) Linux OS), SNP identifies that software is Tassel (version number tassel-4.3.13).In the present invention In, SNP marker site can be excavated in the following manner:It is as follows to analyze script, EcoR I- is combined with restriction enzyme For Mse I.
①tassel-4.3.13/run_pipeline.pl-Xmx32g-fork1
-FastqToTagCountPlugin-i fq/-s 300000000-k
keyfile_eco_mse.txt-e EcoRI-MseI-c 3-o tagCounts
-endPlugin-runfork1
②tassel-4.3.13/run_pipeline.pl-Xmx32g-fork1
-MergeMultipleTagCountPlugin-i tagCounts-o
mergedTagCounts/myMasterGBSTags.cnt-endPlugin-runfork1
tassel-4.3.13/run_pipeline.pl-Xmx32g-fork1
-TagCountToFastqPlugin
-i./mergedTagCounts/myMasterGBSTags.cnt
-o./mergedTagCounts/myMasterGBSTags.fq-endPlugin
-runfork1
bowtie2-2.2.3/bowtie2-p 16--very-sensitive-local
-x../../ref/cow.big_chr-U myMasterGBSTags.fq-S
myAlignedMasterTags.sam
③tassel-4.3.13/run_pipeline.pl-Xmx32g-fork1
-SAMConverterPlugin-i
mergedTagCounts/myAlignedMasterTags.sam-o
topm/myMasterTags.topm-endPlugin-runfork1
④tassel-4.3.13/run_pipeline.pl-Xmx32g-fork1
-FastqToTBTPlugin-i fq/-k keyfile_eco_mse.txt-e
EcoRI-MseI-c 3-o tbt-y
-t./mergedTagCounts/myMasterGBSTags.cnt-endPlugin
-runfork1
tassel-4.3.13/run_pipeline.pl-Xmx32g-fork1
-MergeTagsByTaxaFilesPlugin-i tbt-o
mergedTBT/myStudy.tbt.byte-endPlugin-runfork1
⑤tassel-4.3.13/run_pipeline.pl-Xmx64g-fork1
-DiscoverySNPCallerPlugin-i./mergedTBT/myStudy.tbt.byte-y
-m./topm/myMasterTags.topm
-mUpd./topm/myMasterTagsWithVariants.topm-vcf
-o./hapmap/raw/myGBSGenos_chr+.hmp.txt-mnMAF 0.05
-ref../ref/gga4.big_chr.fa-sC 1-eC 30-endPlugin-runfork1
⑥tassel-4.3.13/run_pipeline.pl-Xmx32g-fork1
-MergeDuplicateSNPsPlugin-hmp
hapmap/raw/myGBSGenos_chr+.hmp.txt-o
hapmap/mergedSNPs/myGBSGenos_mergedSNPs_chr+.hmp.txt
-misMat 0.1-sC 1-eC 30-endPlugin-runfork1
⑦tassel-4.3.13/run_pipeline.pl-Xmx32g-fork1
-GBSHapMapFiltersPlugin-hmp
hapmap/mergedSNPs/myGBSGenos_mergedSNPs_chr+.hmp.txt
-o hapmap/filt/myGBSGenos_mergedSNPsFilt_chr+.hmp.txt
-mnTCov 0.01-mnSCov 0.6-mnMAF 0.05-sC 1-eC 30
-endPlugin-runfork1
The present invention also provides application of the method in Genotyping research.
Optionally, the application includes carrying out Genotyping to pig, ox, sheep, chicken.
Method provided by the present invention carries out Piece Selection before PCR by way of above-mentioned magnetic beads for purifying, makes too much Or too small segment is all removed before PCR, to improve sequencing data quality.
Method provided by the present invention includes according to different enzymes combinations SNP marker site number obtained, digestion piece Duan great little determines the specific enzymes combinations for being directed to target genome.Preferably enzymes combinations standard includes:Moderate digestion density, SNP is evenly distributed in genome, SNP medium density, and digestion is not influenced etc. by DNA modification known to one of skill in the art Enzymes combinations selection criteria.
The method that the present invention is developed can be filtered out arbitrarily by the prediction of bioinformatics electronics and digestion choice experiment Species carry out sequencing and typing most effective digestion experimental program, can under the premise of guaranteeing accuracy, make existing parting at This reduction an order of magnitude has larger application prospect.
Detailed description of the invention
Fig. 1 is the running gel figure of genome single endonuclease digestion and double digestion, and swimming lane 1 is 1kbp Marker, and swimming lane 2 is 100bp Marker, swimming lane 3 are Pst I digestion products, and swimming lane 4 is Bgl II digestion products, and swimming lane 5 is HinP1I digestion products, swimming lane 6 For Pst I-HF digestion products, swimming lane 7 is EcoR I+Mse I digestion products, and swimming lane 8 is ApeK I digestion products, and swimming lane 9 is EcoR I digestion products, swimming lane 10 are Msp I digestion products, and swimming lane 11 is 100bp Maker, and swimming lane 12 is 1kbp Maker, Swimming lane 13 is Bgl II+ApeK I digestion products, and swimming lane 14 is Pst I+ApeK I digestion products, and swimming lane 15 is HinP1 I+ ApeK I digestion products, swimming lane 16 are Mse I digestion products, and swimming lane 17 is Pst I+Msp I digestion products, and swimming lane 18 is EcoR I+Msp I digestion products, swimming lane 19 are Pst I+Mse I digestion products, and swimming lane 20 is HinP1 I+Mse I digestion products.
Fig. 2 is sequencing quality report figure.
Specific embodiment
It below will the present invention is described in detail by specific embodiment.
Embodiment 1
Embodiment 1 utilizes cow genome group profile analysis method provided by the present invention.
1, endonuclease bamhi number is predicted
Write that perl script Site_predict.pl is as follows, the file for needing to input be cow genome group chromosome title, Sequence and the restriction enzyme site sequence for being predicted enzyme.Operation is ordered:perl Site_predict.pl.The genome sequence of ox It is downloaded from Ensembl, version number is:
After obtaining single restricted digestion result, need at the analog result to the restriction enzyme of combination of two Reason, to obtain the analog result under two kinds of enzymes while operative condition.By taking EcoR I and Msp I as an example, order as follows:
Less-S ecor1.pos | awk'OFS=" t " { print $ 1, $ 2, " 1 " } ' | less-S>ecor
Less-S msp1.pos | awk'OFS=" t " { print $ 1, $ 2, " 2 " } ' | less-S>msp
cat ecor msp|sort -k1,1-k2,2n|less-S>double_length_input
The restriction enzyme site of every kind of enzyme, single endonuclease digestion, double digestion prediction result see the table below 2.
Table 2
Single endonuclease digestion result Restriction enzyme site number Double digestion combined result Restriction enzyme site
Bgl II 755,854 Bgl II+ApeK I 1,179,520
EcoR I 851,527 EcoR I+Mse I 1,591,143
HinPl I 1,086,541 EcoR I+Msp I 899,849
Pst I 1,504,605 HinP1 I+ApeK I 1,552,190
Msp I 1,839,903 HinP1 I+Mse I 1,282,950
ApeK I 5,027,512 Pst I+ApeK I 2,519,953
MseI 15,640,941 Pst I+Mse I 2,418,739
Pst I+Msp I 1,428,045
2, connector and PCR primer design
8 kinds of double digestion built-up joints are designed in the present embodiment altogether, every butt joint includes that universal joint and 3 bar codes connect Head.Connector and PCR primer sequence are as shown in table 3 below.
Table 3
1) sequencing library constructs:
The gastric tissue sample of three holstein cows is acquired, postgenome is extracted and carries out genome digestion.Reaction system is 20 μ L, including 15 μ L Nuclease-free water, 2 μ L10 × CutSmart Buffer, 0.5 μ L enzyme, 1,0.5 μ L enzyme 2, 200ng sample DNA mixes, and centrifugation is placed in PCR instrument, reaction condition is:37 DEG C of 90min, 65 DEG C of 30min, 4 DEG C of preservations.Gene The electrophoresis result such as attached drawing 1 of group digestion.
2) connector is annealed:
In one double digestion combination, the ratio of both ends difference connector is single endonuclease digestion segments ratio in electronics prediction, is guaranteed The validity of subsequent connection reaction.Specific mixed proportion is as follows, and reaction system is 50 μ L, including 30 μ L Nuclease-free Water, 10 μ L 5 × Annealing buffer, enzyme 1 and (ratio of connector corresponding to the enzyme) additional amount of enzyme 2 are by shown in table 4 Ratio mixes, centrifugation, and reaction condition is 95 DEG C of 3min, declines 1 DEG C/min, until dropping to 25 DEG C, 25 DEG C of 30min, 4 DEG C guarantors It deposits.Mixed proportion is shown in Table 4.
Table 4
Enzyme 1 Enzyme 2 1 volume of enzyme (μ L) 2 volume of enzyme (μ L) Water (μ L)
EcoR I Mse Ⅰ 0.8 15 84.2
EcoR I Msp Ⅰ 1.6 3.9 194.5
Pst Ⅰ Mse Ⅰ 2.44 15 82.5
Pst Ⅰ Msp Ⅰ 2.44 1.95 95.6
Pst Ⅰ ApeK Ⅰ 2.44 9,62 87.9
Bgl Ⅱ ApeK Ⅰ 1.84 15 83.1
HinP1 Ⅰ Mse Ⅰ 1.84 9.62 88.5
HinP1 Ⅰ ApeK Ⅰ 1 9.62 89.3
3) connector connects:
Reaction system is 40 μ L, including 20 μ L digestion products, 5 μ L nuclease-free waters, 8 μ L5 × DNA Ligase Reaction Buffer, 2 μ L ExpressLink T4 DNA Ligase, 5 μ L connector mixtures mix well, and are centrifuged, instead Answering condition is 22 DEG C of 1h, 65 DEG C of 30min, 4 DEG C of preservations.
4) pond is mixed:
The connection product of every kind of enzymes combinations of each sample is admixed together, and totally 240 μ L are purified for lower step.
5) magnetic beads for purifying connection product:
312 μ L AMPure XP Beads are added in 240 μ L connection products, centrifuge tube are placed on gyroscope, room temperature It is incubated for 20min, is then placed into 3min on magnetic frame, abandons supernatant;500 μ L70% ethyl alcohol are added, centrifuge tube is placed in magnetic frame On, slowly rotary tube, rotation take two turns after 30s, move magnetic bead on tube wall, after solution clarification, remove supernatant, then This step is repeated primary;Centrifuge tube is removed, centrifuge tube is placed on magnetic frame by of short duration centrifugation, is removed and is remained with small pipette tips Ethyl alcohol dries 3min;150 μ L Low TE are added, is inhaled and is beaten several times up and down with pipette tips, shake 10s, of short duration centrifugation is placed in magnetic frame On, supernatant is transferred in new centrifuge tube by 3min after solution clarification;120 are added into 150 μ L Low TE eluents μ L AMPure XP Beads, centrifuge tube is placed on gyroscope, is incubated at room temperature 15min, is then placed into 3min on magnetic frame, Abandon supernatant;500 μ L, 70% ethyl alcohol is added, centrifuge tube is placed on magnetic frame, slowly rotary tube, rotation take two turns after 30s, make Magnetic bead moves on tube wall, after solution clarification, removes supernatant, then repeats this step primary;Centrifuge tube is removed, it is of short duration Centrifugation, centrifuge tube is placed on magnetic frame, is removed residual ethanol with small pipette tips, is dried 3min;50 μ L Low TE are added, use Pipette tips are inhaled up and down beats several times, shakes 10s, and of short duration centrifugation is placed on magnetic frame, 3min, and after solution clarification, supernatant is shifted It is placed in 2min on magnetic frame into new centrifuge tube, then by centrifuge tube, supernatant is transferred to new centrifuge tube, is obtained after purification Connection product.
6) concentration mensuration and PCR amplification:
The measurement of Qubit 2.0 connection product concentration after purification, to determine the amount of PCR process connection product after purification. Amplification system is 60 μ L, including the connection of 50 μ L Platinum PCR SuperMix High Fidelity, 10ng after purification Product, 1.2 μ L10 μM Primer A, 1.2 μ L10 μM Primer B mend Nuclease-free water to 60 μ L, react item Part is 95 DEG C of 5min, 17 × (95 DEG C of 30s, 62 DEG C of 30s, 68 DEG C of 30s), 72 DEG C of 5min, 4 DEG C of preservations.
7) step 5) purifying is repeated, is finally eluted with 30 μ L Low TE.Qubit 2.0 measures library concentration, Agilent 2100 detection library fragments size distributions.
8) selection of microarray dataset
The present invention utilizes the NextSeq500 sequencing system of bis- generation of Illumina microarray dataset, is sequenced and is tried using single-ended 75bp Agent box.Since NextSeq500 sequenator single can produce the sequencing reads of 400M, the test platform and method can be most Bigization reduces sequencing cost, also faster relative to Hiseq sequencing system speed.Attached drawing 2 is shown in sequencing quality report.
3, the excavation of SNP marker
Using TASSEL software to sequencing data carry out SNP excavation, genome mapping software using bowtie2 into Row.It is as follows to analyze script, by taking EcoR I-Mse I as an example.
①tassel-4.3.13/run_pipeline.pl-Xmx32g-fork1
-FastqToTagCountPlugin-i fq/-s 300000000-k
keyfile_eco_mse.txt-e EcoRI-MseI-c 3-o tagCounts
-endPlugin-runfork1
②tassel-4.3.13/run_pipeline.pl-Xmx32g-fork1
-MergeMultipleTagCountPlugin-i tagCounts-o
mergedTagCounts/myMasterGBSTags.cnt-endPlugin-runfork1
tassel-4.3.13/run_pipeline.pl-Xmx32g-fork1
-TagCountToFastqPlugin
-i./mergedTagCounts/myMasterGBSTags.cnt
-o./mergedTagCounts/myMasterGBSTags.fq-endPlugin
-runfork1
bowtie2-2.2.3/bowtie2-p 16--very-sensitive-local
-x../../ref/cow.big_chr-U myMasterGBSTags.fq-S
myAlignedMasterTags.sam
③tassel-4.3.13/run_pipeline.pl-Xmx32g-fork1
-SAMConverterPlugin-i
mergedTagCounts/myAlignedMasterTags.sam-o
topm/myMasterTags.topm-endPlugin-runfork1
④tassel-4.3.13/run_pipeline.pl-Xmx32g-fork1
-FastqToTBTPlugin-i fq/-k keyfile_eco_mse.txt-e
EcoRI-MseI-c 3-o tbt-y
-t./mergedTagCounts/myMasterGBSTags.cnt-endPlugin
-runfork1
tassel-4.3.13/run_pipeline.pl-Xmx32g-fork1
-MergeTagsByTaxaFilesPlugin-i tbt-o
mergedTBT/myStudy.tbt.byte-endPlugin-runfork1
⑤tassel-4.3.13/run_pipeline.pl-Xmx64g-fork1
-DiscoverySNPCallerPlugin-i./mergedTBT/myStudy.tbt.byte-y
-m./topm/myMasterTags.topm
-mUpd./topm/myMasterTagsWithVariants.topm-vcf
-o./hapmap/raw/myGBSGenos_chr+.hmp.txt-mnMAF 0.05
-ref../ref/gga4.big_chr.fa-sC 1-eC 30-endPlugin-runfork1
⑥tassel-4.3.13/run_pipeline.pl-Xmx32g-fork1
-MergeDuplicateSNPsPlugin-hmp
hapmap/raw/myGBSGenos_chr+.hmp.txt-o
hapmap/mergedSNPs/myGBSGenos_mergedSNPs_chr+.hmp.txt
-misMat 0.1-sC 1-eC 30-endPlugin-runfork1
⑦tassel-4.3.13/run_pipeline.pl-Xmx32g-fork1
-GBSHapMapFiltersPlugin-hmp
hapmap/mergedSNPs/myGBSGenos_mergedSNPs_chr+.hmp.txt
-o hapmap/filt/myGBSGenos_mergedSNPsFilt_chr+.hmp.txt
-mnTCov 0.01-mnSCov 0.6-mnMAF 0.05-sC 1-eC 30
-endPlugin-runfork1
The specific digestion result of the reality that every kind of enzymes combinations obtain and SNP number are shown in Table 5.
Table 5
Ox enzymes combinations SNP number Double digestion number Mapping rate
Pst I-Mse I 297854 1589570 99.59%
Pst I-ApeK I 128968 830729 99.40%
EcoR I-Mse I 163127 998753 99.58%
Bgl II-ApeK I 103343 623573 99.42%
Pst I-Msp I 124446 660116 99.12%
HinP1 I-Mse I 28603 335154 97.05%
HinP1 I-ApeK I 21646 245639 95.76%
EcoR I-Msp I 61259 325721 98.96%
By upper table analysis it is found that the obtained SNP number of the enzymes combinations of EcoR I-Msp I it is moderate (this experiment be 3 Individual, the SNP number for carrying out large-scale groups experiment also will increase), digestion not will receive the influence of the modifications such as methylation, digestion Segment is evenly distributed, therefore chooses the enzymes combinations of EcoR I-Msp I as enzyme used in ox SNP marker Locus Analysis in Shoots Cut combination.
Method provided by the present invention can carry out the experiment of 96 samples on the sequence testing chip of a NextSeq500, Cost control is low.
Embodiment 2
Inventor is tested on Important Agricultural economic animal sheep.The blood sample of three Dorper sheeps is acquired, it is real Process is tested with embodiment 1, sheep genome digestion situation and SNP are analyzed.The practical digestion result of every kind of enzymes combinations is shown in Table 6.
Table 6
Sheep enzymes combinations SNP number Double digestion number Mapping rate
Pst I-Mse I 616954 1943931 98.82%
Pst I-ApeK I 138084 730031 98.20%
EcoR I-Mse I 331575 1164181 99.26%
Bgl II-ApeK I 206554 690820 99.07%
Pst I-Msp I 237540 862441 97.34%
HinP1 I-Mse I 71698 463752 95.04%
HinP1 I-ApeK I 49389 335473 93.20%
EcoR I-Msp I 107107 396973 98.24%
By upper table analysis, it is found that the obtained SNP number of the enzymes combinations of EcoR I-Msp I is relatively mild, (this experiment is 3 individuals, the SNP number for carrying out large-scale groups experiment also will increase), digestion not will receive the influence of the modifications such as methylation, Endonuclease bamhi is evenly distributed, therefore chooses the enzymes combinations of EcoR I-Msp I as used in sheep SNP marker Locus Analysis in Shoots Enzymes combinations.
Although above the present invention is described in detail with a general description of the specific embodiments, On the basis of the present invention, it can be made some modifications or improvements, this will be apparent to those skilled in the art.Cause This, these modifications or improvements, fall within the scope of the claimed invention without departing from theon the basis of the spirit of the present invention.

Claims (7)

1. a kind of analysis method in the Important Agricultural economic animal SNP marker site based on sequencing genotyping technique, feature It is, includes the following steps:
(1) restriction enzyme digestion sites prediction is carried out to target genome, counts the digestion piece that different digestion modes obtain Number of segment mesh;
(2) connector at every kind of endonuclease bamhi both ends is designed according to the endonuclease bamhi for the various digestion modes predicted in step (1) Sequence and PCR primer sequence;
(3) sequencing library is constructed using GBS technology for different digestion modes respectively;
(4) it is sequenced using the sequencing library that step (3) construct;
(5) SNP marker site is obtained according to sequencing result;
(6) it is determined according to different enzymes combinations SNP marker site number obtained, endonuclease bamhi size and is directed to target genome Specific enzymes combinations EcoR I-Msp I;
Wherein, in step (1), double digestion prediction is predicted as to the restriction enzyme site of target genome;
The Important Agricultural economic animal is ox or sheep;
Digestion mode described in step (1) includes 8 kinds as shown in Table 1 double enzyme combinations;
Table 1
The joint sequence of every kind of endonuclease bamhi includes universal joint and bar code connector in step (2);Wherein, the universal joint It is formed by universal linker sequence 1 and the annealing of universal linker sequence 2, the bar code connector is by bar code joint sequence 1 and bar shaped The code annealing of joint sequence 2 is formed;
The universal linker sequence, bar code joint sequence and PCR primer sequence are as shown in table 3 below:
Table 3
2. analysis method according to claim 1, which is characterized in that step includes the following steps in (3):
(a) digestion is carried out to genome using restriction enzyme and obtains digestion products;
(b) universal joint and bar code connector are prepared;
(c) universal joint and bar code connector are mixed in a certain ratio to form connector mixture, then produce it with digestion Object is attached reaction, obtains connection product;
(d) connection product equal proportion is subjected to mixed pond, the connection product after obtaining mixed pond;
(e) magnetic bead that 1.2-1.4 times of volume is added in the connection product behind mixed pond carries out first the first purifying of purifying acquisition and produces Object;
(f) magnetic bead that 0.8-0.9 times of volume is added in first purified product carries out second the second purifying of purifying acquisition and produces Object;
(g) PCR amplification is carried out to the second purified product and obtains PCR product;
(h) magnetic bead that 1.2-1.4 times of volume is added in PCR product carries out third purifying and obtains third purified product;
(i) magnetic bead that 0.8-0.9 times of volume is added in third purified product carries out the 4th purifying and obtains simplified gene order-checking Library.
3. analysis method according to claim 2, which is characterized in that the step of first purifying and third purify phase Together, it specifically includes:After magnetic bead is added, system after 18-22min is incubated for is incubated at room temperature on gyroscope;It is put after incubation It sets on magnetic frame, discards supernatant, 70% ethyl alcohol of 480-520 μ L volume is added, slowly rotated after standing 30-40s, make magnetic bead It is moved on tube wall, after solution clarification, removes supernatant, repeat this step and once precipitated;Again obtained heavy Low TE is added in shallow lake, is inhaled with pipettor and is vibrated after beating up and down, clarification is stood after centrifugation and obtains supernatant;Wherein, relative to 100 It is precipitated described in μ L, the additive amount of Low TE is 140-160 μ L.
4. analysis method according to claim 2, which is characterized in that the step of the second purifying is with the 4th purifying is identical, tool Body includes:After magnetic bead is added, 13-16min is incubated at room temperature on gyroscope;It is placed on magnetic frame, discards after incubation Clearly, 70% ethyl alcohol of 480-520 μ L volume is added, is slowly rotated after standing 30-40s, moves magnetic bead on tube wall, to solution After clarification, supernatant is removed, this step is repeated and is once precipitated;Low TE is added in precipitating obtained again, uses liquid relief Device vibrates after suction is beaten up and down, and clarification is stood after centrifugation and obtains supernatant;Wherein, it is precipitated relative to described in 100 μ L, Low TE's adds Dosage is 30-50 μ L.
5. according to the method described in claim 4, it is characterized in that, the annealing system of universal joint described in step (b) is: 100 μM of 15 μ L of universal linker sequence;100 μM of 25 μ L, 5 × Annealing Buffer of universal linker sequence, 10 μ L, free nucleic acid 30 μ L of enzyme water;Cycle of annealing is:It is heated to 95 DEG C, and is cooled to 25 DEG C with the speed of 1 DEG C/min, in 4 after 25 DEG C of heat preservation 30min DEG C save;
The annealing system of bar code connector is:100 μM of 15 μ L of bar code joint sequence;100 μM of 25 μ L of bar code joint sequence, 5 × Annealing Buffer, 10 μ L, 30 μ L of nuclease-free water;Response procedures are:95 DEG C of 3min are dropped with the speed of 1 DEG C/min Temperature saves after 25 DEG C of heat preservation 30min in 4 DEG C until dropping to 25 DEG C.
6. according to the method described in claim 5, it is characterized in that, the system of connection reaction described in step (c) is:Digestion 20 μ L, 5 × DNA Ligase Reaction Buffer of product 8 μ L, 2 μ L of DNA ligase, 5 μ L of nuclease-free water, connector are mixed Close 5 μ L of object;Mixing is placed on PCR, and response procedures are:22 DEG C of heat preservations 1h, 65 DEG C of heat preservation 30min are cooled to 4 DEG C of preservations.
7. application of the method described in any one of claim 1-6 in Genotyping research.
CN201510660486.6A 2015-10-13 2015-10-13 The determination method that enzymes combinations are sequenced in genotyping technique is sequenced Active CN105368930B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510660486.6A CN105368930B (en) 2015-10-13 2015-10-13 The determination method that enzymes combinations are sequenced in genotyping technique is sequenced

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510660486.6A CN105368930B (en) 2015-10-13 2015-10-13 The determination method that enzymes combinations are sequenced in genotyping technique is sequenced

Publications (2)

Publication Number Publication Date
CN105368930A CN105368930A (en) 2016-03-02
CN105368930B true CN105368930B (en) 2018-11-20

Family

ID=55371556

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510660486.6A Active CN105368930B (en) 2015-10-13 2015-10-13 The determination method that enzymes combinations are sequenced in genotyping technique is sequenced

Country Status (1)

Country Link
CN (1) CN105368930B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106811518A (en) * 2016-08-17 2017-06-09 上海易瑞生物科技有限公司 A kind of thyroid cancer tumor susceptibility gene detection and genotyping kit and its application
CN107609346B (en) * 2017-09-01 2021-03-12 广东省科学院动物研究所 Genome IIB type restriction endonuclease site prediction method and electronic equipment
CN108531545A (en) * 2017-11-20 2018-09-14 广西中医药大学 A method of screening fist rolls up marchantia SSR primers
CN109680041A (en) * 2018-12-25 2019-04-26 上海派森诺生物科技股份有限公司 A kind of processing method based on the sequencing sample for simplifying gene order-checking
CN110343741B (en) * 2019-07-23 2023-06-06 中国林业科学研究院热带林业研究所 Construction method of simplified genome sequencing library based on double digestion
CN110592078A (en) * 2019-09-03 2019-12-20 北京康普森生物技术有限公司 Primer group for bovine sexual amplicon sequencing
CN111524552B (en) * 2020-04-24 2021-05-11 深圳市儒翰基因科技有限公司 Simplified genome sequencing library construction and analysis method, detection equipment and storage medium
CN114507707B (en) * 2020-11-16 2024-05-31 上海韦翰斯生物医药科技有限公司 Method for constructing haplotype by enrichment of target region and enzyme digestion
CN114262747A (en) * 2021-12-30 2022-04-01 中国农业大学 Method for researching inheritance and spread of corn southern rust group based on SNP

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102656279A (en) * 2009-12-17 2012-09-05 凯津公司 Restriction enzyme based whole genome sequencing
CN104480217A (en) * 2014-12-26 2015-04-01 上海派森诺生物科技有限公司 Simplified genome sequencing method
CN104562214A (en) * 2014-12-26 2015-04-29 上海派森诺生物科技有限公司 Reduced-representation genome library building method based on type IIB restriction enzyme digestion
CN104694635A (en) * 2015-02-12 2015-06-10 北京百迈客生物科技有限公司 Method for constructing high-flux simplified genome sequencing library

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102656279A (en) * 2009-12-17 2012-09-05 凯津公司 Restriction enzyme based whole genome sequencing
CN104480217A (en) * 2014-12-26 2015-04-01 上海派森诺生物科技有限公司 Simplified genome sequencing method
CN104562214A (en) * 2014-12-26 2015-04-29 上海派森诺生物科技有限公司 Reduced-representation genome library building method based on type IIB restriction enzyme digestion
CN104694635A (en) * 2015-02-12 2015-06-10 北京百迈客生物科技有限公司 Method for constructing high-flux simplified genome sequencing library

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species;Robert J. Elshire等;《PLOS ONE》;20110504;第6卷(第5期);e19379 *
An Efficient Genotyping Method in Chicken Based on Genome Reducing and Sequencing;Rongrong Liao等;《PLOS ONE》;20150827;第10卷(第8期);e0137010,摘要,第2页最后1段至第4页第4段 *
Double Digest RADseq: An Inexpensive Method for De Novo SNP Discovery and Genotyping in Model and Non-Model Species;Brant K. Peterson等;《PLOS ONE》;20120531;第7卷(第5期);e37135,第3页图2、左栏倒数第2段,第4页表1,Protocol_S1 *
Inexpensive Multiplexed Library Preparation for Megabase-Sized Genomes;Michael Baym等;《PLOS ONE》;20150522;第10卷(第5期);e0128036,第3页图2,第5页"Module 4",第6页第3-4段 *

Also Published As

Publication number Publication date
CN105368930A (en) 2016-03-02

Similar Documents

Publication Publication Date Title
CN105368930B (en) The determination method that enzymes combinations are sequenced in genotyping technique is sequenced
AU2016268089B2 (en) Methods for next generation genome walking and related compositions and kits
CN106555226B (en) A kind of method and kit constructing high-throughput sequencing library
CN105238859B (en) A kind of method for obtaining chicken full-length genome high density SNP marker site
CN102409047B (en) Method for building sequencing library by hybridization
CN109593757B (en) Probe and method for enriching target region by using same and applicable to high-throughput sequencing
CN103555846B (en) A kind of high-throughput low cost S NP classifying method based on liquid phase molecule Hybridization principle
CN106939342B (en) SNP marker linked with millet beige, primer and application
JP7203276B2 (en) Methods and kits for constructing sequencing libraries based on target regions of methylated DNA
WO2018147438A1 (en) Pcr primer set for hla gene, and sequencing method using same
CN109112217A (en) A kind of and pig body length and the significantly associated genetic marker of number of nipples and application
CN107988385B (en) Method for detecting marker of PLAG1 gene Indel of beef cattle and special kit thereof
CN114250279B (en) Construction method of haplotype
Nichols et al. Targeted amplification and sequencing of ancient environmental and sedimentary DNA
CN106566872B (en) The analysis method in the pig SNP marker site based on sequencing genotyping technique
BR112012014466B1 (en) method for detecting a mutation using a DNA microarray showing a plurality of polynucleotide probes immobilized on the same
CN108130359A (en) A kind of DNA methylation detection kit and its application
CN113166809A (en) Method, kit, device and application for detecting DNA methylation
CN109825600A (en) One kind SNP marker relevant to Suhuai pig muscle drip loss and detection method
KR101683086B1 (en) Prediction method for swine fecundity using gene expression level and methylation profile
CN114058681A (en) Methylation mutation detection method and kit based on target area capture
CN108642199B (en) SNP (Single nucleotide polymorphism) marker related to growth of millet flag leaves as well as detection primer and application thereof
CN107904297B (en) Primer group, joint group and sequencing method for microbial diversity research
CN106282332B (en) Label and primer for multiple nucleic acid sequencing
CN110914454B (en) Genome sequence analysis of human DNA samples contaminated with microorganisms using whole genome capture inter-transposon segment sequences

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240701

Address after: 511466 1205, Floor 12, Building 9 (Building 8), No. 6, Nanjiang Second Road, the Pearl River Street, Nansha District, Guangzhou, Guangdong

Patentee after: Guangzhou Tian Derivatives Technology Co.,Ltd.

Country or region after: China

Address before: 100193 No. 2 Old Summer Palace West Road, Beijing, Haidian District

Patentee before: CHINA AGRICULTURAL University

Country or region before: China