CN103400056A - DNA sequence pattern construction method - Google Patents

DNA sequence pattern construction method Download PDF

Info

Publication number
CN103400056A
CN103400056A CN2013103582160A CN201310358216A CN103400056A CN 103400056 A CN103400056 A CN 103400056A CN 2013103582160 A CN2013103582160 A CN 2013103582160A CN 201310358216 A CN201310358216 A CN 201310358216A CN 103400056 A CN103400056 A CN 103400056A
Authority
CN
China
Prior art keywords
dna
dna sequence
sequence
monoid
pattern
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2013103582160A
Other languages
Chinese (zh)
Other versions
CN103400056B (en
Inventor
倪莉
黄志清
郑蓉
林江宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN201310358216.0A priority Critical patent/CN103400056B/en
Publication of CN103400056A publication Critical patent/CN103400056A/en
Application granted granted Critical
Publication of CN103400056B publication Critical patent/CN103400056B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention relates to a representation method for specific DNA sequences in biotic population, particularly to a DNA sequence pattern construction method. The method is characterized in that, for the specific DNA sequences, all the DNA sequences of biological species or groups involved in a research are merged, represented and regularly combined into a DNA sequence pattern according to basic groups; such combination can be performed on the level of species, groups or other taxonomic units and can also be performed on specific biologic groups. The DNA sequence pattern constructed according to the method can visually show the sequence characteristics of biological species or groups in a research system, so that analysis and research on the characteristics of biotic population are facilitated.

Description

A kind of construction method of DNA sequence dna pattern
Technical field
The present invention relates to the method for expressing of specific dna sequence in biotic population, particularly a kind of construction method of DNA sequence dna pattern.
Background technology
Gene refers to the DNA fragmentation with hereditary effect, in different types of biosome, exists different gene systems.Yet some gene is being born important vital functions due to its expression product, in biotic population, extensively exist, as aibuhitensis cytochrome oxidase gene and rDNA gene, be present in the middle of nearly all biological cell.In addition, the partial sequence of these two kinds of genes and next-door neighbour's noncoding region also have the changeability of height, and different biological groups has its specific sequence.Therefore these two kinds of gene-correlation sequences can be used as the sign of particular organisms monoid, in the kind of biosome, identify and the aspect such as biotic population structure elucidation is widely used at present: the aibuhitensis cytochrome oxidase gene correlated series is just as the first-selected fragment of living species DNA bar code, the desirable fragment that rDNA gene-correlation sequence is just being studied as microorganism species.It is a kind of ideal situation that yet the same biology has identical above-mentioned sequence, even in fact also can there be the variation (this variation can not simply ascribe subspecies or mutation etc. to) of above-mentioned sequence in the same biology, this situation is different because of different biological kinds, and this should be the normality of most of biological kinds in theory.In addition, for Population Biology research, sometimes need in different biological groups, carry out analysis and comparison, need this moment is the sequence information of biological group aspect but not kind aspect.Described both of these case has related separately to the various ways (in planting) of sequence and the situation of multiple sequence (between kind or between monoid), and this moment, the expression mode of many sequences was also inadvisable, can adopt sequence pattern to characterize.
Summary of the invention
The object of the present invention is to provide a kind of construction method of DNA sequence dna pattern, the DNA sequence dna pattern that the method builds can show the sequence signature of the postgraduate of institute species genus or monoid intuitively, thereby is conducive to analysis and the research of biotic population characteristic.
For achieving the above object, the technical solution used in the present invention is: a kind of construction method of DNA sequence dna pattern, for specific DNA sequence dna, biological kind or all these DNA sequence dnas of monoid that research is related to annex and represent that compatible rule merging becomes a DNA sequence dna pattern according to base, this merging can kind, belong to or other taxonomical unit level on carry out, also can carry out for specific biological group.
In an embodiment of the present invention, the construction step of DNA sequence dna pattern is as follows:
Step 1: determine biological kind or monoid that the DNA sequence dna pattern of intending building relates to;
Step 2: need to select specific DNA fragmentation according to research, collect the corresponding DNA sequence dna of the described biological kind of step 1 or monoid;
Step 3: the DNA sequence dna of collecting in step 2 is compared, reject and the obvious inconsistent or wrong DNA sequence dna of other sequence;
Step 4: need to be on biological kind or monoid level according to research, it is a DNA sequence dna that the DNA sequence dna of the biological kind after step 3 is rejected or monoid is annexed to the expression compatible rule merging according to base, and carrying out necessary check and correction, this is the DNA sequence dna pattern of biological kind or monoid;
Wherein, biological kind or monoid that the DNA sequence dna pattern that intend to build relates to can be determined in conjunction with bibliographical information and early-stage Study.
Wherein, described base annexs the expression rule and is: the comparison result when a certain site: 1) a, t, c, g only occur, with reference to base, annex the table Output rusults; 2) space and a, t, c, g occur, make without exception capitalization A, T, C, G into; 3) space and a plurality of base occur, adopt without exception the merger base of capitalization to represent; 4) other letter occurs, ignore; 5) in aligned sequences, the partial sequence front end is not counted into space with the consecutive miss that end occurs.
Wherein annexing base is not the base of necessary being, but the polybase base situation that may occur with an a certain site of other symbology except a, t, c, tetra-kinds of basic bases of g represent.
The invention has the beneficial effects as follows a kind of easy method for expressing of studying biological group specific dna sequence in biotic population is provided, it is the DNA sequence dna pattern, from the DNA sequence dna pattern that builds, can easily analyze the feature for this DNA sequence dna of research biotic population, comprise changeability that has specific conserved region and variable region and variable region etc., thereby, for the aspects such as the evaluation that comprises primer and design provide foundation, help the accurate parsing of biotic population characteristic in institute's research system.
The present invention is described in further detail below in conjunction with drawings and the specific embodiments.
The accompanying drawing explanation
Fig. 1 is the implementing procedure figure of the embodiment of the present invention.
Embodiment
The construction method of DNA sequence dna pattern of the present invention, for specific DNA sequence dna, biological kind or all these DNA sequence dnas of monoid that institute's research system is related to annex and represent that compatible rule merging becomes a DNA sequence dna pattern according to base, this merging can kind, belong to or other taxonomical unit level on carry out, also can carry out for specific biological group; The DNA sequence dna pattern of biological kind or monoid has just formed the DNA sequence dna pattern of institute's research system jointly.
According to the construction method of above-mentioned DNA sequence dna pattern, as shown in Figure 1, the construction step of DNA sequence dna pattern is specific as follows:
Step 1: determine biological kind or monoid that the DNA sequence dna pattern of intending building relates to;
Step 2: need to select specific DNA fragmentation according to research, collect the corresponding DNA sequence dna of the described biological kind of step 1 or monoid;
Step 3: the DNA sequence dna of collecting in step 2 is compared, reject and the obvious inconsistent or wrong DNA sequence dna of other sequence;
Step 4: need to be on biological kind or monoid level according to research, it is a DNA sequence dna that the DNA sequence dna of the biological kind after step 3 is rejected or monoid is annexed to the expression compatible rule merging according to base, and carrying out necessary check and correction, this is the DNA sequence dna pattern of biological kind or monoid.
Step 5: need to carry out specificity analysis to constructed DNA sequence dna pattern according to research, apply accordingly (as primer evaluation and design etc.) according to its sequence signature.
In above-mentioned steps 4, described base annexs the expression rule and is: the comparison result when a certain site: 1) a, t, c, g only occur, with reference to base, annex the table Output rusults; 2) space and a, t, c, g occur, make without exception capitalization A, T, C, G into; 3) space and a plurality of base occur, adopt without exception the merger base of capitalization to represent; 4) other letter occurs, ignore; 5) in aligned sequences, the partial sequence front end is not counted into space with the consecutive miss that end occurs.
Wherein not all consistent because of the initial sum termination of each sequence that builds for sequence pattern, so in aligned sequences, the partial sequence front end is not counted into space with the consecutive miss that end occurs.
Wherein annexing base is not the base of necessary being, but the polybase base situation that may occur with an a certain site of other symbology except a, t, c, tetra-kinds of basic bases of g represent.Described base merger table is:
Table 1 base annexs table
Figure 510749DEST_PATH_IMAGE001
Wherein, because the DNA sequence dna in the middle of existing database inevitably can exist some errors or deviation, as higher as part order-checking error rate, this top in sequence and end are very common, there will be in addition the situations such as the Format Series Lines of collecting inconsistent (have plenty of the forward sequence and have plenty of reverse complementary sequence), therefore be necessary the sequence pattern that obtains is proofreaied and correct.The foundation that has any to can be used as examination to this, specific DNA sequence dna is as the aforementioned rDNA gene and aibuhitensis cytochrome oxidase gene, in same kind, still has higher conservative property, is difficult to occur continuous a large amount of sudden change.
Wherein, the structure of DNA sequence dna pattern is based on the known flora information that relates to and existing DNA sequence dna thereof, carrying out, carry out and go deep into along with correlative study, flora information and DNA sequence dna thereof will be upgraded or be supplemented, and therefore can adopt the sequence pattern that uses the same method to building and carry out perfect.
The present invention will be further described below in conjunction with embodiment.
The present embodiment is brewageed the distinctive microorganism of system take red rice yellow wine---and the monascus flora, as research object, builds the nearly full length sequence pattern of its 18S rDNA, and its basic sequence characteristic is analyzed.The present embodiment concrete steps are as follows:
1, determine flora information
The embodiment of the present invention relates to be red rice yellow wine brewage monascus in microorganism system ( MonascusSpp.).
2, collect the nearly full length sequence of 18S rDNA
(the 18S rDNA sequence total length of fungi approximately has 1800bp at the nearly full length sequence of 18S rDNA of the Nucleotide of NCBI database retrieval monascus, but the 18S rDNA full length sequence that minority fungi kind is only arranged in database at present, what therefore the present embodiment was collected is the nearly full length sequence of 18S rDNA, be no less than 1500bp), search condition is: (((18S[Title]) AND Monascus[Organism]) AND 1500:2000[Sequence Length]) NOT 28S[Title].
According to above-mentioned condition, retrieve altogether 19 sequences, its population and Accession Number situation refer to table 2.Download this 19 sequences, with Clustalx software (Clustalx1.8), compare, find that GU733347.1 sequence and other 18 sequence differences are larger, therefore reject.
The nearly full length sequence situation of table 2 monascus kind and 18S rDNA thereof
Figure 228170DEST_PATH_IMAGE002
Annotate: the nucleic acid database that sequence is collected with the NCBI of 2013.7.1 is as the criterion; Wherein larger through Clustalx software (Clustalx1.8) comparison discovery GU733347.1 and other sequence difference, therefore reject, in table, with runic, indicate.
3, build the nearly full length sequence pattern of 18S rDNA
Described according to above-mentioned DNA sequence dna mode construction step, by 18 of conclusive evidence MonascusSpp. the nearly full length sequence of 18S rDNA converts its sequence pattern (sequence is as shown in SEQ ID NO.1) to.Utilize Clustalx software (Clustalx1.8) right MonascusSpp. 18 sequences before the nearly full length sequence pattern of 18S rDNA and conversion compare, and through check and correction, find that resulting sequence pattern meets the transformation rule of formulating.but the 29-31 site of this sequence pattern is respectively W, R and S, be all by sequence D Q782881.1 and other 17 sequences, be not inconsistent due to, in addition than wherein 17 sites 44 of this sequence of sequence in sequence alignment, also there is disappearance in 54 places, 50He site, site, described six sites, place are the top of sequence D Q782881.1, analysis-by-synthesis think should be sequence D Q782881.1 itself order-checking inaccurate due to and not mutated, therefore adopting the result of other sequence (is that the 29-31 site is respectively A, G and C, disregard simultaneously described site 44, 50He site, site 54 these 3 place disappearances).Therefore through overcorrect, constructed MonascusSpp. the nearly full length sequence pattern of 18S rDNA is as shown in SEQ ID NO.2.
After correction MonascusSpp. the nearly full length sequence pattern of 18S rDNA has 1777bp, corresponding to Saccharomyces cerevisiaeThe 19-1796 site of the 18S rDNA sequence of reference culture NCYC 505 (Accession Number is Z75578).
4, sequence pattern specificity analysis
To what obtain in 3 MonascusThe mutable site of the nearly full length sequence pattern of 18S rDNA is spp. added up, and is as shown in table 3.
Table 3 MonascusSpp. the nearly full length sequence mode variables of 18S rDNA site situation
This sequence pattern has 24 place's mutable sites, and wherein insertion or deletion segment have 8 places, and there are 14 places (a, g change 9 places, site, and t, c change 5 places, site) in the conversion site, and all the other 2 places are the transversion site.Mutable site accounts for 1.35% of total length, explanation MonascusSpp. 18S rDNA sequence has very little changeability.Wherein change site maximum, account for 60 (14/24) percent, the sudden change rule of this and DNA is relevant.In fungal genomic DNA, C is easily methylated forms 5 '-methylcystein, then deaminizating can form thymine.Due to T itself, be exactly one of basic comprising unit of DNA, therefore be difficult for being repaired.Mutually to change (complementary strand) be comparatively common mutation type so T and C change (raw chains), A and G mutually.
Further analyze and find in addition, 591-810 this length in site is the fragment (approximately only account for total length 12.5%) of 220bp, and its mutable site reaches 9 (account for whole mutable sites 38%); 1041-1200 this length in site is the fragment (approximately only account for total length 9.0%) of 160bp, and its mutable site reaches 6 (account for whole mutable sites 25.0%).Therefore if employing Molecular Ecology of Microbiology means (being that denaturing gradient gel electrophoresis or TGGE are Technique of Temperature Gradient Gel Electrophoresis as DGGE) are studied the monascus flora, its amplified fragments of selected primer should be tending towards this two hypervariable regions as far as possible.
Sequence table
> SEQ ID NO.1
tgtagtcata tgcttgtctc aaagattawr scatgcatgt ctaagtgtaa gcaatttata 60
ctgtgaaact gcgaatggct cattaaatca gttatcgttt atttgatrgt accttactac 120
atggatacct gtggtaattc tagagctaat acrtgctaaa aaccccgact tcggaagggg 180
tgtatttatt agataaaaaa ccaacgccct tcggggctcc ttggtgaatc ataataacta 240
aacgaatcgc atggccttgc gccggcgatg gttcattcaa atttctgccc tatcaacttt 300
cgatggtagg atagtggcct accatggtgg caacgggtaa cggggaatta gggttcgatt 360
ccggagaggg agcctgagaa acggctacca crtccaagga aggcagcagg cgcgcaaatt 420
acccaatccc gacacgggga ggtagtgaca ataaatactg atacggggct ctttcgggtc 480
tcgtaatcgg aatgagaacg acctaaataa cctaacgagg aacaattgga gggcaagtct 540
ggtgccagca gccgcggtaa ttccagctcc aatagcgtat attaaagttg ttgcagktaa 600
aaagctcgta gttgaacctt gggtctggct ggccggtccg cctcaycgcg agtactggtc 660
cggccggacc tttccttctg gggaaAAcct catggccttc aytggctgtg gggggaaccC 720
rggactttta ctgtgaaaaa attagagtgt tcaaagcagg cctttgctcg aatacattag 780
catggaaAta atagaatagg acgtgcggGt tctattttgt tggtttctag gaccgccgta 840
atgattaata gggatagtcg ggggcgtcag tattcagctg tcagaggtga aattcttgga 900
tttgctgaag actaactact gcgaaagcat tcgccaagga tgttttcatt aatcagggaa 960
cgaaagttag gggatcgaag acgatcagat accgtcgtag tcttaaccat aaactatgcc 1020
gactagggat cggacgggtt tcyatgatga cccgttcggc accttacgag aaAtcaaagt 1080
ttttgggttc tggggggagt rtggtcgcaa ggctgaaact taaagraatt gacggaaggg 1140
caccacaagg cgtggagcct gcggcttaat ttgactcaac acggggaarc tyaccaggtc 1200
cagacaaaat aaggattgac agattgagag ctctttcttg atcttttgga tggtggtgca 1260
tggccgttcc tagttggtgg agtgatttgt ctgcttaatt gcgataacga acgagacctc 1320
ggcccttaaa tagcccggtc cgcatttgcg gGccgctggc ttcttaaggg gactatcggc 1380
tcaagccgat ggaagtgcgc ggcaataaca ggtctgtgat gcccttagat gttctgggcc 1440
gcrcgcgcgc tacactgaca gggccCagcg agtacatcac cttggccgag aggcctgggt 1500
aatcttgtta aaccctgtcg tgctggggat agagcattgc aattattgct cttcaacgag 1560
gaatgcctag taggcacgag tcatcagctc gtgccgayta cgtccctgcc ctttgtacac 1620
accgcccgtc gctactaccg attgaatggc tcagtgaggc ctccggactg gcccagggag 1680
gttggcaacg acccccccgg gccggaaagc tggtcaaact cggtcattta gargaagtaa 1740
aagtcgtaac aaggtttmcg taggtgaacc tgcggaa 1777
> SEQ ID NO.2
tgtagtcata tgcttgtctc aaagattaag ccatgcatgt ctaagtgtaa gcaatttata 60
ctgtgaaact gcgaatggct cattaaatca gttatcgttt atttgatrgt accttactac 120
atggatacct gtggtaattc tagagctaat acrtgctaaa aaccccgact tcggaagggg 180
tgtatttatt agataaaaaa ccaacgccct tcggggctcc ttggtgaatc ataataacta 240
aacgaatcgc atggccttgc gccggcgatg gttcattcaa atttctgccc tatcaacttt 300
cgatggtagg atagtggcct accatggtgg caacgggtaa cggggaatta gggttcgatt 360
ccggagaggg agcctgagaa acggctacca crtccaagga aggcagcagg cgcgcaaatt 420
acccaatccc gacacgggga ggtagtgaca ataaatactg atacggggct ctttcgggtc 480
tcgtaatcgg aatgagaacg acctaaataa cctaacgagg aacaattgga gggcaagtct 540
ggtgccagca gccgcggtaa ttccagctcc aatagcgtat attaaagttg ttgcagktaa 600
aaagctcgta gttgaacctt gggtctggct ggccggtccg cctcaycgcg agtactggtc 660
cggccggacc tttccttctg gggaaAAcct catggccttc aytggctgtg gggggaaccC 720
rggactttta ctgtgaaaaa attagagtgt tcaaagcagg cctttgctcg aatacattag 780
catggaaAta atagaatagg acgtgcggGt tctattttgt tggtttctag gaccgccgta 840
atgattaata gggatagtcg ggggcgtcag tattcagctg tcagaggtga aattcttgga 900
tttgctgaag actaactact gcgaaagcat tcgccaagga tgttttcatt aatcagggaa 960
cgaaagttag gggatcgaag acgatcagat accgtcgtag tcttaaccat aaactatgcc 1020
gactagggat cggacgggtt tcyatgatga cccgttcggc accttacgag aaAtcaaagt 1080
ttttgggttc tggggggagt rtggtcgcaa ggctgaaact taaagraatt gacggaaggg 1140
caccacaagg cgtggagcct gcggcttaat ttgactcaac acggggaarc tyaccaggtc 1200
cagacaaaat aaggattgac agattgagag ctctttcttg atcttttgga tggtggtgca 1260
tggccgttcc tagttggtgg agtgatttgt ctgcttaatt gcgataacga acgagacctc 1320
ggcccttaaa tagcccggtc cgcatttgcg gGccgctggc ttcttaaggg gactatcggc 1380
tcaagccgat ggaagtgcgc ggcaataaca ggtctgtgat gcccttagat gttctgggcc 1440
gcrcgcgcgc tacactgaca gggccCagcg agtacatcac cttggccgag aggcctgggt 1500
aatcttgtta aaccctgtcg tgctggggat agagcattgc aattattgct cttcaacgag 1560
gaatgcctag taggcacgag tcatcagctc gtgccgayta cgtccctgcc ctttgtacac 1620
accgcccgtc gctactaccg attgaatggc tcagtgaggc ctccggactg gcccagggag 1680
gttggcaacg acccccccgg gccggaaagc tggtcaaact cggtcattta gargaagtaa 1740
aagtcgtaac aaggtttmcg taggtgaacc tgcggaa 1777
> SEQ ID NO.1
tgtagtcata tgcttgtctc aaagattawr scatgcatgt ctaagtgtaa gcaatttata 60
ctgtgaaact gcgaatggct cattaaatca gttatcgttt atttgatrgt accttactac 120
atggatacct gtggtaattc tagagctaat acrtgctaaa aaccccgact tcggaagggg 180
tgtatttatt agataaaaaa ccaacgccct tcggggctcc ttggtgaatc ataataacta 240
aacgaatcgc atggccttgc gccggcgatg gttcattcaa atttctgccc tatcaacttt 300
cgatggtagg atagtggcct accatggtgg caacgggtaa cggggaatta gggttcgatt 360
ccggagaggg agcctgagaa acggctacca crtccaagga aggcagcagg cgcgcaaatt 420
acccaatccc gacacgggga ggtagtgaca ataaatactg atacggggct ctttcgggtc 480
tcgtaatcgg aatgagaacg acctaaataa cctaacgagg aacaattgga gggcaagtct 540
ggtgccagca gccgcggtaa ttccagctcc aatagcgtat attaaagttg ttgcagktaa 600
aaagctcgta gttgaacctt gggtctggct ggccggtccg cctcaycgcg agtactggtc 660
cggccggacc tttccttctg gggaaAAcct catggccttc aytggctgtg gggggaaccC 720
rggactttta ctgtgaaaaa attagagtgt tcaaagcagg cctttgctcg aatacattag 780
catggaaAta atagaatagg acgtgcggGt tctattttgt tggtttctag gaccgccgta 840
atgattaata gggatagtcg ggggcgtcag tattcagctg tcagaggtga aattcttgga 900
tttgctgaag actaactact gcgaaagcat tcgccaagga tgttttcatt aatcagggaa 960
cgaaagttag gggatcgaag acgatcagat accgtcgtag tcttaaccat aaactatgcc 1020
gactagggat cggacgggtt tcyatgatga cccgttcggc accttacgag aaAtcaaagt 1080
ttttgggttc tggggggagt rtggtcgcaa ggctgaaact taaagraatt gacggaaggg 1140
caccacaagg cgtggagcct gcggcttaat ttgactcaac acggggaarc tyaccaggtc 1200
cagacaaaat aaggattgac agattgagag ctctttcttg atcttttgga tggtggtgca 1260
tggccgttcc tagttggtgg agtgatttgt ctgcttaatt gcgataacga acgagacctc 1320
ggcccttaaa tagcccggtc cgcatttgcg gGccgctggc ttcttaaggg gactatcggc 1380
tcaagccgat ggaagtgcgc ggcaataaca ggtctgtgat gcccttagat gttctgggcc 1440
gcrcgcgcgc tacactgaca gggccCagcg agtacatcac cttggccgag aggcctgggt 1500
aatcttgtta aaccctgtcg tgctggggat agagcattgc aattattgct cttcaacgag 1560
gaatgcctag taggcacgag tcatcagctc gtgccgayta cgtccctgcc ctttgtacac 1620
accgcccgtc gctactaccg attgaatggc tcagtgaggc ctccggactg gcccagggag 1680
gttggcaacg acccccccgg gccggaaagc tggtcaaact cggtcattta gargaagtaa 1740
aagtcgtaac aaggtttmcg taggtgaacc tgcggaa 1777
> SEQ ID NO.2
tgtagtcata tgcttgtctc aaagattaag ccatgcatgt ctaagtgtaa gcaatttata 60
ctgtgaaact gcgaatggct cattaaatca gttatcgttt atttgatrgt accttactac 120
atggatacct gtggtaattc tagagctaat acrtgctaaa aaccccgact tcggaagggg 180
tgtatttatt agataaaaaa ccaacgccct tcggggctcc ttggtgaatc ataataacta 240
aacgaatcgc atggccttgc gccggcgatg gttcattcaa atttctgccc tatcaacttt 300
cgatggtagg atagtggcct accatggtgg caacgggtaa cggggaatta gggttcgatt 360
ccggagaggg agcctgagaa acggctacca crtccaagga aggcagcagg cgcgcaaatt 420
acccaatccc gacacgggga ggtagtgaca ataaatactg atacggggct ctttcgggtc 480
tcgtaatcgg aatgagaacg acctaaataa cctaacgagg aacaattgga gggcaagtct 540
ggtgccagca gccgcggtaa ttccagctcc aatagcgtat attaaagttg ttgcagktaa 600
aaagctcgta gttgaacctt gggtctggct ggccggtccg cctcaycgcg agtactggtc 660
cggccggacc tttccttctg gggaaAAcct catggccttc aytggctgtg gggggaaccC 720
rggactttta ctgtgaaaaa attagagtgt tcaaagcagg cctttgctcg aatacattag 780
catggaaAta atagaatagg acgtgcggGt tctattttgt tggtttctag gaccgccgta 840
atgattaata gggatagtcg ggggcgtcag tattcagctg tcagaggtga aattcttgga 900
tttgctgaag actaactact gcgaaagcat tcgccaagga tgttttcatt aatcagggaa 960
cgaaagttag gggatcgaag acgatcagat accgtcgtag tcttaaccat aaactatgcc 1020
gactagggat cggacgggtt tcyatgatga cccgttcggc accttacgag aaAtcaaagt 1080
ttttgggttc tggggggagt rtggtcgcaa ggctgaaact taaagraatt gacggaaggg 1140
caccacaagg cgtggagcct gcggcttaat ttgactcaac acggggaarc tyaccaggtc 1200
cagacaaaat aaggattgac agattgagag ctctttcttg atcttttgga tggtggtgca 1260
tggccgttcc tagttggtgg agtgatttgt ctgcttaatt gcgataacga acgagacctc 1320
ggcccttaaa tagcccggtc cgcatttgcg gGccgctggc ttcttaaggg gactatcggc 1380
tcaagccgat ggaagtgcgc ggcaataaca ggtctgtgat gcccttagat gttctgggcc 1440
gcrcgcgcgc tacactgaca gggccCagcg agtacatcac cttggccgag aggcctgggt 1500
aatcttgtta aaccctgtcg tgctggggat agagcattgc aattattgct cttcaacgag 1560
gaatgcctag taggcacgag tcatcagctc gtgccgayta cgtccctgcc ctttgtacac 1620
accgcccgtc gctactaccg attgaatggc tcagtgaggc ctccggactg gcccagggag 1680
gttggcaacg acccccccgg gccggaaagc tggtcaaact cggtcattta gargaagtaa 1740
aagtcgtaac aaggtttmcg taggtgaacc tgcggaa 1777

Claims (3)

1. the construction method of a DNA sequence dna pattern, it is characterized in that: for specific DNA sequence dna, biological kind or all these DNA sequence dnas of monoid that research is related to annex and represent that compatible rule merging becomes a DNA sequence dna pattern according to base, this merging can kind, belong to or other taxonomical unit level on carry out, also can carry out for specific biological group.
2. the construction method of a kind of DNA sequence dna pattern according to claim 1, it is characterized in that: the construction step of DNA sequence dna pattern is as follows:
Step 1: determine biological kind or monoid that research relates to;
Step 2: need to select specific DNA fragmentation according to research, collect the corresponding DNA sequence dna of the described biological kind of step 1 or monoid;
Step 3: the DNA sequence dna of collecting in step 2 is compared, reject and the obvious inconsistent or wrong DNA sequence dna of other sequence;
Step 4: need to be on biological kind or monoid level according to research, it is a DNA sequence dna that the DNA sequence dna of the biological kind after step 3 is rejected or monoid is annexed to the expression compatible rule merging according to base, and carrying out necessary check and correction, this namely studies the biological kind that relates to or the DNA sequence dna pattern of monoid.
3. the construction method of DNA sequence dna pattern according to claim 1 and 2 is characterized in that: described base annexs the expression rule and is: the comparison result when a certain site: 1) a, t, c, g only occur, with reference to base, annex the table Output rusults; 2) space and a, t, c, g occur, make without exception capitalization A, T, C, G into; 3) space and a plurality of base occur, adopt without exception the merger base of capitalization to represent; 4) other letter occurs, ignore; 5) in aligned sequences, the partial sequence front end is not counted into space with the consecutive miss that end occurs; Expression in described base merger table meets the regulation that State Intellectual Property Office represents for nucleotide sequence.
CN201310358216.0A 2013-08-17 2013-08-17 DNA sequence pattern construction method Active CN103400056B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310358216.0A CN103400056B (en) 2013-08-17 2013-08-17 DNA sequence pattern construction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310358216.0A CN103400056B (en) 2013-08-17 2013-08-17 DNA sequence pattern construction method

Publications (2)

Publication Number Publication Date
CN103400056A true CN103400056A (en) 2013-11-20
CN103400056B CN103400056B (en) 2017-04-12

Family

ID=49563683

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310358216.0A Active CN103400056B (en) 2013-08-17 2013-08-17 DNA sequence pattern construction method

Country Status (1)

Country Link
CN (1) CN103400056B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1900318A (en) * 2006-06-23 2007-01-24 厦门大学 High variation zone amplication primer of rockfish mitochondrial genome and its design method
US20070287151A1 (en) * 2004-03-25 2007-12-13 Sten Linnarsson Methods and Means for Nucleic Acid Sequencing
US20100114918A1 (en) * 2007-05-31 2010-05-06 Isentio As Generation of degenerate sequences and identification of individual sequences from a degenerate sequence
CN102867134A (en) * 2012-08-16 2013-01-09 盛司潼 System and method for splicing gene sequence fragments

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070287151A1 (en) * 2004-03-25 2007-12-13 Sten Linnarsson Methods and Means for Nucleic Acid Sequencing
CN1900318A (en) * 2006-06-23 2007-01-24 厦门大学 High variation zone amplication primer of rockfish mitochondrial genome and its design method
US20100114918A1 (en) * 2007-05-31 2010-05-06 Isentio As Generation of degenerate sequences and identification of individual sequences from a degenerate sequence
CN102867134A (en) * 2012-08-16 2013-01-09 盛司潼 System and method for splicing gene sequence fragments

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
苏春丽等: "基于rDNA ITS序列探讨中国栽培灵芝菌株的亲缘关系", 《微生物学报》 *
马凌波等: "基于线粒体DNA三种基因序列探讨青蟹属的系统发育关系", 《热带海洋学报》 *

Also Published As

Publication number Publication date
CN103400056B (en) 2017-04-12

Similar Documents

Publication Publication Date Title
Lofgren et al. Genome‐based estimates of fungal rDNA copy number variation across phylogenetic scales and ecological lifestyles
Schoch et al. From the cover: nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for Fungi
Gayevskiy et al. Saccharomyces eubayanus and Saccharomyces arboricola reside in North Island native New Zealand forests
Hartmann et al. V-Xtractor: an open-source, high-throughput software tool to identify and extract hypervariable regions of small subunit (16 S/18 S) ribosomal RNA gene sequences
Borneman et al. Genomic insights into the Saccharomyces sensu stricto complex
Patwardhan et al. Molecular markers in phylogenetic studies-a review
Mazard et al. Multi‐locus sequence analysis, taxonomic resolution and biogeography of marine Synechococcus
Langdon et al. sppIDer: a species identification tool to investigate hybrid genomes with high-throughput sequencing
CN101748213B (en) Environmental microorganism detection method and system
James et al. Repetitive sequence variation and dynamics in the ribosomal DNA array of Saccharomyces cerevisiae as revealed by whole-genome resequencing
Muir et al. A multiplex set of species-specific primers for rapid identification of members of the genus Saccharomyces
Kurtzman et al. Gene sequence analyses and other DNA-based methods for yeast species recognition
Schlackow et al. Genome-wide analysis of poly (A) site selection in Schizosaccharomyces pombe
Anderson et al. The three clades of the telomere-associated TLO gene family of Candida albicans have different splicing, localization, and expression features
Libkind et al. Into the wild: new yeast genomes from natural environments and new tools for their analysis
Maclean et al. Deciphering the genic basis of yeast fitness variation by simultaneous forward and reverse genetics
Grubisha et al. Intercontinental divergence in the Populus‐associated ectomycorrhizal fungus, Tricholoma populinum
CN104694540A (en) Primer suitable for multi-sample amplicon library construction, amplicon library and construction method thereof
Staab et al. Aspergillus section Fumigati typing by PCR-restriction fragment polymorphism
Franco‐Duarte et al. Computational models reveal genotype–phenotype associations in Saccharomyces cerevisiae
CN103103258A (en) Method for analyzing generating trend of aflatoxin B1 in peanut meal by multiple PCR (Polymerase Chain Reaction) technology
Caudal et al. Pan-transcriptome reveals a large accessory genome contribution to gene expression variation in yeast
CN101429559A (en) Environmental microorganism detection method and system
Qu et al. Nucleotide compositional asymmetry between the leading and lagging strands of eubacterial genomes
CN103400056A (en) DNA sequence pattern construction method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant