CN109504795A - A kind of determination method of santal genome specific sequence and the identification method of santal - Google Patents

A kind of determination method of santal genome specific sequence and the identification method of santal Download PDF

Info

Publication number
CN109504795A
CN109504795A CN201811583474.8A CN201811583474A CN109504795A CN 109504795 A CN109504795 A CN 109504795A CN 201811583474 A CN201811583474 A CN 201811583474A CN 109504795 A CN109504795 A CN 109504795A
Authority
CN
China
Prior art keywords
santal
sequence
nucleotide sequence
sample
genome
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811583474.8A
Other languages
Chinese (zh)
Inventor
谢尚潜
栾美薇
邢剑锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hainan University
Original Assignee
Hainan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hainan University filed Critical Hainan University
Priority to CN201811583474.8A priority Critical patent/CN109504795A/en
Publication of CN109504795A publication Critical patent/CN109504795A/en
Pending legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/6895Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for plants, fungi or algae

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Immunology (AREA)
  • Mycology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Botany (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses the identification methods of a kind of determination method of santal genome specific sequence and santal, are related to gene technology field.The determining method: santal genome sequencing data is obtained from gene database, determines santal candidate region sequence;By other species gene group sequences in the sequence alignment gene database of santal candidate region, other species gene group sequences can be compared by rejecting from the sequence of santal candidate region, and reservation can not specifically compare other species gene group sequences and as santal nucleotide sequence;Santal nucleotide sequence is mapped into santal genome, determines location information of the santal nucleotide sequence in the santal genome.Identification method: santal is identified using nucleotide sequence shown in SEQ ID NO.1.The present invention identifies the santal true and false from gene angle, improves the accuracy of santal qualification result by obtaining santal nucleotide sequence and using its specificity.

Description

A kind of determination method of santal genome specific sequence and the identification method of santal
Technical field
The present invention relates to biological gene technical field more particularly to a kind of determination method of santal genome specific sequence and The identification method of santal.
Background technique
Santal (Santalum album L) is Santalaceae (Santalaceae) evergreen partial parasite, in Santalaceae Santal oil content highest has important economic value.The heartwood of santal is rare Chinese medicine;The broken material such as root, trunk can To refine the sandalwood essential oil for being commonly called as " liquid golden ";Branch under trimming in sprout and growth course is high-grade fragrant product former material Material.Earliest Compendium of Material Medica record " is controlled and dysphagic spits food.Look unfamiliar again black mole, wash to wipe every night with pulp-water and enable red, mill juice applies it ", santal Acrid flavour, it is warm-natured, nontoxic there is regulating qi-flowing for harmonizing stomach and other effects, trusted subordinate's pain, chest diaphragm does not relaxes can be controlled.In " Chinese Pharmacopoeia " (nineteen ninety version) In record santal heartwood for the first time with medical value.
Since santal whole body is all treasured, there is huge economic value.In recent years, people have carried out excessive exploitation to it And utilization, it is gradually decreased so as to cause sandalwood wild resource, or even deficient, santal has been put into endangered species of wild fauna and flora at present Kind international convention (CITES) control.Since the demand in market gradually increases, some are similar using constructing in order to chase interests Chinese catalpa wood, cypress and yellow cedar etc. copy sandalwood, to cause serious undesirable influence.
Therefore, for above situation, accurately identify to sandalwood is particularly important.The identification of santal existing one A little conventional methods, including judge institutional framework by micro- characteristic, Medicinal Materials Characters and Physiological-biochemical Characters and use makings Combined instrument (GC-MS) carries out constituent analysis etc., but appearance, fibre structure phase cannot be distinguished above by physical features discrimination method Close imitation (such as Japan cypress, osmanthus China wood);And Physiological-biochemical Characters identification is easy to be influenced by low temperature environment, influences to identify As a result;The constituent analysis of gas chromatograph-mass spectrometer is qualitative identification method, test requirements document temperature control and sample requirement amount is larger, and reagent There is also influences on identification result accuracy for impurity.Therefore, it needs to find a kind of method that can accurately identify santal.
Summary of the invention
In view of this, the embodiment of the invention provides the mirror of a kind of determination method of santal genome specific sequence and santal Determine method, main purpose is to solve the problems, such as santal qualification result inaccuracy.
In order to achieve the above objectives, invention broadly provides following technical solutions:
On the one hand, the embodiment of the invention provides a kind of determination method of santal genome specific sequence, the method packets The following acquisition santal genome sequencing data from gene database is included, determines santal candidate region sequence;
By other species gene group sequences in gene database described in the sequence alignment of the santal candidate region, from the wingceltis In fragrant candidate region sequence reject can compare described in other species gene group sequences, retain can not specifically compare described in other Species gene group sequence and as santal nucleotide sequence;
The santal nucleotide sequence is mapped into the santal genome, determines the santal nucleotide sequence The location information being listed in the santal genome.
Preferably, the santal nucleotide sequence includes nucleotide sequence shown in SEQ ID NO.1.
On the other hand, it the embodiment of the invention provides a kind of identification method of santal, the described method comprises the following steps:
Using above-mentioned santal nucleotide sequence and the location information as the standard information of identification santal;
It extracts sample gene to be tested group and is sequenced, obtain sample gene order-checking data;
The sample core in the sample gene order-checking data is determined using the location information in the standard information Nucleotide sequence;
The santal nucleotide sequence in the standard information is compared with the sample nucleotide sequence, Base ratio is calculated to concordance rate;When the comparison concordance rate is greater than 95%, the sample to be tested is santal.
Preferably, being determined in the sample gene order-checking data using the location information in the standard information Sample nucleotide sequence detailed process are as follows: using BWA-MEM software by the sample gene order-checking data and santal Sequence alignment is carried out with reference to genome sequence, the comparing result comprising corresponding chromosome and location information after being compared; The sample core is obtained from the comparison result using the location information in samtools software and the standard information Nucleotide sequence.
Compared with prior art, the beneficial effects of the present invention are:
The present invention determines santal nucleotide sequence from santal genomic information for the first time, then by the santal nucleosides Sour distinguished sequence identifies the true or false of santal by sequence alignment mode;Since the distinguished sequence is long by santal species What is retained in phase evolutionary process represents the biomarker of santal species, using its gene specific sequence can in a variety of species it is quasi- Really identify real santal;The method of the present invention is identified from gene angle, improves the accuracy of santal qualification result.
Specific embodiment
For further illustrate the present invention to reach the technical means and efficacy that predetermined goal of the invention is taken, below with compared with Good embodiment, to specific embodiment, technical solution, feature and its effect applied according to the present invention, detailed description is as follows.Under Stating the special characteristic, structure or feature in multiple embodiments in bright can be combined by any suitable form.
Technical term of the present invention is explained as follows:
Distinguished sequence: referring to a nucleotide fragments in DNA molecular or the amino acid fragment in protein, they into It is held essentially constant during changing;I.e. distinguished sequence is the sequence in species gene group by remaining unchanged during long-term evolution Column, are not influenced by natural selection, and special distinguished sequence is to represent specific species biology mark by what long-term evolution retained Note.
ClustalW: being a kind of progressive Multiple Sequence Alignment Method, multiple sequences first compared to building distance matrix two-by-two, Relationship two-by-two between reaction sequence;Then generation system evolution guidance tree is calculated according to distance matrix, to sequence in close relations It is weighted;Then it since most close two sequences, is gradually introducing the sequence closed on and constantly rebuilds comparison, until Until all sequences are all added into.The species distinguished sequence region is being determined from species gene group using ClustalW method Method is the prior art.
BWA-MEM: for the reads obtained to be sequenced and refers to genome alignment software.
Samtools: for handling the tool software of sam Yu bam format, can be realized binary system check, format conversion, The functions such as sequence and merging.
Blastn: for carrying out the analysis tool of similarity system design in Protein Data Bank or DNA database.
ITS sequence: in rDNA gene, 5.8S rDNA and 28S rDNA genetic interval sequence is known as ITS.
Embodiment 1
Obtain santal DNA sequence dna:
(1) santal genome sequencing data is obtained from ncbi database, these data are determined using ClustalW and are waited Select sequence area;
(2) by the genome sequence of other species of the santal candidate region sequence alignment of collection to NCBI, from collection The candidate sequence that can compare other species of database is rejected in the sequence of santal candidate region, reservation can not specifically compare other The candidate sequence of species gene is denoted as conserved.fa as santal nucleotide sequence;
(3) above-mentioned santal nucleotide sequence (conserved.fa) is mapped into santal with reference to genome, determines wingceltis Fragrant nucleotide sequence accurate location information and nucleotide sequence in santal genome, as the mark of santal identification Quasi- gene sequence information is denoted as sample.fa and location information, is denoted as position.bed;Wherein, the santal nucleosides of acquisition Sour distinguished sequence includes the nucleotide sequence as shown in SEQ.ID.NO.1.
Identify species method:
(1) species sample to be identified is chosen, DNA is extracted and is sequenced, sample to be tested DNA sequence dna information is obtained, charges to Sample.fastq file;
(2) software is compared by the sequencing data (sample.fastq) of above-mentioned sample to be tested and santal using BWA-MEM Sequence alignment is carried out with reference to genome sequence, the comparison result information comprising homologue position and sequence is generated, charges to BAM File;
(3) position.bed file corresponding sample distinguished sequence area is extracted from BAM file using samtools The nucleotide sequence in domain is denoted as sample_filter.fa file;
(4) the result sample to be tested nucleotide sequence (sample_ that step (3) is obtained using blastn method Filter.fa) compared with the santal polynucleotide sequence (conserved.fa) that step (2) obtain, the ratio of base is calculated To concordance rate, being greater than 95% is genuine piece.
Application examples 1
Santal nucleotide sequence (as shown in SEQ.ID.NO.1) the Lai Jianding santal true and false obtained using embodiment 1;
(1) determine santal standard nucleotides distinguished sequence: the present invention determines ITS (240-399) using the method for embodiment 1 For one section of santal nucleotide sequence (charging to conserved.fa file), as shown in SEQ.ID.NO.1:
AACGACTCTCGGCAACGGATATCTCGGCTCTTGCATCGATGAAGAACGTAGCGAAATGCGATACTTGG TGTGAATTGCAGAATCCCGTGAACCATCGAGTCTTTGAACGCAAGTTGCGCCCGAAGCCACTAGGCCAAGGGCACG CCTGCCTGGGTGTCAC。
240-399 base ratio on ITS is referred into genome to santal, determines that the position on reference genome is NXEK01000069.1:634129-633982 charges to position.bed.
ITS:247-399;
Ref:634129-633982;Comparison process is as shown in table 1.
Table 1.ITS distinguished sequence and santal refer to genome correlation data table
(2) sample to be tested nucleotide sequence is obtained:
Sample to be tested DNA sequencing data are first obtained, from the comparison of sample DNA sequence to be detected and the reference genome of santal As a result the nucleotide sequence that the sample to be tested of corresponding position in position.bed file is extracted in (BAM file), is charged to Sample_filter.fa file;As shown in SEQ ID NO.2:
AACGACTCTCGGCAACGGATATCTCGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATACTTGG TGTGAATTGCAGAATCCCGTGAATCATCGAGTCTTTGAACGCAAGTTGCGCCCGAAGCCATTAGGTTAAGGGCACG CCTGCCTGGGTGTCAC。
(3) the standard nucleotides distinguished sequence of santal and the nucleotide sequence of sample to be tested are compared:
The standard nucleotides of the nucleotide sequence (charging to file sample_filter.fa) of sample to be tested and santal are special Different sequence (.conserved.fa) is compared using blastn, is compared concordance rate and is reached 96%, it is possible to identify above-mentioned sample to be tested For santal, comparison result is as shown in table 2.
The nucleotide sequence (Sample) of sample to be tested: KM521377.1:31-190;
The nucleotide sequence (ITS) of santal: 240-399.
2. sample to be tested nucleotide sequence of table and santal distinguished sequence correlation data table
Place, those skilled in the art can not select from the prior art to the greatest extent in the embodiment of the present invention.
Disclosed above is only a specific embodiment of the invention, but scope of protection of the present invention is not limited thereto, is appointed What those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, answer It is included within the scope of the present invention.Therefore, protection scope of the present invention should be with above-mentioned scope of protection of the claims It is quasi-.
Sequence table
<110>University Of Hainan
<120>a kind of determination method, application and the identification method of santal genome conserved sequence
<160> 2
<170> SIPOSequenceListing 1.0
<210> 1
<211> 160
<212> DNA
<213>artificial sequence (Artificial Sequence)
<400> 1
aacgactctc ggcaacggat atctcggctc ttgcatcgat gaagaacgta gcgaaatgcg 60
atacttggtg tgaattgcag aatcccgtga accatcgagt ctttgaacgc aagttgcgcc 120
cgaagccact aggccaaggg cacgcctgcc tgggtgtcac 160
<210> 2
<211> 160
<212> DNA
<213>artificial sequence (Artificial Sequence)
<400> 2
aacgactctc ggcaacggat atctcggctc tcgcatcgat gaagaacgca gcgaaatgcg 60
atacttggtg tgaattgcag aatcccgtga atcatcgagt ctttgaacgc aagttgcgcc 120
cgaagccatt aggttaaggg cacgcctgcc tgggtgtcac 160

Claims (4)

1. a kind of determination method of santal genome specific sequence, which is characterized in that the described method comprises the following steps:
Santal genome sequencing data is obtained from gene database, determines santal candidate region sequence;
By other species gene group sequences in gene database described in the sequence alignment of the santal candidate region, waited from the santal Other described species gene group sequences can be compared by rejecting in favored area sequence, and reservation can not specifically compare other described species Genome sequence and as santal nucleotide sequence;
The santal nucleotide sequence is mapped into the santal genome, determines that the santal nucleotide sequence exists Location information in the santal genome.
2. a kind of determination method of santal genome specific sequence as described in claim 1, which is characterized in that the santal core Thuja acid distinguished sequence includes nucleotide sequence shown in SEQ ID NO.1.
3. a kind of identification method of santal, which is characterized in that the described method comprises the following steps:
Using santal nucleotide sequence as claimed in claim 1 or 2 and the location information as the standard information of identification santal;
It extracts sample gene to be tested group and is sequenced, obtain sample gene order-checking data;
The sample nucleotide in the sample gene order-checking data is determined using the location information in the standard information Sequence;
The santal nucleotide sequence in the standard information is compared with the sample nucleotide sequence, is calculated Base ratio is to concordance rate;When the comparison concordance rate is greater than 95%, the sample to be tested is santal.
4. a kind of identification method of santal as claimed in claim 3, which is characterized in that utilize described in the standard information Location information determines the detailed process of the sample nucleotide sequence in the sample gene order-checking data are as follows: utilizes BWA-MEM The reference genome sequence of the sample gene order-checking data and santal is carried out sequence alignment by software, the packet after being compared Comparing result containing corresponding chromosome and location information;Utilize the position in samtools software and the standard information Information obtains the sample nucleotide sequence from the comparison result.
CN201811583474.8A 2018-12-24 2018-12-24 A kind of determination method of santal genome specific sequence and the identification method of santal Pending CN109504795A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811583474.8A CN109504795A (en) 2018-12-24 2018-12-24 A kind of determination method of santal genome specific sequence and the identification method of santal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811583474.8A CN109504795A (en) 2018-12-24 2018-12-24 A kind of determination method of santal genome specific sequence and the identification method of santal

Publications (1)

Publication Number Publication Date
CN109504795A true CN109504795A (en) 2019-03-22

Family

ID=65754482

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811583474.8A Pending CN109504795A (en) 2018-12-24 2018-12-24 A kind of determination method of santal genome specific sequence and the identification method of santal

Country Status (1)

Country Link
CN (1) CN109504795A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0265662A2 (en) * 1986-09-22 1988-05-04 Vladimir Dr. Badmajew Pharmaceutic composition
CN104404129A (en) * 2014-05-06 2015-03-11 广州白云山和记黄埔中药有限公司 DNA barcode identification method of Isodon serra(Maxim.)Kudo and relative species
CN105779634A (en) * 2016-05-17 2016-07-20 中国林业科学研究院资源昆虫研究所 Reference gene used for molecular identification of santalum album linn and molecular identification method
CN106434645A (en) * 2016-11-29 2017-02-22 广东药科大学 ITS (internal transcribed spacer) sequence of dalbergia odorifera and method for identifying dalbergia odorifera by ITS sequence
CN106529171A (en) * 2016-11-09 2017-03-22 上海派森诺医学检验所有限公司 Detection analysis method for breast cancer susceptibility gene heritable variation point
CN106929575A (en) * 2017-02-24 2017-07-07 中国林业科学研究院木材工业研究所 A kind of miniature DNA bar code and its discrimination method and application for differentiating red sandalwood and dyestuff red sandalwood

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0265662A2 (en) * 1986-09-22 1988-05-04 Vladimir Dr. Badmajew Pharmaceutic composition
CN104404129A (en) * 2014-05-06 2015-03-11 广州白云山和记黄埔中药有限公司 DNA barcode identification method of Isodon serra(Maxim.)Kudo and relative species
CN105779634A (en) * 2016-05-17 2016-07-20 中国林业科学研究院资源昆虫研究所 Reference gene used for molecular identification of santalum album linn and molecular identification method
CN106529171A (en) * 2016-11-09 2017-03-22 上海派森诺医学检验所有限公司 Detection analysis method for breast cancer susceptibility gene heritable variation point
CN106434645A (en) * 2016-11-29 2017-02-22 广东药科大学 ITS (internal transcribed spacer) sequence of dalbergia odorifera and method for identifying dalbergia odorifera by ITS sequence
CN106929575A (en) * 2017-02-24 2017-07-07 中国林业科学研究院木材工业研究所 A kind of miniature DNA bar code and its discrimination method and application for differentiating red sandalwood and dyestuff red sandalwood

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
张程飞等: "水仙属DNA条形码鉴定技术", 《福建农林大学学报(自然科学版)》 *
李滢等: "基于叶绿体全基因组的贝母属特异性DNA条形码的筛选", 《世界科学技术-中医药现代化》 *
黄海等: "石斛属植物DNA条形码序列的筛选", 《热带作物学报》 *

Similar Documents

Publication Publication Date Title
Liang et al. Whole-genome resequencing of 472 Vitis accessions for grapevine diversity and demographic history analyses
Binder et al. Cell cycle regulation in marine Synechococcus sp. strains
US5378809A (en) Polynucleotides and substrate for the epidermal growth factor receptor kinase (eps8)
CN101775440A (en) Plasmid control molecule for detection of transgenic soybean and building method thereof
CN103333233A (en) Agapanthus praecox auxin receptor protein TIR1 and coding gene and probe thereof
CN105936898A (en) Tea tree violet bud related protein CsGST and encoding gene and application thereof
CN109504795A (en) A kind of determination method of santal genome specific sequence and the identification method of santal
CN114921572B (en) SNP molecular marker for identifying Taihe black-bone chicken variety and application thereof
CN102978194A (en) Tulip chalcone isomerase TfCHI protein and coding gene thereof and probe
CN110484648A (en) A kind of Indel molecular labeling of the raw inflorescence of the novel single cluster of identification capsicum, primer and application
CN102965349A (en) Tulip flavanonol-3&#39;- hydroxylase TfF3&#39; H protein, and coding gene and probe thereof
CN105418740B (en) Pineapple function centromere antigen polypeptide and its application
CN103342741B (en) Agapanthus praecox gibberellin receptor APGID1b protein, and encoding gene and probe thereof
CN109810982A (en) The specific gene and its method for identifying molecules of short hairs Storehouse midge
Ibeagha-Awemu et al. Genetic variations between African and German sheep breeds, and description of a new variant of vitamin D-binding protein
CN109385484B (en) DNA bar code, primer, kit, method and application
CN108950039B (en) DNA bar code, primer, kit, method and application
CN111733273A (en) DNA barcode sequence and method for identifying lycium species by using same
CN105087794A (en) Kit for tuberculosis detection
Sakaguchi et al. Ribes fujisanense (Grossulariaceae): A New Obligate Epiphytic Species of Gooseberry Discovered in Central Japan
CN109338006A (en) A kind of ISSR primer special and its preparation method and application for Dioscarea persimilis Prain et Burkill. genetic resources database
CN109576358A (en) The discrimination method of ginseng under forest and instant detection system
Parks et al. Microba’s community profiler enables precise measurement of the gut microbiome
CN108929908A (en) A kind of detection method skipped based on digital pcr platform c-MET gene Exon14
CN110079627A (en) Reference gene and its primer for Primula forbesii different female bloom dates gene expression analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190322

WD01 Invention patent application deemed withdrawn after publication