CN109504795A - A kind of determination method of santal genome specific sequence and the identification method of santal - Google Patents
A kind of determination method of santal genome specific sequence and the identification method of santal Download PDFInfo
- Publication number
- CN109504795A CN109504795A CN201811583474.8A CN201811583474A CN109504795A CN 109504795 A CN109504795 A CN 109504795A CN 201811583474 A CN201811583474 A CN 201811583474A CN 109504795 A CN109504795 A CN 109504795A
- Authority
- CN
- China
- Prior art keywords
- santal
- sequence
- nucleotide sequence
- sample
- genome
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6888—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
- C12Q1/6895—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for plants, fungi or algae
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Analytical Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Health & Medical Sciences (AREA)
- Biotechnology (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Immunology (AREA)
- Mycology (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Botany (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention discloses the identification methods of a kind of determination method of santal genome specific sequence and santal, are related to gene technology field.The determining method: santal genome sequencing data is obtained from gene database, determines santal candidate region sequence;By other species gene group sequences in the sequence alignment gene database of santal candidate region, other species gene group sequences can be compared by rejecting from the sequence of santal candidate region, and reservation can not specifically compare other species gene group sequences and as santal nucleotide sequence;Santal nucleotide sequence is mapped into santal genome, determines location information of the santal nucleotide sequence in the santal genome.Identification method: santal is identified using nucleotide sequence shown in SEQ ID NO.1.The present invention identifies the santal true and false from gene angle, improves the accuracy of santal qualification result by obtaining santal nucleotide sequence and using its specificity.
Description
Technical field
The present invention relates to biological gene technical field more particularly to a kind of determination method of santal genome specific sequence and
The identification method of santal.
Background technique
Santal (Santalum album L) is Santalaceae (Santalaceae) evergreen partial parasite, in Santalaceae
Santal oil content highest has important economic value.The heartwood of santal is rare Chinese medicine;The broken material such as root, trunk can
To refine the sandalwood essential oil for being commonly called as " liquid golden ";Branch under trimming in sprout and growth course is high-grade fragrant product former material
Material.Earliest Compendium of Material Medica record " is controlled and dysphagic spits food.Look unfamiliar again black mole, wash to wipe every night with pulp-water and enable red, mill juice applies it ", santal
Acrid flavour, it is warm-natured, nontoxic there is regulating qi-flowing for harmonizing stomach and other effects, trusted subordinate's pain, chest diaphragm does not relaxes can be controlled.In " Chinese Pharmacopoeia " (nineteen ninety version)
In record santal heartwood for the first time with medical value.
Since santal whole body is all treasured, there is huge economic value.In recent years, people have carried out excessive exploitation to it
And utilization, it is gradually decreased so as to cause sandalwood wild resource, or even deficient, santal has been put into endangered species of wild fauna and flora at present
Kind international convention (CITES) control.Since the demand in market gradually increases, some are similar using constructing in order to chase interests
Chinese catalpa wood, cypress and yellow cedar etc. copy sandalwood, to cause serious undesirable influence.
Therefore, for above situation, accurately identify to sandalwood is particularly important.The identification of santal existing one
A little conventional methods, including judge institutional framework by micro- characteristic, Medicinal Materials Characters and Physiological-biochemical Characters and use makings
Combined instrument (GC-MS) carries out constituent analysis etc., but appearance, fibre structure phase cannot be distinguished above by physical features discrimination method
Close imitation (such as Japan cypress, osmanthus China wood);And Physiological-biochemical Characters identification is easy to be influenced by low temperature environment, influences to identify
As a result;The constituent analysis of gas chromatograph-mass spectrometer is qualitative identification method, test requirements document temperature control and sample requirement amount is larger, and reagent
There is also influences on identification result accuracy for impurity.Therefore, it needs to find a kind of method that can accurately identify santal.
Summary of the invention
In view of this, the embodiment of the invention provides the mirror of a kind of determination method of santal genome specific sequence and santal
Determine method, main purpose is to solve the problems, such as santal qualification result inaccuracy.
In order to achieve the above objectives, invention broadly provides following technical solutions:
On the one hand, the embodiment of the invention provides a kind of determination method of santal genome specific sequence, the method packets
The following acquisition santal genome sequencing data from gene database is included, determines santal candidate region sequence;
By other species gene group sequences in gene database described in the sequence alignment of the santal candidate region, from the wingceltis
In fragrant candidate region sequence reject can compare described in other species gene group sequences, retain can not specifically compare described in other
Species gene group sequence and as santal nucleotide sequence;
The santal nucleotide sequence is mapped into the santal genome, determines the santal nucleotide sequence
The location information being listed in the santal genome.
Preferably, the santal nucleotide sequence includes nucleotide sequence shown in SEQ ID NO.1.
On the other hand, it the embodiment of the invention provides a kind of identification method of santal, the described method comprises the following steps:
Using above-mentioned santal nucleotide sequence and the location information as the standard information of identification santal;
It extracts sample gene to be tested group and is sequenced, obtain sample gene order-checking data;
The sample core in the sample gene order-checking data is determined using the location information in the standard information
Nucleotide sequence;
The santal nucleotide sequence in the standard information is compared with the sample nucleotide sequence,
Base ratio is calculated to concordance rate;When the comparison concordance rate is greater than 95%, the sample to be tested is santal.
Preferably, being determined in the sample gene order-checking data using the location information in the standard information
Sample nucleotide sequence detailed process are as follows: using BWA-MEM software by the sample gene order-checking data and santal
Sequence alignment is carried out with reference to genome sequence, the comparing result comprising corresponding chromosome and location information after being compared;
The sample core is obtained from the comparison result using the location information in samtools software and the standard information
Nucleotide sequence.
Compared with prior art, the beneficial effects of the present invention are:
The present invention determines santal nucleotide sequence from santal genomic information for the first time, then by the santal nucleosides
Sour distinguished sequence identifies the true or false of santal by sequence alignment mode;Since the distinguished sequence is long by santal species
What is retained in phase evolutionary process represents the biomarker of santal species, using its gene specific sequence can in a variety of species it is quasi-
Really identify real santal;The method of the present invention is identified from gene angle, improves the accuracy of santal qualification result.
Specific embodiment
For further illustrate the present invention to reach the technical means and efficacy that predetermined goal of the invention is taken, below with compared with
Good embodiment, to specific embodiment, technical solution, feature and its effect applied according to the present invention, detailed description is as follows.Under
Stating the special characteristic, structure or feature in multiple embodiments in bright can be combined by any suitable form.
Technical term of the present invention is explained as follows:
Distinguished sequence: referring to a nucleotide fragments in DNA molecular or the amino acid fragment in protein, they into
It is held essentially constant during changing;I.e. distinguished sequence is the sequence in species gene group by remaining unchanged during long-term evolution
Column, are not influenced by natural selection, and special distinguished sequence is to represent specific species biology mark by what long-term evolution retained
Note.
ClustalW: being a kind of progressive Multiple Sequence Alignment Method, multiple sequences first compared to building distance matrix two-by-two,
Relationship two-by-two between reaction sequence;Then generation system evolution guidance tree is calculated according to distance matrix, to sequence in close relations
It is weighted;Then it since most close two sequences, is gradually introducing the sequence closed on and constantly rebuilds comparison, until
Until all sequences are all added into.The species distinguished sequence region is being determined from species gene group using ClustalW method
Method is the prior art.
BWA-MEM: for the reads obtained to be sequenced and refers to genome alignment software.
Samtools: for handling the tool software of sam Yu bam format, can be realized binary system check, format conversion,
The functions such as sequence and merging.
Blastn: for carrying out the analysis tool of similarity system design in Protein Data Bank or DNA database.
ITS sequence: in rDNA gene, 5.8S rDNA and 28S rDNA genetic interval sequence is known as ITS.
Embodiment 1
Obtain santal DNA sequence dna:
(1) santal genome sequencing data is obtained from ncbi database, these data are determined using ClustalW and are waited
Select sequence area;
(2) by the genome sequence of other species of the santal candidate region sequence alignment of collection to NCBI, from collection
The candidate sequence that can compare other species of database is rejected in the sequence of santal candidate region, reservation can not specifically compare other
The candidate sequence of species gene is denoted as conserved.fa as santal nucleotide sequence;
(3) above-mentioned santal nucleotide sequence (conserved.fa) is mapped into santal with reference to genome, determines wingceltis
Fragrant nucleotide sequence accurate location information and nucleotide sequence in santal genome, as the mark of santal identification
Quasi- gene sequence information is denoted as sample.fa and location information, is denoted as position.bed;Wherein, the santal nucleosides of acquisition
Sour distinguished sequence includes the nucleotide sequence as shown in SEQ.ID.NO.1.
Identify species method:
(1) species sample to be identified is chosen, DNA is extracted and is sequenced, sample to be tested DNA sequence dna information is obtained, charges to
Sample.fastq file;
(2) software is compared by the sequencing data (sample.fastq) of above-mentioned sample to be tested and santal using BWA-MEM
Sequence alignment is carried out with reference to genome sequence, the comparison result information comprising homologue position and sequence is generated, charges to BAM
File;
(3) position.bed file corresponding sample distinguished sequence area is extracted from BAM file using samtools
The nucleotide sequence in domain is denoted as sample_filter.fa file;
(4) the result sample to be tested nucleotide sequence (sample_ that step (3) is obtained using blastn method
Filter.fa) compared with the santal polynucleotide sequence (conserved.fa) that step (2) obtain, the ratio of base is calculated
To concordance rate, being greater than 95% is genuine piece.
Application examples 1
Santal nucleotide sequence (as shown in SEQ.ID.NO.1) the Lai Jianding santal true and false obtained using embodiment 1;
(1) determine santal standard nucleotides distinguished sequence: the present invention determines ITS (240-399) using the method for embodiment 1
For one section of santal nucleotide sequence (charging to conserved.fa file), as shown in SEQ.ID.NO.1:
AACGACTCTCGGCAACGGATATCTCGGCTCTTGCATCGATGAAGAACGTAGCGAAATGCGATACTTGG
TGTGAATTGCAGAATCCCGTGAACCATCGAGTCTTTGAACGCAAGTTGCGCCCGAAGCCACTAGGCCAAGGGCACG
CCTGCCTGGGTGTCAC。
240-399 base ratio on ITS is referred into genome to santal, determines that the position on reference genome is
NXEK01000069.1:634129-633982 charges to position.bed.
ITS:247-399;
Ref:634129-633982;Comparison process is as shown in table 1.
Table 1.ITS distinguished sequence and santal refer to genome correlation data table
(2) sample to be tested nucleotide sequence is obtained:
Sample to be tested DNA sequencing data are first obtained, from the comparison of sample DNA sequence to be detected and the reference genome of santal
As a result the nucleotide sequence that the sample to be tested of corresponding position in position.bed file is extracted in (BAM file), is charged to
Sample_filter.fa file;As shown in SEQ ID NO.2:
AACGACTCTCGGCAACGGATATCTCGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATACTTGG
TGTGAATTGCAGAATCCCGTGAATCATCGAGTCTTTGAACGCAAGTTGCGCCCGAAGCCATTAGGTTAAGGGCACG
CCTGCCTGGGTGTCAC。
(3) the standard nucleotides distinguished sequence of santal and the nucleotide sequence of sample to be tested are compared:
The standard nucleotides of the nucleotide sequence (charging to file sample_filter.fa) of sample to be tested and santal are special
Different sequence (.conserved.fa) is compared using blastn, is compared concordance rate and is reached 96%, it is possible to identify above-mentioned sample to be tested
For santal, comparison result is as shown in table 2.
The nucleotide sequence (Sample) of sample to be tested: KM521377.1:31-190;
The nucleotide sequence (ITS) of santal: 240-399.
2. sample to be tested nucleotide sequence of table and santal distinguished sequence correlation data table
Place, those skilled in the art can not select from the prior art to the greatest extent in the embodiment of the present invention.
Disclosed above is only a specific embodiment of the invention, but scope of protection of the present invention is not limited thereto, is appointed
What those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, answer
It is included within the scope of the present invention.Therefore, protection scope of the present invention should be with above-mentioned scope of protection of the claims
It is quasi-.
Sequence table
<110>University Of Hainan
<120>a kind of determination method, application and the identification method of santal genome conserved sequence
<160> 2
<170> SIPOSequenceListing 1.0
<210> 1
<211> 160
<212> DNA
<213>artificial sequence (Artificial Sequence)
<400> 1
aacgactctc ggcaacggat atctcggctc ttgcatcgat gaagaacgta gcgaaatgcg 60
atacttggtg tgaattgcag aatcccgtga accatcgagt ctttgaacgc aagttgcgcc 120
cgaagccact aggccaaggg cacgcctgcc tgggtgtcac 160
<210> 2
<211> 160
<212> DNA
<213>artificial sequence (Artificial Sequence)
<400> 2
aacgactctc ggcaacggat atctcggctc tcgcatcgat gaagaacgca gcgaaatgcg 60
atacttggtg tgaattgcag aatcccgtga atcatcgagt ctttgaacgc aagttgcgcc 120
cgaagccatt aggttaaggg cacgcctgcc tgggtgtcac 160
Claims (4)
1. a kind of determination method of santal genome specific sequence, which is characterized in that the described method comprises the following steps:
Santal genome sequencing data is obtained from gene database, determines santal candidate region sequence;
By other species gene group sequences in gene database described in the sequence alignment of the santal candidate region, waited from the santal
Other described species gene group sequences can be compared by rejecting in favored area sequence, and reservation can not specifically compare other described species
Genome sequence and as santal nucleotide sequence;
The santal nucleotide sequence is mapped into the santal genome, determines that the santal nucleotide sequence exists
Location information in the santal genome.
2. a kind of determination method of santal genome specific sequence as described in claim 1, which is characterized in that the santal core
Thuja acid distinguished sequence includes nucleotide sequence shown in SEQ ID NO.1.
3. a kind of identification method of santal, which is characterized in that the described method comprises the following steps:
Using santal nucleotide sequence as claimed in claim 1 or 2 and the location information as the standard information of identification santal;
It extracts sample gene to be tested group and is sequenced, obtain sample gene order-checking data;
The sample nucleotide in the sample gene order-checking data is determined using the location information in the standard information
Sequence;
The santal nucleotide sequence in the standard information is compared with the sample nucleotide sequence, is calculated
Base ratio is to concordance rate;When the comparison concordance rate is greater than 95%, the sample to be tested is santal.
4. a kind of identification method of santal as claimed in claim 3, which is characterized in that utilize described in the standard information
Location information determines the detailed process of the sample nucleotide sequence in the sample gene order-checking data are as follows: utilizes BWA-MEM
The reference genome sequence of the sample gene order-checking data and santal is carried out sequence alignment by software, the packet after being compared
Comparing result containing corresponding chromosome and location information;Utilize the position in samtools software and the standard information
Information obtains the sample nucleotide sequence from the comparison result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811583474.8A CN109504795A (en) | 2018-12-24 | 2018-12-24 | A kind of determination method of santal genome specific sequence and the identification method of santal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811583474.8A CN109504795A (en) | 2018-12-24 | 2018-12-24 | A kind of determination method of santal genome specific sequence and the identification method of santal |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109504795A true CN109504795A (en) | 2019-03-22 |
Family
ID=65754482
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811583474.8A Pending CN109504795A (en) | 2018-12-24 | 2018-12-24 | A kind of determination method of santal genome specific sequence and the identification method of santal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109504795A (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0265662A2 (en) * | 1986-09-22 | 1988-05-04 | Vladimir Dr. Badmajew | Pharmaceutic composition |
CN104404129A (en) * | 2014-05-06 | 2015-03-11 | 广州白云山和记黄埔中药有限公司 | DNA barcode identification method of Isodon serra(Maxim.)Kudo and relative species |
CN105779634A (en) * | 2016-05-17 | 2016-07-20 | 中国林业科学研究院资源昆虫研究所 | Reference gene used for molecular identification of santalum album linn and molecular identification method |
CN106434645A (en) * | 2016-11-29 | 2017-02-22 | 广东药科大学 | ITS (internal transcribed spacer) sequence of dalbergia odorifera and method for identifying dalbergia odorifera by ITS sequence |
CN106529171A (en) * | 2016-11-09 | 2017-03-22 | 上海派森诺医学检验所有限公司 | Detection analysis method for breast cancer susceptibility gene heritable variation point |
CN106929575A (en) * | 2017-02-24 | 2017-07-07 | 中国林业科学研究院木材工业研究所 | A kind of miniature DNA bar code and its discrimination method and application for differentiating red sandalwood and dyestuff red sandalwood |
-
2018
- 2018-12-24 CN CN201811583474.8A patent/CN109504795A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0265662A2 (en) * | 1986-09-22 | 1988-05-04 | Vladimir Dr. Badmajew | Pharmaceutic composition |
CN104404129A (en) * | 2014-05-06 | 2015-03-11 | 广州白云山和记黄埔中药有限公司 | DNA barcode identification method of Isodon serra(Maxim.)Kudo and relative species |
CN105779634A (en) * | 2016-05-17 | 2016-07-20 | 中国林业科学研究院资源昆虫研究所 | Reference gene used for molecular identification of santalum album linn and molecular identification method |
CN106529171A (en) * | 2016-11-09 | 2017-03-22 | 上海派森诺医学检验所有限公司 | Detection analysis method for breast cancer susceptibility gene heritable variation point |
CN106434645A (en) * | 2016-11-29 | 2017-02-22 | 广东药科大学 | ITS (internal transcribed spacer) sequence of dalbergia odorifera and method for identifying dalbergia odorifera by ITS sequence |
CN106929575A (en) * | 2017-02-24 | 2017-07-07 | 中国林业科学研究院木材工业研究所 | A kind of miniature DNA bar code and its discrimination method and application for differentiating red sandalwood and dyestuff red sandalwood |
Non-Patent Citations (3)
Title |
---|
张程飞等: "水仙属DNA条形码鉴定技术", 《福建农林大学学报(自然科学版)》 * |
李滢等: "基于叶绿体全基因组的贝母属特异性DNA条形码的筛选", 《世界科学技术-中医药现代化》 * |
黄海等: "石斛属植物DNA条形码序列的筛选", 《热带作物学报》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liang et al. | Whole-genome resequencing of 472 Vitis accessions for grapevine diversity and demographic history analyses | |
Binder et al. | Cell cycle regulation in marine Synechococcus sp. strains | |
US5378809A (en) | Polynucleotides and substrate for the epidermal growth factor receptor kinase (eps8) | |
CN101775440A (en) | Plasmid control molecule for detection of transgenic soybean and building method thereof | |
CN103333233A (en) | Agapanthus praecox auxin receptor protein TIR1 and coding gene and probe thereof | |
CN105936898A (en) | Tea tree violet bud related protein CsGST and encoding gene and application thereof | |
CN109504795A (en) | A kind of determination method of santal genome specific sequence and the identification method of santal | |
CN114921572B (en) | SNP molecular marker for identifying Taihe black-bone chicken variety and application thereof | |
CN102978194A (en) | Tulip chalcone isomerase TfCHI protein and coding gene thereof and probe | |
CN110484648A (en) | A kind of Indel molecular labeling of the raw inflorescence of the novel single cluster of identification capsicum, primer and application | |
CN102965349A (en) | Tulip flavanonol-3'- hydroxylase TfF3' H protein, and coding gene and probe thereof | |
CN105418740B (en) | Pineapple function centromere antigen polypeptide and its application | |
CN103342741B (en) | Agapanthus praecox gibberellin receptor APGID1b protein, and encoding gene and probe thereof | |
CN109810982A (en) | The specific gene and its method for identifying molecules of short hairs Storehouse midge | |
Ibeagha-Awemu et al. | Genetic variations between African and German sheep breeds, and description of a new variant of vitamin D-binding protein | |
CN109385484B (en) | DNA bar code, primer, kit, method and application | |
CN108950039B (en) | DNA bar code, primer, kit, method and application | |
CN111733273A (en) | DNA barcode sequence and method for identifying lycium species by using same | |
CN105087794A (en) | Kit for tuberculosis detection | |
Sakaguchi et al. | Ribes fujisanense (Grossulariaceae): A New Obligate Epiphytic Species of Gooseberry Discovered in Central Japan | |
CN109338006A (en) | A kind of ISSR primer special and its preparation method and application for Dioscarea persimilis Prain et Burkill. genetic resources database | |
CN109576358A (en) | The discrimination method of ginseng under forest and instant detection system | |
Parks et al. | Microba’s community profiler enables precise measurement of the gut microbiome | |
CN108929908A (en) | A kind of detection method skipped based on digital pcr platform c-MET gene Exon14 | |
CN110079627A (en) | Reference gene and its primer for Primula forbesii different female bloom dates gene expression analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190322 |
|
WD01 | Invention patent application deemed withdrawn after publication |