CN112575104A - Method for quickly positioning industrial hemp character related gene - Google Patents
Method for quickly positioning industrial hemp character related gene Download PDFInfo
- Publication number
- CN112575104A CN112575104A CN202011463353.7A CN202011463353A CN112575104A CN 112575104 A CN112575104 A CN 112575104A CN 202011463353 A CN202011463353 A CN 202011463353A CN 112575104 A CN112575104 A CN 112575104A
- Authority
- CN
- China
- Prior art keywords
- industrial hemp
- snp
- steps
- following
- sequencing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 56
- 244000025254 Cannabis sativa Species 0.000 title claims abstract description 40
- 235000012766 Cannabis sativa ssp. sativa var. sativa Nutrition 0.000 title claims abstract description 38
- 235000012765 Cannabis sativa ssp. sativa var. spontanea Nutrition 0.000 title claims abstract description 38
- 235000009120 camo Nutrition 0.000 title claims abstract description 38
- 235000005607 chanvre indien Nutrition 0.000 title claims abstract description 38
- 239000011487 hemp Substances 0.000 title claims abstract description 38
- 238000000034 method Methods 0.000 title claims abstract description 37
- 241000196324 Embryophyta Species 0.000 claims abstract description 16
- 238000012163 sequencing technique Methods 0.000 claims abstract description 16
- 238000003753 real-time PCR Methods 0.000 claims abstract description 15
- 239000012634 fragment Substances 0.000 claims abstract description 12
- 238000010276 construction Methods 0.000 claims abstract description 11
- 239000003550 marker Substances 0.000 claims abstract description 10
- 230000003321 amplification Effects 0.000 claims abstract description 9
- 238000003199 nucleic acid amplification method Methods 0.000 claims abstract description 9
- 238000012098 association analyses Methods 0.000 claims abstract description 8
- 238000011161 development Methods 0.000 claims abstract description 7
- 238000012165 high-throughput sequencing Methods 0.000 claims abstract description 6
- 238000001514 detection method Methods 0.000 claims abstract description 5
- 238000001976 enzyme digestion Methods 0.000 claims description 14
- 238000013507 mapping Methods 0.000 claims description 10
- 108020004414 DNA Proteins 0.000 claims description 9
- 238000010219 correlation analysis Methods 0.000 claims description 9
- 238000002156 mixing Methods 0.000 claims description 8
- 210000000349 chromosome Anatomy 0.000 claims description 7
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 6
- 230000018109 developmental process Effects 0.000 claims description 6
- 230000035800 maturation Effects 0.000 claims description 6
- 235000007164 Oryza sativa Nutrition 0.000 claims description 5
- 235000011624 Agave sisalana Nutrition 0.000 claims description 4
- 244000198134 Agave sisalana Species 0.000 claims description 4
- 238000001914 filtration Methods 0.000 claims description 4
- 238000010839 reverse transcription Methods 0.000 claims description 4
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 3
- 240000007594 Oryza sativa Species 0.000 claims description 3
- 238000004458 analytical method Methods 0.000 claims description 3
- 238000007621 cluster analysis Methods 0.000 claims description 3
- 238000005094 computer simulation Methods 0.000 claims description 3
- 238000002474 experimental method Methods 0.000 claims description 3
- 239000007788 liquid Substances 0.000 claims description 3
- 229910052757 nitrogen Inorganic materials 0.000 claims description 3
- 238000000746 purification Methods 0.000 claims description 3
- 235000009566 rice Nutrition 0.000 claims description 3
- 230000017260 vegetative to reproductive phase transition of meristem Effects 0.000 claims description 3
- 102000004533 Endonucleases Human genes 0.000 claims description 2
- 108010042407 Endonucleases Proteins 0.000 claims description 2
- 238000012408 PCR amplification Methods 0.000 claims description 2
- 230000008030 elimination Effects 0.000 claims description 2
- 238000003379 elimination reaction Methods 0.000 claims description 2
- 230000008014 freezing Effects 0.000 claims description 2
- 238000007710 freezing Methods 0.000 claims description 2
- 238000003306 harvesting Methods 0.000 claims description 2
- 238000012216 screening Methods 0.000 claims description 2
- 238000007689 inspection Methods 0.000 claims 1
- 230000004807 localization Effects 0.000 abstract description 7
- 238000005516 engineering process Methods 0.000 abstract description 3
- 238000011160 research Methods 0.000 abstract description 3
- 238000012350 deep sequencing Methods 0.000 abstract description 2
- 239000002773 nucleotide Substances 0.000 abstract description 2
- 125000003729 nucleotide group Chemical group 0.000 abstract description 2
- 239000000835 fiber Substances 0.000 description 9
- 238000009395 breeding Methods 0.000 description 6
- 241000218236 Cannabis Species 0.000 description 4
- 230000001488 breeding effect Effects 0.000 description 4
- 239000002299 complementary DNA Substances 0.000 description 4
- 239000000203 mixture Substances 0.000 description 4
- 235000008697 Cannabis sativa Nutrition 0.000 description 2
- 241000209094 Oryza Species 0.000 description 2
- CYQFCXCEBYINGO-UHFFFAOYSA-N THC Natural products C1=C(C)CCC2C(C)(C)OC3=CC(CCCCC)=CC(O)=C3C21 CYQFCXCEBYINGO-UHFFFAOYSA-N 0.000 description 2
- 239000003153 chemical reaction reagent Substances 0.000 description 2
- CYQFCXCEBYINGO-IAGOWNOFSA-N delta1-THC Chemical compound C1=C(C)CC[C@H]2C(C)(C)OC3=CC(CCCCC)=CC(O)=C3[C@@H]21 CYQFCXCEBYINGO-IAGOWNOFSA-N 0.000 description 2
- 229960004242 dronabinol Drugs 0.000 description 2
- 230000037361 pathway Effects 0.000 description 2
- 238000003752 polymerase chain reaction Methods 0.000 description 2
- 241000219195 Arabidopsis thaliana Species 0.000 description 1
- 229920000742 Cotton Polymers 0.000 description 1
- 208000035240 Disease Resistance Diseases 0.000 description 1
- 102000004190 Enzymes Human genes 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 208000034454 F12-related hereditary angioedema with normal C1Inh Diseases 0.000 description 1
- 241000219146 Gossypium Species 0.000 description 1
- 240000006240 Linum usitatissimum Species 0.000 description 1
- 235000004431 Linum usitatissimum Nutrition 0.000 description 1
- 108020005089 Plant RNA Proteins 0.000 description 1
- 238000010802 RNA extraction kit Methods 0.000 description 1
- 238000011529 RT qPCR Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- -1 building Substances 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000029087 digestion Effects 0.000 description 1
- 238000007865 diluting Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 208000016861 hereditary angioedema type 3 Diseases 0.000 description 1
- 238000009396 hybridization Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000003147 molecular marker Substances 0.000 description 1
- 239000011148 porous material Substances 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000004753 textile Substances 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6888—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
- C12Q1/6895—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for plants, fungi or algae
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01H—NEW PLANTS OR NON-TRANSGENIC PROCESSES FOR OBTAINING THEM; PLANT REPRODUCTION BY TISSUE CULTURE TECHNIQUES
- A01H1/00—Processes for modifying genotypes ; Plants characterised by associated natural traits
- A01H1/02—Methods or apparatus for hybridisation; Artificial pollination ; Fertility
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6844—Nucleic acid amplification reactions
- C12Q1/6858—Allele-specific amplification
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
- G16B25/20—Polymerase chain reaction [PCR]; Primer or probe design; Probe optimisation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
- G16B30/10—Sequence alignment; Homology search
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/13—Plant traits
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Organic Chemistry (AREA)
- Analytical Chemistry (AREA)
- Biotechnology (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Genetics & Genomics (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Theoretical Computer Science (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Medical Informatics (AREA)
- Evolutionary Biology (AREA)
- General Engineering & Computer Science (AREA)
- Botany (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- Developmental Biology & Embryology (AREA)
- Environmental Sciences (AREA)
- Mycology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
A method for rapidly positioning industrial hemp character related genes belongs to the technical field of industrial hemp research. The method comprises the following steps: according to the target characters, variety selection, extreme group construction, specific site amplification fragment library construction and high-throughput sequencing, marker development, association analysis, gene annotation and real-time fluorescence quantitative PCR are carried out. According to the invention, a high-low mixed pool is constructed for F1 generation groups obtained by hybridizing parents with larger target character differences, the parents are subjected to resequencing, the high-low mixed pool is subjected to sequencing by adopting a simplified genome deep sequencing technology, a SLAF label is developed and single nucleotide polymorphism detection is carried out, marker association analysis is carried out for genotype frequency difference between mixed pools, a target character related candidate area is obtained, most plant groups can be subjected to gene localization only when the propagation is carried out to F2 generation at least, and due to the high heterozygosity characteristic of industrial hemp, the gene localization is realized through F1 generation, so that the rapid localization of industrial hemp genes can be realized in a high-efficiency and low-cost manner.
Description
Technical Field
The invention belongs to the technical field of industrial hemp research, and particularly relates to a method for quickly positioning genes related to characteristics of industrial hemp.
Background
Industrial hemp (Cannabis sativa. L.) is an annual herbaceous plant of the genus Cannabis (Cannabis) of the family Cannabiaceae (Cannabis), the content of Tetrahydrocannabinol (THC) is lower than 0.3 percent, the stems, the flower leaves, the seeds and the like of industrial cannabis have economic utilization value, the method is widely applied to industries such as textile, building, paper making, medicine, food and the like, the main target of industrial hemp variety cultivation can be set according to specific industrial requirements, at present, the traditional breeding method is mainly adopted for industrial hemp breeding, however, the method for cultivating new species not only has long period but also is difficult to achieve the customized breeding target, the molecular breeding method can not only improve the breeding efficiency but also achieve the purpose of accurate breeding, however, the premise is that the reliable molecular marker is searched for by positioning the related genes of the characters, but because the research of the industrial hemp molecular biology starts late and the related functional genes are not positioned, the development of the sequencing technology and the bioinformatics provides possibility for the rapid positioning of the character genes.
Disclosure of Invention
The invention provides a method for quickly positioning industrial hemp character related genes, which solves the problems that related functional genes are not positioned in the beginning of the molecular biology of industrial hemp at present and the like.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
a method for rapidly positioning industrial hemp trait related genes comprises the following steps: according to target characters, variety selection, extreme population construction, Specific Locus Amplification Fragment (SLAF) library construction and high-throughput sequencing, marker development, correlation analysis, gene annotation and real-time fluorescence quantitative PCR are carried out.
Further, the variety selection specifically comprises: two varieties with obvious target character difference and more consistent field performances of other characters (including agriculture, yield, disease resistance and the like) are selected as parents and parents.
Further, the extreme population is specifically constructed as follows: selecting two varieties at proper time according to the flowering phase, pulling out male plants of female parent varieties or removing male flowers in the bud phase to obtain hybrid seeds which are F1 generation, selecting industrial hemp plants with the same sex in the bud phase to hang tags, quickly freezing young and tender leaves by liquid nitrogen, storing in a refrigerator at minus 80 ℃ for later use, harvesting the industrial hemp in the process maturation phase or the kernel maturation phase, and measuring target character data.
Further, the construction and high-throughput sequencing of the specific site amplification fragment library specifically comprises the following steps: dividing samples into two groups with obvious differences according to the statistic result of target characters, selecting at least 30-100 plants in each group (accounting for about 5% of the total amount of the samples, and selecting 50 plants in each group if 1000 plants are selected), extracting DNA, mixing the DNA of each plant sample in equal amount to construct two mixing pools of high and low, performing electronic enzyme digestion prediction on the reference genome of industrial hemp by using enzyme digestion sequencing software SLAF-Predict, performing enzyme digestion on the DNA mixing pools by using endonuclease, recovering enzyme digestion fragments, adding A at the 3' end, connecting a Dual-index sequencing joint, performing PCR amplification, purification, mixing samples and gel cutting on DNA sequence fragments at the same position in each sample, checking the qualified products by the SLAF library, sequencing the amplification products by using a sequencing system, and using BWA software for the sequencing result and rice (Oryza sativa) as a contrast to evaluate whether the enzyme digestion scheme is effective.
Further, the marker development is specifically as follows: and performing cluster analysis on the obtained reads, mapping the distribution of the SLAF labels on each chromosome of the industrial hemp, and performing SNP detection on the positioning result of the clean reads on a reference genome by using a GATK software package.
Further, the association analysis specifically includes: before the correlation analysis, firstly filtering SNP loci, and filtering out loci with read support degree less than 4, loci with multiple genotypes, loci with recessive mixed pool genes not from recessive parents and loci with consistent genotypes between mixed pools; carrying out association analysis by adopting an SNP-index method to find out the obvious difference of the genotype frequency between the mixed pools, and carrying out statistics by using delta (SNP-index), wherein the stronger the association degree of the SNP and the target character is, the closer the numerical value of the delta (SNP-index) is to 1; the calculation formula is as follows: snpindex (aa) ═ ma/(ma + Paa); snpindex (ab) ═ Mab/(Mab + Pab); Δ (SNP-index) ═ snpindex (aa) -snpindex (ab);
note: paa refers to the depth of the aa pool from the male parent, Maa refers to the depth of the aa pool from the female parent, Pab refers to the depth of the ab pool from the male parent, and Mab refers to the depth of the ab pool from the female parent;
the elimination of the false positive sites mainly utilizes the position of a marker on a genome, adopts an SNPNUM method to fit the delta SNP-index, selects a region above a threshold value according to a correlation threshold value as a character related candidate region, and calculates a result according to a computer simulation experiment.
Further, the gene annotation specifically is: and carrying out deep annotation of NR, Swiss-Prot, GO, KEGG and COG on the coding genes in the candidate region obtained by the correlation analysis, and quickly screening the candidate genes according to the annotation result.
Further, the real-time fluorescent quantitative PCR specifically comprises: 2-3 varieties with different target characters are selected, sample RNA is extracted, reverse transcription is carried out, 1-2 primers are respectively designed for different candidate genes for carrying out primer debugging on corresponding primers, and the debugging result is qualified and used for relative quantitative PCR analysis.
Compared with the prior art, the invention has the beneficial effects that: constructing a high-low mixed pool for F1 generation groups obtained by hybridizing parents with obvious target character differences, performing resequencing on the parents, sequencing the high-low mixed pool by adopting a simplified genome deep sequencing technology (SLAF-seq), developing an SLAF label and performing Single Nucleotide Polymorphism (SNP) detection, performing marker association analysis on genotype frequency difference between the mixed pools to obtain a target character related candidate region, and performing gene localization on most plant groups at least until F2 generation, wherein due to the high heterozygosity characteristic of industrial hemp, the gene localization can be realized through F1 generation.
Drawings
FIG. 1 is a map of the SLAF markers (black lines) on the chromosome of industrial cannabis sativa;
FIG. 2 is a map of SNP-index association values on a chromosome;
note: the abscissa is the chromosome name, the black dots represent the calculated SNP-index (or. DELTA. -SNP-index) values, and the black lines are the fitted SNP-index (or. DELTA. -SNP-index) values. The upper graph is the distribution graph of SNP-index values of the recessive mixed pool; the middle panel is the distribution of SNP-index values of the dominant pool; the lower graph is a distribution of Δ SNP-index values, where the dashed line represents the threshold line of the 99 percentile.
FIG. 3 is a pathway profile of genes within a candidate region;
fig. 4 is a plot of the results of fluorescence quantitative PCR for the gene LOC115705530 (. about.p <0.05,. about.p < 0.01);
fig. 5 is a plot of the results of the fluorescent quantitative PCR of gene LOC115707511 (. P <0.05,. P < 0.01);
fig. 6 is a plot of the results of fluorescence quantitative PCR for the gene LOC115704794 (. P <0.05,. P < 0.01);
fig. 7 is a plot of the results of fluorescent quantitative PCR of the gene LOC115705371 (. about.p <0.05,. about.p < 0.01);
fig. 8 is a plot of the results of fluorescence quantitative PCR for the genes LOC115705688 (. sp <0.05,. sp < 0.01).
Detailed Description
The technical solutions of the present invention are further described below with reference to the drawings and the embodiments, but the present invention is not limited thereto, and modifications or equivalent substitutions may be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.
Example 1:
(1) variety selection: selecting golden knife-15 with high fiber content as male parent and hemp I with low fiber content as female parent to perform hybridization.
(2) Group construction: the gold knife-15 and the first flowering period of the hemp are the same, two varieties are planted at the same time, the first male plant of the hemp is pulled out at the current bud period, the obtained hybrid is F1 generation, industrial hemp plants with the same sex are selected at the current bud period to be listed, young and tender leaves are taken to be quickly frozen by liquid nitrogen, the young and tender leaves are stored in a refrigerator at minus 80 ℃ for standby, the industrial hemp is harvested at the process maturation period, the weight of the original stem and the weight of the fiber of a single plant are measured after the hemp plants are freshly peeled, and the fiber content is calculated. The calculation formula is as follows: fiber content is fiber weight/protostem weight x 100%.
(3) SLAF library construction and high throughput sequencing: selecting 30 plants with high content and 30 plants with low content according to the fiber content result, extracting DNA, equivalently mixing DNA of each plant sample to construct a high-low mixed pool, performing electronic enzyme digestion prediction on an industrial hemp reference genome by using enzyme digestion sequencing software SLAF-Predict to obtain 105823 SLAF labels, wherein each label is basically and uniformly distributed on a genome chromosome, selecting RsaI and HaeIII to perform enzyme digestion on the DNA mixed pool as shown in figure 1, recovering enzyme digestion fragments, adding A at the 3' end, connecting a Dual-index sequencing joint, and performing D sequencing at the same position in each sampleCarrying out Polymerase Chain Reaction (PCR) amplification, purification, sample mixing and gel cutting on the NA sequence fragment, checking by the SLAF library to be qualified, and using a sequencing system IlluminaHiSeq to amplify the productTM2500 for sequencing. BWA software was used to evaluate the validity of the digestion protocol in comparison with rice (Oryza sativa) for the sequencing results.
(4) And (3) label development: performing cluster analysis on the obtained reads, mapping the distribution of the SLAF labels on each chromosome of the industrial hemp, performing SNP detection on the positioning result of the clean reads on a reference genome by using GATK software, detecting 389,687 SNP sites among mixed pools, and developing SNP molecular markers.
(5) Correlation analysis: before the correlation analysis, SNP loci are firstly filtered, and loci with read support degree less than 4, loci with multiple genotypes, loci with recessive mixed pool genes not from recessive parents and loci with consistent genotypes among mixed pools are filtered out. Performing association analysis by adopting an SNP-index method, mainly aiming at searching for a significant difference of genotype frequencies among mixed pools and performing statistics by using delta (SNP-index). The stronger the SNP is associated with the target trait, the closer the value of Δ (SNP-index) is to 1. The calculation formula is as follows: snpindex (aa) ═ ma/(ma + Paa); snpindex (ab) ═ Mab/(Mab + Pab); Δ (SNP-index) ═ SNPindex (aa) -SNPindex (ab)
Note: paa refers to the depth of the aa pool from the male parent, Maa refers to the depth of the aa pool from the female parent, Pab refers to the depth of the ab pool from the male parent, and Mab refers to the depth of the ab pool from the female parent;
eliminating false positive sites mainly utilizes the position of a marker on a genome, adopts an SNPNUM method to fit the delta SNP-index, selects a region above a threshold value according to a correlation threshold value as a character related candidate region, and calculates a result according to a computer simulation experiment. When the confidence is 0.90, no correlation to the relevant candidate region is made. Theoretically, the target site and its nearby linkage sites should be close to the threshold, and a higher peak should appear near the significant association region, but in the present experimental results, no significant localization result is obtained because no region exceeding the theoretical threshold is found. To fully exploit the data, the potential localization regions were found by lowering the threshold, using the 99 percentile of the fitted Δ SNP-index, i.e., 0.10, as shown in FIG. 2. A total of 4 candidate regions of 8.72Mb total length were obtained, including 397 genes.
(6) Gene annotation: coding genes in a candidate region obtained by correlation analysis are subjected to deep annotation of NR, Swiss-Prot, GO, KEGG and COG, 389 genes are annotated, the pathway distribution map of the genes in the candidate region is shown in figure 3, and the candidate genes LOC115705530, LOC115707511, LOC115703881, LOC115704794, LOC115705010, LOC115705371, LOC115705568, LOC115705688, LOC115705891, LOC115705892 and LOC115706200 are obtained by gene comparison with crops such as arabidopsis thaliana, flax, cotton and the like.
(7) Real-time fluorescent quantitative PCR: selecting 3 varieties of hemp I (22.1%), hemp 10 (27.1%) and golden knife 15 (33.1%) with different fiber contents, performing verification in seedling stage and process maturation stage, extracting sample RNA (the specific method refers to a Tiangen plant RNA extraction Kit (DP432)), performing reverse transcription by using a FastKing RT Kit (KR116) reagent, performing system 20ul, and preparing Buffer mixed liquor (2ul FQ-RT Primer Mix, 2ul10 Xking RT Buffer, 1ul FastKing RT Enzyme Mix, 5ul RNase-Free dd H2O), 1ug of RNA was added to 10ul of buffer mixture using dd H2Supplementing O to 20ul, reacting at 42 ℃ for 30min and 95 ℃ for 3min, diluting the cDNA (complementary deoxyribonucleic acid) obtained by reverse transcription by 10 times, using the diluted cDNA to perform quantitative PCR, designing 1-2 corresponding primers aiming at different candidate genes respectively, debugging the corresponding primers, using the qualified debugging result to perform quantitative PCR analysis, using a reagent Power qPCR PreMix (Genecopoeia) to perform quantitative analysis, using a 96-pore plate SYBR Green 20ul system (10ul Mix, 1ul cDNA, 0.5ul and 8ul H primers respectively before and after the primer is used2O, reaction sequence is shown in the following table.
Constructing standard product of target gene and reference gene by standard curve method, constructing standard curve, and calculating amplification of target gene and reference gene primerEfficiency is increased, the multiple relation between the two is obtained by substituting calculation, and a copy number calculation formula is as follows: copy number/. mu.l ═ (ng/. mu.l). times.10-9×6.02×1023/(bp × 660). Wherein: 6.02X 1023As a molar constant, 660 is the average molecular weight of the base (AGCT). The genes LOC115705530, LOC115707511, LOC115704794, LOC115705371 and LOC115705688 were found to be related to fiber content by quantitative results, as shown in FIGS. 4-8.
Claims (8)
1. A method for rapidly positioning industrial hemp character related genes is characterized in that: the method comprises the following steps: according to the target characters, variety selection, extreme group construction, specific site amplification fragment library construction and high-throughput sequencing, marker development, association analysis, gene annotation and real-time fluorescence quantitative PCR are carried out.
2. The method for rapidly mapping industrial hemp trait-related genes according to claim 1, wherein the method comprises the following steps: the variety selection specifically comprises the following steps: two varieties with obvious target character difference and more consistent field performances of other characters are selected as parents and parents.
3. The method for rapidly mapping industrial hemp trait-related genes according to claim 1, wherein the method comprises the following steps: the extreme population construction specifically comprises the following steps: selecting two varieties at proper time according to the flowering phase, pulling out male plants of female parent varieties or removing male flowers in the bud phase to obtain hybrid seeds which are F1 generation, selecting industrial hemp plants with the same sex in the bud phase to hang tags, quickly freezing young and tender leaves by liquid nitrogen, storing in a refrigerator at minus 80 ℃ for later use, harvesting the industrial hemp in the process maturation phase or the kernel maturation phase, and measuring target character data.
4. The method for rapidly mapping industrial hemp trait-related genes according to claim 1, wherein the method comprises the following steps: the construction and high-throughput sequencing of the specific site amplification fragment library specifically comprises the following steps: dividing samples into two groups with obvious differences according to a target character statistical result, selecting at least 30-100 plants in each group, extracting DNA, equivalently mixing the DNA of each plant sample to construct a high-low mixed pool, performing electronic enzyme digestion prediction on an industrial hemp reference genome by using enzyme digestion sequencing software SLAF-Predict, performing enzyme digestion on the DNA mixed pool by using endonuclease, recovering enzyme digestion fragments, adding A at the 3' end, connecting a Dual-index sequencing joint, performing PCR amplification, purification, sample mixing and gel cutting on the DNA sequence fragments at the same position in each sample, performing sequencing on an amplification product by using a sequencing system after the DNA sequence fragments are qualified through the SLAF library inspection, and evaluating whether the enzyme digestion scheme is effective or not by using BWA software and rice as a reference on the sequencing result.
5. The method for rapidly mapping industrial hemp trait-related genes according to claim 1, wherein the method comprises the following steps: the marker development is specifically as follows: and performing cluster analysis on the obtained reads, mapping the distribution of the SLAF labels on each chromosome of the industrial hemp, and performing SNP detection on the positioning result of the clean reads on a reference genome by using a GATK software.
6. The method for rapidly mapping industrial hemp trait-related genes according to claim 1, wherein the method comprises the following steps: the correlation analysis specifically comprises: before the correlation analysis, firstly filtering SNP loci, and filtering out loci with read support degree less than 4, loci with multiple genotypes, loci with recessive mixed pool genes not from recessive parents and loci with consistent genotypes between mixed pools; carrying out association analysis by adopting an SNP-index method to find out the obvious difference of the genotype frequency between the mixed pools, and carrying out statistics by using delta (SNP-index), wherein the stronger the association degree of the SNP and the target character is, the closer the numerical value of the delta (SNP-index) is to 1; the calculation formula is as follows: snpindex (aa) ═ ma/(ma + Paa); snpindex (ab) ═ Mab/(Mab + Pab); Δ (SNP-index) ═ snpindex (aa) -snpindex (ab);
note: paa refers to the depth of the aa pool from the male parent, Maa refers to the depth of the aa pool from the female parent, Pab refers to the depth of the ab pool from the male parent, and Mab refers to the depth of the ab pool from the female parent;
the elimination of the false positive sites mainly utilizes the position of a marker on a genome, adopts an SNPNUM method to fit the delta SNP-index, selects a region above a threshold value according to a correlation threshold value as a character related candidate region, and calculates a result according to a computer simulation experiment.
7. The method for rapidly mapping industrial hemp trait-related genes according to claim 1, wherein the method comprises the following steps: the gene annotation is specifically: and carrying out deep annotation of NR, Swiss-Prot, GO, KEGG and COG on the coding genes in the candidate region obtained by the correlation analysis, and quickly screening the candidate genes according to the annotation result.
8. The method for rapidly mapping industrial hemp trait-related genes according to claim 1, wherein the method comprises the following steps: the real-time fluorescent quantitative PCR specifically comprises the following steps: 2-3 varieties with different target characters are selected, sample RNA is extracted, reverse transcription is carried out, 1-2 primers are respectively designed for different candidate genes for carrying out primer debugging on corresponding primers, and the debugging result is qualified and used for relative quantitative PCR analysis.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011463353.7A CN112575104A (en) | 2020-12-11 | 2020-12-11 | Method for quickly positioning industrial hemp character related gene |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011463353.7A CN112575104A (en) | 2020-12-11 | 2020-12-11 | Method for quickly positioning industrial hemp character related gene |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112575104A true CN112575104A (en) | 2021-03-30 |
Family
ID=75131917
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011463353.7A Pending CN112575104A (en) | 2020-12-11 | 2020-12-11 | Method for quickly positioning industrial hemp character related gene |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112575104A (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103088120A (en) * | 2012-11-29 | 2013-05-08 | 北京百迈客生物科技有限公司 | Large-scale genetic typing method based on SLAF-seq (Specific-Locus Amplified Fragment Sequencing) technology |
WO2013086964A1 (en) * | 2011-12-15 | 2013-06-20 | 深圳华大基因科技有限公司 | Method for enrichment, library construction and snp analysis of gene regions in complex genome of higher plant |
CN107034302A (en) * | 2017-06-07 | 2017-08-11 | 湖南农业大学 | A kind of method that Relationship iden- tification is carried out using SLAF seq technological development awns genus plants SNP marker |
CN109360606A (en) * | 2018-11-19 | 2019-02-19 | 广西壮族自治区农业科学院水稻研究所 | A kind of method of low-density SNP genome area Accurate Prediction BSA-seq candidate gene |
CN109880931A (en) * | 2019-04-11 | 2019-06-14 | 江苏省农业科学院 | A kind of SLAF-SNP molecule labelling method of sponge gourd anti cucumber mosaic virus CMV main effect QTL and application |
CN109913532A (en) * | 2019-04-11 | 2019-06-21 | 江苏省农业科学院 | A method of obtaining sponge gourd anti cucumber mosaic virus disease candidate gene |
WO2019210696A1 (en) * | 2018-05-03 | 2019-11-07 | 北京林业大学 | Prunus mume pendulous trait snp molecular markers and use thereof |
-
2020
- 2020-12-11 CN CN202011463353.7A patent/CN112575104A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013086964A1 (en) * | 2011-12-15 | 2013-06-20 | 深圳华大基因科技有限公司 | Method for enrichment, library construction and snp analysis of gene regions in complex genome of higher plant |
CN103088120A (en) * | 2012-11-29 | 2013-05-08 | 北京百迈客生物科技有限公司 | Large-scale genetic typing method based on SLAF-seq (Specific-Locus Amplified Fragment Sequencing) technology |
CN107034302A (en) * | 2017-06-07 | 2017-08-11 | 湖南农业大学 | A kind of method that Relationship iden- tification is carried out using SLAF seq technological development awns genus plants SNP marker |
WO2019210696A1 (en) * | 2018-05-03 | 2019-11-07 | 北京林业大学 | Prunus mume pendulous trait snp molecular markers and use thereof |
CN109360606A (en) * | 2018-11-19 | 2019-02-19 | 广西壮族自治区农业科学院水稻研究所 | A kind of method of low-density SNP genome area Accurate Prediction BSA-seq candidate gene |
CN109880931A (en) * | 2019-04-11 | 2019-06-14 | 江苏省农业科学院 | A kind of SLAF-SNP molecule labelling method of sponge gourd anti cucumber mosaic virus CMV main effect QTL and application |
CN109913532A (en) * | 2019-04-11 | 2019-06-21 | 江苏省农业科学院 | A method of obtaining sponge gourd anti cucumber mosaic virus disease candidate gene |
Non-Patent Citations (3)
Title |
---|
JT PAGE ET AL: "Insights into the evolution of cotton diploids and polyploids from whole-genome re-sequencing", G3, vol. 3, no. 10, 1 October 2013 (2013-10-01) * |
刘渊;孔佑宾;李喜焕;张彩英;: "基于SLAF-BSA技术挖掘大豆酸性磷酸酶候选基因及标记开发", 植物遗传资源学报, no. 01, 30 May 2019 (2019-05-30) * |
陈勇;柳亦松;曾建国;: "植物基因组测序的研究进展", 生命科学研究, no. 01, 28 February 2014 (2014-02-28) * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Xin et al. | Applying genotyping (TILLING) and phenotyping analyses to elucidate gene function in a chemically induced sorghum mutant population | |
CN109554503B (en) | Molecular marker for early sex identification of actinidia arguta seedlings and application of molecular marker | |
CN108103235B (en) | SNP molecular marker and primer for identifying cold resistance of apple rootstock and application of SNP molecular marker and primer | |
CN112080582B (en) | KASP molecular marker closely linked with major QTL locus of wheat spike length and application thereof | |
CN109609686B (en) | Molecular marker for early sex identification of actinidia arguta seedlings and application of molecular marker | |
CN113584216B (en) | Development and application of KASP marker of wheat grain weight gene TaCYP78A16 | |
CN112195265B (en) | SNP (Single nucleotide polymorphism) locus and primer set for identifying purity of pepper hybrid and application | |
CN105219858B (en) | Grain Weight in Common Wheat gene TaGS5 3A single nucleotide polymorphisms and its application | |
CN114717355A (en) | Watermelon whole genome SNP-Panel | |
CN114134247B (en) | Molecular marker closely linked with millet plant height character, primer sequence and application thereof | |
CN107541551A (en) | The primer and the application that are used to detect spinach RPF1 genotype based on KASP technological development | |
CN113046467A (en) | SNP loci significantly associated with wheat stripe rust resistance and application thereof in genetic breeding | |
CN117965781A (en) | Peanut 40K liquid-phase SNP chip 'PeanutGBTS K' and application thereof | |
CN107267504B (en) | SNP (Single nucleotide polymorphism) marker for identifying pear pulp high stone cell content based on high resolution dissolution curve and application of SNP marker | |
CN112575104A (en) | Method for quickly positioning industrial hemp character related gene | |
CN108841983A (en) | A kind of SSR primer of sugarcane overall length transcript profile data large-scale development | |
CN107058601B (en) | SNP (Single nucleotide polymorphism) marker for identifying low stone cell content of pear pulp based on high-resolution dissolution curve and application of SNP marker | |
CN105950615B (en) | A kind of method and kit detecting TaAGPL Allelic Variation | |
CN109913579A (en) | A kind of barley phosphorus element efficiently utilizes molecular labeling and the application of QTL site | |
CN115948591B (en) | Identification of corn seedling drought tolerance related monomer ZmC10.HapDR and application thereof | |
CN107893125A (en) | For identifying single nucleotide polymorphism site, primer pair, kit and the application of peach blossom bell type/rose type character | |
CN115838790B (en) | Molecular marker for sex identification of actinidia arguta and application of specific primer pair M4 | |
CN114410816B (en) | Screening method and application of reference genes suitable for cassava disease resistance research | |
CN118064638A (en) | SNP molecular marker locus related to drought tolerance of corn and application thereof | |
CN117737294A (en) | Molecular marker and method for rapidly identifying purity of tomato winter rhyme hybrid seeds |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |