CN113718342A - Construction method of high-density genetic map of recombinant inbred line population - Google Patents

Construction method of high-density genetic map of recombinant inbred line population Download PDF

Info

Publication number
CN113718342A
CN113718342A CN202110490790.6A CN202110490790A CN113718342A CN 113718342 A CN113718342 A CN 113718342A CN 202110490790 A CN202110490790 A CN 202110490790A CN 113718342 A CN113718342 A CN 113718342A
Authority
CN
China
Prior art keywords
slaf
markers
map
constructing
marker
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110490790.6A
Other languages
Chinese (zh)
Inventor
曾威
黎珉
石英尧
徐大伟
刘二宝
胡群文
包亚玲
许明慧
王宝琛
黄钰姮
王茜彧
黄世纪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Agricultural University AHAU
Original Assignee
Anhui Agricultural University AHAU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Agricultural University AHAU filed Critical Anhui Agricultural University AHAU
Priority to CN202110490790.6A priority Critical patent/CN113718342A/en
Publication of CN113718342A publication Critical patent/CN113718342A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6858Allele-specific amplification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biochemistry (AREA)
  • Zoology (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Analytical Chemistry (AREA)
  • Biotechnology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a method for constructing a high-density genetic map of a recombinant inbred line population, which comprises the following steps: SLAF library construction and high throughput sequencing: comprises three aspects; first, a pre-design scheme for the SLAF is selected using training data; the enzyme digestion design must be determined according to the marker efficiency characteristics, including random distribution of the whole genome, uniqueness of the genome and consistent amplification efficiency among selected markers; carrying out large-scale productivity test on the basis of a pre-designed scheme; secondly, constructing a SLAF-seq library; thirdly, high-throughput sequencing and genotyping; sequence data grouping and genotyping; and (3) constructing a genetic linkage map: the molecular marker developed by the invention belongs to a third-generation molecular marker, and the third-generation molecular marker represented by the SNP marker is the molecular marker with the most genetic variation and the most abundant polymorphism in all markers.

Description

Construction method of high-density genetic map of recombinant inbred line population
Technical Field
The invention relates to a corn breeding method, in particular to a method for constructing a high-density genetic map of a recombinant inbred line population.
Background
Corn (Zea mays L.) is one of the most important food crops in the world, and the highest yield in the world, and, along with rice and wheat, provides at least 30% of the food calories for 45 billion and more people in 94 developing countries. In parts of africa and central america, corn alone accounts for more than 20% of the caloric content of food. Maize was domesticated by the wild plant teosinte in central america and was grown throughout the continent before colombia was discovered. Besides economic value, corn is an important model organism for plant genetics, physiology and development research. Therefore, the analysis of the genetic mechanism of the important agronomic traits of the corn and the deep excavation of important functional genes have important significance for the genetic improvement of the key traits of the corn. The construction of the high-density genetic map creates necessary conditions for the accurate positioning and functional research of target genes.
Genetic maps, also known as genetic linkage maps, are a representation of a genome showing the relative positions and distances between molecular markers or genes along a chromosome. It does not show the physical distance between these markers or genes, but rather the genetic distance, which is defined as a function of the crossover frequency during meiosis. The closer the two genes are on the chromosome, the less chance there is for an exchange to occur between them. The early genetic map is constructed by mainly utilizing molecular markers such as Restriction Fragment Length Polymorphism (RFLP), Simple Sequence Repeat (SSR), Random Amplified Polymorphic DNA (RAPD), and the like, has a large mapping distance, is easily limited by the characteristics and the quantity of the molecular markers, has low constructed genetic density and saturation, cannot be closely linked with a positioned target gene, and seriously hinders the application of the genetic map in gene positioning and genome assembly. The third generation molecular marker represented by a Single Nucleotide Polymorphism (SNP) marker is the molecular marker with the most genetic variation and the most abundant polymorphism in all markers. SNPs exhibit polymorphisms that involve only single base variations, which can be caused by single base transitions or transversions, or by base insertions or deletions. The marker has the advantages of large quantity, rich polymorphism, stable heredity and the like, and is an ideal marker for constructing a high-density genetic map. The SLAF-seq technology is a large-scale SNP marker development and genotyping technology based on high-throughput sequencing, and has been successfully applied to genetic variation research of a plurality of animals and plants. The project utilizes the SLAF-seq technology to carry out whole genome sequencing on a corn recombinant inbred line group, firstly identifies SNP markers with polymorphism, then carries out genotype coding on the markers, and finally constructs a high-density genetic map, thereby laying a favorable foundation for further deeply excavating and researching important agronomic trait genes.
Disclosure of Invention
The invention aims to provide a method for constructing a high-density genetic map of a recombinant inbred line population, so as to solve the problems in the background technology.
In order to achieve the purpose, the invention provides the following technical scheme:
a method for constructing a high-density genetic map of a recombinant inbred line population comprises the following steps:
SLAF (Specific-localized Amplified Fragment Sequencing) library construction and high throughput Sequencing: comprises three aspects; first, a pre-design scheme for the SLAF is selected using training data; the enzyme digestion design must be determined according to the marker efficiency characteristics, including random distribution of the whole genome, uniqueness of the genome and consistent amplification efficiency among selected markers; carrying out large-scale productivity test on the basis of a pre-designed scheme; secondly, constructing a SLAF-seq library; genomic DNA is digested by a set of enzymes designed for the individual; adding double bar codes in two PCR reactions to distinguish each individual, so that the size of a sample pool can be conveniently selected, and the sizes of fragments among the individuals are kept consistent; thirdly, high-throughput sequencing and genotyping; performing deep sequencing on the merged RRLs by adopting an Illumina paired-end sequencing protocol, and performing genotype definition and verification through software;
sequence data grouping and genotyping: filtering out low-quality reading codes, and dividing the original reading codes into each subsequence according to the double-barcode sequence; after cutting the bar code and the terminal 5-bp position from each high-quality reads, using SOAP software to position clean reads of the same sample to a corn genome sequence; defining a sequence located at the same position as a SLAF focus; then detecting Single Nucleotide Polymorphism (SNP) sites of each SLAF site among parents, and filtering out SLAFs with more than 3 SNPs; then defining the allele of each SLAF locus according to reads with the depth of the parent sequence being >21.22 times, and defining the allele according to reads with the depth of each offspring sequence being >8.05 times; for diploid species, one SLAF locus can contain up to 4 genotypes, so SLAF loci with more than 4 alleles are defined as repeat SLAFs and then discarded; only SLAFs with 2 to 4 alleles were identified as polymorphisms and considered as potential markers; all polymorphic SLAFS loci are genotyped according to the consistency of parental and filial generation SNP loci; the marker codes of polymorphic SLAFs were analyzed according to the RIL population type of one isolated type (aa × bb); then, a Bayesian method is used for genotyping so as to further ensure the quality of the genotype;
and (3) constructing a genetic linkage map: the marker sites are divided primarily into Linkage Groups (LGs) according to their position on the B73_ RefGen _ V4 genome; secondly, calculating Modified Lod (MLOD) scores among the markers, and further verifying the robustness of the markers to each LGs; tags with MLOD score <0.05 were filtered before subscription; in order to ensure the construction of a genetic map with high density and high quality, a newly developed HighMap strategy is adopted to sort the SLAF markers, and the genotyping errors in the LGs are corrected; firstly, calculating recombination frequency and LOD score by using a two-point analysis method for deducing continuous phase; then, combining enhanced Gibbs sampling, spatial sampling and simulated annealing algorithm to carry out an iterative process of marking and sequencing; briefly, in the first stage of the sorting process, the SLAF marker is selected using spatial sampling; randomly extracting a mark according to the priority of test intersection, and excluding marks with recombination frequency smaller than a given sampling radius from a mark set; then searching the optimal mapping sequence by using a simulated annealing algorithm; the annealing system continues until, in a series of successive steps, the newly generated mapping order is rejected; estimating the multipoint complex frequency of the parents after obtaining the optimal sample marking map sequence by adopting a blocking Gibbs sampling method; integrating the mapping of the two parents by using the updated recombination frequency, and optimizing the mapping sequence of the next simulated annealing period; after 3-4 cycles, once a stable map order is obtained, we turn to the next round of map construction cycle; selecting a subset of the currently unmapped markers and adding them to the previous sample to reduce the sample radius; the mapping algorithm is repeated until all the markers have been mapped correctly.
As a still further scheme of the invention: the specific method for deep sequencing the merged RRLs by adopting the Illumina paired-end sequencing protocol is that the reference genome of corn B73-RefGen _ V4 is utilized to simulate the quantity of markers generated by different enzymes, and a marker discovery experiment is designed; then, performing a SLAF intermediate experiment, and constructing a SLAF library according to a pre-designed scheme; at 264In the RIL population, two enzymes are used, wherein the two enzymes are respectively HaeIII and Hpy166 II; digesting the genomic DNA at 37 ℃; subsequently single nucleotide (A) was added to the digested fragment by overhang with Klenow fragment (3 '→ 5') (NEB) and dATP; then the double-labeled sequencing adapter is subjected to PAGE purification by using T4 DNA ligase; ligation to the a-tail fragment; using a diluted restriction ligation DNA sample, dNTPs,
Figure BDA0003052454350000031
Performing Polymerase Chain Reaction (PCR) by using high-fidelity DNA polymerase and PCR primers; the PCR products were subsequently purified using agencouurtampure XP beads and pooled; separating the mixed sample by 2% agarose gel electrophoresis; fragments with indices and adapters of size from 414 to 464 base pairs were excised and purified using the QIAquick gel extraction kit (Qiagen, Hilden, Germany); the gel purified product is diluted; sequencing by pairing on an Illumina platform system
As a still further scheme of the invention: performing SMOOTH error correction strategy on contribution of the parental genotypes, and merging the missing genotypes by adopting a k nearest neighbor algorithm; then adding the slant marks into the map by applying a maximum likelihood multi-point method; map distances are estimated using Kosambi mapping functions.
As a still further scheme of the invention: the Bayesian method for genotyping comprises the following steps; firstly, calculating the posterior conditional probability by using the coverage rate of each allele and the number of single nucleotide polymorphisms; then, selecting qualified markers for subsequent analysis by using the genotyping mass fraction of the probabilistic translation; counting the low quality markers for each marker and each individual and deleting the poor markers or individuals in a dynamic process; the process was stopped when the mean genotype mass fraction of all SLAF markers reached a critical value.
As a still further scheme of the invention: the criteria for genetic map screening of the high-quality SLAF marker are as follows; first, the average sequence depth per progeny should be >8.05 times, the parent >21.22 times; secondly, filtering the marks with the data missing rate exceeding 50%; thirdly, checking segregation distortion by adopting a chi-square test; during map construction, markers with significant segregation (P <0.05) were first removed and then added as auxiliary markers.
As a still further scheme of the invention: the PCR primers are divided into a forward primer and a reverse primer; wherein the forward primer: 5'-AATGATACGGCGACCACCGA-3' and the reverse primer: 5'-CAAGCAGAAGACGGCATACG-3' were subjected to PAGE purification.
Compared with the prior art, the invention has the beneficial effects that:
the molecular marker developed by the invention belongs to a third-generation molecular marker, and is represented by a Single Nucleotide Polymorphism (SNP) marker, and is the molecular marker with the most genetic variation and the most abundant polymorphism in all markers.
Drawings
FIG. 1 is a schematic diagram of a construction process of a maize recombinant inbred line of a construction method of a high-density genetic map of a recombinant inbred line population.
FIG. 2 is a schematic diagram of DNA extraction results of a part of RIL strains in a construction method of a high-density genetic map of a recombinant inbred line population.
FIG. 3 is a schematic diagram of a high-density genetic map of a maize recombinant inbred line population in a construction method of the high-density genetic map of the recombinant inbred line population.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1-3, in the embodiment of the present invention, a method for constructing a high-density genetic map of a recombinant inbred line population,
(1) constructing a maize recombinant inbred line: the invention takes a corn model variety B73 as a female parent and a local variety L403 special for Menghai county in Yunnan province as a male parent material to carry out hybridization to form F1 generation, and finally forms a Recombinant Inbred Line (RIL) group (figure 1) consisting of 264 parts of materials through 8 generations by a single seed propagation method. This is a permanent gene mining population that can be repeated for many years.
(2) Extracting DNA of corn leaves: when the corn grows to a small horn mouth period, sampling, listing and numbering, and extracting genome DNA (shown in figure 2), wherein the extraction method comprises the following steps:
1) taking 0.1-0.15g of fresh and tender corn leaf tissues and 2 steel balls with the diameter of 3mm, and putting the fresh and tender corn leaf tissues and the 2 steel balls into a 2.0ml centrifuge tube;
2) adding 900ul of CTAB extracting solution preheated to 65 ℃, adding 10ul or 2ul of beta-mercaptoethanol, and slightly shaking and uniformly mixing;
3) water bath is carried out for 30 minutes at 65 ℃, and the cell is shaken up and down for several times every 5 minutes in the period, so that the extracting solution can fully crack the cells;
4) cooling to room temperature, adding phenol/chloroform/isoamyl alcohol (25:24:1) with the same volume, and gently shaking up and down to mix for 10 minutes to fully denature the protein;
5) centrifuging at room temperature at 12000rpm for 10min, taking out the centrifuge tube, sucking the supernatant with a large-caliber 1ml suction head (sheared by scissors), and placing in another 2ml centrifuge tube;
6) adding RNaseA (10mg/ml) to a final concentration of about 15ug/ml (about 2ul), standing for several minutes, adding chloroform/isoamyl alcohol (24:1) with the same volume, shaking up and down gently, mixing for 10 minutes to ensure complete RNA digestion, denaturation of residual protein and RNaseA and removal of residual phenol;
7) centrifuging at room temperature at 12000rpm for 10min, taking out the centrifuge tube, sucking the supernatant with a large-caliber 1ml suction head (sheared by scissors), and placing in another 1.5ml centrifuge tube;
8) adding 2 times volume of dehydrated alcohol pre-cooled at-20 deg.C or 0.7 times volume of isopropanol pre-cooled at-20 deg.C, mixing, and precipitating to obtain flocculent precipitate until the solution becomes clear (precipitate is completely precipitated);
9) centrifuging at room temperature or 4 deg.C at 12000rpm for 10min, and removing supernatant;
10) adding 70% alcohol, rinsing for more than 15 min, centrifuging at 12000rpm for 10 min;
11) repeating the previous step;
12) naturally drying, adding 100ul TE1.0Dissolving at room temperature overnight;
13) agarose detection of band type (clear, single, no tailing, no RNA band, not too bright sample port), spectrophotometer detection of OD value (OD value between 1.7 and 2.1 is appropriate);
14) storing at-20 deg.C.
(3) SLAF-seq library construction and sequencing: according to the information such as the size of the corn genome, the GC content and the like, selecting the corn genome as a reference genome for enzyme digestion prediction (reference genome download address: ftp:// ftp. ensilage. org/pub/plants/release-33/fasta/zea _ mays/dna /). And selecting an optimal enzyme digestion scheme, and performing enzyme digestion experiments on the genomic DNA of each sample qualified for detection respectively. And (3) carrying out treatment of adding A to the 3' end of the obtained enzyme digestion fragment (SLAF label), connecting a Dual-index sequencing joint, carrying out PCR amplification, purifying, mixing samples, cutting gel, selecting a target fragment, and carrying out PE100bp sequencing by using Illumina HiSeq after the library quality is qualified.
(4) Developing molecular markers: and (3) counting sequencing data of each sample, wherein the sequencing data comprise the number of reads, the number of bases, Q30 and GC content, comparing the sequencing reads to a reference genome through BWA software, and considering the reads with two ends aligned to the same position as being from the same SLAF label. The present invention developed 589,770 SLAF tags with average sequencing depth of 21.22X for the SLAF tag parents and 8.05X for the progeny (Table 1).
TABLE 1 sequencing data and number of molecular markers for each sample
Figure BDA0003052454350000071
(5) Constructing a high-density genetic map: polymorphism analysis was performed based on the difference between allele counts and gene sequences, yielding a total of 3 types of SLAF tags: polymorphic type, Non-Polymorphic type, Repetitive type. The total development of all samples yielded 589,770 SLAF tags, with 108,709 polymorphic SLAF tags, with a polymorphism rate of 18.43%. To ensure genetic profile quality, the polymorphic SLAF signatures were filtered according to the following rules: 1) filtering the parental sequencing depth below 6X. According to the parent to the offspring genotyping, the high-depth parent sequencing depth ensures the accuracy of the offspring genotyping. 2) The number of SNPs is greater than 5. Since the sequencing length of the SNP tag is 200, the occurrence of too many SNPs is considered to be a sequencing high-frequency variation region. 3) And (4) filtering the integrity. The genotype is screened for markers that cover at least 50% or more of all progeny (this criterion is appropriately adjusted based on the actual marker data amount). That is, at least 50 individuals out of 100 progeny have a defined genotype for a single polymorphic marker site. 4) Partial separation label filtration. Segregation markers are ubiquitous, generally do not affect map construction, and may have an effect on QTL positioning. The method for processing the segregation marker is used for filtering the polymorphism markers which are seriously segregated (Ka Square test P < 0.001) by referring to most of literatures. The final result was 10,724 and 18,309 map SNP markers of SLAF tags that could be used for mapping. And dividing the screened 10,724 SLAF labels into 10 linkage groups by positioning with a reference genome, calculating MLOD values between every two labels, and filtering out the labels with the MLOD values of other SLAF labels lower than 5, wherein the total number of the labels is 10,114. Each chromosome is a linkage group, the linkage group is taken as a unit, the linear arrangement of the markers in the linkage group is obtained by adopting HighMap software analysis, the genetic distance between adjacent markers is estimated, and finally, the genetic map with the total map distance of 1,657.6cM is obtained.
TABLE 2 construction of SNP marker information of high Density genetic map
Figure BDA0003052454350000072
Figure BDA0003052454350000081
The working principle of the invention is as follows:
genetic maps have been widely developed in plants or animals. Such maps have been demonstrated for a variety of applications, including genetic mapping of important agronomic traits or mapping of Quantitative Trait Loci (QTLs), map-based cloning, molecular marker assisted selection breeding, and genome assembly analysis [1 ]. The first requirement for genetic linkage map construction is to have a separate population. Proper selection of parents is a prerequisite to ensure that a population is effectively segregated. Generally, the selection of parents is performed with reference to the following three principles: first, there is a great deal of genetic diversity between the mapped parents. Second, the purity of the parent is guaranteed. The higher the purity of the parent, the higher the accuracy of the mapping. Third, progeny of the cross are reproducible. Parents with too far relationship may be sterile or have a reduced recombination rate after crossing due to the inhibition of recombination and pairing on the genome, which affects the accuracy of mapping. According to the principle, two parents B73 and L403 with abundant genetic background difference are selected in the experiment, and a permanent mapping population, namely a 264 parts F2:8 highly homozygous recombinant inbred line population, is constructed by using the two parents.
DNA sequences that can be located and identified in the genome. The base composition of a specific location in the genome may differ from plant to plant due to genetic alterations (mutations, insertions, deletions). These differences (collectively referred to as polymorphisms) can be located and identified. Plant breeders always prefer to detect a certain gene as a molecular marker, although this is not always possible. Further alternatives are some with markers closely related to and inherited together with the gene. The molecular marker developed by the experiment belongs to a third-generation molecular marker, and a Single Nucleotide Polymorphism (SNP) marker is taken as a representative third-generation molecular marker, and is the molecular marker with the most genetic variation and the most abundant polymorphism in all markers.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims (6)

1. A method for constructing a high-density genetic map of a recombinant inbred line population is characterized by comprising the following steps:
SLAF library construction and high throughput sequencing: comprises three aspects; first, a pre-design scheme for the SLAF is selected using training data; the enzyme digestion design must be determined according to the marker efficiency characteristics, including random distribution of the whole genome, uniqueness of the genome and consistent amplification efficiency among selected markers; carrying out large-scale productivity test on the basis of a pre-designed scheme; secondly, constructing a SLAF-seq library; genomic DNA is digested by a set of enzymes designed for the individual; adding double bar codes in two PCR reactions to distinguish each individual, so that the size of a sample pool can be conveniently selected, and the sizes of fragments among the individuals are kept consistent; thirdly, high-throughput sequencing and genotyping; performing deep sequencing on the merged RRLs by adopting an Illumina paired-end sequencing protocol, and performing genotype definition and verification through software;
sequence data grouping and genotyping: filtering out low-quality reading codes, and dividing the original reading codes into each subsequence according to the double-barcode sequence; after cutting the bar code and the terminal 5-bp position from each high-quality reads, using SOAP software to position clean reads of the same sample to a corn genome sequence; defining a sequence located at the same position as a SLAF focus; then detecting Single Nucleotide Polymorphism (SNP) sites of each SLAF site among parents, and filtering out SLAFs with more than 3 SNPs; then defining the allele of each SLAF locus according to reads with the depth of the parent sequence being >21.22 times, and defining the allele according to reads with the depth of each offspring sequence being >8.05 times; for diploid species, one SLAF locus can contain up to 4 genotypes, so SLAF loci with more than 4 alleles are defined as repeat SLAFs and then discarded; only SLAFs with 2 to 4 alleles were identified as polymorphisms and considered as potential markers; all polymorphic SLAFS loci are genotyped according to the consistency of parental and filial generation SNP loci; the marker codes of polymorphic SLAFs were analyzed according to the RIL population type of one isolated type (aa × bb); then, a Bayesian method is used for genotyping so as to further ensure the quality of the genotype;
and (3) constructing a genetic linkage map: the marker sites are divided primarily into Linkage Groups (LGs) according to their position on the B73_ RefGen _ V4 genome; secondly, calculating Modified Lod (MLOD) scores among the markers, and further verifying the robustness of the markers to each LGs; tags with MLOD score <0.05 were filtered before subscription; in order to ensure the construction of a genetic map with high density and high quality, a newly developed HighMap strategy is adopted to sort the SLAF markers, and the genotyping errors in the LGs are corrected; firstly, calculating recombination frequency and LOD score by using a two-point analysis method for deducing continuous phase; then, combining enhanced Gibbs sampling, spatial sampling and simulated annealing algorithm to carry out an iterative process of marking and sequencing; selecting a SLAF marker using spatial sampling at a first stage of the sorting process; randomly extracting a mark according to the priority of test intersection, and excluding marks with recombination frequency smaller than a given sampling radius from a mark set; then searching the optimal mapping sequence by using a simulated annealing algorithm; the annealing system continues until, in a series of successive steps, the newly generated mapping order is rejected; estimating the multipoint complex frequency of the parents after obtaining the optimal sample marking map sequence by adopting a blocking Gibbs sampling method; integrating the mapping of the two parents by using the updated recombination frequency, and optimizing the mapping sequence of the next simulated annealing period; after 3-4 cycles, once a stable map order is obtained, we turn to the next round of map construction cycle; selecting a subset of the currently unmapped markers and adding them to the previous sample to reduce the sample radius; the mapping algorithm is repeated until all the markers have been mapped correctly.
2. The method for constructing the high-density genetic map of the recombinant inbred line population according to claim 1, wherein the specific method for deep sequencing the combined RRLs by using the Illumina paired-end sequencing protocol is that the number of markers generated by different enzymes is simulated by using a corn B73_ RefGen _ V4 reference genome, and a marker discovery experiment is designed; then, performing a SLAF intermediate experiment, and constructing a SLAF library according to a pre-designed scheme; in the 264 RIL populations, two enzymes were used, of which two were HaeIII and Hpy166II, respectively; digesting the genomic DNA at 37 ℃; subsequently single nucleotide (A) was added to the digested fragment by overhang with Klenow fragment (3 '→ 5') (NEB) and dATP; then the double-labeled sequencing adapter is subjected to PAGE purification by using T4 DNA ligase; ligation to the a-tail fragment; using a diluted restriction ligation DNA sample, dNTPs,
Figure FDA0003052454340000021
Performing Polymerase Chain Reaction (PCR) by using high-fidelity DNA polymerase and PCR primers; the PCR products were subsequently purified using agencouurtampure XP beads and pooled; separating the mixed sample by 2% agarose gel electrophoresis; fragments with indices and adapters of size from 414 to 464 base pairs were excised and purified using the QIAquick gel extraction kit (Qiagen, Hilden, Germany); the gel purified product is diluted; paired sequencing was performed on the Illumina platform system.
3. The method for constructing the high-density genetic map of the recombinant inbred line population according to claim 1, wherein the contribution of the parental genotypes is subjected to an SMOOTH error correction strategy, and a k-nearest neighbor algorithm is adopted to merge the missing genotypes; then adding the slant marks into the map by applying a maximum likelihood multi-point method; map distances are estimated using Kosambi mapping functions.
4. The method for constructing the high-density genetic map of the recombinant inbred line population according to claim 1, wherein the Bayesian method for genotyping comprises the following steps; firstly, calculating the posterior conditional probability by using the coverage rate of each allele and the number of single nucleotide polymorphisms; then, selecting qualified markers for subsequent analysis by using the genotyping mass fraction of the probabilistic translation; counting the low quality markers for each marker and each individual and deleting the poor markers or individuals in a dynamic process; the process was stopped when the mean genotype mass fraction of all SLAF markers reached a critical value.
5. The method for constructing the high-density genetic map of the recombinant inbred line population according to claim 1, wherein the high-quality SLAF markers are used for genetic map screening according to the following standards; first, the average sequence depth per progeny should be >8.05 times, the parent >21.22 times; secondly, filtering the marks with the data missing rate exceeding 50%; thirdly, checking segregation distortion by adopting a chi-square test; during map construction, markers with significant segregation (P <0.05) were first removed and then added as auxiliary markers.
6. The method for constructing the high-density genetic map of the recombinant inbred line population according to claim 1, wherein the PCR primers are divided into a forward primer and a reverse primer; wherein the forward primer: 5'-AATGATACGGCGACCACCGA-3' and the reverse primer: 5'-CAAGCAGAAGACGGCATACG-3' were subjected to PAGE purification.
CN202110490790.6A 2021-05-06 2021-05-06 Construction method of high-density genetic map of recombinant inbred line population Pending CN113718342A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110490790.6A CN113718342A (en) 2021-05-06 2021-05-06 Construction method of high-density genetic map of recombinant inbred line population

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110490790.6A CN113718342A (en) 2021-05-06 2021-05-06 Construction method of high-density genetic map of recombinant inbred line population

Publications (1)

Publication Number Publication Date
CN113718342A true CN113718342A (en) 2021-11-30

Family

ID=78672689

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110490790.6A Pending CN113718342A (en) 2021-05-06 2021-05-06 Construction method of high-density genetic map of recombinant inbred line population

Country Status (1)

Country Link
CN (1) CN113718342A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115762641A (en) * 2023-01-10 2023-03-07 天津极智基因科技有限公司 Fingerprint spectrum construction method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103114150A (en) * 2013-03-11 2013-05-22 上海美吉生物医药科技有限公司 Single nucleotide polymorphism site identification method based on digestion library-establishing and sequencing and bayesian statistics
CN103525917A (en) * 2013-09-24 2014-01-22 北京百迈客生物科技有限公司 Construction and evaluation of parting High Map on basis of high throughput
US20190333602A1 (en) * 2018-04-24 2019-10-31 Institute Of Crop Sciences, Chinese Academy Of Agricultural Sciences SOYBEAN ANTI-POD-SHATTERING MAJOR QTLqPD05, AND MAPPING METHOD AND APPLICATION THEREOF

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103114150A (en) * 2013-03-11 2013-05-22 上海美吉生物医药科技有限公司 Single nucleotide polymorphism site identification method based on digestion library-establishing and sequencing and bayesian statistics
CN103525917A (en) * 2013-09-24 2014-01-22 北京百迈客生物科技有限公司 Construction and evaluation of parting High Map on basis of high throughput
US20190333602A1 (en) * 2018-04-24 2019-10-31 Institute Of Crop Sciences, Chinese Academy Of Agricultural Sciences SOYBEAN ANTI-POD-SHATTERING MAJOR QTLqPD05, AND MAPPING METHOD AND APPLICATION THEREOF

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
XIAMING WU等: "Construction of High-Density Genetic Map and Identification of QTLs Associated with Seed Vigor after Exposure to Artificial Aging Conditions in Sweet Corn Using SLAF-seq", 《GENES》 *
乔帅: "玉米株型及产量相关性状QTL定位", 《中国优秀硕士学位论文全文数据库 农业科技辑》 *
高凤云等: "基于 SLAF-seq 技术构建亚麻高密度遗传图谱", 《中国油料作物学报》 *
高凤云等: "基于SLAF-seq技术构建亚麻高密度遗传图谱", 中国油料作物学报 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115762641A (en) * 2023-01-10 2023-03-07 天津极智基因科技有限公司 Fingerprint spectrum construction method and system
CN115762641B (en) * 2023-01-10 2023-04-07 天津极智基因科技有限公司 Fingerprint spectrum construction method and system

Similar Documents

Publication Publication Date Title
US9976191B2 (en) Rice whole genome breeding chip and application thereof
Hoshino et al. Microsatellites as tools for genetic diversity analysis
Paux et al. Sequence-based marker development in wheat: advances and applications to breeding
Xu et al. Developing high throughput genotyped chromosome segment substitution lines based on population whole-genome re-sequencing in rice (Oryza sativa L.)
CN108004340B (en) Method for developing SNP (single nucleotide polymorphism) of whole genome of peanut
Sun et al. Genome-wide characterization and linkage mapping of simple sequence repeats in mei (Prunus mume Sieb. et Zucc.)
CN107090494B (en) Molecular marker related to grain number character of millet and detection primer and application thereof
CN115807122A (en) SNP molecular marker for pineapple seed resource identification and application thereof
CN108642201B (en) SNP (Single nucleotide polymorphism) marker related to millet plant height character as well as detection primer and application thereof
CN107988424B (en) Molecular marker, interval, primer and application related to methionine content of soybean seeds
CN113718342A (en) Construction method of high-density genetic map of recombinant inbred line population
CN109797242A (en) Identify the molecular labeling and method of wheat yield correlated traits
CN112226529A (en) SNP molecular marker of wax gourd blight-resistant gene and application
WO2020082314A1 (en) Oryza sativa green gene chip and application
CN107447022B (en) SNP molecular marker for predicting corn heterosis and application thereof
CN113736891B (en) Molecular marker G2997 for rapidly identifying low-temperature tolerant variety of penaeus japonicus and application thereof
CN113584187A (en) Molecular marker A2629 for screening penaeus japonicus with low temperature resistance, amplification primer and application thereof
CN113355449A (en) Elytrigia elongata 3E chromosome specific codominant KASP molecular marker and application thereof
CN108642203B (en) SNP (Single nucleotide polymorphism) marker related to millet stem thickness character as well as detection primer and application thereof
CN113684280A (en) Apostichopus japonicus high temperature resistant breeding low-density 12K SNP chip and application
CN117248061B (en) InDel locus related to soybean seed oil content, molecular marker, primer and application thereof
CN117965787B (en) SNP (Single nucleotide polymorphism) marker and primer set for identifying authenticity of pineapple Josapine and MD2 hybrid and application of SNP marker and primer set
CN117230240B (en) InDel locus related to soybean seed oil content, molecular marker, primer and application thereof
CN108642202B (en) SNP (Single nucleotide polymorphism) marker related to millet stem node number traits as well as detection primer and application thereof
Aydin et al. DNA fingerprinting of crop plants

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination