CN115679011A - SNP molecular marker combination and application thereof in maize germplasm identification and breeding - Google Patents

SNP molecular marker combination and application thereof in maize germplasm identification and breeding Download PDF

Info

Publication number
CN115679011A
CN115679011A CN202210764001.8A CN202210764001A CN115679011A CN 115679011 A CN115679011 A CN 115679011A CN 202210764001 A CN202210764001 A CN 202210764001A CN 115679011 A CN115679011 A CN 115679011A
Authority
CN
China
Prior art keywords
snp molecular
molecular marker
snp
markers
corn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210764001.8A
Other languages
Chinese (zh)
Inventor
王天宇
李春辉
王红武
刘蓓
李永祥
黎裕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Crop Sciences of Chinese Academy of Agricultural Sciences
Original Assignee
Institute of Crop Sciences of Chinese Academy of Agricultural Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Crop Sciences of Chinese Academy of Agricultural Sciences filed Critical Institute of Crop Sciences of Chinese Academy of Agricultural Sciences
Priority to CN202210764001.8A priority Critical patent/CN115679011A/en
Publication of CN115679011A publication Critical patent/CN115679011A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention relates to the technical field of bioinformatics, in particular to a SNP molecular marker combination and application thereof in maize germplasm identification and breeding. The SNP molecular markers comprise the SNP molecular markers shown in the table 1, and the SNP molecular markers refer to nucleotides in the sequences of the table 1. The method filters 2004 corn germplasm resource re-sequencing data to obtain 20M high-quality markers, and then obtains 1050 SNP molecular markers through screening, wherein the SNP molecular markers have lower individual deletion rate and higher minimum allele frequency. The SNP molecular marker combination provided by the invention can accurately classify different corn germplasm resources and different varieties, has high detection rate and high repetition rate, and can be applied to analysis such as corn molecular marker-assisted breeding, corn variety identification, corn group division and the like.

Description

SNP molecular marker combination and application thereof in maize germplasm identification and breeding
Technical Field
The invention relates to the technical field of bioinformatics, in particular to a SNP molecular marker combination and application thereof in maize germplasm identification and breeding.
Background
A DNA chip, also called gene chip, is a kind of biochip, by fixing many probe molecules on a substrate, according to the principle of complementary base pairing of A-T and C-G, the DNA chip hybridizes with the detected gene of the sample and then the intensity of the hybridization signal is detected, and the sequence and the quantity information of the sample can be obtained.
The DNA chip is widely applied to the fields of diagnosis and treatment of diseases, identification and screening of crops, food health supervision and the like at present, and the breeding efficiency of the crops is effectively improved due to the DNA chip. Currently, in the development of corn gene chips, originally based on the golden gate platform, one chip developed on corn contains 1536 SNP markers. Subsequently, based on an Illumina detection platform, about 2 million SNP markers identified by synthetic sequencing of 27 genetic diversity inbred lines of the American corn nested association mapping population are utilized, and a 50K chip containing 49585 SNP markers is designed, and the chip is widely applied to genetic research and breeding improvement, but most sites are from hot strip materials and cannot be better applied to identifying the genotype of the temperate material; in addition, other densities of chips, such as 3K chips, have also been developed by different researchers. A detection platform based on Affymetrix utilizes the re-sequencing results of 30 representative temperate zone inbred lines in Europe to develop a 600K chip, and in addition, 55K chips and 6H-60K chips are also developed and developed in the prior art, however, polymorphic sites of the chips all represent European materials, and the genotype of the temperate zone germplasm in China cannot be effectively solved. Recently, liquid phase chips based on detection of target sequencing genotypes are widely used, and on the basis of the previous chips, 10K, 5K, and 1K chips, etc. have been developed. Although high-density, medium-density, low-density corn chips have been previously developed, the chip-designed markers are derived from re-sequencing or other sequencing results of only a few american or european corn germplasm resources (tens or hundreds), and the re-sequencing depth of some materials is low and inconsistent, which may lead to low genetic diversity and poor representativeness of the screened SNP markers when identifying existing corn breeding materials. In addition, the existing chip has the problems of poor representation of polymorphic sites, few functional markers, low diversity and the like.
Disclosure of Invention
In order to solve the technical problems in the prior art, the invention provides an SNP molecular marker combination and application thereof in maize germplasm identification and breeding.
In a first aspect, the invention provides an SNP molecular marker combination, wherein 2004 germplasm resources sequencing results are used to obtain 20M SNP site information, and then further screening is performed to obtain 1050 SNP site information, wherein the 1050 SNP sites have a low individual deletion rate (Miss value) and a high minimum allele frequency (MAF value), specific information is shown in table 1, SNP position information is shown in table 1 in chromosomes and position column, polymorphism of a site is shown in table 1 in position column, and nucleotide sequence information of 50bp around the site is shown in sequence column [ ] around the site.
Figure BSA0000276906700000021
Figure BSA0000276906700000031
Figure BSA0000276906700000041
Figure BSA0000276906700000051
Figure BSA0000276906700000061
Figure BSA0000276906700000071
Figure BSA0000276906700000081
Figure BSA0000276906700000091
Figure BSA0000276906700000101
Figure BSA0000276906700000111
Figure BSA0000276906700000121
Figure BSA0000276906700000131
Figure BSA0000276906700000141
Figure BSA0000276906700000151
Figure BSA0000276906700000161
Figure BSA0000276906700000171
Figure BSA0000276906700000181
Figure BSA0000276906700000191
Figure BSA0000276906700000201
Figure BSA0000276906700000211
Figure BSA0000276906700000221
Figure BSA0000276906700000231
Figure BSA0000276906700000241
Figure BSA0000276906700000251
Figure BSA0000276906700000261
Figure BSA0000276906700000271
Figure BSA0000276906700000281
Figure BSA0000276906700000291
Figure BSA0000276906700000301
Figure BSA0000276906700000311
Figure BSA0000276906700000321
Figure BSA0000276906700000331
Figure BSA0000276906700000341
Figure BSA0000276906700000351
Figure BSA0000276906700000361
Figure BSA0000276906700000371
Figure BSA0000276906700000381
Figure BSA0000276906700000391
Figure BSA0000276906700000401
Figure BSA0000276906700000411
Figure BSA0000276906700000421
Figure BSA0000276906700000431
Figure BSA0000276906700000441
Figure BSA0000276906700000451
Figure BSA0000276906700000461
Figure BSA0000276906700000471
Figure BSA0000276906700000481
Figure BSA0000276906700000491
Figure BSA0000276906700000501
Figure BSA0000276906700000511
Figure BSA0000276906700000521
Figure BSA0000276906700000531
Figure BSA0000276906700000541
Figure BSA0000276906700000551
Figure BSA0000276906700000561
Figure BSA0000276906700000571
Figure BSA0000276906700000581
Figure BSA0000276906700000591
Figure BSA0000276906700000601
The invention further provides application of the SNP molecular marker, the probe combination or the DNA chip in corn molecular marker-assisted breeding.
The invention further provides application of the SNP molecular marker, the probe combination or the DNA chip in corn group analysis or genetic transformation.
The invention further provides the application of the SNP molecular marker, the probe combination or the DNA chip in corn variety identification.
The invention has the following beneficial effects:
based on 2004 global temperate corn germplasm resource re-sequencing results with abundant genetic diversity and different breeding stages, about 2 million high-quality SNP markers are obtained, 1050 SNP molecular markers are obtained through two steps of screening, wherein the 1050 SNP molecular markers comprise 330 functional markers and breeding selection sites obtained through whole genome association of flowering phase characters, yield and yield-related characters, plant type characters and the like, 272 specific sites of heterosis group conserved segments and stably detected sites verified by a large group. The SNP molecular markers have lower individual deletion rate and higher minimum allele frequency. After being used as a DNA chip prepared by a detection object, the SNP molecular marker provided by the invention can be applied to corn variety identification and heterosis group analysis, has the advantages of high flux, good repeatability, wide practicability and high accuracy, and has important application values in the aspects of corn germplasm resource identification and evaluation, excellent corn new variety cultivation and the like.
Drawings
FIG. 1 is a flowchart of screening molecular markers provided in example 1 of the present invention.
FIG. 2 is a schematic diagram of the distribution results of 1050 SNP molecular markers on a chromosome, which is provided in example 1 of the present invention.
FIG. 3 is a diagram showing the statistical results of individual deletion rate (miss) and minimum allele frequency (maf) of 1050 SNP molecular markers according to example 1 of the present invention, wherein the left diagram is the statistical result of individual deletion rate, and the right diagram is the statistical result of minimum allele frequency.
Fig. 4 is a schematic diagram of a group analysis result provided in embodiment 2 of the present invention.
Fig. 5 is a schematic diagram of a class division result provided in embodiment 2 of the present invention.
Fig. 6 is a schematic diagram illustrating a group analysis result of the STRUCTURE group analysis result provided in embodiment 2 of the present invention.
Fig. 7 is a schematic diagram illustrating a group analysis result of the STRUCTURE group analysis result provided in embodiment 2 of the present invention.
Fig. 8 is a schematic diagram of a similarity analysis result provided in embodiment 2 of the present invention.
Fig. 9 is a schematic diagram of an analysis result of a fingerprint provided in embodiment 2 of the present invention.
Detailed Description
The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
Example 1
The invention adopts the re-sequencing vcf data (using the mail v4 gene group) of two thousand global corn germplasm resource materials as the basis, and carries out the screening of molecular markers by the following procedures:
the first step is shown in figure 1: firstly, filtering out sites with population marker polymorphism (MAF) lower than 0.05; secondly, screening a mark with the GC content of 30-70% of a sequence with 101bp of upstream and downstream; analysis of 20M labeled LD was performed simultaneously. And (3) selecting representative markers (the priority division is that non-synonymous mutation is greater than exon variation, the upstream and downstream of the gene and a shearing region are divided, and the MAF value is greater than 0.2 (intron is greater than an intergenic region)) in the blocks in the segments with the number of the markers greater than 35 to obtain 52k markers in total, and carrying out liquid phase probe capture development and design. The second step is that: detecting 2700 parts of local varieties and 145 parts of representative lines by using a designed probe kit, performing mutation detection analysis (BWA, GATK), selecting sites which are excellent in the experiment, selecting important sites such as trait associated sites, breeding improvement and the like as first priority, selecting the sites with the best detection rate and polymorphism, and performing final selection according to the principle that 1M keeps one marker; then dividing the sections according to 2M, screening the positions of the sections which are not covered by the top-priority marks, and selecting block representative positions as much as possible, wherein MAF is more than 0.3; miss < 0.1. 1050 high-quality representative markers (shown in figure 2) are obtained by filtering and screening the markers in the two steps.
The invention carries out statistics on 1050 SNP sites, the statistical results of individual deletion rate (miss) and minimum allele frequency (maf) are shown in figure 3, and the individual deletion rate is below 0.2 and generally distributed at about 0.05; the minimum allele frequency is generally distributed in the interval of 0.3-0.5, which shows that 1050 SNP loci obtained by screening have the characteristics of high quality, high detection rate and the like.
Example 2
The invention adopts 1050 molecular markers to carry out cluster analysis, and the process flow is as follows:
the invention carries out second-generation sequencing by a liquid phase capture sequencing technology, and standard variation detection analysis process is carried out on the result after sequencing.
The second generation sequencing procedure was as follows:
1. after extracting the genome DNA, fragmenting the genome DNA by enzyme digestion and adding a sequencing joint;
2. combining an RNA probe with a biotin label with a DNA fragment with an already-provided linker sequence;
3. the streptavidin-coated magnetic beads are combined with a double-stranded complex formed by combining an RNA probe with a biotin label and DNA (excessive probes);
4. washing (Washing) the DNA of the target region for the purpose of removing non-specific hybridization and improving the capture efficiency;
5. performing PCR amplification on the eluted DNA product, and constructing an Illumina sequencing library for sequencing;
the main steps of the analysis of the mutation detection information are as follows:
1. performing quality control on Raw data obtained by off-line processing to obtain Clean data
2. Aligning Clean data to a reference genome
3. According to the comparison result, carrying out group SNP detection and filtration
The method comprises the steps of firstly detecting gvcf by utilizing a HaplotypeCaller model of GATK, filtering obtained data by using basic filtering parameters recommended by a GATK official website, further filtering markers by selectively using vcftools, and obtaining the markers with high polymorphism in a 2845 sample after filtering.
Then PCA, TREE and structure analysis of the population are respectively carried out by using mainstream software GCTA, treebest and additure of the population analysis, and similarity analysis is carried out by using an R software package. The results shown in FIG. 4 are obtained, and the results show that the group analysis is carried out on 2845 samples by adopting 1050 molecular markers (the sample sources are 145 samples from backbone parents and star hybrid parents in different periods of corn breeding in China, the backbone lines in main heterosis groups of corn breeding are covered, 2700 local varieties are from important farmer varieties planted in different ecoregions in China historically and have rich genetic diversity), and a plurality of groups can be accurately separated. The results shown in fig. 5-7 are obtained, and the results of fig. 5, 6 and 7 show that 1050 markers of the present invention can successfully and accurately classify the samples into groups, where fig. 4 is TREE, i.e. the analysis result of the population evolution TREE; FIG. 5 shows the results of analysis of PCA; FIGS. 6 and 7 show the results of structure analysis.
The result of the similarity analysis is shown in fig. 8, and the result shows that 1050 markers provided by the invention can accurately perform the difference analysis and variety identification of germplasm resources or varieties.
The fingerprint construction results are shown in fig. 9, and the results show that: the A, T, C, G and heterozygous genotypes are displayed in different colors, and the genotypes of each individual can be visually displayed and have uniqueness.
Although the invention has been described in detail hereinabove with respect to a general description and specific embodiments thereof, it will be apparent to those skilled in the art that modifications and improvements can be made thereto based on the invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.

Claims (8)

1. A SNP molecular marker combination, which is characterized by comprising the SNP molecular markers as shown in Table 1, wherein the SNP molecular markers refer to nucleotides in sequences [ ] in Table 1.
2. A probe combination for detecting the combination of molecular markers of any one of claims 1-3.
3. A DNA chip comprising the probe set according to claim 2.
4. A kit comprising the SNP molecular marker set according to claim 1.
5. The use of the SNP molecular marker set according to claim 1, or the probe set according to claim 2, or the DNA chip according to claim 3, or the kit according to claim 4 in maize molecular marker-assisted selection or whole genome selection.
6. Use of the SNP molecular marker set according to claim 1, or the probe set according to claim 2, or the DNA chip according to claim 3, or the kit according to claim 4 for the identification of maize varieties.
7. The SNP molecular marker set according to claim 1, or the probe set according to claim 2, or the DNA chip according to claim 3, or the kit according to claim 4, for use in maize genome-wide association analysis.
8. The SNP molecular marker set according to claim 1, or the probe set according to claim 2, or the DNA chip according to claim 3, or the kit according to claim 4, for use in any one of the following:
(1) Constructing a genetic map;
(2) Analyzing genetic diversity;
(3) Constructing a molecular identity card;
(4) Improving the corn variety;
(5) Analyzing germplasm resources;
(6) And (4) identifying the corn variety.
CN202210764001.8A 2022-06-30 2022-06-30 SNP molecular marker combination and application thereof in maize germplasm identification and breeding Pending CN115679011A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210764001.8A CN115679011A (en) 2022-06-30 2022-06-30 SNP molecular marker combination and application thereof in maize germplasm identification and breeding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210764001.8A CN115679011A (en) 2022-06-30 2022-06-30 SNP molecular marker combination and application thereof in maize germplasm identification and breeding

Publications (1)

Publication Number Publication Date
CN115679011A true CN115679011A (en) 2023-02-03

Family

ID=85060420

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210764001.8A Pending CN115679011A (en) 2022-06-30 2022-06-30 SNP molecular marker combination and application thereof in maize germplasm identification and breeding

Country Status (1)

Country Link
CN (1) CN115679011A (en)

Similar Documents

Publication Publication Date Title
CN109196123B (en) SNP molecular marker combination for rice genotyping and application thereof
CN108998550B (en) SNP molecular marker for rice genotyping and application thereof
CN109706263A (en) Chain SNP marker and application with wheat stripe rust resisting ospc gene QYr.sicau-1B-1
CN110029187A (en) A kind of application for marking the method for map based on competitive equipotential PCR building rice molecular and it being utilized to carry out breeding
CN108103235B (en) SNP molecular marker and primer for identifying cold resistance of apple rootstock and application of SNP molecular marker and primer
CN115029451B (en) Sheep liquid phase chip and application thereof
CN107760789B (en) Genotyping detection kit for parent-child identification and individual identification of yaks
CN107090495B (en) Molecular marker related to long shape of neck of millet and detection primer and application thereof
CN111088382B (en) Corn whole genome SNP chip and application thereof
CN115198023B (en) Hainan cattle liquid-phase breeding chip and application thereof
CN110846429A (en) Corn whole genome InDel chip and application thereof
CN115232881A (en) Abalone genome breeding chip and application thereof
CN117965781A (en) Peanut 40K liquid-phase SNP chip &#39;PeanutGBTS K&#39; and application thereof
CN115679012B (en) Chilli whole genome SNP-Panel and application thereof
CN117051151A (en) Chilli 5K liquid-phase chip and application thereof
CN108416189B (en) Crop variety heterosis mode identification method based on molecular marker technology
CN106399495B (en) SNP marker closely linked with soybean short stalk character and application thereof
CN108823330A (en) A kind of soybean HRM-SNP molecular labeling point labeling method and its application
CN108913797A (en) The method that GBS obtains Chinese cabbage group genome SNP building finger-print
CN115679011A (en) SNP molecular marker combination and application thereof in maize germplasm identification and breeding
CN114457070A (en) Wheat-diploid elytrigia elongata 45K liquid-phase chip and application
CN110305974B (en) PCR analysis primer for distinguishing common mouse inbred lines based on detection of five SNP loci and analysis method thereof
CN111549172A (en) Watermelon leaf posterior green gene linkage site and CAPS marker
CN117587159B (en) Chilli SNP molecular marker combination, SNP chip and application thereof
CN108728570A (en) DCAPS primer pairs and its application for the first female section of auxiliary judgment pumpkin and first male section

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination