CN114395640B - Soybean high-protein content related molecular marker located on chromosome 6 and method for identifying soybean high-protein content - Google Patents

Soybean high-protein content related molecular marker located on chromosome 6 and method for identifying soybean high-protein content Download PDF

Info

Publication number
CN114395640B
CN114395640B CN202210043653.2A CN202210043653A CN114395640B CN 114395640 B CN114395640 B CN 114395640B CN 202210043653 A CN202210043653 A CN 202210043653A CN 114395640 B CN114395640 B CN 114395640B
Authority
CN
China
Prior art keywords
soybean
protein
dna
synthesis
protein content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210043653.2A
Other languages
Chinese (zh)
Other versions
CN114395640A (en
Inventor
齐照明
赵莹
朱荣胜
黄仕钰
刘珊珊
刘春燕
辛大伟
王锦辉
陈庆山
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeast Agricultural University
Original Assignee
Northeast Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeast Agricultural University filed Critical Northeast Agricultural University
Priority to CN202210043653.2A priority Critical patent/CN114395640B/en
Publication of CN114395640A publication Critical patent/CN114395640A/en
Application granted granted Critical
Publication of CN114395640B publication Critical patent/CN114395640B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/6895Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for plants, fungi or algae
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/13Plant traits
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Immunology (AREA)
  • Mycology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Botany (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention provides a molecular marker related to high protein content of soybean on chromosome 6 and a method for identifying the soybean with high protein content, belonging to the technical field of biology. In order to rapidly and accurately screen high-protein high-quality soybean varieties. The invention provides a molecular marker SNP2 related to soybean high protein content, wherein the nucleotide locus corresponding to the SNP2 is Gm06_44869874, and application of the markers in preparation of a kit for detecting soybean high protein content and a screening method. The selection of the characters is realized through the selection of the markers, the breeding efficiency is greatly improved, and the effect of directionally improving the soybean varieties is realized, so that the soybean varieties with high protein can be selected.

Description

Soybean high-protein content related molecular marker located on chromosome 6 and method for identifying soybean high-protein content
The application is a divisional application of application number 202110583739.X, application day 2021, 5 and 27, and the invention name of a molecular marker related to high protein content of soybean and a method for identifying the soybean with high protein content.
Technical Field
The invention belongs to the technical field of biology, and particularly relates to a molecular marker related to high protein content of soybean on chromosome 6 and a method for identifying the soybean with high protein content.
Background
The soybean has rich nutrition, and protein content is about 40%. The soybean protein contains 8 essential amino acids for human body, people can eat soybean to supplement needed nutrient substances, and can prevent cardiovascular diseases of human body, the soybean is an important oil crop, can be processed into edible oil, can meet the dietary requirements of people, and simultaneously mainly consists of five fatty acids, wherein the fatty acids can prevent heart diseases, cancers and the like. Along with the increasing living standard of people, more and more people pay more attention to the edible health and the nutritional value of food, so the demand for soybeans is great, but the soybeans in China are more dependent on import from other countries, so the soybean protein and the high-protein and high-oil soybean varieties are urgently improved in China, and the daily needs of people are met.
The soybean grain protein is a quality-related character, is a relatively complex quantitative character, is controlled by a plurality of genes, is limited by genetic characteristics and a breeding method all the time, is too slow in a traditional method, and is proposed as technology is continuously advanced, molecular auxiliary selection is performed on the basis of the traditional hybridization breeding method, molecular markers are closely linked with genes for determining target characters, the selection of the characters is realized through the selection of the markers, the breeding efficiency is greatly improved, and the effect of directionally improving soybean varieties is realized, so that the soybean varieties with high protein can be selected.
Disclosure of Invention
The invention aims to rapidly and accurately screen high-protein high-quality soybean varieties, and provides a soybean high-protein content related molecular marker, wherein the nucleotide sequence of the molecular marker is SNP1, the sequence of the SNP1 is the nucleotide sequence of 50.84Mb-50.87Mb positions on a soybean chromosome 1, and the 50861576 nucleotide site of a Gm01 chromosome is A or C.
In one embodiment, primers for amplifying SNP1 are Gm01_50861576-F and Gm01_50861576-R, and the nucleotide sequence of Gm01_50861576-F is shown as SEQ ID NO.15 or SEQ ID NO. 16; the nucleotide sequence of Gm01_50861576-R is shown in SEQ ID NO. 17.
The invention also provides a soybean high-protein content related molecular marker, the nucleotide sequence of the molecular marker is SNP2, the sequence of the SNP2 is the nucleotide sequence of 38.49Mb-47.89Mb positions on a soybean chromosome 6, and the 44869874 nucleotide site of a chromosome Gm06 is A or G.
In one embodiment, primers for amplifying SNP2 are Gm06_44869874-F and Gm14_16525645-R, the nucleotide sequence of Gm06_44869874-F is shown as SEQ ID NO.51 or as SEQ ID NO. 52; the nucleotide sequence of Gm06_44869874-R is shown in SEQ ID NO. 53.
The invention also provides a soybean high-protein content related molecular marker, the nucleotide sequence of the molecular marker is SNP3, the sequence of the SNP3 is the nucleotide sequence of the 16.13Mb-16.66Mb position on the chromosome 14 of soybean, and the 16525645 nucleotide site of the chromosome 14 of Gm is A or T.
In one embodiment, primers for amplifying SNP3 are Gm14_16525645-F and Gm14_16525645-R, the nucleotide sequence of Gm14_16525645-F is shown as SEQ ID NO.84 or as SEQ ID NO. 85; the nucleotide sequence of Gm14_16525645-R is shown in SEQ ID NO. 86.
The invention also provides application of SNP1, SNP2 and SNP3 molecular markers in preparing a kit for identifying soybean with high protein content, and the SNP1, SNP2 and SNP3 molecular markers are amplified by any one group of primers from (a) to (c):
(a) The nucleotide sequence of the upstream primer of the amplification SNP1 is shown as SEQ ID NO.15 or SEQ ID NO. 16; the nucleotide sequence of the downstream primer of the amplification SNP1 is shown as SEQ ID NO. 17;
(b) Amplifying the nucleotide sequence of the upstream primer of the SNP2 as shown in SEQ ID NO.51 or SEQ ID NO. 52; the nucleotide sequence of the downstream primer of the amplification SNP2 is shown as SEQ ID NO. 53;
(c) The nucleotide sequence of the upstream primer of the amplification SNP3 is shown as SEQ ID NO.84 or SEQ ID NO. 85; the nucleotide sequence of the downstream primer of the amplification SNP3 is shown as SEQ ID NO. 86.
The invention also provides a method for identifying the soybean with high protein content, which comprises the following specific steps:
(1) Extracting DNA of soybean to be detected;
(2) And (3) carrying out PCR (polymerase chain reaction) by using a primer of SNP1 molecular marker, detecting that the soybean of the to-be-detected variety is of CC genotype, wherein the soybean of the to-be-detected variety is of high protein content, and if the soybean of the to-be-detected variety is of AA genotype, the soybean of the to-be-detected variety is of low protein content.
The invention also provides a method for identifying the soybean with high protein content, which comprises the following specific steps:
(1) Extracting DNA of soybean to be detected;
(2) And (3) carrying out PCR (polymerase chain reaction) by using a primer of SNP2 molecular marker, detecting that the soybean of the to-be-detected variety is GG genotype, wherein the soybean of the to-be-detected variety is high-protein-content soybean, and if the soybean of the to-be-detected variety is AA genotype, the soybean of the to-be-detected variety is low-protein-content soybean.
The invention also provides a method for identifying the soybean with high protein content, which comprises the following specific steps:
(1) Extracting DNA of soybean to be detected;
(2) And (3) carrying out PCR reaction by using the SNP3 molecular marked primer, and detecting that the soybean of the to-be-detected variety is AA genotype, wherein the soybean of the to-be-detected variety is high-protein soybean, and if the soybean of the to-be-detected variety is TT genotype, the soybean of the to-be-detected variety is low-protein soybean.
The beneficial effects are that: the research utilizes 643 parts of resource groups subjected to whole genome resequencing to combine phenotype data of soybean seed storage substances which are repeated for 3 times in 2 years, utilizes a layering evaluation method to screen SNP loci which are extremely obviously related to soybean seed proteins and oil, adopts KASP in SNP molecular marking technology to verify in 151 parts of soybean non-sequencing extreme protein resource materials and 162 parts of soybean non-sequencing extreme oil resource materials, and develops molecular markers related to soybean proteins and oil according to parting results and phenotype data thereof so as to provide a high-speed and accurate method for screening high-quality varieties with high protein and high oil content in advance in production.
Drawings
FIG. 1 is a graph of protein, oil content and BLUP distribution of a 2018 and 2019 resource sequencing material, wherein, the graph A is a 2018 protein content distribution histogram, B is a 2019 protein content distribution histogram, C is a 2 protein BLUP distribution histogram, the abscissa is the group, and the ordinate is the frequency;
FIG. 2 is a distribution of the number of SNP sites on 20 chromosomes, wherein the abscissa is the chromosome and the ordinate is the number of SNP sites;
FIG. 3 is a distribution of the number of SNP sites on 20 chromosomes related to a protein, wherein the abscissa indicates the chromosome and the ordinate indicates the number of SNP sites;
FIG. 4 is a graph showing differences in phenotypic effects of a protein-associated SNP site mutant genome corresponding allele and a reference genome, wherein the abscissa is the group and the ordinate is the difference in phenotypic effects;
FIG. 5 is a graph showing the average of high protein excellent haplotype and low protein haplotype phenotypes for protein-associated SNP sites, where the abscissa indicates the group and the ordinate indicates the protein content;
FIG. 6 is KASP genotyping of SNP markers in 151 soybean extreme protein resource materials.
DETAILED DESCRIPTION OF EMBODIMENT (S) OF INVENTION
MQTL of soy protein content is described in the document Qi et al 2018Meta-analysis and transcriptome profiling reveal hub genes for soybean seed storage composition during seed development.
Example 1.
Experimental population: 643 parts of soybean core germplasm sequencing resources in northeast areas are selected as experimental groups, the experimental groups are planted in the sunny farm test fields of Jilin province agricultural sciences and northeast agricultural universities in 2018 and 2019, the experimental groups are repeated for 3 times, the row length is 1m, 20 seeds are sown in each 1 row, the sowing depth is 3-4cm, the field management method is the same as field management, each character is inspected after harvesting, and 5 plants with consistent growth vigor are selected for measurement. In the vegetative growth stage, leaves at the very young positions at the top of the plants are adopted for extracting DNA, 5 plants are randomly taken from each plant line for threshing during harvesting, and the leaves are used for measuring the protein content.
1. Measuring and treating the protein content of soybean particles: the protein content of the experimental material and the verification material is measured by a large number of methods by using a FOSS grain analyzer (Infratec 1241), the instrument utilizes a near infrared transmission technology to carry out full spectrum scanning, rich spectrum information can be obtained, and phenotype data with very high precision of protein content can be obtained by comparing with a calibration database. During measurement, the grains are ensured to be within the safe water content range, each single plant grain is repeatedly measured for 3 times, and the average value of the 3 times of measurement data is taken as the final protein content phenotype data. Phenotype data processing was performed using Microsoft Office excel 2013, the average was taken, and soy seed granule proteins repeated 3 times 2 years were analyzed using SPSS statistics, including significance testing, frequency distribution histogram, mean calculated, etc. Calculation of the best linear unbiased prediction (The Best Linear Unbiased Predictions, blu) is performed by R software.
As shown in Table 1, the 1 quality-related characters of the population have no large amplitude in two years, the variation coefficient is between 4% and 5%, the standard deviation of the protein is small, and the characters are respectively 2.09 and 2.04 in 2 years. Through analysis of kurtosis and skewness, the protein of 1 quality-related characters of the population shows medium skewness distribution, BLUP analysis is carried out on 2018 and 2019 two-year protein oil phenotype data, and a normal distribution diagram (C in FIG. 1) is made of 2018 and 2019 two-year protein average values (A and B in FIG. 1) and BLUP values by SPSS software, so that the measured two-year data and BLUP values of the soybean grain proteins show continuous distribution, the distribution trend is obvious, and the soybean grain proteins can also show normal distribution from the normal curve. Secondly, as can be seen from the two years of protein and the BLUP value graph, the peak value of the BLUP value is lower than the data in 2 years, and the data distribution is wider and more uniform; the BLUP value distribution characteristic of the quality character also accords with the genetic characteristic of the quality character, and is more suitable for subsequent analysis and research by using the BLUP value.
Table 1 descriptive analysis of soybean sequencing material 2018, 2019 protein quality traits
Figure GDA0003546893820000031
The DNA extraction method comprises the following steps: 3-4g of fresh leaves of soybean are taken and put into a 1.5mL centrifuge tube, and then 3 sterilized small steel balls with the diameter of 3mm are added. The centrifuge tube was immersed in liquid nitrogen and the lyophilized leaf tissue was shaken into powder using a tissue mill. Adding 650-700 mu L CTAB extract in water bath at 65 deg.C, mixing on vortex oscillator, and mixing for 1 hr in water bath at 65 deg.C, and reversing for every 15 min. Adding equal volume of chloroform, mixing thoroughly, centrifuging at 12,000rpm for 20min, sucking supernatant, pouring into new centrifuge tube, adding 700 μl of chloroform again, centrifuging at 12,000rpm for 20min after mixing thoroughly upside down, sucking supernatant, dripping into centrifuge tube pre-filled with pre-chilled isopropanol at-20deg.C, and preserving at-20deg.C for 20min. Centrifugation at 8,000rpm for 10min, the supernatant was poured into a waste jar and the bottom pellet DNA was washed with absolute ethanol and 75% ethanol, respectively. And (3) placing the centrifuge tube cover in an ultra-clean workbench for blow-drying, and adding sterilized water. The quality level of the extracted DNA was measured by a spectrophotometer (NanoDrop), and the concentration of the DNA was measured by agarose gel electrophoresis, and the DNA was uniformly diluted to a working solution concentration of 20 ng/. Mu.L.
2. Layering evaluation of SNP loci of resource sequencing materials: 643 soybean resource groups are selected for resequencing, 20 chromosomes obtain 53,946 SNP loci in total, the SNP quality control MAF is less than 0.05, the heterozygosity rate is less than 10%, and each chromosome contains 2,697 on average. Wherein the number of SNP in the chromosome of Chr18 is the largest, and 4,462; the number of SNPs in the Chr11 chromosome was minimal, 862, and the number of SNPs on the remaining chromosomes was shown in FIG. 2.
3. Important allele mining: (1) Classifying materials according to phenotypes, respectively according to soybean proteins with different properties, solving the average value and standard deviation of all data, respectively adding and subtracting the standard deviation from the average value to obtain data which are respectively used as critical, and using materials which are higher than the average value and the standard deviation as high proteins. (2) Sequencing results of the SNP locus are analyzed according to the statistics of alleles of a reference genome or alleles corresponding to a mutated genome, materials are classified according to the standard deviation of the average value plus or minus one time of the standard deviation of the combined phenotype data, namely protein and oil, and the materials with both ends higher than and lower than the standard materials are taken, and the counted data are listed in the following four-way table as table 2 for chi-square detection:
table 2 tetrad table for chi-square analysis
Figure GDA0003546893820000041
Note that: i is A/C/T/G, a 11 For the number of i alleles in high protein, a 21 A is the number of i-free alleles in high protein, a 12 Is the number of i alleles in low protein, a 22 Is the number of i-free alleles in low protein, C 1 Is the total number of high protein, C 2 R is the total number of high protein 1 For the total number of i alleles, R 2 For the total number of i-free alleles, n is the total number of materials.
(3) Let H0: the size of the protein content is independent of the i allele, HA:2 variables are associated. Obtaining a χ2 result by the following formula, and agreeing that H0 is established when the obtained χ2 is less than χ2α; when the obtained χ2 is not less than χ2α, H0 is not agreed to be established, and HA is established. Therefore, the SNP site under study was judged based on the obtained χ2 value and the threshold value at α=0.001.
Figure GDA0003546893820000042
(4) Repeating the steps (2) and (3), and carrying out independent inspection on all SNP loci of the protein property so as to judge the specific influence on the research property, wherein the experiment takes alpha=0.001 as a threshold value, when the P corresponding to the obtained result is less than 0.001, the SNP loci are judged to be obvious loci influencing the property, the subsequent research is carried out, and when the P is more than or equal to 0.001, the continuous research on the loci is abandoned. (5) Performing a further phenotypic effect value verification on the selected significant sites in step (4), performing a phenotypic effect calculation on the significant sites in all materials by the following formula:
Figure GDA0003546893820000043
note that: rate of change indicates the effector value, and the A allele indicates the reference of the SNP siteThe allele corresponding to the genome, value A represents the average Value of the protein with the A sample, the allele corresponding to the mutant genome of the SNP site, and Value B represents the average Value of the protein content with the B sample.
Results: the method comprises the steps of taking protein phenotypes at two ends and sequencing results by taking the standard of plus and minus one standard deviation of the average value of the protein phenotypes as a standard, mining important alleles, selecting significance alpha=0.001 as a layering threshold value by using a chi-square analysis method, layering the analysis result, when the detected P value is smaller than 0.001, considering the analysis result as extremely significant SNP sites related to the protein, relatively small compared effects of other P values larger than 0.001, and very small selection significance, so that the number of extremely significant sites related to the protein is 7,404, wherein the number of the most significant sites related to the protein on chromosome 4 is 1,537, the number of the other chromosomes is shown in fig. 3, the allele mutation of the SNP sites for the further mined sites is further limited in the high-low phenotypes, the extremely significant sites are compared with the MQTL result of the soybean protein, the number of the related SNP sites obtained after comparison is 147, and the number of the SNP sites related to the protein is up to 48 on chromosome 14.
The effects of the critical SNP loci obtained through chi-square detection and comparison with MQTL are different from each other, and some have positive effects on protein content and some have negative effects: an allele can increase protein content if the average protein content for that allele is higher than the average protein content for all resources; conversely, if the average protein content of the mutant allele containing it is lower than the average protein content of all resources, then that allele has the effect of being able to reduce the protein content. The SNP loci have different effect values on protein content (as shown in FIG. 4), the upper point in the graph shows that the difference between the phenotype effect value of the mutation corresponding allele and the phenotype effect value of the reference genome containing allele is positive, namely the protein content after mutation is increased, and the lower point in the graph shows that the difference between the phenotype effect value of the mutation corresponding allele and the phenotype effect value of the reference genome corresponding allele is negative, namely the protein content after mutation is reduced.
On chromosome 1,5 SNP loci of 50.84Mb-50.87Mb, which account for 63.17-66.99% of high protein, have a phenotypic effect rate of 1.56% -2.44%; 34 SNP loci of 36.79-38.93 on chromosome 3, which account for 60.19% -64.52% of high protein, have a phenotypic effect rate of 1.92% -2.59%; 2 SNP loci on chromosome 5 at 40081658 and 40270960, which are respectively 60.19% and 61.17% of high protein, and the phenotypic effect rates are 2.43%; 37 SNP loci of 38.49-47.89 on chromosome 6, which account for 60.22% -73.79% of high protein, have a phenotypic effect rate of 1.65% -2.52%; 5 SNP loci of 9.23-9.24 on chromosome 8, which account for 60.19% -64.08% of high protein, have a phenotypic effect rate of 1.30% -1.71%; a SNP site at 14918130 on chromosome 11, which accounts for 60.19% of high protein, with a phenotypic response of 2.29%; a SNP site at 1389378 on chromosome 13, which accounts for 62.14% of high protein, with a phenotypic response rate of 2.51%; 48 SNP loci of 16.13-16.66 on chromosome 14, which account for 60.22% -77.67% of high protein, and the phenotypic effect rate is 1.76% -2.69%; 4 SNP loci of 42.65-42.75 on chromosome 19, which account for 62.14% -65.59% of high protein, and the phenotypic effect rate is 1.39% -1.76%; the proportion of 9 SNP sites of 33.67-33.99 on chromosome 20, which is high protein, is 60.19% -62.14%, and the phenotypic effect rate is 1.56% -2.29% (the result is shown in Table 3). The above information on the chromosomes of soybean is from a website: https:// phytozome. Jgi. Doe. Gov/pz/portal. Html #)! Infoalias=org_Gmax.
TABLE 3 SNP loci associated with proteins after screening
Figure GDA0003546893820000051
4. Protein-related SNP site haplotype analysis: in order to determine the relation between the obtained SNP loci related to the protein and the protein, analyzing haplotypes of 147 mutation loci, dividing the adjacent loci into a group for common analysis to obtain 46 groups, generating different haplotypes in each group, analyzing to obtain the proportion of the haplotypes in 643 parts of sequencing materials, calculating the phenotype average value of the haplotypes, and finally analyzing to obtain that the protein phenotype average value of the high-protein excellent haplotypes and the low-protein haplotypes in 14 groups of loci has larger difference, wherein the separation of the protein can be better achieved as shown in figure 5.
Analysis at positions 50836411 (representing positions on chromosome) on chromosome 1, 50838581, 50840858, 50854308 and 50861576 shows that high protein excellent haplotype hap_1 (TCCCC) and low protein haplotype hap_4 (TCCCA) are obtained, the high protein excellent haplotype accounts for 34.8%, the protein is mainly distributed at about 41-48, the low protein haplotype accounts for 1.5%, and the protein is mainly distributed at 40-43, so that obvious difference is achieved; analysis on chromosome 3 at positions 36911977, 36956744, 36976313 and 37015622 shows that the high protein excellent haplotype has_1 (TACTCATATTAC) and the low protein haplotype has_35 (CAAATGAGCCGA), the high protein excellent haplotype accounts for 36.12%, the protein is mainly distributed at about 41-48, the low protein haplotype accounts for 1.17%, and the protein is mainly distributed at 40-42, so that obvious difference is achieved; analysis on chromosome 3 at 38222729 shows that high protein excellent haplotype hap_4 (ACGCAATGTAGA) and low protein haplotype hap_102 (CGATGGTAACAT) are obtained, the high protein excellent haplotype accounts for 42.31%, the protein is mainly distributed at about 41-47.5, the low protein haplotype accounts for 1.17%, and the protein is mainly distributed at 39-40, so that obvious difference is achieved; analysis on chromosome 5 at 40081658 shows that high protein excellent haplotype hap_5 (TAGTTCCCTCTCA) and low protein haplotype hap_12 (TAATTCCCTCTCA) are obtained, the high protein excellent haplotype accounts for 20.40%, the protein is mainly distributed around 41-46, the low protein haplotype accounts for 1.67%, and the protein is mainly distributed around 39-43, so that obvious difference is achieved; analysis on chromosome 5 at 40270960 shows that high protein excellent haplotype hap_1 (CCAGGTTAGCCGA) and low protein haplotype hap_10 (CCGTGTTAGCCGA) are obtained, the high protein excellent haplotype accounts for 23.91%, the protein is mainly distributed about 41-48, the low protein haplotype accounts for 1.00%, and the protein is mainly distributed about 36-40, so that obvious difference is achieved; analysis on chromosome 6 at 44869874, 45732460, 46313677, 46682433 and 47893908 shows that the high protein excellent haplotype hap_5 (GTGGCGCCTG) and the low protein haplotype hap_60 (ACTATACTCC) are obtained, the high protein excellent haplotype accounts for 19.23%, the protein is mainly distributed at about 41-48, the low protein haplotype accounts for 1.17%, and the protein is mainly distributed at 38-42, so that obvious difference is achieved; analysis on chromosome 11 at 14918130 shows that high protein excellent haplotype hap_3 (AAGTCAGTAGCAAATGGCA) and low protein haplotype hap_58 (TGACTCTGAAAGGGGTA TG) are obtained, the high protein excellent haplotype accounts for 39.46%, the protein is mainly distributed about 41-48, the low protein haplotype accounts for 1.17%, and the protein is mainly distributed about 38-42, so that obvious difference is achieved; analysis on chromosome 13 at 13893781 shows that the high protein excellent haplotype Hap_1 (AATGGACAGGAGCA) and the low protein haplotype Hap_27 (AATGGACAGAAGCA) are obtained, the high protein excellent haplotype accounts for 24.75%, the protein is mainly distributed at about 41-48, the low protein haplotype accounts for 1.00%, and the protein is mainly distributed at 40-41, so that obvious difference is achieved; analysis on chromosome 13 at 13893781 shows that the high protein excellent haplotype Hap_1 (AATGGACAGGAGCA) and the low protein haplotype Hap_27 (AATGGACAGAAGCA) are obtained, the high protein excellent haplotype accounts for 24.75%, the protein is mainly distributed at about 41-48, the low protein haplotype accounts for 1.00%, and the protein is mainly distributed at 40-41, so that obvious difference is achieved.
5. Verification population: 151 parts of soybean core non-sequencing extreme protein resource material (Table 4) in northeast area are selected for verification of important allele mining, and the planting, management, sampling and harvesting methods are the same as those of experimental materials.
TABLE 4 name and protein content of 151 parts soybean non-sequenced extreme protein material variety
Figure GDA0003546893820000061
Figure GDA0003546893820000071
The marking screening method of SNP locus comprises the following steps: the KASP reaction system consisted of mixed primers, master Mix and sample DNA. Based on the SNP site obtained by the layering evaluation, 50bp base sequences on the upstream and downstream of the SNP site were extracted by local Blast, and KASP primers were designed by using Primer 5.0 software.Each site primer consists of 2 specific forward primers (F1/F2) with different alleles and fluorescent tags and 1 common reverse primer (R), wherein the main formulation of each component is 46. Mu.L ddH 2 O, 12. Mu.L (100. Mu. Mol. L-1) of each forward primer and 30. Mu.L (100. Mu. Mol. L-1) of each reverse primer, master Mix was obtained from LGC company. Fluorescent tag FAM: GAAGGTGACCAAGTTCATGCT (SEQ ID NO. 1), fluorescent tag HEX:GAAGGTCGGAGTCAACGGATT(SEQ ID NO. 2), and the primer sequence information is shown in Table 5.
TABLE 5 primer sequence information
Figure GDA0003546893820000081
/>
Figure GDA0003546893820000091
The components required for the KASP reaction were added to 384 well plates using a Roche LightCycler480 II real-time fluorescent quantitative PCR apparatus and the terminal fluorescent signal was read after the reaction was terminated, and the PCR amplification procedure was as follows: 95 ℃ for 15min;95 ℃ for 20s;65 ℃ for 25s; go to step 2, 10cycles, -0.8 ℃ per cycle;95 ℃ for 10s;57 ℃ for 1min; go to step 4, 35cycles;4 ℃ and infinity.
KASP typing verification: the Luo's LightCycler480 II obtains parting result, transposes to Excel software, analyzes, calculates the site coincidence rate, and the basic idea is: (1) According to different extreme soybean proteins and oil non-sequencing materials, the number and distribution of alleles corresponding to the reference genome and alleles corresponding to the mutant genome of SNP loci of each primer in high protein, low protein materials or high oil and low oil materials are counted, and a four-grid table of coincidence rate is constructed as shown in Table 6:
table 6 four grid table of compliance rates
Figure GDA0003546893820000092
Note that: and x and y are genotypes of KASP (kaSP) typing of SNP locus design primers, a is the number of x alleles in a non-sequencing high protein material typing result, b is the number of x alleles in a non-sequencing low protein material typing result, c is the number of y alleles in a non-sequencing high protein material typing result, d is the number of y alleles in a non-sequencing low protein material typing result, M is the total number of non-sequencing high protein materials, and N is the total number of non-sequencing low protein materials.
(2) Original assumption H 0 : the content is independent of the x/y allele, H A :2 variables are associated. The coincidence rate P is obtained by 1 、P 2 When P is obtained 1 <P α Or P 2 <P α When agree to H 0 Establishment; when P is obtained 1 ≥P α And P is 2 ≥P α When disagree with H 0 Hold, H A This is true. So according to the calculated P 1 、P 2 The result of each primer was judged by the value and a threshold value at α=60%.
Figure GDA0003546893820000093
(3) Repeating the steps (1) and (2), performing independent test on all primer typing results of the protein trait to verify the influence on the trait, performing further phenotypic effect verification on all results obtained in the step (3), and performing phenotypic effect calculation on the dominant position in all materials by using the following formula:
Figure GDA0003546893820000094
results: typing verification was performed using 29 markers of protein, and genotyping was performed using KASP on 151 copies of the polar protein non-sequencing resource material. The final 20 successful typing of protein-related markers is shown in FIG. 6, which shows a schematic representation of the results of 1 KASP typing successful markers, where 2 different homozygous alleles (GG, AA) are shown, and the genotype (AG) is shown. Analysis of KASP verification results of protein-related SNP markers shows that 54 parts of protein-related Gm01_50861576 markers are of CC genotype in high-protein materials, 42 parts of protein-related Gm01_50861576 markers are of AA genotype in low-protein materials, the protein-related Gm01_50861576 markers respectively account for 70.13 percent and 56.76 percent of high-protein materials, and the phenotypic effect value of the protein-related Gm01_50861576 markers is 3.31 percent; protein-related Gm06_44869874 is marked with 73 parts of GG genotype in high-protein material, 57 parts of GG genotype in low-protein material, and the high-protein material and the low-protein material respectively account for 94.81% and 77.03%, and the phenotype effect value of the Gm06_44869874 is 8.34%; the Gm14_16525645 marker related to the protein has the genotype of 41 parts of AA in a high-protein material, 51 parts of TT in a low-protein material, 53.25 percent and 68.92 percent of the high-protein material and the low-protein material respectively, the phenotypic effect value of the marker is 2.24 percent, and all the 3 markers can be successfully typed and represent different genotypes in the high-protein material and the low-protein material, so that SNP markers can be successfully developed.
A soybean high protein content related molecular marker, wherein the nucleotide sequence of the molecular marker is SNP1, the sequence of the SNP1 is the nucleotide sequence of 50.84Mb-50.87Mb position on soybean chromosome 1, and the 50861576 nucleotide site of Gm01 chromosome is A or C;
a soybean high protein content related molecular marker, wherein the nucleotide sequence of the molecular marker is SNP2, the sequence of the SNP2 is the nucleotide sequence of 38.49Mb-47.89Mb position on a soybean chromosome 6, and the 44869874 nucleotide site of a Gm06 chromosome is A or G.
A soybean high protein content related molecular marker, wherein the nucleotide sequence of the molecular marker is SNP3, the sequence of the SNP3 is the nucleotide sequence of the 16.13Mb-16.66Mb position on a soybean chromosome 14, and the 16525645 nucleotide site of the chromosome Gm14 is A or T.
Example 2.
1. A kit for screening high protein soybeans:
(a) The nucleotide sequence of the upstream primer of the amplification SNP1 is shown as SEQ ID NO.15 or SEQ ID NO. 16; the nucleotide sequence of the downstream primer of the amplification SNP1 is shown as SEQ ID NO. 17;
(b) The nucleotide sequence of the upstream primer of the amplification SNP2 is shown as SEQ ID NO.51 or SEQ ID NO. 52; the nucleotide sequence of the downstream primer of the amplification SNP2 is shown as SEQ ID NO. 53;
(c) The nucleotide sequence of the upstream primer of the amplification SNP3 is shown as SEQ ID NO.84 or SEQ ID NO. 85; the nucleotide sequence of the downstream primer of the amplification SNP3 is shown as SEQ ID NO. 86.
The screening method comprises the following steps: selecting a sample with unknown soy protein content, and performing a PCR amplification procedure by using the kit for screening high-protein soybeans in the step one: 95 ℃ for 15min;95 ℃ for 20s;65 ℃ for 25s; go to step 2, 10cycles, -0.8 ℃ per cycle;95 ℃ for 10s;57 ℃ for 1min; go to step 4, 35cycles;4 ℃ and infinity. The steps after KASP analysis are as follows:
2. a method for identifying soybeans with high protein content, which comprises the following specific steps:
(1) Extracting DNA of soybean to be detected;
(2) Carrying out PCR (polymerase chain reaction) by using a primer of SNP1 molecular marker, detecting that the soybean of the variety to be detected is of CC genotype, wherein the soybean of the variety to be detected is of high protein content, and if the soybean of the variety to be detected is of AA genotype, the soybean of the variety to be detected is of low protein content; carrying out PCR (polymerase chain reaction) by using a primer of SNP2 molecular marker, detecting that the soybean of the variety to be detected is GG genotype, and if the soybean of the variety to be detected is AA genotype, the soybean of the variety to be detected is low-protein content; and (3) carrying out PCR reaction by using the SNP3 molecular marked primer, and detecting that the soybean of the to-be-detected variety is AA genotype, wherein the soybean of the to-be-detected variety is high-protein soybean, and if the soybean of the to-be-detected variety is TT genotype, the soybean of the to-be-detected variety is low-protein soybean.
Results: the soybean protein content in the sample with the unknown soybean protein content is more than 42%, the genotype is used for SNP1 marker detection to be CC genotype, SNP2 marker detection to be GG genotype and SNP3 marker detection to be AA genotype, and the soybean high protein content is consistent with the genotype obtained by marker detection. The low protein content of soybean is consistent with the genotype detected by the marker.
SEQUENCE LISTING
<110> northeast agricultural university
<120> a high protein content related molecular marker of soybean located on chromosome 6 and a method for identifying high protein content soybean
<160> 119
<170> PatentIn version 3.5
<210> 1
<211> 21
<212> DNA
<213> Synthesis
<400> 1
gaaggtgacc aagttcatgc t 21
<210> 2
<211> 21
<212> DNA
<213> Synthesis
<400> 2
gaaggtcgga gtcaacggat t 21
<210> 3
<211> 46
<212> DNA
<213> Synthesis
<400> 3
gaaggtgacc aagttcatgc tcctgcttta gtttattgtt gacaaa 46
<210> 4
<211> 46
<212> DNA
<213> Synthesis
<400> 4
gaaggtcgga gtcaacggat tcctgcttta gtttattgtt gacaat 46
<210> 5
<211> 30
<212> DNA
<213> Synthesis
<400> 5
gaagtggaaa aagttatcag tgcttgacac 30
<210> 6
<211> 44
<212> DNA
<213> Synthesis
<400> 6
gaaggtgacc aagttcatgc ttgcagcttt aaaataccaa taat 44
<210> 7
<211> 44
<212> DNA
<213> Synthesis
<400> 7
gaaggtcgga gtcaacggat ttgcagcttt aaaataccaa taac 44
<210> 8
<211> 25
<212> DNA
<213> Synthesis
<400> 8
aaatcccatt tggactatat cagcg 25
<210> 9
<211> 41
<212> DNA
<213> Synthesis
<400> 9
gaaggtgacc aagttcatgc tttgaagaag agttttcaag t 41
<210> 10
<211> 41
<212> DNA
<213> Synthesis
<400> 10
gaaggtcgga gtcaacggat tttgaagaag agttttcaag c 41
<210> 11
<211> 23
<212> DNA
<213> Synthesis
<400> 11
tataaatacc ataccccatc acg 23
<210> 12
<211> 43
<212> DNA
<213> Synthesis
<400> 12
gaaggtgacc aagttcatgc ttcacccgag tatcttatat cat 43
<210> 13
<211> 43
<212> DNA
<213> Synthesis
<400> 13
gaaggtcgga gtcaacggat ttcacccgag tatcttatat cac 43
<210> 14
<211> 21
<212> DNA
<213> Synthesis
<400> 14
gaaacatgga gtgacttgtg g 21
<210> 15
<211> 41
<212> DNA
<213> Synthesis
<400> 15
gaaggtgacc aagttcatgc ttttcgtccc aaaattggtt a 41
<210> 16
<211> 41
<212> DNA
<213> Synthesis
<400> 16
gaaggtcgga gtcaacggat ttttcgtccc aaaattggtt c 41
<210> 17
<211> 23
<212> DNA
<213> Synthesis
<400> 17
ccttcttcac caaataccaa cca 23
<210> 18
<211> 40
<212> DNA
<213> Synthesis
<400> 18
gaaggtgacc aagttcatgc tgggttcaac atttccttgg 40
<210> 19
<211> 40
<212> DNA
<213> Synthesis
<400> 19
gaaggtcgga gtcaacggat tgggttcaac atttccttga 40
<210> 20
<211> 21
<212> DNA
<213> Synthesis
<400> 20
attggcagtc tgctgaggtc a 21
<210> 21
<211> 45
<212> DNA
<213> Synthesis
<400> 21
gaaggtgacc aagttcatgc tcaagtctgc ttaaaatgaa cacaa 45
<210> 22
<211> 45
<212> DNA
<213> Synthesis
<400> 22
gaaggtcgga gtcaacggat tcaagtctgc ttaaaatgaa cacat 45
<210> 23
<211> 23
<212> DNA
<213> Synthesis
<400> 23
agactcttgc attcaacagg gat 23
<210> 24
<211> 46
<212> DNA
<213> Synthesis
<400> 24
gaaggtgacc aagttcatgc taaacaagta aacatgccat attcat 46
<210> 25
<211> 46
<212> DNA
<213> Synthesis
<400> 25
gaaggtcgga gtcaacggat taaacaagta aacatgccat attcaa 46
<210> 26
<211> 22
<212> DNA
<213> Synthesis
<400> 26
cgaaattaat taggcatgca aa 22
<210> 27
<211> 46
<212> DNA
<213> Synthesis
<400> 27
gaaggtgacc aagttcatgc tgtcactgaa gctaggcgaa gcttgg 46
<210> 28
<211> 46
<212> DNA
<213> Synthesis
<400> 28
gaaggtcgga gtcaacggat tgtcactgaa gctaggcgaa gcttga 46
<210> 29
<211> 25
<212> DNA
<213> Synthesis
<400> 29
gtcactgaag ctaggcgaag cttgg 25
<210> 30
<211> 41
<212> DNA
<213> Synthesis
<400> 30
gaaggtgacc aagttcatgc ttcctcttct tcttcctgct c 41
<210> 31
<211> 41
<212> DNA
<213> Synthesis
<400> 31
gaaggtcgga gtcaacggat ttcctcttct tcttcctgct a 41
<210> 32
<211> 27
<212> DNA
<213> Synthesis
<400> 32
atgagacata cctggtacct ccgactc 27
<210> 33
<211> 43
<212> DNA
<213> Synthesis
<400> 33
gaaggtgacc aagttcatgc tttgaaatgg gaatcttcct ttg 43
<210> 34
<211> 43
<212> DNA
<213> Synthesis
<400> 34
gaaggtcgga gtcaacggat tttgaaatgg gaatcttcct ttc 43
<210> 35
<211> 30
<212> DNA
<213> Synthesis
<400> 35
ttatctcatt gataataatg caatcttcaa 30
<210> 36
<211> 41
<212> DNA
<213> Synthesis
<400> 36
gaaggtgacc aagttcatgc ttgttccatc aacatgacag a 41
<210> 37
<211> 41
<212> DNA
<213> Synthesis
<400> 37
gaaggtcgga gtcaacggat ttgttccatc aacatgacag c 41
<210> 38
<211> 28
<212> DNA
<213> Synthesis
<400> 38
agaaattata aaggtaaggg attgcatt 28
<210> 39
<211> 43
<212> DNA
<213> Synthesis
<400> 39
gaaggtgacc aagttcatgc taccaagaga caatgctgtc tca 43
<210> 40
<211> 43
<212> DNA
<213> Synthesis
<400> 40
gaaggtcgga gtcaacggat taccaagaga caatgctgtc tct 43
<210> 41
<211> 25
<212> DNA
<213> Synthesis
<400> 41
ttgagaggga tgaatgaaag agtgt 25
<210> 42
<211> 44
<212> DNA
<213> Synthesis
<400> 42
gaaggtgacc aagttcatgc taaaaaaaag tgattcaaga ttaa 44
<210> 43
<211> 44
<212> DNA
<213> Synthesis
<400> 43
gaaggtcgga gtcaacggat taaaaaaaag tgattcaaga ttaa 44
<210> 44
<211> 23
<212> DNA
<213> Synthesis
<400> 44
tgaggggaag aggggttaga gtt 23
<210> 45
<211> 44
<212> DNA
<213> Synthesis
<400> 45
gaaggtgacc aagttcatgc taccatgatt ttgtctgggt atat 44
<210> 46
<211> 44
<212> DNA
<213> Synthesis
<400> 46
gaaggtcgga gtcaacggat taccatgatt ttgtctgggt ataa 44
<210> 47
<211> 27
<212> DNA
<213> Synthesis
<400> 47
ggaaattgaa gcactacaaa atgataa 27
<210> 48
<211> 42
<212> DNA
<213> Synthesis
<400> 48
gaaggtgacc aagttcatgc tattcattaa aaagcctggt ct 42
<210> 49
<211> 42
<212> DNA
<213> Synthesis
<400> 49
gaaggtcgga gtcaacggat tattcattaa aaagcctggt cc 42
<210> 50
<211> 27
<212> DNA
<213> Synthesis
<400> 50
caaggactgg taaagcttga gactcta 27
<210> 51
<211> 40
<212> DNA
<213> Synthesis
<400> 51
gaaggtgacc aagttcatgc tcccgaaatt tctcttggga 40
<210> 52
<211> 40
<212> DNA
<213> Synthesis
<400> 52
gaaggtcgga gtcaacggat tcccgaaatt tctcttgggg 40
<210> 53
<211> 26
<212> DNA
<213> Synthesis
<400> 53
tgttcctatc atcgcataaa actcag 26
<210> 54
<211> 44
<212> DNA
<213> Synthesis
<400> 54
gaaggtgacc aagttcatgc tgggagataa gaaagctaat attt 44
<210> 55
<211> 44
<212> DNA
<213> Synthesis
<400> 55
gaaggtcgga gtcaacggat tgggagataa gaaagctaat attc 44
<210> 56
<211> 26
<212> DNA
<213> Synthesis
<400> 56
catatttgag acagggacag tcgaag 26
<210> 57
<211> 41
<212> DNA
<213> Synthesis
<400> 57
gaaggtgacc aagttcatgc ttcttcagtc cctcctttga c 41
<210> 58
<211> 41
<212> DNA
<213> Synthesis
<400> 58
gaaggtcgga gtcaacggat ttcttcagtc cctcctttga t 41
<210> 59
<211> 27
<212> DNA
<213> Synthesis
<400> 59
gtctctacac aatgccacaa cactaat 27
<210> 60
<211> 41
<212> DNA
<213> Synthesis
<400> 60
gaaggtgacc aagttcatgc tcaacgagag tcaaatcgct c 41
<210> 61
<211> 41
<212> DNA
<213> Synthesis
<400> 61
gaaggtcgga gtcaacggat tcaacgagag tcaaatcgct a 41
<210> 62
<211> 29
<212> DNA
<213> Synthesis
<400> 62
ggtttaatcg ttttctccga gagtagtta 29
<210> 63
<211> 43
<212> DNA
<213> Synthesis
<400> 63
gaaggtgacc aagttcatgc tcctcctagg aaaccaatgt tac 43
<210> 64
<211> 43
<212> DNA
<213> Synthesis
<400> 64
gaaggtcgga gtcaacggat tcctcctagg aaaccaatgt tag 43
<210> 65
<211> 30
<212> DNA
<213> Synthesis
<400> 65
acattaaatc atagagcaaa agagggatat 30
<210> 66
<211> 40
<212> DNA
<213> Synthesis
<400> 66
gaaggtgacc aagttcatgc tctcaccgta cgaagcttct 40
<210> 67
<211> 40
<212> DNA
<213> Synthesis
<400> 67
gaaggtcgga gtcaacggat tctcaccgta cgaagcttcc 40
<210> 68
<211> 25
<212> DNA
<213> Synthesis
<400> 68
gtacggcaag tgacaaactg acagc 25
<210> 69
<211> 40
<212> DNA
<213> Synthesis
<400> 69
gaaggtgacc aagttcatgc tcttgatgag tattttgata 40
<210> 70
<211> 40
<212> DNA
<213> Synthesis
<400> 70
gaaggtcgga gtcaacggat tcttgatgag tattttgatt 40
<210> 71
<211> 23
<212> DNA
<213> Synthesis
<400> 71
tattggggtg gtcactagca tta 23
<210> 72
<211> 42
<212> DNA
<213> Synthesis
<400> 72
gaaggtgacc aagttcatgc tatgcttaag gatagtgatg gc 42
<210> 73
<211> 42
<212> DNA
<213> Synthesis
<400> 73
gaaggtcgga gtcaacggat tatgcttaag gatagtgatg ga 42
<210> 74
<211> 28
<212> DNA
<213> Synthesis
<400> 74
aatttggtga ccatagtctc caacttta 28
<210> 75
<211> 40
<212> DNA
<213> Synthesis
<400> 75
gaaggtgacc aagttcatgc tagaacaggg gaaaggaatt 40
<210> 76
<211> 40
<212> DNA
<213> Synthesis
<400> 76
gaaggtcgga gtcaacggat tagaacaggg gaaaggaatg 40
<210> 77
<211> 27
<212> DNA
<213> Synthesis
<400> 77
actgttaaac ccttaagctc atcaatg 27
<210> 78
<211> 42
<212> DNA
<213> Synthesis
<400> 78
gaaggtgacc aagttcatgc ttctccattc tttgctactc at 42
<210> 79
<211> 42
<212> DNA
<213> Synthesis
<400> 79
gaaggtcgga gtcaacggat ttctccattc tttgctactc ac 42
<210> 80
<211> 28
<212> DNA
<213> Synthesis
<400> 80
cataatgaac aaataaaggg acaaggta 28
<210> 81
<211> 45
<212> DNA
<213> Synthesis
<400> 81
gaaggtgacc aagttcatgc tcaagtgaaa atttttttat ttaag 45
<210> 82
<211> 45
<212> DNA
<213> Synthesis
<400> 82
gaaggtcgga gtcaacggat tcaagtgaaa atttttttat ttaat 45
<210> 83
<211> 21
<212> DNA
<213> Synthesis
<400> 83
tttagtggga tcgacaggcc c 21
<210> 84
<211> 41
<212> DNA
<213> Synthesis
<400> 84
gaaggtgacc aagttcatgc tgtcaaggtc tttgaaacct a 41
<210> 85
<211> 41
<212> DNA
<213> Synthesis
<400> 85
gaaggtcgga gtcaacggat tgtcaaggtc tttgaaacct t 41
<210> 86
<211> 22
<212> DNA
<213> Synthesis
<400> 86
gcagctgatg caacctaatt ga 22
<210> 87
<211> 46
<212> DNA
<213> Synthesis
<400> 87
gaaggtgacc aagttcatgc tggtgctaag gcaatttgac catgtc 46
<210> 88
<211> 46
<212> DNA
<213> Synthesis
<400> 88
gaaggtcgga gtcaacggat tggtgctaag gcaatttgac catgtg 46
<210> 89
<211> 22
<212> DNA
<213> Synthesis
<400> 89
ataggacaag gatgttgttg gc 22
<210> 90
<211> 41
<212> DNA
<213> Synthesis
<400> 90
gaaggtgacc aagttcatgc tacgccaaaa atagtaaaat g 41
<210> 91
<211> 41
<212> DNA
<213> Synthesis
<400> 91
gaaggtcgga gtcaacggat tacgccaaaa atagtaaaat a 41
<210> 92
<211> 25
<212> DNA
<213> Synthesis
<400> 92
ggggaggaaa taaagggtgt tgtgt 25
<210> 93
<211> 41
<212> DNA
<213> Synthesis
<400> 93
gaaggtgacc aagttcatgc tggtttatgt tcaggccaat g 41
<210> 94
<211> 41
<212> DNA
<213> Synthesis
<400> 94
gaaggtcgga gtcaacggat tggtttatgt tcaggccaat a 41
<210> 95
<211> 24
<212> DNA
<213> Synthesis
<400> 95
tctccccagt caaaaggtaa cctc 24
<210> 96
<211> 43
<212> DNA
<213> Synthesis
<400> 96
gaaggtgacc aagttcatgc tgcattgttc atttgttagc ttc 43
<210> 97
<211> 43
<212> DNA
<213> Synthesis
<400> 97
gaaggtcgga gtcaacggat tgcattgttc atttgttagc ttt 43
<210> 98
<211> 22
<212> DNA
<213> Synthesis
<400> 98
gtgaaccaac aataaccaag gc 22
<210> 99
<211> 44
<212> DNA
<213> Synthesis
<400> 99
gaaggtgacc aagttcatgc tgctgtgagg aacctaacac aacc 44
<210> 100
<211> 44
<212> DNA
<213> Synthesis
<400> 100
gaaggtcgga gtcaacggat tgctgtgagg aacctaacac aact 44
<210> 101
<211> 22
<212> DNA
<213> Synthesis
<400> 101
gttgcatagt tggtccaaat cc 22
<210> 102
<211> 41
<212> DNA
<213> Synthesis
<400> 102
gaaggtgacc aagttcatgc tacaagttgc caaagaattg t 41
<210> 103
<211> 41
<212> DNA
<213> Synthesis
<400> 103
gaaggtcgga gtcaacggat tacaagttgc caaagaattg a 41
<210> 104
<211> 26
<212> DNA
<213> Synthesis
<400> 104
ggcaacgcca tgaataactt acctta 26
<210> 105
<211> 46
<212> DNA
<213> Synthesis
<400> 105
gaaggtgacc aagttcatgc tctactagag tttcaaagca ttagaa 46
<210> 106
<211> 46
<212> DNA
<213> Synthesis
<400> 106
gaaggtcgga gtcaacggat tctactagag tttcaaagca ttagag 46
<210> 107
<211> 24
<212> DNA
<213> Synthesis
<400> 107
atggagacag tgaaattgag gctc 24
<210> 108
<211> 42
<212> DNA
<213> Synthesis
<400> 108
gaaggtgacc aagttcatgc tcctagtact atgatatgga cg 42
<210> 109
<211> 42
<212> DNA
<213> Synthesis
<400> 109
gaaggtcgga gtcaacggat tcctagtact atgatatgga ca 42
<210> 110
<211> 30
<212> DNA
<213> Synthesis
<400> 110
taggtatttc attggatatg ccaaaacgtc 30
<210> 111
<211> 40
<212> DNA
<213> Synthesis
<400> 111
gaaggtgacc aagttcatgc tgagagatac aagacaagac 40
<210> 112
<211> 40
<212> DNA
<213> Synthesis
<400> 112
gaaggtcgga gtcaacggat tgagagatac aagacaagaa 40
<210> 113
<211> 30
<212> DNA
<213> Synthesis
<400> 113
gtctttatgt aatcaattgc ttctttttga 30
<210> 114
<211> 46
<212> DNA
<213> Synthesis
<400> 114
gaaggtgacc aagttcatgc tcctggattc gttagccgtt ggattg 46
<210> 115
<211> 46
<212> DNA
<213> Synthesis
<400> 115
gaaggtcgga gtcaacggat tcctggattc gttagccgtt ggatta 46
<210> 116
<211> 22
<212> DNA
<213> Synthesis
<400> 116
gcacaaatga atcttgaacc ac 22
<210> 117
<211> 40
<212> DNA
<213> Synthesis
<400> 117
gaaggtgacc aagttcatgc tcgaccacaa aaaatgaggc 40
<210> 118
<211> 40
<212> DNA
<213> Synthesis
<400> 118
gaaggtcgga gtcaacggat tcgaccacaa aaaatgagga 40
<210> 119
<211> 24
<212> DNA
<213> Synthesis
<400> 119
acactatttt tcttattttt cccg 24

Claims (2)

1. Use of a primer for amplifying a soybean high protein content related molecular marker located on chromosome 6 for preparing a kit for identifying soybean with high protein content, wherein the nucleotide sequence of the molecular marker is SNP2, the sequence of the SNP2 is the nucleotide sequence of 38.49Mb-47.89Mb position on chromosome 6 of soybean, and the 44869874 nucleotide site of chromosome Gm06 is A or G;
the nucleotide sequence of the upstream primer of the SNP2 is shown as SEQ ID NO.51, the nucleotide sequence of the downstream primer of the SNP2 is shown as SEQ ID NO. 52.
2. A method for identifying soybeans with high protein content, which is characterized by comprising the following specific steps:
(1) Extracting DNA of soybean to be detected;
(2) And (3) carrying out PCR reaction by using SEQ ID NO.51, SEQ ID NO.52 and SEQ ID NO.53, wherein the soybean of the to-be-detected variety is detected to be of GG genotype, the soybean of the to-be-detected variety is detected to be of high protein content, and the soybean of the to-be-detected variety is detected to be of low protein content if the soybean of the to-be-detected variety is of AA genotype.
CN202210043653.2A 2021-05-27 2021-05-27 Soybean high-protein content related molecular marker located on chromosome 6 and method for identifying soybean high-protein content Active CN114395640B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210043653.2A CN114395640B (en) 2021-05-27 2021-05-27 Soybean high-protein content related molecular marker located on chromosome 6 and method for identifying soybean high-protein content

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110583739.XA CN113322339B (en) 2021-05-27 2021-05-27 Molecular marker related to high protein content of soybean and method for identifying soybean with high protein content
CN202210043653.2A CN114395640B (en) 2021-05-27 2021-05-27 Soybean high-protein content related molecular marker located on chromosome 6 and method for identifying soybean high-protein content

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN202110583739.XA Division CN113322339B (en) 2021-05-27 2021-05-27 Molecular marker related to high protein content of soybean and method for identifying soybean with high protein content

Publications (2)

Publication Number Publication Date
CN114395640A CN114395640A (en) 2022-04-26
CN114395640B true CN114395640B (en) 2023-05-12

Family

ID=77421607

Family Applications (3)

Application Number Title Priority Date Filing Date
CN202210043653.2A Active CN114395640B (en) 2021-05-27 2021-05-27 Soybean high-protein content related molecular marker located on chromosome 6 and method for identifying soybean high-protein content
CN202110583739.XA Active CN113322339B (en) 2021-05-27 2021-05-27 Molecular marker related to high protein content of soybean and method for identifying soybean with high protein content
CN202210045059.7A Active CN114182045B (en) 2021-05-27 2021-05-27 Soybean high-protein content related molecular marker located on chromosome 14 and method for identifying soybean high-protein content

Family Applications After (2)

Application Number Title Priority Date Filing Date
CN202110583739.XA Active CN113322339B (en) 2021-05-27 2021-05-27 Molecular marker related to high protein content of soybean and method for identifying soybean with high protein content
CN202210045059.7A Active CN114182045B (en) 2021-05-27 2021-05-27 Soybean high-protein content related molecular marker located on chromosome 14 and method for identifying soybean high-protein content

Country Status (1)

Country Link
CN (3) CN114395640B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114317798B (en) * 2021-12-21 2023-03-14 河北省农林科学院粮油作物研究所 Molecular marker related to soybean protein content and application thereof
CN114107551B (en) * 2021-12-21 2023-03-14 河北省农林科学院粮油作物研究所 Molecular marker for identifying or assisting in identifying soybean protein content and application thereof
WO2023126875A1 (en) * 2021-12-29 2023-07-06 Benson Hill, Inc. Compositions and methods for producing high-protein soybean plants
CN116287423B (en) * 2023-05-17 2023-08-04 黑龙江省农业科学院农产品质量安全研究所 SNP molecular marker related to corn kernel oil content and application thereof
CN116622888B (en) * 2023-06-02 2024-02-20 江苏省农业科学院 KASP (KASP-related protein) mark related to soybean glutamic acid and application thereof

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103045588B (en) * 2012-12-11 2014-08-20 南京农业大学 Molecular marker of major QTL (Quantitative Trait Locus) of soybean seed protein content and application thereof
CN105925722B (en) * 2016-07-11 2020-02-14 东北农业大学 QTL related to soybean protein content, method for obtaining molecular marker, molecular marker and application
CN108165659A (en) * 2018-03-13 2018-06-15 山东省农业科学院作物研究所 A kind of molecule labelling method for improving soybean protein content and its label combination
CN109486993A (en) * 2018-12-06 2019-03-19 江苏沿海地区农业科学研究所 A kind of selection of high-protein soybean germplasm

Also Published As

Publication number Publication date
CN114182045B (en) 2023-08-18
CN114395640A (en) 2022-04-26
CN114182045A (en) 2022-03-15
CN113322339B (en) 2022-02-22
CN113322339A (en) 2021-08-31

Similar Documents

Publication Publication Date Title
CN114395640B (en) Soybean high-protein content related molecular marker located on chromosome 6 and method for identifying soybean high-protein content
CN110117673B (en) Molecular marker of brassica napus dwarf trait locus and application thereof
CN114317805B (en) Soybean high-oil-content related molecular marker located on soybean chromosome 6 and method for identifying high-oil-content soybeans
CN114231658B (en) High-oil-content related molecular marker located on soybean chromosome 1 and application thereof
CN116837110B (en) SNP locus on chromosome 7 and related to chicken growth traits and application thereof
Chepkoech et al. Assessment of genetic variability of passion fruit using simple sequence repeat (SSR) markers
CN113215297B (en) Molecular marker ID0159 closely linked with major QTL site of sesame oil content and application thereof
CN115992248B (en) Molecular marker related to muscovy duck propagation traits and application thereof
CN117418030A (en) Soybean protein content-related molecular marker located on soybean chromosome 7 and application thereof
Wang et al. Genome-Wide association Study Identifies Candidate Genes Related to Oleic acid content of Soybean Seed
CN117230240A (en) InDel locus related to soybean seed oil content, molecular marker, primer and application thereof
CN117965778A (en) InDel locus related to soybean seed oil content, molecular marker, primer and application thereof
CN117418029A (en) Molecular marker related to soybean protein content on soybean chromosome 2 and application thereof
CN117363781A (en) SNP molecular marker combination for identifying oil content and/or plant height of rape and application of SNP molecular marker combination in high-oil and high-yield polymerized breeding of rape
CN117305501A (en) Soybean protein content-related molecular marker located on soybean chromosome 14 and application thereof
CN117467793A (en) Soybean protein content-related molecular marker located on soybean chromosome 17 and application thereof
CN117344051A (en) Soybean protein content-related molecular marker located on soybean chromosome 3 and application thereof
CN117987592A (en) KASP molecular marker related to soybean main stem node number and application thereof
CN117248061A (en) InDel locus related to soybean seed oil content, molecular marker, primer and application thereof
CN117487931A (en) Sillago sihama hypoxia tolerance character related SNP molecular marker and application thereof
CN117363773A (en) Molecular marker related to soybean oil content on soybean chromosome 17 and application thereof
CN117363774A (en) Molecular marker related to soybean oil content and application thereof
CN117587155A (en) Molecular marker related to soybean oil content on soybean chromosome 3 and application thereof
CN117604139A (en) Molecular marker related to soybean oil content on soybean chromosome 12 and application thereof
CN118086565A (en) Development and application of InDel molecular marker for identifying variety of lemon of No. 1 cloud lemon

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant