CN108445035B - Method for identifying corn haploid grains based on nuclear magnetic resonance CPMG attenuation curve - Google Patents

Method for identifying corn haploid grains based on nuclear magnetic resonance CPMG attenuation curve Download PDF

Info

Publication number
CN108445035B
CN108445035B CN201810377928.XA CN201810377928A CN108445035B CN 108445035 B CN108445035 B CN 108445035B CN 201810377928 A CN201810377928 A CN 201810377928A CN 108445035 B CN108445035 B CN 108445035B
Authority
CN
China
Prior art keywords
haploid
corn
attenuation curve
nuclear magnetic
cpmg
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810377928.XA
Other languages
Chinese (zh)
Other versions
CN108445035A (en
Inventor
陈绍江
李金龙
李伟
焦炎炎
张俊稳
陈琛
陈明
刘晨旭
田小龙
钟裕
祁晓龙
王鼎昌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Agricultural University
Original Assignee
China Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Agricultural University filed Critical China Agricultural University
Priority to CN201810377928.XA priority Critical patent/CN108445035B/en
Publication of CN108445035A publication Critical patent/CN108445035A/en
Application granted granted Critical
Publication of CN108445035B publication Critical patent/CN108445035B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N24/00Investigating or analyzing materials by the use of nuclear magnetic resonance, electron paramagnetic resonance or other spin effects
    • G01N24/08Investigating or analyzing materials by the use of nuclear magnetic resonance, electron paramagnetic resonance or other spin effects by using nuclear magnetic resonance

Landscapes

  • Physics & Mathematics (AREA)
  • High Energy & Nuclear Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a method for identifying corn haploid grains based on a nuclear magnetic resonance CPMG attenuation curve. The method provided by the invention comprises the following steps: (1) collecting nuclear magnetic signals of each corn grain of the training set to obtain a CPMG attenuation curve after the quality of each grain is normalized; (2) carrying out data processing on the section of 0-600ms, then carrying out principal component analysis, and then constructing a haploid identification model; (3) taking corn kernels to be detected, collecting nuclear magnetic signals, and obtaining a CPMG attenuation curve after quality normalization; (4) and (3) carrying out data processing on the section of 0-600ms, then carrying out principal component analysis, substituting the result into the haploid identification model, and outputting the result by the model. The method provided by the invention can be used for automatic identification and has an important effect on promoting the engineering of the corn haploid breeding technology. The method for identifying the corn haploid is simple, feasible, rapid and efficient, has universality and has great application and popularization values.

Description

Method for identifying corn haploid grains based on nuclear magnetic resonance CPMG attenuation curve
Technical Field
The invention relates to the field of identification of corn haploid grains, in particular to a method for identifying corn haploid grains based on a nuclear magnetic resonance CPMG attenuation curve.
Background
Corn is native to central and south america, has been introduced into china for over 400 years, and is the first crop in China due to high yield, wide application, strong adaptability and rapid development of cultivation area. Meanwhile, the corn is the crop with the highest commercialization degree, and the commercial operation mode requires seed companies to keep up with the trend of variety updating, so that the corn variety suitable for the market can be cultivated more quickly. The DH breeding technology can reduce the time of line selection, shorten the breeding period and improve the breeding efficiency.
The corn Haploid technology is a breeding technology which is easy to realize engineering and comprises four links of preparation of basic materials, production of Haploid, Haploid doubling, management and application of Double Haploid (DH) systems and the like. Wherein haploid production comprises two key steps of haploid induction and identification. The induced line is used as a male parent to be hybridized with the basic material, and the offspring can generate haploid with certain frequency. At present, the induction rate of a haploid induction line is only 2% -15%, hybrid grains are only haploid grains, and more double-hybrid grains, and how to rapidly and accurately identify the haploid from a large number of induced grains becomes very important.
At present, a plurality of identification methods exist, and the identification methods can be divided into a kernel development stage, a kernel stage and a kernel post-stage according to an identification period. The identification in the grain development stage mainly depends on the tissue culture technology and is carried out after pollination according to the color development condition or the existence of fluorescence. The common method for the kernel stage is based on the kernel color expression. The post-kernel stage is mainly to identify the induced kernels after planting into plants, and the common method is according to the forms of the plants, because the ploidy of haploids and heterozygous diploids is different, the forms of the plants are different: the haploid plant is short and small, has long and narrow leaves and is mostly sterile.
Although the accuracy of each identification method is different, in the large-scale haploid engineering breeding, the efficiency and the cost are the first problems to be considered. The seed development stage identification needs to be carried out by means of tissue culture technology, an industrial tissue culture laboratory is constructed, and the seed development stage identification has certain timeliness (only can be carried out for a certain number of days after pollination). The seeds need to be planted in a seedling raising pot or a field in the later stage of the seeds, so that a large amount of land resources are occupied, and seedling transplanting (seedling raising pot) and impurity removal (field) in the later stage are also tedious work. But the identification time is flexible in the kernel stage compared with the kernel development stage, and manpower and material resources are saved in the later stage compared with the kernel. Thus, the kernel stage is a good time to identify haploids.
The R1-nj color marking system is a method for identifying the corn haploid at the most widely applied mature grain stage at present. The method is proposed by Nanda and Chase in 1966, and according to the forming characteristics of the haploid, the haploid embryo only contains chromosomes of a female parent, so that the expression of the haploid embryo in the color of the embryo is different from that of the diploid, and the purpose of haploid identification can be achieved only by color recognition in the seed stage. The advantages of the R1-nj color marking system are: the technical content is not high, and the method is simple and quick. Disadvantages of the R1-nj color marking system are: some germplasm materials have dominant suppressor genes such as C1-I, and R1-nj has great difference in expression definition in grains; due to manual selection, visual fatigue occurs after long-term work, and the identification accuracy rate of different personnel is different.
Disclosure of Invention
The invention aims to provide a method for identifying corn haploid grains based on a nuclear magnetic resonance CPMG attenuation curve.
The first method for identifying the corn haploid comprises the following steps:
(1) collecting nuclear magnetic signals of each corn grain of the training set to obtain a CPMG attenuation curve of each grain, and then dividing the amplitude corresponding to each time point by the weight of the grain to normalize the data to obtain a CPMG attenuation curve after the quality of each grain is normalized; the training set consists of a plurality of corn grains, wherein one part of the corn grains is a real haploid, and the other part of the corn grains is a real diploid;
(2) performing data processing on the 0-600ms section of the CPMG attenuation curve obtained in the step (1) after the quality normalization, then performing principal component analysis, and then constructing a haploid identification model;
(3) acquiring nuclear magnetic signals of corn kernels to be detected to obtain a CPMG attenuation curve, and dividing the amplitude corresponding to each time point by the kernel weight to normalize the data to obtain a CPMG attenuation curve after quality normalization;
(4) and (3) carrying out data processing on the 0-600ms section of the CPMG attenuation curve obtained in the step (3) after the quality normalization, then carrying out principal component analysis, then substituting the result into the haploid identification model constructed in the step (2), and outputting the result of predicting the haploid or diploid corn kernel to be detected by the model.
In the step (2) and the step (4), the data processing is smoothing processing. The smoothing process may specifically be a 10-point smoothing process.
In the step (2) and the step (4), the number of principal components in the principal component analysis is 100.
In the step (2), the algorithm for constructing the haploid identification model is a support vector machine algorithm.
The parameters of the support vector machine algorithm are as follows: the sigma is 0.004976874, and the penalty factor C is 16.
In the step (2) and the step (4), the principal component analysis is a principal component analysis based on an R language.
And (3) acquiring nuclear magnetic signals by using a nuclear magnetic resonance instrument and matched nuclear magnetic resonance analysis software. The NMR spectrometer may be a MesoMR23-020H-I NMR spectrometer manufactured by Neumei technologies, Inc., Shanghai. The nuclear magnetic resonance analysis software may be specifically "CPMG (CPMG) pulse sequence". The nuclear magnetic signal acquisition parameter setting is specifically as follows: TW is 800ms, TE is 0.600ms, and NS is 16.
In the step (1) and the step (3), the software used for quality normalization may specifically be: microsoft Excel 2016MSO 32 bit.
The second method for identifying the corn haploid comprises the following steps:
(1) collecting nuclear magnetic signals of each corn grain of the training set to obtain a CPMG attenuation curve of each grain, and then dividing the amplitude corresponding to each time point by the weight of the grain to normalize the data to obtain a CPMG attenuation curve after the quality of each grain is normalized; the training set consists of a plurality of corn grains, wherein one part of the corn grains is a real haploid, and the other part of the corn grains is a real diploid;
(2) performing data processing on the CPMG attenuation curve after the quality normalization obtained in the step (1), then performing principal component analysis, and then constructing a haploid identification model;
(3) acquiring nuclear magnetic signals of corn kernels to be detected to obtain a CPMG attenuation curve, and dividing the amplitude corresponding to each time point by the kernel weight to normalize the data to obtain a CPMG attenuation curve after quality normalization;
(4) and (3) carrying out data processing on the CPMG attenuation curve after the quality normalization obtained in the step (3), then carrying out principal component analysis, then substituting the result into the haploid identification model constructed in the step (2), and outputting the result that the corn kernel to be detected is the prediction haploid or the prediction diploid by the model.
In the step (2) and the step (4), the data processing is smoothing processing. The smoothing process may specifically be a 10-point smoothing process.
In the step (2) and the step (4), the number of principal components in the principal component analysis is 100.
In the step (2), the algorithm for constructing the haploid identification model is a support vector machine algorithm.
The parameters of the support vector machine algorithm are as follows: the sigma is 0.004976874, and the penalty factor C is 16.
In the step (2) and the step (4), the principal component analysis is a principal component analysis based on an R language.
And (3) acquiring nuclear magnetic signals by using a nuclear magnetic resonance instrument and matched nuclear magnetic resonance analysis software. The NMR spectrometer may be a MesoMR23-020H-I NMR spectrometer manufactured by Neumei technologies, Inc., Shanghai. The nuclear magnetic resonance analysis software may be specifically "CPMG (CPMG) pulse sequence". The nuclear magnetic signal acquisition parameter setting is specifically as follows: TW is 800ms, TE is 0.600ms, and NS is 16.
In the step (1) and the step (3), the software used for quality normalization may specifically be: microsoft Excel 2016MSO 32 bit.
The third method for identifying the corn haploid comprises the following steps:
(1) collecting nuclear magnetic signals of each corn grain of the training set to obtain a CPMG attenuation curve of each grain, and then dividing the amplitude corresponding to each time point by the weight of the grain to normalize the data to obtain a CPMG attenuation curve after the quality of each grain is normalized; the training set consists of a plurality of corn grains, wherein one part of the corn grains is a real haploid, and the other part of the corn grains is a real diploid;
(2) performing principal component analysis on the 0-600ms section of the CPMG attenuation curve obtained in the step (1) after the quality normalization, and then constructing a haploid identification model;
(3) acquiring nuclear magnetic signals of corn kernels to be detected to obtain a CPMG attenuation curve, and dividing the amplitude corresponding to each time point by the kernel weight to normalize the data to obtain a CPMG attenuation curve after quality normalization;
(4) and (3) carrying out principal component analysis on the 0-600ms section of the CPMG attenuation curve obtained in the step (3) after the quality normalization, then substituting the result into the haploid identification model constructed in the step (2), and outputting the result that the corn kernel to be detected is a predicted haploid or a predicted diploid by the model.
In the step (2) and the step (4), the number of principal components in the principal component analysis is 100.
In the step (2), the algorithm for constructing the haploid identification model is a support vector machine algorithm.
The parameters of the support vector machine algorithm are as follows: the sigma is 0.004976874, and the penalty factor C is 16.
In the step (2) and the step (4), the principal component analysis is a principal component analysis based on an R language.
And (3) acquiring nuclear magnetic signals by using a nuclear magnetic resonance instrument and matched nuclear magnetic resonance analysis software. The NMR spectrometer may be a MesoMR23-020H-I NMR spectrometer manufactured by Neumei technologies, Inc., Shanghai. The nuclear magnetic resonance analysis software may be specifically "CPMG (CPMG) pulse sequence". The nuclear magnetic signal acquisition parameter setting is specifically as follows: TW is 800ms, TE is 0.600ms, and NS is 16.
In the step (1) and the step (3), the software used for quality normalization may specifically be: microsoft Excel 2016MSO 32 bit.
The fourth method for identifying the corn haploid comprises the following steps:
(1) collecting nuclear magnetic signals of each corn grain of the training set to obtain a CPMG attenuation curve of each grain, and then dividing the amplitude corresponding to each time point by the weight of the grain to normalize the data to obtain a CPMG attenuation curve after the quality of each grain is normalized; the training set consists of a plurality of corn grains, wherein one part of the corn grains is a real haploid, and the other part of the corn grains is a real diploid;
(2) performing principal component analysis on the CPMG attenuation curve after the quality normalization obtained in the step (1), and then constructing a haploid identification model;
(3) acquiring nuclear magnetic signals of corn kernels to be detected to obtain a CPMG attenuation curve, and dividing the amplitude corresponding to each time point by the kernel weight to normalize the data to obtain a CPMG attenuation curve after quality normalization;
(4) and (3) carrying out principal component analysis on the CPMG attenuation curve after the quality normalization obtained in the step (3), substituting the result into the haploid identification model constructed in the step (2), and outputting the result of predicting the corn kernel to be tested to be a haploid or a diploid by the model.
In the step (2) and the step (4), the number of principal components in the principal component analysis is 100.
In the step (2), the algorithm for constructing the haploid identification model is a support vector machine algorithm.
The parameters of the support vector machine algorithm are as follows: the sigma is 0.004976874, and the penalty factor C is 16.
In the step (2) and the step (4), the principal component analysis is a principal component analysis based on an R language.
And (3) acquiring nuclear magnetic signals by using a nuclear magnetic resonance instrument and matched nuclear magnetic resonance analysis software. The NMR spectrometer may be a MesoMR23-020H-I NMR spectrometer manufactured by Neumei technologies, Inc., Shanghai. The nuclear magnetic resonance analysis software may be specifically "CPMG (CPMG) pulse sequence". The nuclear magnetic signal acquisition parameter setting is specifically as follows: TW is 800ms, TE is 0.600ms, and NS is 16.
In the step (1) and the step (3), the software used for quality normalization may specifically be: microsoft Excel 2016MSO 32 bit.
The invention also protects the application of any one of the methods in identifying the corn haploid.
The invention also protects the application of the nuclear magnetic resonance apparatus and the vector recorded with any one of the methods in identifying the corn haploid. The NMR spectrometer may be a MesoMR23-020H-I NMR spectrometer manufactured by Neumei technologies, Inc., Shanghai.
The invention also provides a system for identifying the corn haploid, which comprises a nuclear magnetic resonance apparatus and a carrier recorded with any one of the methods. The NMR spectrometer may be a MesoMR23-020H-I NMR spectrometer manufactured by Neumei technologies, Inc., Shanghai.
Any of the diploids described above is a heterozygous diploid.
The corn kernel is mature kernel.
The true haploids are obtained by field test identification.
The true diploid is obtained by field test identification.
In any of the above methods, the corn kernels in the training set and the corn kernels to be tested belong to the same cross population.
In any of the above methods, the corn kernels in the training set are obtained by sampling from the cross population in which the corn kernels to be tested are located.
The hybrid population may specifically be the following hybrid population: and (3) adopting a haploid inducing line to hybridize with the hybrid corn to obtain hybrid progeny (seeds). In the hybridization, the haploid inducer line serves as a male parent. The haploid inducer line is a non-high oil inducer line.
The hybrid population may specifically be the following hybrid population: respectively hybridizing n1 haploid inducing lines with m1 hybrid corns, and then mixing obtained hybrid progeny (seeds). In each set of crosses, the haploid inducer line serves as the male parent and the hybrid maize serves as the female parent. The haploid inducer line is a non-high oil inducer line.
The hybrid population may specifically be the following hybrid population: hybridizing by using a corn Zhengdan 958 as a female parent and a corn haploid induction line CAU3 as a male parent to obtain hybrid progeny (seeds) to form a hybrid population A1; hybridizing by using a corn Zhengdan 958 as a female parent and a corn haploid induction line CAU4 as a male parent to obtain hybrid progeny (seeds) to form a hybrid population A2; hybridizing by using a corn Zhengdan 958 as a female parent and a corn haploid induction line CAU5 as a male parent to obtain hybrid progeny (seeds) to form a hybrid population A3; hybridizing by using corn BM as a female parent and using a corn haploid induction line CAU5 as a male parent to obtain hybrid progeny (seeds) to form a hybrid population A4; hybridizing by using a maize Jingke 968 as a female parent and a maize haploid induction line CAU5 as a male parent to obtain hybrid progeny (seeds) to form a hybrid population A5; and mixing the five hybridization groups to obtain a hybridization group.
The method provided by the invention can be used for automatic identification and has an important effect on promoting the engineering of the corn haploid breeding technology. The method for identifying the corn haploid is simple, feasible, rapid and efficient, has universality and has great application and popularization values.
Drawings
Fig. 1 is a mass-normalized CPMG decay curve (full relaxation time) for each kernel.
Fig. 2 is a CPMG attenuation curve (0-600ms) after mass normalization of each kernel.
Fig. 3 is a graph of the CPMG decay curve normalized to the average mass of all true haploids versus the CPMG decay curve normalized to the average mass of all true diploids.
Detailed Description
The following examples are given to facilitate a better understanding of the invention, but do not limit the invention. The experimental procedures in the following examples are conventional unless otherwise specified. The test materials used in the following examples were purchased from a conventional biochemical reagent store unless otherwise specified. The quantitative tests in the following examples, all set up three replicates and the results averaged.
Zhengdan 958, Jingke 968 and BM are hybrid corn. The corn induction line CAU3, the corn induction line CAU4 and the corn induction line CAU5 are corn haploid induction lines (non-high oil).
Corn induction line CAU3 (also called "Nongda high inducing No. 3"): the method is a conventional induction line for breeding by the national corn improvement center of China agricultural university. Corn induction line CAU4 (also called "Nongda high inducing No. 4"): the method is a conventional induction line for breeding by the national corn improvement center of China agricultural university. Corn induction line CAU5 (also called "Nongda high inducing No. 5"): the method is a conventional induction line for breeding by the national corn improvement center of China agricultural university.
Zhengdan 958: the product of Beijing agriculture species Limited, implements the standard: GB 4404.1-2008.
Jingke 968: the product of Beijing Tungyu species Co., Ltd, the number is approved: jade 2011007 was examined domestically.
Corn BM: the F1 generation individual is obtained by hybridization with B73 as a female parent and Mo17 as a male parent.
The NMR spectrometer used in the examples was a MesoMR23-020H-I NMR spectrometer manufactured by Neumei technologies, Inc. of Shanghai.
The principal component analysis in the embodiments is a principal component analysis based on the R language.
The indexes for evaluating the model effect are accuracy, selection missing rate and selection error rate. The model evaluation confusion matrix is shown in table 1.
TABLE 1
Total up to Haploid (true) Heterozygous diploid (true)
Haploid (prediction) True Positive(TP) False Positive(FP)
Heterozygous diploid (predictive) False Negtive(FN) True Negtive(TN)
The accuracy is as follows: the number of diploid grains which are predicted to be haploid or heterozygous with the real accounts for the percentage of all grains.
Figure BDA0001640287350000061
The selection missing rate is as follows: how many haplotypes out of all haplotypes were judged as heterozygous diploids.
Figure BDA0001640287350000062
The wrong selection rate is as follows: percentage of heterozygous diploids in grain predicted to be haploid.
Figure BDA0001640287350000071
Example 1 preparation of hybrid population
Methods for identifying haploid and heterozygous diploids: after the corn ears are mature, harvesting the ears obtained by hybridization, and placing the ears in a dry environment for airing; and then, selecting haploid grains and heterozygous diploid (diploid for short) grains according to the R1-nj color, wherein the grains with purple endosperm and colorless embryonic shield slices are the haploid grains, and the grains with purple endosperm and purple embryonic shield slices are the heterozygous diploid grains. The method is adopted to screen haploid and heterozygous diploid from the filial generation obtained in the embodiment, wherein the haploid is a real haploid, and the heterozygous diploid is a real diploid.
Time: 2017. A place: hainan province.
Hybridizing by using a corn Zhengdan 958 as a female parent and a corn haploid induction line CAU3 as a male parent to obtain hybrid progeny (seeds); randomly taking 45 haploids and 45 heterozygous diploids from hybrid offspring (grains) to form a hybrid population A1.
Hybridizing by using a corn Zhengdan 958 as a female parent and a corn haploid induction line CAU4 as a male parent to obtain hybrid progeny (seeds); from the filial generation (grain), 34 haploids and 35 heterozygous diploids are randomly selected to form a cross population A2.
Hybridizing by using a corn Zhengdan 958 as a female parent and a corn haploid induction line CAU5 as a male parent to obtain hybrid progeny (seeds); randomly taking 20 haploids and 20 heterozygous diploids from hybrid offspring (grains) to form a hybrid population A3.
Hybridizing by using corn BM as a female parent and using a corn haploid induction line CAU5 as a male parent to obtain hybrid progeny (seeds); from the filial generation (grain), 50 haploids and 50 heterozygous diploids are randomly selected to form a cross population A4.
Hybridizing by using a Jingke 968 corn as a female parent and a haploid induction line CAU5 corn as a male parent to obtain hybrid progeny (grains); from the filial generation (grain), 50 haploids and 20 heterozygous diploids are randomly selected to form a cross population A5.
And mixing the five hybridization groups to obtain a hybridization group B (369 grains in total, 199 haploid grains and 170 heterozygous diploid grains). Statistics of kernel numbers for hybrid population B are shown in table 2.
TABLE 2
Figure BDA0001640287350000072
Figure BDA0001640287350000081
Example 2 Nuclear magnetic Signal acquisition
And (3) respectively processing each seed in the hybrid population B as follows:
1. and (5) weighing.
2. Nuclear magnetic resonance instrument and nuclear magnetic resonance analysis software CPMG (CPMG) pulse sequence are adopted for nuclear magnetic signal acquisition. The parameters are set as follows: TW is 800ms, TE is 0.600ms, and NS is 16. Obtaining the CPMG attenuation curve of each seed.
3. Quality normalization (eliminating the influence of kernel weight on the signal quantity)
And dividing the amplitude corresponding to each time point by the weight of the grains to normalize the data to obtain a CPMG attenuation curve after the quality normalization (1 CPMG attenuation curve after the quality normalization is obtained for each grain).
The software adopted for the quality normalization is as follows: microsoft Excel 2016MSO 32 bit.
4. Spectral band selection
The mass-normalized CPMG decay curves (full relaxation times) of individual kernels are shown in fig. 1.
The mass-normalized CPMG attenuation curves (0-600ms) of each kernel are shown in FIG. 2.
The mean mass normalized CPMG decay curve for all true haploids versus the mean mass normalized CPMG decay curve for all true diploids are shown in fig. 3.
Observing the CPMG attenuation curve after mass normalization, the CPMG attenuation curve tends to be stable after 600ms, and has larger variation in 0-600ms, so that a 0-600ms section is intercepted for analysis, and 1000 points are summed in the section.
Example 3 selection of data processing method
Model building was performed 100 times and the results averaged. In each model building, 80% of haploids and 80% of heterozygous diploids are randomly taken from the hybridization group B to form a training set, and the remaining 20% of haploids and 20% of heterozygous diploids form a verification set.
Firstly, processing the data of the training set grains as follows
And (3) carrying out data processing on the 0-600ms section of the CPMG attenuation curve after the quality normalization obtained in the embodiment 2, then carrying out principal component analysis, and adopting the first 100 principal components as variables and adopting a support vector machine algorithm to construct a haploid identification model.
The data processing method respectively adopts the following steps: 10-point smoothing processing (S), first-order derivation (D) and vector normalization processing (V), wherein the first-order derivation (SD) is carried out after the 10-point smoothing processing, the vector normalization processing (SV) is carried out after the 10-point smoothing processing, and the first-order derivation and the vector normalization processing (SDV) are carried out after the 10-point smoothing processing.
Secondly, the verification kernel is processed as follows
And (3) carrying out data processing on the 0-600ms section of the CPMG attenuation curve obtained in the embodiment 2 after the quality normalization (the data processing method is the same as the step one), then carrying out principal component analysis, and then substituting the result into the haploid identification model constructed in the step one to obtain the predicted value.
And evaluating the model according to the predicted value and the true value of the verification set grains. The results are shown in Table 3. The data processing modeling effect is best by adopting a 10-point smoothing method.
TABLE 3
Figure BDA0001640287350000091
Example 4 selection of the amount of principal Components
Model building was performed 100 times and the results averaged. In each model building, 80% of haploids and 80% of heterozygous diploids are randomly taken from the hybridization group B to form a training set, and the remaining 20% of haploids and 20% of heterozygous diploids form a verification set.
Firstly, processing the data of the training set grains as follows
And (3) carrying out data processing (the data processing method is 10-point smoothing) on the 0-600ms section of the CPMG attenuation curve obtained in the embodiment 2 after the quality normalization, then carrying out principal component analysis, and adopting a principal component related to the seed character of the haploid and the heterozygous diploid as a variable to construct a haploid identification model by adopting a support vector machine algorithm.
The number of the main components is respectively set as: 50. 100, 150 or 200.
Secondly, the verification kernel is processed as follows
And (3) carrying out data processing (the data processing method is 10-point smoothing) on the 0-600ms section of the CPMG attenuation curve obtained in the embodiment 2 after the quality normalization, then carrying out principal component analysis (the number of the principal components is consistent with that of the principal components in the step one), and then substituting the result into the haploid identification model constructed in the step one to obtain a predicted value of the haploid identification model.
And evaluating the model according to the predicted value and the true value of the verification set grains. The results are shown in Table 4. The modeling effect is best when the number of the principal components is 100.
TABLE 4
Figure BDA0001640287350000101
Example 5 selection of modeling method
Model building was performed 100 times and the results averaged. In each model building, 80% of haploids and 80% of heterozygous diploids are randomly taken from the hybridization group B to form a training set, and the remaining 20% of haploids and 20% of heterozygous diploids form a verification set.
Firstly, processing the data of the training set grains as follows
The CPMG attenuation curve obtained in example 2 after the quality normalization was subjected to data processing in the 0-600ms range (the data processing method was 10-point smoothing), and then principal component analysis was performed (the number of principal components was 100) to construct a haploid identification model.
The algorithm for establishing the model is respectively as follows: support vector machine algorithm (SVM; parameters are sigma 0.004976874 and penalty coefficient C16), random forest algorithm (RF; parameters are random sampling variable number mtry 12), K neighbor algorithm (KNN; parameters are K39), decision tree algorithm (DT; parameters are number trials of independent decision trees 35), and naive Bayes algorithm (NB; prediction variables conform to independent distribution characteristics).
Secondly, the verification kernel is processed as follows
And (3) carrying out data processing (the data processing method is 10-point smoothing) on the 0-600ms section of the CPMG attenuation curve obtained in the embodiment 2 after the quality normalization, then carrying out principal component analysis (the number of the principal components is 100), and then substituting the result into the haploid identification model constructed in the step one to obtain the predicted value.
And evaluating the model according to the predicted value and the true value of the verification set grains. The results are shown in Table 5. The modeling effect is best by adopting the support vector machine algorithm.
TABLE 5
Figure BDA0001640287350000111

Claims (1)

1. A method for identifying corn haploid comprises the following steps:
(1) collecting nuclear magnetic signals of each corn grain of the training set to obtain a CPMG attenuation curve of each grain, and then dividing the amplitude corresponding to each time point by the weight of the grain to normalize the data to obtain a CPMG attenuation curve after the quality of each grain is normalized; the training set consists of a plurality of corn grains, wherein one part of the corn grains is a real haploid, and the other part of the corn grains is a real diploid;
(2) performing data processing on the 0-600ms section of the CPMG attenuation curve obtained in the step (1) after the quality normalization, then performing principal component analysis, and then constructing a haploid identification model; an algorithm for constructing a haploid identification model is a support vector machine algorithm; the parameters of the support vector machine algorithm are as follows: 0.004976874, and 16 is the penalty coefficient C;
(3) acquiring nuclear magnetic signals of corn kernels to be detected to obtain a CPMG attenuation curve, and dividing the amplitude corresponding to each time point by the kernel weight to normalize the data to obtain a CPMG attenuation curve after quality normalization;
(4) performing data processing on the 0-600ms section of the CPMG attenuation curve obtained in the step (3) after the quality normalization, then performing principal component analysis, substituting the result into the haploid identification model constructed in the step (2), and outputting the result of the corn kernel to be tested, namely the result of predicting the haploid or the result of predicting the diploid, by the model;
in the step (2) and the step (4), the data processing is 10-point smoothing processing;
in the step (2) and the step (4), the number of principal components in the principal component analysis is 100;
in the step (2) and the step (4), the principal component analysis is a principal component analysis based on an R language.
CN201810377928.XA 2018-04-25 2018-04-25 Method for identifying corn haploid grains based on nuclear magnetic resonance CPMG attenuation curve Active CN108445035B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810377928.XA CN108445035B (en) 2018-04-25 2018-04-25 Method for identifying corn haploid grains based on nuclear magnetic resonance CPMG attenuation curve

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810377928.XA CN108445035B (en) 2018-04-25 2018-04-25 Method for identifying corn haploid grains based on nuclear magnetic resonance CPMG attenuation curve

Publications (2)

Publication Number Publication Date
CN108445035A CN108445035A (en) 2018-08-24
CN108445035B true CN108445035B (en) 2021-02-02

Family

ID=63201545

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810377928.XA Active CN108445035B (en) 2018-04-25 2018-04-25 Method for identifying corn haploid grains based on nuclear magnetic resonance CPMG attenuation curve

Country Status (1)

Country Link
CN (1) CN108445035B (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI591335B (en) * 2011-12-29 2017-07-11 陶氏農業科學公司 Colorimetric determination of the total oil content of a plant tissue sample using alkaline saponification
CN103268492B (en) * 2013-04-19 2016-03-30 北京农业信息技术研究中心 A kind of corn grain type identification method
EP3175228A4 (en) * 2014-06-06 2018-01-24 NanoNord A/S A method for determinig the amount of h2o in a sample
CN105806872A (en) * 2016-04-29 2016-07-27 大连工业大学 Method for identifying different sturgeon roes by low field nuclear magnetism technology
CN105954308B (en) * 2016-04-29 2018-09-28 大连工业大学 A kind of method of quick detection oil-containing microorganism oil content
CN106018452A (en) * 2016-04-29 2016-10-12 大连工业大学 Peanut variety nondestructive testing method based on nuclear magnetic resonance technology

Also Published As

Publication number Publication date
CN108445035A (en) 2018-08-24

Similar Documents

Publication Publication Date Title
Austin Augmenting yield-based selection
CN102187774A (en) System for identifying and evaluating seedling resistance of malus
CN101156545A (en) Method for inducing corn haploid and multi-embryo using high oil type inducing series
Zhang et al. Yield gap and production constraints of mango (Mangifera indica) cropping systems in Tianyang County, China
Cai et al. Induction, regeneration and characterization of tetraploids and variants in ‘Tapestry’caladium
CN108901837A (en) A kind of high-quality and efficient screening varieties of middle and lower reach of Yangtze River single cropping japonica rice and cultural method
Wu et al. Genetic variation and genetic gain in growth traits, stem-branch characteristics and wood properties and their relationships to Eucalyptus urophylla clones
CN107278873B (en) A kind of haploid method of identification corn
CN108445035B (en) Method for identifying corn haploid grains based on nuclear magnetic resonance CPMG attenuation curve
CN103081802B (en) Method for auxiliary identification of corn haploid induction line
CN107667852B (en) Production method and application of rice seeds
CN115184546B (en) Method for rapidly identifying and selecting rubber tree triploid plants in field
Sakhanokho et al. Morphological and cytomolecular assessment of intraspecific variability in Scarlet eggplant (Solanum aethiopicum L.)
CN111109073A (en) Echelon selective character breeding method for peanuts
Stoyanov et al. Research on the variability in triticale (× Triticosecale Wittm.) crosses as a source of genetic diversity
CN108668890B (en) Method for improving correct recognition rate of corn haploid
Singh et al. Quality seed production, its testing and certification standard
CN103503769A (en) Novel wheat mutation breeding method
CN116965329B (en) Breeding method and application of rape hybrid variety
CN112330115B (en) Comprehensive evaluation method for drought resistance of Xinjiang cotton in boll period
CN115956499B (en) Shadow-tolerance evaluation method for peanuts in seedling stage and shadow-tolerance peanut variety screening method
Ukwu et al. Fruit setting in cassava (Manihot esculenta Crantz) varieties as influenced by genotype and maternal inheritance
NL2032273B1 (en) A method for improving correct recognition rate of corn haploid
Gautier et al. Conservative selection of a population variety of zucchini according to its genetic and phenotypic diversity analysis
Kuneva et al. Evaluation of rye specimens in maturity stage on the basis of mathematical-statistical analysis.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant