CN116543837B - Genotype comparison method and device based on fluorescent signal platform - Google Patents

Genotype comparison method and device based on fluorescent signal platform Download PDF

Info

Publication number
CN116543837B
CN116543837B CN202310809269.3A CN202310809269A CN116543837B CN 116543837 B CN116543837 B CN 116543837B CN 202310809269 A CN202310809269 A CN 202310809269A CN 116543837 B CN116543837 B CN 116543837B
Authority
CN
China
Prior art keywords
signal data
crop variety
fluorescent signal
determining
fluorescence signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310809269.3A
Other languages
Chinese (zh)
Other versions
CN116543837A (en
Inventor
王凤格
葛建镕
赵怡锟
张云龙
王蕊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Academy of Agriculture and Forestry Sciences
Original Assignee
Beijing Academy of Agriculture and Forestry Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Academy of Agriculture and Forestry Sciences filed Critical Beijing Academy of Agriculture and Forestry Sciences
Priority to CN202310809269.3A priority Critical patent/CN116543837B/en
Publication of CN116543837A publication Critical patent/CN116543837A/en
Application granted granted Critical
Publication of CN116543837B publication Critical patent/CN116543837B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/30Detection of binding sites or motifs
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/20Polymerase chain reaction [PCR]; Primer or probe design; Probe optimisation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/10Signal processing, e.g. from mass spectrometry [MS] or from PCR
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/10Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture

Abstract

The invention provides a genotype comparison method and device based on a fluorescence signal platform, and relates to the technical field of computers, wherein the method comprises the following steps: acquiring first fluorescence signal data of a first crop variety of a fluorescence signal platform at a plurality of sites of a genome and second fluorescence signal data of a second crop variety at corresponding sites of the genome; determining a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety for each location; under the condition that the first genotyping result and the second genotyping result are not missing data, determining the included angle, the Euclidean distance and the distance difference between the first fluorescence signal data and the second fluorescence signal data based on the first fluorescence signal data and the second fluorescence signal data, further determining the genotype comparison result of the first crop variety and the second crop variety at each site, and improving the accuracy of the judging result of the sample to be tested.

Description

Genotype comparison method and device based on fluorescent signal platform
Technical Field
The invention relates to the technical field of computers, in particular to a genotype comparison method and device based on a fluorescent signal platform.
Background
Genotype data alignment is one of the methods commonly used in modern bioinformatic analysis, and is the most basic alignment method for calculating the genetic similarity between samples. Genotyping methods are mostly used for genotyping data comparison in the authenticity identification of samples and screening of similar samples.
In the related technology, the genotype comparison method is used for extracting the DNA of a sample to be detected, marking the DNA, detecting the DNA by adopting a genotype parting algorithm, and obtaining the genotype parting result of the sample to be detected on a detection mark; the genotyping algorithm is based on multi-sample fluorescent signal data clustering, and the genotyping result is determined through data noise reduction, clustering and genotyping judgment flow. And then comparing and screening genotype data of the sample to be detected with genotype data of a plurality of samples in the DNA fingerprint database, thereby realizing judgment of the sample to be detected.
However, the genotype comparison method is highly dependent on the accuracy of the genotyping algorithm, and in general, there is a large difference in fluorescence signals of two samples that are classified into the same cluster, and in such cases, the genotype comparison method is determined to be the same sample; also, even if the fluorescent signals of two samples divided into different clusters are sometimes only very slightly different, the genotype comparison method is judged to be different samples at the time of comparison. The existing genotype comparison method leads to lower accuracy of the judging result of the sample to be tested.
Disclosure of Invention
The invention provides a genotype comparison method based on a fluorescence signal platform, which is used for solving the problem of lower accuracy of a judging result of a sample to be tested in the prior art.
The invention provides a genotype comparison method based on a fluorescence signal platform, which comprises the following steps:
acquiring first fluorescence signal data of a first crop variety of a fluorescence signal platform at a plurality of sites of a genome and second fluorescence signal data of a second crop variety at corresponding sites of the genome;
for each site, respectively determining a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety based on the first fluorescent signal data and the second fluorescent signal data;
determining an included angle, a euclidean distance and a distance difference between the first fluorescent signal data and the second fluorescent signal data based on the first fluorescent signal data and the second fluorescent signal data under the condition that neither the first genotyping result nor the second genotyping result is missing data;
and determining genotype comparison results of the first crop variety and the second crop variety at each locus based on the included angle, the Euclidean distance and the distance difference.
According to the genotype comparison method based on the fluorescent signal platform, which is provided by the invention, the genotype comparison result of the first crop variety and the second crop variety at each site is determined based on the included angle, the Euclidean distance and the distance difference value, and the genotype comparison method comprises the following steps:
comparing the included angle with a first preset threshold value;
under the condition that the included angle is larger than the first preset threshold value, determining that genotype comparison results of the first crop variety and the second crop variety at all the sites are different;
and under the condition that the included angle is smaller than or equal to the first preset threshold value, determining that the genotype comparison results of the first crop variety and the second crop variety at all the sites are the same based on the Euclidean distance and the distance difference value.
According to the genotype comparison method based on the fluorescent signal platform provided by the invention, the genotype comparison result of the first crop variety and the second crop variety at each site is the same based on the Euclidean distance and the distance difference value, and the genotype comparison method comprises the following steps:
comparing the Euclidean distance with a second preset threshold value;
Under the condition that the Euclidean distance is smaller than or equal to the second preset threshold value, determining that the genotype comparison results of the first crop variety and the second crop variety at all the sites are the same;
and under the condition that the Euclidean distance is larger than the second preset threshold value, determining that the genotype comparison results of the first crop variety and the second crop variety at all the sites are the same based on the distance difference value.
According to the genotype comparison method based on the fluorescent signal platform provided by the invention, the genotype comparison result of the first crop variety and the second crop variety at each site is the same based on the distance difference value, and the genotype comparison method comprises the following steps:
comparing the distance difference value with a third preset threshold value;
and under the condition that the distance difference value is smaller than or equal to the third preset threshold value, determining that genotype comparison results of the first crop variety and the second crop variety at all the sites are the same.
According to the genotype comparison method based on the fluorescence signal platform, which is provided by the invention, the method further comprises the following steps:
and under the condition that the distance difference value is larger than the third preset threshold value, determining that genotype comparison results of the first crop variety and the second crop variety at all the sites are different.
According to the genotype comparison method based on the fluorescence signal platform, the included angle, the Euclidean distance and the distance difference between the first fluorescence signal data and the second fluorescence signal data are determined based on the first fluorescence signal data and the second fluorescence signal data, and the genotype comparison method comprises the following steps:
calculating the included angle between the first fluorescence signal data and the second fluorescence signal data based on a first vector corresponding to the first fluorescence signal data and a second vector corresponding to the second fluorescence signal data;
and calculating the Euclidean distance and the distance difference between the first fluorescence signal data and the second fluorescence signal data respectively based on the first fluorescence signal data and the second fluorescence signal data.
According to the genotype comparison method based on the fluorescence signal platform, the first genotyping result corresponding to the first crop variety and the second genotyping result corresponding to the second crop variety are respectively determined based on the first fluorescence signal data and the second fluorescence signal data, and the genotype comparison method comprises the following steps:
respectively judging whether the first fluorescent signal data and the second fluorescent signal data meet a first preset condition or not; the first preset condition is used for judging whether the first fluorescent signal data and the second fluorescent signal data are missing or not;
And under the condition that the first fluorescent signal data and the second fluorescent signal data meet the first preset condition, determining that a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety are both missing data.
According to the genotype comparison method based on the fluorescence signal platform, the first genotyping result corresponding to the first crop variety and the second genotyping result corresponding to the second crop variety are respectively determined based on the first fluorescence signal data and the second fluorescence signal data, and the genotype comparison method comprises the following steps:
respectively judging whether the first fluorescent signal data and the second fluorescent signal data meet a second preset condition or not; the second preset condition is used for judging whether genotype typing results corresponding to the first fluorescent signal data and the second fluorescent signal data are homozygous genotype typing or not;
and under the condition that the first fluorescent signal data and the second fluorescent signal data meet the second preset condition, determining that a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety are homozygous genotyping.
The invention also provides a genotype comparison device based on the fluorescent signal platform, which comprises:
the acquisition module is used for acquiring first fluorescent signal data of a first crop variety of the fluorescent signal platform at a plurality of sites of a genome and second fluorescent signal data of a second crop variety at corresponding sites of the genome;
the genotyping module is used for determining a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety according to the first fluorescence signal data and the second fluorescence signal data for each position;
the first determining module is used for determining an included angle, a Euclidean distance and a distance difference between the first fluorescent signal data and the second fluorescent signal data based on the first fluorescent signal data and the second fluorescent signal data under the condition that the first genotyping result and the second genotyping result are not missing data;
and the genotype comparison module is used for determining genotype comparison results of the first crop variety and the second crop variety at each locus based on the included angle, the Euclidean distance and the distance difference value.
The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the genotype comparison method based on the fluorescence signal platform when executing the program.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a genotype comparison method based on a fluorescence signal platform as described in any of the above.
The genotype comparison method and the genotype comparison device based on the fluorescence signal platform provided by the invention are characterized in that the first fluorescence signal data of a first crop variety of the fluorescence signal platform at a plurality of loci of a genome and the second fluorescence signal data of a second crop variety at corresponding loci of the genome are obtained; for each site, respectively determining a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety according to the first fluorescence signal data and the second fluorescence signal data; under the condition that the first genotyping result and the second genotyping result are homozygous genotyping, determining an included angle, a Euclidean distance and a distance difference between the first fluorescence signal data and the second fluorescence signal data according to the first fluorescence signal data and the second fluorescence signal data; and determining genotype comparison results of the first crop variety and the second crop variety at each site based on the included angle, the Euclidean distance and the distance difference. Based on the included angle, the Euclidean distance and the distance difference value determined by the first fluorescent signal data and the second fluorescent signal data, genotype comparison of each site can be accurately judged, the difference between the first crop variety and the second crop variety is judged from the source of the fluorescent signal data, and the accuracy of the judging result of the sample to be tested is improved.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a genotype comparison method based on a fluorescence signal platform;
FIG. 2 is a second flow chart of the genotype comparison method based on the fluorescence signal platform provided by the invention;
FIG. 3 is a schematic structural diagram of a genotype comparison device based on a fluorescence signal platform;
fig. 4 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The genotype comparison method based on the fluorescence signal platform of the present invention is described below with reference to FIGS. 1-2.
FIG. 1 is a schematic flow chart of a genotype comparison method based on a fluorescence signal platform, which is shown in FIG. 1, and comprises steps 101-104; wherein,
step 101, acquiring first fluorescent signal data of a first crop variety of a fluorescent signal platform at a plurality of sites of a genome and second fluorescent signal data of a second crop variety at corresponding sites of the genome; the first crop variety and the second crop variety are the same.
It should be noted that, the genotype comparison method based on the fluorescent signal platform provided by the invention is suitable for a scenario of genotype comparison of crop varieties, and the execution subject of the method can be a genotype comparison device based on the fluorescent signal platform, such as an electronic device, or a control module in the genotype comparison device based on the fluorescent signal platform, which is used for executing the genotype comparison method based on the fluorescent signal platform.
Specifically, the fluorescent signal platform can be any one of competitive allele specific PCR (Kompetitive Allele Specific PCR, KASP), a gene chip, real-time fluorescent quantitative PCR and the like, and can detect fluorescent signal data of crop varieties. The first crop variety may be a control sample and the second crop variety may be a sample to be tested. The first crop variety and the second crop variety are the same, for example, the first crop variety and the second crop variety are both soybean, or may be corn. The number of sites is set according to the actual situation, for example, 10000 sites. The first fluorescent signal data and the second fluorescent signal data are fluorescent signal intensity data of the first crop variety at each site and fluorescent signal intensity data of the second crop variety at each site, respectively, wherein the first fluorescent signal data and the second fluorescent signal data each comprise FAM fluorescent signal intensity data (6-carboxy-fluoroscin, 6-carboxyfluorescein) and HEX (Hexachloro fluorescein, hexachloro-6-methylfluorescein) fluorescent signal intensity data.
Step 102, determining a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety based on the first fluorescence signal data and the second fluorescence signal data for each site.
Specifically, for each locus of the genome, a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety can be determined according to the first fluorescence signal data and the second fluorescence signal data, respectively; wherein the first genotyping result comprises at least one of: missing data; genotyping homozygous; genotyping results could not be determined.
Step 103, determining an included angle, a Euclidean distance and a distance difference between the first fluorescence signal data and the second fluorescence signal data based on the first fluorescence signal data and the second fluorescence signal data under the condition that the first genotyping result and the second genotyping result are not missing data.
Specifically, under the condition that the first genotyping result and the second genotyping result are not missing data, determining an included angle, a Euclidean distance and a distance difference between the first crop variety and the second crop variety according to the first fluorescent signal data and the second fluorescent signal data; the included angle is determined based on a vector formed by points mapped in the same two-dimensional space by the first fluorescent signal data and the second fluorescent signal data and an origin, the Euclidean distance is determined based on a first vector corresponding to the first fluorescent signal data and a second vector corresponding to the second fluorescent signal data, and the distance difference is determined based on a difference value between the fluorescent signal data after log conversion of the first fluorescent signal data and the second fluorescent signal data.
Alternatively, for each locus, in the event that either of the first genotyping result and the second genotyping result is missing data, no comparison will be made between the first crop variety and the second crop variety.
And 104, determining genotype comparison results of the first crop variety and the second crop variety at each locus based on the included angle, the Euclidean distance and the distance difference.
Specifically, for each locus, according to the included angle, the Euclidean distance and the distance difference value, the genotype comparison result of the first crop variety and the second crop variety at the locus can be further determined, so that the genotype comparison result of each locus is determined; the genotype comparison results are the same or different, and the same indicates that the genotypes of the first crop variety and the second crop variety at one locus of the genome are the same; the non-identical means that the first crop variety and the second crop variety are not identical in genotype at one locus of the genome.
The genotype comparison method based on the fluorescent signal platform provided by the invention comprises the steps of obtaining first fluorescent signal data of a first crop variety of the fluorescent signal platform at a plurality of loci of a genome and second fluorescent signal data of a second crop variety at corresponding loci of the genome; for each site, respectively determining a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety according to the first fluorescence signal data and the second fluorescence signal data; under the condition that the first genotyping result and the second genotyping result are homozygous genotyping, determining an included angle, a Euclidean distance and a distance difference between the first fluorescence signal data and the second fluorescence signal data according to the first fluorescence signal data and the second fluorescence signal data; and determining genotype comparison results of the first crop variety and the second crop variety at each site based on the included angle, the Euclidean distance and the distance difference. Based on the included angle, the Euclidean distance and the distance difference value determined by the first fluorescent signal data and the second fluorescent signal data, genotype comparison of each site can be accurately judged, the difference between the first crop variety and the second crop variety is judged from the source of the fluorescent signal data, and the accuracy of the judging result of the sample to be tested is improved.
Optionally, the specific implementation manner of step 102 includes:
(1) Respectively judging whether the first fluorescent signal data and the second fluorescent signal data meet a first preset condition or not; the first preset condition is used for judging whether the first fluorescent signal data and the second fluorescent signal data are missing or not.
Specifically, the first preset condition is used for judging whether the first fluorescent signal data and the second fluorescent signal data are missing, for example, the first preset condition is expressed as: signa+signb < = 1000, where signa represents one of the fluorescent signal intensity data included in the first fluorescent signal data, signb represents another of the fluorescent signal intensity data included in the first fluorescent signal data, for example, signa is FAM fluorescent signal intensity data, and signb is HEX fluorescent signal intensity data.
(2) And under the condition that the first fluorescent signal data and the second fluorescent signal data meet the first preset condition, determining that a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety are both missing data.
Specifically, for each site, under the condition that the first fluorescent signal data and the second fluorescent signal data respectively meet the first preset condition, it can be determined that the first genotyping result corresponding to the first crop variety and the second genotyping result corresponding to the second crop variety are both missing data. For example, for site 1, if the signla and signlb included in the first fluorescent signal data satisfy a first preset condition (signla+signlb < = 1000), then it may be determined that the first genotyping result corresponding to the first crop variety at site 1 is missing data; the signalA and signalB included in the second fluorescent signal data meet a first preset condition (signalA+signalB < = 1000), then the second genotyping result corresponding to the second crop variety at the site 1 can be determined to be missing data, the genotyping result is determined according to the first fluorescent signal data and the second fluorescent signal data, further the genotyping comparison of each site can be determined according to the genotyping result, and the accuracy of the determination result of the sample to be tested is improved.
Optionally, under the condition that the first fluorescent signal data and the second fluorescent signal data do not meet the first preset condition, determining that the first genotyping result corresponding to the first crop variety and the second genotyping result corresponding to the second crop variety are both the genotyping results which cannot be determined.
Optionally, the specific implementation manner of step 102 includes:
(a) Respectively judging whether the first fluorescent signal data and the second fluorescent signal data meet a second preset condition or not; the second preset condition is used for judging whether the genotyping results corresponding to the first fluorescent signal data and the second fluorescent signal data are homozygous genotyping.
Specifically, the second preset condition is used for judging whether the genotyping results corresponding to the first fluorescence signal data and the second fluorescence signal data are homozygous genotyping, for example, homozygous genotyping includes AA genotyping and BB genotyping. For example, the second preset condition is expressed as: signalA > = 2SignalB or SignalB > = 2SignalA.
(b) And under the condition that the first fluorescent signal data and the second fluorescent signal data meet the second preset condition, determining that a first genotyping result corresponding to each position of the first crop variety and a second genotyping result corresponding to each position of the second crop variety are homozygous genotyping.
Specifically, under the condition that the first fluorescent signal data and the second fluorescent signal data meet the second preset condition, the first genotyping result corresponding to each position of the first crop variety and the second genotyping result corresponding to each position of the second crop variety can be determined to be homozygous genotyping. For example, the second preset condition is expressed as: determining that the first genotyping result corresponding to the first crop variety at each position is AA if the first fluorescent signal data meets the signa > =2signalB; the second preset condition is expressed as: determining that the first fluorescent signal data meet the requirement of signalB > =2signala, determining that the second genotyping result corresponding to the second crop variety at each position is BB, determining the genotyping result according to the first fluorescent signal data and the second fluorescent signal data, further determining the genotyping comparison of each position according to the genotyping result, and improving the accuracy of the determination result of the sample to be tested.
Optionally, under the condition that the first fluorescent signal data and the second fluorescent signal data do not meet the second preset condition, determining that the first genotyping result corresponding to the first crop variety at each position and the second genotyping result corresponding to the second crop variety at each position are both the genotyping results which cannot be determined.
Optionally, the specific implementation manner of step 103 includes:
1) And calculating the included angle between the first fluorescent signal data and the second fluorescent signal data based on a first vector corresponding to the first fluorescent signal data and a second vector corresponding to the second fluorescent signal data.
Specifically, the first vector represents a vector in which the first fluorescent signal data is constituted in a two-dimensional space, and the second vector represents a vector in which the second fluorescent signal data is constituted in the same two-dimensional space. According to the first vector corresponding to the first fluorescent signal data and the second vector corresponding to the second fluorescent signal data, the included angle between the first crop variety and the second crop variety can be calculated by adopting the formula (1)The method comprises the steps of carrying out a first treatment on the surface of the Wherein, formula (1) is expressed as:
(1)
wherein,representing a first vector, ++>Representing a second vector.
For example, if the first fluorescent signal data at a certain site is signla 1 and signlb 1 and the second fluorescent signal data is signla 2 and signlb 2, then in the two-dimensional space formed by signla 1 and signlb 1, the point C (signla 1, signlb 1) represents the first fluorescent signal data, the point T (signla 2, signlb 2) represents the second fluorescent signal data, and the vector from the origin O to the point C is expressed as the first vector The vector from origin O to point T is denoted as second vector +.>Calculating by using the formula (1) to obtain +.>And->Included angle->
2) And calculating the Euclidean distance and the distance difference between the first fluorescence signal data and the second fluorescence signal data respectively based on the first fluorescence signal data and the second fluorescence signal data.
Specifically, based on the first fluorescence signal data and the second fluorescence signal data, the Euclidean distance ED between the first fluorescence signal data and the second fluorescence signal data is calculated using formula (2), and the distance difference between the first fluorescence signal data and the second fluorescence signal data is calculated using formula (3)The method comprises the steps of carrying out a first treatment on the surface of the Wherein, formula (2) and formula (3) are expressed as:
(2)
(3)
wherein,representing the data log transformed from the first fluorescent signal data,/the first fluorescent signal data>Representing the log transformed data of the second fluorescent signal data, wherein +.>And->Can be expressed by the formula (4):
(4)
wherein A can be A1 or A2, and when A is A1, B is B1; when A is A2, B is B2.
The specific implementation order of the steps 1) and 2) is not specifically limited, and the step 1) may be performed first, then the step 2) may be performed, or the step 2) may be performed first, then the step 1) may be performed.
Optionally, the specific implementation manner of step 104 includes:
a) And comparing the included angle with a first preset threshold value.
Specifically, under the condition that the first genotyping result and the second genotyping result are not missing data, firstly judging whether the first genotyping result and the second genotyping result are the same homozygous genotyping, and under the condition that the first genotyping result and the second genotyping result are the same homozygous genotyping, determining that the genotype comparison result of the first crop variety and the second crop variety at each locus is the same; comparing the included angle with a first preset threshold value under the condition that the first genotyping result and the second genotyping result are not the same homozygous genotyping; the first preset threshold is set according to practical situations, for example, the first preset threshold is 30 °.
b) And under the condition that the included angle is larger than the first preset threshold value, determining that genotype comparison results of the first crop variety and the second crop variety at all the sites are different.
Specifically, under the condition that the included angle is larger than a first preset threshold value, the first vector and the second vector are larger in difference distance, the genotype comparison results of the first crop variety and the second crop variety at each site can be determined to be different, the genotype comparison results are determined according to the included angle, the genotype typing results are not relied on, and the accuracy of the judging results of the samples to be tested can be improved.
c) And under the condition that the included angle is smaller than or equal to the first preset threshold value, determining that the genotype comparison results of the first crop variety and the second crop variety at all the sites are the same based on the Euclidean distance and the distance difference value.
Specifically, under the condition that the included angle is smaller than or equal to a first preset threshold value, the genotype comparison results of the first crop variety and the second crop variety at all the sites can be further determined to be the same according to the Euclidean distance and the distance difference value.
Optionally, the specific implementation manner of the step c) includes:
c-1) comparing the Euclidean distance with a second preset threshold value.
Specifically, the second preset threshold is set according to the actual situation, for example, the second preset threshold is 1000.
c-2) determining that the genotype comparison results of the first crop variety and the second crop variety at each site are the same under the condition that the Euclidean distance is smaller than or equal to the second preset threshold value.
Specifically, when the euclidean distance is less than or equal to the second preset threshold value, the first fluorescent signal data and the second fluorescent signal data are similar, and the genotype comparison results of the first crop variety and the second crop variety at each site can be determined to be the same.
c-3) determining that the genotype comparison results of the first crop variety and the second crop variety at the sites are the same based on the distance difference value under the condition that the Euclidean distance is larger than the second preset threshold value.
Specifically, if the euclidean distance is greater than the second preset threshold value, it is necessary to further determine that the genotype comparison results of the first crop variety and the second crop variety at each site are the same according to the distance difference value.
Optionally, the determining that the genotype comparison result of the first crop variety and the second crop variety at each of the sites is the same based on the distance difference comprises:
comparing the distance difference value with a third preset threshold value; and under the condition that the distance difference value is smaller than or equal to the third preset threshold value, determining that genotype comparison results of the first crop variety and the second crop variety at all the sites are the same.
Specifically, the third preset threshold is set according to the actual situation, for example, the third preset threshold is 1. And under the condition that the distance difference value is smaller than or equal to a third preset threshold value, the smaller the distance difference value between the first fluorescent signal data and the second fluorescent signal data is, the first fluorescent signal data and the second fluorescent signal data can be further indicated to be similar, and the genotype comparison results of the first crop variety and the second crop variety at all sites can be determined to be the same.
Optionally, determining that the genotype comparison results of the first crop variety and the second crop variety at the sites are different when the distance difference is greater than the third preset threshold.
Specifically, when the distance difference is greater than the third preset threshold, the distance difference between the first fluorescent signal data and the second fluorescent signal data is larger, so that the fact that the first fluorescent signal data and the second fluorescent signal data are dissimilar can be indicated, and the genotype comparison results of the first crop variety and the second crop variety at all sites can be determined to be different.
The genotype comparison method based on the fluorescent signal platform provided by the invention comprises the steps of comparing the included angle with a first preset threshold value; under the condition that the included angle is larger than a first preset threshold value, determining that genotype comparison results of the first crop variety and the second crop variety at each site are different; comparing the Euclidean distance with a second preset threshold under the condition that the included angle is smaller than or equal to the first preset threshold; under the condition that the Euclidean distance is smaller than or equal to a second preset threshold value, determining that the genotype comparison results of the first crop variety and the second crop variety at all the sites are the same; comparing the distance difference value with a third preset threshold value under the condition that the Euclidean distance is larger than the second preset threshold value; and under the condition that the distance difference value is smaller than or equal to a third preset threshold value, determining that the genotype comparison results of the first crop variety and the second crop variety at all the sites are the same. Through contained angle, euclidean distance and distance difference, the genotype comparison results of the first crop variety and the second crop variety at each site are comprehensively judged to be the same or different, the accuracy of the judgment result of the sample to be detected is improved, the genetic similarity of the first crop variety and the second crop variety can be further judged, and the accuracy of the genetic similarity judgment can be improved.
Optionally, after the genotype comparison result of the first crop variety and the second crop variety at each site is obtained, determining that the genotype comparison result is the same probability according to the genotype comparison result, and determining that the genetic similarity of the first crop variety and the second crop variety is high when the probability is greater than a target preset threshold value, i.e. the variety of the first crop variety and the second crop variety is the same; for example, the target preset threshold is 80%.
FIG. 2 is a second flow chart of a genotype comparison method based on a fluorescence signal platform according to the present invention, as shown in FIG. 2, the method includes steps 201-217; wherein,
step 201, acquiring first fluorescence signal data of a first crop variety of a fluorescence signal platform at a plurality of sites of a genome and second fluorescence signal data of a second crop variety at corresponding sites of the genome.
Step 202, respectively judging whether the first fluorescent signal data and the second fluorescent signal data meet a first preset condition for each position point; the first preset condition is used for judging whether the first fluorescent signal data and the second fluorescent signal data are missing or not; if the first fluorescent signal data and the second fluorescent signal data meet the first preset condition, go to step 203; if the first fluorescent signal data and the second fluorescent signal data do not satisfy the first preset condition, the process goes to step 204.
Step 203, determining that the first genotyping result corresponding to each position of the first crop variety and the second genotyping result corresponding to each position of the second crop variety are missing data.
Step 204, determining that the first genotyping result corresponding to each position of the first crop variety and the second genotyping result corresponding to each position of the second crop variety are both the genotyping results which cannot be determined, for example, using None to indicate the genotyping results which cannot be determined.
Step 205, judging whether the first fluorescent signal data and the second fluorescent signal data meet a second preset condition or not respectively; the second preset condition is used for judging whether genotype parting results corresponding to the first fluorescent signal data and the second fluorescent signal data are homozygous genotype parting or not; if the first fluorescent signal data and the second fluorescent signal data meet the second preset condition, go to step 206; if the first fluorescent signal data and the second fluorescent signal data do not satisfy the second preset condition, the process goes to step 204.
Step 206, determining that a first genotyping result corresponding to each position of the first crop variety and a second genotyping result corresponding to each position of the second crop variety are homozygous genotyping; wherein, the homozygous genotyping is AA genotyping or BB genotyping. For example, a first crop variety at position 1 corresponds to a first genotyping result of AA and a second crop variety at position 1 corresponds to a second genotyping result of AA; the first genotyping result corresponding to the first crop variety at the site 2 is AA, and the second genotyping result corresponding to the second crop variety at the site 2 is None; the first genotyping result corresponding to the first crop variety at the site 3 is AA, and the second genotyping result corresponding to the second crop variety at the site 3 is BB; the first genotyping result corresponding to the first crop variety at the site 4 is BB, and the second genotyping result corresponding to the second crop variety at the site 4 is BB; the first genotyping result corresponding to the first crop variety at the site 5 is BB, and the second genotyping result corresponding to the second crop variety at the site 5 is None; the first genotyping result corresponding to the first crop variety at site 6 is None and the second genotyping result corresponding to the second crop variety at site 6 is None.
Step 207, judging whether missing data exists in the first genotyping result and the second genotyping result; if either one of the first genotyping result and the second genotyping result is missing data, proceeding to step 208; in the case that neither the first genotyping result nor the second genotyping result is missing data, go to step 209.
Step 208, determining the first genotyping result corresponding to each position of the first crop variety and the first genotyping result corresponding to each position of the second crop variety as the missing data.
Step 209, determining whether the first genotyping result and the second genotyping result are the same homozygous genotyping. In case the first genotyping result and the second genotyping result are the same homozygous genotyping, go to step 210; in case neither the first genotyping result nor the second genotyping result is the same homozygous genotyping, the process goes to step 211.
Step 210, determining that the genotype comparison results of the first crop variety and the second crop variety at each site are the same. For example, the genotype comparison results of the first crop variety and the second crop variety at the site 1 and the site 4 are the same, and the genotype comparison results of the first crop variety and the second crop variety at the site 2, the site 3, the site 5 and the site 6 are further judged according to the included angle, the Euclidean distance and the distance difference.
Step 211, determining whether the included angle is greater than a first preset threshold. If the included angle is greater than the first preset threshold, go to step 212; if the included angle is less than or equal to the first preset threshold, the process goes to step 213.
Step 212, determining that the genotype comparison results of the first crop variety and the second crop variety at each site are different.
Step 213, determining whether the euclidean distance is greater than a second preset threshold. If the euclidean distance is less than or equal to the second preset threshold value, go to step 214; in the case that the euclidean distance is greater than the second preset threshold, the process goes to step 215.
Step 214, determining that the genotype comparison results of the first crop variety and the second crop variety at each site are the same. For example, the genotype comparison results for site 2, site 5 and site 6 are the same. And the genotype comparison results of the first crop variety and the second crop variety at the site 2, the site 3, the site 5 and the site 6 are further judged according to the distance difference.
Step 215, determining whether the distance difference is greater than a third preset threshold. If the distance difference is less than or equal to the third preset threshold, go to step 216; if the distance difference is greater than the third preset threshold, the process proceeds to step 217.
And step 216, determining that the genotype comparison results of the first crop variety and the second crop variety at each site are the same. For example, the genotype comparison results for site 2, site 5 and site 6 are the same.
And step 217, determining that the genotype comparison results of the first crop variety and the second crop variety at each site are different. For example, the genotype comparison results for site 2, site 5 and site 6 are not identical.
The genotype comparison method based on the fluorescence signal platform provided by the invention does not depend on genotyping software, avoids errors caused by genotyping errors, uses the basic data as the original fluorescence signal data, directly judges the differences among samples from the data source, and improves the accuracy of genotype comparison results.
The genotype comparison device based on the fluorescence signal platform provided by the invention is described below, and the genotype comparison device based on the fluorescence signal platform described below and the genotype comparison method based on the fluorescence signal platform described above can be correspondingly referred to each other.
Fig. 3 is a schematic structural diagram of a genotype comparison device based on a fluorescence signal platform according to the present invention, and as shown in fig. 3, a genotype comparison device 300 based on a fluorescence signal platform includes: an acquisition module 301, a genotyping module 302, a first determination module 303, and a genotype comparison module 304; wherein,
An acquisition module 301, configured to acquire first fluorescent signal data of a first crop variety of a fluorescent signal platform at a plurality of loci of a genome and second fluorescent signal data of a second crop variety at a corresponding locus of the genome;
a genotyping module 302, configured to determine, for each location, a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety based on the first fluorescent signal data and the second fluorescent signal data;
a first determining module 303, configured to determine, based on the first fluorescent signal data and the second fluorescent signal data, an included angle, a euclidean distance, and a distance difference between the first fluorescent signal data and the second fluorescent signal data, when neither the first genotyping result nor the second genotyping result is missing data;
and a genotype comparison module 304, configured to determine genotype comparison results of the first crop variety and the second crop variety at each of the loci based on the included angle, the euclidean distance, and the distance difference.
The genotype comparison device based on the fluorescence signal platform provided by the invention is characterized in that first fluorescence signal data of a first crop variety of the fluorescence signal platform at a plurality of loci of a genome and second fluorescence signal data of a second crop variety at corresponding loci of the genome are obtained; for each site, respectively determining a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety according to the first fluorescence signal data and the second fluorescence signal data; under the condition that the first genotyping result and the second genotyping result are homozygous genotyping, determining an included angle, a Euclidean distance and a distance difference between the first fluorescence signal data and the second fluorescence signal data according to the first fluorescence signal data and the second fluorescence signal data; and determining genotype comparison results of the first crop variety and the second crop variety at each site based on the included angle, the Euclidean distance and the distance difference. Based on the included angle, the Euclidean distance and the distance difference value determined by the first fluorescent signal data and the second fluorescent signal data, genotype comparison of each site can be accurately judged, the difference between the first crop variety and the second crop variety is judged from the source of the fluorescent signal data, and the accuracy of the judging result of the sample to be tested is improved.
Optionally, the genotype comparison module 304 is specifically configured to:
comparing the included angle with a first preset threshold value;
under the condition that the included angle is larger than the first preset threshold value, determining that genotype comparison results of the first crop variety and the second crop variety at all the sites are different;
and under the condition that the included angle is smaller than or equal to the first preset threshold value, determining that the genotype comparison results of the first crop variety and the second crop variety at all the sites are the same based on the Euclidean distance and the distance difference value.
Optionally, the genotype comparison module 304 is specifically configured to:
comparing the Euclidean distance with a second preset threshold value;
under the condition that the Euclidean distance is smaller than or equal to the second preset threshold value, determining that the genotype comparison results of the first crop variety and the second crop variety at all the sites are the same;
and under the condition that the Euclidean distance is larger than the second preset threshold value, determining that the genotype comparison results of the first crop variety and the second crop variety at all the sites are the same based on the distance difference value.
Optionally, the genotype comparison module 304 is specifically configured to:
comparing the distance difference value with a third preset threshold value;
and under the condition that the distance difference value is smaller than or equal to the third preset threshold value, determining that genotype comparison results of the first crop variety and the second crop variety at all the sites are the same.
Optionally, the genotype comparison device 300 based on the fluorescence signal platform further includes:
and the second determining module is used for determining that the genotype comparison results of the first crop variety and the second crop variety at each locus are different under the condition that the distance difference value is larger than the third preset threshold value.
Optionally, the first determining module 303 is specifically configured to:
calculating the included angle between the first fluorescence signal data and the second fluorescence signal data based on a first vector corresponding to the first fluorescence signal data and a second vector corresponding to the second fluorescence signal data;
and calculating the Euclidean distance and the distance difference between the first fluorescence signal data and the second fluorescence signal data respectively based on the first fluorescence signal data and the second fluorescence signal data.
Optionally, the genotyping module 302 is specifically configured to:
respectively judging whether the first fluorescent signal data and the second fluorescent signal data meet a first preset condition or not; the first preset condition is used for judging whether the first fluorescent signal data and the second fluorescent signal data are missing or not;
and under the condition that the first fluorescent signal data and the second fluorescent signal data meet the first preset condition, determining that a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety are both missing data.
Optionally, the genotyping module 302 is specifically configured to:
respectively judging whether the first fluorescent signal data and the second fluorescent signal data meet a second preset condition or not; the second preset condition is used for judging whether genotype typing results corresponding to the first fluorescent signal data and the second fluorescent signal data are homozygous genotype typing or not;
and under the condition that the first fluorescent signal data and the second fluorescent signal data meet the second preset condition, determining that a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety are homozygous genotyping.
Fig. 4 is a schematic physical structure of an electronic device according to the present invention, as shown in fig. 4, the electronic device may include: processor 410, communication interface (Communications Interface) 420, memory 430 and communication bus 440, wherein processor 410, communication interface 420 and memory 430 communicate with each other via communication bus 440. The processor 410 may invoke logic instructions in the memory 430 to perform a fluorescence signal platform based genotype comparison method that includes: acquiring first fluorescence signal data of a first crop variety of a fluorescence signal platform at a plurality of sites of a genome and second fluorescence signal data of a second crop variety at corresponding sites of the genome; for each site, respectively determining a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety based on the first fluorescent signal data and the second fluorescent signal data; determining an included angle, a euclidean distance and a distance difference between the first fluorescent signal data and the second fluorescent signal data based on the first fluorescent signal data and the second fluorescent signal data under the condition that neither the first genotyping result nor the second genotyping result is missing data; and determining genotype comparison results of the first crop variety and the second crop variety at each locus based on the included angle, the Euclidean distance and the distance difference.
Further, the logic instructions in the memory 430 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the fluorescence signal platform-based genotype comparison method provided by the above methods, the method comprising: acquiring first fluorescence signal data of a first crop variety of a fluorescence signal platform at a plurality of sites of a genome and second fluorescence signal data of a second crop variety at corresponding sites of the genome; for each site, respectively determining a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety based on the first fluorescent signal data and the second fluorescent signal data; determining an included angle, a euclidean distance and a distance difference between the first fluorescent signal data and the second fluorescent signal data based on the first fluorescent signal data and the second fluorescent signal data under the condition that neither the first genotyping result nor the second genotyping result is missing data; and determining genotype comparison results of the first crop variety and the second crop variety at each locus based on the included angle, the Euclidean distance and the distance difference.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (5)

1. The genotype comparison method based on the fluorescent signal platform is characterized by comprising the following steps of:
acquiring first fluorescence signal data of a first crop variety of a fluorescence signal platform at a plurality of sites of a genome and second fluorescence signal data of a second crop variety at corresponding sites of the genome;
for each site, respectively determining a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety based on the first fluorescent signal data and the second fluorescent signal data;
determining an included angle, a euclidean distance and a distance difference between the first fluorescent signal data and the second fluorescent signal data based on the first fluorescent signal data and the second fluorescent signal data under the condition that the first genotyping result and the second genotyping result are not missing data and are the same homozygous genotyping;
Determining genotype comparison results of the first crop variety and the second crop variety at each site based on the included angle, the Euclidean distance and the distance difference;
the determining, based on the first fluorescent signal data and the second fluorescent signal data, a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety, respectively, includes:
respectively judging whether the first fluorescent signal data and the second fluorescent signal data meet a second preset condition or not; the second preset condition is used for judging whether genotype typing results corresponding to the first fluorescent signal data and the second fluorescent signal data are homozygous genotype typing or not; the second preset condition is expressed as: signalA > = 2SignalB or SignalB > = 2SignalA, the first fluorescent signal data includes SignalA and SignalB; the second fluorescent signal data comprises a signalA and a signalB;
under the condition that the first fluorescent signal data and the second fluorescent signal data meet the second preset condition, determining that a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety are homozygous genotyping;
The determining the genotype comparison result of the first crop variety and the second crop variety at each site based on the included angle, the Euclidean distance and the distance difference value comprises the following steps:
comparing the included angle with a first preset threshold value;
under the condition that the included angle is larger than the first preset threshold value, determining that genotype comparison results of the first crop variety and the second crop variety at all the sites are different;
determining that the genotype comparison results of the first crop variety and the second crop variety at each locus are the same based on the Euclidean distance and the distance difference value under the condition that the included angle is smaller than or equal to the first preset threshold value;
the determining that the genotype comparison results of the first crop variety and the second crop variety at each of the sites are the same based on the euclidean distance and the distance difference value comprises:
comparing the Euclidean distance with a second preset threshold value;
under the condition that the Euclidean distance is smaller than or equal to the second preset threshold value, determining that the genotype comparison results of the first crop variety and the second crop variety at all the sites are the same;
Determining that the genotype comparison results of the first crop variety and the second crop variety at each site are the same based on the distance difference value under the condition that the Euclidean distance is larger than the second preset threshold value;
the determining that the genotype comparison result of the first crop variety and the second crop variety at each of the sites is the same based on the distance difference value comprises:
comparing the distance difference value with a third preset threshold value;
under the condition that the distance difference value is smaller than or equal to the third preset threshold value, determining that genotype comparison results of the first crop variety and the second crop variety at all the sites are the same; the distance difference value adoptsCalculating, wherein->Representing the data log transformed from the first fluorescent signal data,/the first fluorescent signal data>Representing the data log transformed from the second fluorescent signal data,/the second fluorescent signal data>And->Adopts->Calculating;
the determining, based on the first fluorescence signal data and the second fluorescence signal data, an included angle, a euclidean distance, and a distance difference between the first fluorescence signal data and the second fluorescence signal data, includes:
calculating the included angle between the first fluorescence signal data and the second fluorescence signal data based on a first vector corresponding to the first fluorescence signal data and a second vector corresponding to the second fluorescence signal data;
And calculating the Euclidean distance and the distance difference between the first fluorescence signal data and the second fluorescence signal data respectively based on the first fluorescence signal data and the second fluorescence signal data.
2. The fluorescence signal platform-based genotype comparison method of claim 1, further comprising:
and under the condition that the distance difference value is larger than the third preset threshold value, determining that genotype comparison results of the first crop variety and the second crop variety at all the sites are different.
3. The genotyping method based on the fluorescent signal platform according to claim 1 or 2, wherein the determining the first genotyping result corresponding to the first crop variety and the second genotyping result corresponding to the second crop variety based on the first fluorescent signal data and the second fluorescent signal data, respectively, comprises:
respectively judging whether the first fluorescent signal data and the second fluorescent signal data meet a first preset condition or not; the first preset condition is used for judging whether the first fluorescent signal data and the second fluorescent signal data are missing or not;
And under the condition that the first fluorescent signal data and the second fluorescent signal data meet the first preset condition, determining that a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety are both missing data.
4. Genotype comparison device based on fluorescence signal platform, characterized by comprising:
the acquisition module is used for acquiring first fluorescent signal data of a first crop variety of the fluorescent signal platform at a plurality of sites of a genome and second fluorescent signal data of a second crop variety at corresponding sites of the genome;
the genotyping module is used for determining a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety according to the first fluorescence signal data and the second fluorescence signal data for each position;
a first determining module, configured to determine, based on the first fluorescent signal data and the second fluorescent signal data, an included angle, a euclidean distance, and a distance difference between the first fluorescent signal data and the second fluorescent signal data, when neither the first genotyping result nor the second genotyping result is missing data, and both are the same homozygous genotyping;
The genotype comparison module is used for determining genotype comparison results of the first crop variety and the second crop variety at each locus based on the included angle, the Euclidean distance and the distance difference value;
the genotyping module is specifically used for:
respectively judging whether the first fluorescent signal data and the second fluorescent signal data meet a second preset condition or not; the second preset condition is used for judging whether genotype typing results corresponding to the first fluorescent signal data and the second fluorescent signal data are homozygous genotype typing or not; the second preset condition is expressed as: signalA > = 2SignalB or SignalB > = 2SignalA, the first fluorescent signal data includes SignalA and SignalB; the second fluorescent signal data comprises a signalA and a signalB;
under the condition that the first fluorescent signal data and the second fluorescent signal data meet the second preset condition, determining that a first genotyping result corresponding to the first crop variety and a second genotyping result corresponding to the second crop variety are homozygous genotyping;
the genotype comparison module is specifically used for:
Comparing the included angle with a first preset threshold value;
under the condition that the included angle is larger than the first preset threshold value, determining that genotype comparison results of the first crop variety and the second crop variety at all the sites are different;
comparing the Euclidean distance with a second preset threshold value under the condition that the included angle is smaller than or equal to the first preset threshold value;
under the condition that the Euclidean distance is smaller than or equal to the second preset threshold value, determining that the genotype comparison results of the first crop variety and the second crop variety at all the sites are the same;
comparing the distance difference with a third preset threshold value under the condition that the Euclidean distance is larger than the second preset threshold value;
under the condition that the distance difference value is smaller than or equal to the third preset threshold value, determining that genotype comparison results of the first crop variety and the second crop variety at all the sites are the same; the distance difference value adoptsCalculating, wherein->Representing the data log transformed from the first fluorescent signal data,/the first fluorescent signal data>Representing the data log transformed from the second fluorescent signal data,/the second fluorescent signal data >And->Adopts->Calculating;
the first determining module is specifically configured to:
calculating the included angle between the first fluorescence signal data and the second fluorescence signal data based on a first vector corresponding to the first fluorescence signal data and a second vector corresponding to the second fluorescence signal data;
and calculating the Euclidean distance and the distance difference between the first fluorescence signal data and the second fluorescence signal data respectively based on the first fluorescence signal data and the second fluorescence signal data.
5. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the fluorescence signal platform-based genotype comparison method of any of claims 1-3 when the program is executed.
CN202310809269.3A 2023-07-04 2023-07-04 Genotype comparison method and device based on fluorescent signal platform Active CN116543837B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310809269.3A CN116543837B (en) 2023-07-04 2023-07-04 Genotype comparison method and device based on fluorescent signal platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310809269.3A CN116543837B (en) 2023-07-04 2023-07-04 Genotype comparison method and device based on fluorescent signal platform

Publications (2)

Publication Number Publication Date
CN116543837A CN116543837A (en) 2023-08-04
CN116543837B true CN116543837B (en) 2024-01-26

Family

ID=87452758

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310809269.3A Active CN116543837B (en) 2023-07-04 2023-07-04 Genotype comparison method and device based on fluorescent signal platform

Country Status (1)

Country Link
CN (1) CN116543837B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111944884A (en) * 2020-08-24 2020-11-17 北京诺赛基因组研究中心有限公司 Method for typing SNP sites of sample based on KASP technology
CN113604596A (en) * 2021-08-10 2021-11-05 河北省农林科学院经济作物研究所 KASP primer for detecting cucumber small zucchini yellow mosaic virus disease resistance gene zym and application thereof
CN114480721A (en) * 2022-03-18 2022-05-13 北京市农林科学院 Method for identifying whether melon variety to be detected is thin-skin melon or thick-skin melon and special SNP primer combination thereof
WO2023056451A1 (en) * 2021-09-30 2023-04-06 Mammoth Biosciences, Inc. Compositions and methods for assaying for and genotyping genetic variations

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010046673A1 (en) * 1999-03-16 2001-11-29 Ljl Biosystems, Inc. Methods and apparatus for detecting nucleic acid polymorphisms

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111944884A (en) * 2020-08-24 2020-11-17 北京诺赛基因组研究中心有限公司 Method for typing SNP sites of sample based on KASP technology
CN113604596A (en) * 2021-08-10 2021-11-05 河北省农林科学院经济作物研究所 KASP primer for detecting cucumber small zucchini yellow mosaic virus disease resistance gene zym and application thereof
WO2023056451A1 (en) * 2021-09-30 2023-04-06 Mammoth Biosciences, Inc. Compositions and methods for assaying for and genotyping genetic variations
CN114480721A (en) * 2022-03-18 2022-05-13 北京市农林科学院 Method for identifying whether melon variety to be detected is thin-skin melon or thick-skin melon and special SNP primer combination thereof

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
SNP 芯片数据估计动物个体基因组品种构成的方法及应用;何俊 等;遗传;第40卷(第4期);第305-312页 *
人工合成小麦SHW-L1 高硒含量KASP 分子标记开发及其应用;魏广辉 等;中国农业科学;第53卷(第20期);第4103-4110页 *
玉米杂交种纯度鉴定SNP 核心引物的确定及高通量检测方案的建立;王蕊;作物学报;第47卷(第4期);全文 *

Also Published As

Publication number Publication date
CN116543837A (en) 2023-08-04

Similar Documents

Publication Publication Date Title
KR102638152B1 (en) Verification method and system for sequence variant calling
US20140067355A1 (en) Using Haplotypes to Infer Ancestral Origins for Recently Admixed Individuals
Nevado et al. Resequencing studies of nonmodel organisms using closely related reference genomes: optimal experimental designs and bioinformatics approaches for population genomics
CN110692101B (en) Method for aligning targeted nucleic acid sequencing data
CN110648721B (en) Method and device for detecting copy number variation by aiming at exon capture technology
CN108595912B (en) Method, device and system for detecting chromosome aneuploidy
WO2019213811A1 (en) Method, apparatus, and system for detecting chromosomal aneuploidy
CN116189763A (en) Single sample copy number variation detection method based on second generation sequencing
JP2019500706A5 (en)
CN109461473B (en) Method and device for acquiring concentration of free DNA of fetus
CN108875307B (en) Paternity test method based on fetal free DNA in peripheral blood of pregnant woman
CN117253539B (en) Method and system for detecting sample pollution in high-throughput sequencing based on germ line mutation
CN107075565B (en) Individual single nucleotide polymorphism site typing method and device
CN116543837B (en) Genotype comparison method and device based on fluorescent signal platform
CN114708915A (en) Snap typing effectiveness evaluation method and device based on contour coefficient and electronic equipment
CN114300045A (en) Semi-supervised SNP (single nucleotide polymorphism) typing method and device based on control group and electronic equipment
CN116525000B (en) Crop variety genotyping method and device compatible with multiple fluorescent signal platforms
US10937523B2 (en) Methods, systems and computer readable storage media for generating accurate nucleotide sequences
US20210151126A1 (en) Methods for fingerprinting of biological samples
Bernhardsson et al. Variant calling using NGS and sequence capture data for population and evolutionary genomic inferences in Norway Spruce (Picea abies)
CN114703263B (en) Group chromosome copy number variation detection method and device
US20170226588A1 (en) Systems and methods for dna amplification with post-sequencing data filtering and cell isolation
CN116168761B (en) Method and device for determining characteristic region of nucleic acid sequence, electronic equipment and storage medium
CN116411123A (en) Method for detecting tea tree variety and substantial derivative relation
CN116994647A (en) Method for constructing model for analyzing mutation detection result

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant