Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a single-point variation site rs533280354 on the FKBP5 gene specially developed for Asian people, and experimental verification is carried out on the single-point variation site rs533280354, and the result shows that the variation site has obvious relevance with high myopia, so that the single-point variation site can be used for early screening and diagnosis of the high myopia and has important clinical application value.
The above object of the present invention is achieved by the following technical solutions:
the first aspect of the invention provides application of a reagent for detecting the SNP locus of the FKBP5 gene in a sample of a subject in preparing a product for diagnosing high myopia susceptibility.
Further, the SNP locus is rs 533280354;
further, the SNP locus rs533280354 in the high myopia susceptibility patient is mutated from G to A.
Further, when rs533280354G > a, the subject is at risk of or suffering from high myopia.
Further, the reagent includes a primer pair for specifically amplifying the nucleotide sequence of the SNP site.
Further, the reagent also comprises reagents used in methods for detecting the SNP site genotype by utilizing mass spectrometry, a DNA microarray method, a sequencing method, allele-specific probe hybridization, restriction fragment analysis, oligonucleotide ligation detection, single-strand conformation polymorphism analysis and allele-specific amplification.
Further, the reagent also includes a nucleic acid affinity ligand or probe that specifically binds to the SNP site.
Further, the product comprises a kit, a chip or test paper.
Further, as an alternative embodiment, the kit may further include a genomic DNA extraction reagent, a PCR reaction system reagent, a DHPLC-related reagent, or a protein extraction reagent;
preferably, the PCR reaction system reagent comprises dNTP and MgCl 2 Taq DNA polymerase, PCR reaction buffer solution and deionized water;
preferably, the kit further comprises an enzyme digestion conventional component, including reaction buffer and deionized water.
Further, the kit comprises suitable containers, which typically comprise at least one vial, test tube, flask, pet bottle, syringe or other container in which one component can be placed and, preferably, suitably aliquoted. Where more than one component is present in the kit, the kit will also typically comprise a second, third or other additional container in which the additional components are separately disposed. However, different combinations of components may be contained in one vial. The kit of the invention will also typically include a container for holding the reactants, sealed for commercial sale. Such containers may include injection molded or blow molded plastic containers in which the desired vials may be retained.
Further, as an alternative embodiment, the chip includes: a solid phase carrier and oligonucleotide probes orderly fixed on the solid phase carrier, wherein the oligonucleotide probes specifically correspond to the sequence of rs 533280354;
preferably, the solid phase carrier can be made of various materials commonly used in the field of gene chip, such as but not limited to plastic products, microparticles, membrane carriers, etc. The plastic products can be combined with antibodies or protein antigens through a non-covalent or physical adsorption mechanism, and the most common plastic products are small test tubes, small beads and micro reaction plates made of polystyrene; the micro-particles are microspheres or particles polymerized by high molecular monomers, the diameter of the micro-particles is more than micron, and the micro-particles are easy to form chemical coupling with antibodies (antigens) due to the functional groups capable of being combined with proteins, and the combination capacity is large; the membrane carrier comprises microporous filter membranes such as a nitrocellulose membrane, a glass cellulose membrane, a nylon membrane and the like.
Furthermore, the oligonucleotide probe is an oligonucleotide probe aiming at the SNP site rs533280354 of the FKBP5 gene, and can be DNA, RNA, a DNA-RNA chimera, PNA or other derivatives. The length of the probe is not limited as long as specific hybridization is achieved and specific binding to the target nucleotide sequence is achieved.
Further, the high myopia is considered to be high myopia in asian population.
Furthermore, the sequencing method is a direct sequencing method, the direct sequencing method is the most direct and reliable method for detecting the SNP sites, the detection rate is up to 100%, and representative sequencing technologies include Pyrosequencing (Pyrosequencing), Taqman technology, micro sequencing (SNaPshot) and the like. The method detects SNP by comparing the sequencing results of PCR amplification products of the same gene or gene fragments in different samples or re-sequencing analysis of the positioned Sequence Tag Site (STS) and Expressed Sequence Tag (EST). The PCR product can be purified and recovered, then the PCR product can be connected to a vector for sequencing, and the PCR product can also be directly sequenced. By aligning the sequences, the mutation type and position of SNP can be accurately detected.
Further, the Single Strand Conformation Polymorphism (SSCP) refers to a spatial conformation difference of a Single strand DNA caused by a difference in base sequence, and the difference results in a difference in electrophoretic mobility of Single strand DNAs having the same or similar length, so that the Single strand conformation polymorphism can be effectively detected by non-denaturing polyacrylamide gel electrophoresis. PCR-SSCP is a method of using SSCP for detecting gene mutation in PCR amplification products, in which PCR-amplified DNA fragments are subjected to native polyacrylamide gel electrophoresis under denaturing agent conditions by high-temperature treatment to unwind double-stranded DNA amplification fragments and maintain single-stranded state. Currently, the PCR-SSCP technique is widely used in various fields of molecular biology.
Further, the principle of Allele-specific amplification (Allle-specific PCR, AS-PCR) is AS follows: because the mismatch of a single base at the 3 'end of the primer cannot be repaired by Taq DNA polymerase, when the base at the 3' end of the primer is complementarily matched with the allele of the SNP locus, an amplification reaction can be carried out; when the base at the 3' end of the primer does not complementarily match the allele at the SNP site, the amplification reaction cannot occur. At present, several methods based on AS-PCR improvement have appeared, such AS four-primer amplification mutation-resistant system PCR (four-primer amplification mutation system PCR, four-primer ARMS-PCR), fragment length difference allele specific PCR (FLDAS-PCR), multi-allele specific amplification (PCR amplification of multiple specific antigens, PMASA), etc.
Further, the subject sample is selected from blood or tissue of the subject;
preferably, the subject sample is selected from the group consisting of tissue of a subject;
more preferably, the subject sample is selected from the group consisting of exfoliated cells of the oral cavity of the subject.
In a second aspect the invention provides a product for diagnosing high myopia susceptibility.
Further, the product comprises a reagent for detecting the SNP locus genotype of the FKBP5 gene;
preferably, the SNP locus of the FKBP5 gene is rs 533280354;
preferably, the SNP locus rs533280354 in the high myopia-susceptible patient is mutated from G to A;
preferably, the high myopia is asian population high myopia;
preferably, the reagent for detecting the SNP site genotype of the FKBP5 gene comprises reagents used in a method for detecting the SNP site genotype by utilizing mass spectrometry, a DNA microarray method, a sequencing method, allele-specific probe hybridization, restriction fragment analysis, oligonucleotide ligation detection, single-strand conformation polymorphism analysis and allele-specific amplification.
More preferably, the reagent for detecting the genotype of the SNP site of the FKBP5 gene comprises a primer pair for specifically amplifying the nucleotide sequence of the SNP site;
preferably, the product comprises a kit, chip or strip.
The third aspect of the invention provides application of the SNP locus of the FKBP5 gene in constructing a calculation model for predicting the susceptibility to high myopia.
Preferably, the computational model has the genotype of rs878639 as an input variable;
preferably, the SNP locus rs533280354 in the high myopia-susceptible patient is mutated from G to A;
preferably, the high myopia is considered to be high myopia in asian populations.
Further, in some embodiments, the computational model may also take as input variables other markers associated with high myopia.
Further, when rs533280354G > a, the subject is at risk of or suffering from high myopia.
The fourth aspect of the invention provides application of the SNP locus of the FKBP5 gene in constructing a biological sample screening system with high myopia susceptibility.
Further, the SNP locus is rs 533280354;
preferably, the SNP locus rs533280354 in the biological sample susceptible to high myopia is mutated from G to A;
preferably, the high myopia is considered to be high myopia in asian populations.
In a fifth aspect of the invention, a biological sample screening system susceptible to high myopia is provided.
Further, the system comprises:
(1) a nucleic acid extraction device for extracting a nucleic acid sample from the biological sample;
(2) the nucleic acid sequence determining device is connected with the nucleic acid extracting device and is used for analyzing the nucleic acid sample and determining the nucleic acid sequence of the nucleic acid sample;
(3) the result judging device is connected with the nucleic acid sequence determining device, and is used for judging whether the mutation of the SNP locus rs533280354 from G to A exists in the biological sample or not based on the comparison of the nucleic acid sequence of the nucleic acid sample and the wild type FKBP5 gene sequence, so as to judge whether the biological sample is a biological sample susceptible to high myopia or not;
preferably, the biological sample is subject-derived exfoliated oral cells;
preferably, the nucleic acid sample is DNA;
preferably, if a mutation at SNP site rs533280354 from G to a is present in the biological sample derived from the subject, the biological sample is a biological sample susceptible to high myopia.
In addition, the invention also provides a method for diagnosing the susceptibility of the subject to the high myopia.
Further, the method comprises the steps of:
(1) extracting nucleic acids from a sample of a subject;
(2) contacting nucleic acid from a sample from a subject with an agent that detects the rs533280354 genotype;
(3) determining the genotype of rs 533280354;
(4) determining whether the subject is at risk of or suffering from high myopia based on the genotype.
Further, the subject is an asian population.
Further, the Asian population is Chinese.
Further, the sample comprises exfoliated oral cells, preferably obtained from the subject by conventional oral swab sampling.
To further explain the present invention, the terms of art referred to in the present invention are explained as follows:
the term "SNP (single nucleotide polymorphism)" refers to a single base position in DNA at which a population of different alleles or alternative nucleotides are present. This SNP position is typically preceded and followed by highly conserved sequences of the allele (e.g., sequences that differ among members less than 1/100 or 1/1000 in the population). Individuals may be homozygous or heterozygous for the allele at each SNP position. The SNP sites of the invention are named "rs-" and the person skilled in the art is able to determine their exact position, nucleotide sequence from a suitable database and related information systems, such as the single nucleotide polymorphism database (dbSNP), based on the rs-naming above. In a specific embodiment of the invention, the SNP site is the SNP site rs533280354 on the FKBP5 gene, and the SNP site rs533280354 has detailed information thereof in NCBI, which is detailed in https:// www.ncbi.nlm.nih.gov/SNP/? term rs 533280354.
The term "sample" refers to any sample derived from any suitable portion of a subject. As a non-limiting example, the sample may be derived from a pure tissue or organ or cell type. In other embodiments of the invention, the sample may be derived from a bodily fluid, such as from saliva, cerebrospinal fluid, blood, serum, sputum, mucosal scraping, tissue biopsy, tear secretion, semen, sweat, and the like. Particularly preferred is the use of a blood sample comprising cells containing DNA, such as immature red blood cells, red blood cell precursor cells, white blood cells and the like. The sample used in the context of the present invention should preferably be collected in a clinically acceptable manner, more preferably in a manner that retains nucleic acids or proteins. In the detection step of the present invention, the target to be detected is a nucleic acid (preferably DNA) or a protein, and in the embodiment of the present invention, the sample is preferably an exfoliated oral cell, and more preferably an exfoliated oral cell derived from a subject is collected by a conventional oral swab sampling method.
The term "allele" refers to a pair or series of forms of a gene or nongenic region present at a given locus on a chromosome. In a normal diploid cell, there are two alleles (one for each parent) of either gene, which occupy the same relative position (locus) on the homologous chromosome. In a population, more than two alleles may be present for a gene. SNPs also have alleles, i.e., two (or more) nucleotides that characterize the SNP.
The term "genotype" refers to the identity of the alleles present in an individual or sample. Typically, it refers to the genotype of the individual associated with a particular gene of interest; in a polyploid individual, it refers to what combination of alleles of a gene are carried by the individual.
The term "primer" refers to a naturally occurring oligonucleotide (e.g., a restriction fragment) or a synthetically produced oligonucleotide that is capable of serving as a point of initiation of synthesis of a primer extension product that is complementary to a nucleic acid strand (template or target sequence) when subjected to appropriate conditions (e.g., buffer, salt, temperature, and pH) and in the presence of nucleotides and an agent for nucleic acid polymerization (e.g., a DNA-dependent or RNA-dependent polymerase).
The term "nucleic acid affinity ligand for a SNP site" refers to a nucleic acid molecule capable of binding to a SNP site or sequences in the vicinity thereof as described herein. By way of non-limiting example, it may be, for example, an RNA, DNA, PNA, CAN, HNA, LNA or ANA molecule or any other suitable form of nucleic acid known to those skilled in the art.
The term "probe" means that it can have any suitable length, for example, a length of 15, 20, 30, 40, 50, 100, 150, 200, 300, 500, 1000, or more than 1000 nucleotides. But preferably is no less than 10 nucleotides in length, and preferably is no more than about 80 nucleotides in length; in some embodiments, the probe is about 20 to 60 nucleotides in length; in other embodiments, the probe is about 20 to 40 nucleotides in length. The probe may also be suitably modified, for example by the addition of a label, such as a fluorescent label, a dye, a radioactive label, etc. The probes of the invention may be labeled by standard labeling techniques, such as radiolabels, enzymatic labels, fluorescent labels, biotin-avidin labels, chemiluminescent labels, and the like, according to methods known to those skilled in the art. After hybridization, the probe can be detected using known methods.
Compared with the prior art, the invention has the advantages and beneficial effects that:
(1) based on the ten-thousand-person queue research of the Chinese high myopia population, the invention discovers a variation site for early screening and diagnosis of the high myopia for the first time, wherein the variation site is a single-point variation site rs533280354 on the FKBP5 gene, and the experiment verification is carried out on the variation site, so that the remarkable relevance between the variation site and the high myopia is further proved;
(2) the invention firstly clarifies the correlation between the single-point variation site rs533280354 on the FKBP5 gene and the high myopia, provides a method for predicting the susceptibility of the high myopia for the field, can be used for early screening, auxiliary diagnosis and treatment of the high myopia, and can also be used for developing medicaments for preventing, delaying or treating the high myopia, thereby having important scientific research and clinical application values.
Detailed Description
The present invention is further illustrated below with reference to specific examples, which are intended to be illustrative only and are not to be construed as limiting the invention. As will be understood by those of ordinary skill in the art: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents. The experimental methods used in the following examples are all conventional methods unless otherwise specified; reagents, biomaterials, etc. used in the following examples are commercially available unless otherwise specified.
Example screening and Experimental validation of SNP sites associated with susceptibility to high myopia
1. Study object
10000 high myopia population and 10000 contrast population (normal vision population) are collected at high school, middle school and primary school in each district and county of Wenzhou city respectively, and oral swab sampling is carried out on the 20000 people. All subjects signed informed consent after the ethical committee agreed to this study. Wherein the inclusion and exclusion criteria for the highly myopic population are as follows:
inclusion criteria were: middle and primary school students and partial parents with high myopia and non-high myopia at the age of 7-18 years old;
exclusion criteria: the subjects who have personal information missing or cannot be matched with the eye surgery or the OK mirror (orthokeratology mirror) correction.
2. Extraction of genetic material DNA from oral swab samples
The oral swab DNA extraction kit is purchased from Fujian division of Beijing Beirui and kang biotech Co., Ltd, and has a product number of Zhejiang mechanical equipment 20150116. The extraction of the buccal swab sample DNA was carried out according to the procedures described in the kit instructions, and the specific extraction procedure was as follows:
(1) opening the outer package of the oral swab and taking out the swab;
(2) allowing the subject to open and hold, and inserting the buccal swab into the subject's mouth;
(3) placing the oral swab on the inner side of the two sides of the cheek of the subject, and scraping each side ten times until a sufficient amount of oral membrane cells are collected;
(4) taking the oral swab out of the mouth of the subject, placing the oral swab in a preservation solution of a sample bottle, and screwing down the sample bottle;
(5) shaking up the preservation solution for about 10 seconds, putting the bar code back into the sample bag and sealing after confirming that the bar code is stuck;
(6) and (4) performing enzymatic cleavage treatment. After cell lysis, DNA was isolated and purified from the cell fraction using Qiagen DNA extraction kit;
(7) after DNA extraction, the DNA samples are stored at low temperature and sent to Berry sequencing company for whole exome sequencing, and high-quality individual exome sequence information is obtained.
3. Quality control
This experiment first retained the metric that passed the GATK VQSR (variant mass fraction recalibration), as well as reads that were outside the low complexity region. Genotype Depths (DP) <10, genotypes with Genotype Quality (GQ) <20, and heterozygous genotypes with allele equilibria >0.8 or <0.2 are set as deletions. This experiment also excluded mutation sites with deletion rates >1, case-control deletion rates <0.995, and P values <10-6 in the Hardy-Weinberg equilibrium (HWE) test based on the combined case-control cohort. Furthermore, samples were excluded if their mean recall rate was low (<0.9), mean sequencing depth was low (<10), or mean genotype quality was low (< 65). Finally, outliers of the conversion/transversion ratio, heterozygosity/homozygote ratio or insertion/deletion ratio (distance mean >4SD) were excluded from each queue.
Samples with X chromosome inbred propagation coefficient >0.8 were classified as male and samples with X chromosome inbred propagation coefficient <0.4 were classified as female. Samples between <0.8 and >0.4, classified as showing an ambiguous gender status, were excluded from the dataset.
To alleviate confusion caused by the difference in recall rates, a site-based filtering strategy was used in this experiment. Individuals showing site targeting of exonic sequences were excluded from the analysis if the absolute difference in the percentage of cases compared to the control group was more than 0.007. Site-based filtering resulted in 2.42% of the bases of the target exon sequence being excluded from the respective analysis to alleviate problems associated with differential recall. In addition, this experiment removed 2.36% of the target exon sequence bases that reached the genome-wide significance threshold (P) in the recall correlation test<1×10 -6 ) (two-sided P values in Fisher's exact test).
4. Annotation of variant sites
The experiment uses Ensembl's variant effect prediction software (VEP v.99) to annotate variants in human genome GRCh 37. This experiment classified protein-encoding variants into the following four categories: (1) a synonymous mutation; (2) benign missense mutations; (3) a destructive missense mutation; (4) protein Truncation Variants (PTV).
5. Relevance analysis
The experiment was subjected to two types of univariate correlation analysis: one is analysis of all (including related) samples using (two-sided) MLMA, MLMA-LOCO, boltlm and EMMAX test and one unrelated sample analysis using (two-sided) fit logistic regression test. Firth analysis included covariates of the major components of the genetic ancestry (top 10 PCs). The MLMA-LOCO result is used to correlate P values and the Firth result is used to estimate effect size. Due to population differentiation caused by genetic drift, test statistics obtained by linear regression may be large, and a correction method (e.g., "genome control") is employed so that exaggerated data can be corrected. Exome extensive association study (ExWAS) first tested the association of each variation neglecting allele frequency and high myopia, the experiment applying P to the encoded variants<4.3×10 -7 Of synonymous or non-coding variants, using P<5×10 -8 To a significant level. For MAF>0.05 mutation site, approximately 80% of the power was used to detect a variation with a relative risk of 1.23 for the multiplicative genotype. To estimate cross-population genetic effect correlations between chinese in MAGIC and european in Biobank, uk, experiments used popcor (V1.0) software, a method based on summarized data, for estimating cross-population genetic correlations. Subsequently, meta-analysis was performed on the summary level of the ExWAS results using an inverse variance weighted fixed effect model with METAL software.
6. Motif analysis
To detect sequence motifs associated with important cognate variants (rs533280354), the present experiment performed motif analysis using the HOMER software on 200bp regions upstream and downstream of the mutation site.
7. The single G-A variation at the rs533280354 site disrupts the KLF15 binding motif of the FKBP5 promoter region, thereby reducing its transcriptional performance
The wild-type (WT) promoter sequence and 5' UTR of Mutant (MUT) were cloned into pGL3-basic luciferase vector and luciferase activity was detected in HEK293 cells (human embryonic kidney 293 cells), this study used small interfering RNA (siRNA) to knock out KLF15 and then the WT and MT FKBP5 promoters were tested for luciferase activity as follows:
the wild-type and mutant FKBP5 promoters were completely synthesized from the huada gene (beijing genomics institute) and inserted into the pGL3 basic backbone using HindIII restriction sites. siRNA was synthesized by Ribobio. siRNA duplexes targeting KLF15 mRNA or siNC were transfected into HEK293 cells using Lipofectamine RNAiMAX transfection reagent (Invitrogen) as described in the manufacturer's instructions.
After 48 hours of incubation, pGL3-WT or pGL3-MUT was co-transfected with pRL-TK into cells using Lipofectamine 2000 (Invitrogen).
After an additional 24 hours of incubation, Dual luciferase activity was measured using the Dual-Glo luciferase assay kit (Promega).
the t-test was used to compare fluorescence intensity between experimental groups.
8. Results of the experiment
And performing quality control, alignment and variation detection on the sequencing data, and detecting 3386821 variation sites (excluding HLA regions with high genetic heterozygosity). The association analysis of the cohort was performed to find variation sites with different frequency differences between the high myopia population and the control population, and the results are shown in fig. 1, which shows that the variation sites with different frequency differences between the high myopia population and the control population include variation sites on the FKBP5, ripo 2, and DOCK9 genes. Integrating this cohort with the european UK biobank cohort in the highly myopic population, SNP rs533280354 at FKBP5 was found to be the specific and most contributing site in the asian population (frequency of 1.68% in this cohort, 0 in european cohort) (see fig. 2).
After removing the position of the strong linkage, selecting the most significant position in the results of the association analysis of the discovery queue, and further researching the genome and chromatin characteristics of the 5'UTR of FKBP5, wherein the most significant position comprises a SNP position rs533280354 located on the FKBP5 gene, the SNP position rs533280354 in the high myopia susceptibility patient is mutated from G to A (the SNP position rs533280354 has the detailed information in NCBI, see https:// www.ncbi.nlm.nih.gov/SNP// term ═ rs533280354), and the mutation position is located in the 5' UTR region upstream of the gene FKBP 5. The results are shown in Table 1.
TABLE 1 detailed information of SNP rs533280354 on FKBP5 Gene
MOTIF mining analysis of FKBP5 gene revealed that SNP rs533280354 mutant is located at the binding site of transcription factor KLF15 in 5' UTR of FKBP5 gene ("CCCGCCC"), as shown in detail in FIG. 3.
Chromatin accessibility and modification analysis showed that the 5' UTR region had very strong open regions in both ATAC-seq and H3K27ac chip-seq signals. In addition, KLF15 has a significant binding peak in the rs533280354 region of the FKBP5 gene, which is highly expressed in retina and macula, as shown in fig. 4.
The Wild Type (WT) promoter sequence and the 5' UTR of the Mutant (MUT) were cloned into pGL3-basic luciferase vector, and the results showed that the inserted WT promoter sequence showed strong luciferase activity in HEK293 cells compared to pGL3-basic, but the MT promoter with G-A variation showed significantly reduced luciferase transcriptional activation, indicating that rs533280354 could directly affect the binding of KLF15 to FKBP5 promoter region. To further verify the above results, the present inventors knocked out KLF15 using siRNA and then examined luciferase activity of WT and MT FKBP5 promoters, and the experimental results showed that after knocking out KLF15 using siRNA, WT promoter decreased by 40.9% while MT promoter was not changed (see fig. 5A and 5B). The above results and data strongly indicate that a single G-a variation at the rs533280354 site on FKBP5 destroys the KLF15 binding motif of the FKBP5 promoter region, thereby reducing the transcriptional performance of FKBP5, further resulting in reduced expression of FKBP5, increasing the sensitivity of glucocorticoid receptor, and excessive uptake of glucocorticoid induces the occurrence of high myopia, and the above cell experiments further prove that a single G-a variation at the SNP site rs533280354 on FKBP5 gene is a pathogenic mutation of high myopia, can induce the occurrence of high myopia, i.e. can be used in the screening diagnosis of high myopia.
The above description of the embodiments is only intended to illustrate the method of the invention and its core idea. It should be noted that, for those skilled in the art, without departing from the principle of the present invention, several improvements and modifications can be made to the present invention, and these improvements and modifications will also fall into the protection scope of the claims of the present invention.