CN115206420B - Construction method and application of schizophrenia abnormal gene-metabolism regulation network - Google Patents

Construction method and application of schizophrenia abnormal gene-metabolism regulation network Download PDF

Info

Publication number
CN115206420B
CN115206420B CN202210739157.0A CN202210739157A CN115206420B CN 115206420 B CN115206420 B CN 115206420B CN 202210739157 A CN202210739157 A CN 202210739157A CN 115206420 B CN115206420 B CN 115206420B
Authority
CN
China
Prior art keywords
schizophrenia
gene
metabolite
network
metabolites
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210739157.0A
Other languages
Chinese (zh)
Other versions
CN115206420A (en
Inventor
杨新平
杨小雪
高玥
余文君
劳健培
王静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southern Hospital Southern Medical University
Original Assignee
Southern Hospital Southern Medical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southern Hospital Southern Medical University filed Critical Southern Hospital Southern Medical University
Priority to CN202210739157.0A priority Critical patent/CN115206420B/en
Publication of CN115206420A publication Critical patent/CN115206420A/en
Application granted granted Critical
Publication of CN115206420B publication Critical patent/CN115206420B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • G16B15/30Drug targeting using structural data; Docking or binding prediction
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Abstract

The invention belongs to the technical field of computer biology, and discloses a method for constructing a schizophrenia abnormal gene-metabolism regulation network and application thereof. The invention integrates the exome and metabonomics to analyze the metabolic regulation network of the schizophrenia, the extracted metabolic regulation network is more systematic and comprehensive than the traditional analysis, the network can intuitively present the regulation nodes related to genes or metabolites, the node size can embody the importance degree of the genes/metabolites in the network, and the invention can provide data for other researches. And the results of the integrated analysis of the genes and the metabolites are combined, and the subsequent experimental verification and the like can be combined to expand the application of diagnosis and treatment of the schizophrenia.

Description

Construction method and application of schizophrenia abnormal gene-metabolism regulation network
Technical Field
The invention relates to the technical field of computer biology, in particular to a method for constructing a schizophrenia abnormal gene-metabolism regulation network and application thereof.
Background
The existing research on the pathogenesis factors of the schizophrenia shows that the pathogenesis of the schizophrenia is influenced by a plurality of factors, and the pathogenesis is generally considered to be the result of the combined action of environmental factors and genetic factors. From genetic factors, the current study supports mainly the three pathogenesis hypotheses of dopamine, glutamate and gamma-aminobutyric acid, which are thought to be related to the pathogenesis of schizophrenia. Large-scale detection of genes and genetic factors related to schizophrenia onset by high-throughput methods has found a large number of genes and genetic factors related to schizophrenia onset, e.g., PGC schizophrenia research team identified 108 segments related to schizophrenia onset, and Shiyong team studies found 113 segments related to schizophrenia onset. From environmental factors, schizophrenia occurs more frequently in men than women, and men often develop more frequently in schizophrenic patients. In addition, factors such as father age, embryo dysplasia, obstetric complications, drug addiction, family history, environmental pollution, etc. are considered to be likely to be associated with the onset of schizophrenia. These factors may directly or indirectly affect the processes such as gene function and regulation. Metabolic studies of postmortem anatomical brain tissue of schizophrenic patients have found that the levels of linoleic acid, a prerequisite for phosphatidylcholine, phosphatidylethanolamine, arachidic acid, are significantly reduced in brain tissue of patients. Also studies indicate that AA and docosahexaenoic acid levels in erythrocytes and plasma are significantly reduced in schizophrenic patients, while membrane peroxidized lipids are significantly elevated compared to normal controls following treatment with antipsychotic drugs. The change of the metabolic level in the human body can be directly or indirectly influenced by environmental factors, and the metabolome research on the patients with the schizophrenia is helpful for revealing the role of the environmental factors in the pathogenesis of the schizophrenia.
However, the pathogenesis of schizophrenia cannot be completely explained by the functions and pathways related to the pathogenesis of schizophrenia discovered by the current research on the exon groups of schizophrenia, and the research on the metabolome is focused on screening molecular markers and does not go deep into the related pathways and gene function changes. And the research flux of cell signal paths is low, and only the relation between individual genes and metabolic substances is concerned. The genetic factors that have been found to explain the pathogenesis of schizophrenic patients remain limited, and it is not known how environmental factors lead to the pathogenesis.
Therefore, there is a need to study the systemic metabolic regulation network integrating exogenomics and metabonomics, and thus systematically integrate the changes and correlations of gene and metabolite changes in schizophrenia, providing a basis for clinical prevention and treatment of schizophrenia.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a method for constructing a schizophrenia abnormal gene-metabolism regulation network and application thereof.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
in a first aspect, the present invention provides a method for constructing a metabolic regulation network for abnormal genes of schizophrenia, comprising the steps of:
(1) Collecting a peripheral blood sample of schizophrenia, and separating serum and cells;
(2) Extracting the DNA of the cells, and performing high-throughput exome sequencing; taking the serum for non-targeted liquid chromatography tandem secondary mass spectrometry sequencing;
(3) Taking the high-throughput exome sequencing data, carrying out copy number variation screening to obtain a schizophrenia CNV related gene, and carrying out single base site mutation screening to obtain a schizophrenia SNV related gene;
(4) Taking the non-targeted liquid chromatography tandem mass spectrometry for sequencing, and analyzing a positive ion mode and a negative ion mode to obtain a schizophrenia related differential metabolite;
(5) Collecting information of interaction relation among genes-metabolites, metabolites-metabolites and proteins-proteins from a public database, and integrating the information into a gene-metabolism regulation network;
(6) The obtained schizophrenia CNV related gene, schizophrenia SNV related gene and schizophrenia related differential metabolite are used as seed nodes, and edges in the gene-metabolism regulation network are filtered, so that the schizophrenia abnormal gene-metabolism regulation network is obtained.
As a preferred embodiment of the method for constructing a network for regulation and control of abnormal genes and metabolism of schizophrenia according to the present invention, in the step (3), the method for screening copy number variation comprises:
screening out CNV regions in the sequencing result of the high-throughput exome by adopting XHMM software, comparing the CNV regions with CNV regions in thousands of people genome data and a public database SZDB, only reserving CNV regions which have intersections with reported schizophrenia CNV and are common to non-crowds, and removing outlier samples to obtain screened CNV regions; calculating a contrast value ratio according to the occurrence times of each screened CNV region in a patient and a normal control by using a double-tail Fisher accurate test to obtain a CNV region with OR > 1; genes in the CNV region opposite to OR >1 are processed by using NCBI batch gene id conversion tool, and genes with entrez id are reserved, so that the schizophrenia CNV related genes are obtained.
As a preferred embodiment of the method for constructing the schizophrenia abnormal gene-metabolism regulation network, the method for screening single base site mutation comprises the following steps:
selecting SNV sites which are evaluated by GATK mutation quality in the sequencing result of the high-throughput exome by using a GATK tool, removing the results of three outlier samples and calculating OR to obtain SNV sites of OR > 1; using ensembl VEP software to evaluate the influence of SNV locus of OR >1 on gene function, retaining all high image mutation, and retaining loci meeting filtering standard for the modification image mutation; and (3) for the gene with the mutation site through filtration, acquiring the entrez id of the gene by using an ensembl Biomart, and comparing the obtained gene with known schizophrenia network genes to obtain the gene related to the schizophrenia SNV.
Preferably, the meeting the filtering criteria is meeting any three or more of the following conditions:
the score after CADD correction is not lower than 32;
sift prediction classified as 'delete (0)';
PolyPhen prediction is classified as 'probably_damarging (1)';
the MutabionTaster predictions are classified as either 'A' or 'D', but do not contain 'N'.
As a preferred embodiment of the method for constructing a network for regulation and control of abnormal genes and metabolism of schizophrenia according to the present invention, in the step (4), the method for analyzing the positive ion pattern and the negative ion pattern comprises:
after peak extraction, peak alignment and peak area calculation are carried out on the peak retention time and the charge-to-mass ratio obtained by the sequencing result by using an R packet XCMS, respectively extracting a positive ion mode metabolite and a negative ion mode metabolite; then using metaX software, combining HMDB, KEGG, lipidblast and referring to the annotation data metabolite annotation map, carrying out metabolite annotation identification to obtain a secondary mass spectrum annotated metabolite;
removing low-quality metabolite peaks, filling up the deficiency values and correcting and normalizing the metabolite intensities by using R-pack metaX, calculating a variation coefficient by using the corrected metabolite intensities, and removing the metabolites with CV >30% intensity fluctuation after secondary mass spectrometry annotation; performing statistical analysis on metabolic differences of the metabolites in schizophrenic patients and normal control serum samples by using t-test and BH correction, calculating VIP values of the metabolites among groups by using a partial least squares regression analysis method, and correcting the metabolites with q values less than 0.05 and VIP more than or equal to 1 to obtain differential metabolites annotated by a secondary mass spectrum;
preferably, the conditions for the peak extraction are: the centrwa param function parameter is snthresh=6, ppm=30, peakwith=c (5, 25), mzdiff=0.01;
the conditions for the peak alignment were: non-subset-based mode, adjustRtime function parameter is param=obiwarparam (binsize=0.1), chromatogram function parameter is aggregation fun= 'max', increment= 'none';
the conditions for the peak area calculation are: the peakDensityparam function parameter is minfraction=0.5, binsize=0.015, bw=5;
the conditions for removing the low quality metabolite peaks are: the fileqpeak function parameter is ratio=0.5, and the filepeak function parameter is ratio=0.8;
the filling conditions of the missing values are as follows: the missingvalueinputte function parameter is method= 'knn', and the preProcess function parameter is scale= 'pareto';
the conditions for the metabolite intensity correction normalization are: the normal function parameter is method= 'pqn';
the statistical analysis uses a peakStat function.
In the step (5), the interaction relationship between the gene-metabolite and the metabolite-metabolite is collected from KEGG, bioGRID, CTD, GEM, reactome and brende, the protein-protein interaction relationship is collected from InBioMap, bioPlex, huRI, PROPER, BIOGRID, intAct, MENTHA and iRefIndex, the gene id is uniformly converted into the entrez id, the metabolite id is uniformly KEGG id, and then the obtained product is integrated into the human gene-metabolic regulation network using Cytoscape.
As a preferred embodiment of the method for constructing a network for regulation and control of abnormal genes and metabolism of schizophrenia according to the present invention, in the step (6), the filtration is performed by a three-point filtration method; the three-point filtering method is to select one node as a central point, judge whether the interaction relation between two adjacent nodes and the central point and the node type meet the requirements, and reserve the sides meeting the conditions.
In the second aspect, the method for constructing the schizophrenia abnormal gene-metabolism regulation network is applied to noninvasive detection screening of schizophrenia potential markers.
In the third aspect, the invention applies the construction method of the schizophrenia abnormal gene-metabolism regulation network in preparing a diagnosis marker identification kit for schizophrenia.
Compared with the prior art, the invention has the beneficial effects that:
the invention integrates the exome and metabonomics to analyze the metabolic regulation network of the schizophrenia, the extracted metabolic regulation network is more systematic and comprehensive than the traditional analysis, the network can intuitively present the regulation nodes related to genes or metabolites, the node size can embody the importance degree of the genes/metabolites in the network, and the invention can provide data for other researches. And the results of the integrated analysis of the genes and the metabolites are combined, and the subsequent experimental verification and the like can be combined to expand the application of diagnosis and treatment of the schizophrenia.
Drawings
FIG. 1 is a schematic diagram showing the relationship between gene and metabolic regulation satisfying the conditions in "three-point filtration" screening.
FIG. 2 is a technical scheme for constructing a network of dystopsia gene-metabolic regulation in schizophrenia.
FIG. 3 shows the enrichment of gene nodes in the aberrant gene-metabolic regulation network of schizophrenia in immune candidate gene (A) and synaptic transmission candidate gene (B);
in the figure, the P value is obtained by using the accurate test calculation of double-tail Fisher in R, the abscissa is from left to right, and the abscissa is the proportion of candidate genes in the corresponding gene groups, wherein the abscissa is all genes in a protein-protein interaction network, genes in a constructed schizophrenia abnormal gene-metabolism regulation network, and the screened schizophrenia related CNV and SNV genes.
FIG. 4 is a comparison of network principal component sizes for a "real network" and a "random network";
in the figure, the abscissa is the size of the main component in the network, i.e., the number of nodes in the main component; the ordinate is the number of networks having a size corresponding to the major component.
Detailed Description
For a better description of the objects, technical solutions and advantages of the present invention, the present invention will be further described with reference to the following specific examples. It will be appreciated by persons skilled in the art that the specific embodiments described herein are for purposes of illustration only and are not intended to be limiting.
The test methods used in the examples are conventional methods unless otherwise specified; the materials, reagents and the like used, unless otherwise specified, are all commercially available.
Example 1: construction of a network for regulation of abnormal genes and metabolism of schizophrenia
1. Collecting a peripheral blood sample of schizophrenia
Peripheral blood samples of 31 schizophrenic patients (age range 15-35 years) and 20 normal control populations (age range 15-20 years) were collected at 5ml x 2 tube/person and stored in anticoagulation blood collection tubes and coagulation blood collection tubes, respectively. And (3) performing red blood cell elution on a blood sample in the anticoagulation blood collection tube by using red blood cell lysate, and then performing DNA and RNA extraction by using a Magen blood DNA kit, wherein the operation flow is described in a kit manual. Blood samples in procoagulant blood collection tubes were centrifuged and the supernatant was transferred to 2ml cryopreservation tubes. All samples are signed with sample collection informed consent, and the processed samples follow the laboratory biosafety operation standard, are stored in a-80% refrigerator after corresponding records are made.
2. High throughput sequencing
The DNA extracted from peripheral blood is subjected to high-throughput exome sequencing, the capture chip captures AIwhole exome (V4 version) for the exome which is independently developed and designed by the company, the capture area is 62M, and the sequencing instrument is an Illumina NovaSeq for double-ended sequencing, wherein the average coverage depth of the area is not less than 100X.
The serum is separated from the peripheral blood and subjected to non-targeted liquid chromatography tandem secondary mass spectrometry, and the sequencer is Thermo Q Exactive.
3. Sequencing data analysis
(1) The analysis of the exome sequencing data is divided into two parts: copy Number Variation (CNV) screening and single base site mutation (SNV) screening.
In CNV screening, XHMM software is adopted to screen CNV regions in sequencing results, then the CNV regions are compared with CNV regions in thousand-person genome (1000 genome) data and in a public database SZDB (v 2.0), only CNV regions which have intersections with reported schizophrenia CNV and are common to non-population groups are reserved, and 631 CNV regions are obtained in total after outlier samples are removed. A two-tailed fisher-exact test (OR) was then used to calculate the Odds Ratio (OR) separately based on the number of occurrences of each CNV region in the patient and normal controls, yielding a total of 352 CNV regions of OR > 1. Finally, the genes in these 352 CNV regions were processed using NCBI's batch gene id conversion tool (https:// www.ncbi.nlm.nih.gov/data/tables/genes /), leaving 259 genes with the entrez id for subsequent network construction (as shown in Table 1).
TABLE 1 schizophrenia CNV related genes
Figure GDA0004090956100000061
/>
Figure GDA0004090956100000071
/>
Figure GDA0004090956100000081
In SNV screening, the SNV sites which are evaluated by the quality of the GATK mutation in the sequencing result are firstly selected by using a GATK tool, and after the results of three outlier samples are removed and OR is calculated, 95445 SNV sites of OR >1 are obtained. For these sites, the effect of SNV mutations on gene function was evaluated using the ensembl VEP software, retaining all high image mutations, and for the modification image mutations retaining sites meeting at least the following three filter criteria: (i) a CADD corrected score of not less than 32; (ii) SIFT prediction classification as 'delete (0)'; (iii) PolyPhen prediction is classified as 'probably_damarging (1)'; (iv) The MutabionTaster predictions are classified as either 'A' or 'D', but do not contain 'N'. For genes with mutation sites screened, the enrez id of the gene was obtained using an ensembl Biomart and compared with the schizophrenia network gene (2021Gao et al,iScience,PMID 34746692) published by the team, and 243 SNV genes were finally reserved for subsequent network construction (as shown in table 2).
TABLE 2 SNV related genes for schizophrenia
Figure GDA0004090956100000082
/>
Figure GDA0004090956100000091
/>
Figure GDA0004090956100000101
(2) In the serometabonomic analysis, the results contained both positive (positive) and negative (negative) modes.
Peak extraction (centrwasemoParam function parameter is snthresh=6, ppm=30, peak width=c (5, 25), mzdiff=0.01), peak alignment (non-subset-based mode, adjust rtime function parameter is param=obiwarparam (binsize=0.1), chromatogram function parameter is agmatingfun= 'max', included= 'none') and peak area calculation (peak Density Param function parameter is minfraction=0.5, binsize=0.015, bw=5) were performed on the sequencing result using R-packet XCMS, and then 22065 positive ion mode metabolites and 13355 negative ion mode metabolites were extracted, respectively. And then performing metabolite annotation identification by using metaX software and combining HMDB, KEGG, lipidblast and four metabolite annotation maps of Shenzhen micro-nano fei biotechnology limited company self-construction library standard (part of metabolites selected by the company, self-construction library sequencing drawing map is used as reference annotation), and respectively obtaining 859 and 471 metabolites with reliable secondary mass spectrum annotation in positive ion and negative ion modes.
Low quality metabolite peaks were removed using R-pack metaX (filterqpeak function parameter ratio=0.5, filterpeak function parameter ratio=0.8), missing value padding (missingvalueinputte function parameter method= 'knn', preProcess function parameter scale= 'pareto') and metabolite intensity correction normalization (normal function parameter method= 'pqn'), and coefficients of variation were calculated using corrected metabolite intensities (coefficient of variation, CV), with CV >30% of intensity fluctuating oversized metabolites removed. Finally, the metabolic differences of the metabolites in schizophrenic patients and normal control serum samples were statistically analyzed using t-test combined with BH correction (peak function, normal control serum as control), while the VIP (variable important for the projection) value of the metabolites between groups (runPLSDA function) was calculated using partial least squares regression analysis (PLS-DA), and metabolites with q value <0.05 and VIP >1 were corrected to obtain 40 differential metabolites with reliable secondary mass spectrum annotation for positive and 39 differential metabolites for negative ion modes, respectively (as shown in table 3), for subsequent network construction.
TABLE 3 differential metabolites associated with schizophrenia
Figure GDA0004090956100000111
/>
Figure GDA0004090956100000121
4. Schizophrenia abnormal gene-metabolism regulation network construction
The interaction relationship between the gene-metabolite and the metabolite-metabolite is collected from the six public data sets of KEGG, bioGRID, CTD, GEM, reactome and BRENDA, the protein-protein interaction relationship is collected from the eight public data sets of InBioMap, bioPlex, huRI, PROPER, BIOGRID, intAct, MENTHA and iRefIndex, the gene id is uniformly converted into the entrez id, the metabolite id is uniformly changed into the KEGG id, and then the KEGG is integrated into the human gene-metabolism regulation network by using Cytoscape.
On the basis, the 256 schizophrenia CNV related genes, 243 schizophrenia SNV related genes (Psczs) and 79 schizophrenia differential metabolites (DEMS) screened in positive/negative ion mode are used as 'seed' nodes, edges in a human gene-metabolism regulation network are filtered, a 'three-point filtering method' is used, namely, one node is selected as a central point, whether the interaction relation between two adjacent nodes and the central point and the type of the node meet the requirements or not is judged, the edges meeting the conditions are reserved, particularly as shown in a figure 1, the central node is a metabolite (whether the central node is a differential metabolite or not), one of the directly adjacent nodes is the schizophrenia related gene, the other directly adjacent node is the differential metabolite, as shown in a part (1) in the figure 1, or the schizophrenia related gene, as shown in a part (4) in the figure 1; the central node is a gene (whether or not it is a gene associated with schizophrenia), and its immediate neighbors must have one node as a differential metabolite, and the other immediate neighbor may be a gene associated with schizophrenia, as part (2) in FIG. 1, or a differential metabolite, as part (3) in FIG. 1. The overall flow of the construction of the schizophrenia-related gene-metabolic regulation network is shown in FIG. 2.
Example 2: validating dystopia gene-metabolic regulation network
To verify whether the dystopia gene-metabolic regulation network constructed in example 1 is indeed associated with the onset of schizophrenia, it was verified from the following aspects: functional enrichment of mutant genes associated with schizophrenia in the network, clustering of genes, and comparison of metabolites in the network with known biomarkers of schizophrenia.
1. 3904 immune-related genes were collected from the literature, and 772 synaptic transmission-related genes were collected from the Gene otolog website (genes annotated for human species were screened after searching in Gene otolog using 'synaptic transmission' as a key). These two genes were chosen as candidate gene sets for validating gene function in the schizophrenia gene-metabolic regulation network, since this can correspond to the two major classes of pathogenesis hypothesis for schizophrenia-immune abnormality hypothesis and neurotransmitter abnormality hypothesis.
After calculation of p-values using the two-tailed fischer exact test, it was found that after screening by gene-metabolic networks, the ratio in gene-metabolic networks was higher than in all protein-protein interaction networks, whether immune candidate genes (as shown in fig. 3A) or synaptic transmission candidate genes (as shown in fig. 3B). Wherein, the immune candidate genes have statistical difference (p < 0.05) with the PPI network as background after being filtered by the gene-metabolism network, but the synaptic transmission gene ratio only shows rising trend and has no statistical difference. The reason for this result is that firstly, a peripheral blood sample is adopted for analysis, and one of the limiting conditions of network screening is used for the metabolites which have abnormal changes in the serum of a patient, the immune system is easily influenced by the change of environmental factors in the occurrence of diseases, the metabolites are one of the results caused by the change of the environmental factors, and the screened genes are necessarily related to the immune function under the condition; secondly, since the synaptic transmission genes are mostly expressed specifically by brain tissues, the peripheral blood sample cannot capture the part of the genes, so that the number of the synaptic transmission genes in the network is too small (< 10), and the effect of carrying out statistical difference test cannot be achieved.
2. The clustering between the protein interaction relationship between the schizophrenia-related genes and the random background control network was compared.
The protein-protein interaction network collected in the public database is used as a background set, the schizophrenia-related genes are used for grabbing the schizophrenia protein interaction sub-network, the interaction edges of the two end nodes which are the schizophrenia-related genes are reserved, and the obtained sub-network is called as a real network. 10000 times of random sampling are carried out to construct a network, and the genes with the same number of genes related to schizophrenia are extracted from protein-protein interaction network nodes each time, and the same network grabbing strategy is used, so that the obtained sub-network set is called as a random network.
For both "true network" and "random network", the main component sizes in the network were calculated using R-packet iGraph (major component size), and then the bar graphs were plotted for comparison. The purpose of this comparison is that because the protein interactions can reflect the degree of functional relatedness between genes to some extent, the more important and functionally similar genes in general, the more urgent should their protein interactions be, the more aggregated should be the network be formed and the greater should be the network's major components be.
In comparison with the "random network", the number of the main components of the "real network" is larger than that of the vast majority of random networks (as shown in fig. 4), and it is proved that there is indeed a certain functional correlation between the schizophrenia abnormal gene and the schizophrenia related gene screened in the metabolic regulation network, not a random case.
3. In the constructed abnormal gene-metabolism regulation network of the schizophrenia, metabolites such as L-Tyrosine (L-Tyrosine), choline (Choline), L-Glutamine (L-Glutamine), linoleic Acid (Linoleic Acid), L-Proline (L-Proline), docosahexaenoic Acid (DHA) and the like are all reported abnormal metabolites of the schizophrenia.
In summary, the invention starts from the high throughput sequencing data of small sample size clinical samples, combines the large-scale experiment and sequencing results in a public database, screens out mutant genes and differential metabolites possibly related to the onset of schizophrenia, integrates the known interaction/regulation relations of protein-protein, protein-metabolite and metabolite-metabolite of human beings, and constructs the abnormal gene-metabolism regulation network of schizophrenia. The sequencing data of the exome and the metabolome are derived from the collected peripheral blood sample, so that the consistency of the data is ensured from the source, and the data reliability is high. The network can more systematically reveal the relation between gene change and metabolite change, help screen out important factors in the gene change and the metabolite change for subsequent experimental verification, and provide powerful help for the research of the pathogenesis of schizophrenia. And the past study proves that the feasibility of using blood metabolome and exome to respond to brain tissue change conditions can provide theoretical assistance for establishing noninvasive tests for screening potential markers of schizophrenia.
The prior network analysis related research on schizophrenia is mainly focused on protein-protein interaction analysis, and the research on metabolites is mainly limited to detecting and comparing the level change of the metabolites in patients and control groups, and screening differential metabolites. The study of metabolic pathways is often conducted by considering only individual differential metabolites, and low-throughput studies are conducted, so that only a few of the regulation relations related to the metabolites can be revealed.
The integrated network analysis thought adopted by the schizophrenia abnormal gene-metabolism regulation network constructed by the invention can provide system support for the gene-metabolism regulation research of schizophrenia, and the method is more beneficial to revealing the functional pathways related to the pathogenesis of schizophrenia in view of the fact that a plurality of genes and metabolites are needed to participate in the biological function exertion process. Provides a method for diagnosing markers of the schizophrenia and can be expanded to the field of noninvasive diagnosis.
In a word, the invention integrates the exome and metabonomics to analyze the metabolic regulation network of the schizophrenia, the extracted metabolic regulation network is more systematic and comprehensive than the traditional analysis, the network can intuitively present the regulation nodes related to genes or metabolites, the node size can embody the importance degree of the genes/metabolites in the network, and data can be provided for other researches. And the results of the integrated analysis of the genes and the metabolites are combined, and the subsequent experimental verification and the like can be combined to expand the application of diagnosis and treatment of the schizophrenia.
Finally, it should be noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the scope of the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that the technical solution of the present invention may be modified or substituted equally without departing from the spirit and scope of the technical solution of the present invention.

Claims (5)

1. The method for constructing the schizophrenia abnormal gene-metabolism regulation network is characterized by comprising the following steps of:
(1) Collecting a peripheral blood sample of schizophrenia, and separating serum and cells;
(2) Extracting the DNA of the cells, and performing high-throughput exome sequencing; taking the serum for non-targeted liquid chromatography tandem secondary mass spectrometry sequencing;
(3) Taking the high-throughput exome sequencing data, carrying out copy number variation screening to obtain a schizophrenia CNV related gene, and carrying out single base site mutation screening to obtain a schizophrenia SNV related gene;
the copy number variation screening method comprises the following steps:
screening out CNV regions in the sequencing result of the high-throughput exome by adopting XHMM software, comparing the CNV regions with CNV regions in thousands of people genome data and a public database SZDB, only reserving CNV regions which have intersections with reported schizophrenia CNV and are common to non-crowds, and removing outlier samples to obtain screened CNV regions; according to the occurrence times of each screened CNV region in the patient and normal control, respectively calculating a contrast value ratio OR by using a double-tail Fisher accurate test to obtain a CNV region with OR > 1; treating genes in the CNV region of OR >1 by using NCBI batch gene id conversion tool, and reserving the genes with the entrez id to obtain the genes related to the CNV of the schizophrenia;
the single base site mutation screening method comprises the following steps:
selecting SNV sites which are evaluated by GATK mutation quality in the sequencing result of the high-throughput exome by using a GATK tool, removing the results of three outlier samples and calculating OR to obtain SNV sites of OR > 1; using ensembl VEP software to evaluate the influence of SNV locus of OR >1 on gene function, retaining all high image mutation, and retaining loci meeting filtering standard for the modification image mutation; obtaining the entrez id of the gene by using an ensembl Biomart for the gene with the mutation site filtered, and comparing the obtained gene with the known schizophrenia network gene to obtain the gene related to the schizophrenia SNV;
the meeting of the filtering standard is that any three or more of the following conditions are met:
the score after CADD correction is not lower than 32;
sift prediction classified as 'delete (0)';
PolyPhen prediction is classified as 'probably_damarging (1)';
the MutabionTaster predictions are classified as 'A' or 'D', but do not contain 'N'
(4) Taking the non-targeted liquid chromatography tandem mass spectrometry for sequencing, and analyzing a positive ion mode and a negative ion mode to obtain a schizophrenia related differential metabolite;
the positive ion mode and negative ion mode analysis method comprises the following steps:
after peak extraction, peak alignment and peak area calculation are carried out on the peak retention time and the charge-to-mass ratio obtained by the sequencing result by using an R packet XCMS, respectively extracting a positive ion mode metabolite and a negative ion mode metabolite; then using metaX software, combining HMDB, KEGG, lipidblast and referring to the annotation data metabolite annotation map, carrying out metabolite annotation identification to obtain a secondary mass spectrum annotated metabolite;
removing low-quality metabolite peaks, filling up the deficiency values and correcting and normalizing the metabolite intensities by using R-packet metaX, calculating a variation coefficient CV by using the corrected metabolite intensities, and removing the metabolites with CV >30% of intensity fluctuation after secondary mass spectrometry annotation; performing statistical analysis on metabolic differences of the metabolites in schizophrenic patients and normal control serum samples by using t-test and BH correction, calculating VIP values of the metabolites among groups by using a partial least squares regression analysis method, and correcting the metabolites with q values less than 0.05 and VIP more than or equal to 1 to obtain differential metabolites annotated by a secondary mass spectrum; wherein the VIP is variable important for the projection;
(5) Collecting information of interaction relation among genes-metabolites, metabolites-metabolites and proteins-proteins from a public database, and integrating the information into a gene-metabolism regulation network;
(6) The obtained schizophrenia CNV related gene, schizophrenia SNV related gene and schizophrenia related differential metabolite are used as seed nodes, and edges in the gene-metabolism regulation network are filtered, so that a schizophrenia abnormal gene-metabolism regulation network is obtained;
the filtering adopts a three-point filtering method; the three-point filtering method is to select one node as a central point, judge whether the interaction relation between two adjacent nodes and the central point and the node type meet the requirements, and reserve the sides meeting the conditions.
2. The method for constructing a network for regulation of abnormal genes-metabolism of schizophrenia according to claim 1, wherein the conditions for peak extraction are as follows: the centrwa param function parameter is snthresh=6, ppm=30, peakwith=c (5, 25), mzdiff=0.01;
the conditions for the peak alignment were: non-subset-based mode, adjustRtime function parameter is param=obiwarparam (binsize=0.1), chromatogram function parameter is aggregation fun= 'max', increment= 'none';
the conditions for the peak area calculation are: the peakDensityparam function parameter is minfraction=0.5, binsize=0.015, bw=5;
the conditions for removing the low quality metabolite peaks are: the fileqpeak function parameter is ratio=0.5, and the filepeak function parameter is ratio=0.8;
the filling conditions of the missing values are as follows: the missingvalueinputte function parameter is method= 'knn', and the preProcess function parameter is scale= 'pareto';
the conditions for the metabolite intensity correction normalization are: the normal function parameter is method= 'pqn';
the statistical analysis uses a peakStat function.
3. The method of constructing a network for abnormal gene-metabolism regulation of schizophrenia according to claim 1, wherein in the step (5), the interaction relationship between the gene-metabolite and the metabolite-metabolite is collected from KEGG, bioGRID, CTD, GEM, reactome and brendea, the protein-protein interaction relationship is collected from InBioMap, bioPlex, huRI, PROPER, BIOGRID, intAct, MENTHA and iRefIndex, the gene id is converted into the entrez id and the metabolite id is unified into KEGG id, and then the ketoscape is used for integration into the network for human gene-metabolism regulation.
4. Use of the method for constructing a network for regulation of the abnormal gene-metabolism of schizophrenia according to any one of claims 1 to 3 for noninvasive test screening of potential markers for schizophrenia.
5. Use of the method for constructing a network for regulation of the abnormal gene-metabolism of schizophrenia according to any one of claims 1 to 3 for preparing a diagnostic marker recognition kit for schizophrenia.
CN202210739157.0A 2022-06-27 2022-06-27 Construction method and application of schizophrenia abnormal gene-metabolism regulation network Active CN115206420B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210739157.0A CN115206420B (en) 2022-06-27 2022-06-27 Construction method and application of schizophrenia abnormal gene-metabolism regulation network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210739157.0A CN115206420B (en) 2022-06-27 2022-06-27 Construction method and application of schizophrenia abnormal gene-metabolism regulation network

Publications (2)

Publication Number Publication Date
CN115206420A CN115206420A (en) 2022-10-18
CN115206420B true CN115206420B (en) 2023-05-23

Family

ID=83578083

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210739157.0A Active CN115206420B (en) 2022-06-27 2022-06-27 Construction method and application of schizophrenia abnormal gene-metabolism regulation network

Country Status (1)

Country Link
CN (1) CN115206420B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107341365A (en) * 2017-07-24 2017-11-10 李文杰 The screening method and kit of a kind of hereditary disease
CN113637735A (en) * 2021-08-09 2021-11-12 优葆优保健康科技(宁波)有限公司 Kit for detecting children nutrition genome and application

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110111419A1 (en) * 2008-07-04 2011-05-12 deCODE Geneties ehf. Copy Number Variations Predictive of Risk of Schizophrenia
CA2927477C (en) * 2013-10-18 2022-12-13 The Hospital For Sick Children Method of determining disease causality of genome mutations
CN103834730A (en) * 2014-02-20 2014-06-04 中国医学科学院基础医学研究所 Purpose of ZFP28 variation point in preparation of diagnosis schizo kit
US20190228836A1 (en) * 2018-01-15 2019-07-25 SensOmics, Inc. Systems and methods for predicting genetic diseases
US10468141B1 (en) * 2018-11-28 2019-11-05 Asia Genomics Pte. Ltd. Ancestry-specific genetic risk scores
US20220328192A1 (en) * 2019-08-13 2022-10-13 Tata Consultancy Services Limited System and method for assessing the risk of schizophrenia
CN110827916B (en) * 2019-10-24 2021-12-14 南方医科大学南方医院 Method for constructing schizophrenia gene-gene interaction network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107341365A (en) * 2017-07-24 2017-11-10 李文杰 The screening method and kit of a kind of hereditary disease
CN113637735A (en) * 2021-08-09 2021-11-12 优葆优保健康科技(宁波)有限公司 Kit for detecting children nutrition genome and application

Also Published As

Publication number Publication date
CN115206420A (en) 2022-10-18

Similar Documents

Publication Publication Date Title
US11315774B2 (en) Big-data analyzing Method and mass spectrometric system using the same method
CN102952854B (en) Single cell sorting and screening method and device thereof
JP4829297B2 (en) Instruments and methods for metabolite characterization analysis
Schutzer et al. Establishing the proteome of normal human cerebrospinal fluid
CN106714556B (en) Methods and systems for determining risk of autism spectrum disorders
CN110057955B (en) Method for screening specific serum marker of hepatitis B
CN108319813B (en) Method and device for detecting circulating tumor DNA copy number variation
CN110033860B (en) Method for improving detection rate of genetic metabolic diseases based on machine learning
US6625545B1 (en) Method and apparatus for mRNA assembly
CN111341383B (en) Method, device and storage medium for detecting copy number variation
CN106021984A (en) Whole-exome sequencing data analysis system
JP4950993B2 (en) System and method for comparing and editing metabolite data from multiple samples using a computer system database
CN107766696A (en) Eucaryote alternative splicing analysis method and system based on RNA seq data
CN110021346B (en) Gene fusion and mutation detection method and system based on RNAseq data
CN110057954B (en) Application of plasma metabolism marker in diagnosis or monitoring of HBV
CN108680745A (en) Application process of the serum lipids biomarker in NSCLC early diagnosis
CN110648722B (en) Device for evaluating neonatal genetic disease risk
CN115206420B (en) Construction method and application of schizophrenia abnormal gene-metabolism regulation network
WO2006129401A1 (en) Screening method for specific protein in proteome comprehensive analysis
CN112798678A (en) Novel rapid detection method for coronavirus infection based on serum
CN114913918A (en) High-throughput sequencing data analysis method and device for autism
CN110438235B (en) Method for deducing crowd source based on hair shaft proteome nsSNP
JP2013506843A (en) Apparatus and related methods for small molecule component analysis in complex mixtures
Raj et al. Quality control of variant peptides identified through proteogenomics-catching the (un) usual suspects
Duncan et al. State-of-the-art capillary electrophoresis mass spectrometry methods for analyzing the polar metabolome

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant