CN111020021A - Intestinal flora-based small-scale schizophrenia biomarker combination, application thereof and mOTU screening method - Google Patents

Intestinal flora-based small-scale schizophrenia biomarker combination, application thereof and mOTU screening method Download PDF

Info

Publication number
CN111020021A
CN111020021A CN201910605146.1A CN201910605146A CN111020021A CN 111020021 A CN111020021 A CN 111020021A CN 201910605146 A CN201910605146 A CN 201910605146A CN 111020021 A CN111020021 A CN 111020021A
Authority
CN
China
Prior art keywords
schizophrenia
biomarker
motu
relative abundance
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910605146.1A
Other languages
Chinese (zh)
Inventor
王奇
马现仓
鞠艳梅
朱峰
郭锐进
王崴
贾慧珏
范雅娟
马青艳
郭丽阳
高成阁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BGI Shenzhen Co Ltd
First Affiliated Hospital of Medical College of Xian Jiaotong University
Original Assignee
BGI Shenzhen Co Ltd
First Affiliated Hospital of Medical College of Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BGI Shenzhen Co Ltd, First Affiliated Hospital of Medical College of Xian Jiaotong University filed Critical BGI Shenzhen Co Ltd
Priority to CN201910605146.1A priority Critical patent/CN111020021A/en
Publication of CN111020021A publication Critical patent/CN111020021A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6809Methods for determination or identification of nucleic acids involving differential detection
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/689Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/106Pharmacogenomics, i.e. genetic variability in individual responses to drugs and drug metabolism
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/136Screening for pharmacological compounds
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Zoology (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention discloses a small-scale schizophrenia biomarker combination based on intestinal flora, application thereof and a mOTU screening method, which can utilize the influence of candidate drugs on the biomarkers before and after use so as to determine whether the candidate drugs can be used for treating or preventing schizophrenia. Overcomes the defects that the prior schizophrenia diagnosis method can not realize early warning, can not predict the onset and the development trend of schizophrenia and the like. Therefore, the kit can be applied to predicting the onset and development trend of schizophrenia and the preparation and application of the kit for pathological typing of diseases.

Description

Intestinal flora-based small-scale schizophrenia biomarker combination, application thereof and mOTU screening method
Technical Field
The invention belongs to the technical field of biological medicines, and relates to a biomarker of schizophrenia based on intestinal flora, a kit for diagnosing or predicting schizophrenia risk and application.
Background
Schizophrenia (English: Schizophrenia) is a group of serious psychosis with unknown etiology, which is usually caused slowly or subacute in young and strong years, and is clinically manifested as syndromes with different symptoms, and involves various disorders such as sensory perception, thinking, emotion and behavior, and uncoordinated mental activities. Patients are generally conscious, intelligence is basically normal, but some patients are accompanied by impairment of cognitive functions in the disease process. The course of the disease is prolonged, and the disease is recurrent, aggravated or worsened, most patients finally decline and have mental disability, and only a few patients can achieve a recovery or basic recovery state after treatment.
The global prevalence of schizophrenia is about 0.3-0.7%. By 2016, there are over 2100 million patients with schizophrenia estimated globally, with a 10 to 25 year shorter average life expectancy than in normal persons.
Although previous studies show that the onset of schizophrenia is caused by the combined action of genetic factors and environmental factors, and partial abnormal changes of serum and brain tissues of patients exist, the diagnosis of schizophrenia still depends on symptomatic evaluation at present, and no reliable biological marker is identified. In addition, the existing diagnostic criteria cannot predict the onset, efficacy and prognosis of schizophrenia at an early stage.
Disclosure of Invention
The invention aims to provide a small-scale schizophrenia biomarker combination based on intestinal flora and application thereof, and overcomes the defects that the existing schizophrenia diagnosis cannot realize early warning, cannot predict the onset and development trend of schizophrenia and the like only by using at most 5 markers, and can help disease pathological typing, research on drug action targets, accurate medication, research on pathogenesis and the like.
The invention is realized by the following technical scheme:
an intestinal flora-based small scale schizophrenia biomarker combination for providing relative abundance information comprising one or more selected from the group consisting of:
biomarker 1: bacteriodes plexius;
biomarker 2: akkermansia muciniphila;
biomarker 3: enterococcus faecalium;
biomarker 4: eubacterium siraum;
biomarker 5: bacteroides intestinalis.
The relative abundance information provided by the biomarker combinations is used to compare to a reference value.
The relative abundance information of the biomarkers 1-5 is provided based on gene sequences for which abundance calculation can be performed.
The small-scale schizophrenia biomarker combination based on the intestinal flora is used as a detection target or an application of a detection target in preparation of a detection kit.
The method for screening the small-scale schizophrenia biomarker combination based on the intestinal flora comprises the following steps:
1) collecting samples: collecting a stool sample, freezing, transporting, rapidly transferring to-80 ℃ for storage, and performing DNA extraction to obtain an extracted DNA sample, wherein sample subjects comprise schizophrenia patients and healthy people;
2) metagenomic sequencing and Assembly
3) Conservative single copy gene alignment and abundance calculation
3) Inputting high-quality sequencing fragments into software mOTU to calculate the relative abundance of the species:
3.1) aligning the high quality sequencing fragments to a reference single copy gene;
3.2) counting the number of the inserted fragments according to the comparison result;
3.3) normalizing the number of inserts to the length of a single copy gene to obtain the corresponding abundance.
4) Randomly selecting schizophrenia patients and healthy people from a sample set as a training set, using the other samples as a verification set, calculating the relative abundance of mOTU in each sample in the training set, then inputting the mOTU in the training set into a random forest classifier, performing cross validation on the classifier for 5 times by 10 times, calculating the schizophrenia suffering risk of each individual by using the relative abundance of the mOTU screened by an RF model, drawing an ROC curve, calculating AUC as a judgment model efficiency evaluation parameter, selecting a combination with the combination number of markers less than 30 and the best judgment efficiency, and outputting the higher importance index of each mOTU in the model, the higher the importance index represents that the marker has higher importance for judging schizophrenia and non-schizophrenia.
The sample set, sample subjects including 90 schizophrenia patients and 81 healthy persons, and the validation set, sample subjects including 10 schizophrenia patients and 10 healthy persons.
The method of using the above marker, i.e., the method of diagnosing whether a subject has schizophrenia or predicting the risk of whether a subject has schizophrenia, comprises:
1) collecting a sample from a subject;
2) determining relative abundance information of biomarkers in the sample obtained in step 1);
3) comparing the relative abundance information described in step 2) to a reference data set or reference value. The method can be used for disease diagnosis in the meaning of patent law, and can be used for scientific research or other non-disease diagnoses such as enrichment of personal genetic information, enrichment of genetic information base and the like. The relative abundance information of each biomarker in the test subject is compared to a reference data set or reference value to determine whether the subject has schizophrenia or is predicted to be at risk for schizophrenia.
The reference data set includes relative abundance information of biomarkers in samples from a plurality of schizophrenia patients and a plurality of healthy controls.
The reference data set refers to the relative abundance information of each biomarker obtained by operating on samples diagnosed as diseased individuals and healthy individuals, and is used as a reference for the relative abundance of each biomarker. In particular, the reference data set may refer to a training data set. According to the present invention, the training set is referred to and the validation set has the meaning well known in the art. In one embodiment of the present invention, the training set refers to a data set comprising a sample number of test samples of schizophrenic subjects and non-schizophrenic subjects, wherein the test samples comprise the content of each biomarker. The validation set is an independent data set used to test the performance of the training set.
The reference value in the present invention refers to a reference value or normal value of a healthy control. It is known to those skilled in the art that when the sample volume is sufficiently large, a range of normal values (absolute values) for each biomarker in the sample can be obtained using detection and calculation methods well known in the art. When detecting the level of the biomarker using an assay, the absolute value of the level of the biomarker in the sample can be directly compared to a reference value to assess risk of developing the disease and to diagnose or early diagnose schizophrenia, optionally statistical methods can be included.
The step of comparing the relative abundance information with the reference data set in step 2) further comprises executing a multivariate statistical model to obtain the prevalence probability. The rapid and efficient detection can be realized by utilizing the multivariate statistical model. Specifically, the multivariate statistical model is a random forest model.
The prevalence probability being greater than a threshold value indicates that the subject has, or is at risk of having, schizophrenia or a related disease. Specifically, the threshold is 0.5.
The relative abundance information of the biomarkers in the step 2) is obtained by using a sequencing method, and further comprises the following steps: isolating a nucleic acid sample from the sample of the subject, constructing a DNA library based on the nucleic acid sample obtained, and sequencing the DNA library to obtain a sequencing result; and comparing the sequencing result to a reference gene set based on the sequencing result to determine relative abundance information of the biomarker.
According to an embodiment of the invention, at least one of SOAP2 and MAQ may be used to compare the sequencing result with the reference gene set, so that the comparison efficiency can be improved, and the schizophrenia detection efficiency can be improved. According to the embodiment of the invention, multiple (at least two) biomarkers can be detected simultaneously, and the efficiency of schizophrenia detection can be improved.
The reference gene set comprises a step of performing metagenomic sequencing on samples of a plurality of schizophrenia patients and a plurality of healthy controls to obtain a non-redundant gene set, and then combining the non-redundant gene set and an intestinal microorganism gene set to obtain the reference gene set. The reference gene set in the present invention may be an existing gene set, such as an existing reference gene set of a disclosed gut microorganism; or metagenome sequencing can be carried out on samples of a plurality of schizophrenia patients and a plurality of healthy controls to obtain a non-redundant gene set, and then the non-redundant gene set is combined with the intestinal microorganism gene set to obtain the reference gene set, so that the obtained reference gene set has more comprehensive information and more reliable detection results.
The set of non-redundant genes is explained as generally understood by those skilled in the art, and is simply the set of genes remaining after removal of the redundant genes. Redundant genes generally refer to multiple copies of a gene that appear on a chromosome.
In particular, the sample is a stool sample. The sequencing method is carried out by a second generation sequencing method or a third generation sequencing method. The sequencing method is not particularly limited, and rapid and efficient sequencing can be realized by sequencing by a second-generation or third-generation sequencing method.
The sequencing method is performed by at least one selected from the group consisting of Hiseq2000, SOLiD, 454, and single molecule sequencing devices. Therefore, the characteristics of high-throughput and deep sequencing of the sequencing devices can be utilized, so that the subsequent sequencing data can be analyzed, and particularly, the accuracy and the accuracy of statistical test are facilitated.
The invention provides application of the small-scale schizophrenia biomarker combination based on the intestinal flora as a detection target or a detection target in preparing a detection kit, wherein the kit is used for diagnosing whether a subject has schizophrenia or related diseases or predicting the risk of whether the subject has schizophrenia or related diseases.
That is, the present invention provides a kit comprising reagents for detecting biomarkers, with which the relative abundance of these markers in the gut flora can be determined, and thus, the resulting relative abundance values can be used to determine whether a subject is suffering from or susceptible to schizophrenia, and to monitor the efficacy of treatment in patients with schizophrenia.
The invention provides application of a small-scale schizophrenia biomarker combination based on intestinal flora as a target spot in screening medicines for treating and/or preventing schizophrenia. The biomarkers are the biomarkers provided by the invention, and the influence of the candidate drug on the biomarkers before and after use can be utilized to determine whether the candidate drug can be used for treating or preventing schizophrenia.
The change in the relative abundance of the biomarker panel provides a basis for determining whether the drug candidate is effective.
Compared with the prior art, the invention has the following beneficial technical effects:
the present invention is based on the discovery and recognition of the following facts and problems: intestinal microorganisms are microbial communities present in the human intestinal tract, and are the "second genome" of the human body. The human intestinal flora and the host form a coherent whole. Gut microorganisms are capable of producing most of the neurotransmitters found in the human brain. There is increasing evidence supporting the view that gut microbiota influence central neurochemistry and behaviour, irritable bowel syndrome being considered as a typical case of disturbances in the regulation of the brain-gut microbiota axis. Transformation studies have shown that certain specific flora may have an effect on stress response and cognitive function. The probiotics or antibiotics are used for changing the intestinal microbiota, and a new method is provided for improving the brain function and treating intestinal-cerebral axis diseases such as depression and autism. Therefore, the invention screens out the biomarker with high association with schizophrenia by analyzing the intestinal flora and gene sequences of patients with schizophrenia and healthy people, and can accurately diagnose schizophrenia or predict the disease risk by using the biomarker and be used for monitoring the treatment effect.
Feces are metabolites of the human body, and include not only metabolites of the human body but also intestinal microorganisms closely related to changes in metabolism and immunity of our body and brain functions. The excrement is researched, and the obvious difference exists in the composition of intestinal flora of schizophrenia patients and healthy people, so that the risk evaluation and early diagnosis of schizophrenia patients can be accurately carried out. The invention obtains a plurality of related intestinal microorganisms based on comparison and analysis of intestinal flora of schizophrenia patients and healthy people, and can accurately carry out disease risk assessment and early diagnosis on the schizophrenia patients by combining high-quality schizophrenia patients and healthy people mOTU as a training set. Compared with the conventional diagnosis method, the method has the characteristics of convenience and rapidness.
The schizophrenia-related biomarker proposed by the present invention is valuable for early diagnosis. First, the markers of the present invention have high specificity and sensitivity. Second, analysis of stool ensures accuracy, safety, affordability, and patient compliance. And samples of stool are transportable. Polymerase Chain Reaction (PCR) -based assays are comfortable and non-invasive, so one would be more likely to participate in a given screening procedure. Third, the markers of the invention can also be used as a tool for therapy monitoring of schizophrenic patients to detect responses to therapy. For the reason of abundance measurement, the combination of five markers in the present invention is particularly suitable for the case of measuring abundance based on the mOTU method.
Drawings
FIG. 1 is a graph of the diversity of schizophrenic patients and healthy controls β at the genus level according to one embodiment of the present invention.
FIG. 2 is an error rate distribution of 5-fold 10-fold cross validation in a random forest classifier according to an embodiment of the invention.
Fig. 3 is a Receiver Operating Characteristic (ROC) Curve and Area under the Curve (AUC) of a training set consisting of schizophrenic patients and healthy controls based on a random forest model (5 gut markers) according to an embodiment of the present invention.
Fig. 4 is a ROC curve and AUC for a validation set consisting of schizophrenic patients and healthy controls based on a random forest model (5 gut markers) according to an embodiment of the present invention.
Detailed Description
The terms used herein have meanings commonly understood by those of ordinary skill in the relevant art. However, for a better understanding of the present invention, some definitions and related terms are explained as follows:
schizophrenia is a group of serious psychosis with unknown etiology, which is initiated slowly or subacute in young and old years, and is clinically manifested as syndromes with different symptoms, and involves various disorders such as sensory perception, thinking, emotion and behavior, and uncoordinated mental activities.
"biomarker," also referred to as "biological marker," refers to a measurable indicator of a biological state of an individual. Such biomarkers can be any substance in an individual as long as they are related to a particular biological state (e.g., disease) of the subject, e.g., nucleic acid markers (also referred to as gene markers, e.g., DNA), protein markers, cytokine markers, chemokine markers, carbohydrate markers, antigen markers, antibody markers, species markers (species/genus markers) and functional markers (KO/OG markers), and the like. The meaning of the nucleic acid marker is not limited to the existing gene that can be expressed as a protein having biological activity, and includes any nucleic acid fragment, which may be DNA, RNA, modified DNA or RNA, unmodified DNA or RNA, and a collection of these. Nucleic acid markers may also sometimes be referred to herein as signature fragments. In the present invention, biomarkers can also be denoted as "intestinal markers" because the biomarkers found to be associated with schizophrenia are all present in the intestinal tract of the subject. Biomarkers are measured and evaluated, often to examine normal biological processes, pathogenic processes, or therapeutic intervention pharmacological responses, and are useful in many scientific fields.
The biomarker can be used for analyzing fecal samples of healthy people and schizophrenic patients in batches by using high-throughput sequencing. Comparing healthy population to schizophrenic patient population based on high throughput sequencing data to determine specific nucleic acid sequences associated with the schizophrenic patient population. Briefly, the procedure is as follows:
collecting and processing samples: collecting excrement samples of healthy people and schizophrenia patient groups, and performing DNA extraction by using the kit to obtain nucleic acid samples;
library construction and sequencing: constructing and sequencing a DNA library by using high-throughput sequencing so as to obtain a nucleic acid sequence of the intestinal microorganisms contained in the fecal sample;
the specific intestinal microorganism nucleic acid sequence related to the schizophrenia patient is determined by a bioinformatics analysis method. First, the sequenced sequences (reads) are aligned with a reference gene set (also referred to as a reference gene set, which may be a newly constructed gene set or a database of any known sequences, e.g., using a known non-redundant gene set of human intestinal microflora). Next, based on the alignment results, the relative abundance of each gene in the nucleic acid samples from the stool samples of the healthy population and the schizophrenic patient population, respectively, was determined. By comparing the sequencing sequence with the reference gene set, the corresponding relation between the sequencing sequence and the genes in the reference gene set can be established, so that the number of the corresponding sequencing sequence can effectively reflect the relative abundance of the genes aiming at the specific genes in the nucleic acid sample. Thus, the relative abundance of a gene in a nucleic acid sample can be determined by comparison results, according to conventional statistical analysis. Finally, after the relative abundance of each gene in the nucleic acid sample is determined, the relative abundance of each gene in the nucleic acid sample from the stools of the healthy population and the schizophrenic patient population is statistically examined, whereby it can be judged whether there is a gene whose relative abundance is significantly different in the healthy population and the schizophrenic patient population, and if there is a gene that is significantly different, the gene is regarded as a biomarker of an abnormal state, i.e., a nucleic acid marker.
In addition, for a known or newly constructed reference gene set, the reference gene set usually comprises gene species information and functional annotations, so that on the basis of determining the relative abundance of the genes, the species information and the functional annotations of the genes can be further classified, thereby determining the species relative abundance and the functional relative abundance of each microorganism in the intestinal flora, and further determining the species marker and the functional marker of the abnormal state. Briefly, the method of determining a species marker and a functional marker further comprises: comparing the sequencing sequences of the healthy population and the schizophrenia patient population with a reference gene set; respectively determining the species relative abundance and the function relative abundance of each gene in the nucleic acid samples of the healthy population and the schizophrenia patient population based on the comparison result; performing statistical tests on the species relative abundance and the function relative abundance of each gene in nucleic acid samples from healthy people and schizophrenic patient groups; and determining species markers and functional markers, respectively, that are significantly different in relative abundance between nucleic acid samples of healthy and schizophrenic patient populations. According to embodiments of the present invention, statistical tests, such as summing, averaging, median, etc., of the relative abundances of genes from the same species and genes with the same functional annotation can be employed to determine functional relative abundance and species relative abundance.
Unless otherwise indicated, the techniques used in the examples are conventional and well known to those skilled in the art, and may be performed according to the third edition of the molecular cloning, laboratory Manual, or related products, and the reagents and products used are also commercially available. Various procedures and methods not described in detail are conventional methods well known in the art, and the sources, trade names, and components of the reagents used are indicated at the time of first appearance, and the same reagents used thereafter are the same as those indicated at the first appearance, unless otherwise specified.
The invention adopts a Metagenome-Wide Association Study (MWAS) analysis method, and analyzes the flora composition and functional difference of the excrement sample through sequencing; and distinguishing the schizophrenia groups and the non-schizophrenia groups by using a random forest distinguishing model to obtain the disease probability, and using the disease probability for evaluating, diagnosing and early diagnosing the disease risk of the schizophrenia or searching potential drug targets.
According to the present invention, the term "individual" refers to an animal, in particular a mammal, such as a primate, preferably a human.
According to the present invention, terms such as "a," "an," and "the" do not refer only to a singular entity, but also include the general class that may be used to describe a particular embodiment.
In the present invention, the sequencing (next generation sequencing) and MWAS are well known in the art, and can be adjusted by those skilled in the art according to the specific situation. According to the embodiments of the present invention, the method can be performed according to the method described in the literature (Jun Wang, and HuijueJea. Metagenome-wide association students: fine-mining the Microbiology 14.8(2016): 508-.
According to the invention, the term "mOTU" refers to the operational taxonomic Units (metallic operational taxonomic Units) (Sunagawa S, Mende D R, Zeller G, et al. Metagenomic specific using elementary phenotypic genetic markers [ J ]. Nature methods,2013,10(12): 1196) which are the same markers that are set for a certain taxonomic unit (line, species, genus, group, etc.) for the purpose of analysis in phylogenetic or quorum-genetic studies. The sequences are typically divided into different mOTUs according to a similarity threshold, each of which is typically considered a microbial species.
In the present invention, the use methods of the random forest model and the ROC curve are well known in the art, and those skilled in the art can set and adjust parameters according to specific situations. According to embodiments of the present invention, methods described in the literature (Drogand, Dunn WB, Lin W, Buijsse B, Schulze MB, Langenberg C, Brown M, Floegel a., Dietrichs S, Rolandsson O, Wedge DC, Goodare R, Forouhi NG, Sharp SJ, Spanger J, Wareham NJ, Boeing H: Unfigured method outline identification of specific Serum absolute Type 2-Diabetes mellitis in a responsive, New Case Control study. Clinm. 2015,61:487 497; Mihalik SJ, Michalizyn SF, laser J, Heronic F, Chamber F, Chase F605, approach DH, wavelength, emission J. and method of detection of biological sample SA: method of detection of biological sample SA, emission of molecular discovery, emission of biological sample SA, emission of research, emission of.
In the invention, a training set of biomarkers of schizophrenic subjects and non-schizophrenic subjects is constructed, and the biomarker content value of a sample to be tested is evaluated by taking the training set as a reference.
One skilled in the art knows that when further expanding the sample size, the normal content value interval (absolute value) of each biomarker in the sample can be derived using sample detection and calculation methods well known in the art. The absolute value of the biomarker content detected can be compared with the normal content value, optionally in combination with statistical methods, to derive a risk assessment of schizophrenia, a diagnosis, and an efficiency for monitoring the efficacy of treatment of patients with schizophrenia, etc.
Without wishing to be bound by any theory, the inventors indicate that these biomarkers are the intestinal flora present in humans. The method of the invention is used for carrying out correlation analysis on intestinal flora of a subject to obtain a content range value of the biomarker of the schizophrenia population in flora detection.
The present invention will now be described in further detail with reference to specific examples, which are intended to be illustrative, but not limiting, of the invention.
Example 1
1.1 sample Collection
Referring to the method described in the document A, stool-side association study of gut microbiota in type2diabetes (Qian J et al. Nature.2012,490,55-60), a stool sample was collected, then frozen and transported, and rapidly transferred to-80 ℃ for storage, and DNA extraction was performed to obtain an extracted DNA sample. Stool samples from schizophrenic and non-schizophrenic subjects of the invention were from china. The total number of the samples was 171, 81 healthy samples and 90 schizophrenia samples.
1.2 metagenomic sequencing and Assembly
The extracted DNA samples were used to construct a sequencing library for bidirectional (Paired-end) metagenomic sequencing (insert 350bp, read length 100bp) on the Illumina HiSeq2000 sequencing platform. The data generated by sequencing was filtered (quality-controlled) to remove adapter contaminating sequences, low quality sequences and host genome contaminating sequences, resulting in high quality sequencing fragments (reads).
1.3 conservative Single copy Gene alignment and abundance calculation
The relative abundance of the species can be calculated by inputting the high-quality sequencing fragments (reads) of the 1.2 metagenome sequencing and screening into software mOTU (http:// www.bork.embl.de/software/mOTU/download. html). Refer to the methods described in the literature, methods specific profiling using non-reactive polymeric markers (Sunagawa S et al, Nature methods.2013,10(12), 1196-9). The abundance of the protein is calculated as follows: 1) aligning the high quality sequencing fragments to a reference single copy gene; 2) counting the number of the inserted fragments according to the comparison result; 3) normalizing the length of the single copy gene by the number of inserts (normalizing by the average gene length and rounding down to obtain the abundance of the corresponding mOTU) yields the corresponding abundance.
1.4 screening of potential biomarkers for the development of schizophrenia by random forest (ROC/AUC)
In order to further screen the intestinal biomarkers of potential diseases, a training set of biomarkers of schizophrenic subjects and non-schizophrenic subjects is constructed, and the biomarker content value of the sample to be tested is evaluated based on the training set. Wherein, in the present invention, the training set and the validation set have meanings well known in the art. In an embodiment of the present invention, the training set refers to a data set comprising a number of samples of the content of each biomarker in test samples of schizophrenic subjects and non-schizophrenic subjects. The validation set is an independent data set used to test the performance of the training set. The non-schizophrenic subjects are subjects with good mental status, and the subjects can be human beings or model animals, and in the embodiment, the experiments are performed by using human beings as the subjects.
The method specifically comprises the following steps:
the present invention randomly selected 80 schizophrenia patients and 71 healthy persons as a training set (table 1) from 171 samples (90 schizophrenia patients and 81 healthy persons), and the remaining samples as a validation set (10 schizophrenia patients and 10 healthy persons).
1.4.1 biomarkers screened using training set data
First, the relative abundance of mOTU in each sample in the training set was calculated as described in 1.3. The mOTU of the training set is then input into a Random Forest (RF) classifier (4.6-12 in R3.2.5). And performing 10-fold cross validation on the classifier for 5 times, calculating the schizophrenia risk of each individual by using the mOTU relative abundance screened by the RF model, drawing an ROC curve, and calculating AUC as a discrimination model efficiency evaluation parameter. The combination with the number of marker combinations less than 30 and the best discrimination efficiency is selected as the combination of the invention. An importance index for each mOTU is output in the model, the higher the importance index is, the higher the importance of the marker for discriminating schizophrenia from non-schizophrenia.
Finally, the RF classifier obtained in the present invention contains 5 metabolites (i.e., 5 biomarkers), and the relative abundances of these 5 biomarkers are shown in table 1, and the detailed information thereof is shown in table 2. Table 3 shows that 5 biomarkers bind to predict the prevalence probability of the training set, wherein the prevalence probability ≧ 0.5 confirms that the individual is at risk for schizophrenia or has schizophrenia.
Fig. 2 shows the error rate distribution of 5-fold 10-fold cross validation in the random forest classifier. The model is trained by using the relative abundance of mOTU which is obtained by MWAS process treatment and meets the target, a thick black solid curve represents the average value of 5 tests (a thin black curve represents 5 tests), and a black vertical line represents the number of mOTU in the selected optimal combination.
Figure 3 shows ROC curves and AUC for a training set consisting of schizophrenic patients and healthy controls based on a random forest model (5 biomarkers), where specificity characterizes the probability for non-diseased couples, sensitivity refers to the probability for diseased couples, and the discriminatory potency for the training set samples is: AUC 80.88%, 95% confidence interval CI 74-87.76%. The results indicate that the resulting metabolite combinations of this model can serve as potential biomarkers for distinguishing schizophrenia from non-schizophrenia.
1.4.2 validation of the biomarkers screened Using the validation set data
The invention then uses independent population to verify the model, the disease probability is more than or equal to 0.5 to predict that the individual has the risk of suffering from schizophrenia or suffers from schizophrenia. First, the relative abundance of each biomarker in each sample in the training set was calculated as described in 1.3. The verification set data was then verified using a random forest model according to the method of 1.4.1.
Based on the model:
the relative abundance of the 5 biomarkers in the validation set is shown in table 4. Table 5 shows the prevalence probabilities for the validation set based on the 5 biomarkers prediction.
Figure 4 shows ROC curves and AUC for an independent validation set consisting of schizophrenic patients and healthy controls based on a random forest model (5 biomarkers) with discriminatory AUC 71% (95% CI 45.88% -96.12%).
Random forest model classification and regression was performed in version 3.2.5R using "randomForest 4.6-12 package". The inputs included training set data (i.e., the relative abundance of selected mOTU markers in the training samples, see Table 1), sample disease status (the sample disease status of the training samples is a vector, '1' for schizophrenia, '0' for healthy controls), and a validation set (the relative abundance of selected mOTU markers in the validation set, see Table 4). Then, the inventor uses the random forest function of the random forest packet in the R software to establish classification and prediction functions to predict the verification set data, and the output is the prediction result (the ill probability); the threshold is 0.5, and if the probability of the disease is 0.5 or more, it is considered that there is a risk of schizophrenia or schizophrenia.
TABLE 1 training set of random forest models intestinal marker (mOTU) relative abundance data
Figure RE-GDA0002400517510000091
Figure RE-GDA0002400517510000101
Figure RE-GDA0002400517510000111
Figure RE-GDA0002400517510000121
Figure RE-GDA0002400517510000131
SZ: patients with schizophrenia; h: healthy controls
TABLE 25 biomarker details
Figure RE-GDA0002400517510000132
And the AUC of the # validation set represents the discrimination degree of the data of the validation set under the training set data acquisition model.
& txid, indicates the number of this biomarker in the NCBI database.
TABLE 3 probability of illness based on 5 marker combination training set
Figure RE-GDA0002400517510000133
Figure RE-GDA0002400517510000141
Figure RE-GDA0002400517510000151
Table 4 verification of intestinal marker (mtotu) relative abundance data by random forest model
Figure RE-GDA0002400517510000152
TABLE 5 probability of morbidity based on 5 marker combination validation set
Figure RE-GDA0002400517510000153
Figure RE-GDA0002400517510000161
The results show that the biomarker disclosed by the invention has higher accuracy and specificity and has good prospect of being developed into a diagnosis method, thereby providing basis for risk assessment, diagnosis and early diagnosis of schizophrenia and searching potential drug targets.
The invention therefore proposes the following applications:
the small-scale schizophrenia biomarker combination based on the intestinal flora is used as a detection target or an application of a detection target in preparation of a detection kit.
The intestinal flora-based small-scale schizophrenia biomarker combination is applied to screening of drugs for treating and/or preventing schizophrenia as a target.
The change in the relative abundance of the biomarker panel provides a basis for determining whether the drug candidate is effective.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (8)

1. An intestinal flora based biomarker combination for small scale schizophrenia, comprising one or more of the following selected from the group consisting of:
biomarker 1: bacteriodes plexius;
biomarker 2: akkermansia muciniphila;
biomarker 3: enterococcus faecalium;
biomarker 4: eubacterium siraum;
biomarker 5: bacteroides intestinalis.
2. The gut flora based small scale schizophrenia biomarker panel of claim 1, wherein said biomarker panel provides relative abundance information for comparison to a reference value.
3. The gut flora-based small scale schizophrenia biomarker panel of claim 1, wherein the relative abundance information of biomarkers 1-5 is provided based on gene sequences for which abundance calculations can be performed.
4. Use of the intestinal flora-based small-scale schizophrenia biomarker combination as defined in claim 1 as a detection target or detection target for the preparation of a detection kit.
5. Use of the gut flora based small scale schizophrenia biomarker combination of claim 1 as a target for screening for a medicament for the treatment and/or prevention of schizophrenia.
6. The use of claim 5, wherein the change in the relative abundance of the biomarker panel provides a basis for determining whether the drug candidate is effective.
7. The method for screening a small-scale schizophrenia biomarker combination based on the intestinal flora of claim 1, characterized by the steps of:
1) collecting samples: collecting a stool sample, freezing, transporting, rapidly transferring to-80 ℃ for storage, and performing DNA extraction to obtain an extracted DNA sample, wherein sample subjects comprise schizophrenia patients and healthy people;
2) metagenomic sequencing and Assembly
3) Conservative single copy gene alignment and abundance calculation
3) Inputting high-quality sequencing fragments into software mOTU to calculate the relative abundance of the species:
3.1) aligning the high quality sequencing fragments to a reference single copy gene;
3.2) counting the number of the inserted fragments according to the comparison result;
3.3) normalizing the number of inserts to the length of a single copy gene to obtain the corresponding abundance.
4) Randomly selecting schizophrenia patients and healthy people from a sample set as a training set, using the other samples as a verification set, calculating the relative abundance of mOTU in each sample in the training set, then inputting the mOTU in the training set into a random forest classifier, performing cross validation on the classifier for 5 times by 10 times, calculating the schizophrenia suffering risk of each individual by using the relative abundance of the mOTU screened by an RF model, drawing an ROC curve, calculating AUC as a judgment model efficiency evaluation parameter, selecting a combination with the combination number of markers less than 30 and the best judgment efficiency, and outputting the higher importance index of each mOTU in the model, the higher the importance index represents that the marker has higher importance for judging schizophrenia and non-schizophrenia.
8. The screening method according to claim 7, wherein the specimen set includes 90 schizophrenia patients and 81 healthy persons, and the validation set includes 10 schizophrenia patients and 10 healthy persons.
CN201910605146.1A 2019-07-05 2019-07-05 Intestinal flora-based small-scale schizophrenia biomarker combination, application thereof and mOTU screening method Pending CN111020021A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910605146.1A CN111020021A (en) 2019-07-05 2019-07-05 Intestinal flora-based small-scale schizophrenia biomarker combination, application thereof and mOTU screening method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910605146.1A CN111020021A (en) 2019-07-05 2019-07-05 Intestinal flora-based small-scale schizophrenia biomarker combination, application thereof and mOTU screening method

Publications (1)

Publication Number Publication Date
CN111020021A true CN111020021A (en) 2020-04-17

Family

ID=70200088

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910605146.1A Pending CN111020021A (en) 2019-07-05 2019-07-05 Intestinal flora-based small-scale schizophrenia biomarker combination, application thereof and mOTU screening method

Country Status (1)

Country Link
CN (1) CN111020021A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114703270A (en) * 2021-12-31 2022-07-05 杭州拓宏生物科技有限公司 Schizophrenia marker gene and application thereof

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120207726A1 (en) * 2009-06-16 2012-08-16 The Trustees Of Columbia University In The City Of New York Autism-associated biomarkers and uses thereof
CN105603066A (en) * 2016-01-13 2016-05-25 金锋 Mental disorder related intestinal tract microbial marker and application thereof
CN107746874A (en) * 2017-11-06 2018-03-02 张猛 Schizophrenia mark
CN108345768A (en) * 2017-01-20 2018-07-31 深圳华大生命科学研究院 A kind of method and marker combination of determining infant's intestinal flora maturity
CN108508211A (en) * 2018-04-04 2018-09-07 中央民族大学 Starting non-medication schizophrenic patients blood serum designated object FGF9 and its application

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120207726A1 (en) * 2009-06-16 2012-08-16 The Trustees Of Columbia University In The City Of New York Autism-associated biomarkers and uses thereof
CN105603066A (en) * 2016-01-13 2016-05-25 金锋 Mental disorder related intestinal tract microbial marker and application thereof
CN108345768A (en) * 2017-01-20 2018-07-31 深圳华大生命科学研究院 A kind of method and marker combination of determining infant's intestinal flora maturity
CN107746874A (en) * 2017-11-06 2018-03-02 张猛 Schizophrenia mark
CN108508211A (en) * 2018-04-04 2018-09-07 中央民族大学 Starting non-medication schizophrenic patients blood serum designated object FGF9 and its application

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PENG ZHENG ET AL.: "The gut microbiome from patients with schizophrenia modulates the glutamate-glutamine-GABA cycle and schizophrenia-relevant behaviors in mice" *
YANG SHEN ET AL.: "Analysis of gut microbiota diversity and auxiliary diagnosis as a biomarker in patients with schizophrenia: A cross-sectional study" *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114703270A (en) * 2021-12-31 2022-07-05 杭州拓宏生物科技有限公司 Schizophrenia marker gene and application thereof

Similar Documents

Publication Publication Date Title
CN111430027B (en) Duplex affective disorder biomarker based on intestinal microorganisms and screening application thereof
CN106714556B (en) Methods and systems for determining risk of autism spectrum disorders
CN105603066B (en) Intestinal microbial marker of mental disorder and application thereof
WO2020244018A1 (en) Small-scale schizophrenia biomarker combination, application thereof and metaphlan2 screening method therefor
CN110904213B (en) Ulcerative colitis biomarker based on intestinal flora and application thereof
CN111440884A (en) Intestinal flora for diagnosing sarcopenia and application thereof
EP3786305A1 (en) Biomarker for depression and use thereof
CN104769132A (en) Gene signatures of inflammatory disorders that relate to the liver
CN111020020A (en) Biomarker combination for schizophrenia, application thereof and metaplan 2 screening method
CN111505288A (en) Novel depression biomarker and application thereof
Clelland et al. Utilization of never-medicated bipolar disorder patients towards development and validation of a peripheral biomarker profile
CN110396538B (en) Migraine biomarkers and uses thereof
CN109072306A (en) Isolated nucleic acid and application
CN112384634B (en) Osteoporosis biomarker and application thereof
CN111020021A (en) Intestinal flora-based small-scale schizophrenia biomarker combination, application thereof and mOTU screening method
CN112063709B (en) Diagnosis kit for myasthenia gravis by taking microorganisms as diagnosis markers and application
CN112011605B (en) Use of microbial flora in disease diagnosis
CN113862382B (en) Application of biomarker of intestinal flora in preparation of product for diagnosing adult immune thrombocytopenia
CN114657270A (en) Alzheimer disease biomarker based on intestinal flora and application thereof
CN110396537B (en) Asthma biomarker and application thereof
CN109072278A (en) Isolated nucleic acid and application
CN111370116A (en) Intestinal microbial marker for predicting curative effect of bipolar affective disorder and screening application thereof
CN111996248B (en) Reagent for detecting microorganism and application thereof in diagnosis of myasthenia gravis
CN112048565B (en) Intestinal flora for diagnosing myasthenia gravis and application thereof
CN112877417A (en) Screening and application of polycystic ovarian syndrome intestinal flora biomarker

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200417

RJ01 Rejection of invention patent application after publication