CN114093411B - Method and equipment for analyzing evolutionary relationship and abundance information of microbial population based on sample - Google Patents

Method and equipment for analyzing evolutionary relationship and abundance information of microbial population based on sample Download PDF

Info

Publication number
CN114093411B
CN114093411B CN202111430273.6A CN202111430273A CN114093411B CN 114093411 B CN114093411 B CN 114093411B CN 202111430273 A CN202111430273 A CN 202111430273A CN 114093411 B CN114093411 B CN 114093411B
Authority
CN
China
Prior art keywords
sample
population
microbial
microorganisms
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111430273.6A
Other languages
Chinese (zh)
Other versions
CN114093411A (en
Inventor
何昆仑
韩洋
贾志龙
宋欣雨
于康
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chinese PLA General Hospital
Original Assignee
Chinese PLA General Hospital
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chinese PLA General Hospital filed Critical Chinese PLA General Hospital
Priority to CN202111430273.6A priority Critical patent/CN114093411B/en
Publication of CN114093411A publication Critical patent/CN114093411A/en
Application granted granted Critical
Publication of CN114093411B publication Critical patent/CN114093411B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B10/00ICT specially adapted for evolutionary bioinformatics, e.g. phylogenetic tree construction or analysis
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/689Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/6895Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for plants, fungi or algae
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds
    • G06F18/2414Smoothing the distance, e.g. radial basis function networks [RBFN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/106Pharmacogenomics, i.e. genetic variability in individual responses to drugs and drug metabolism
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Theoretical Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Genetics & Genomics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Immunology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biochemistry (AREA)
  • Biomedical Technology (AREA)
  • Microbiology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Pathology (AREA)
  • Medical Informatics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Public Health (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Databases & Information Systems (AREA)
  • Hospice & Palliative Care (AREA)
  • Epidemiology (AREA)

Abstract

The invention relates to a method and equipment for analyzing evolutionary relationship and abundance information of microbial populations based on samples. The method comprises the following steps: acquiring genetic information of microbial populations of a sample, constructing a phylogenetic tree of the microbial populations of the sample, and extracting characteristics of an evolutionary relationship among the microbial populations according to the phylogenetic tree; acquiring abundance information of microbial populations of a sample, and extracting characteristics of the abundance information of the microbial populations; fusing features of the evolutionary relationships among the populations of microorganisms with features of the abundance information of the populations of microorganisms to obtain a set of features; and inputting the feature set into a classifier to obtain a classification result of the sample. The method integrates the evolution information of the microorganisms and the abundance information of the microorganisms provided by the microorganism phylogenetic tree, and deeply excavates the life law hidden behind the microorganism data, thereby having important application value.

Description

Method and equipment for analyzing evolutionary relationship and abundance information of microbial populations based on samples
Technical Field
The invention relates to the field of microbial data analysis, in particular to a method, equipment, a system, a computer-readable storage medium and application thereof for analyzing evolutionary relationship and abundance information of microbial populations based on samples.
Background
The microbiome plays an important role in human health and development of disease. In recent years, a great deal of research shows that the composition and structure of intestinal flora are closely related to many chronic systemic metabolic diseases such as diabetes, obesity and the like, and even related to cancer. Microorganisms even affect the developmental maturation of the body's immune system. More and more research suggests that human health is closely related to microorganisms in the body. The variety of human microorganisms is wide, the distribution and abundance in human bodies are greatly different, and researchers face huge challenges when analyzing the microbiology data due to the bulkiness and complexity of the microbiology data. AI provides researchers with a new tool for analyzing microbiome data, which can help us to obtain more associations between microbiome and host health by means of AI, however, at present, the researches on microbes by machine learning and AI mainly aim at the abundance of microbes and DNA sequences thereof, research the relationships between microbes and disease and population characteristics, and do not consider the relationships between microbes.
In recent years, in the field of deep learning, as an emerging graph data learning technology, a graph neural network has attracted extensive attention, and the graph neural network realizes the combination of graph data and deep learning. The graph data is a general data representation method for describing the relationship, and has a wide application scene. The microbial relational graph is analyzed by utilizing the graph neural network, and the data information is further combined, so that the method has strong innovation in the field of life science, and can generate beneficial promotion effect on the research of the field of life science.
Disclosure of Invention
The method integrates the evolution information of the microorganisms and the abundance information of the microorganisms provided by the microorganism phylogenetic tree, and deeply excavates the life law hidden behind the microorganism data to solve the related life science problems.
The application discloses a method for analyzing evolutionary relationship and abundance information of microbial populations based on a sample, comprising:
acquiring genetic information of microbial communities in a sample, constructing a phylogenetic tree of the microbial communities in the sample, and extracting characteristics of the evolutionary relationship among the microbial communities according to the phylogenetic tree;
acquiring abundance information of microbial populations of a sample, and extracting characteristics of the abundance information of the microbial populations;
fusing features of the evolutionary relationships among the populations of microorganisms with features of the abundance information of the populations of microorganisms to obtain a set of features;
and inputting the feature set into a classifier to obtain a classification result of the sample.
Further, the analysis method also comprises the steps of extracting the characteristics of the abundance information of the microbial population, fusing the characteristics of the characteristic set and the characteristics of the abundance information of the microbial population to obtain a fused multi-dimensional characteristic set, and inputting the multi-dimensional characteristic set into a classifier to obtain a classification result of the sample;
optionally, the abundance information of the microbial population is convolved to obtain the feature of the abundance information of the microbial population, the feature set is convolved to obtain the feature of the feature set, and the feature of the abundance information of the microbial population obtained by the convolution and the feature of the feature set obtained by the convolution are fused to obtain the fused multi-dimensional feature set.
Further, the obtaining genetic information of the microbial population of the sample obtains the genetic information of the microbial population of the sample using a method comprising high-throughput sequencing; optionally, the high-throughput sequencing comprises two types: one is based on 16s rDNA, 18s rDNA and ITS area to carry out amplification sequencing; one is metagenomic sequencing;
optionally, the genetic information of the microbial population is the sequence of DNA or protein and/or the structure of DNA or protein of the microbial population, and a phylogenetic tree of the microbial population is constructed by the sequence of DNA or protein and/or the structure of DNA or protein of the microbial population;
optionally, the phylogenetic tree of the microbial population of the constructed sample adopts a phylogenetic tree of the microbial population comprising a distance method, a maximum reduction method, a maximum likelihood method and a bayesian method;
optionally, a relationship matrix among the microbial populations is obtained according to the phylogenetic tree, and the evolutionary relationship among the microbial populations is extracted.
Further, the abundance information of the microbial populations of the sample is obtained based on an OTU analysis;
optionally, after obtaining information on the abundance of the population of microorganisms in the sample, a pre-treatment is performed, including normalization.
Further, the sample is a host of a microbial population or a growth environment of a microbial population; optionally, the host comprises a human, an animal or a plant, and the growing environment comprises soil, water; optionally, the sample is from the mouth, intestine, skin, stomach, esophagus, stool, urethra, blood, eye, nasopharynx, external auditory canal, vagina, or lung of the host.
An analysis apparatus based on evolutionary relationship and abundance information of a population of microorganisms of a sample, the apparatus comprising: a memory and a processor;
the memory is to store program instructions;
the processor is configured to invoke program instructions that, when executed, perform the above-described method of analyzing the evolutionary relationship and abundance information of a population of microorganisms based on a sample.
An analysis system based on evolutionary relationship and abundance information of a population of microorganisms of a sample, comprising:
a first acquisition unit configured to acquire genetic information of microbial populations of a sample, construct a phylogenetic tree of the microbial populations of the sample, and extract characteristics of an evolutionary relationship among the microbial populations based on the phylogenetic tree;
a second acquisition unit configured to acquire abundance information of a microbial population of a sample and extract a feature of the abundance information of the microbial population;
a first fusion unit for fusing the characteristics of the evolutionary relationship among the microbial populations with the characteristics of the abundance information of the microbial populations to obtain a set of characteristics;
and the classification unit is used for inputting the feature set into a classifier to obtain a classification result of the sample.
An analysis system based on evolutionary relationship and abundance information of a population of microorganisms of a sample, comprising:
a first acquisition unit configured to acquire genetic information of microbial populations of a sample, construct a phylogenetic tree of the microbial populations of the sample, and extract a feature of an evolutionary relationship among the microbial populations based on the phylogenetic tree;
a second acquisition unit configured to acquire abundance information of a microbial population of a sample and extract a feature of the abundance information of the microbial population;
a first fusion unit for fusing the characteristics of the evolutionary relationship among the microbial populations with the characteristics of the abundance information of the microbial populations to obtain a set of characteristics;
the second fusion unit is used for fusing the characteristic set with the characteristic of the abundance information of the microbial population in the second acquisition unit to obtain a fused multi-dimensional characteristic set;
and the classification unit is used for inputting the multi-dimensional feature set into a classifier to obtain a classification result of the sample.
A computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the above-described method for analyzing evolutionary relationship and abundance information of a population of microorganisms based on a sample.
Any of the following applications:
the application of the device in diagnosis of the occurrence and development of diseases; optionally the development of said disease is associated with changes in the type and number of microorganisms; alternatively, the disease includes hypertension, obesity, tumors, food allergy, cholelithiasis, urinary incontinence, acne, osteoarthritis, inflammatory bowel disease, type T2 diabetes, constipation, recurrent Urinary Tract Infection (UTI), celiac disease, colitis, kidney disease, neurological disease, and the like.
The use of the apparatus described above to assist in the selection of a disease treatment regimen; alternative such treatment regimens include the selection of therapeutic drugs, whether to administer immunotherapy, etc., which are affected by changes in the type and amount of microorganisms.
The application of the equipment in ecosystem monitoring;
the application of the device in sample classification or prediction of the properties of the sample;
the application of the system in diagnosis of the occurrence and development of diseases; optionally the development of said disease is associated with changes in the type and number of microorganisms;
the use of the system described above in sample classification or predicting attributes of a sample;
use of the system described above to assist in the selection of a disease treatment regimen;
the application of the system in ecosystem monitoring.
The application has the advantages that:
1. the method is characterized in that the life rule hidden behind the microbial data is dug from the deep level, and the accuracy and the depth of data analysis are greatly improved through deep analysis of multiple dimensions such as the evolutionary relationship, abundance information and the like of microbial populations;
2. the method has the advantages that characteristics of the evolutionary relationship among the microbial communities and characteristics of the abundance information of the microbial communities are innovatively fused to obtain characteristic sets, the characteristic sets and the characteristics of the abundance information of the microbial communities are fused again to obtain multi-dimensional characteristic sets, and microbial community data are fully utilized;
3. the application creatively discloses an analysis device and a system based on the evolutionary relationship and abundance information of microbial population of a sample, and the application can be more accurately applied to the fields of auxiliary diagnosis of disease occurrence and development related to the change of the species and the quantity of microbes, auxiliary selection of treatment schemes, ecosystem monitoring, prediction of sample attributes and the like through deep interpretation of microbial population data of the sample.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic flow chart diagram of analysis of evolutionary relationship and abundance information of a sample-based population of microorganisms provided by an embodiment of the present invention;
FIG. 2 is a schematic diagram of an analysis apparatus for analyzing information on the evolutionary relationships and abundances of microbial populations based on a sample, according to an embodiment of the present invention;
FIG. 3 is a schematic flow chart of an analysis system for sample-based information on the evolutionary relationships and abundances of microbial populations provided by embodiments of the present invention;
fig. 4 is a classification result diagram of the chinese Tibetan data set of the multiple deep learning algorithms provided in the embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention.
In some of the flows described in the present specification and claims and in the above figures, a number of operations are included that occur in a particular order, but it should be clearly understood that these operations may be performed out of order or in parallel as they occur herein, with the order of the operations being indicated as 101, 102, etc. merely to distinguish between the various operations, and the order of the operations by themselves does not represent any order of performance. Additionally, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first", "second", etc. in this document are used for distinguishing different messages, devices, modules, etc., and do not represent a sequential order, nor limit the types of "first" and "second" to be different.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
Fig. 1 is a schematic flow chart of a method for analyzing evolutionary relationship and abundance information of microbial populations based on a sample, which includes the following steps:
101: acquiring genetic information of microbial communities in a sample, constructing a phylogenetic tree of the microbial communities in the sample, and extracting characteristics of the evolutionary relationship among the microbial communities according to the phylogenetic tree;
in one embodiment, obtaining genetic information of a population of microorganisms of a sample obtains genetic information of a population of microorganisms of a sample using a method comprising high-throughput sequencing; alternatively, high throughput sequencing includes two categories: one is based on 16s rDNA, 18s rDNA, ITS zone for sequencing; one is metagenomic sequencing.
In one embodiment, the genetic information of a microbial population is the sequence of DNA or protein and/or the structure of DNA or protein of the microbial population from which the phylogenetic tree of a microbial population is constructed.
In one embodiment, the phylogenetic tree of microbial populations of the constructed sample employs a phylogenetic tree of microbial populations comprising a distance method, a maximum reduction method, a maximum likelihood method, and a bayesian method; optionally, a relationship matrix among the microbial populations is obtained according to the phylogenetic tree, and the evolutionary relationship among the microbial populations is extracted.
In one embodiment, a relationship matrix between populations of microorganisms is obtained from the phylogenetic tree, and a graph of relationships between populations of microorganisms is constructed using a graph-related software package (e.g., NetworkX) to extract evolutionary relationships between the populations of microorganisms.
In one embodiment, genetic information is obtained for a population of microorganisms in a sample, a phylogenetic tree is constructed for the population of microorganisms in the sample, a graph of relationships between the populations of microorganisms is obtained, and a graph neural network is used to extract features of the evolutionary relationships between the populations of microorganisms. Alternatively, the Graph neural network may be a Graph convolution neural network (GCN), a Graph attention network (GAT, Graph LSTM, etc.
16S rDNA sequencing: the 16S rDNA is a DNA sequence for coding the RNA of the small subunit of the ribosome of the prokaryote, a HiSeq sequencer or a latest MiSeq sequencer is adopted to sequence a certain hypervariable region of the 16S rDNA, and the diversity of bacteria or archaea in environmental microorganisms is further analyzed.
18S rDNA sequencing: 18S rDNA is a DNA sequence coding eukaryotic ribosome small subunit rRNA. Structurally, the biological sample is divided into a conserved region and a hypervariable region, wherein the conserved region reflects the genetic relationship among biological species, and the hypervariable region reflects the difference among the species.
ITS sequencing: the ITS is divided into two regions: ITS1/ITS2, wherein ITS1 is located between eukaryotic ribosomal rDNA sequences 18S and 5.8S, and ITS2 is located between eukaryotic ribosomal rDNA sequences 5.8S and 28S. Sequencing ITS1 or ITS2 for further analysis of fungal diversity in environmental microorganisms
Metagenomics (Metagenomics), also known as Metagenomics, refers to the DNA of the entire microbial community being studied simultaneously. Metagenomic sequencing is the sequencing of the genome of a population of microorganisms in a sample.
In one embodiment, the sample is a host of a population of microorganisms or a growing environment of a population of microorganisms; optionally, the host comprises a human, an animal or a plant, and the growing environment comprises soil, water; optionally, the sample is from the mouth, intestine, skin, stomach, esophagus, stool, urethra, blood, eye, nasopharynx, external auditory canal, vagina, or lung of the host.
Sequencing can now be performed using a variety of different sequencers, including Roche 454, Illumina's Novoseq, Miseq, Hiseq, Life PGM or Pacbio and nanopore's third generation sequencers.
The phylogenetic tree of the microbial community for constructing the sample adopts a phylogenetic tree of the microbial community for constructing the sample by adopting a distance method, a maximum reduction method, a maximum likelihood method and a Bayesian method.
The distance method is represented by a Neighbor-Joining (NJ) method, and the NJ method is suitable for constructing a phylogenetic tree of short sequences with small evolutionary distance and few information sites.
The maximal reduction method is based on the hypothesis that the minimum number of nucleotide (or amino acid) substitutions required during the evolution process is calculated for all possible correct topologies and the topology with the minimum number of required substitutions is selected as the optimal phylogenetic tree, i.e. by comparing all possible trees, the tree with the smallest length is selected as the final phylogenetic tree, i.e. the maximal reduction tree.
The first application of maximum likelihood to phylogenetic analysis was in the analysis of gene frequency data, which consisted of accumulating the probability of all possible residue substitutions at each position, taking into account the likelihood of the occurrence of a residue at each position, to generate a likelihood for a particular position. The ML (maximum likelihood) method calculates the likelihood function for all possible phylogenetic trees, and the tree with the maximum likelihood function value is the most possible phylogenetic tree.
Bayesian (Bayesian) methods use the monte carlo method of markov chains to generate estimates of the posterior probability (spatial probability) of all parameters, including the topology, branch length, and estimates of the parameters of the surrogate model, according to various molecular evolution models. The method not only can directly quantize the parameters of the model, but also can analyze a large data set, and the credibility of each branch is represented by the posterior probability without the detection of a self-guiding method (bootstrap).
And (3) construction and display steps of the phylogenetic tree:
(1) preparing data: data commonly used to construct phylogenetic trees include morphological data and molecular data. The morphological data is mainly obtained by encoding morphological characters; molecular data is mainly downloaded by sequencing or the public database GeBank.
(2) And (3) processing data: comprises sequence splicing, sequence comparison, and disputed site correction;
(3) the best model is typically evaluated before the phylogenetic tree is constructed. The software used is ModelTest, MrModelTest, jModelTest, etc.
(4) The method comprises the following steps: commonly used methods for constructing phylogenetic trees, including distance method, Maximum reduction Method (MP), Maximum Likelihood Method (ML), and Bayesian Inference (BI) (Hall,2008), can be used.
(5) Displaying: common software for editing and displaying phylogenetic tree diagrams are TreeView, FigTree, MEGA, ITOL (http:// ITOL. embl. de /), R package (ggtree, APE), etc.
102: acquiring abundance information of microbial populations of a sample, and extracting characteristics of the abundance information of the microbial populations;
in one embodiment, the abundance information of the microbial population of the sample is obtained based on an OTU-based assay.
Otu (operational taxonomic units) is the same marker that is manually set for a certain classification unit (line, species, genus, group, etc.) in phylogenetic studies or population genetics studies for the convenience of analysis. Sequences are typically classified into different OTUs according to a similarity threshold of 97%, each OTU being generally considered a microbial species. A similarity of less than 97% can be considered as belonging to a different species, a similarity of less than 93% -95% can be considered as belonging to a different genus.
In one embodiment, after obtaining information on the abundance of microbial populations of the sample, a pre-treatment is performed, including normalization.
103: fusing features of the evolutionary relationships among the populations of microorganisms with features of the abundance information of the populations of microorganisms to obtain a set of features;
in one embodiment, genetic information of a population of microorganisms of a sample is obtained, a phylogenetic tree of the population of microorganisms of the sample is constructed, a graph of relationships among the population of microorganisms is obtained, features of the evolutionary relationships among the population of microorganisms are extracted using a graph neural network, abundance information of the population of microorganisms of the sample is obtained, and the features of the abundance information of the population of microorganisms are extracted; and fusing the characteristics of the evolutionary relationship among the microbial population and the characteristics of the abundance information of the microbial population by using a graph neural network to obtain a characteristic set. Alternatively, the Graph neural network may be a Graph convolution neural network (GCN), a Graph attention network (GAT, Graph LSTM, etc.
104: and inputting the feature set into a classifier to obtain a classification result of the sample.
In one embodiment, the feature set is input into a pre-trained classifier to obtain a classification result of the sample.
In one embodiment, the classifier may be any one of a random forest model, a decision tree model, a logistic regression model, a Support Vector Machine (SVM) model, and a neural network model, which is not limited herein.
In one embodiment, the training of the classifier may be: obtaining different types of labeled samples (such as diseased samples and normal samples, samples in different disease stages, soil samples in different periods and the like), repeating the characteristic extraction process and the fusion process to obtain a characteristic set or a multi-dimensional characteristic set, inputting the characteristic set or the multi-dimensional characteristic set into a classifier to obtain a classification result of the samples, calculating the loss between the classification result and a real value by using a loss function, then performing back propagation, and updating parameters by using an optimizer to obtain the trained classifier.
In one embodiment, the analysis method further comprises extracting the characteristics of the abundance information of the microbial population, fusing the characteristics of the characteristic set and the characteristics of the abundance information of the microbial population to obtain a fused multi-dimensional characteristic set, and inputting the multi-dimensional characteristic set into a classifier to obtain the classification result of the sample.
In one embodiment, convolving the abundance information of the population of microorganisms obtains the features of the abundance information of the population of microorganisms, convolving the set of features obtains the features of the set of features, fusing the features of the abundance information of the population of microorganisms obtained by the convolving and the features of the set of features obtained by the convolving to obtain a fused set of multidimensional features.
In one embodiment, the sample is a host for a microbial population or a growing environment for a microbial population; optionally, the host comprises a human, an animal or a plant, and the growing environment comprises soil, water; optionally, the sample is from the mouth, intestine, skin, stomach, esophagus, stool, urethra, blood, eye, nasopharynx, external auditory canal, vagina, or lung of the host.
In one example, the above method is applied to the analysis of the Hanzang population, and for the metagenomic data of the intestinal microorganisms of 363 samples (102 samples of Tibetan living in the plateau, 92 samples of Han living in the plain, 81 samples of Han living in the plateau back to Han living in the plain, 67 samples of Han living in 1 week from the plain to the plateau, and 21 samples of Han living in the plateau), a phylogenetic tree is constructed by the maximum likelihood method for the DNA sequence of the microorganism population of each person obtained, a graph of the relationship among the microorganisms is constructed using a graph-related software package (e.g., NetworkX), the characteristics of the evolutionary relationship among the microorganism populations are extracted, Laplacian regularization is performed, and GCN integration is performed with the characteristics of the abundance information of the microorganism population to obtain a microorganism population set fusing the characteristics of the evolutionary relationship among the microorganism populations and the abundance information of the microorganism population, and performing graph convolution on the feature set, simultaneously performing convolution on the features of the abundance information of the microbial population, combining the two convolution results in the channel dimension, performing full connection, and predicting the population type of each sample. The method and the device take the prediction accuracy of the crowd type as a measurement index to evaluate the effect of different models (the model, the AutoGenome model and the SVM model) in the case. After ten-fold cross validation of the data, the accuracy of the prediction of the results of the model of the application (Phylo-GDL) was 0.802, which is higher than 0.769 for AutoGenome and 0.647 for SVM, see fig. 4.
In one embodiment, the method is used for disease occurrence development diagnosis analysis, metagenomic data of intestinal microorganisms of a sample (patients with different tumor stages) is obtained, a phylogenetic tree of microorganism populations of the sample is constructed, and characteristics of evolutionary relationships among the microorganism populations are extracted according to the phylogenetic tree; acquiring abundance information of microbial populations of a sample, and extracting characteristics of the abundance information of the microbial populations; fusing features of the evolutionary relationships among the populations of microorganisms with features of the abundance information of the populations of microorganisms to obtain a set of features; and inputting the characteristic set into a classifier to obtain a classification result of the sample, wherein the classification result is the tumor stage of the patient.
In one example, where the above method is used in conjunction with analysis in the selection of a disease treatment protocol, the sample may be a patient in need of immunotherapy and the classification of whether immunotherapy is appropriate is given based on the genetic information and abundance information of the microbial population of the sample.
In one example, using the above method for ecosystem monitoring, the samples can be soil containing microorganisms at different periods, and the results of classification of the ecological environment change are given according to the genetic information and abundance information of the microorganism population of the samples.
Fig. 2 is an analysis apparatus for analyzing the evolutionary relationship and abundance information of microbial populations based on a sample, according to an embodiment of the present invention, the apparatus including: a memory and a processor;
the memory is to store program instructions;
the processor is configured to invoke program instructions that, when executed, perform the above-described method of analyzing the evolutionary relationship and abundance information of a population of microorganisms based on a sample.
Fig. 3 is a system for analyzing evolutionary relationship and abundance information of a microbial population based on a sample, according to an embodiment of the present invention, including:
a first acquisition unit 301 configured to acquire genetic information of microbial populations of a sample, construct a phylogenetic tree of the microbial populations of the sample, and extract characteristics of an evolutionary relationship between the microbial populations based on the phylogenetic tree;
a second acquisition unit 302 for acquiring abundance information of a microbial population of a sample, and extracting a feature of the abundance information of the microbial population;
a first fusion unit 303 configured to fuse the characteristics of the evolutionary relationship between the microbial populations and the characteristics of the abundance information of the microbial populations to obtain a characteristic set;
and the classification unit 304 is configured to input the feature set into a classifier, so as to obtain a classification result of the sample.
The embodiment of the invention provides an analysis system for the evolutionary relationship and abundance information of microbial populations based on samples, which comprises:
a first acquisition unit configured to acquire genetic information of microbial populations of a sample, construct a phylogenetic tree of the microbial populations of the sample, and extract characteristics of an evolutionary relationship among the microbial populations based on the phylogenetic tree;
a second acquisition unit configured to acquire abundance information of a microbial population of a sample and extract a feature of the abundance information of the microbial population;
a first fusion unit for fusing the characteristics of the evolutionary relationship among the microbial populations with the characteristics of the abundance information of the microbial populations to obtain a set of characteristics;
the second fusion unit is used for fusing the characteristic set with the characteristic of the abundance information of the microbial population in the second acquisition unit to obtain a fused multi-dimensional characteristic set;
and the classification unit is used for inputting the multi-dimensional feature set into a classifier to obtain a classification result of the sample.
A computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the above-described method for analyzing evolutionary relationship and abundance information of a population of microorganisms based on a sample.
The validation results of this validation example show that assigning an intrinsic weight to an indication can moderately improve the performance of the method relative to the default settings.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
It will be understood by those skilled in the art that all or part of the steps in the method for implementing the above embodiments may be implemented by hardware that is instructed to implement by a program, and the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
While the invention has been described in detail with reference to specific embodiments thereof, it will be apparent to one skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention.

Claims (21)

1. A method for analyzing evolutionary relationship and abundance information of a population of microorganisms based on a sample, comprising:
acquiring genetic information of microbial populations of a sample, constructing a phylogenetic tree of the microbial populations of the sample to obtain a map of the relationship among the microbial populations, and extracting the characteristics of the evolutionary relationship among the microbial populations by using a neural network of the map;
acquiring abundance information of microbial populations of a sample, and extracting characteristics of the abundance information of the microbial populations;
fusing features of the evolutionary relationships among the populations of microorganisms with features of the abundance information of the populations of microorganisms to obtain a set of features;
and inputting the feature set into a classifier to obtain a classification result of the sample.
2. The method of claim 1, further comprising extracting the features of the abundance information of the microbial population, fusing the features of the abundance information of the microbial population with the features of the feature set to obtain a fused multi-dimensional feature set, and inputting the multi-dimensional feature set into a classifier to obtain the classification result of the sample.
3. The method of claim 1, wherein the method comprises convolving the abundance information of the population of microorganisms to obtain the feature of the abundance information of the population of microorganisms, convolving the feature set to obtain the feature of the feature set, and fusing the feature of the abundance information of the population of microorganisms obtained by the convolution and the feature of the feature set obtained by the convolution to obtain the fused multi-dimensional feature set.
4. The method of claim 1, wherein the obtaining genetic information of the population of microorganisms in the sample obtains genetic information of the population of microorganisms in the sample using a method comprising high throughput sequencing.
5. The method of claim 4, wherein the high throughput sequencing comprises two categories: one is based on 16s rDNA, 18s rDNA, ITS zone for sequencing; one is metagenomic sequencing.
6. The method according to claim 1, wherein the genetic information of the microbial population is a DNA or protein sequence and/or a DNA or protein structure of the microbial population, and the DNA or protein sequence and/or DNA or protein structure of the microbial population is used to construct a phylogenetic tree of the microbial population.
7. The method of claim 1, wherein the phylogenetic tree of microbial populations constructing the sample employs a phylogenetic tree of microbial populations that includes distance, maximum reduction, maximum likelihood, and bayesian approaches to construct the sample.
8. The method of claim 1, wherein the phylogenetic tree is used to obtain a relationship matrix between microbial populations and the evolutionary relationships between the microbial populations are extracted.
9. The method of claim 1, wherein the sample-based analysis of the evolutionary relationship and abundance information of microbial populations is obtained from an OTU-based analysis.
10. The method of claim 1, wherein the sample is subjected to a pre-treatment after the information on the abundance of the population of microorganisms is obtained, wherein the pre-treatment comprises normalization.
11. The method of claim 1, wherein the sample is a host of a population of microorganisms or a growing environment of a population of microorganisms.
12. The method of claim 11, wherein the host comprises a human, animal or plant and the growing environment comprises soil, water.
13. The method of claim 1, wherein the sample is from the oral cavity, intestinal tract, skin, stomach, esophagus, stool, urethra, blood, eye, nasopharyngeal cavity, external auditory canal, vagina, or lung of the host.
14. An analysis apparatus based on evolutionary relationship and abundance information of a population of microorganisms of a sample, the apparatus comprising: a memory and a processor;
the memory is to store program instructions;
the processor is configured to invoke program instructions that, when executed, perform the method of analyzing the evolutionary relationship and abundance information of a population of sample-based microorganisms of any one of claims 1-13.
15. An analysis system based on evolutionary relationship and abundance information of a population of microorganisms of a sample, comprising:
a first acquisition unit configured to acquire genetic information of microbial populations of a sample, construct a phylogenetic tree of the microbial populations of the sample, obtain a map of relationships among the microbial populations, and extract features of evolutionary relationships among the microbial populations using a neural network of the map;
a second acquisition unit configured to acquire abundance information of a microbial population of a sample and extract a feature of the abundance information of the microbial population;
a first fusion unit for fusing the characteristics of the evolutionary relationship among the microbial populations with the characteristics of the abundance information of the microbial populations to obtain a set of characteristics;
and the classification unit is used for inputting the feature set into a classifier to obtain a classification result of the sample.
16. An analysis system based on evolutionary relationship and abundance information of a population of microorganisms of a sample, comprising:
a first acquisition unit configured to acquire genetic information of microbial populations of a sample, construct a phylogenetic tree of the microbial populations of the sample, obtain a map of relationships among the microbial populations, and extract features of evolutionary relationships among the microbial populations using a neural network of the map;
a second acquisition unit configured to acquire abundance information of a microbial population of a sample and extract a feature of the abundance information of the microbial population;
a first fusion unit for fusing the characteristics of the evolutionary relationship among the microbial populations with the characteristics of the abundance information of the microbial populations to obtain a set of characteristics;
the second fusion unit is used for fusing the characteristic set with the characteristic of the abundance information of the microbial population in the second acquisition unit to obtain a fused multi-dimensional characteristic set;
and the classification unit is used for inputting the multi-dimensional feature set into a classifier to obtain a classification result of the sample.
17. A computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of analyzing the evolutionary relationship and abundance information of a population of sample-based microorganisms of any one of claims 1-13 above.
18. Use of the device of claim 14 for ecosystem monitoring.
19. Use of the apparatus of claim 14 in sample classification or predicting properties of a sample.
20. Use of the system of any of claims 15-16 for sample classification or predicting properties of a sample.
21. Use of the system of any one of claims 15-16 for ecosystem monitoring.
CN202111430273.6A 2021-11-29 2021-11-29 Method and equipment for analyzing evolutionary relationship and abundance information of microbial population based on sample Active CN114093411B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111430273.6A CN114093411B (en) 2021-11-29 2021-11-29 Method and equipment for analyzing evolutionary relationship and abundance information of microbial population based on sample

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111430273.6A CN114093411B (en) 2021-11-29 2021-11-29 Method and equipment for analyzing evolutionary relationship and abundance information of microbial population based on sample

Publications (2)

Publication Number Publication Date
CN114093411A CN114093411A (en) 2022-02-25
CN114093411B true CN114093411B (en) 2022-08-09

Family

ID=80305359

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111430273.6A Active CN114093411B (en) 2021-11-29 2021-11-29 Method and equipment for analyzing evolutionary relationship and abundance information of microbial population based on sample

Country Status (1)

Country Link
CN (1) CN114093411B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114736970B (en) * 2022-03-09 2023-06-30 中国人民解放军总医院 Method for identifying different crowds
CN116936074A (en) * 2022-07-22 2023-10-24 浙江省肿瘤医院 Application of tumor prediction system based on tongue fur microorganism
CN115881229B (en) * 2022-12-16 2024-01-09 迪辅乐生物(上海)有限公司 Allergy prediction model construction method based on intestinal microbial information

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109593865A (en) * 2018-10-25 2019-04-09 华中科技大学鄂州工业技术研究院 The analysis of marine coral Bacterial community, gene excavating method and equipment
CN111710364A (en) * 2020-05-08 2020-09-25 中国科学院深圳先进技术研究院 Method, device, terminal and storage medium for acquiring flora marker

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7627437B2 (en) * 2005-01-14 2009-12-01 Idaho Research Foundation Categorization of microbial communities
WO2015103165A1 (en) * 2013-12-31 2015-07-09 Biota Technology, Inc. Microbiome based systems, apparatus and methods for monitoring and controlling industrial processes and systems
JP6514549B2 (en) * 2015-04-03 2019-05-15 住友化学株式会社 Microbiota analysis system, determination system, microbiota analysis method and determination method
CN108804875B (en) * 2018-06-21 2020-11-17 中国科学院北京基因组研究所 Method for analyzing microbial population function by using metagenome data
CN110444254B (en) * 2019-07-08 2021-10-19 深圳先进技术研究院 Detection method, detection system and terminal for flora marker
CN111933218B (en) * 2020-07-01 2022-03-29 广州基迪奥生物科技有限公司 Optimized metagenome binding method for analyzing microbial community

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109593865A (en) * 2018-10-25 2019-04-09 华中科技大学鄂州工业技术研究院 The analysis of marine coral Bacterial community, gene excavating method and equipment
CN111710364A (en) * 2020-05-08 2020-09-25 中国科学院深圳先进技术研究院 Method, device, terminal and storage medium for acquiring flora marker

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
艾比湖湿地盐节木根际土壤氨氧化微生物多样性和丰度及其与环境因子的相关性分析;何园等;《环境科学学报》;20161207;第37卷(第05期);第1967-1975页 *

Also Published As

Publication number Publication date
CN114093411A (en) 2022-02-25

Similar Documents

Publication Publication Date Title
CN114093411B (en) Method and equipment for analyzing evolutionary relationship and abundance information of microbial population based on sample
CN109072309A (en) Cancer evolution detection and diagnosis
US20200210852A1 (en) Transcriptome deconvolution of metastatic tissue samples
Leduc-Robert et al. Phylogeny with introgression in Habronattus jumping spiders (Araneae: Salticidae)
Bossert et al. Gene tree estimation error with ultraconserved elements: an empirical study on Pseudapis bees
CN106202846B (en) The construction method of oral microbial community detection model
US20200234793A1 (en) Systems and methods for metagenomic analysis
Zhou et al. Recovering metagenome-assembled genomes from shotgun metagenomic sequencing data: methods, applications, challenges, and opportunities
WO2021183917A9 (en) Systems and methods for deconvolution of expression data
CN115050477A (en) Bayesian optimization based RF and LightGBM disease prediction method
Bhattacharyya et al. The conserved phylogeny of blood microbiome
CN110428899A (en) The more Data Integration circular rnas restarted based on double random walks and disease associated prediction technique
Wani et al. Advances and applications of Bioinformatics in various fields of life
Reiter et al. TTP: tool for tumor progression
CN113838528B (en) Single-cell horizontal coupling visualization method based on single-cell immune repertoire data
CN114974432A (en) Screening method of biomarker and related application thereof
Nguyen et al. Markovbin: An algorithm to cluster metagenomic reads using a mixture modeling of hierarchical distributions
Brown Phylogeny and evolution of swallows (Hirundinidae) with a transcriptomic perspective on seasonal migration
Afify et al. Taxonomy metagenomic analysis for microbial sequences in three domains system via machine learning approaches
CN117312893B (en) Evaluation method and related device for flora matching degree
Li et al. A new genomics tool for monitoring Arctic char (Salvelinus alpinus) populations in the Lower Northwest Passage, Nunavut
Linck Gene flow and models of avian speciation in tropical mountains
Amses Single Cell Sequencing Facilitates Genome-enabled Biology in Uncultured Fungi and Resolves Deep Branches on the Fungal Tree of Life
Doppala Differentiating Human Populations Based on k-mer Classification of Hand Bacteria
Pang et al. Discovering the Inter-species Interaction among Microorganisms Based on Iterative Random Forest Algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant