CN116312796B - Metagenome abundance estimation method and system based on expectation maximization algorithm - Google Patents

Metagenome abundance estimation method and system based on expectation maximization algorithm Download PDF

Info

Publication number
CN116312796B
CN116312796B CN202310103910.1A CN202310103910A CN116312796B CN 116312796 B CN116312796 B CN 116312796B CN 202310103910 A CN202310103910 A CN 202310103910A CN 116312796 B CN116312796 B CN 116312796B
Authority
CN
China
Prior art keywords
species
metagenome
abundance
reference genome
sequencing data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310103910.1A
Other languages
Chinese (zh)
Other versions
CN116312796A (en
Inventor
马昀然
刘俊锋
郭昊
李诗濛
任用
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiansheng Medical Examination Laboratory Co ltd
Jiangsu Xiansheng Diagnostic Technology Co ltd
Jiangsu Xiansheng Medical Diagnosis Co ltd
Original Assignee
Beijing Xiansheng Medical Examination Laboratory Co ltd
Jiangsu Xiansheng Diagnostic Technology Co ltd
Jiangsu Xiansheng Medical Diagnosis Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiansheng Medical Examination Laboratory Co ltd, Jiangsu Xiansheng Diagnostic Technology Co ltd, Jiangsu Xiansheng Medical Diagnosis Co ltd filed Critical Beijing Xiansheng Medical Examination Laboratory Co ltd
Publication of CN116312796A publication Critical patent/CN116312796A/en
Application granted granted Critical
Publication of CN116312796B publication Critical patent/CN116312796B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Abstract

The application belongs to the technical field of belief analysis, and particularly relates to a metagenome abundance estimation method based on an expectation maximization algorithm (EM), which comprises the steps of introducing the influence of species reference genome similarity comparison on information of a reference genome unique comparison position based on comparison information of metagenome sequencing data, and constructing the probability of occurrence of unique comparison and multiple comparison observed by a statistical model depiction; and (3) adopting an EM to solve the constructed statistical model to estimate the species abundance in the metagenome. The application quantifies the influence of the similarity of the species reference genome, so the accuracy of Gao Hong genome abundance estimation can be improved on the species level, and the sensitivity and the specificity of metagenome species identification can be improved.

Description

Metagenome abundance estimation method and system based on expectation maximization algorithm
Technical Field
The application belongs to the technical field of bioinformatics, and particularly relates to a metagenome abundance estimation method and system based on a expectation maximization algorithm.
Technical Field
Metagenomic sequencing (mNGS) technology has broad application prospects in clinical microbiology, and a variety of computational methods based on metagenomic sequencing data have been developed for rapid identification of pathogens in clinical samples. Wherein, the metagenome classification algorithm centrafuge realizes rapid classification of metagenome sequencing data based on BWT (Burrows-Wheeler transform) and FM (ferrocina-Manzini) indexes and uses a smaller index space. Moreover, centrafuge uses the EM (estimation-Maximization) algorithm to estimate species abundance in metagenomic sequencing data. In addition, the centrafuge has wider application scenes, and can analyze not only short-reading long sequences but also long-reading long sequences. Although centrafuge has better performance in species identification, it is less effective in identifying low abundance species. Compared with a mapping-based method, the method has the advantages that the read matching precision of the centering is low, the estimated value of the abundance is influenced, and the false positive rate is high.
Based on the metagenome sequencing data comparison result, the metagenome classifier mainly adopts 2 strategies to distribute metagenome sequencing data with multiple comparison results. The first strategy directly assigns according to the number of species unique comparison reads: when the read i multiple aligned to species A and species B, if the unique aligned reads of species A are greater in number than species B, then the read i will be assigned to species A with a greater probability or directly to species A. The second strategy is to construct a probability model to characterize the metagenome sequencing data comparison result, and the sequence abundance or species abundance of each species is estimated by solving the probability model, and the representative algorithms are centrafuge and braicken. The second strategy can give a probabilistic interpretation of the allocation results compared to the first strategy. However, the representative algorithms Centrifuge and braicken as the second strategy are both deficient in probabilistic model construction. The shortcoming of the probability model constructed by the Centrifuge algorithm is that: (1) The unique alignment and the multiple alignment are not subjected to differentiation treatment on sequencing data of different species; (2) The effect of genomic similarity on unique alignment sequencing data was not characterized. The probability model constructed by the Bracken algorithm has the following defects: (1) distributing the comparison result of only Kraken; (2) model solving is not completely based on observation samples; (3) The effect of genomic similarity on unique alignment sequencing data was not directly characterized.
In view of this, the present application has been proposed.
Disclosure of Invention
In order to solve the technical problems, the application introduces the proportion of the unique comparison position of the reference genome to quantify the influence of the similarity comparison information of the species reference genome based on the comparison information of the metagenome sequencing data, and constructs the probability of occurrence of the unique comparison and multiple comparison observed by the statistical model; calculating the proportion of unique comparison positions of corresponding reference genomes based on the comparison information of the metagenome sequencing data; the statistical model constructed is solved using an Expectation-Maximization (EM) algorithm, species abundance in the metagenome is estimated, and a reference genome length is introduced at an Expectation step (E-step).
Therefore, the core objective of the application is to provide a metagenomic abundance assessment method and system.
In order to achieve the above purpose, the present application proposes the following technical scheme:
the application firstly provides a metagenome abundance assessment method, which comprises the following steps:
1) Obtaining a unique comparison position proportion of a reference genome;
2) Obtaining the occurrence probability of single comparison and multiple comparison of sequencing data;
3) Metagenomic species abundance was assessed using a expectation maximization algorithm.
Further, the step 1) is obtained as follows: based on the comparison information of the metagenome sequencing data, counting the number of unique comparison sequencing sequences and the number of multiple comparison sequences on the reference genome, and calculating the ratio of the number of sequencing sequences uniquely compared to the reference genome to the number of sequencing sequences all compared to the reference genome.
Further, the obtaining in step 2) is: based on the reference genome unique alignment position proportion, the influence of species reference genome similarity alignment information is quantified, and a statistical model is constructed to characterize the occurrence probability of the observed unique alignment and multiple alignments.
Further, the statistical model is:
wherein,
r is the number of metagenomic sequencing data,
s is the number of species in the metagenome,
and->The abundance of species j and k, respectively, the parameter to be estimated,
l j and l k The average length of the genomes of species j and k respectively,
C ij for the probability of comparing the sequencing data i to species j, when the sequencing data i is compared only to species j, the probability is equal to P j ,P j The ratio of unique alignment positions on reference genome j; when sequencing data i is multiple aligned to species j, the probability is equal to 1-P j The method comprises the steps of carrying out a first treatment on the surface of the When sequencing data i is not aligned to species j, the probability is equal to 0.
Further, the step 3) specifically includes:
and (3) solving the statistical model constructed in the step (2) by adopting an expectation maximization algorithm, and estimating the abundance of the species in the metagenome.
Further, the solving specifically includes: introducing a reference genome length in the desired step for calculating the number n of sequencing data from species j j The formula is:
based on n again j Updating abundance of species jThe formula is as follows:
further, the metagenome in the above method is an infectious metagenome.
The application also provides a metagenome abundance estimation system, which comprises the following components:
assembly 1): a reference genome unique alignment position ratio calculation component;
assembly 2): the sequencing data unique comparison and multiple comparison occurrence probability statistics component;
assembly 3): metagenomic species abundance assessment component.
Further, the obtaining of the component 1) is as follows: based on the comparison information of the metagenome sequencing data, counting the number of unique comparison sequencing sequences and the number of multiple comparison sequences on the reference genome, and calculating the ratio of the number of sequencing sequences uniquely compared to the reference genome to the number of sequencing sequences all compared to the reference genome.
Further, the obtaining of the component 2) is as follows: based on the reference genome unique alignment position proportion, the influence of species reference genome similarity alignment information is quantified, and a statistical model is constructed to characterize the occurrence probability of the observed unique alignment and multiple alignments.
Further, the statistical model is:
wherein,
r is the number of metagenomic sequencing data,
s is the number of species in the metagenome,
and->The abundance of species j and k, respectively, the parameter to be estimated,
l j and l k The average length of the genomes of species j and k respectively,
C ij for the probability of comparing the sequencing data i to species j, when the sequencing data i is compared only to species j, the probability is equal to P j ,P j The ratio of unique alignment positions on reference genome j; when sequencing data i is multiple aligned to species j, the probability is equal to 1-P j The method comprises the steps of carrying out a first treatment on the surface of the When sequencing data i is not aligned to species j, the probability is equal to 0.
Further, the assembly 3) specifically includes:
and (3) estimating the abundance of the species in the metagenome by adopting a statistical model constructed by the expectation maximization algorithm solving component 2).
Further, the solving specifically includes: introducing a reference genome length in the desired step for calculating the number n of sequencing data from species j j The formula is:
based on n again j Updating abundance of species jThe formula is as follows:
further, the metagenome in the above system is an infectious metagenome.
The present application also provides an electronic device including: a processor and a memory; the processor is connected to a memory, wherein the memory is configured to store a computer program, and the processor is configured to invoke the computer program to perform the method according to any of the preceding claims.
The present application also provides a computer storage medium storing a computer program comprising program instructions which, when executed by a processor, perform a method as claimed in any one of the preceding claims.
The application has the beneficial technical effects that:
the application quantifies the influence of species reference genome similarity on comparison information, and differentially processes the sequencing data of unique comparison and multiple comparison for different species, thereby improving the accuracy of Gao Hong genome abundance estimation on the species level and providing necessary technical support for improving the sensitivity and specificity of metagenome species identification.
Drawings
FIG. 1, a metagenomic abundance estimation flow chart based on a expectation maximization algorithm of the present application;
FIG. 2, correlation analysis of abundance estimates and true values for different methods;
FIG. 3, correlation analysis of abundance estimates and true values for different methods (5% outlier removal).
Detailed Description
Embodiments of the present application will be described in detail below with reference to examples, but it will be understood by those skilled in the art that the following examples are only for illustrating the present application and should not be construed as limiting the scope of the present application. The specific conditions are not noted in the examples and are carried out according to conventional conditions or conditions recommended by the manufacturer. The reagents or apparatus used were conventional products commercially available without the manufacturer's attention.
Some definitions of terms unless defined otherwise below, all technical and scientific terms used in the detailed description of the application are intended to have the same meaning as commonly understood by one of ordinary skill in the art. While the following terms are believed to be well understood by those skilled in the art, the following definitions are set forth to better explain the present application.
The term "about" in the present application means a range of accuracy that one skilled in the art can understand while still guaranteeing the technical effect of the features in question. The term generally means a deviation of + -10%, preferably + -5%, from the indicated value.
As used herein, the terms "comprising," "including," "having," "containing," or "involving" are inclusive or open-ended and do not exclude additional unrecited elements or method steps. The term "consisting of …" is considered to be a preferred embodiment of the term "comprising". If a certain group is defined below to contain at least a certain number of embodiments, this should also be understood to disclose a group that preferably consists of only these embodiments.
Furthermore, the terms first, second, third, (a), (b), (c), and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and that the embodiments of the application described herein are capable of operation in other sequences than described or illustrated herein.
The application is illustrated below in connection with specific embodiments.
Example 1 estimation method establishment
As shown in fig. 1, the metagenome abundance estimation method based on the expectation maximization algorithm provided by the application comprises the following steps:
s1, sequencing data comparison: and selecting proper alignment tools and an alignment database, performing alignment analysis on the metagenome sequencing data, and outputting alignment information of each piece of sequencing data. The comparison information at least needs to contain the identification information of the reference genome on the comparison, and distinguish between the unique comparison and the multiple comparison. For multiple aligned sequencing data, reference genome identification information on all alignments should be included.
S2, calculating the proportion of unique comparison positions of the reference genome: according to the comparison information of the metagenome sequencing data, the number of the unique comparison sequencing sequences and the number of multiple comparison sequences on the reference genome j (j=1, 2,..n, n is the number of the reference genome) are counted, and then the ratio of the unique comparison positions on the reference genome j, namely the ratio of the number of the sequencing sequences uniquely compared to the reference genome j to the number of the sequencing sequences uniquely compared to the reference genome j, is calculated.
S3, calculating the unique comparison and multiple comparison probability of the sequencing data:
based on the reference genome unique alignment position proportion, the influence of species reference genome similarity alignment information is quantified, and a statistical model is constructed to characterize the occurrence probability of the observed unique alignment and multiple alignments.
The statistical model is as follows:
r is the number of metagenomic sequencing data,
s is the number of species in the metagenome,
and->The abundance of species j and k, respectively, the parameter to be estimated,
l j and l k The average length of the genomes of species j and k respectively,
C ij for the probability that sequencing data i is aligned to species j, if sequencing data i is aligned to reference genome j only,the probability is P j The method comprises the steps of carrying out a first treatment on the surface of the If multiple alignments are made to reference genome j, the probability is 1-P j 。P j The ratio of positions is uniquely aligned for reference genome j (see step S2 for calculation).
S4, estimating the species abundance of the metagenome by adopting an EM algorithm: based on the alignment information of the metagenomic sequencing data, species abundance is estimated as follows:
1) Initial step (I-step): the initial value of abundance of species j is:
s: number of species in metagenome
2) Desired step (E-step):
n j : the number of sequences from species j;
C ij : sequencing the probability that sequence i is from species j, if species j is not aligned, the probability is 0; if the species j is uniquely compared or multiple compared, the probability is S3;
3) Maximizing step (M-step) updates the abundance of species j:
abundance of updated species j
The EM algorithm stops iterating if the difference between the pre-update and post-update species abundance estimates satisfies the following condition:
after estimating the abundance of a species, the estimated abundance of species j can be converted to the sequence abundance of that species by the following formula:
example 2 evaluation of Effect
This example compares the merits of the methods of the present application with conventional methods in terms of metagenomic abundance estimation.
1) Generating simulated metagenome data: the present example is based on reference genomes of 4078 bacteria and 200 archaea, using a simulator Mason to generate 2000 ten thousand sequences of 100bp in length, from which 10000 sequences were randomly extracted for abundance estimation.
2) Analysis of simulated metagenomic data: the above simulated data were subjected to abundance estimation using centrifuge, which has been more widely used, and the metagenomic abundance estimation method of the present application.
3) The correlation analysis of the abundance estimate values of the two methods with the true abundance values shows (fig. 2) that the Pearson correlation coefficient of the centrafuge is only 0.26, whereas the Pearson correlation coefficient of the metagenomic abundance estimation method of the present application is 0.6. If 5% of the outlier estimates were removed from both methods, the Pearson correlation of centrafuge was also only increased to 0.46, whereas the Pearson correlation of the metagenomic abundance estimation method of the present application was increased to 0.95 (fig. 3).
As can be seen from example 2, compared with the more widely used centrafuge, the metagenomic abundance estimation method provided by the application can obviously improve the accuracy of metagenomic abundance estimation, thereby being beneficial to improving the sensitivity and specificity of metagenomic species identification.
Finally, it should be noted that the above embodiments are merely illustrative of the technical solution of the present application, and not limiting thereof; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application.

Claims (5)

1. A method for assessing metagenomic abundance, comprising the steps of:
1) Obtaining a unique comparison position proportion of a reference genome;
2) Obtaining the occurrence probability of single comparison and multiple comparison of sequencing data;
3) Assessing metagenomic species abundance using a expectation maximization algorithm;
the step 2) is obtained by: quantifying the influence of species reference genome similarity comparison information based on the reference genome unique comparison position proportion, and constructing a statistical model to characterize the occurrence probability of the observed unique comparison and multiple comparison;
the statistical model is as follows:
wherein,
r is the number of metagenomic sequencing data,
s is the number of species in the metagenome,
and->The abundance of species j and k, respectively, the parameter to be estimated,
l j and l k The average length of the genomes of species j and k respectively,
C ij for the probability of comparing the sequencing data i to species j, when the sequencing data i is compared only to species j, the probability is equal to P j ,P j For uniquely aligning positions on reference genome jProportion of the components; when sequencing data i is multiple aligned to species j, the probability is equal to 1-P j The method comprises the steps of carrying out a first treatment on the surface of the When sequencing data i is not aligned to species j, its probability is equal to 0;
the evaluation of step 3) is as follows:
solving the statistical model constructed in the step 2) by adopting an expectation maximization algorithm, and estimating species abundance in a metagenome;
the solving is as follows: introducing a reference genome length in the desired step for calculating the number n of sequencing data from species j j The formula is:
based on n again j Updating abundance of species jThe formula is as follows:
2. the method of evaluating according to claim 1, wherein,
the step 1) is obtained by: based on the alignment information of the metagenome sequencing data, the ratio of the number of sequencing sequences uniquely aligned to the reference genome to the number of sequencing sequences all aligned to the reference genome is calculated.
3. The assessment method according to any one of claims 1-2, wherein said metagenome is an infectious metagenome.
4. An electronic device, comprising: a processor and a memory; the processor is connected to a memory, wherein the memory is adapted to store a computer program, the processor being adapted to invoke the computer program to perform the method of any of claims 1-3.
5. A computer storage medium, characterized in that the computer storage medium stores a computer program comprising program instructions which, when executed by a processor, perform the method of any of claims 1-3.
CN202310103910.1A 2022-12-27 2023-02-07 Metagenome abundance estimation method and system based on expectation maximization algorithm Active CN116312796B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211690228 2022-12-27
CN2022116902289 2022-12-27

Publications (2)

Publication Number Publication Date
CN116312796A CN116312796A (en) 2023-06-23
CN116312796B true CN116312796B (en) 2023-11-14

Family

ID=86800458

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310103910.1A Active CN116312796B (en) 2022-12-27 2023-02-07 Metagenome abundance estimation method and system based on expectation maximization algorithm

Country Status (1)

Country Link
CN (1) CN116312796B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104911194A (en) * 2015-06-04 2015-09-16 山东农业大学 Wheat male sterility genes WMS and application of anther specific promoter thereof
CN105408909A (en) * 2013-07-09 2016-03-16 莱克斯奥根有限公司 Transcript determination method
CN113186311A (en) * 2021-04-27 2021-07-30 中国医学科学院北京协和医院 Application of vaginal microorganism in differential diagnosis of chronic pelvic pain syndrome
CN113337590A (en) * 2021-06-03 2021-09-03 深圳华大基因股份有限公司 Second-generation sequencing method and library construction method
CN113337589A (en) * 2021-05-24 2021-09-03 华南理工大学 Method for screening genes related to synthesis of target compound and application
CN114402084A (en) * 2019-06-27 2022-04-26 赛福医药公司 Developing classifiers for stratifying patients
CN115331737A (en) * 2022-08-11 2022-11-11 黄琨 Method for analyzing pathogenic bacteria in intestinal flora and quantifying regional characteristics of flora

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105408909A (en) * 2013-07-09 2016-03-16 莱克斯奥根有限公司 Transcript determination method
CN104911194A (en) * 2015-06-04 2015-09-16 山东农业大学 Wheat male sterility genes WMS and application of anther specific promoter thereof
CN114402084A (en) * 2019-06-27 2022-04-26 赛福医药公司 Developing classifiers for stratifying patients
CN113186311A (en) * 2021-04-27 2021-07-30 中国医学科学院北京协和医院 Application of vaginal microorganism in differential diagnosis of chronic pelvic pain syndrome
CN113337589A (en) * 2021-05-24 2021-09-03 华南理工大学 Method for screening genes related to synthesis of target compound and application
CN113337590A (en) * 2021-06-03 2021-09-03 深圳华大基因股份有限公司 Second-generation sequencing method and library construction method
CN115331737A (en) * 2022-08-11 2022-11-11 黄琨 Method for analyzing pathogenic bacteria in intestinal flora and quantifying regional characteristics of flora

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome;Bo Li et al.;《BMC Bioinformatics》;第1-16页 *
应用基因表达系列分析( SAGE) 技术研究高温处理前后家蚕的基因表达差异;鲍忠赞等;《蚕业科学》;第456-467页 *

Also Published As

Publication number Publication date
CN116312796A (en) 2023-06-23

Similar Documents

Publication Publication Date Title
BR112020013636A2 (en) method to facilitate the prenatal diagnosis of a genetic disorder from a maternal sample associated with the pregnant woman, method for identifying contamination associated with at least one between preparation of sequencing library and high-throughput sequencing and method for characterization associated with at least one between sequencing library preparation and sequencing
CN105279397A (en) Method for identifying key proteins in protein-protein interaction network
CN110111843B (en) Method, apparatus and storage medium for clustering nucleic acid sequences
CN116523320B (en) Intellectual Property Risk Intelligent Analysis Method Based on Internet Big Data
CN109887546B (en) Single-gene or multi-gene copy number detection system and method based on next-generation sequencing
Radley et al. Entropy sorting of single-cell RNA sequencing data reveals the inner cell mass in the human pre-implantation embryo
CN111226281B (en) Method and device for determining chromosome aneuploidy and constructing classification model
CN107463797B (en) Biological information analysis method and device for high-throughput sequencing, equipment and storage medium
KR20220073732A (en) Method, apparatus and computer readable medium for adaptive normalization of analyte levels
CN116312796B (en) Metagenome abundance estimation method and system based on expectation maximization algorithm
WO2023124779A1 (en) Third-generation sequencing data analysis method and device for point mutation detection
US20210130888A1 (en) Method, apparatus, and system for detecting chromosome aneuploidy
CN109686400B (en) Enrichment degree inspection method and device, readable medium and storage controller
Ranjbar et al. Bayesian normalization model for label-free quantitative analysis by LC-MS
Yang et al. Improved detection algorithm for copy number variations based on hidden Markov model
CN116206680A (en) Method, device, equipment and storage medium for detecting tandem repeat area
JP4576194B2 (en) Compound structure estimation apparatus, compound structure estimation method and program thereof
WO2023231184A1 (en) Feature screening method and apparatus, storage medium, and electronic device
CN117951695A (en) Industrial unknown threat detection method and system
GUINEL Pre-natal testing using low-coverage next-generation sequencing data
Mukherjee et al. Finding Overlapping Rmaps via Gaussian Mixture Model Clustering
Jia Bioinformatic Insights into the Challenges of miRNA-Based BRCA Status Classification
Mukherjee et al. Finding Overlapping Rmaps via Clustering
CN117894367A (en) Screening and evaluating method for conservation of specific sequences of microorganisms
Wu et al. Measurement uncertainty in cell image segmentation data analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant