CN117524312A - Analysis method and device for pathogen metagenome sequencing data and application thereof - Google Patents

Analysis method and device for pathogen metagenome sequencing data and application thereof Download PDF

Info

Publication number
CN117524312A
CN117524312A CN202311484607.7A CN202311484607A CN117524312A CN 117524312 A CN117524312 A CN 117524312A CN 202311484607 A CN202311484607 A CN 202311484607A CN 117524312 A CN117524312 A CN 117524312A
Authority
CN
China
Prior art keywords
positive
pathogen
data
result
reporting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311484607.7A
Other languages
Chinese (zh)
Inventor
杨丽
刘佳
朱鸿坤
戴立忠
李赛
邓小龙
陈姮玉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shengwei Intelligence Chengdu Gene Technology Co ltd
Original Assignee
Shengwei Intelligence Chengdu Gene Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shengwei Intelligence Chengdu Gene Technology Co ltd filed Critical Shengwei Intelligence Chengdu Gene Technology Co ltd
Priority to CN202311484607.7A priority Critical patent/CN117524312A/en
Publication of CN117524312A publication Critical patent/CN117524312A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/40ICT specially adapted for the handling or processing of patient-related medical or healthcare data for data related to laboratory analysis, e.g. patient specimen analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H15/00ICT specially adapted for medical reports, e.g. generation or transmission thereof

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Public Health (AREA)
  • Evolutionary Biology (AREA)
  • Biophysics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Primary Health Care (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Analytical Chemistry (AREA)
  • Data Mining & Analysis (AREA)
  • Chemical & Material Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention belongs to the field of pathogen infection detection, and particularly relates to a method and a device for analyzing pathogen metagenome sequencing data and application thereof, and more particularly relates to a method and a device for degrading and analyzing positive results of pathogen metagenome sequencing data and application thereof. The invention provides an analysis method of pathogen metagenome sequencing data, which comprises the following steps: s1, acquiring pathogen metagenome sequencing sun-reporting data; s2, degrading positive results of specific pathogens on the positive reporting data; and S3, outputting a final positive reporting result.

Description

Analysis method and device for pathogen metagenome sequencing data and application thereof
Technical Field
The invention belongs to the field of pathogen infection detection, and particularly relates to a method and a device for analyzing pathogen metagenome sequencing data and application thereof, and more particularly relates to a method and a device for degrading and analyzing positive results of pathogen metagenome sequencing data and application thereof.
Background
Accurate diagnosis of etiology is of great importance for diagnosis and treatment of infectious diseases. Traditional etiology diagnosis is highly dependent on the experience of clinicians, and differential diagnosis of pathogens is generally carried out according to clinical manifestations of patients, and suspected pathogens are detected and checked one by one; because the limitations of the traditional detection method often cannot be considered in the cases of rare pathogenic pathogens, mixed infection and the like, the metagenome second generation sequencing (metagenomics next generation sequencing, mNGS for short) technology can detect multiple pathogens rapidly and unbiased simultaneously. A typical mNGS bioinformatics flow consists of a series of analysis steps from the original input fastq file, including quality and low complexity filtering, linker filtering, human host removal, microbiological identification by alignment with a reference database, optional sequence assembly, and classification of individual reads and/or contiguous sequences (contigs) at the family, genus and species level.
The pathogen metagenome sequencing currently takes a lot of time and professional interpretation personnel, and the report is audited and the positive reporting result is output. However, the output positive result is affected by the detected positive microorganisms with too low reads or too low abundance, so that an analysis method is required in the field, which can perform degradation processing on the positive result and output a high-precision positive result.
Disclosure of Invention
In view of this, in a first aspect, the present invention provides a method for analysis of pathogen metagenomic sequencing data, comprising the steps of:
s1, acquiring pathogen metagenome sequencing positive data, wherein the positive data comprise positive data and original data of pathogen metagenome sequencing;
s2, degrading positive results of specific pathogens on the positive reporting data; and
s3, outputting a final positive reporting result.
In some specific embodiments, the assay method further comprises the step of constructing an interpretation library.
Further, the interpretation library may comprise a positive library; still further, the positive library construction includes the steps of:
and obtaining a report result of the known positive sample and sequencing data of the known positive sample corresponding to the report result, and marking the report result and the sequencing data in a one-to-one correspondence mode, so that an interpretation library is constructed.
The step of constructing the interpretation library may be performed before the step S1, after the step S1, or before the step S2.
Further, the interpretation library may comprise a negative library; still further, the negative bank construction includes the steps of:
and obtaining a report result of the known negative sample and sequencing data of the known negative sample corresponding to the report result, and marking the report result and the sequencing data in a one-to-one correspondence mode, so that a negative library is constructed.
Further, the report results comprise positive pathogens, suspected pathogens and detected drug resistance genes; the sequencing data comprises the data conventionally possessed by the sequencing data such as specific reads, sample numbers, corresponding Latin names, genome coverage, relative abundance, categories, corresponding genus names, RPM, pathogenic information and the like.
Further, the number of positive samples is not less than 200, preferably not less than 1000, more preferably not less than 3000.
Further, the number of negative samples is not less than 50, preferably not less than 100, more preferably not less than 300.
In some specific embodiments, the step S1 further comprises obtaining at least one of the following data: species name, genus name corresponding to species, specific short nucleotide sequence number in genus, sequencing data amount, ratio of human data amount to total data amount, total microbial data amount, pathogenic information, short nucleotide sequence number, negative control short nucleotide sequence number, specific short nucleotide sequence number, negative control specific short nucleotide sequence number, unit short nucleotide sequence number, negative control unit short nucleotide sequence number, relative abundance, coverage.
In a specific embodiment, the step S2 further comprises the following positive result degradation conditions:
and (3) reading a positive result, when the number of reads of the positive pathogen to be interpreted is more than or equal to 2, comparing the number of reads of the positive pathogen to be interpreted with the minimum number of reads of the same positive pathogen in the interpretation library, and if the number of reads of the positive pathogen is lower, degrading the positive reporting result and reporting the positive reporting result as suspected pathogenic bacteria.
In a specific embodiment, the step S2 further comprises the following positive result degradation conditions:
and (3) reading positive results, when the number of positive pathogens to be interpreted is more than or equal to 2, sequencing the same positive pathogens in the interpretation library from small to large according to the number of reads, and if the number of reads of the positive pathogens to be interpreted is lower than the number of reads of the positive pathogens which are ranked in the first 5% in the interpretation library, degrading the positive reporting results and reporting the positive reporting results to suspected pathogens.
Specifically, in the interpretation library, total colibacillus positive data are 100 times, the sequences are from low to high according to the reads, and suspected pathogenic bacteria are reported when the number of the read pathogen is lower than the number of the reads of colibacillus arranged in the interpretation library at the 5 th.
In a second aspect, the present invention provides an apparatus for pathogen metagenomic sequencing data analysis, comprising:
s1, acquiring a pathogen metagenome sequencing positive data module, wherein the positive data comprises positive data and original data of pathogen metagenome sequencing;
s2, a positive result degradation module for carrying out specific pathogen on the positive reporting data; and
and S3, outputting a final positive reporting result.
In some specific embodiments, the apparatus further comprises a build clinical interpretation library module.
Further, the interpretation library may comprise a positive library; still further, the positive library construction includes the steps of:
and acquiring a clinical report result of the clinical known positive sample and sequencing data of the known positive sample corresponding to the clinical report result, and marking the clinical report result and the sequencing data in a one-to-one correspondence mode, so that an interpretation library is constructed.
Further, the interpretation library may comprise a negative library; still further, the negative bank construction includes the steps of:
and obtaining a clinical report result of a clinical known negative sample and sequencing data of the known negative sample corresponding to the clinical report result, and marking the clinical report result and the sequencing data in a one-to-one correspondence mode, so that a negative library is constructed.
Further, the clinical report results comprise positive pathogens, suspected pathogens and detected drug resistance genes; the sequencing data comprises the data conventionally possessed by the sequencing data such as specific reads, sample numbers, corresponding Latin names, genome coverage, relative abundance, categories, corresponding genus names, RPM, pathogenic information and the like.
Further, the number of positive samples is not less than 200, preferably not less than 1000, more preferably not less than 3000.
Further, the number of negative samples is not less than 50, preferably not less than 100, more preferably not less than 300.
In some specific embodiments, the S3 module further comprises a degradation condition as a positive result:
and (3) reading a positive result, when the positive pathogen is more than or equal to 2, comparing the number of reads of the positive pathogen with the lowest number of reads of the same positive pathogen in the reading library, and if the number of reads of the positive pathogen is lower, degrading the positive result and reporting the positive result as suspected pathogenic bacteria.
In some specific embodiments, the S3 module further comprises a degradation condition as a positive result:
and (3) reading positive results, when the positive pathogens are more than or equal to 2, sequencing the same positive pathogens in the reading library from small to large according to the reads number, and if the reads number of the positive pathogens is lower than the reads number of the positive pathogens which are ranked in the first 5% in the reading library, degrading the positive reporting results and reporting the positive reporting results to suspected pathogens.
Further, the device also includes a nucleic acid extraction module for extracting nucleic acids of the sample.
In a third aspect, the invention provides the use of an assay method or device as described above in a kit or device for preparing pathogen metagenomic sequencing data.
In a fourth aspect, the present invention provides an apparatus comprising:
at least one processor; and
a memory communicatively coupled to at least one of the processors; wherein,
the memory stores instructions executable by the processor for execution by the processor to implement the method of analyzing pathogen metagenomic sequencing data of any of the above.
In some embodiments, the device further comprises at least one input device and at least one output device; in the device, the processor, the memory, the input device and the output device are connected through buses.
In a fifth aspect, there is provided a storage medium storing computer instructions for execution by the computer to implement the method of analysis of pathogen metagenomic sequencing data of any one of the above.
In some embodiments, the storage medium is a computer-readable storage medium.
In a sixth aspect, the invention provides a kit comprising:
sample nucleic acid extraction reagents and metagenomic sequencing reagents; and
an apparatus or device or storage medium as described above.
By using the analysis method of pathogen metagenome sequencing data, anode result degradation treatment can be carried out on the positive report data, so that the positive report result is more accurate, artificial interpretation is supplemented, excessive clinical interpretation of pathogenic bacteria is avoided, the report result is more in line with clinical results, and doctors can be better assisted in medication.
Drawings
FIG. 1 is a schematic illustration of a data analysis method according to the present invention.
Detailed Description
The advantages and various effects of the present invention will be more clearly apparent from the following detailed description and examples. It will be understood by those skilled in the art that these specific embodiments and examples are intended to illustrate the invention, not to limit the invention.
A schematic diagram of the data analysis method according to the present invention is shown in fig. 1.
The terms referred to in this invention are:
fastq: fastq is a text format, also called fq format. This format is used to store biological sequences and their corresponding quality values (typically nucleic acid sequences).
Short nucleotide sequence number Reads: the number of the base sequences which can be specifically aligned to the pathogen is determined by breaking nucleic acid of the microorganism into nucleic acid fragments and then sequencing, and the number of the sequences is how many nucleic acid fragments are detected to belong to the microorganism, so that the number of the sequences is always positively correlated with the loading capacity of the pathogen.
Relative Abundance: the ratio of the sequences of the pathogen in the detected similar microorganisms is calculated independently and is relative abundance due to the difference of microecological characteristics and clinical significance of bacteria, fungi, viruses and parasites, for example, the relative abundance of a certain bacterium is the percentage of the bacterium in all detected bacteria in the sample. The higher the relative abundance, the higher the ratio of the pathogen in the specimen, but the relative abundance of microorganisms between different major classes cannot be compared with each other.
Genome Coverage: refers to the ratio of the nucleic acid sequence of the microorganism to the whole gene sequence of the microorganism, the genome coverage is related to the number of sequences, and the higher the number of sequences, the higher the nucleic acid, which indicates that the pathogen really exists in a specimen.
RPM (Reads per Millionreads): the number of reads of a microorganism detected in each million data is corrected for the length of the gene and the total data amount to represent the gene expression amount.
Negative control NTC (negative control): the objective is to exclude false positives, negative controls are the result that a sample is known to be necessarily negative, which indicates that the experiment is problematic if it is detected that the sample is positive. Negative controls were used to monitor some of the variables in the experiment.
Example 1 construction of interpretation library
A total of 4623 clinically known positive samples and 102 clinically known negative samples were obtained as well as sequencing data of the known positive samples and the known negative corresponding thereto.
The clinical report results include the detection results of positive pathogens, suspected pathogens and detected drug resistance genes.
Sequencing data included:
sources of sequencing data: the machine-setting data of the sequencer is subjected to data quality control, humanized and background microorganism sequence removal, low-complexity filtering, pathogen database comparison, bacterial library comparison result classification and pathogen data table output, wherein the repetition and length of the sequence are not up to standard.
Content of sequencing data: sequencing date, number of specific reads, sample number, corresponding latin name, genome coverage, relative abundance, class, corresponding genus name, RPM, pathogenic information, etc., in part format as shown in table 1 below.
The 4623 clinical known positive sample sequencing data sheets are summarized to form a table, positive pathogens and suspected pathogens of the sequencing data sheets are marked in one-to-one correspondence according to clinical report results to form a reading library (3149 respiratory tract clinical history sample data are used for constructing a respiratory tract reading library; 1113 whole blood clinical history sample data are used for constructing a whole blood reading library; 361 cerebrospinal fluid clinical history sample data are used for constructing a cerebrospinal fluid reading library).
102 clinical known negative samples were sequenced and data was presented as another table to construct a negative pool.
TABLE 1
Example 2 analysis of clinical sample 1 by the analytical method of the present invention
In the implementation, the type of the sample to be detected is a respiratory tract sample, which is alveolar lavage fluid, the total detection strain of the sample BA21120703 is 11, and the total positive result is 2 based on a preset positive analysis method. The invention further performs degradation analysis on the sample detection data by considering the accuracy problem of the current positive analysis method.
TABLE 2.1 alveolar lavage fluid sample metagenome test raw data and first time positive data
First, according to the original sample data and the cation report data, an original cation report result is obtained.
TABLE 2.2 alveolar lavage fluid sample metagenomic detection first time positive reporting data
And secondly, based on the positive reporting data. Degrading the detected positive microorganisms with too low reads, and avoiding the excessive positive microorganisms from influencing the interpretation of the real pathogenic bacteria by clinicians.
Degradation condition 1: and (3) reading a positive result, when the positive pathogen is more than or equal to 2, comparing the number of reads of the positive pathogen with the lowest number of reads of the same positive pathogen in the reading library, and if the number of reads of the positive pathogen is lower, degrading the positive result and reporting the positive result as suspected pathogenic bacteria.
Degradation condition 2: and (3) reading positive results, when the positive pathogens are more than or equal to 2, sequencing the same positive pathogens in the reading library from small to large according to the reads number, and if the reads number of the positive pathogens is lower than the reads number of the positive pathogens which are ranked in the first 5% in the reading library, degrading the positive reporting results and reporting the positive reporting results to suspected pathogens.
In the embodiment, the streptococcus pneumoniae of the BA21120703 sample meets the condition 1, the number of reads in the original positive reporting data is 1, the number of reads of detected microorganisms is the lowest number of reads in a historical positive reporting interpretation library, and the positive reporting result is degraded and reported as suspected pathogenic bacteria.
Meanwhile, the BA21120703 sample streptococcus pneumoniae also meets the condition 2, positive results are read, the number of positive bacteria is more than or equal to 2, the number of the streptococcus pneumoniae ready is 1, the number of ready in an interpretation library is lower than 5%, and the positive results are degraded and reported as suspected pathogenic bacteria.
TABLE 2.3 macrogenomic test of alveolar lavage fluid samples data for positive degradation treatment
And thirdly, outputting a degradation and yang reporting result.
TABLE 2.4 final positive data after degradation of the metagenomic detection of alveolar lavage fluid samples
Comparing the output result with the clinical result of the alveolar lavage fluid sample, as shown in table 2.5, the comparison finds that the individual to which the sample belongs in clinic prompts the infection with the chlamydia psittaci, so that the result of the analysis method of the invention completely corresponds to the clinical result, and the analysis accuracy of the invention is higher.
Table 2.5 final data reporting and comparison of clinical test data
Example 3 analysis of clinical sample 2 by the analytical method of the present invention
In the implementation, the sample type of the sample to be detected is whole blood, the total number of the sample BL22010607 is 13, and the total number of positive results is 2 based on a preset positive analysis method. The invention further performs degradation analysis on the sample detection data by considering the accuracy problem of the current positive analysis method.
TABLE 3.1 metagenomic detection raw data and first time data for whole blood samples
First, according to the original sample data and the cation report data, an original cation report result is obtained.
TABLE 3.2 metagenomic detection of Whole blood samples first time positive reporting data
And secondly, degrading the detected positive microorganisms with too low reads based on the positive reporting data, so as to avoid the excessive positive microorganisms from influencing the interpretation of the real pathogenic bacteria by clinicians.
Degradation condition 1: and (3) reading a positive result, when the positive pathogen is more than or equal to 2, comparing the number of reads of the positive pathogen with the lowest number of reads of the same positive pathogen in the reading library, and if the number of reads of the positive pathogen is lower, degrading the positive result and reporting the positive result as suspected pathogenic bacteria.
Degradation condition 2: and (3) reading positive results, when the positive pathogens are more than or equal to 2, sequencing the same positive pathogens in the reading library from small to large according to the reads number, and if the reads number of the positive pathogens is lower than the reads number of the positive pathogens which are ranked in the first 5% in the reading library, degrading the positive reporting results and reporting the positive reporting results to suspected pathogens.
In this embodiment, the BL22010607 sample enterococcus faecium meets the condition 1, the number of reads in the original positive report data is 1, the number of reads meeting the detection of microorganisms is the lowest number of reads in the historical positive report interpretation library, and the suspected pathogenic bacteria are reported.
TABLE 3.3 Whole blood sample metagenomic detection report positive degradation treatment data
And thirdly, outputting a degradation and yang reporting result.
TABLE 3.4 final cation data after degradation of whole blood metagenomic detection
Comparing the output result with the clinical result of the alveolar lavage fluid sample, as shown in table 3.5, the comparison finds that the individual to which the sample belongs in clinic prompts infection with mycoplasma pneumoniae, so that the result of the analysis method of the invention completely corresponds to the clinical result, which indicates that the analysis accuracy of the invention is higher.
TABLE 3.5 final data reporting and comparison of clinical test data
Example 4 precision enhancement of the analytical method of the present invention
Further, in order to verify that the analysis method of the invention improves the positive reporting precision, positive result degradation analysis is carried out on 173 respiratory tract samples, 39 whole blood samples and 30 cerebrospinal fluid samples, 50 samples in 173 respiratory tract samples are found to obtain more accurate pathogenic pathogens through analysis, 6 samples in 39 whole blood samples are found to obtain more accurate pathogenic pathogens, and the clinical diagnosis result is consistent through comparison with the clinical diagnosis result. As shown in Table 4, the precision increases were 28.90% and 5.38%, respectively.
TABLE 4 Table 4
Sample of Respiratory tract sample Whole blood sample
Total sample 173 39
Precision-improved sample 50 6
Precision improvement rate 28.90% 15.38%

Claims (11)

1. A method of analyzing pathogen metagenomic sequencing data, comprising the steps of:
s1, acquiring pathogen metagenome sequencing positive data, wherein the positive data comprise positive data and original data of pathogen metagenome sequencing;
s2, degrading positive results of specific pathogens on the positive reporting data; and
s3, outputting a final positive reporting result.
2. The method of claim 1, wherein the analysis method further comprises the step of constructing an interpretation library.
3. The method of claim 2, wherein the interpretation library comprises a positive library, the positive library construction comprising the steps of:
and obtaining a report result of the known positive sample and sequencing data of the known positive sample corresponding to the report result, and marking the report result and the sequencing data in a one-to-one correspondence mode, so that an interpretation library is constructed.
4. The method of claim 2, wherein the interpretation library comprises a negative library, the negative library construction comprising the steps of:
and obtaining a report result of the known negative sample and sequencing data of the known negative sample corresponding to the report result, and marking the report result and the sequencing data in a one-to-one correspondence mode, so that a negative library is constructed.
5. The method of claim 1, wherein the step S2 further comprises a positive result degradation condition of:
and (3) reading a positive result, when the number of reads of the positive pathogen to be interpreted is more than or equal to 2, comparing the number of reads of the positive pathogen to be interpreted with the minimum number of reads of the same positive pathogen in the interpretation library, and if the number of reads of the positive pathogen is lower, degrading the positive reporting result and reporting the positive reporting result as suspected pathogenic bacteria.
6. The method of claim 1, wherein the step S2 further comprises a positive result degradation condition of:
and (3) reading positive results, when the number of positive pathogens to be interpreted is more than or equal to 2, sequencing the same positive pathogens in the interpretation library from small to large according to the number of reads, and if the number of reads of the positive pathogens to be interpreted is lower than the number of reads of the positive pathogens which are ranked in the first 5% in the interpretation library, degrading the positive reporting results and reporting the positive reporting results to suspected pathogens.
7. Use of the method of analyzing pathogen metagenomic sequencing data according to any of claims 1-6 in a device for preparing pathogen metagenomic sequencing data.
8. An apparatus for pathogen metagenomic sequencing data analysis, comprising:
s1, acquiring a pathogen metagenome sequencing positive data module, wherein the positive data comprises positive data and original data of pathogen metagenome sequencing;
s2, a positive result degradation module for carrying out specific pathogen on the positive reporting data; and
and S3, outputting a final positive reporting result.
9. The apparatus of claim 8, further comprising a build interpretation library module.
10. An apparatus for pathogen metagenomic sequencing data analysis, comprising:
at least one processor; and
a memory communicatively coupled to at least one of the processors; wherein,
the memory stores instructions executable by the processor for execution by the processor to implement the method of analyzing pathogen metagenomic sequencing data according to any one of claims 1-6.
11. A storage medium storing computer instructions for execution by the computer to implement the method of analysis of pathogen metagenomic sequencing data of any one of claims 1-6.
CN202311484607.7A 2023-11-07 2023-11-07 Analysis method and device for pathogen metagenome sequencing data and application thereof Pending CN117524312A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311484607.7A CN117524312A (en) 2023-11-07 2023-11-07 Analysis method and device for pathogen metagenome sequencing data and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311484607.7A CN117524312A (en) 2023-11-07 2023-11-07 Analysis method and device for pathogen metagenome sequencing data and application thereof

Publications (1)

Publication Number Publication Date
CN117524312A true CN117524312A (en) 2024-02-06

Family

ID=89765693

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311484607.7A Pending CN117524312A (en) 2023-11-07 2023-11-07 Analysis method and device for pathogen metagenome sequencing data and application thereof

Country Status (1)

Country Link
CN (1) CN117524312A (en)

Similar Documents

Publication Publication Date Title
CN111951895B (en) Pathogen analysis method based on metagenomics analysis device, apparatus, and storage medium
CN111009286B (en) Method and apparatus for microbiological analysis of a host sample
CN113160882B (en) Pathogenic microorganism metagenome detection method based on third generation sequencing
EP3590058A1 (en) Systems and methods for metagenomic analysis
EP3409789A1 (en) Method for qualitative and quantitative detection of microorganism in human body
JP2016518822A (en) Characterization of biological materials using unassembled sequence information, probabilistic methods, and trait-specific database catalogs
CN115719616B (en) Screening method and system for pathogen species specific sequences
WO2014136106A1 (en) Method and system for analyzing the taxonomic composition of a metagenome in a sample
KR20210094783A (en) Method and apparatus for screening gene related with disease in next generation sequence analysis
WO2019242445A1 (en) Detection method, device, computer equipment and storage medium of pathogen operation group
US20230357834A1 (en) Hybrid protocols and barcoding schemes for multiple sequencing technologies
CN112331268B (en) Method for obtaining specific sequence of target species and method for detecting target species
AU2020382701A1 (en) Identification of host RNA biomarkers of infection
CN117524312A (en) Analysis method and device for pathogen metagenome sequencing data and application thereof
CN110970093B (en) Method and device for screening primer design template and application
CN114317725B (en) Crohn disease biomarker, kit and screening method of biomarker
CN113793647A (en) Metagenome data analysis device and method based on next generation sequencing
CN116469462A (en) Ultra-low frequency DNA mutation identification method and device based on double sequencing
CN117524313A (en) Analysis method and device for pathogen metagenome sequencing data and application thereof
Kebschull et al. Differential expression and functional analysis of high-throughput-omics data using open source tools
CN113470752A (en) Bacterial sequencing data identification method based on nanopore sequencer
US20210214774A1 (en) Method for the identification of organisms from sequencing data from microbial genome comparisons
CN117524311A (en) Analysis method and device for pathogen metagenome sequencing data and application thereof
Uprety et al. The current state of metagenomics in infectious disease
CN211578386U (en) Metagenome analysis device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination