CN114107454A - Respiratory tract infection pathogen detection method based on macrogene/macrotranscriptome sequencing - Google Patents

Respiratory tract infection pathogen detection method based on macrogene/macrotranscriptome sequencing Download PDF

Info

Publication number
CN114107454A
CN114107454A CN202010883055.7A CN202010883055A CN114107454A CN 114107454 A CN114107454 A CN 114107454A CN 202010883055 A CN202010883055 A CN 202010883055A CN 114107454 A CN114107454 A CN 114107454A
Authority
CN
China
Prior art keywords
sample
microorganism
sequencing
sequence
macro
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010883055.7A
Other languages
Chinese (zh)
Inventor
莫茜
陶悦
杜白露
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Childrens Medical Center Affiliated to Shanghai Jiaotong University School of Medicine
Original Assignee
Shanghai Childrens Medical Center Affiliated to Shanghai Jiaotong University School of Medicine
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Childrens Medical Center Affiliated to Shanghai Jiaotong University School of Medicine filed Critical Shanghai Childrens Medical Center Affiliated to Shanghai Jiaotong University School of Medicine
Priority to CN202010883055.7A priority Critical patent/CN114107454A/en
Publication of CN114107454A publication Critical patent/CN114107454A/en
Pending legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing

Abstract

The invention discloses a method for detecting pathogens of respiratory infectious diseases of children based on metagenome/macrotranscriptome sequencing, which is characterized by specifically comprising the following steps of: after a sample to be detected is processed, extracting DNA and RNA by using a DNA extraction kit and an RNA extraction kit respectively; constructing a sample DNA sequencing library and an RNA sequencing library; sequencing the sequencing library in high throughput, and simultaneously sequencing the negative control of DNA and RNA extraction and library building on the same lane; and comparing the sample sequence data with a reference database to obtain a sequence with a comparison ratio of more than 70% and without multiple comparison, screening pathogenic microorganisms, and detecting the respiratory infectious disease pathogen in the sample according to a screening result. The invention establishes the pathogeny for quickly determining the respiratory infectious diseases of children and distinguishes planting and infection, thereby establishing a targeted treatment strategy and reducing the hospitalization time, the medical cost, the antibiotic use days and the death rate of children patients.

Description

Respiratory tract infection pathogen detection method based on macrogene/macrotranscriptome sequencing
Technical Field
The invention belongs to the technical field of microbial detection, and particularly relates to a respiratory tract infection pathogen detection method based on macrogene/macrotranscriptome sequencing.
Background
Acute Respiratory Tract Infection (ARTI) is the leading cause of outpatient and hospitalization in children of all ages, particularly in the winter and spring; meanwhile, ARTI is also the second leading cause of death in children under the age of 5, and is a major public health threat facing children. Taking pneumonia as an example of a single disease species, WHO surveys show that more than 200 ten thousand children with the ages of 0-5 die of pneumonia every year, and 1/5 accounts for almost all deaths of children under the age of 5 worldwide; in china, 14.8% of deaths in children under 5 years of age are caused by pneumonia. Therefore, the improvement of the diagnosis and treatment level of the respiratory tract infection of children is an important measure for promoting the health of children.
The infection of pathogenic microorganisms such as bacteria, viruses, mycoplasma and the like is a main cause for the pneumonia of children, but the clinical symptoms of different types of pneumonia of children are extremely similar, even if clinical detection in aspects of hematology, imaging and the like is assisted, clinicians often cannot accurately distinguish the pneumonia, empirical and broad-spectrum treatment is difficult to avoid, and thus infected children may receive unnecessary antibacterial treatment. Therefore, early and definite etiological diagnosis is beneficial to starting targeted treatment in time, reducing the abuse risk of antibiotics and reducing the consumption of medical resources.
The traditional detection methods of pathogenic microorganisms mainly comprise morphological detection, microorganism culture, smear microscopy, antigen-antibody detection, nucleic acid detection and the like, and the methods are still widely applied clinically, but the detection methods in clinical laboratories have different limitations due to own methodologies. For example, microbial culture is a gold standard for pathogen detection, and has good specificity, but has long detection period, low positive rate and complex operation, and is not suitable for wide clinical application except bacterial culture; the immunological detection mainly detects the pathogenic antibody in the serum of the infant, and has simple and quick operation, but because the generation and elimination of the pathogenic antibody lags behind the change of the pathogen and the condition of cross reaction exists, the method has difficult early diagnosis of diseases and is easy to generate false negative or false positive. With the development of molecular biology technology, pathogen molecular diagnosis is gradually applied to clinical applications, such as 7 respiratory tract pathogens based on fluorescence real-time quantitative PCR or isothermal amplification (LAMP), and instant detection of pathogens such as FilmArray, GeneXpert, etc., but these methods all aim at preset pathogens, have the limitation of narrow pathogen spectrum, and are especially beam-tie-free for detection of rare or new pathogens. The wide variety of respiratory tract infection pathogens, similar clinical characteristics, complex bacterial acquired drug resistance mechanisms and the limitation of microorganism detection means in clinical laboratories cause more anti-infection treatments to be in trouble, so that more accurate and normative pathogen molecule diagnosis methods are urgently needed for improving the diagnosis and treatment of respiratory tract infection.
The pathogenic macro genome sequencing (mNGS) is a high-throughput sequencing technology which directly extracts nucleic acid from a specimen of an infected focus part of a patient to realize pathogen species identification without depending on culture or presetting pathogens. Compared with the traditional clinical laboratory detection method, the pathogenic mNGS detects the nucleic acid sequence of pathogenic microorganisms based on the nucleic acid level, can break through the limitations of different pathogen types, comprehensively covers thousands or even tens of thousands of pathogens without bias, and simultaneously identifies various pathogenic microorganisms such as bacteria, fungi, viruses, tuberculosis, parasites and the like. Ngs has gradually become an important tool in the field of clinical microbiology, in particular to rapidly cope with the infectious disease threat of new pathogens, and also to hopefully enhance our ability to detect and track infectious disease pathogens.
Although the pathogenic macro genomics is a powerful tool for identifying pathogen species and can detect genetic information of pathogens, the pathogenic macro transcriptome cannot detect the gene expression level of microorganisms, analyze the activity of the microorganisms and identify RNA viruses, and the pathogenic macro transcriptome can well make up for the short plate of the mNGS. Macro transcriptomics (Metatrancriptomics) is a research means which takes all RNA in a specimen as a research object, can detect RNA viruses in the specimen, and can research the transcription condition of all genome of thallus life (pathogens are in an active state) and the transcription regulation and control rule in the focus part of a patient on the whole level, avoids the problem of difficult separation and culture of microorganisms, and can research the change of complex microbial communities from the transcription level. Because pathogens causing the respiratory tract infection of children comprise a plurality of RNA viruses, the pathogens causing the respiratory tract infection of children are detected by combining pathogen metagenome sequencing and pathogen macrotranscriptome sequencing, so that not only can the full coverage of pathogen spectrums be realized, but also drug-resistant genes can be identified; in addition, pathogen macrotranscriptome sequencing and macrotranscriptome sequencing result comparison before and after treatment can also help to judge whether the pathogen is in an active state, whether the treatment is effective and the like, so that the detection rate and the accuracy of the pathogen of the respiratory tract infection of children are improved, and accurate medical treatment of the respiratory tract infection of children is promoted.
Disclosure of Invention
The invention aims to solve the technical problems that the detection means for children respiratory tract infection in the prior art is low in positive rate, long in period, easy to generate false negative or false positive, or narrow in pathogen spectrum and the like, and aims to provide a novel respiratory tract infection pathogen detection method based on macro gene/macro transcriptome sequencing.
The method for detecting the respiratory infectious disease pathogen is based on metagenome/macrotranscriptome sequencing, and specifically comprises the following steps:
step S1: breaking the wall of a collected respiratory tract sample to be detected, and extracting DNA by using a DNA extraction kit; directly extracting RNA from the other collected respiratory tract sample to be detected by using an RNA extraction kit;
step S2: constructing a sample DNA sequencing library by using the extracted DNA library construction kit; constructing a sample RNA sequencing library by using an RNA library construction kit for the extracted RNA;
step S3: performing high-throughput sequencing on the sample DNA sequencing library and the sample RNA sequencing library, and simultaneously performing sequencing on negative controls of DNA and RNA extraction and library building on the same lane; preliminarily screening and filtering interference sequences for the sequenced sequence data to obtain processed sample sequence data;
step S4: and comparing the sample sequence data with a reference database to obtain a sample sequence and a negative control sequence with the comparison ratio of more than 70% and without multiple comparison, then screening pathogenic microorganisms, and detecting the respiratory infectious disease pathogen in the sample to be detected according to the screening result.
The respiratory tract infection pathogen detection method based on the macro-gene/macro-transcriptome sequencing is a strategy for establishing efficient nucleic acid extraction and corresponding library construction for children respiratory tract samples and screening pathogenic microorganisms from background microorganisms and colonizing microorganisms, so that the children respiratory tract infection pathogen detection and identification method based on the macro-gene/macro-transcriptome sequencing is established.
In a preferred embodiment of the present invention, the screening of pathogenic microorganisms further comprises:
substep S4A: the removal of background microorganisms, in particular,
1) calculating an RPM value for each microorganism in the sample sequence and the negative control sequence; wherein the microorganism's RPM (Reads Per Million) value is the number of sequencing reads belonging to the microorganism Per million sequencing reads of the microorganism;
2) if a certain microorganism detected in the sample sequence exists in the negative control sequence at the same time, and the ratio of the RPM value of the microorganism in the sample sequence to the RPM value of the microorganism in the negative control sequence is more than 10, defining the microorganism as a pending positive microorganism; defining a bacterium, fungus or parasite as a pending positive microorganism if the bacterium, fungus or parasite detected in the sample sequence is not present in the negative control sequence and the RPM value of the bacterium, fungus or parasite in the sample sequence is greater than 50; if the virus detected in the sample sequence is not present in the negative control sequence and the sequencing read length in the sample sequence where the virus is not overlapped is greater than 3, defining the virus as a pending positive microorganism;
3) comparing all the undetermined positive microorganisms with the background microorganism list 1, and removing the undetermined positive microorganisms existing in the background microorganism list 1 to obtain a quasi-positive microorganism list;
substep S4B: screening and planting the microorganism, specifically,
marking the microorganisms which appear in the respiratory tract colonization flora list 2 in the quasi-positive microorganism list as P3, the microorganisms which appear in the common pathogenic microorganism list 3 as P1, and the rest quasi-positive microorganisms as P2; the probability of the sample containing a pathogenic microorganism is, in turn, that the sample contains a microorganism designated P1 > contains a microorganism designated P2 > contains a microorganism designated P3.
In a preferred embodiment of the present invention, the background microorganism list is shown in Table 1
Figure RE-GDA0002776304140000041
Figure RE-GDA0002776304140000051
Note: taxid is the number of the microorganism in the NCBI database.
In a preferred embodiment of the present invention, the respiratory tract colonization flora list is shown in table 2:
Figure RE-GDA0002776304140000052
Figure RE-GDA0002776304140000061
note: taxi is the number of microorganisms in the National Center for Biotechnology Information (NCBI) database.
In a preferred embodiment of the present invention, the list of common pathogenic microorganisms is shown in table 3:
Figure RE-GDA0002776304140000062
Figure RE-GDA0002776304140000071
Figure RE-GDA0002776304140000081
note: taxid is the number of the microorganism in the NCBI database.
In a preferred embodiment of the present invention, in step S1, the respiratory tract sample to be tested is sputum, alveolar lavage fluid, a swab sample with preservation fluid or a swab sample without preservation fluid; the DNA and RNA extracted in step S1 are subjected to concentration measurement and quality control at the same time.
In a preferred embodiment of the present invention, in step S2: the step of constructing a sample DNA sequencing library comprises DNA fragmentation, end repair, joint connection, fragment selection, purification, amplification and purification; the steps of constructing the sample DNA sequencing library comprise enucleation of a ribosome, RNA purification and fragmentation, end repair, ligation by adding a linker, fragment selection, purification, amplification and purification.
In a preferred embodiment of the present invention, in step S3: the high-throughput sequencing is performed on PE150 sequencing on an Illumina Nextseq550 platform, and the data volume of a sequencing library of each sample is 10G.
In a preferred embodiment of the present invention, in step S3: the preliminary screening, filtering and interference sequence is to remove a joint sequence, filter a low-quality sequence, delete a sequence with the length less than 40bp and remove a sequence aligned to the ginseng reference genome GRCh38 from the sequencing data in sequence; the removal of linker sequences, filtration of low quality sequences was performed using the software Cutadapt v1.18 with the parameters-e 0.2-O10-m 40-q 15, 15-max-n ═ 0.1, and the removal of sequences aligned to the ginseng reference genome GRCh38 was performed using the software Bowtie v 2.3.4.
In a preferred embodiment of the present invention, in step S4: aligning the sample sequence data to a reference database using software centrifuge v1.0.3; the reference database is a reference genomic sequence at the microbial level downloaded from the NCBI genomic database, containing more than 20000 reference genomic sequences, including more than 12000 bacteria, 7312 viruses, 515 fungi and 168 parasites.
In step S4: the construction method of the reference database comprises the following steps: sequences at the Refseq level are downloaded first, sequences at the Complete level are downloaded as a complement if there are less than 200 sequences at the Refseq level, sequences at the chromosome, scaffold and contig levels are downloaded if there are no sequences at the Refseq and Complete levels, sequences of less than 150bp are deleted using low complexity fragments in the software dutmasker marker sequences, and then a reference database is constructed using centrifugev1.0.3.
The respiratory tract infection pathogen detection method based on the macro gene/macro transcriptome sequencing is used for detecting based on the macro genome/macro transcriptome sequencing, and has important significance for quickly determining the pathogen of the respiratory tract infectious diseases of children and distinguishing colonization and infection, so that a targeted treatment strategy is formulated, and the hospitalization time, the medical cost, the antibiotic use days and the death rate of children are reduced. The invention adds common drug-resistant genes into a common microorganism list, and provides a certain basis for drug selection.
The conception, the specific structure, and the technical effects produced by the present invention will be further described below to fully understand the objects, the features, and the effects of the present invention.
Detailed Description
The present invention is further described in detail below with reference to specific examples, which are provided for illustration only and are not intended to limit the scope of the present invention. The test methods used in the following examples are all conventional methods unless otherwise specified; the materials and reagents used are commercially available reagents and materials, unless otherwise specified.
EXAMPLE 1 detection of pathogens of respiratory infectious diseases in Children by sputum samples
Step S1: treating the collected sputum of the children: in the metagenome sequencing process, 600 mu L of sputum sample to be detected is taken to be put into a 2mL grinding tube filled with 500 mu L of glass beads, and a tube cover is screwed tightly to ensure that no liquid leakage occurs during wall breaking. Further, 2mL grinding tube was placed on 2mL grinding tube holder, trimmed, and the grinding homogenizer was opened and the following procedure was used to break the wall as shown in table 4.
TABLE 4 wall breaking treatment parameters
Figure RE-GDA0002776304140000091
After the wall breaking, the grinding tube is taken out, and the AQBD cube mini centrifuge is used for short-time centrifugation for 5 s. The grinding mean instrument is a Tiangen Bioprep-24 tissue grinding homogenizer. Then, DNA was extracted using a DNA extraction kit manufactured by Guangzhou micro-distance medical instruments Ltd according to the instruction manual.
In the macro transcriptome sequencing process, 200 μ L of the sputum sample to be tested is taken to a corresponding 2mL sterile centrifuge tube, and the tube cover is screwed tightly. RNA was extracted using an RNA extraction kit manufactured by Guangzhou micro-distance medical instruments Ltd according to the instruction manual.
Meanwhile, a Qubit fluorescence quantitative instrument is used for measuring the concentration of DNA and RNA, and electrophoresis is used for controlling the quality of the obtained DNA and RNA.
Step S2: constructing a sample DNA sequencing library by using a DNA library construction kit produced by Guangzhou micro-remote medical instrument limited according to an operation instruction; the extracted RNA was used to construct a sample RNA sequencing library using an RNA library construction kit manufactured by Guangzhou micro-distance medical instruments Inc. according to the instructions. Constructing a sequencing library of the sample DNA, and performing DNA fragmentation, end repair, joint connection, fragment selection, purification, amplification and purification according to an operation instruction; constructing a sequencing library of sample RNA, and according to the operation instruction, removing ribosome, purifying and fragmenting RNA, repairing tail end, connecting joint, selecting fragments, purifying, amplifying and purifying.
Step S3: PE150 sequencing was performed on Illumina Nextseq550 platform with a sequencing library data volume of 10G per sample, and each batch of samples had to contain a negative control (plasma of healthy persons in the pooling kit) for simultaneous DNA and RNA extraction and pooling and sequencing on the same lane. Then, removing the joint sequence, filtering the low-quality sequence, deleting the sequence with the length less than 40bp, and removing the sequence aligned to the reference genome GRCh38 from the sequencing data; linker sequences were removed using software Cutadapt v1.18, low quality sequences were filtered, the parameters set to-e 0.2-O10-m 40-q 15, 15-max-n ═ 0.1, and sequences aligned to the reference genome GRCh38 were removed using software Bowtie v 2.3.4.
Step S4: comparing the processed sample sequence data with a reference database by using software centrifuge v1.0.3; the reference database is a reference genomic sequence at the microbial level downloaded from the NCBI genomic database, containing more than 20000 reference genomic sequences, including more than 12000 bacteria, 7312 viruses, 515 fungi and 168 parasites. The sample sequence and the negative control sequence with the alignment ratio of more than 70 percent and without multiple alignment are obtained according to the general conventional method. The construction method of the reference database comprises the following steps: for each microbial species, sequences at the Refseq level are downloaded first, sequences at the Complete level are downloaded as a complement if less than 200 sequences at the Refseq level, sequences at the chromosome, scaffold and contig levels are downloaded if no sequences at the Refseq and Complete level are present, low complexity fragments in the sequences are tagged with a software dutmasker, sequences of less than 150bp are deleted, and then a database is constructed with centrrigfugev1.0.3.
Then, the pathogenic microorganism screening including background microorganism removal and colonization microorganism screening is carried out.
Background microbial removal was: (A1) first, the RPM value of each microorganism detected in the sample sequence and the negative control sequence (i.e., the number of sequencing reads belonging to the microorganism per million sequencing reads) is calculated; (A2) when the detected microorganisms in the sample sequence exist in the negative control sequence at the same time, and the ratio of the RPM value in the sample sequence to the RPM value in the negative control sequence is more than 10, the microorganisms are defined as undetermined positive microorganisms; defining the bacteria, fungi or parasites to be positive microorganisms if the bacteria, fungi or parasites detected in the sample sequence are not present in the negative control sequence and the RPM value in the sample sequence is greater than 50; defining a virus as a microorganism to be determined positive if the virus detected in the sample sequence is not present in the negative control sequence and if there are more than 3 non-overlapping sequencing reads in the sample sequence; (A3) and comparing the undetermined positive microorganisms with a background microorganism list 1, and removing the undetermined positive microorganisms existing in the background microorganism list to obtain a quasi-positive microorganism list.
The screening of the microorganism for permanent planting comprises the following steps: quasi-positive microorganisms in the list of quasi-positive microorganisms detected in the sequence of the respiratory tract sample are labeled as P3 if they appear in the list of respiratory tract colonizing flora 2, P1 if they appear in the list of common pathogenic microorganisms 3, and P2 for the remainder. The probability of the sample containing pathogenic microorganisms is P1 > P2 > P3.
Background microorganisms List 1
Figure RE-GDA0002776304140000111
Figure RE-GDA0002776304140000121
Note: taxid is the number of the microorganism in the NCBI database.
List of colonisation flora of the respiratory tract 2
Figure RE-GDA0002776304140000122
Figure RE-GDA0002776304140000131
Note: taxid is the number of the microorganism in the NCBI database.
List of common pathogenic microorganisms 3
Figure RE-GDA0002776304140000132
Figure RE-GDA0002776304140000141
Note: taxid is the number of the microorganism in the NCBI database.
Example 2 detection of pathogens to respiratory infectious diseases in children by lavage fluid samples
This example uses a sample of alveolar lavage fluid, and the other steps and parameters are the same as example 1.
EXAMPLE 3 detection of pathogens of respiratory infectious diseases in Children by swab samples with preservation solution
The swab specimen with the preservation solution was used in this example, and the other steps and parameters were the same as in example 1.
Example 4 detection of pathogen of respiratory infectious diseases in children with swab sample without preservation solution was used in this example, and other steps and parameters were the same as in example 1.
Comparative example the method of the invention is used for verifying the reliability of the detection of pathogens of respiratory infectious diseases in children
To verify the reliability of the method for detecting pathogens of respiratory infections based on macrogene/macrotranscriptome sequencing of the present invention, parallel samples of sputum samples of example 1, parallel samples of alveolar lavage samples of example 2, and parallel samples of buccal swab samples of example 3 and example 4 were each verified. Meanwhile, the same infant is detected by adopting the traditional culture, throat swab and specific antigen detection methods respectively. The results are shown in tables 5 to 7.
TABLE 5 comparison of the results of the detection method of the present invention with the conventional culture detection method
Figure RE-GDA0002776304140000151
TABLE 6 comparison of the results of the test method of the present invention and the conventional pharyngeal swab test method
Figure RE-GDA0002776304140000152
Figure RE-GDA0002776304140000161
TABLE 7 comparison of the results of the detection method of the present invention with the conventional dedicated pharyngeal swab detection method
Figure RE-GDA0002776304140000162
From tables 5-7, the respiratory infectious disease pathogen detection method based on metagenome/macrotranscriptome sequencing has higher detection efficiency in detection of alveolar lavage fluid, oral swab and sputum samples, and the pathogen detection positive rate reaches 80.41%, while the detection rates of the traditional culture method and the pharynx swab specific antigen detection are only 19.58%, 15.58% and 20.24% respectively. Therefore, the respiratory infectious disease pathogen detection method based on metagenome/macrotranscriptome sequencing can quickly determine the pathogen of the respiratory infectious disease of children, distinguish colonization and infection, and further formulate a targeted treatment strategy.
The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and the present invention shall be covered thereby. Accordingly, the scope of the invention is to be determined by the claims that follow.

Claims (10)

1. A respiratory tract infection pathogen detection method based on macro gene/macro transcriptome sequencing is characterized in that the respiratory tract infection pathogen detection method is based on macro genome/macro transcriptome sequencing for detection, and specifically comprises the following steps:
step S1: breaking the wall of a collected respiratory tract sample to be detected, and extracting DNA by using a DNA extraction kit; directly extracting RNA from the other collected respiratory tract sample to be detected by using an RNA extraction kit;
step S2: constructing a sample DNA sequencing library by using the extracted DNA library construction kit; constructing a sample RNA sequencing library by using an RNA library construction kit for the extracted RNA;
step S3: performing high-throughput sequencing on the sample DNA sequencing library and the sample RNA sequencing library, and simultaneously performing sequencing on negative controls of DNA and RNA extraction and library building on the same lane; preliminarily screening the sequenced sequence data to filter out interference sequences, and obtaining processed sample sequence data;
step S4: and comparing the sample sequence data with a reference database to obtain a sample sequence and a negative control sequence with the comparison ratio of more than 70% and without multiple comparison, then screening pathogenic microorganisms, and detecting the respiratory infectious disease pathogen in the sample to be detected according to the screening result.
2. The method for detecting a respiratory infectious pathogen based on macro gene/macro transcriptome sequencing according to claim 1, wherein the screening for pathogenic microorganisms further comprises:
substep S4A: the removal of background microorganisms, in particular,
1) calculating the RPM value for each microorganism detected in the sample sequence and the negative control sequence; wherein the RPM value of a microorganism is the number of sequencing reads belonging to that microorganism per million sequencing reads of that microorganism;
2) if a certain microorganism detected in the sample sequence exists in the negative control sequence at the same time, and the ratio of the RPM value of the microorganism in the sample sequence to the RPM value of the microorganism in the negative control sequence is more than 10, defining the microorganism as a pending positive microorganism; defining a bacterium, fungus or parasite as a pending positive microorganism if the bacterium, fungus or parasite detected in the sample sequence is not present in the negative control sequence and the RPM value of the bacterium, fungus or parasite in the sample sequence is greater than 50; if the virus detected in the sample sequence is not present in the negative control sequence and the sequencing read length in the sample sequence where the virus is not overlapped is greater than 3, defining the virus as a pending positive microorganism;
3) comparing all the undetermined positive microorganisms with a background microorganism list, and removing the undetermined positive microorganisms in the background microorganism list to obtain a quasi-positive microorganism list;
substep S4B: screening and planting the microorganism, specifically,
marking the microorganisms which appear in the respiratory tract colonization flora list in the quasi-positive microorganism list as P3, the microorganisms which appear in the common pathogenic microorganism list as P1, and the rest quasi-positive microorganisms as P2; the probability of the sample containing a pathogenic microorganism is, in turn, that the sample contains a microorganism designated P1 > contains a microorganism designated P2 > contains a microorganism designated P3.
3. The method for detecting respiratory infectious agents based on macro-gene/macro-transcriptome sequencing according to claim 2, wherein said background microorganisms are listed as follows:
Figure FDA0002654684130000021
Figure FDA0002654684130000031
note: taxid is the number of the microorganism in the national center for Biotechnology information database.
4. The method for detecting respiratory tract infection pathogens based on macro-gene/macro-transcriptome sequencing according to claim 2, wherein the respiratory tract colonisation flora list is as follows:
Figure FDA0002654684130000032
Figure FDA0002654684130000041
note: taxid is the number of the microorganism in the national center for Biotechnology information database.
5. The method for detecting pathogens of respiratory tract infection based on macrogene/macrotranscriptome sequencing according to claim 2, wherein the list of common pathogenic microorganisms is as follows:
Figure FDA0002654684130000042
Figure FDA0002654684130000051
Figure FDA0002654684130000061
note: taxid is the number of the microorganism in the national center for Biotechnology information database.
6. The method for detecting a respiratory infectious pathogen based on macrogene/macrotranscriptome sequencing according to claim 1, wherein said respiratory test sample in step S1 is a sputum, an alveolar lavage fluid, a swab sample with a preservative solution or a swab sample without a preservative solution; the DNA and RNA extracted in step S1 are subjected to concentration measurement and quality control at the same time.
7. The method for detecting respiratory tract infection pathogens based on macro-gene/macro-transcriptome sequencing according to claim 1, wherein in step S2: the step of constructing a sample DNA sequencing library comprises DNA fragmentation, end repair, joint connection, fragment selection, purification, amplification and purification; the steps of constructing the sample DNA sequencing library comprise enucleation of a ribosome, RNA purification and fragmentation, end repair, ligation by adding a linker, fragment selection, purification, amplification and purification.
8. The method for detecting respiratory tract infection pathogens based on macro-gene/macro-transcriptome sequencing according to claim 1, wherein in step S3: the high-throughput sequencing is performed on PE150 sequencing on an Illumina Nextseq550 platform, and the data volume of a sequencing library of each sample is 10G.
9. The method for detecting respiratory tract infection pathogens based on macro-gene/macro-transcriptome sequencing according to claim 1, wherein in step S3: the preliminary screening, filtering and interference sequence is to remove a joint sequence, filter a low-quality sequence, delete a sequence with the length less than 40bp and remove a sequence aligned to the ginseng reference genome GRCh38 from the sequencing data in sequence; the removal of linker sequences, filtration of low quality sequences using the software Cutadapt v1.18, removal of sequences aligned to the reference genome GRCh38 using the software Bowtie v 2.3.4.
10. The method for detecting respiratory tract infection pathogens based on macro-gene/macro-transcriptome sequencing according to claim 1, wherein in step S4: aligning the sample sequence data to a reference database using software centrifuge v1.0.3; the reference database is a reference genomic sequence at the level of microorganisms downloaded from the NCBI genomic database, comprising more than 20000 reference genomic sequences, including more than 12000 bacteria, 7312 viruses, 515 fungi and 168 parasites; the construction method of the reference database comprises the following steps: sequences at the Refseq level are downloaded first, sequences at the Complete level are downloaded as a complement if there are less than 200 sequences at the Refseq level, sequences at the chromosome, scaffold and contig levels are downloaded if there are no sequences at the Refseq and Complete levels, sequences of less than 150bp are deleted using low complexity fragments in the software dutmasker marker sequences, and then a reference database is constructed using centrifugev1.0.3.
CN202010883055.7A 2020-08-28 2020-08-28 Respiratory tract infection pathogen detection method based on macrogene/macrotranscriptome sequencing Pending CN114107454A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010883055.7A CN114107454A (en) 2020-08-28 2020-08-28 Respiratory tract infection pathogen detection method based on macrogene/macrotranscriptome sequencing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010883055.7A CN114107454A (en) 2020-08-28 2020-08-28 Respiratory tract infection pathogen detection method based on macrogene/macrotranscriptome sequencing

Publications (1)

Publication Number Publication Date
CN114107454A true CN114107454A (en) 2022-03-01

Family

ID=80374608

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010883055.7A Pending CN114107454A (en) 2020-08-28 2020-08-28 Respiratory tract infection pathogen detection method based on macrogene/macrotranscriptome sequencing

Country Status (1)

Country Link
CN (1) CN114107454A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116598005A (en) * 2023-07-17 2023-08-15 中日友好医院(中日友好临床医学研究所) Lower respiratory tract infection probability prediction system and device based on host sequence information

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109610008A (en) * 2018-11-08 2019-04-12 广州华大基因医学检验所有限公司 Cental system pathogenic infection detection library constructing method, detection method and kit based on high-flux sequence
CN110093455A (en) * 2019-04-27 2019-08-06 中国医学科学院病原生物学研究所 A kind of detection method of Respirovirus
CN110317864A (en) * 2019-07-18 2019-10-11 江苏宏微特斯医药科技有限公司 A method of it is sequenced by macro transcript profile to detect pathogen
CN110349630A (en) * 2019-06-21 2019-10-18 天津华大医学检验所有限公司 Analysis method, device and its application of the macro gene order-checking data of blood
CN110349629A (en) * 2019-06-20 2019-10-18 广州赛哲生物科技股份有限公司 A kind of analysis method detecting microorganism using macro genome or macro transcript profile
CN111188094A (en) * 2020-02-24 2020-05-22 南京诺唯赞生物科技有限公司 Sequencing library construction method and kit for pathogenic microorganism detection
CN111187813A (en) * 2020-02-20 2020-05-22 予果生物科技(北京)有限公司 Full-process quality control pathogenic microorganism high-throughput sequencing detection method
CN111394486A (en) * 2020-04-09 2020-07-10 复旦大学附属儿科医院 Child infectious disease pathogen detection and identification method based on metagenome sequencing
CN111471676A (en) * 2020-03-13 2020-07-31 广州市达瑞生物技术股份有限公司 Preparation method of database building sample for metagenome next generation sequencing

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109610008A (en) * 2018-11-08 2019-04-12 广州华大基因医学检验所有限公司 Cental system pathogenic infection detection library constructing method, detection method and kit based on high-flux sequence
CN110093455A (en) * 2019-04-27 2019-08-06 中国医学科学院病原生物学研究所 A kind of detection method of Respirovirus
CN110349629A (en) * 2019-06-20 2019-10-18 广州赛哲生物科技股份有限公司 A kind of analysis method detecting microorganism using macro genome or macro transcript profile
CN110349630A (en) * 2019-06-21 2019-10-18 天津华大医学检验所有限公司 Analysis method, device and its application of the macro gene order-checking data of blood
CN110317864A (en) * 2019-07-18 2019-10-11 江苏宏微特斯医药科技有限公司 A method of it is sequenced by macro transcript profile to detect pathogen
CN111187813A (en) * 2020-02-20 2020-05-22 予果生物科技(北京)有限公司 Full-process quality control pathogenic microorganism high-throughput sequencing detection method
CN111188094A (en) * 2020-02-24 2020-05-22 南京诺唯赞生物科技有限公司 Sequencing library construction method and kit for pathogenic microorganism detection
CN111471676A (en) * 2020-03-13 2020-07-31 广州市达瑞生物技术股份有限公司 Preparation method of database building sample for metagenome next generation sequencing
CN111394486A (en) * 2020-04-09 2020-07-10 复旦大学附属儿科医院 Child infectious disease pathogen detection and identification method based on metagenome sequencing

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116598005A (en) * 2023-07-17 2023-08-15 中日友好医院(中日友好临床医学研究所) Lower respiratory tract infection probability prediction system and device based on host sequence information
CN116598005B (en) * 2023-07-17 2023-10-03 中日友好医院(中日友好临床医学研究所) Lower respiratory tract infection probability prediction system and device based on host sequence information

Similar Documents

Publication Publication Date Title
CN110349630B (en) Analysis method and device for blood metagenome sequencing data and application thereof
CN111662958B (en) Construction method of library based on nanopore sequencing platform, method for identifying microorganisms and application
CN111394486A (en) Child infectious disease pathogen detection and identification method based on metagenome sequencing
US20200294628A1 (en) Creation or use of anchor-based data structures for sample-derived characteristic determination
CN112831604A (en) Pathogenic microorganism detection primer group, kit and method based on targeted sequencing
WO2019223502A1 (en) Method for detecting pathogens based on cfdna high-throughput sequencing
CN110875082B (en) Microorganism detection method and device based on targeted amplification sequencing
Buszewski et al. A new approach to identifying pathogens, with particular regard to viruses, based on capillary electrophoresis and other analytical techniques
CN113160882A (en) Pathogenic microorganism metagenome detection method based on third generation sequencing
CN113025761A (en) Multi-amplification matched high-throughput sequencing method and kit for pathogenic microorganism identification
CN110438199A (en) A kind of method of novel the pathogenic microorganism examination
WO2021179469A1 (en) Composition for detecting pathogens, and kit and method therefor
US20190048393A1 (en) Method for qualitative and quantitative detection of microorganism in human body
CN113066533A (en) mNGS pathogen data analysis method
CN111304285A (en) Urinary metagenome sample library building and detecting method based on nanopore sequencing platform
CN107475449A (en) A kind of transcript profile sequence measurement spliced suitable for dwarf virus section and geminivirus infection coe virus genome
CN113637668A (en) Kit for simultaneously extracting pathogenic bacteria DNA of blood plasma and blood cells and application thereof
CN113265452A (en) Bioinformatics pathogen detection method based on Nanopore metagenome RNA-seq
CN114107454A (en) Respiratory tract infection pathogen detection method based on macrogene/macrotranscriptome sequencing
CN112863601A (en) Pathogenic microorganism drug-resistant gene attribution model and establishing method and application thereof
CN115651990B (en) Characteristic gene combination, kit and sequencing method for predicting antibiotic drug sensitivity phenotype of escherichia coli
CN112410465A (en) Novel coronavirus SARS-CoV-2ORF1ab and N gene constant temperature amplification primer group and kit
WO2022174117A1 (en) Metagenomic next-generation sequencing of microbial cell-free nucleic acids in subjects with lyme disease
CN114196743A (en) Rapid detection method for pathogenic microorganisms and kit thereof
CN110511995B (en) Tuberculosis markers and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination