CN114107454A - Respiratory tract infection pathogen detection method based on macrogene/macrotranscriptome sequencing - Google Patents
Respiratory tract infection pathogen detection method based on macrogene/macrotranscriptome sequencing Download PDFInfo
- Publication number
- CN114107454A CN114107454A CN202010883055.7A CN202010883055A CN114107454A CN 114107454 A CN114107454 A CN 114107454A CN 202010883055 A CN202010883055 A CN 202010883055A CN 114107454 A CN114107454 A CN 114107454A
- Authority
- CN
- China
- Prior art keywords
- sample
- microorganism
- sequencing
- sequence
- macro
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 244000052769 pathogen Species 0.000 title claims abstract description 61
- 238000012163 sequencing technique Methods 0.000 title claims abstract description 56
- 230000001717 pathogenic effect Effects 0.000 title claims abstract description 41
- 238000001514 detection method Methods 0.000 title claims description 43
- 206010057190 Respiratory tract infections Diseases 0.000 title claims description 27
- 238000000034 method Methods 0.000 claims abstract description 22
- 239000013642 negative control Substances 0.000 claims abstract description 22
- 244000000010 microbial pathogen Species 0.000 claims abstract description 20
- 230000000241 respiratory effect Effects 0.000 claims abstract description 18
- 238000012216 screening Methods 0.000 claims abstract description 18
- 208000035473 Communicable disease Diseases 0.000 claims abstract description 16
- 208000015181 infectious disease Diseases 0.000 claims abstract description 15
- 238000001712 DNA sequencing Methods 0.000 claims abstract description 10
- 238000007400 DNA extraction Methods 0.000 claims abstract description 8
- 238000003559 RNA-seq method Methods 0.000 claims abstract description 6
- 238000002123 RNA extraction Methods 0.000 claims abstract description 4
- 238000010802 RNA extraction kit Methods 0.000 claims abstract description 4
- 244000005700 microbiome Species 0.000 claims description 89
- 210000002345 respiratory system Anatomy 0.000 claims description 14
- 241000894006 Bacteria Species 0.000 claims description 13
- 241000700605 Viruses Species 0.000 claims description 13
- 241000233866 Fungi Species 0.000 claims description 12
- 244000045947 parasite Species 0.000 claims description 12
- 238000000746 purification Methods 0.000 claims description 12
- 238000010276 construction Methods 0.000 claims description 10
- 239000012634 fragment Substances 0.000 claims description 9
- 206010036790 Productive cough Diseases 0.000 claims description 8
- 208000024794 sputum Diseases 0.000 claims description 8
- 210000003802 sputum Anatomy 0.000 claims description 8
- 239000012530 fluid Substances 0.000 claims description 7
- 238000001914 filtration Methods 0.000 claims description 6
- 108090000623 proteins and genes Proteins 0.000 claims description 6
- 230000003321 amplification Effects 0.000 claims description 5
- 238000013467 fragmentation Methods 0.000 claims description 5
- 238000006062 fragmentation reaction Methods 0.000 claims description 5
- 238000012165 high-throughput sequencing Methods 0.000 claims description 5
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 5
- 230000008439 repair process Effects 0.000 claims description 5
- 241000208340 Araliaceae Species 0.000 claims description 3
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 claims description 3
- 235000003140 Panax quinquefolius Nutrition 0.000 claims description 3
- 210000000349 chromosome Anatomy 0.000 claims description 3
- 230000000295 complement effect Effects 0.000 claims description 3
- 235000008434 ginseng Nutrition 0.000 claims description 3
- 230000007159 enucleation Effects 0.000 claims description 2
- 239000003550 marker Substances 0.000 claims description 2
- 238000005259 measurement Methods 0.000 claims description 2
- 238000003908 quality control method Methods 0.000 claims description 2
- 230000002458 infectious effect Effects 0.000 claims 2
- 239000003755 preservative agent Substances 0.000 claims 2
- 230000002335 preservative effect Effects 0.000 claims 2
- 239000012678 infectious agent Substances 0.000 claims 1
- 230000034994 death Effects 0.000 abstract description 5
- 231100000517 death Toxicity 0.000 abstract description 5
- 238000011269 treatment regimen Methods 0.000 abstract description 3
- 230000003115 biocidal effect Effects 0.000 abstract description 2
- 108020004414 DNA Proteins 0.000 description 12
- 238000011282 treatment Methods 0.000 description 10
- 238000000227 grinding Methods 0.000 description 7
- 206010035664 Pneumonia Diseases 0.000 description 6
- 238000003745 diagnosis Methods 0.000 description 5
- 230000000813 microbial effect Effects 0.000 description 5
- 150000007523 nucleic acids Chemical class 0.000 description 5
- 108020004707 nucleic acids Proteins 0.000 description 4
- 102000039446 nucleic acids Human genes 0.000 description 4
- 238000011160 research Methods 0.000 description 4
- 239000003814 drug Substances 0.000 description 3
- 229940079593 drug Drugs 0.000 description 3
- 241001493065 dsRNA viruses Species 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 239000003761 preservation solution Substances 0.000 description 3
- 238000001228 spectrum Methods 0.000 description 3
- 238000010998 test method Methods 0.000 description 3
- 238000013518 transcription Methods 0.000 description 3
- 230000035897 transcription Effects 0.000 description 3
- 239000000427 antigen Substances 0.000 description 2
- 108091007433 antigens Proteins 0.000 description 2
- 102000036639 antigens Human genes 0.000 description 2
- 230000001580 bacterial effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 239000003153 chemical reaction reagent Substances 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 210000003800 pharynx Anatomy 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 238000004321 preservation Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 206010059866 Drug resistance Diseases 0.000 description 1
- 241000204031 Mycoplasma Species 0.000 description 1
- 108091028043 Nucleic acid sequence Proteins 0.000 description 1
- 238000011529 RT qPCR Methods 0.000 description 1
- 239000003242 anti bacterial agent Substances 0.000 description 1
- 230000000844 anti-bacterial effect Effects 0.000 description 1
- 230000002924 anti-infective effect Effects 0.000 description 1
- 229940088710 antibiotic agent Drugs 0.000 description 1
- 239000011324 bead Substances 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000012136 culture method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000013399 early diagnosis Methods 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 238000012268 genome sequencing Methods 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000001900 immune effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000011901 isothermal amplification Methods 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 238000009629 microbiological culture Methods 0.000 description 1
- 238000000386 microscopy Methods 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000005180 public health Effects 0.000 description 1
- 239000002096 quantum dot Substances 0.000 description 1
- 230000008261 resistance mechanism Effects 0.000 description 1
- 208000020029 respiratory tract infectious disease Diseases 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 210000002966 serum Anatomy 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 201000009032 substance abuse Diseases 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 201000008827 tuberculosis Diseases 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
Abstract
The invention discloses a method for detecting pathogens of respiratory infectious diseases of children based on metagenome/macrotranscriptome sequencing, which is characterized by specifically comprising the following steps of: after a sample to be detected is processed, extracting DNA and RNA by using a DNA extraction kit and an RNA extraction kit respectively; constructing a sample DNA sequencing library and an RNA sequencing library; sequencing the sequencing library in high throughput, and simultaneously sequencing the negative control of DNA and RNA extraction and library building on the same lane; and comparing the sample sequence data with a reference database to obtain a sequence with a comparison ratio of more than 70% and without multiple comparison, screening pathogenic microorganisms, and detecting the respiratory infectious disease pathogen in the sample according to a screening result. The invention establishes the pathogeny for quickly determining the respiratory infectious diseases of children and distinguishes planting and infection, thereby establishing a targeted treatment strategy and reducing the hospitalization time, the medical cost, the antibiotic use days and the death rate of children patients.
Description
Technical Field
The invention belongs to the technical field of microbial detection, and particularly relates to a respiratory tract infection pathogen detection method based on macrogene/macrotranscriptome sequencing.
Background
Acute Respiratory Tract Infection (ARTI) is the leading cause of outpatient and hospitalization in children of all ages, particularly in the winter and spring; meanwhile, ARTI is also the second leading cause of death in children under the age of 5, and is a major public health threat facing children. Taking pneumonia as an example of a single disease species, WHO surveys show that more than 200 ten thousand children with the ages of 0-5 die of pneumonia every year, and 1/5 accounts for almost all deaths of children under the age of 5 worldwide; in china, 14.8% of deaths in children under 5 years of age are caused by pneumonia. Therefore, the improvement of the diagnosis and treatment level of the respiratory tract infection of children is an important measure for promoting the health of children.
The infection of pathogenic microorganisms such as bacteria, viruses, mycoplasma and the like is a main cause for the pneumonia of children, but the clinical symptoms of different types of pneumonia of children are extremely similar, even if clinical detection in aspects of hematology, imaging and the like is assisted, clinicians often cannot accurately distinguish the pneumonia, empirical and broad-spectrum treatment is difficult to avoid, and thus infected children may receive unnecessary antibacterial treatment. Therefore, early and definite etiological diagnosis is beneficial to starting targeted treatment in time, reducing the abuse risk of antibiotics and reducing the consumption of medical resources.
The traditional detection methods of pathogenic microorganisms mainly comprise morphological detection, microorganism culture, smear microscopy, antigen-antibody detection, nucleic acid detection and the like, and the methods are still widely applied clinically, but the detection methods in clinical laboratories have different limitations due to own methodologies. For example, microbial culture is a gold standard for pathogen detection, and has good specificity, but has long detection period, low positive rate and complex operation, and is not suitable for wide clinical application except bacterial culture; the immunological detection mainly detects the pathogenic antibody in the serum of the infant, and has simple and quick operation, but because the generation and elimination of the pathogenic antibody lags behind the change of the pathogen and the condition of cross reaction exists, the method has difficult early diagnosis of diseases and is easy to generate false negative or false positive. With the development of molecular biology technology, pathogen molecular diagnosis is gradually applied to clinical applications, such as 7 respiratory tract pathogens based on fluorescence real-time quantitative PCR or isothermal amplification (LAMP), and instant detection of pathogens such as FilmArray, GeneXpert, etc., but these methods all aim at preset pathogens, have the limitation of narrow pathogen spectrum, and are especially beam-tie-free for detection of rare or new pathogens. The wide variety of respiratory tract infection pathogens, similar clinical characteristics, complex bacterial acquired drug resistance mechanisms and the limitation of microorganism detection means in clinical laboratories cause more anti-infection treatments to be in trouble, so that more accurate and normative pathogen molecule diagnosis methods are urgently needed for improving the diagnosis and treatment of respiratory tract infection.
The pathogenic macro genome sequencing (mNGS) is a high-throughput sequencing technology which directly extracts nucleic acid from a specimen of an infected focus part of a patient to realize pathogen species identification without depending on culture or presetting pathogens. Compared with the traditional clinical laboratory detection method, the pathogenic mNGS detects the nucleic acid sequence of pathogenic microorganisms based on the nucleic acid level, can break through the limitations of different pathogen types, comprehensively covers thousands or even tens of thousands of pathogens without bias, and simultaneously identifies various pathogenic microorganisms such as bacteria, fungi, viruses, tuberculosis, parasites and the like. Ngs has gradually become an important tool in the field of clinical microbiology, in particular to rapidly cope with the infectious disease threat of new pathogens, and also to hopefully enhance our ability to detect and track infectious disease pathogens.
Although the pathogenic macro genomics is a powerful tool for identifying pathogen species and can detect genetic information of pathogens, the pathogenic macro transcriptome cannot detect the gene expression level of microorganisms, analyze the activity of the microorganisms and identify RNA viruses, and the pathogenic macro transcriptome can well make up for the short plate of the mNGS. Macro transcriptomics (Metatrancriptomics) is a research means which takes all RNA in a specimen as a research object, can detect RNA viruses in the specimen, and can research the transcription condition of all genome of thallus life (pathogens are in an active state) and the transcription regulation and control rule in the focus part of a patient on the whole level, avoids the problem of difficult separation and culture of microorganisms, and can research the change of complex microbial communities from the transcription level. Because pathogens causing the respiratory tract infection of children comprise a plurality of RNA viruses, the pathogens causing the respiratory tract infection of children are detected by combining pathogen metagenome sequencing and pathogen macrotranscriptome sequencing, so that not only can the full coverage of pathogen spectrums be realized, but also drug-resistant genes can be identified; in addition, pathogen macrotranscriptome sequencing and macrotranscriptome sequencing result comparison before and after treatment can also help to judge whether the pathogen is in an active state, whether the treatment is effective and the like, so that the detection rate and the accuracy of the pathogen of the respiratory tract infection of children are improved, and accurate medical treatment of the respiratory tract infection of children is promoted.
Disclosure of Invention
The invention aims to solve the technical problems that the detection means for children respiratory tract infection in the prior art is low in positive rate, long in period, easy to generate false negative or false positive, or narrow in pathogen spectrum and the like, and aims to provide a novel respiratory tract infection pathogen detection method based on macro gene/macro transcriptome sequencing.
The method for detecting the respiratory infectious disease pathogen is based on metagenome/macrotranscriptome sequencing, and specifically comprises the following steps:
step S1: breaking the wall of a collected respiratory tract sample to be detected, and extracting DNA by using a DNA extraction kit; directly extracting RNA from the other collected respiratory tract sample to be detected by using an RNA extraction kit;
step S2: constructing a sample DNA sequencing library by using the extracted DNA library construction kit; constructing a sample RNA sequencing library by using an RNA library construction kit for the extracted RNA;
step S3: performing high-throughput sequencing on the sample DNA sequencing library and the sample RNA sequencing library, and simultaneously performing sequencing on negative controls of DNA and RNA extraction and library building on the same lane; preliminarily screening and filtering interference sequences for the sequenced sequence data to obtain processed sample sequence data;
step S4: and comparing the sample sequence data with a reference database to obtain a sample sequence and a negative control sequence with the comparison ratio of more than 70% and without multiple comparison, then screening pathogenic microorganisms, and detecting the respiratory infectious disease pathogen in the sample to be detected according to the screening result.
The respiratory tract infection pathogen detection method based on the macro-gene/macro-transcriptome sequencing is a strategy for establishing efficient nucleic acid extraction and corresponding library construction for children respiratory tract samples and screening pathogenic microorganisms from background microorganisms and colonizing microorganisms, so that the children respiratory tract infection pathogen detection and identification method based on the macro-gene/macro-transcriptome sequencing is established.
In a preferred embodiment of the present invention, the screening of pathogenic microorganisms further comprises:
substep S4A: the removal of background microorganisms, in particular,
1) calculating an RPM value for each microorganism in the sample sequence and the negative control sequence; wherein the microorganism's RPM (Reads Per Million) value is the number of sequencing reads belonging to the microorganism Per million sequencing reads of the microorganism;
2) if a certain microorganism detected in the sample sequence exists in the negative control sequence at the same time, and the ratio of the RPM value of the microorganism in the sample sequence to the RPM value of the microorganism in the negative control sequence is more than 10, defining the microorganism as a pending positive microorganism; defining a bacterium, fungus or parasite as a pending positive microorganism if the bacterium, fungus or parasite detected in the sample sequence is not present in the negative control sequence and the RPM value of the bacterium, fungus or parasite in the sample sequence is greater than 50; if the virus detected in the sample sequence is not present in the negative control sequence and the sequencing read length in the sample sequence where the virus is not overlapped is greater than 3, defining the virus as a pending positive microorganism;
3) comparing all the undetermined positive microorganisms with the background microorganism list 1, and removing the undetermined positive microorganisms existing in the background microorganism list 1 to obtain a quasi-positive microorganism list;
substep S4B: screening and planting the microorganism, specifically,
marking the microorganisms which appear in the respiratory tract colonization flora list 2 in the quasi-positive microorganism list as P3, the microorganisms which appear in the common pathogenic microorganism list 3 as P1, and the rest quasi-positive microorganisms as P2; the probability of the sample containing a pathogenic microorganism is, in turn, that the sample contains a microorganism designated P1 > contains a microorganism designated P2 > contains a microorganism designated P3.
In a preferred embodiment of the present invention, the background microorganism list is shown in Table 1
Note: taxid is the number of the microorganism in the NCBI database.
In a preferred embodiment of the present invention, the respiratory tract colonization flora list is shown in table 2:
note: taxi is the number of microorganisms in the National Center for Biotechnology Information (NCBI) database.
In a preferred embodiment of the present invention, the list of common pathogenic microorganisms is shown in table 3:
note: taxid is the number of the microorganism in the NCBI database.
In a preferred embodiment of the present invention, in step S1, the respiratory tract sample to be tested is sputum, alveolar lavage fluid, a swab sample with preservation fluid or a swab sample without preservation fluid; the DNA and RNA extracted in step S1 are subjected to concentration measurement and quality control at the same time.
In a preferred embodiment of the present invention, in step S2: the step of constructing a sample DNA sequencing library comprises DNA fragmentation, end repair, joint connection, fragment selection, purification, amplification and purification; the steps of constructing the sample DNA sequencing library comprise enucleation of a ribosome, RNA purification and fragmentation, end repair, ligation by adding a linker, fragment selection, purification, amplification and purification.
In a preferred embodiment of the present invention, in step S3: the high-throughput sequencing is performed on PE150 sequencing on an Illumina Nextseq550 platform, and the data volume of a sequencing library of each sample is 10G.
In a preferred embodiment of the present invention, in step S3: the preliminary screening, filtering and interference sequence is to remove a joint sequence, filter a low-quality sequence, delete a sequence with the length less than 40bp and remove a sequence aligned to the ginseng reference genome GRCh38 from the sequencing data in sequence; the removal of linker sequences, filtration of low quality sequences was performed using the software Cutadapt v1.18 with the parameters-e 0.2-O10-m 40-q 15, 15-max-n ═ 0.1, and the removal of sequences aligned to the ginseng reference genome GRCh38 was performed using the software Bowtie v 2.3.4.
In a preferred embodiment of the present invention, in step S4: aligning the sample sequence data to a reference database using software centrifuge v1.0.3; the reference database is a reference genomic sequence at the microbial level downloaded from the NCBI genomic database, containing more than 20000 reference genomic sequences, including more than 12000 bacteria, 7312 viruses, 515 fungi and 168 parasites.
In step S4: the construction method of the reference database comprises the following steps: sequences at the Refseq level are downloaded first, sequences at the Complete level are downloaded as a complement if there are less than 200 sequences at the Refseq level, sequences at the chromosome, scaffold and contig levels are downloaded if there are no sequences at the Refseq and Complete levels, sequences of less than 150bp are deleted using low complexity fragments in the software dutmasker marker sequences, and then a reference database is constructed using centrifugev1.0.3.
The respiratory tract infection pathogen detection method based on the macro gene/macro transcriptome sequencing is used for detecting based on the macro genome/macro transcriptome sequencing, and has important significance for quickly determining the pathogen of the respiratory tract infectious diseases of children and distinguishing colonization and infection, so that a targeted treatment strategy is formulated, and the hospitalization time, the medical cost, the antibiotic use days and the death rate of children are reduced. The invention adds common drug-resistant genes into a common microorganism list, and provides a certain basis for drug selection.
The conception, the specific structure, and the technical effects produced by the present invention will be further described below to fully understand the objects, the features, and the effects of the present invention.
Detailed Description
The present invention is further described in detail below with reference to specific examples, which are provided for illustration only and are not intended to limit the scope of the present invention. The test methods used in the following examples are all conventional methods unless otherwise specified; the materials and reagents used are commercially available reagents and materials, unless otherwise specified.
EXAMPLE 1 detection of pathogens of respiratory infectious diseases in Children by sputum samples
Step S1: treating the collected sputum of the children: in the metagenome sequencing process, 600 mu L of sputum sample to be detected is taken to be put into a 2mL grinding tube filled with 500 mu L of glass beads, and a tube cover is screwed tightly to ensure that no liquid leakage occurs during wall breaking. Further, 2mL grinding tube was placed on 2mL grinding tube holder, trimmed, and the grinding homogenizer was opened and the following procedure was used to break the wall as shown in table 4.
TABLE 4 wall breaking treatment parameters
After the wall breaking, the grinding tube is taken out, and the AQBD cube mini centrifuge is used for short-time centrifugation for 5 s. The grinding mean instrument is a Tiangen Bioprep-24 tissue grinding homogenizer. Then, DNA was extracted using a DNA extraction kit manufactured by Guangzhou micro-distance medical instruments Ltd according to the instruction manual.
In the macro transcriptome sequencing process, 200 μ L of the sputum sample to be tested is taken to a corresponding 2mL sterile centrifuge tube, and the tube cover is screwed tightly. RNA was extracted using an RNA extraction kit manufactured by Guangzhou micro-distance medical instruments Ltd according to the instruction manual.
Meanwhile, a Qubit fluorescence quantitative instrument is used for measuring the concentration of DNA and RNA, and electrophoresis is used for controlling the quality of the obtained DNA and RNA.
Step S2: constructing a sample DNA sequencing library by using a DNA library construction kit produced by Guangzhou micro-remote medical instrument limited according to an operation instruction; the extracted RNA was used to construct a sample RNA sequencing library using an RNA library construction kit manufactured by Guangzhou micro-distance medical instruments Inc. according to the instructions. Constructing a sequencing library of the sample DNA, and performing DNA fragmentation, end repair, joint connection, fragment selection, purification, amplification and purification according to an operation instruction; constructing a sequencing library of sample RNA, and according to the operation instruction, removing ribosome, purifying and fragmenting RNA, repairing tail end, connecting joint, selecting fragments, purifying, amplifying and purifying.
Step S3: PE150 sequencing was performed on Illumina Nextseq550 platform with a sequencing library data volume of 10G per sample, and each batch of samples had to contain a negative control (plasma of healthy persons in the pooling kit) for simultaneous DNA and RNA extraction and pooling and sequencing on the same lane. Then, removing the joint sequence, filtering the low-quality sequence, deleting the sequence with the length less than 40bp, and removing the sequence aligned to the reference genome GRCh38 from the sequencing data; linker sequences were removed using software Cutadapt v1.18, low quality sequences were filtered, the parameters set to-e 0.2-O10-m 40-q 15, 15-max-n ═ 0.1, and sequences aligned to the reference genome GRCh38 were removed using software Bowtie v 2.3.4.
Step S4: comparing the processed sample sequence data with a reference database by using software centrifuge v1.0.3; the reference database is a reference genomic sequence at the microbial level downloaded from the NCBI genomic database, containing more than 20000 reference genomic sequences, including more than 12000 bacteria, 7312 viruses, 515 fungi and 168 parasites. The sample sequence and the negative control sequence with the alignment ratio of more than 70 percent and without multiple alignment are obtained according to the general conventional method. The construction method of the reference database comprises the following steps: for each microbial species, sequences at the Refseq level are downloaded first, sequences at the Complete level are downloaded as a complement if less than 200 sequences at the Refseq level, sequences at the chromosome, scaffold and contig levels are downloaded if no sequences at the Refseq and Complete level are present, low complexity fragments in the sequences are tagged with a software dutmasker, sequences of less than 150bp are deleted, and then a database is constructed with centrrigfugev1.0.3.
Then, the pathogenic microorganism screening including background microorganism removal and colonization microorganism screening is carried out.
Background microbial removal was: (A1) first, the RPM value of each microorganism detected in the sample sequence and the negative control sequence (i.e., the number of sequencing reads belonging to the microorganism per million sequencing reads) is calculated; (A2) when the detected microorganisms in the sample sequence exist in the negative control sequence at the same time, and the ratio of the RPM value in the sample sequence to the RPM value in the negative control sequence is more than 10, the microorganisms are defined as undetermined positive microorganisms; defining the bacteria, fungi or parasites to be positive microorganisms if the bacteria, fungi or parasites detected in the sample sequence are not present in the negative control sequence and the RPM value in the sample sequence is greater than 50; defining a virus as a microorganism to be determined positive if the virus detected in the sample sequence is not present in the negative control sequence and if there are more than 3 non-overlapping sequencing reads in the sample sequence; (A3) and comparing the undetermined positive microorganisms with a background microorganism list 1, and removing the undetermined positive microorganisms existing in the background microorganism list to obtain a quasi-positive microorganism list.
The screening of the microorganism for permanent planting comprises the following steps: quasi-positive microorganisms in the list of quasi-positive microorganisms detected in the sequence of the respiratory tract sample are labeled as P3 if they appear in the list of respiratory tract colonizing flora 2, P1 if they appear in the list of common pathogenic microorganisms 3, and P2 for the remainder. The probability of the sample containing pathogenic microorganisms is P1 > P2 > P3.
Background microorganisms List 1
Note: taxid is the number of the microorganism in the NCBI database.
List of colonisation flora of the respiratory tract 2
Note: taxid is the number of the microorganism in the NCBI database.
List of common pathogenic microorganisms 3
Note: taxid is the number of the microorganism in the NCBI database.
Example 2 detection of pathogens to respiratory infectious diseases in children by lavage fluid samples
This example uses a sample of alveolar lavage fluid, and the other steps and parameters are the same as example 1.
EXAMPLE 3 detection of pathogens of respiratory infectious diseases in Children by swab samples with preservation solution
The swab specimen with the preservation solution was used in this example, and the other steps and parameters were the same as in example 1.
Example 4 detection of pathogen of respiratory infectious diseases in children with swab sample without preservation solution was used in this example, and other steps and parameters were the same as in example 1.
Comparative example the method of the invention is used for verifying the reliability of the detection of pathogens of respiratory infectious diseases in children
To verify the reliability of the method for detecting pathogens of respiratory infections based on macrogene/macrotranscriptome sequencing of the present invention, parallel samples of sputum samples of example 1, parallel samples of alveolar lavage samples of example 2, and parallel samples of buccal swab samples of example 3 and example 4 were each verified. Meanwhile, the same infant is detected by adopting the traditional culture, throat swab and specific antigen detection methods respectively. The results are shown in tables 5 to 7.
TABLE 5 comparison of the results of the detection method of the present invention with the conventional culture detection method
TABLE 6 comparison of the results of the test method of the present invention and the conventional pharyngeal swab test method
TABLE 7 comparison of the results of the detection method of the present invention with the conventional dedicated pharyngeal swab detection method
From tables 5-7, the respiratory infectious disease pathogen detection method based on metagenome/macrotranscriptome sequencing has higher detection efficiency in detection of alveolar lavage fluid, oral swab and sputum samples, and the pathogen detection positive rate reaches 80.41%, while the detection rates of the traditional culture method and the pharynx swab specific antigen detection are only 19.58%, 15.58% and 20.24% respectively. Therefore, the respiratory infectious disease pathogen detection method based on metagenome/macrotranscriptome sequencing can quickly determine the pathogen of the respiratory infectious disease of children, distinguish colonization and infection, and further formulate a targeted treatment strategy.
The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and the present invention shall be covered thereby. Accordingly, the scope of the invention is to be determined by the claims that follow.
Claims (10)
1. A respiratory tract infection pathogen detection method based on macro gene/macro transcriptome sequencing is characterized in that the respiratory tract infection pathogen detection method is based on macro genome/macro transcriptome sequencing for detection, and specifically comprises the following steps:
step S1: breaking the wall of a collected respiratory tract sample to be detected, and extracting DNA by using a DNA extraction kit; directly extracting RNA from the other collected respiratory tract sample to be detected by using an RNA extraction kit;
step S2: constructing a sample DNA sequencing library by using the extracted DNA library construction kit; constructing a sample RNA sequencing library by using an RNA library construction kit for the extracted RNA;
step S3: performing high-throughput sequencing on the sample DNA sequencing library and the sample RNA sequencing library, and simultaneously performing sequencing on negative controls of DNA and RNA extraction and library building on the same lane; preliminarily screening the sequenced sequence data to filter out interference sequences, and obtaining processed sample sequence data;
step S4: and comparing the sample sequence data with a reference database to obtain a sample sequence and a negative control sequence with the comparison ratio of more than 70% and without multiple comparison, then screening pathogenic microorganisms, and detecting the respiratory infectious disease pathogen in the sample to be detected according to the screening result.
2. The method for detecting a respiratory infectious pathogen based on macro gene/macro transcriptome sequencing according to claim 1, wherein the screening for pathogenic microorganisms further comprises:
substep S4A: the removal of background microorganisms, in particular,
1) calculating the RPM value for each microorganism detected in the sample sequence and the negative control sequence; wherein the RPM value of a microorganism is the number of sequencing reads belonging to that microorganism per million sequencing reads of that microorganism;
2) if a certain microorganism detected in the sample sequence exists in the negative control sequence at the same time, and the ratio of the RPM value of the microorganism in the sample sequence to the RPM value of the microorganism in the negative control sequence is more than 10, defining the microorganism as a pending positive microorganism; defining a bacterium, fungus or parasite as a pending positive microorganism if the bacterium, fungus or parasite detected in the sample sequence is not present in the negative control sequence and the RPM value of the bacterium, fungus or parasite in the sample sequence is greater than 50; if the virus detected in the sample sequence is not present in the negative control sequence and the sequencing read length in the sample sequence where the virus is not overlapped is greater than 3, defining the virus as a pending positive microorganism;
3) comparing all the undetermined positive microorganisms with a background microorganism list, and removing the undetermined positive microorganisms in the background microorganism list to obtain a quasi-positive microorganism list;
substep S4B: screening and planting the microorganism, specifically,
marking the microorganisms which appear in the respiratory tract colonization flora list in the quasi-positive microorganism list as P3, the microorganisms which appear in the common pathogenic microorganism list as P1, and the rest quasi-positive microorganisms as P2; the probability of the sample containing a pathogenic microorganism is, in turn, that the sample contains a microorganism designated P1 > contains a microorganism designated P2 > contains a microorganism designated P3.
3. The method for detecting respiratory infectious agents based on macro-gene/macro-transcriptome sequencing according to claim 2, wherein said background microorganisms are listed as follows:
note: taxid is the number of the microorganism in the national center for Biotechnology information database.
4. The method for detecting respiratory tract infection pathogens based on macro-gene/macro-transcriptome sequencing according to claim 2, wherein the respiratory tract colonisation flora list is as follows:
note: taxid is the number of the microorganism in the national center for Biotechnology information database.
5. The method for detecting pathogens of respiratory tract infection based on macrogene/macrotranscriptome sequencing according to claim 2, wherein the list of common pathogenic microorganisms is as follows:
note: taxid is the number of the microorganism in the national center for Biotechnology information database.
6. The method for detecting a respiratory infectious pathogen based on macrogene/macrotranscriptome sequencing according to claim 1, wherein said respiratory test sample in step S1 is a sputum, an alveolar lavage fluid, a swab sample with a preservative solution or a swab sample without a preservative solution; the DNA and RNA extracted in step S1 are subjected to concentration measurement and quality control at the same time.
7. The method for detecting respiratory tract infection pathogens based on macro-gene/macro-transcriptome sequencing according to claim 1, wherein in step S2: the step of constructing a sample DNA sequencing library comprises DNA fragmentation, end repair, joint connection, fragment selection, purification, amplification and purification; the steps of constructing the sample DNA sequencing library comprise enucleation of a ribosome, RNA purification and fragmentation, end repair, ligation by adding a linker, fragment selection, purification, amplification and purification.
8. The method for detecting respiratory tract infection pathogens based on macro-gene/macro-transcriptome sequencing according to claim 1, wherein in step S3: the high-throughput sequencing is performed on PE150 sequencing on an Illumina Nextseq550 platform, and the data volume of a sequencing library of each sample is 10G.
9. The method for detecting respiratory tract infection pathogens based on macro-gene/macro-transcriptome sequencing according to claim 1, wherein in step S3: the preliminary screening, filtering and interference sequence is to remove a joint sequence, filter a low-quality sequence, delete a sequence with the length less than 40bp and remove a sequence aligned to the ginseng reference genome GRCh38 from the sequencing data in sequence; the removal of linker sequences, filtration of low quality sequences using the software Cutadapt v1.18, removal of sequences aligned to the reference genome GRCh38 using the software Bowtie v 2.3.4.
10. The method for detecting respiratory tract infection pathogens based on macro-gene/macro-transcriptome sequencing according to claim 1, wherein in step S4: aligning the sample sequence data to a reference database using software centrifuge v1.0.3; the reference database is a reference genomic sequence at the level of microorganisms downloaded from the NCBI genomic database, comprising more than 20000 reference genomic sequences, including more than 12000 bacteria, 7312 viruses, 515 fungi and 168 parasites; the construction method of the reference database comprises the following steps: sequences at the Refseq level are downloaded first, sequences at the Complete level are downloaded as a complement if there are less than 200 sequences at the Refseq level, sequences at the chromosome, scaffold and contig levels are downloaded if there are no sequences at the Refseq and Complete levels, sequences of less than 150bp are deleted using low complexity fragments in the software dutmasker marker sequences, and then a reference database is constructed using centrifugev1.0.3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010883055.7A CN114107454A (en) | 2020-08-28 | 2020-08-28 | Respiratory tract infection pathogen detection method based on macrogene/macrotranscriptome sequencing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010883055.7A CN114107454A (en) | 2020-08-28 | 2020-08-28 | Respiratory tract infection pathogen detection method based on macrogene/macrotranscriptome sequencing |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114107454A true CN114107454A (en) | 2022-03-01 |
Family
ID=80374608
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010883055.7A Pending CN114107454A (en) | 2020-08-28 | 2020-08-28 | Respiratory tract infection pathogen detection method based on macrogene/macrotranscriptome sequencing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114107454A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116598005A (en) * | 2023-07-17 | 2023-08-15 | 中日友好医院(中日友好临床医学研究所) | Lower respiratory tract infection probability prediction system and device based on host sequence information |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109610008A (en) * | 2018-11-08 | 2019-04-12 | 广州华大基因医学检验所有限公司 | Cental system pathogenic infection detection library constructing method, detection method and kit based on high-flux sequence |
CN110093455A (en) * | 2019-04-27 | 2019-08-06 | 中国医学科学院病原生物学研究所 | A kind of detection method of Respirovirus |
CN110317864A (en) * | 2019-07-18 | 2019-10-11 | 江苏宏微特斯医药科技有限公司 | A method of it is sequenced by macro transcript profile to detect pathogen |
CN110349630A (en) * | 2019-06-21 | 2019-10-18 | 天津华大医学检验所有限公司 | Analysis method, device and its application of the macro gene order-checking data of blood |
CN110349629A (en) * | 2019-06-20 | 2019-10-18 | 广州赛哲生物科技股份有限公司 | A kind of analysis method detecting microorganism using macro genome or macro transcript profile |
CN111188094A (en) * | 2020-02-24 | 2020-05-22 | 南京诺唯赞生物科技有限公司 | Sequencing library construction method and kit for pathogenic microorganism detection |
CN111187813A (en) * | 2020-02-20 | 2020-05-22 | 予果生物科技(北京)有限公司 | Full-process quality control pathogenic microorganism high-throughput sequencing detection method |
CN111394486A (en) * | 2020-04-09 | 2020-07-10 | 复旦大学附属儿科医院 | Child infectious disease pathogen detection and identification method based on metagenome sequencing |
CN111471676A (en) * | 2020-03-13 | 2020-07-31 | 广州市达瑞生物技术股份有限公司 | Preparation method of database building sample for metagenome next generation sequencing |
-
2020
- 2020-08-28 CN CN202010883055.7A patent/CN114107454A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109610008A (en) * | 2018-11-08 | 2019-04-12 | 广州华大基因医学检验所有限公司 | Cental system pathogenic infection detection library constructing method, detection method and kit based on high-flux sequence |
CN110093455A (en) * | 2019-04-27 | 2019-08-06 | 中国医学科学院病原生物学研究所 | A kind of detection method of Respirovirus |
CN110349629A (en) * | 2019-06-20 | 2019-10-18 | 广州赛哲生物科技股份有限公司 | A kind of analysis method detecting microorganism using macro genome or macro transcript profile |
CN110349630A (en) * | 2019-06-21 | 2019-10-18 | 天津华大医学检验所有限公司 | Analysis method, device and its application of the macro gene order-checking data of blood |
CN110317864A (en) * | 2019-07-18 | 2019-10-11 | 江苏宏微特斯医药科技有限公司 | A method of it is sequenced by macro transcript profile to detect pathogen |
CN111187813A (en) * | 2020-02-20 | 2020-05-22 | 予果生物科技(北京)有限公司 | Full-process quality control pathogenic microorganism high-throughput sequencing detection method |
CN111188094A (en) * | 2020-02-24 | 2020-05-22 | 南京诺唯赞生物科技有限公司 | Sequencing library construction method and kit for pathogenic microorganism detection |
CN111471676A (en) * | 2020-03-13 | 2020-07-31 | 广州市达瑞生物技术股份有限公司 | Preparation method of database building sample for metagenome next generation sequencing |
CN111394486A (en) * | 2020-04-09 | 2020-07-10 | 复旦大学附属儿科医院 | Child infectious disease pathogen detection and identification method based on metagenome sequencing |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116598005A (en) * | 2023-07-17 | 2023-08-15 | 中日友好医院(中日友好临床医学研究所) | Lower respiratory tract infection probability prediction system and device based on host sequence information |
CN116598005B (en) * | 2023-07-17 | 2023-10-03 | 中日友好医院(中日友好临床医学研究所) | Lower respiratory tract infection probability prediction system and device based on host sequence information |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110349630B (en) | Analysis method and device for blood metagenome sequencing data and application thereof | |
CN111662958B (en) | Construction method of library based on nanopore sequencing platform, method for identifying microorganisms and application | |
CN111394486A (en) | Child infectious disease pathogen detection and identification method based on metagenome sequencing | |
US20200294628A1 (en) | Creation or use of anchor-based data structures for sample-derived characteristic determination | |
CN112831604A (en) | Pathogenic microorganism detection primer group, kit and method based on targeted sequencing | |
WO2019223502A1 (en) | Method for detecting pathogens based on cfdna high-throughput sequencing | |
CN110875082B (en) | Microorganism detection method and device based on targeted amplification sequencing | |
Buszewski et al. | A new approach to identifying pathogens, with particular regard to viruses, based on capillary electrophoresis and other analytical techniques | |
CN113160882A (en) | Pathogenic microorganism metagenome detection method based on third generation sequencing | |
CN113025761A (en) | Multi-amplification matched high-throughput sequencing method and kit for pathogenic microorganism identification | |
CN110438199A (en) | A kind of method of novel the pathogenic microorganism examination | |
WO2021179469A1 (en) | Composition for detecting pathogens, and kit and method therefor | |
US20190048393A1 (en) | Method for qualitative and quantitative detection of microorganism in human body | |
CN113066533A (en) | mNGS pathogen data analysis method | |
CN111304285A (en) | Urinary metagenome sample library building and detecting method based on nanopore sequencing platform | |
CN107475449A (en) | A kind of transcript profile sequence measurement spliced suitable for dwarf virus section and geminivirus infection coe virus genome | |
CN113637668A (en) | Kit for simultaneously extracting pathogenic bacteria DNA of blood plasma and blood cells and application thereof | |
CN113265452A (en) | Bioinformatics pathogen detection method based on Nanopore metagenome RNA-seq | |
CN114107454A (en) | Respiratory tract infection pathogen detection method based on macrogene/macrotranscriptome sequencing | |
CN112863601A (en) | Pathogenic microorganism drug-resistant gene attribution model and establishing method and application thereof | |
CN115651990B (en) | Characteristic gene combination, kit and sequencing method for predicting antibiotic drug sensitivity phenotype of escherichia coli | |
CN112410465A (en) | Novel coronavirus SARS-CoV-2ORF1ab and N gene constant temperature amplification primer group and kit | |
WO2022174117A1 (en) | Metagenomic next-generation sequencing of microbial cell-free nucleic acids in subjects with lyme disease | |
CN114196743A (en) | Rapid detection method for pathogenic microorganisms and kit thereof | |
CN110511995B (en) | Tuberculosis markers and application thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |