CN117821561A - Library based on second-generation sequencing, construction method and reagent thereof - Google Patents
Library based on second-generation sequencing, construction method and reagent thereof Download PDFInfo
- Publication number
- CN117821561A CN117821561A CN202211197088.1A CN202211197088A CN117821561A CN 117821561 A CN117821561 A CN 117821561A CN 202211197088 A CN202211197088 A CN 202211197088A CN 117821561 A CN117821561 A CN 117821561A
- Authority
- CN
- China
- Prior art keywords
- primer
- sequence
- sequencing
- library
- read3
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012163 sequencing technique Methods 0.000 title claims abstract description 197
- 238000010276 construction Methods 0.000 title claims abstract description 15
- 239000003153 chemical reaction reagent Substances 0.000 title claims abstract description 11
- 238000000034 method Methods 0.000 claims abstract description 32
- 238000001514 detection method Methods 0.000 claims abstract description 25
- 238000004458 analytical method Methods 0.000 claims abstract description 16
- 238000006062 fragmentation reaction Methods 0.000 claims abstract description 15
- 239000012634 fragment Substances 0.000 claims abstract description 14
- 238000013467 fragmentation Methods 0.000 claims abstract description 13
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 8
- 230000003321 amplification Effects 0.000 claims description 37
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 37
- 238000006243 chemical reaction Methods 0.000 claims description 27
- 238000012408 PCR amplification Methods 0.000 claims description 15
- 108010020764 Transposases Proteins 0.000 claims description 7
- 102000008579 Transposases Human genes 0.000 claims description 7
- 238000003766 bioinformatics method Methods 0.000 claims description 6
- 238000007405 data analysis Methods 0.000 claims description 6
- 238000000746 purification Methods 0.000 claims description 6
- 238000012986 modification Methods 0.000 claims description 5
- 230000004048 modification Effects 0.000 claims description 5
- 238000000137 annealing Methods 0.000 claims description 4
- 108020004707 nucleic acids Proteins 0.000 claims description 4
- 102000039446 nucleic acids Human genes 0.000 claims description 4
- 150000007523 nucleic acids Chemical class 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 3
- 239000012467 final product Substances 0.000 claims description 2
- 230000010354 integration Effects 0.000 claims description 2
- 244000052769 pathogen Species 0.000 abstract description 12
- 230000007306 turnover Effects 0.000 abstract description 9
- 238000004904 shortening Methods 0.000 abstract description 5
- 239000011324 bead Substances 0.000 description 17
- 108020004414 DNA Proteins 0.000 description 16
- 239000006228 supernatant Substances 0.000 description 10
- 230000001717 pathogenic effect Effects 0.000 description 8
- 102000053602 DNA Human genes 0.000 description 7
- 238000002156 mixing Methods 0.000 description 7
- HEMHJVSKTPXQMS-UHFFFAOYSA-M Sodium hydroxide Chemical compound [OH-].[Na+] HEMHJVSKTPXQMS-UHFFFAOYSA-M 0.000 description 6
- 239000007788 liquid Substances 0.000 description 6
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 4
- 239000000047 product Substances 0.000 description 4
- 238000011002 quantification Methods 0.000 description 4
- 230000035945 sensitivity Effects 0.000 description 4
- 241000589517 Pseudomonas aeruginosa Species 0.000 description 2
- 238000010009 beating Methods 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000007865 diluting Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 230000002194 synthesizing effect Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 229910021642 ultra pure water Inorganic materials 0.000 description 2
- 239000012498 ultrapure water Substances 0.000 description 2
- 238000003260 vortexing Methods 0.000 description 2
- 108010012306 Tn5 transposase Proteins 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 244000052616 bacterial pathogen Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000003115 biocidal effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000006911 enzymatic reaction Methods 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 230000002458 infectious effect Effects 0.000 description 1
- 238000002898 library design Methods 0.000 description 1
- 244000000010 microbial pathogen Species 0.000 description 1
- 238000003908 quality control method Methods 0.000 description 1
- 238000011897 real-time detection Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000010839 reverse transcription Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000011451 sequencing strategy Methods 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B40/00—Libraries per se, e.g. arrays, mixtures
- C40B40/04—Libraries containing only organic compounds
- C40B40/06—Libraries containing nucleotides or polynucleotides, or derivatives thereof
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B50/00—Methods of creating libraries, e.g. combinatorial synthesis
- C40B50/06—Biochemical methods, e.g. using enzymes or whole viable microorganisms
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Biochemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Analytical Chemistry (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Microbiology (AREA)
- General Chemical & Material Sciences (AREA)
- Physics & Mathematics (AREA)
- Biotechnology (AREA)
- Immunology (AREA)
- Biophysics (AREA)
- Medicinal Chemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention relates to a library based on second-generation sequencing, a construction method and a reagent thereof, belonging to the technical field of gene detection. The sequence fragments of the library based on the second-generation sequencing are sequentially from the 5 'end to the 3' end: the kit comprises a first universal joint sequence, a Read3 sequencing primer sequence, an index2 sequence, a Read1 sequencing primer sequence, a target insert, a Read2 sequencing primer sequence, an index1 sequence and a second universal joint sequence, wherein the target insert is a sequence to be detected after genome fragmentation to be detected. According to the invention, by adjusting the library structure, the procedure of overturning the template chain, which takes 1 hour to resynthesize, of the index2 is omitted, the sequencing data is finally split 1 hour in advance, the purpose of shortening the experimental turnover time is realized, the rapid second generation sequencing can be performed, the method is particularly suitable for mNGS analysis, and the urgent requirement of a clinician for rapidly and accurately identifying pathogens can be met.
Description
Technical Field
The invention relates to the technical field of gene detection, in particular to a library based on second-generation sequencing, a construction method and a reagent thereof.
Background
Metagenomic second generation sequencing (mNGS) has great potential for unbiased detection of pathogens in samples, and is also an important means of identifying newly outbreak pathogenic microorganisms, but its experimental turnaround time (TAT) is relatively long (typically above 15 hours).
Although Oxford Nanopore Technologies (ONT) sequencing platforms provide a TAT-controllable rapid sequencing and real-time detection analysis scheme within 6 hours, there are still shortcomings of lower sequencing depth, lower Reads yields, higher sequencing error rates, etc. compared to the illumine second generation sequencing approach.
While in the face of various suspected infectious conditions, clinicians are always in urgent need for rapid and accurate identification of pathogens to reduce morbidity, mortality, and avoid antibiotic resistance due to empirical administration. How to improve the timeliness of mNGS detection becomes an important point of clinical pathogen detection.
The current metagenome sequencing development is mainly based on NextSeq550 of illumine, and the sequencing mode is usually mainly SE50/75bp (single-end sequencing) and double index8 bp. Two sets of metagenome library building methods commonly used for illumine sequencing are Truseq and Nextera respectively, and index1 and index2 are between P5/P7 and Read1/Read2 under the conventional library building method. The procedure for sequencing using the NextSeq550 sequencer was: first, read1 reads the target sequence inserted in the library, and then reverse sequencing primers Read the index1/index2 tag sequence of the library. In the double-ended sequencing strategy, the inserted target sequence is first Read with Read1, then Index1 (i 7) is Read, then Index2 (i 5) is Read during the bridge PCR amplification flip, and finally Read2 is Read. That is, to achieve Index2 sequencing, the library needs to be inverted to synthesize another template strand after Index1 sequencing is completed, and the process of inversion synthesis needs 1h.
Therefore, how to further shorten TAT is a problem to be solved in the art.
Disclosure of Invention
In view of the above, it is necessary to provide a library based on second-generation sequencing, which shortens the period of second-generation sequencing (in particular, metagenomic detection) by adjusting the library structure so that the template strand does not need to be turned over during index2 sequencing to realize the sequencing 1 hour before starting the machine.
The invention discloses a library based on second-generation sequencing, which sequentially comprises the following sequence fragments from a 5 'end to a 3' end: the kit comprises a first universal joint sequence, a Read3 sequencing primer sequence, an index2 sequence, a Read1 sequencing primer sequence, a target insert, a Read2 sequencing primer sequence, an index1 sequence and a second universal joint sequence, wherein the target insert is a sequence to be detected after genome fragmentation to be detected.
In facing the problem of how to further shorten TAT, the inventors considered that sequencing of index2 could be performed directly by altering the sequencing flow, e.g. omitting the turnover process of the library, i.e. the library did not go through the turnover process after the end of index1 sequencing. In the case of TruSeq library-based strategy sequencing, a sample-specific index2 tag sequence may be inserted after the Read1 sequencing primer binding region, so that the index2 tag can be detected at the beginning of Read1 sequencing, and then the sequencing primer is replaced with Reads2 to complete index1 sequencing. The template chain can be sequenced without synthesizing again when the index2 is sequenced through the TruSeq library structure, sequencing data is finally split 1 hour in advance, and quick mNSS analysis is realized. However, this method is currently only applicable to libraries of TruSeq structure, and because of the principle of library construction by Nextera transposase, index2 cannot be added to the 3' end of the Read1 sequencing primer as in the TruSeq library, and thus the above strategy is not applicable to libraries of Nextera structure.
Based on the above, the present invention proposes that a Read3 sequencing primer sequence is inserted between the first universal adaptor sequence and the index2 sequence as a sequencing primer for index2 by adjusting the library structure. During on-machine sequencing, the target insert sequence is detected by using a Read1 sequencing primer, then the index1 sequence is detected by using a Read2 sequencing primer, and the library can be completely tested by directly detecting the index2 sequence by using a Read3 sequencing primer without bridge type overturning. Therefore, the procedure of overturning the template chain, which takes 1 hour to resynthesize, of the index2 is omitted, the sequencing data is finally split 1 hour in advance, the aim of shortening the experimental turnover time is fulfilled, the rapid second-generation sequencing can be performed, the method is particularly suitable for mNGS analysis, and the urgent requirement of a clinician for rapidly and accurately identifying pathogens can be met.
The index1 sequence and the index2 sequence are molecular recognition marker sequences, and it is understood that the molecular recognition marker sequences are tag sequences recognized by different nucleic acid molecules, and the tag sequences can be designed and selected according to the conventional mode in the field.
In one embodiment, the first universal linker sequence is a P5 linker and the second universal linker sequence is a P7 linker. It can be understood that the universal joint can be adjusted according to a specific sequencing platform, so that the requirement of a sequencing chip can be met, but the universal joint has good convenience.
In one embodiment, the Read3 primer sequence is designed according to the following requirements: the length of the primer sequence of the Read3 is 20-40nt, the annealing temperature Tm is 65-90 ℃, and the GC content is equivalent to that of the Read1 and the Read2.
It will be appreciated that the above GC content is comparable to Read1 and Read2, i.e., as conventionally understood in the art, such as GC content does not differ by more than 15%,10% or 5%, etc.
In one embodiment, the Read3 primer sequence is designed according to the following requirements: the length of the Read3 primer sequence is 30-35nt, the annealing temperature Tm is 75-80 ℃, the GC content is 58-65%, and/or the 3 bases near the 3 'end and/or the 3 bases near the 5' end of the Read3 primer are subjected to base modification (namely LNA modification) by locking nucleic acid.
In one embodiment, the Read3 primer sequence is selected from the group consisting of the sequences set forth in SEQ ID NO. 3. The sequence is selected, so that a better sequencing effect can be achieved.
In one embodiment, the Read1 primer sequence is selected from the group consisting of the sequences set forth in SEQ ID NO.1 and the Read2 primer sequence is selected from the group consisting of the sequences set forth in SEQ ID NO. 2.
The invention also discloses a reagent for constructing the second-generation sequencing-based library, which comprises a PCR primer and a sequencing primer, wherein the PCR primer comprises: the primer comprises a P5 end primer and a P7 end primer, wherein the P5 end primer comprises a P5 amplification primer and a P5 general primer, the P7 end primer comprises a P7 amplification primer and a P7 general primer, and the P5 amplification primer consists of a Read3 sequencing primer sequence, an index2 sequence and a Read1 sequencing primer sequence which are sequentially connected; the P7 amplification primer consists of a Read2 sequencing primer sequence, an index1 sequence and a second universal joint sequence which are connected in sequence; the P5 amplification primer and the P5 universal primer are ligated or non-ligated, the P7 amplification primer and the P7 universal primer are ligated or non-ligated, and the sequencing primer comprises: read1 sequencing primer, read2 sequencing primer and Read3 sequencing primer.
The reagent is used for constructing the library based on the second-generation sequencing, the PCR primer designed in the invention is used for amplifying the library with the specific region including the Read3 sequencing primer after adjustment, and the custom sequencing primer is used for omitting the procedure of template chain turnover which takes 1h for synthesizing the index2 according to the custom sequencing flow, so that the sequencing data is split in advance by 1h, and the aim of shortening the experimental turnover time is fulfilled.
It will be understood that the above-mentioned P5 amplification primer and P5 universal primer are ligated or not, which means that the P5 amplification primer and the P5 universal primer may be ligated into one primer by means of phosphodiester bond or the like, or may be merely present in the reaction system at the same time for reaction without ligation; similarly, the P7 amplification primer and the P7 universal primer may be ligated together as a single primer or may be present in the reaction system at the same time.
In one embodiment, the P5 universal primer sequence is selected from the sequences set forth in SEQ ID NO.4 and the P7 universal primer sequence is selected from the sequences set forth in SEQ ID NO. 5. It will be appreciated that the above-described universal primers refer to primers that do not carry a tag during library construction but can be amplified to obtain a complete, sequenced library structure by binding of the tagged primers.
The invention also discloses application of the library based on the second generation sequencing in an illuminea sequencing platform.
The invention also discloses a construction method of the library based on the second generation sequencing, which comprises the following steps:
fragmenting: taking the extracted DNA to be detected, adding transposase to carry out fragmentation treatment, and stopping the reaction after a preset time;
and (3) PCR amplification: adding the primer combination to perform PCR amplification reaction;
library purification: and (3) sorting and purifying the library obtained in the step to obtain the final product.
It will be appreciated that the transposase may be selected from conventional fragmentation reagents such as TN5 transposase, which has embedded a linker sequence suitable for illuminea sequencing. In the PCR amplification reaction, the 4 kinds of primers are preferably amplified in one step, so that the detection can be efficiently performed.
In one embodiment, in the PCR amplification step, the dosage ratio of the P5 amplification primer to the P5 universal primer is 1:15-1:3, and the dosage ratio of the P7 amplification primer to the P7 universal primer is 1:15-1:3;
the library purification step further comprises a fragment analysis step, wherein in the fragment analysis step, when the main peak fragment of the library is 300+/-50 bp, the library is judged to be qualified.
In one embodiment, the ratio of the amount of the P5 amplification primer to the amount of the P5 universal primer is 1:12-1:6, and the ratio of the amount of the P7 amplification primer to the amount of the P7 universal primer is 1:12-1:6.
In one embodiment, the ratio of the amount of the P5 amplification primer to the amount of the P5 universal primer is 1:10-1:7, and the ratio of the amount of the P7 amplification primer to the amount of the P7 universal primer is 1:10-1:7.
The ratio of the amplification primers to the universal primers described above is based on the fact that the library amplified from the index-containing P5 and P7 amplification primers is not a complete, sequencabable library, and additional amplification with the P5 and P7 universal primers is required, so that two primers are added, and the amplification is performed in tandem. Under the condition of the primer dosage, the optimal amplification effect can be achieved, and the finally obtained sequencing library can be ensured to occupy more than 99 percent. The library size is controlled to be about 300+/-50 bp, the efficiency of sequencing cluster generation is highest, and the sequencing quality can be improved.
The library structure can be obtained by a traditional joint connection library construction method besides a transposase method, and the specific experimental flow is as follows: after DNA is fragmented by an enzymatic method or an ultrasonic method, end repair and A addition are firstly carried out, then linker connection is completed, the linker can be a universal linker without a label or a non-universal linker with a label, finally PCR amplification of the library is carried out, amplification primers are determined according to the structure of the linker, if the linker is labeled, the universal primers are used for amplification primers, and if the linker is not labeled, the labeled primers are used for amplification primers. Regardless of the form of linker ligation, a library of identical structures according to the invention is ultimately obtained.
The invention also discloses a gene detection method based on second generation sequencing, which comprises the following steps:
sample processing: taking a sample to be detected, and extracting to obtain a genome to be detected;
library construction: obtaining a library to be tested by adopting the construction method;
sequencing: taking a library to be tested, and sequencing on a machine;
data analysis: taking sequencing off-machine data, and analyzing by a bioinformatics analysis method to obtain a gene detection result.
In one embodiment, in the step of sequencing, the target insert sequence is detected by using a Read1 sequencing primer, then the index1 sequence is detected by using a Read2 sequencing primer, and the index2 sequence is directly detected by using a Read3 sequencing primer;
in the data analysis step, genome sequence information to be detected is obtained through integration of an index1 sequence, an index2 sequence and a target insert sequence by a bioinformatics analysis method.
In the sequencing step, the synthesized template chain is not required to be overturned, the Read3 sequencing primer is directly placed in a self-defined primer hole of sequencing equipment, the index2 sequence is obtained by detecting the Read3 sequencing primer, the overturning procedure in the sequencing step can be omitted, the sequencing data can be finally split in advance for 1 hour, and the aim of shortening the experiment turnover time is fulfilled.
Compared with the prior art, the invention has the following beneficial effects:
according to the library based on second-generation sequencing, a Read3 sequencing primer sequence is inserted between a first universal joint sequence and an index2 sequence by adjusting the structure of the library, and is used as a sequencing primer of the index 2. During on-machine sequencing, the target insert sequence is detected by using a Read1 sequencing primer, then the index1 sequence is detected by using a Read2 sequencing primer, and the library can be completely tested by directly detecting the index2 sequence by using a Read3 sequencing primer without bridge type overturning. Therefore, the procedure of overturning the template chain, which takes 1 hour to resynthesize, of the index2 is omitted, the sequencing data is finally split 1 hour in advance, the aim of shortening the experimental turnover time is fulfilled, the rapid second-generation sequencing can be performed, the method is particularly suitable for mNGS analysis, and the urgent requirement of a clinician for rapidly and accurately identifying pathogens can be met.
Drawings
FIG. 1 is a schematic diagram of the second generation sequencing-based library structure of the present invention in example 1.
FIG. 2 is a schematic diagram of a conventional library structure in example 2.
FIG. 3 is a library peak pattern constructed by the method of example 1 in example 2.
FIG. 4 is a graph of library peaks constructed by the conventional method of example 2.
FIG. 5 is a schematic representation of the types of positive pathogen specimens in example 2.
FIG. 6 is a diagram showing the alignment of the number of sequences of positive pathogen detection in example 2.
FIG. 7 is a schematic diagram of the detection of the positive pathogen in example 2.
Detailed Description
In order that the invention may be readily understood, a more complete description of the invention will be rendered by reference to the appended drawings. Preferred embodiments of the present invention are shown in the drawings. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.
The reagents used in the following examples, unless otherwise specified, are all commercially available; the methods used in the examples below, unless otherwise specified, are all conventional.
The library kit used in the following examples was from the product of the biological science, inc. of Nanjinopran, catalog number TD503.NGS DNA purified magnetic beads were from the product of the biotechnology company, south genizella, catalog No. N411.PCR amplification primers and sequencing primer synthesis were from Bio (Shanghai) Inc. Other reagents are commercially available unless otherwise specified.
Example 1
A rapid gene detection method based on second generation sequencing is established.
1. Library and primer design
1. Library design
The inventor fully researches the conventional sequencing technology and puts forward after multiple small test tests, and can shorten the experimental turnover time by adjusting the library structure, wherein the specific adjustment mode is as follows:
as shown in fig. 1, the sequence fragments of the library were designed sequentially from 5 'end to 3' end: the kit comprises a first universal adaptor sequence (P5), a Read3 sequencing primer sequence (Rd 3 SP), an index2 sequence, a Read1 sequencing primer sequence (Rd 1 SP), a target Insert (DNA Insert), a Read2 sequencing primer sequence (Rd 2' SP), an index1 sequence and a second universal adaptor sequence (P7), wherein the target Insert is a sequence to be detected after genome fragmentation to be detected.
A Read3 sequencing primer sequence was inserted between the first universal adaptor sequence and the index2 sequence as a sequencing primer for index 2. During on-machine sequencing, the target insert sequence is detected by using a Read1 sequencing primer, then the index1 sequence is detected by using a Read2 sequencing primer, and the library can be completely tested by directly detecting the index2 sequence by using a Read3 sequencing primer without bridge type overturning.
2. Primer design
In order to obtain a library having the above structure, the following PCR primers were used for PCR amplification:
p5 amplification primer: read3 sequencing primer sequence-index 2 sequence-Read 1 sequencing primer sequence, specifically selectable: 5 'CTCAGAACGACATGGCTACGACCGACTG-index 2-TCGTCGGCAGCGTC'.
P7 amplification primer: read2 sequencing primer sequence-index 1 sequence-second universal adaptor sequence, specifically selectable: 5 'CAAGCAGAAGAGACGGCATACGAGAT-index 1-GTCTCGTGGGCTCGG'.
P5 universal primer: AATGATACGGCGACCACCGAGATCTACACCTCACAGAACGACATGGCTACG3' (SEQ ID NO. 4).
P7 general primer: 5'CAAGCAGAAGACGGCATACGAGAT3' (SEQ ID No. 5).
According to the sequencing step of the library, the sequencing primer comprises: the Read1 sequencing primer, the Read2 sequencing primer and the Read3 sequencing primer are as follows:
read1 sequencing primer: 5'TCGTCGGCAGCGTC 3' (SEQ ID NO. 1)
Read2 sequencing primer: 5'CAAGCAGAAGACGGCATACGAGAT3' (SEQ ID NO. 2)
Read3 sequencing primer: 5'CTCACAGAACGACATGGCTACGATCCGACTG 3' (SEQ ID NO. 3)
And the last 3 bases CTG at the 3' end of the Read3 sequencing primer is subjected to nucleic acid locking modification.
2. Detection flow
1. Sample processing.
Taking a sample to be detected, extracting to obtain a genome to be detected according to a conventional method, and converting the genome to be detected into dsDNA by means of reverse transcription and the like according to the type of the genome to be detected as DNA or RNA for later use.
2. Library construction.
2.1 fragmentation: taking the extracted dsDNA to be detected, adding transposase (Norwegian, TD 503) to carry out fragmentation treatment, wherein the dsDNA fragmentation system comprises the following steps:
TABLE 1 fragmentation reaction System
After a predetermined time, the reaction was terminated with the following system.
TABLE 2 termination reaction System
2.2 And (3) PCR amplification: PCR amplification was performed by adding the above-designed primer combinations (P5 amplification primer, P7 amplification primer, P5 universal primer and P7 universal primer).
TABLE 3 PCR reaction System and conditions
2.3 library purification
2.3.1 vortexing mix DNA Clean Beads and draw 35. Mu.l volume into 50. Mu.l PCR product, vortexing or pipetting 10 times thoroughly mix, incubating for 5min at room temperature.
2.3.2 the reaction tube was briefly centrifuged and placed on a magnetic rack to separate the beads from the liquid, and after the solution was clear (about 5 min) the supernatant was carefully transferred to a new sterile PCR tube and the beads were discarded.
2.3.3 vortex shaking mixing DNA Clean Beads and sucking 15. Mu.l volume into supernatant, vortex shaking or pipetting 10 times thoroughly mixing, incubating for 5min at room temperature.
2.3.4 the reaction tube was briefly centrifuged and placed on a magnetic rack to separate the beads from the liquid and the supernatant was carefully removed after the solution was clear (about 5 min).
2.3.5 keep the reaction tube always on the magnetic rack and rinse the beads with 200. Mu.l of freshly prepared 80% ethanol. Incubate for 30sec at room temperature, carefully remove the supernatant. Step 5 was repeated for a total of two rinses.
2.3.6 the reaction tube was kept on the magnetic rack all the time and the beads were air dried for about 5min after uncapping.
2.3.7 the reaction tube was removed from the magnet holder and eluted by adding 22. Mu.l of sterilized ultrapure water.
2.3.8 vortex shaking or beating 10 times with a pipette, mixing thoroughly, and incubating at room temperature for 5min.
2.3.9 the reaction tube was briefly centrifuged and placed on a magnetic rack to separate the beads from the liquid, and after the solution was clarified (about 5 min) 20. Mu.l of supernatant was carefully aspirated into a fresh sterile PCR tube and stored at-20 ℃.
2.4 library analysis
2.4.1 library quantification: library quantification was performed using a qubit4.0 dsDNA high sensitivity kit, with library concentrations greater than 1 ng/. Mu.l being acceptable.
2.4.2 fragment analysis: library fragment analysis was performed using Qsep1 instrument, and the main peak was judged to be qualified at 300.+ -.50 bpbp.
Taking a qualified library, carrying out Pooling conventionally, denaturing by using NaOH, and diluting to 1.5pM for later use.
3. Sequencing.
Based on a NextSeq550 sequencing platform, the sequencing chip is a NextSeq550 400M chip, a Read3 primer is added into a custom primer hole, the working concentration of the sequencing primer is 0.3 mu M, the volume is 1.5ml, the target insert sequence is obtained by detecting a Read1 sequencing primer in the sequencing on the machine, the index1 sequence is obtained by detecting a Read2 sequencing primer, and the index2 sequence is obtained by directly detecting a Read3 sequencing primer.
4. And (5) data analysis.
Taking sequencing off-machine data, analyzing by a bioinformatics analysis method, and integrating the index1 sequence, the index2 sequence and the target insert sequence to obtain genome sequence information to be detected, namely a detection result.
Example 2
The present embodiment verifies the application of the detection method established in embodiment 1 in metagenomic detection.
1. The method.
A random parallel experiment was performed on clinical samples for review using the detection method of example 1 and a conventional second generation sequencing method. Wherein the library structure in the conventional second generation sequencing method is shown in FIG. 2, i.e., no Rd3 SP structure.
The library construction and sequencing steps in the conventional sequencing method are as follows:
1.1 library construction.
1.1.1 fragmentation: taking the extracted dsDNA to be detected, adding transposase (Norwegian, TD 503) to carry out fragmentation treatment, wherein the dsDNA fragmentation system comprises the following steps:
TABLE 4 fragmentation reaction System
After a predetermined time, the reaction was terminated with the following system.
TABLE 5 termination reaction System
1.1.2 And (3) PCR amplification: the primer combination was added to perform PCR amplification reaction.
TABLE 6 PCR reaction System and conditions
Note that: the amplification primer (containing index 1) consists of a Read2 sequencing primer sequence, an index1 sequence and a P7 universal adaptor, the amplification primer (containing index 2) consists of a P5 universal adaptor, an index2 sequence and a Read1 sequencing primer sequence,
1.1.3 library purification
1.1.3.1 vortex mixing DNA Clean Beads and aspirating 35. Mu.l volume into 50. Mu.l PCR product, vortex shaking or pipetting 10 times thoroughly mix, incubate for 5min at room temperature.
1.1.3.2 the reaction tube was briefly centrifuged and placed on a magnetic rack to separate the beads from the liquid, and after the solution was clarified (about 5 min) the supernatant was carefully transferred to a new sterile PCR tube and the beads discarded.
1.1.3.3 vortex shaking mixing DNA Clean Beads and sucking 15. Mu.l volume into the supernatant, vortex shaking or pipetting 10 times thoroughly mixing, incubating for 5min at room temperature.
1.1.3.4 the reaction tube was briefly centrifuged and placed on a magnetic rack to separate the beads from the liquid and the supernatant was carefully removed after the solution was clear (about 5 min).
1.1.3.5 keep the reaction tube on the magnet holder all the time, rinse the beads with 200 μl of freshly prepared 80% ethanol. Incubate for 30sec at room temperature, carefully remove the supernatant. Step 5 was repeated for a total of two rinses.
1.1.3.6 keep the reaction tube on the magnetic rack all the time, and uncover the air to dry the magnetic beads for about 5min.
1.1.3.7 the reaction tube was removed from the magnet holder and eluted by adding 22. Mu.l of sterilized ultrapure water.
1.1.3.8 vortex shaking or beating 10 times with a pipette, mixing thoroughly, and incubating at room temperature for 5min.
1.1.3.9 the reaction tube was briefly centrifuged and placed on a magnetic rack to separate the beads from the liquid, and after the solution was clarified (about 5 min) 20 μl of supernatant was carefully aspirated into a fresh sterilized PCR tube and stored at-20 ℃.
1.1.4 library analysis
1.1.4.1 library quantification: library quantification was performed using a qubit4.0 dsDNA high sensitivity kit, with library concentrations greater than 1 ng/. Mu.l being acceptable.
1.1.4.2 fragment analysis: library fragment analysis was performed using Qsep1 instrument, and the main peak was judged to be qualified at 300.+ -.50 bpbp.
Taking a qualified library, carrying out Pooling conventionally, denaturing by using NaOH, and diluting to 1.5pM for later use.
1.2, sequencing.
Based on the NextSeq550 sequencing platform, the sequencing chip is a NextSeq550 400M chip, and the sequencing chip is subjected to on-machine sequencing. Specifically, the target insert sequence is detected by using a Read1 sequencing primer, and then the index1 sequence is detected by using a Read2 sequencing primer. After the sequencing of index1 is finished, the library needs to be turned over to synthesize a new template chain, and then the sequencing of index2 is finished by using Read 1.
1.2, data analysis.
Taking sequencing off-machine data, analyzing by a bioinformatics analysis method, and integrating the index1 sequence, the index2 sequence and the target insert sequence to obtain genome sequence information to be detected, namely a detection result.
2. As a result.
2.1 library Peak Pattern
The peak pattern of an exemplary library constructed using the method of example 1 is shown in FIG. 3, and the peak pattern of a library constructed using the conventional sequencing method of this example is shown in FIG. 4.
As can be seen from FIGS. 3 to 4, the main peaks of the library constructed by the method of example 1 (the speed-increasing method) and the main peaks of the library constructed by the conventional sequencing method are both between 250 and 350bp, and are qualified libraries.
2.2 sequencing Mass
100 clinical samples are actually measured in the embodiment, and the sequencing quality Q30 of index2 by using the Read3 sequencing primer is more than 90% according to analysis, so that the quality control requirement is met.
2.3 sensitivity contrast
100 clinical specimens were tested in a randomized parallel experiment with the detection method of example 1 (accelerated sequencing) and a conventional second generation sequencing method (conventional sequencing), the results of which are shown in the following table and fig. 5-7.
Wherein, fig. 5 shows the types of positive pathogen samples and the corresponding number of cases. FIG. 6 is a graph showing the RPM value of 1M (megameter) of sequencing data reported as the number of detected sequences (Reads) of suspected pathogenic bacteria reported after interpretation by a professional with a clinical medical setting, for the same batch of samples in conventional sequencing and accelerated sequencing, respectively. For example: in SE75bp, double index sequencing mode, one sample Pseudomonas aeruginosa RPM is 100 Reads, i.e., 100 sequences 75bp in length were detected by Pseudomonas aeruginosa at 1M (1M 75 bp). FIG. 7 is an RPM Ration of the sample of FIG. 6, which is assigned values for the number of pathogenic sequences detected by the speed-up sequencing mode over the conventional sequencing mode in a sample (sample from the same case) experiment.
TABLE 4 comparison of the number of sequences of positive pathogen detection (RPM)
As can be seen from the results, compared with the conventional sequencing method of the second generation sequencing, the rapid sequencing method for sequencing by adjusting the library structure has no significant difference in pathogen detection sensitivity, namely the gene detection method (rapid sequencing method) based on the second generation sequencing can shorten TAT on the premise of not losing efficiency, and meets the urgent requirement of a clinician for rapidly and accurately acquiring the detection result.
The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples illustrate only a few embodiments of the invention, which are described in detail and are not to be construed as limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.
Claims (13)
1. A library based on second generation sequencing, wherein the sequence fragments of the library are, in order from 5 'end to 3' end: the kit comprises a first universal joint sequence, a Read3 sequencing primer sequence, an index2 sequence, a Read1 sequencing primer sequence, a target insert, a Read2 sequencing primer sequence, an index1 sequence and a second universal joint sequence, wherein the target insert is a sequence to be detected after genome fragmentation to be detected.
2. The second generation sequencing-based library of claim 1, wherein the first universal linker sequence is a P5 linker and the second universal linker sequence is a P7 linker.
3. The second generation sequencing-based library of claim 1, wherein the Read3 primer sequences are designed according to the following requirements: the length of the primer sequence of the Read3 is 20-40nt, the annealing temperature Tm is 65-90 ℃, and/or the GC content is equivalent to that of the Read1 and the Read2.
4. The second generation sequencing-based library of claim 3, wherein the Read3 primer sequences are designed according to the following requirements: the length of the sequence of the Read3 primer is 30-35nt, the annealing temperature Tm is 75-80 ℃, the GC content is 58-65%, and/or the 3 bases near the 3 'end and/or the 3 bases near the 5' end of the Read3 primer are subjected to base modification by locking nucleic acid.
5. The second generation sequencing-based library of claim 3, wherein the Read3 primer sequence is selected from the group consisting of the sequences set forth in SEQ ID No. 3.
6. The second generation sequencing-based library of claim 1, wherein the Read1 primer sequence is selected from the group consisting of the sequences set forth in SEQ ID No.1 and the Read2 primer sequence is selected from the group consisting of the sequences set forth in SEQ ID No. 2.
7. A reagent for constructing the second generation sequencing-based library of any one of claims 1-6, comprising PCR primers and sequencing primers, the PCR primers comprising: the primer comprises a P5 end primer and a P7 end primer, wherein the P5 end primer comprises a P5 amplification primer and a P5 general primer, the P7 end primer comprises a P7 amplification primer and a P7 general primer, and the P5 amplification primer consists of a Read3 sequencing primer sequence, an index2 sequence and a Read1 sequencing primer sequence which are sequentially connected; the P7 amplification primer consists of a Read2 sequencing primer sequence, an index1 sequence and a second universal joint sequence which are connected in sequence; the P5 amplification primer and the P5 universal primer are ligated or non-ligated, the P7 amplification primer and the P7 universal primer are ligated or non-ligated, and the sequencing primer comprises: read1 sequencing primer, read2 sequencing primer and Read3 sequencing primer.
8. The reagent of the second generation sequencing-based library according to claim 7, wherein the P5 universal primer sequence is selected from the group consisting of the sequences shown in SEQ ID No.4, and the P7 universal primer sequence is selected from the group consisting of the sequences shown in SEQ ID No. 5.
9. Use of the second generation sequencing-based library of any one of claims 1-6 in an illuminea sequencing platform.
10. The method for constructing a second-generation sequencing-based library according to any one of claims 1 to 6, comprising the steps of:
fragmenting: taking the extracted DNA to be detected, adding transposase to carry out fragmentation treatment, and stopping the reaction after a preset time;
and (3) PCR amplification: adding the primer combination of claim 7 or 8 to perform a PCR amplification reaction;
library purification: and (3) sorting and purifying the library obtained in the step to obtain the final product.
11. The method of claim 9, wherein the ratio of the amount of the P5 amplification primer to the amount of the P5 universal primer in the PCR amplification step is 1:9, 1:7, or 1:3, and the ratio of the amount of the P7 amplification primer to the amount of the P7 universal primer is 1:9, 1:7, or 1:3;
the library purification step further comprises a fragment analysis step, wherein in the fragment analysis step, when the main peak fragment of the library is 300+/-50 bp, the library is judged to be qualified.
12. The gene detection method based on second generation sequencing is characterized by comprising the following steps of:
sample processing: taking a sample to be detected, and extracting to obtain a genome to be detected;
library construction: obtaining a library to be tested by the construction method of claim 10 or 11;
sequencing: taking a library to be tested, and sequencing on a machine;
data analysis: taking sequencing off-machine data, and analyzing by a bioinformatics analysis method to obtain a gene detection result.
13. The method according to claim 12, wherein in the step of sequencing, the target insert sequence is detected by using a Read1 sequencing primer, then the index1 sequence is detected by using a Read2 sequencing primer, and the index2 sequence is directly detected by using a Read3 sequencing primer;
in the data analysis step, genome sequence information to be detected is obtained through integration of an index1 sequence, an index2 sequence and a target insert sequence by a bioinformatics analysis method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211197088.1A CN117821561A (en) | 2022-09-29 | 2022-09-29 | Library based on second-generation sequencing, construction method and reagent thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211197088.1A CN117821561A (en) | 2022-09-29 | 2022-09-29 | Library based on second-generation sequencing, construction method and reagent thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117821561A true CN117821561A (en) | 2024-04-05 |
Family
ID=90515937
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211197088.1A Pending CN117821561A (en) | 2022-09-29 | 2022-09-29 | Library based on second-generation sequencing, construction method and reagent thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117821561A (en) |
-
2022
- 2022-09-29 CN CN202211197088.1A patent/CN117821561A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108300716B (en) | Linker element, application thereof and method for constructing targeted sequencing library based on asymmetric multiplex PCR | |
CN111808854B (en) | Balanced joint with molecular bar code and method for quickly constructing transcriptome library | |
CN112195521A (en) | DNA/RNA co-database building method based on transposase, kit and application | |
CN108517567B (en) | Adaptor, primer group, kit and library construction method for cfDNA library construction | |
CN107893100A (en) | A kind of unicellular mRNA reverse transcriptions and the method for amplification | |
CN112877403B (en) | Method for constructing sequencing library of target sequence | |
CN110438121A (en) | Connector, connector library and its application | |
CN112226821B (en) | Construction method of MGI sequencing platform sequencing library based on double-strand cyclization | |
CN114107459B (en) | High-throughput single cell sequencing method based on oligonucleotide chain hybridization marker | |
CN111549025B (en) | Strand displacement primer and cell transcriptome library construction method | |
CN110511978A (en) | FFPE sample DNA library and its construction method | |
CN111748637A (en) | SNP molecular marker combination, multiplex composite amplification primer set, kit and method for genetic relationship analysis and identification | |
CN114958997A (en) | Method for detecting chaperone gene | |
CN116287357A (en) | Respiratory tract pathogenic bacteria detection kit based on targeted amplicon sequencing | |
CN112795654A (en) | Method and kit for organism fusion gene detection and fusion abundance quantification | |
US12084652B2 (en) | Methods and compositions for processing samples containing nucleic acids | |
CN113337590B (en) | Second generation sequencing method and library construction method | |
CN116790718B (en) | Construction method and application of multiplex amplicon library | |
CN113265452A (en) | Bioinformatics pathogen detection method based on Nanopore metagenome RNA-seq | |
CN112646859A (en) | Macrogenomics-based respiratory tract pharynx swab sample database building method and pathogen detection method | |
CN111501106A (en) | Construction method, device and application of high-throughput sequencing library of exosome RNA | |
CN111549109A (en) | High-throughput pathogen microorganism gene detection screening method | |
CN115747208A (en) | Method for processing DNA/RNA mixture | |
CN117821561A (en) | Library based on second-generation sequencing, construction method and reagent thereof | |
CN115074422A (en) | Detection method of unknown fusion gene |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |