CN112941147A - High-fidelity target gene library building method and kit thereof - Google Patents

High-fidelity target gene library building method and kit thereof Download PDF

Info

Publication number
CN112941147A
CN112941147A CN202110230543.2A CN202110230543A CN112941147A CN 112941147 A CN112941147 A CN 112941147A CN 202110230543 A CN202110230543 A CN 202110230543A CN 112941147 A CN112941147 A CN 112941147A
Authority
CN
China
Prior art keywords
sequencing
primer
linker
adaptor
joint
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110230543.2A
Other languages
Chinese (zh)
Inventor
崔品
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Ruifa Biotechnology Co ltd
Original Assignee
Shenzhen Ruifa Biotechnology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Ruifa Biotechnology Co ltd filed Critical Shenzhen Ruifa Biotechnology Co ltd
Priority to CN202110230543.2A priority Critical patent/CN112941147A/en
Publication of CN112941147A publication Critical patent/CN112941147A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Abstract

A high fidelity target gene database building method and its reagent kit, said method comprises the phosphorylation step, modify the phosphate group at the 5' end of the template molecule; annealing and extending step, annealing and extending by using a first primer to obtain a target nucleotide molecule; a first sequencing adapter ligation step of ligating the first sequencing adapter to the target nucleotide molecule to obtain a first sequencing adapter ligation product; the method comprises the steps of enzyme digestion, wherein a first primer is cut off to obtain a first sequencing joint connection product; a double-chain end flattening step, wherein a single-chain sequence is cut off to obtain a first sequencing adaptor connection product with a flattened end; a second sequencing joint connection step, wherein a second sequencing joint is connected to the blunt-end first sequencing joint connection product to obtain a second sequencing joint connection product; and an amplification step, namely amplifying the second sequencing joint connection product by using a second primer and a third primer to obtain a library. The invention integrates the conventional steps of library construction and capture into one flow, effectively simplifies the flow and reduces the loss of DNA in the operation process.

Description

High-fidelity target gene library building method and kit thereof
Technical Field
The invention relates to the technical field of gene sequencing, in particular to a high-fidelity target gene database building method and a kit thereof.
Background
The preparation process of the NGS targeted library (including single-chain library construction by a novel library construction method) is generally divided into two sets of processes, the mainstream library construction method for capturing needs four necessary steps of library construction, amplification before capturing, hybridization capturing and amplification after capturing, and the whole process generally lasts for 2 to 3 days. Another common method is amplicon pooling, which is generally performed by performing multiplex PCR first and then pooling the PCR products, and some commercial kits add linker sequences corresponding to the NGS platform outside the 5' end of the primers to integrate the two steps into one step.
The first major technical route is to strictly separate library construction from hybrid capture, which requires many long steps and relies on magnetic bead capture based on streptavidin linked biotin, which is expensive and import dependent. Although the second technical route is simpler than the former one, because it is based on multiplex PCR, there are the following problems: 1. the initial investment demand of building a warehouse is high; 2. the number of plexes in the same reaction system cannot be too large, so that the gene detection of a large panel is difficult to finish through a single-tube reaction and can only be divided into a plurality of single-tube reactions, and then the products are combined to realize, so that the cost is greatly increased, the operation time is prolonged, the single-tube reaction detection flux is limited, and the popularization is not facilitated; 3. PCR requires primer pairing at two ends, so that structural variations such as fusion genes (novel fusion) and virus insertion sites cannot be detected; 4. the exponential amplification of PCR results in its inability to detect gene copy number variations; 5. the inevitable amplification preference of multiplex PCR results in low uniformity, resulting in poor coverage of part of the region and excessive coverage of part of the region in the panel.
The single-chain library building method can realize library building of severely degraded samples, but the material cost of the single-chain linker connection or linker connection during hairpin is high, and the connection efficiency is lower than that of the conventional double-chain linker connection.
Disclosure of Invention
According to a first aspect, there is provided in one embodiment a high fidelity target gene banking method comprising:
a phosphorylation step, which comprises modifying a phosphate group at the 5' end of the template molecule;
annealing and extending, including annealing and extending a first primer to a target nucleotide region of the template molecule to obtain a double-stranded molecule containing a target nucleotide extension chain, namely the target nucleotide molecule, wherein the first primer contains an enzyme cutting site;
a first sequencing adapter ligation step comprising ligating a first sequencing adapter to the target nucleotide molecule, obtaining a first sequencing adapter ligation product;
the enzyme digestion step comprises the steps of cutting off a first primer with an enzyme digestion site on a first sequencing joint connection product by using enzyme, and then removing a nucleotide sequence at the 5' end of the enzyme digestion site in the first primer to obtain the first sequencing joint connection product after partial sequence of the first primer is cut off;
the method comprises the following steps of (1) flattening the double-stranded end, wherein the step comprises the step of excising a single-stranded sequence protruding from the 3' end of an original template chain in a first sequencing linker ligation product to obtain a first sequencing linker ligation product with a flattened end;
a second sequencing linker ligation step, comprising ligating a second sequencing linker to the blunt-end first sequencing linker ligation product to obtain a second sequencing linker ligation product;
and an amplification step, which comprises amplifying the second sequencing joint connection product by using a second primer and a third primer to obtain a sequencing library.
According to a second aspect, there is provided in one embodiment the library constructed by the method of the first aspect.
According to a third aspect, there is provided in an embodiment a kit comprising: the kit comprises a first primer, a second primer, a third primer, a first sequencing joint and a second sequencing joint, wherein the first primer contains an enzyme cutting site. Since the first primer contains an enzyme cleavage site, a partial sequence can be cleaved by a corresponding enzyme. The kit can be used for library construction.
According to the high-fidelity target gene library construction method and the kit thereof of the embodiment, the conventional steps of library construction and capture are integrated into one process, the process is effectively simplified, the loss of DNA in the operation process is reduced, the operation process is effectively shortened, the library is directly constructed on the original template molecular chain, the joints at two ends are connected by double chains, the connection efficiency is high, and the method is a high-cost-performance and high-fidelity library construction method.
Drawings
FIG. 1 is a schematic diagram of a process for preparing a molecular tagged sequencing adapter according to an embodiment;
FIG. 2 shows a flow chart of library construction according to an embodiment.
Detailed Description
The present invention will be described in further detail with reference to the following detailed description and accompanying drawings. Wherein like elements in different embodiments are numbered with like associated elements. In the following description, numerous details are set forth in order to provide a better understanding of the present application. However, those skilled in the art will readily recognize that some of the features may be omitted or replaced with other elements, materials, methods in different instances. In some instances, certain operations related to the present application have not been shown or described in detail in order to avoid obscuring the core of the present application from excessive description, and it is not necessary for those skilled in the art to describe these operations in detail, so that they may be fully understood from the description in the specification and the general knowledge in the art.
Furthermore, the features, operations, or characteristics described in the specification may be combined in any suitable manner to form various embodiments. Also, the various steps or actions in the method descriptions may be transposed or transposed in order, as will be apparent to one of ordinary skill in the art. Thus, the various sequences in the specification and drawings are for the purpose of describing certain embodiments only and are not intended to imply a required sequence unless otherwise indicated where such sequence must be followed.
The numbering of the components as such, e.g., "first", "second", etc., is used herein only to distinguish the objects as described, and does not have any sequential or technical meaning. The term "connected" and "coupled" when used in this application, unless otherwise indicated, includes both direct and indirect connections (couplings).
Herein, MAF (minor Allele frequency) is the minimum Allele frequency, and generally refers to the frequency of occurrence of an unusual Allele in a given population, for example, three genotypes TT, TC, and CC, and the frequency of C in the population is 0.36, and the frequency of T is 0.64, so that the Allele C is the minimum Allele frequency, and MAF is 0.36.
According to a first aspect, in an embodiment, there is provided a high fidelity target gene banking method comprising:
a phosphorylation step, which comprises modifying a phosphate group at the 5' end of the template molecule;
annealing and extending, including annealing and extending a first primer to a target nucleotide region of the template molecule to obtain a double-stranded molecule containing a target nucleotide extension chain, namely the target nucleotide molecule, wherein the first primer contains an enzyme cutting site;
a first sequencing adapter ligation step comprising ligating a first sequencing adapter to a target nucleotide molecule, obtaining a first sequencing adapter ligation product;
the enzyme digestion step comprises the steps of cutting off a first primer with an enzyme digestion site on a first sequencing joint connection product by using enzyme, and then removing a nucleotide sequence at the 5' end of the enzyme digestion site in the first primer to obtain the first sequencing joint connection product after partial sequence of the first primer is cut off;
the method comprises the following steps of (1) flattening the double-stranded end, wherein the step comprises the step of excising a single-stranded sequence protruding from the 3' end of an original template chain in a first sequencing linker ligation product to obtain a first sequencing linker ligation product with a flattened end;
a second sequencing linker ligation step, comprising ligating a second sequencing linker to the blunt-end first sequencing linker ligation product to obtain a second sequencing linker ligation product;
and an amplification step, which comprises amplifying the second sequencing joint connection product by using a second primer and a third primer to obtain a sequencing library. The library is a complete library of target nucleotide sequences with sequencing adaptors attached to both ends.
In one embodiment, the template molecule is single-stranded DNA and/or double-stranded DNA. The template molecule may be in single-stranded DNA, double-stranded DNA or irregular (single-double-stranded mixed) form, and may be suitable for severely degraded samples and micro-samples, including bisulfite-treated DNA. The template molecule may also be cDNA reverse transcribed from an RNA sample.
In one embodiment, in the phosphorylation step, the template molecule includes, but is not limited to, at least one of the following DNA molecules:
a) DNA molecules with the length less than or equal to 500 bp;
b) bisulfite-treated DNA molecules;
c) extracellular free DNA;
d) reverse transcription of single-stranded or double-stranded cDNA from an RNA sample.
Extracellular free DNA is also called cfDNA, abbreviated circulating free DNA, and generally refers to partially degraded, in vivo-derived DNA that is free from cells in body fluids (e.g., blood).
In one embodiment, the starting template molecule can be a DNA of various types (less than 500bp in length) or bisulfite (bisufite) treated DNA of various types, first-strand cDNA or double-stranded cDNA reverse-transcribed from RNA.
In one embodiment, after the 5' end of the template molecule is modified with phosphate group, the template molecule is dissociated into single-stranded DNA molecules by thermal denaturation, and simultaneously, the enzyme in the system is denatured, and then the next reaction is carried out. Of course, phosphorylase may be used to phosphorylate a single-stranded DNA molecule or a double-stranded DNA molecule.
In one embodiment, after modifying phosphate group at the 5' end of the template molecule, heating the template molecule to 80-98 ℃ and keeping for 1-10min, so that the template molecule is dissociated into single chains on one hand, and phosphorylase is denatured and inactivated on the other hand, after the reaction is finished, placing a container filled with the template molecule on ice and keeping for 2-10min to avoid the template molecule from renaturing into double chains, and then entering the next step of reaction. In some embodiments, the template molecule is heated at a temperature including, but not limited to, 80 ℃, 81 ℃, 82 ℃, 83 ℃, 84 ℃, 85 ℃, 86 ℃, 87 ℃, 88 ℃, 89 ℃, 91 ℃, 92 ℃, 93 ℃, 94 ℃, 95 ℃, 96 ℃, 97 ℃, 98 ℃ and the like, for a holding time including, but not limited to, 1min, 2min, 3min, 4min, 5min, 6min, 7min, 8min, 9min, 10min and the like. The time that the container with the template molecule is kept on ice includes, but is not limited to, 2min, 3min, 4min, 5min, 6min, 7min, 8min, 9min, 10min, and the like.
In one embodiment, after the 5' end of the template molecule is modified with a phosphate group, the template molecule is heated to 90-98 ℃ for 1-10 min.
In one embodiment, the enzyme used in the phosphorylation step includes, but is not limited to, T4polynucleotide kinase (T4polynucleotide kinase, also known as T4 phosphokinase).
In one embodiment, the first primer contains an enzyme cleavage site, a first sequence that is tandemly linked to the 3 'end of the enzyme cleavage site and that is complementarily paired with a target nucleotide sequence on a template molecule, and a second sequence that is tandemly linked to the 5' end of the enzyme cleavage site and that is not complementarily paired with a nucleotide sequence on a template molecule. The second sequence separates the labeling molecule (e.g., biotin) from the target nucleotide annealing region to which the first primer is attached, increasing steric hindrance, allowing the first primer to be sufficiently developed to facilitate annealing.
In one embodiment, the length of the second sequence may be 3-30nt, the second sequence does not have three or more consecutive identical nucleotides together, and the second sequence cannot have more than 70% similarity with the sequences of the second primer, the third primer, the first linker and the second linker, and is not easy to self-anneal to form a secondary structure, and the second sequence designed according to the design principle is suitable for the present invention. The length of the second sequence includes, but is not limited to, 3nt, 4nt, 5nt, 6nt, 7nt, 8nt, 9nt, 10nt, 11nt, 12nt, 13nt, 14nt, 15nt, 16nt, 17nt, 18nt, 19nt, 20nt, 21nt, 22nt, 23nt, 24nt, 25nt, 26nt, 27nt, 28nt, 29nt, 30nt, and the like.
In one embodiment, the second sequence includes, but is not limited to, the following base sequence: CAAGGACATCCG are provided.
In one embodiment, the cleavage site of the first primer comprises uracil.
In one embodiment, the 5' end of the first primer is modified with a labeling molecule. The 5' end of the first primer contained in the first sequencing joint connection product contains labeling molecules such as biotin and the like, and can be covalently bonded with streptavidin on the surface of streptavidin-coated magnetic beads, the magnetic beads are collected by a magnetic frame, and the first sequencing joint connection product can be collected at the same time.
In one embodiment, the labeling molecule includes, but is not limited to, biotin.
In one embodiment, the first sequencing adapter comprises a first molecular tag.
In one embodiment, the 5' end of the inner side of the first sequencing adapter is modified with a phosphate group, and the inner side of the first sequencing adapter refers to the side of the first sequencing adapter that can be connected to the target nucleotide molecule in series.
In one embodiment, the 3' end of the outer side of the first sequencing linker is modified with a phosphate group, and the outer side of the first sequencing linker refers to a side of the first sequencing linker that is not capable of being connected to the target nucleotide molecule in series. The phosphate group modification is a blocking modification for blocking, because the molecular mechanism of the tandem connection between two DNA chains is that phosphate group at the 5' end of one chain and hydroxyl group at the 3 ' end of the other chain dehydrate to form a covalent bond, if the 5' end has no phosphate group, the reaction cannot be initiated, and if the group at the tail end of the 3 ' end is not hydroxyl, the reaction cannot be completed, so that the effect of blocking the DNA connection reaction can be achieved by modifying the phosphate group at the 3 ' end (replacing the original hydroxyl group with the phosphate group). According to the principle, after the phosphate group is modified at the 3 ' end of the outer side of the first sequencing joint, the phosphate group at the 5' end of the inner side of the second sequencing joint cannot be subjected to series connection reaction with the 3 ' end of the outer side of the first sequencing joint, so that the situation that the second sequencing joint is connected in series at the outer side of the first sequencing joint to form a by-product of a non-standard library structure in the subsequent second joint connection reaction is avoided.
In one embodiment, the first sequencing adaptor comprises a forward strand and a reverse strand which can be complementarily paired, wherein a first molecular tag is connected in series at the 5' end of the forward strand, a nucleotide sequence which is complementarily paired with the first molecular tag is connected in series at the 3 ' end of the reverse strand, and the 5' end of the first molecular tag is modified with a phosphate group.
In one embodiment, the 3' end of the forward strand of the first sequencing adaptor is also modified with a phosphate group, which serves as a blocking modification.
In one embodiment, in the first sequencing linker ligation step, streptavidin-coated magnetic beads are used to collect the first sequencing linker ligation products, which then enter the enzymatic cleavage step. Because the first sequencing joint connection product contains the first primer which is connected with the marker molecule in series, the magnetic bead coated with streptavidin is combined with the marker molecule, the magnetic bead combination is collected on the inner wall of the PCR tube through the magnetic frame, the supernatant in the PCR tube is absorbed and discarded, and the retentate in the PCR tube is the first sequencing joint connection product.
In one embodiment, in the step of enzyme digestion, magnetic beads coated with streptavidin are used to collect the second sequence connected in series with the 5' end of the cleavage site cleaved from the first primer, the remaining first sequencing adapter ligation product is retained in the supernatant, and the supernatant is transferred to another container for the subsequent step of double-stranded end flattening. The cleavage site was digested by the UDG enzyme.
In one embodiment, the enzyme used in the enzymatic cleavage step may be a UDG enzyme (i.e., Uracil-DNA Glycocasylase, Uracil-DNA glycosylase). The UDG enzyme is commercially available.
In one example, in the double-stranded end-flattening step, a T4DNA polymerase is used to excise a single-stranded sequence (overlap) protruding from the 3' end of the original template strand in the ligation product of the first sequencing linker. The protruding single-stranded sequence is a nucleotide sequence of the 3' end of the original template strand in the ligation product of the first sequencing adapter that does not form a double strand, and the nucleotide sequence is not complementary-paired with the extended strand of the first primer to form a double strand.
In one embodiment, the inner side of the second sequencing adapter is modified with a phosphate group at the 5' end, and the inner side of the second sequencing adapter refers to the side of the second sequencing adapter that can be connected to the ligation product of the first sequencing adapter in series.
In one embodiment, the second sequencing adapter contains or does not contain a second molecular tag.
In one embodiment, the second sequencing adaptor comprises a complementary-mateable forward strand and a complementary-mateable reverse strand.
In one embodiment, when the second sequencing adaptor does not contain the second molecular tag, the 5' end of the reverse strand of the second sequencing adaptor is modified with a phosphate group.
In one embodiment, when the second sequencing adaptor contains a second molecular tag, the 5' end of the reverse strand of the second sequencing adaptor is connected with the second molecular tag in series, the 5' end of the second molecular tag is modified with a phosphate group, and the 3 ' end of the forward strand of the second sequencing adaptor is connected with a nucleotide sequence which can be complementarily paired with the second molecular tag in series.
In one embodiment, the first molecular tag and the second molecular tag are independently random nucleotide sequences.
In one embodiment, the lengths of the first molecular tag and the second molecular tag may be independently 4-19nt, including but not limited to 4nt, 5nt, 6nt, 7nt, 8nt, 9nt, 10nt, 11nt, 12nt, 13nt, 14nt, 15nt, 16nt, 17nt, 18nt, 19nt, and the like.
In one embodiment, the second primer contains a first sample tag.
In one embodiment, the second primer comprises an inner adaptor, a first sample tag, and an outer adaptor in tandem from 3 'to 5' and the inner adaptor is complementary to the reverse strand of the first sequencing adaptor.
In one embodiment, the third primer contains or does not contain a second sample tag.
In one embodiment, when the third primer does not contain the second sample tag, the third primer comprises an inner adaptor and an outer adaptor connected in series from 3 'end to 5' end, wherein the inner adaptor can be complementarily paired with the reverse strand of the second sequencing adaptor.
In one embodiment, when the third primer comprises a second sample tag, the third primer comprises an inner adaptor, a second sample tag, and an outer adaptor connected in series from 3 'to 5' in sequence, wherein the inner adaptor can be complementarily paired with the reverse strand of the second sequencing adaptor.
In one embodiment, the lengths of the first sample label and the second sample label can be independently 4-19nt, including but not limited to 4nt, 5nt, 6nt, 7nt, 8nt, 9nt, 10nt, 11nt, 12nt, 13nt, 14nt, 15nt, 16nt, 17nt, 18nt, 19nt, and the like.
In one embodiment, the first sequencing adapter is a high throughput sequencing platform right side sequencing adapter.
In one embodiment, the first sequencing linker includes, but is not limited to, any of a P7-terminal sequencing linker of the Illumina sequencing platform, a P1-terminal sequencing linker of the MGI sequencing platform, or other high throughput sequencing platform right-side sequencing linker.
In one embodiment, the second sequencing adapter is a left-side sequencing adapter of a high-throughput sequencing platform.
In one embodiment, the second sequencing linker includes, but is not limited to, any of a P5-terminal sequencing linker of the Illumina sequencing platform, a P2-terminal sequencing linker of the MGI sequencing platform, or other high throughput sequencing platform left-side sequencing linker.
According to a second aspect, in one embodiment, there is provided the use of the high fidelity target gene banking method of the first aspect to construct the resulting library.
According to a third aspect, in an embodiment, there is provided a kit comprising: the kit comprises a first primer, a second primer, a third primer, a first sequencing joint and a second sequencing joint, wherein the first primer contains an enzyme cutting site. Since the first primer contains an enzyme cleavage site, a partial sequence can be cleaved by a corresponding enzyme. The kit can be used for sequencing library construction.
In one embodiment, the first primer further comprises a first sequence concatenated to the 3 'end of the cleavage site and complementarily pairable with a target nucleotide sequence on a template molecule, and a second sequence concatenated to the 5' end of the cleavage site and not complementarily pairable with a nucleotide sequence on a template molecule.
In one embodiment, the length of the second sequence may be 3-30nt, the second sequence does not have three or more consecutive identical nucleotides together, and the second sequence cannot have more than 70% similarity with the sequences of the second primer, the third primer, the first linker and the second linker, and is not easy to self-anneal to form a secondary structure, and the second sequence designed according to the design principle is suitable for the present invention. The length of the second sequence includes, but is not limited to, 3nt, 4nt, 5nt, 6nt, 7nt, 8nt, 9nt, 10nt, 11nt, 12nt, 13nt, 14nt, 15nt, 16nt, 17nt, 18nt, 19nt, 20nt, 21nt, 22nt, 23nt, 24nt, 25nt, 26nt, 27nt, 28nt, 29nt, 30nt, and the like.
In one embodiment, the second sequence includes, but is not limited to, the following base sequence: CAAGGACATCCG are provided.
In one embodiment, the cleavage site of the first primer comprises uracil.
In one embodiment, the 5' end of the first primer is modified with a labeling molecule.
In one embodiment, the labeling molecule includes, but is not limited to, biotin.
In one embodiment, the second primer contains a first sample tag.
In one embodiment, the second primer comprises an inner adaptor, a first sample tag, and an outer adaptor in tandem from 3 'to 5' and the inner adaptor is complementary to the reverse strand of the first sequencing adaptor.
In one embodiment, the third primer contains or does not contain a second sample tag.
In one embodiment, when the third primer does not contain the second sample tag, the third primer comprises an inner adaptor and an outer adaptor connected in series from 3 'end to 5' end, wherein the inner adaptor can be complementarily paired with the reverse strand of the second sequencing adaptor.
In one embodiment, when the third primer comprises a second sample tag, the third primer comprises an inner adaptor, a second sample tag, and an outer adaptor connected in series from 3 'to 5' in sequence, wherein the inner adaptor can be complementarily paired with the reverse strand of the second sequencing adaptor.
In one embodiment, the lengths of the first sample label and the second sample label are independently 4-19nt, including but not limited to 4nt, 5nt, 6nt, 7nt, 8nt, 9nt, 10nt, 11nt, 12nt, 13nt, 14nt, 15nt, 16nt, 17nt, 18nt, 19nt, and the like.
In one embodiment, the first sequencing adapter comprises a first molecular tag.
In one embodiment, the 5' end of the inner side of the first sequencing adapter is modified with a phosphate group, and the inner side of the first sequencing adapter refers to the side of the first sequencing adapter that can be connected to the target nucleotide molecule in series.
In one embodiment, the 3' end of the outer side of the first sequencing linker is modified with a phosphate group, and the outer side of the first sequencing linker refers to a side of the first sequencing linker that is not capable of being connected to the target nucleotide molecule in series.
In one embodiment, the first sequencing adapter comprises a complementary pair of a forward strand and a reverse strand, the forward strand has a first molecular tag connected in series at the 5' end, the reverse strand has a nucleotide sequence complementary to the first molecular tag connected in series at the 3 ' end, and the first molecular tag is modified with a phosphate group at the 5' end.
In one embodiment, the inner side of the second sequencing adapter is modified with a phosphate group at the 5' end, and the inner side of the second sequencing adapter refers to the side of the second sequencing adapter that can be connected to the ligation product of the first sequencing adapter in series.
In one embodiment, the second sequencing adapter contains or does not contain a second molecular tag.
In one embodiment, the second sequencing adapter contains complementary pairs of a forward strand and a reverse strand.
In one embodiment, when the second sequencing adaptor does not contain the second molecular tag, the 5' end of the reverse strand of the second sequencing adaptor is modified with a phosphate group.
In one embodiment, when the second sequencing adaptor contains a second molecular tag, the 5' end of the reverse strand of the second sequencing adaptor is connected with the second molecular tag in series, the 5' end of the second molecular tag is modified with a phosphate group, and the 3 ' end of the forward strand of the second sequencing adaptor is connected with a nucleotide sequence which can be complementarily paired with the second molecular tag in series.
In one embodiment, the first molecular tag and the second molecular tag are independently random nucleotide sequences.
In one embodiment, the lengths of the first molecular tag and the second molecular tag may be independently 4-19nt, including but not limited to 4nt, 5nt, 6nt, 7nt, 8nt, 9nt, 10nt, 11nt, 12nt, 13nt, 14nt, 15nt, 16nt, 17nt, 18nt, 19nt, and the like.
In one embodiment, the first sequencing adapter comprises a high throughput sequencing platform right side sequencing adapter.
In one embodiment, the first sequencing linker includes, but is not limited to, any of a P7-terminal sequencing linker of the Illumina sequencing platform, a P1-terminal sequencing linker of the MGI sequencing platform, or other high throughput sequencing platform right-side sequencing linker.
In one embodiment, the second sequencing adapter comprises a high throughput sequencing platform left side sequencing adapter.
In one embodiment, the second sequencing linker includes, but is not limited to, any of a P5-terminal sequencing linker of the Illumina sequencing platform, a P2-terminal sequencing linker of the MGI sequencing platform, or other high throughput sequencing platform left-side sequencing linker.
In the following examples, Illumina platform sequencing library preparation is taken as an example, and other NGS platforms are also applicable to the present invention, except that the sequencing linker sequence needs to be changed accordingly.
In one embodiment, the library building method of the present invention mainly comprises the following steps: 1. phosphorylation of the end of the template DNA 5'; 2. annealing and extending the target primer; 3. the first joint is connected; 4. combining magnetic beads; 5. carrying out enzyme digestion on UDG; 6. repairing a blunt end; 7. the second joint is connected; 8. amplification of the sample tagged library.
In one embodiment, FIG. 1 is a flow chart of the preparation of a sequencing adaptor with a molecular tag, and FIG. 2 is a main flow chart of a library building method, which mainly comprises the following steps:
1. the DNA was subjected to 5' phosphorylation treatment using T4 phosphokinase, and the following reactions were carried out in a PCR instrument: 20 minutes at 37 ℃; and (3) carrying out ice-bath treatment on the product at 95 ℃ for 1-10 minutes, wherein the ice-bath treatment time is2 minutes to 1 hour, so as to obtain the single-chain template molecule.
2. The primer annealing and extension is carried out on a target gene molecule using a first primer having biotin at the 5 'end and a nucleotide sequence complementary to the target gene at the 3' end (the primer contains a uracil site for subsequent cleavage, the uracil site has a sequence which is not complementary to the target nucleotide sequence in series at the 5 'end, and a nucleotide sequence complementary to the target nucleotide sequence in series at the 3' end).
3. The P7 linker (i.e., the first sequencing linker) was ligated in a PCR instrument using T4DNA ligase. The P7 linker has a first molecular tag at the end near the template molecule, and a phosphate group modified at the 5' end of the inner side (the side that can be connected to the template molecule in series).
4. After the ligation was completed, the biotin-bearing first primer extension product was grasped with streptavidin-coated magnetic beads, the supernatant was removed, the double-stranded library molecules remained in the PCR tube, UDG enzyme was added to the PCR tube, the uracil site on the first primer and the nucleotide sequence of the 5 'end thereof were cleaved, and then the product was flattened using T4DNA polymerase, and then hairpin P5 linker (i.e., second sequencing linker) having a phosphate group on the inner 5' end was added, so that the flattened product was ligated to the P5-terminal hairpin linker, incubated at 25-37 ℃ for 30-120 minutes, and then thermally denatured at 95 ℃ for 2 minutes, so that T4DNA ligase was denatured and inactivated to completely terminate the linker ligation reaction.
5. Directly adding DNA polymerase and a matched buffer solution thereof, a third primer with or without a sample label at the end of P5, a second primer with a sample label at the end of P7 and dNTP into a ligation reaction product to prepare a reaction system, and carrying out PCR (polymerase chain reaction) with a sample label, wherein the reaction conditions are as follows: at 95 ℃ for 3 minutes; 6-30 cycles, each cycle as follows: 95-98 ℃ for 30 seconds; 10-30 seconds at 60 ℃; 10-30 seconds at 72 ℃; after the circulation reaction is finished, the reaction is carried out for 1 to 10 minutes at 72 ℃.
6. And after the reaction is finished, carrying out a Novozam magnetic bead purification reaction to obtain an Illumina library with a P7 single-side molecular label and a sample label, or an Illumina library with a P7 single-side molecular label and a double-ended sample label, or an Illumina library with a double-side molecular label and a double-ended sample label.
In FIG. 2, "without purification between steps" means that no purification is required between the digestion step and the double-stranded end flattening step, between the double-stranded end flattening step and the second sequencing adapter ligation step, and between the second sequencing adapter ligation step and the amplification step. The need for purification between steps not only saves time and reagents, but also saves the original DNA sample, since each step of purification results in loss of DNA, and thus the need for purification between steps is a great advantage of the present invention.
In one embodiment, the conventional steps of library construction and capture are integrated into one process, so that the operation process is effectively shortened, about 9 hours are needed from the phosphorylation step to the completion of library construction, and the operation is simple and easy.
In one embodiment, the present invention can be completely based on double-stranded linker ligation (the first sequencing linker and the second sequencing linker are both double-stranded ligation when they are ligated to the template molecule), the ligation time is only 15-30min, and the linker ligation efficiency is higher than that of single-stranded library construction.
In one embodiment, the pooling starting template of the present invention may be in single-stranded, double-stranded, or irregular (i.e., single-double-stranded mixed) form, and may be applied to severely degraded samples and micro-samples, including bisulfite-treated DNA.
In one embodiment, genomic structural variation detection (including but not limited to gene copy number variation, detection of fusion genes and viral insertion sequences, etc.) can be achieved based on one-way primer extension and direct capture of the original DNA strand.
In one embodiment, the invention can directly use the original DNA strand for library construction, which is beneficial to more accurate capture and identification of mutation.
In one embodiment, the present invention can also be identified when only one strand of the original double-stranded DNA molecule has a mutation, because the present technique separates the two strands of the original DNA sample into separate sequencing libraries.
In one embodiment, RNA samples can be automatically compatible with the present invention after reverse transcription into first strand cDNA, and no duplex synthesis is required, saving material and time, and avoiding some errors and preferences associated with random primers in conventional duplex synthesis.
In one embodiment, the present invention is also applicable to the pooling of bisulfite treated DNA, cfDNA, fragmented genomic DNA, and the like samples.
In one embodiment, the adapters at both ends of the template molecule are connected by conventional double-stranded connection, which is cheaper and provides higher utilization of the original sample molecules.
In one embodiment, the present invention requires a DNA input of at least 20ng, and the genomic DNA extracted from the tissue sample is disrupted to between 200bp and 600 bp.
In one embodiment, the invention relies on an imported enzyme, Thermolabile
Figure BDA0002957657160000051
II Enzyme, a UDG Enzyme, also known as uracil-DNA glycosylase.
In one embodiment, the library construction method and the kit thereof of the present invention can be used for detecting ultra-low frequency gene mutation, and in one embodiment, can be used for detecting samples with mutation frequency as low as 0.03%.
In the prior art, the library construction is generally carried out on the amplification product of an original DNA molecule, in one embodiment, the library is directly constructed on the original template molecular chain, the joint connection of two ends is double-chain connection, the connection efficiency is high, and the original template molecule can be built into a sequencing library with high fidelity.
Example 1
The library building method of the present embodiment is performed with reference to fig. 1 and 2.
This example first prepares a mutant free nucleic acid (cfDNA) standard with a mutation frequency of three parts per million, then three equal parts of the cfDNA standard substance are respectively used for three independent library construction experiments, the mass of each part of the cfDNA standard substance is 60ng, the method of the embodiment, the prior hybridization capture library construction method (comparative example 1) and the prior amplicon library construction method (comparative example 2) are respectively adopted as the library preparation method of the sequencing library, and the target gene regions designed by the three groups of experiments are basically consistent, and then performing machine sequencing on the same high-throughput sequencing platform, sequencing the same data quantity, and finally adopting the same data analysis flow to check the detection conditions of the same 8 target gene mutation sites (the 8 sites are distributed in exon regions of 4 genes, and the 4 genes are respectively NRAS, KRAS, PIK3CA and EGFR) so as to evaluate the performance difference of the three high-throughput sequencing target gene library construction technologies.
In this embodiment, a library of an international common Illumina sequencing platform is taken as an example, and other high-throughput sequencing platforms are also suitable for the invention, except that a sequencing linker sequence needs to be correspondingly replaced.
Experimental materials and equipment:
the standard substance of this example was purchased from Jinglian Genesis technologies (Shenzhen) Limited, specifically, lung cancer ctDNA standard substance suite GW-OCTM009, which contains wild-type DNA standard substance and ctDNA standard substance with mutation frequency of 0.1%, and the two were made to be in the following ratio of 7: 3 to obtain a diluted standard substance with mutation frequency of 0.03%.
The target detection sites are shown in table 1 below.
TABLE 1
Figure BDA0002957657160000061
The desired oligomers (oligos) are shown in tables 2 and 3 below (synthesized by tsingtaury biotechnology limited, tokyo, HPLC purification).
TABLE 2
Figure BDA0002957657160000062
Figure BDA0002957657160000071
Table 3: target gene probe (first primer) with biotin modification at 5' end
Figure BDA0002957657160000072
The symbols in tables 2 and 3 are described below: (1) "Biotin-" represents a Biotin label.
(2) IS2-RC-N and IS2 anneal into double stranded DNA molecules, i.e., a first sequencing linker; IS1-RC-N (or IS1-RC) and IS1 anneal into a double stranded DNA molecule, the second sequencing adaptor.
(3) "N" represents a random base in a nucleotide, and the random base may be any one of A, T, C, G bases.
(4) In the first primer (Biotin-U-GSP) containing the target gene complementary pairing sequence, X represents a sequence complementarily paired with a target gene region and is 20 nucleotides long, and the sequence is arranged in the target gene region in a forward direction every 10 nucleotides, namely 2X tile coverage.
(5) "Pho" represents a phosphate group.
(6) "U" represents a deoxynucleotide with uracil, i.e., dUTP.
(7) In primer-tag 1 at the end P7 with the sample tag, the sequence labeled with a downward straight line, "tgaag," is the sample tag.
The reagents and instrumentation are described below:
1) for 5' -phosphorylation of each DNA template, T4 polynuceotide Kinase (10U/. mu.L) was purchased from England Weiji (Shanghai) trade Co., Ltd., product No. EK0031 was used.
2) The first sequencing adapter preparation reaction used DNA polymerase I Klenow Fragment (5U/. mu.L) from Biotech, Inc. of Kinza, N.K.: n104-01.
3) The first primer extension reaction employs
Figure BDA0002957657160000082
AmpSeq Multi-PCR Module V2, purchased from biotechnology ltd, nuozokenza, nj, cat #: NA 205-01.
4) For each linker ligation, T4DNA ligase (Rapid) was used, purchased from Biotechnology GmbH of Nanjing Novozam, cat #: n103-01.
5) In cleaving the first primer extension product having dUTP, Thermolabile is used
Figure BDA0002957657160000083
II Enzyme, available from Nippon Biotechnology (Beijing) Ltd, cat #: M5508S.
6) When the single-stranded nucleotide sequence (overlap) protruding from the 3' -end of the original template strand complementarily paired with the first primer extension product was excised, T4DNA polymerase, purchased from nakai nuozokenza biotech gmbh, cat #: n101-01.
7) The library Amplification reaction used was a VAHTS HiFi Amplification Mix purchased from Biotechnology GmbH, Inc. of Kinzoka, N.K., cat #: n616-01.
8) PCR product purification magnetic Beads VAHTS DNA Clean Beads were purchased from tokyo kezan biotechnology, inc, cat #: n411-01.
9) Streptavidin magnetic bead Dynabeads for binding single-stranded ligation productsTM MyOneTMStreptavidin C1 was purchased from England Weiji (Shanghai) trade, Inc., Cat number 65001.
10) The ULtraPure water used in each step of the experiment is ULTRAPUreTMDNase/RNase-Free Distilled Water, purchased from England Weiji (Shanghai) trade, Inc., Cat number 10977023.
11) The instrument comprises the following steps: ABI veriti96 type PCR instrument (Yinxi Weijie (Shanghai) trade Co., Ltd.), constant temperature mixer (Hangzhou Yongning instruments Co., Ltd., Cat. HC-100), four-dimensional rotating mixer (Haimen's Linbei instruments manufacturing Co., Ltd., BE-1100), magnetic frame (Wuxi Baige Biotech Co., Ltd., Cat. BMB16-1.5-2), QubitTM4 fluoroometer (with WiFi, Yinxi Weijie (Shanghai) trade Limited, Cat No. Q33238), Bioptic full-automatic multiple nucleic acid detection system (Hangzhou Kagazei Biotech Limited, Cat No. Qsep-100), Eppendorf brand pipettor 1000. mu.L range, 100. mu.L range, 10. mu.L range, all were from Eppendorf company, Germany.
12) The TE buffer composition of this example was as follows: 10mmol/L Tris-HCl, 1mmol/L EDTA, pH 8.0.
As shown in fig. 1 and fig. 2, the experimental steps of this example are as follows:
1. a cyanine fine gene-lung cancer ctDNA standard set-GW-OCTM 009(20 ng/mu L) is taken, wherein the standard set contains a wild type DNA standard and a ctDNA standard with mutation frequency of 0.1%, and the wild type DNA standard: ctDNA standard with mutation frequency of 0.1% ═ 7: 3, to form a cfDNA sample with a mutation frequency of 0.03% of 60 ng.
2. Carrying out phosphorylation reaction on the DNA sample:
preparing the following reaction system in the original 200 microliter PCR tube of the reaction product in the previous step:
TABLE 4
Components Volume of
60ng (20 ng/. mu.L) of cfDNA sample with mutation frequency of 0.03% 3μL
ULtraPure DNase/RNase-Free Distilled Water 12μL
10x reaction buffer A T4 PNK 2μL
ATP(10mM) 2μL
T4 Polynucleotide Kinase 1μL
Total volume 20μL
The PCR tube was then placed in a PCR instrument and the following reactions were performed: 20 minutes at 37 ℃; 95 ℃ for 3 minutes. The reaction at 95 ℃ for 3 minutes can dissociate the DNA sample into single strands and denature and inactivate enzymes in the reaction system. Then, the sample was placed on ice for 5 minutes to avoid renaturation into double strands, and 20. mu.L of a single-stranded cfDNA sample phosphorylated at the 5' -end was obtained.
3. Preparation of the first sequencing adapter (with molecular tag)
3.1 the following reaction system was placed in a 200. mu.L PCR tube:
TABLE 5 first sequencing linker preparation System
Figure BDA0002957657160000081
Figure BDA0002957657160000091
3.2 placing the mixture in a PCR instrument to perform the following reactions: 95 ℃ for 10 s; the temperature was slowly reduced to 14 ℃ at a RAMP 4% (0.1 ℃/s) rate.
3.3 formation of the first sequencing linker precursor at a concentration of 200 pmol/. mu.L.
3.4 prepare the following reaction system in a 200. mu.L PCR tube:
TABLE 6
Components Sample addition amount (μ L)
Blue buffer(10X) 5
dNTP(10mM each) 7.5
ULtraPure DNase/RNase-Free Distilled Water 7.5
Klenow Fragment(5U/μL) 2.5
Tween_20(1%) 2.5
First sequencing linker precursor (200 pmol/. mu.L) 25
Total volume 50
3.5 placing the mixture in a PCR instrument for carrying out the following reactions: 15min at 37 ℃; 95 ℃ for 3min (reaction at 95 ℃ for 3min in order to denature the Klenow Fragment enzyme); a first sequencing linker was formed at a final concentration of 100 pmol/. mu.L. After the reaction is finished, the temperature is naturally reduced to room temperature (room temperature means 23 ℃ C. + -. 2 ℃ C., and the definitions of the subsequent room temperature are the same) for later use.
The prepared product can be stored for a long time at the temperature of minus 20 ℃ or stored for 8 hours at the temperature of 4 ℃.
4. Preparation of a second sequencing adapter (with molecular tag)
4.1 the following reaction system was prepared in a 200. mu.L PCR tube:
TABLE 7 second sequencing Joint annealing System
Components Sample addition amount (μ L)
IS1(500μM) 20
IS1-RC-N(500μM) 20
TE buffer 9.5
NaCl(5M) 0.5
Total volume 50
4.2 annealing reaction conditions: 95 ℃ for 10 seconds; the temperature was slowly reduced to 14 ℃ at a rate of RAMP 4% (0.1 ℃/s).
4.3 formation of a second sequencing linker (with molecular tag) precursor at a concentration of 200 pmol/. mu.L.
4.4 prepare the following reaction system in a 200. mu.L PCR tube:
TABLE 8
Components Sample addition amount (μ L)
Blue buffer(10×) 5
dNTP(10mM each) 7.5
ULtraPure DNase/RNase-Free Distilled Water 7.5
Klenow Fragment(5U/μL) 2.5
Tween_20(1%) 2.5
Second sequencing linker (with molecular tag) precursor (200 pmol/. mu.L) 25
Total volume 50
4.5 placing in a PCR instrument to perform the following reactions: 15min at 37 ℃; 95 ℃ for 3min (reaction at 95 ℃ for 3min was aimed at denaturing the Klenow Fragment enzyme). A second sequencing linker (with molecular tag) was formed at a final concentration of 100 pmol/. mu.L. And after the reaction is finished, naturally cooling to room temperature for later use.
The prepared product can be stored in the environment of minus 20 ℃ for a long time or stored in the environment of 4 ℃ for 8 hours.
5. Preparation of a second sequencing adapter (without molecular tag)
5.1 the following reaction system was prepared in a 200. mu.L PCR tube:
TABLE 9 second sequencing (without molecular tag) linker annealing System
Components Sample addition amount (μ L)
IS1(500μM) 20
IS1-RC(500μM) 20
TE buffer 9.5
NaCl(5M) 0.5
Total volume 50
5.2 annealing reaction conditions: 95 ℃ for 10 seconds; the temperature was slowly reduced to 14 ℃ at a rate of RAMP 4% (0.1 ℃/s).
5.3 Add 50. mu.L of TE buffer to the above reaction product (50. mu.L) to obtain a second sequencing linker at a final concentration of 100 pmol/. mu.L, i.e., 100. mu.M.
The prepared product can be stored in the environment of minus 20 ℃ for a long time or stored in the environment of 4 ℃ for 8 hours.
6. Each of the biotin-modified target gene probes (first primers) at the 5' -end in Table 3 was mixed in equimolar amounts to a final concentration of 200 pmol/. mu.L.
7. Annealing and extension of the first primer
For each single-tube reaction, the number of target gene sites to be detected can be from 1 to 1 ten thousand, and each site corresponds to a first primer with a specific target gene binding region, so that at most 1 ten thousand probes can be mixed for each single-tube reaction. The number of target gene detection sites in this example is 8, specifically shown in table 1, the first primers of the 8 target gene sites are shown in table 3, and each site has 2 primers, and 16 primers in total. These 16 pieces of first primers were mixed in equimolar amounts to obtain a first primer mixture of 200. mu.M as required in the present example.
A reaction system (the reaction adopts a multiple PCR reagent purchased from Nanjing NuoZan Biotechnology GmbH) is prepared in a 200-microliter PCR tube
Figure BDA0002957657160000102
AmpSeq Multi-PCR Module V2):
Watch 10
Components Volume of
First primer mixture 4μL
5' phosphorylated single-stranded cfDNA 20μL
5×VAHTS Multi-PCR Mix 6μL
Total volume 30μL
Vortex and mix evenly and centrifuge briefly, place in PCR instrument and do the following reaction:
the multiplex target gene site first primer mixture anneals and extends in the target region of cfDNA sample, and the reaction conditions in the PCR instrument are as follows: at 95 ℃ for 3 minutes; 60 seconds at 55 ℃; 72 ℃ for 5 minutes. And after the reaction is finished, naturally cooling to room temperature for later use.
8. First sequencing linker ligation reaction
The following reaction system was prepared directly in a 200. mu.l PCR tube containing the product from the previous step:
TABLE 11
Components Volume of
First primer extension product 30μL
2×Rapid Ligation Buffer 32μL
First sequencing Joint (100. mu.M) 1μL
T4 DNA Ligase(Rapid)(600U/μL) 1μL
Total volume 64μL
The following reactions were performed in a PCR instrument: the reaction was carried out at 37 ℃ for half an hour, which is a ligation reaction, and then at 95 ℃ for 1 to 10 minutes (2 minutes in this example), to inactivate T4DNA ligase. And after the reaction is finished, naturally cooling to room temperature for later use.
9. Streptavidin magnetic bead binding and purification of first sequencing linker ligation products
Taking 3 to 20 microliters (5 microliters in this example) of DynabeadsTM MyOneTMPlacing Streptavidin C1 magnetic beads (stored in the cold storage layer of refrigerator and placed at room temperature for half an hour before use) on a magnetic rack for standing1 to 10 minutes (5 minutes in this example), after all the magnetic beads are collected by the magnetic rack, sucking the supernatant and discarding; and (3) adding all the reaction products of the first sequencing joint connection in the last step into a PCR tube containing magnetic beads, fully vortexing, uniformly resuspending the magnetic beads, fixing the PCR tube on a four-dimensional rotary mixer, and uniformly mixing for 5-60 minutes (20 minutes in the embodiment) at room temperature in a rotating manner. Then, the mixture is placed on a magnetic frame and is kept still for 1 to 10 minutes (5 minutes in the embodiment), after all the magnetic beads are collected by the magnetic frame (since the biotin carried at the 5' -end of the first primer extension product is covalently bonded with the streptavidin coated on the magnetic beads, the first primer extension product is also collected), the supernatant is sucked and discarded, and the PCR tube is left on the magnetic frame.
10. Primer for cutting off uracil
Directly adding the following reaction system into a PCR tube containing magnetic beads:
TABLE 12
Figure BDA0002957657160000101
Fully vortex the magnetic beads for uniform resuspension, centrifuge briefly, place in the PCR instrument for the following reaction: 15min at 25 ℃; 65 ℃ for 10 min. After the reaction, the PCR tube was left on the magnetic stand and left to stand for 1 to 10 minutes (5 minutes in this example), and after all the magnetic beads to which the cleavage products were bound were collected by the magnetic stand, the supernatant was aspirated and transferred to a new PCR tube.
11. Blunt end (excision of 3 'overhanging single-stranded sequence, i.e., 3' overlap) in the first primer extension product:
the following reagents were added directly into the PCR tube one step up:
watch 13
Components Volume of
T4 DNA polymerase(3U/μL) 1μL
10×Blue Buffer 6μL
ULtraPure DNase/RNase-Free Distilled Water 3μL
First primer extension product (after uracil cleavage) 50μL
Total volume 60μL
The total volume of the reaction system is 60 microliters, the reaction system is vortexed, mixed and centrifuged for a short time, and then the mixture is placed in a PCR instrument for the following reaction: at 25 ℃ for 10 min; 75 ℃ for 10 min. And after the reaction is finished, naturally cooling to room temperature for later use.
12. Second sequencing linker (without molecular tag) ligation reaction
The following reaction system was prepared directly in the 200 μ l PCR tube of the previous step:
TABLE 14
Components Volume of
First primer extension product (after blunt end) 60μL
10×Ligation Buffer 8μL
Second sequencing adapter (without molecular tag) (100. mu.M) 2μL
ULtraPure DNase/RNase-Free Distilled Water 8μL
T4DNA Ligase(Rapid)(600U/μL) 2μL
Total volume 80μL
The total reaction volume was 80. mu.l, and the reaction was carried out in a PCR apparatus at 37 ℃ for half an hour (linker ligation) and at 95 ℃ for 2 minutes (T4 DNA Ligase denaturation inactivation). And after the reaction is finished, naturally cooling to room temperature for later use.
13. And (3) equally dividing the second sequencing joint ligation reaction product into two parts (each part is respectively subjected to PCR and purification, and finally, the purified products are combined), adding 40 microliter of each part, and carrying out PCR reaction by adding a second primer containing a sequence complementary to the reverse strand of the first sequencing joint and a third primer containing a sequence complementary to the reverse strand of the second sequencing joint to obtain a complete library (Illumina indexing PCR). Specifically, the reaction system is as follows:
watch 15
Components Volume of
Second linker to the reaction product (one part in two parts) 40μL
ULtraPureTM DNase/RNase-Free Distilled Water 8μL
P7-index-1 0.1μL
P7-index-2 0.1μL
P7-index-3 0.1μL
P7-index-4 0.1μL
P7-index-5 0.1μL
P7-index-6 0.1μL
P7-index-7 0.1μL
P7-index-8 0.1μL
P7-index-9 0.1μL
P7-index-10 0.1μL
IS4_indPCR.P5 1μL
VAHTS HiFi Amplification Mix 50μL
Total volume 100μL
A100 microliter reaction system is prepared according to the above table to carry out PCR, and the reaction conditions are as follows:
TABLE 16
Figure BDA0002957657160000111
After the reaction is completed, the product is purified by VAHTS DNA Clean Beads according to the standard operation of purifying PCR products by the Beads, the final step is to elute the final product by 22.5 microliter of ultrapure water, and two eluted products of the same sample are combined to form a P7 end sample labeled Illumina target gene library.
Comparative example 1
This comparative example provides a hybridization capture control experiment.
A cyanine fine gene-lung cancer ctDNA standard product set-GW-OCTM 009 containing a wild type DNA standard product and a ctDNA standard product with mutation frequency of 0.1% is taken, and the ratio of the wild type DNA standard product to the ctDNA standard product is determined according to the following steps of 7: 3 to form 60ng of DNA sample with mutation frequency of 0.03%. The list of captured genes is as follows: NRAS, KRAS, PIK3CA, EGFR. The hybrid capture probe is synthesized singly under Nanjing Jinslei Biotechnology GmbH according to the gene list (according to the general design concept of the hybrid capture probe, all exon regions of the genes listed in the gene list are covered, which is a customized product and has no product number), the genome region covered by all the probes in the embodiment 1 is completely covered by the capture region of the hybrid capture probe, library construction is carried out according to the standard operation flow by adopting a library construction and a hybrid capture kit provided by the Jinslei Biotechnology GmbH, and the library construction comprises amplification before capture, hybrid capture and amplification after capture, and sequencing is carried out.
Comparative example 2
This comparative example provides an amplicon banking control experiment based on multiplex PCR technology.
A cyanine fine gene-lung cancer ctDNA standard product set-GW-OCTM 009 containing a wild type DNA standard product and a ctDNA standard product with mutation frequency of 0.1% is taken, and the ratio of the wild type DNA standard product to the ctDNA standard product is determined according to the following steps of 7: 3 to form 60ng of DNA sample with mutation frequency of 0.03%. The list of target genes is as follows: NRAS, KRAS, PIK3CA, EGFR. The multiplex PCR probe set is singly synthesized by Nanjing Kingsler Biotechnology GmbH according to the target gene list (according to the general design concept of amplicon library construction, all exon regions of the genes listed in the gene list are covered, and are customized products without goods numbers), the target region of the probe completely covers the genome region covered by all the probes in the embodiment 1, and the amplicon library construction kit provided by the Kingsler Biotechnology GmbH is adopted to carry out amplicon library construction according to the standard operation flow and send the sequencing.
Sequencing on machine
The products of example 1, comparative example 1 and comparative example 2 were measured for concentration using qubit4.0, and 20ng of each was taken and sent to the machine for sequencing. The model of the instrument, Illumina Hiseq 4000, the strategy is PE150, and the data volume is 1Gb per sample.
Sequencing data quality control and analysis process
Raw data was processed using fastp software, genome Alignment using BWA software (i.e. Burrows-Wheeler-Alignment Tool, algorithm BWA-MEM), reference genome using GRCh38 (also known as hg38, international universal human reference genome sequence) and labeling using sambamba software (markdup).
Analysis results
The sequencing results of the library constructed in example 1 are a collection of 10 index resolved read numbers (reads) as shown in the following table:
TABLE 17
index No. Number of reads Ratio of
1 671004 10.07%
2 659283 9.89%
3 669175 10.04%
4 660547 9.91%
5 668647 10.03%
6 660289 9.91%
7 666729 10.00%
8 663625 9.96%
9 670091 10.06%
10 666824 10.01%
Number of reads that cannot be listed in index 8042 0.12%
Total reads number 6664256 100.00%
As can be seen from the above table, the distribution preference of the reads number among the indexes is low (the reads number split by each index is similar), and the reads number which cannot be listed in the index only accounts for 0.12% of the total reads number, which indicates that the P7 end-labeled index amplification system used in example 1 can accurately perform mixed target gene library construction and sequencing on a plurality of samples.
The mutation detection results were as follows:
watch 18
Figure BDA0002957657160000121
Figure BDA0002957657160000131
In the above table, raw base refers to the raw data amount.
The GC content is the ratio of Guanine (Guanine) to Cytosine (Cytosine).
Q30 represents the ratio of reads to total reads for 99.9% accuracy.
depth in target refers to the sequencing depth of the target site.
ref _ reads represents the corresponding number of reads on the human reference genome.
alt reads represent the number of reads of the mutation (variant).
MAF (mutation Allole frequency) is the frequency of the abrupt change, specifically the ratio of alt reads to ref _ reads.
As can be seen from the above table, the quality of the sequencing data of the library constructed in example 1 is higher than the sequencing results of the libraries constructed in the other two prior arts, specifically, the Q30 ratio is higher; and the frequency of the target gene mutation detected based on the library constructed in the embodiment 1 is closer to the true value and is closer to the preset value of 0.03%. Therefore, the library construction method of example 1 has better performance and shorter time consumption when sequencing and detecting the specific target gene of the complex genome such as human. The hybrid capture needs 72-80 hours for library establishment, the amplicon needs 24-32 hours for library establishment, and only 9 hours are needed in example 1. And the embodiment 1 needs less steps, needs less various actual and consumable materials and has low cost. In summary, the library construction method of example 1 has wider application in clinical examination, molecular medicine research and genome science research.
In example 1, a second sequencing adaptor with a molecular tag and a third primer without a sample tag are adopted, and the obtained target gene library is a sequencing library with double-ended molecular tags and single-ended sample tags; in practical application, a second sequencing joint without a molecular tag and a third primer with a sample tag can be selected according to needs, so that a sequencing library with a single-ended molecular tag and a double-ended sample tag can be obtained; the double-end molecular tag can improve the accuracy of identifying ultralow frequency mutation; the double-end sample tags can better reduce the molecular tag crosstalk among samples when libraries of different molecular tags are split during multi-sample mixed sequencing theoretically, and improve the data splitting accuracy and the effective data utilization rate of each sample.
Example 2
This example uses DNA extracted from formalin-fixed and paraffin-embedded (FFPE) tissue standards (purchased from qian xian gene science and technology (shenzhen) limited, including tumor wild-type FFPE standard and tumor SNV 5% FFPE standard) to prepare tumor mutation standards with five ten thousandths of mutation frequency, three equal portions of 300ng of each DNA standard are taken and used in three independent library construction experiments, the method of this example, the existing hybrid capture library construction method and the existing amplicon library construction method are respectively used as library preparation methods for sequencing libraries, and the target gene regions designed by the three sets of experiments are basically identical, then on-machine sequencing is performed on the same high-throughput sequencing platform, and the same data volume is sequenced, finally, the same 7 target gene mutation sites are checked by using the same data analysis process (these 7 sites are distributed in the exon regions of 4 genes, the 4 genes are respectively NRAS, KRAS, PIK3CA and EGFR) to evaluate the performance difference of the three high-throughput sequencing target gene library construction methods (the three library construction methods are the library construction method of the embodiment, the existing hybridization capture library construction method and the existing amplicon library construction method).
In this embodiment, a library of an international common Illumina sequencing platform is taken as an example, and other high-throughput sequencing platform processes are also applicable to the invention, except that a sequencing linker sequence needs to be correspondingly replaced.
Experimental materials and equipment:
the DNA standard was selected from tumor wild-type FFPE standard (mutation frequency 0, cat # GW-OPSM005) and tumor SNV 5% FFPE standard (cat # GW-OPSM003) from Jinglian GeneImmunol technology (Shenzhen).
The FFPE standard substance is extracted by adopting a magnetic bead method paraffin-embedded tissue DNA extraction kit (product number: D6323-02B) of Guangzhou Meiji biological science and technology Limited.
When the FFPE total DNA is fragmented (i.e., the total DNA of long fragments of 10kb or more is fragmented into short fragments of 200-500bp in length), cleavage is performed using KAPA fragment Kit for Enzymatic Fragmentation (cat No. KK8600) available from Roche diagnostics, Inc.
The target detection sites are as follows:
watch 19
Figure BDA0002957657160000141
The desired oligomers (oligos) are as follows:
watch 20
Figure BDA0002957657160000142
Figure BDA0002957657160000151
Table 21: target gene probe (first primer) with biotin modification at 5' end
Figure BDA0002957657160000152
Description of the drawings: (1) "Biotin-" represents a Biotin label.
(2) IS2-RC-N and IS2 anneal into double stranded DNA molecules, i.e., a first sequencing linker; IS1-RC-N (or IS1-RC) and IS1 anneal into a double stranded DNA molecule, the second sequencing adaptor.
(3) "N" represents a random base in a nucleotide.
(4) "X" represents a sequence that is complementary paired to the target gene region, 20 nucleotides long, with one such sequence being placed every 10 nucleotides forward in the target gene region, i.e., 2 × shingled coverage.
(5) "Pho" represents a phosphate group.
(6) "U" represents a deoxynucleotide with uracil, i.e., dUTP.
(7) In primer-tag 1 at the end P7 with the sample tag, the sequence labeled with a downward straight line, "tgaag," is the sample tag.
For 5' -terminal phosphorylation of each DNA template, T4 polynuceotide Kinase (10U/. mu.L) (purchased from England Weiji (Shanghai) trade Co., Ltd., product No. EK0031) was used.
First sequencing adapters were prepared using DNA polymerase I Klenow Fragment (5U/. mu.L), cat #: n104-01, available from Biotech, Inc. of Nanjing Novozam.
When the first primer is extended, the method uses
Figure BDA0002957657160000153
AmpSeq Multi-PCR Module V2, cat #: NA205-01, available from Nanjing Novozam Biotech, Inc.
T4DNA ligase (Rapid) is adopted in each joint connection reaction, and the cargo number: n103-01, available from Biotech, Inc. of Nanjing Novozam.
When the first primer extension product having dUTP is cleaved, Thermolobile is used
Figure BDA0002957657160000154
II Enzyme, cat #: M5508S, available from Biotechnology, N.Y. (Beijing) Inc.
When the single-stranded portion protruding from the 3' -end of the original template strand to which the first primer extension product is paired is excised, T4DNA polymerase, cat #: n101-01, available from Biotech, Inc. of Nanjing Novozam.
Library Amplification reactions were performed using a VAHTS HiFi Amplification Mix, cat #: n616-01, available from Biotech, Inc. of Nanjing Novozam.
PCR product purification magnetic Beads were VAHTS DNA Clean Beads, cat No.: n411-01, available from Biotech, Inc. of Nanjing Novozam.
Streptavidin magnetic beads used for binding the first primer extension product are DynabeadsTM MyOneTMStreptavidin C1, purchased from England Weiji (Shanghai) trade, Inc., Cat number 65001.
The ULtraPure water used in each step of the experiment is ULTRAPUreTMDNase/RNase-Free Distilled Water, purchased from England Weiji (Shanghai) trade, Inc., Cat number 10977023.
The instrument comprises the following steps: ABI veriti96 type PCR instrument(Yinxie Jie based (Shanghai) trade Co., Ltd.), constant temperature mixer (Hangzhou Yongning, Cat # HC-100), four-dimensional rotating mixer (Kangmen's Lin Beier Instrument manufacturing Co., Ltd., BE-1100), magnetic frame (Wuxi Baige Biotech Co., Ltd., Cat # BMB16-1.5-2), and QubitTM4 Fluorometer, with WiFi (Yinxi Weijie trading Limited, cat # Q33238), Bioptic full-automatic multiplex nucleic acid detection system (Hangzhou Kzegaku Biotech Limited, cat # Qsep-100), Eppendorf brand pipettor 1000. mu.L, 100. mu.L, 10. mu.L, purchased from Eppendort, Germany.
The steps of this example are as follows:
1. the magnetic bead method paraffin-embedded tissue DNA extraction kit (product number: D6323-02B) purchased from Guangzhou Meiji Biotech Limited is adopted to extract total DNA of a tumor wild type FFPE standard (mutation frequency is 0, product number GW-OPSM005) and a tumor SNV 5% FFPE standard (product number GW-OPSM003) purchased from Jingzhen Genesis Limited, and the total DNA extraction is carried out according to the standard operation flow of the kit, and finally the DNA extract is obtained by elution according to the volume of 50 microliter.
2. The concentrations of wild type FFPE DNA and 5% SNV are respectively 15.54 ng/. mu.L and 14.78 ng/. mu.L and the total amounts are respectively 777ng and 739ng by using the Qubit4.0 to determine the concentration, and 297ng of tumor wild type FFPE standard DNA and 3ng of tumor SNV 5% FFPE standard DNA are taken. Mix (i.e., mix at a mass ratio of 99: 1) to form 300ng of FFPE DNA sample with a mutation frequency of 0.05%, vortex and mix well.
3. Putting the product of the last step into a 200-microliter PCR tube, and adopting a KAPA fragment Kit for Enzymatic Fragmentation Kit to perform enzyme cleavage Fragmentation, wherein the fragment length range after Fragmentation is 200-600bp, and the main peak is about 300 bp. Purifying by using 0.9 times volume of VAHTS DNA Clean Beads (cat number: N411-01) according to a standard operation flow, finally eluting a purified product by using 23.5 microliter of pure water, taking 1 microliter of the purified product, and measuring the concentration of the purified product by using the qubit4.0 to obtain the result of 7.28 ng/microliter, namely 163.8ng of the fragmented and purified FFPE sample DNA.
4. Carrying out phosphorylation reaction on the DNA sample:
preparing the following reaction system in the original 200 microliter PCR tube of the reaction product in the previous step:
TABLE 22
Components Volume of
Purification of the product by fragmentation reaction 22.5μL
10x reaction buffer A T4 PNK 3μL
ATP(10mM) 3μL
T4 Polynucleotide Kinase 1.5μL
Total volume 30μL
Placing the mixture in a PCR instrument to perform the following reactions: at 37 ℃ for 20 min; 95 ℃ for 3 min. The enzyme in the reaction system can be denatured and inactivated by reacting at 95 ℃ for 3 minutes, simultaneously, DNA molecules are dissociated into single strands, then the DNA molecules are placed on ice for 5 minutes to avoid renaturation into double strands, 30 mu L of DNA samples with 5' end phosphorylation are obtained, 1 mu L of the DNA samples is taken and the concentration of the DNA samples is measured by using Qubit4.0, and the result is 5.42 ng/mu L.
5. Preparation of the first sequencing adapter (with molecular tag)
5.1 the following reaction system was prepared in a 200. mu.L PCR tube:
TABLE 23 first sequencing linker preparation System
Components Sample addition amount (μ L)
IS2(500μM) 20
IS2-RC-N(500μM) 20
TE buffer 9.5
NaCl(5M) 0.5
Total volume 50
5.2 placing the mixture in a PCR instrument for carrying out the following reactions: 95 ℃ for 10 s; the temperature was slowly reduced to 14 ℃ at a RAMP 4% (0.1 ℃/s) rate.
5.3 formation of the first sequencing linker precursor at a concentration of 200 pmol/. mu.L.
5.4 prepare the following reaction system in a 200. mu.L PCR tube:
watch 24
Components Sample addition amount (μ L)
Blue buffer(10X) 5
dNTP(10mM each) 7.5
UltraPureTM DNase/RNase-Free Distilled Water 7.5
Klenow Fragment(5U/μl) 2.5
Tween_20(1%) 2.5
First sequencing linker precursor (200 pmol/. mu.L) 25
Total volume 50
5.5 placing the mixture in a PCR instrument for carrying out the following reactions: 15min at 37 ℃; 95 ℃ for 3min (Klenow Fragment enzyme denaturation), first sequencing linker was formed at a final concentration of 100 pmol/. mu.L. And after the reaction is finished, naturally cooling to room temperature for later use.
The prepared product can be stored in the environment of minus 20 ℃ for a long time or stored in the environment of 4 ℃ for 8 hours.
6. Preparation of a second sequencing adapter (with molecular tag)
6.1 the following reaction system was prepared in a 200. mu.L PCR tube:
TABLE 25
Figure BDA0002957657160000171
6.2 annealing reaction, conditions are as follows: 95 ℃ for 10 seconds; the temperature was slowly reduced to 14 ℃ at a RAMP 4% (0.1 ℃/s) rate.
6.3 formation of a second sequencing linker (with molecular tag) precursor at a concentration of 200 pmol/. mu.L.
6.4 the following reaction system was prepared in a 200. mu.L PCR tube:
watch 26
Components Sample addition amount (μ L)
Blue buffer(10X) 5
dNTP(10mM each) 7.5
ULtraPure DNase/RNase-Free Distilled Water 7.5
Klenow Fragment(5U/μL) 2.5
Tween_20(1%) 2.5
Second sequencing linker (with molecular tag) precursor (200 pmol/. mu.L) 25
Total volume 50
6.5 placing the mixture in a PCR instrument for carrying out the following reactions: 15min at 37 ℃; 95 ℃ for 3min (denaturation of Klenow Fragment enzyme); a second sequencing linker (with molecular tag) was formed at a final concentration of 100 pmol/. mu.L. And after the reaction is finished, naturally cooling to room temperature for later use.
The prepared product can be stored in the environment of minus 20 ℃ for a long time or stored in the environment of 4 ℃ for 8 hours.
7. Preparation of a second sequencing adapter (without molecular tag)
7.1 the following reaction system was prepared in a 200. mu.L PCR tube:
TABLE 27 second sequencing (without molecular tag) linker annealing System
Components Sample addition amount (μ L)
IS1(500μM) 20
IS1-RC(500μM) 20
TE buffer 9.5
NaCl(5M) 0.5
Total volume 50
7.2 annealing reaction conditions: 95 ℃ for 10 seconds; the temperature was slowly reduced to 14 ℃ at a RAMP 4% (0.1 ℃/s) rate.
7.3 Add 50. mu.L of TE buffer to the above reaction product (50. mu.L) to obtain a second sequencing linker at a final concentration of 100 pmol/. mu.L, i.e., 100. mu.M.
The prepared product can be stored in the environment of minus 20 ℃ for a long time or stored in the environment of 4 ℃ for 8 hours.
8. Each of the biotin-modified target gene probes (first primers) at the 5' -end in Table 21 was mixed in equimolar amounts to a final concentration of 200 pmol/. mu.L.
9. Annealing and extension of the first primer
For each single-tube reaction, the number of the target gene sites to be detected can be 1 to 1 ten thousand, and each site corresponds to a first primer with a specific target gene binding region, so that at most 1 ten thousand probes can be mixed for each single-tube reaction. The number of target gene detection sites in this example is 7, specifically shown in table 19, the first primers of these 7 target gene sites are shown in table 21, and there are 2 primers per site, for a total of 14. These 14 first primers were mixed in equimolar amounts to obtain a first primer mixture of 200. mu.M final concentration required in the present example.
A reaction system (the reaction adopts a multiple PCR reagent purchased from Nanjing NuoZan Biotechnology GmbH) is prepared in a 200-microliter PCR tube
Figure BDA0002957657160000172
AmpSeq Multi-PCR Module V2):
Watch 28
Components Volume of
First primer mixture 4μL
Fragmented and phosphorylated FFPE DNA sample (5.42 ng/. mu.L) 28μL
5×VAHTS Multi-PCR Mix 8μL
Total volume 40μL
Vortex and mix evenly and centrifuge briefly, place in PCR instrument and do the following reaction:
the first primer mixture of multiple target gene sites is annealed and extended in a target region of a genome and is carried out in a PCR instrument under the following reaction conditions: at 95 ℃ for 3 minutes; 60 seconds at 55 ℃; 72 ℃ for 5 minutes. And after the reaction is finished, naturally cooling to room temperature for later use.
10. First sequencing linker ligation reaction
The following reaction system was prepared directly in a 200. mu.l PCR tube containing the product from the previous step:
watch 29
Components Volume of
First primer extension product 40μL
2×Rapid Ligation Buffer 50μL
First sequencing Joint (100. mu.M) 5μL
T4 DNA Ligase(Rapid)(600U/μL) 5μL
Total volume 100μL
The total reaction volume was 100. mu.l, and the following reactions were carried out in a PCR apparatus: the ligase was denatured and inactivated by a reaction at 37 ℃ for 30 minutes, which is a ligation reaction, and then a reaction at 95 ℃ for 1 to 10 minutes (2 minutes in this example). And after the reaction is finished, naturally cooling to room temperature for later use.
11. Streptavidin magnetic bead binding and purification of first sequencing linker ligation products
Taking 3 to 20 microliters (5 microliters in this example) of DynabeadsTM MyOneTMStreptavidin C1 magnetic beads (stored in a cold storage layer of a refrigerator and placed at room temperature for half an hour before use) are placed on a magnetic rack to stand for 1 to 10 minutes (5 minutes in the embodiment), and after all the magnetic beads are collected by the magnetic rack, supernatant is sucked and discarded; and (3) adding all the reaction products of the first sequencing joint connection in the last step into a PCR tube containing magnetic beads, fully whirling to uniformly resuspend the magnetic beads, fixing the PCR tube on a four-dimensional rotary mixer, and uniformly mixing for 5-60 minutes (20 minutes in the embodiment) at room temperature in a rotating manner. Then, the mixture is placed on a magnetic frame and kept stand for 1 to 10 minutes (5 minutes in the embodiment), after all the magnetic beads are collected by the magnetic frame (since the biotin carried at the 5' -end of the first primer extension product is covalently bound with the streptavidin coated on the magnetic beads, the first primer extension product is also collected), the supernatant is sucked and discarded, and the PCR tube is left on the magnetic frame.
12. Primer for cutting off uracil
Directly adding the following reaction system into a PCR tube containing magnetic beads:
watch 30
Figure BDA0002957657160000181
Fully whirling, uniformly resuspending the magnetic beads, centrifuging for a short time, and placing the magnetic beads in a PCR instrument for the following reactions: 15min at 25 ℃; 10min at 65 ℃ (10 min at 65 ℃ for the purpose of allowing Thermolabile to react
Figure BDA0002957657160000182
II Enzyme denaturation inactivation). After the reaction is finished, the PCR tube is placed on a magnetic rack and kept still for 1 to 10 minutes (5 minutes in the embodiment), and after all the magnetic beads are collected by the magnetic rack, the supernatant is sucked up and transferred to a new PCR tube.
13. The first primer extension product was blunt-ended (3 'overhang sequence was excised, i.e., 3' overlap)
The following reagents were added directly into the PCR tube one step up:
watch 31
Components Volume of
T4 DNA polymerase(3U/μL) 1μL
10×Blue Buffer 6μL
ULtraPure DNase/RNase-Free Distilled Water 3μL
First primer extension product (after uracil cleavage)) 50μL
Total volume 60μL
The total volume of the reaction system is 60 microliters, the reaction system is vortexed, mixed and centrifuged for a short time, and then the mixture is placed in a PCR instrument for the following reaction: at 25 ℃ for 10 min; the DNA polymerase was denatured at 75 ℃ for 10min (T4 DNA polymerase was inactivated). And after the reaction is finished, naturally cooling to room temperature for later use.
14. Second sequencing linker (without molecular tag) ligation reaction
The following reaction system was prepared directly in the 200 μ l PCR tube of the previous step:
watch 32
Components Volume of
First primer extension product (after blunt end) 60μL
10×Ligation Buffer 8μL
Second sequencing adapter (without molecular tag) (100. mu.M) 2μL
ULtraPure DNase/RNase-Free Distilled Water 8μL
T4 DNA Ligase(Rapid)(600U/μL) 2μL
Total volume 80μL
The total reaction volume was 80. mu.l, and the following reactions were carried out in a PCR apparatus: reaction at 37 ℃ for 30min (linker attachment); reaction at 95 ℃ for 2 minutes (T4 DNA Ligase denaturation inactivation). And after the reaction is finished, naturally cooling to room temperature for later use.
15. And (3) dividing the second sequencing joint connection reaction product into two parts, wherein each part is 40 microliters, adding a second primer containing a sequence complementary to the reverse strand of the first sequencing joint and a third primer containing a sequence complementary to the reverse strand of the second sequencing joint, and carrying out PCR reaction to obtain a complete library (Illumina indexing PCR). Specifically, the reaction system is as follows:
watch 33
Components Volume of
Second linker to the reaction product (one part in two parts) 40 microliter
ULtraPureTM DNase/RNase-Free Distilled Water 8 microliter
P7-index-1 0.1 microliter
P7-index-2 0.1 microliter
P7-index-3 0.1 microliter
P7-index-4 0.1 microliter
P7-index-5 0.1 microliter
P7-index-6 0.1 microliter
P7-index-7 0.1 microliter
P7-index-8 0.1 microliter
P7-index-9 0.1 microliter
P7-index-10 0.1 microliter
IS4_indPCR.P5 1 microliter
VAHTS HiFi Amplification Mix 50 microliter
Total volume 100 microliter
A100 microliter reaction system is prepared according to the above table to carry out PCR, and the reaction conditions are as follows:
watch 34
Figure BDA0002957657160000191
After the reaction is finished, VAHTS DNA Clean Beads are adopted to purify products, the standard operation of purifying PCR products by the Beads is carried out, the final product is eluted by 20 microliters of ultrapure water in the last step, two aliquots of purified eluents obtained by PCR corresponding to each sample are mixed to obtain 40 microliters of purified products, and the Illumina target gene library with the sample label at the P7 end is built.
In this embodiment, a second sequencing adaptor without a molecular tag and a third primer without a sample tag are used, and the obtained target gene library is a sequencing library of a single-ended molecular tag and a single-ended sample tag. In some embodiments, a second sequencing adaptor with a molecular tag and a third primer with a sample tag can be selected as required, so that a sequencing library with a double-ended molecular tag and a double-ended sample tag can be obtained; the double-ended tag can improve the accuracy of identifying ultralow frequency mutation.
Comparative example 3
This comparative example provides a hybridization capture control experiment.
Tumor wild-type FFPE standards and tumor SNV 5% FFPE standards purchased from qianqin gene technology (shenzhen) limited were obtained according to 99: 1, 300ng of DNA sample with mutation frequency of 0.05% was formed, and fragmentation screening and magnetic bead purification were performed in the same manner as in example 2. According to a target gene list (target genes are specifically NRAS, KRAS, PIK3CA and EGFR), a hybrid capture probe is synthesized under Nanjing King Shirui biological technology GmbH (according to the general design concept of the hybrid capture probe, all exon regions of the genes listed in the list are covered, and the probe is a customized product and has no commodity number), library construction is carried out according to a standard operation flow by adopting a library construction and a hybrid capture kit provided by the King Shirui biological technology GmbH, and the library construction comprises amplification before capture, hybrid capture and amplification after capture, and sequencing is carried out.
Comparative example 4
This comparative example provides an amplicon banking control experiment based on multiplex PCR technology.
Tumor wild-type FFPE standards and tumor SNV 5% FFPE standards purchased from qianqin gene technology (shenzhen) limited were obtained according to 99: 1, 300ng of DNA sample with mutation frequency of 0.05% was formed, and fragmentation screening and magnetic bead purification were performed in the same manner as in example 2. According to a target gene list (target genes are specifically NRAS, KRAS, PIK3CA and EGFR), a multiplex PCR probe set is synthesized under Nanjing Kingshi biological science and technology Co., Ltd (according to the general design concept of amplicon library construction, all exon regions of the genes listed in the table are covered, which are customized products and have no goods number), an amplicon library construction kit provided by Kingshi biological science and technology Co., Ltd is adopted, amplicon library construction is carried out according to a standard operation process, and sequencing is carried out.
Sequencing on machine
The library products prepared in example 2, comparative example 3 and comparative example 4 were taken, the concentration of the library products was measured by using Qubit4.0, and 20ng of the library products were taken and sent to an on-machine for sequencing. The model number of the instrument is illumina Hiseq 4000, the strategy is PE150, and the data volume is 1Gb per sample.
Sequencing data quality control and analysis process
Raw data was processed using fastp software, genome Alignment using BWA software (i.e. Burrows-Wheeler-Alignment Tool, algorithm BWA-MEM), reference genome using GRCh38/hg38 (international universal human genome reference sequence), and labeling using sambamba software (markdup).
Analysis results
The sequencing results of the library constructed in example 1 are a collection of reads numbers resolved at 10 indices, as shown in the following table:
watch 35
index No. Number of reads Ratio of
1 664921 9.97%
2 670013 10.04%
3 669184 10.03%
4 664555 9.96%
5 668647 10.02%
6 663892 9.95%
7 669534 10.04%
8 665389 9.97%
9 668364 10.02%
10 659987 9.89%
Number of reads that cannot be listed in index 6832 0.10%
Total reads number 6671318 100.00%
As can be seen from the above table, the distribution preference of the reads among the indexes is low (the reads split by each index are similar), and the reads which cannot be listed in the index only account for 0.10% of the total reads, which indicates that the index amplification system with the P7 end-labeled sample in example 2 can accurately perform mixed target gene library construction and sequencing on a plurality of samples.
The mutation detection results were as follows:
watch 36
Figure BDA0002957657160000201
Figure BDA0002957657160000211
As can be seen from the above table, the quality of the sequencing data of the library constructed in example 2 is higher than that of the sequencing results of the libraries constructed by the other two technologies, specifically, the proportion of Q30 is higher, and Q30 represents the proportion of reads with the accuracy of 99.9% to the total number of reads; and the frequency (MAF) of the target gene mutation detected based on the library constructed in the embodiment 2 is closer to the true value and closer to the preset value of 0.05%. Therefore, the technology has better performance and shorter time consumption (72-80 hours for hybrid capture and 24-32 hours for amplicon library construction, and only 9 hours for example 2) when the technology is used for sequencing and detecting specific target genes of complex genomes such as human beings. And the method needs less steps, needs less various actual and consumable materials and has low cost. Purification is not needed among main experimental steps, so that time and materials are saved, DNA loss in the purification process of each step is avoided, and the original template DNA is retained to the maximum extent. In conclusion, the database construction technology of example 2 has wider application value in clinical detection, molecular medicine research and genome science research.
The present invention has been described in terms of specific examples, which are provided to aid understanding of the invention and are not intended to be limiting. For a person skilled in the art to which the invention pertains, several simple deductions, modifications or substitutions may be made according to the idea of the invention.
SEQUENCE LISTING
<110> Shenzhen Rui method Biotech Limited
<120> high-fidelity target gene library building method and kit thereof
<130> 20I30445
<160> 32
<170> PatentIn version 3.3
<210> 1
<211> 34
<212> DNA
<213> Artificial sequence
<400> 1
gtgactggag ttcagacgtg tgctcttccg atct 34
<210> 2
<211> 46
<212> DNA
<213> Artificial sequence
<220>
<221> misc_feature
<222> (1)..(12)
<223> n is a, c, g, or t
<400> 2
nnnnnnnnnn nnagatcgga agagcacacg tctgaactcc agtcac 46
<210> 3
<211> 33
<212> DNA
<213> Artificial sequence
<400> 3
acactctttc cctacacgac gctcttccga tct 33
<210> 4
<211> 42
<212> DNA
<213> Artificial sequence
<220>
<221> misc_feature
<222> (1)..(9)
<223> n is a, c, g, or t
<400> 4
nnnnnnnnna gatcggaaga gcgtcgtgta gggaaagagt gt 42
<210> 5
<211> 33
<212> DNA
<213> Artificial sequence
<400> 5
agatcggaag agcgtcgtgt agggaaagag tgt 33
<210> 6
<211> 60
<212> DNA
<213> Artificial sequence
<400> 6
caagcagaag acggcatacg agattgatag gtgactggag ttcagacgtg tgctcttccg 60
<210> 7
<211> 60
<212> DNA
<213> Artificial sequence
<400> 7
caagcagaag acggcatacg agattatacg gtgactggag ttcagacgtg tgctcttccg 60
<210> 8
<211> 60
<212> DNA
<213> Artificial sequence
<400> 8
caagcagaag acggcatacg agatcgatca gtgactggag ttcagacgtg tgctcttccg 60
<210> 9
<211> 60
<212> DNA
<213> Artificial sequence
<400> 9
caagcagaag acggcatacg agatatacac gtgactggag ttcagacgtg tgctcttccg 60
<210> 10
<211> 60
<212> DNA
<213> Artificial sequence
<400> 10
caagcagaag acggcatacg agatatagcg gtgactggag ttcagacgtg tgctcttccg 60
<210> 11
<211> 60
<212> DNA
<213> Artificial sequence
<400> 11
caagcagaag acggcatacg agattgttca gtgactggag ttcagacgtg tgctcttccg 60
<210> 12
<211> 60
<212> DNA
<213> Artificial sequence
<400> 12
caagcagaag acggcatacg agatagatac gtgactggag ttcagacgtg tgctcttccg 60
<210> 13
<211> 60
<212> DNA
<213> Artificial sequence
<400> 13
caagcagaag acggcatacg agattagctg gtgactggag ttcagacgtg tgctcttccg 60
<210> 14
<211> 60
<212> DNA
<213> Artificial sequence
<400> 14
caagcagaag acggcatacg agatgtatgt gtgactggag ttcagacgtg tgctcttccg 60
<210> 15
<211> 60
<212> DNA
<213> Artificial sequence
<400> 15
caagcagaag acggcatacg agatggctca gtgactggag ttcagacgtg tgctcttccg 60
<210> 16
<211> 51
<212> DNA
<213> Artificial sequence
<400> 16
aatgatacgg cgaccaccga gatctacact ctttccctac acgacgctct t 51
<210> 17
<211> 33
<212> DNA
<213> Artificial sequence
<220>
<221> misc_feature
<222> (13)..(13)
<223> n is u
<400> 17
caaggacatc cgntgatttg tagtggagaa gga 33
<210> 18
<211> 33
<212> DNA
<213> Artificial sequence
<220>
<221> misc_feature
<222> (13)..(13)
<223> n is u
<400> 18
caaggacatc cgntggcctg gcttgcttac ctt 33
<210> 19
<211> 33
<212> DNA
<213> Artificial sequence
<220>
<221> misc_feature
<222> (13)..(13)
<223> n is u
<400> 19
caaggacatc cgngcatctg cctcacctcc acc 33
<210> 20
<211> 33
<212> DNA
<213> Artificial sequence
<220>
<221> misc_feature
<222> (13)..(13)
<223> n is u
<400> 20
caaggacatc cgntccagga ggcagccgaa ggg 33
<210> 21
<211> 33
<212> DNA
<213> Artificial sequence
<220>
<221> misc_feature
<222> (13)..(13)
<223> n is u
<400> 21
caaggacatc cgnggaaact gaattcaaaa aga 33
<210> 22
<211> 33
<212> DNA
<213> Artificial sequence
<220>
<221> misc_feature
<222> (13)..(13)
<223> n is u
<400> 22
caaggacatc cgngacctta ccttatacac cgt 33
<210> 23
<211> 33
<212> DNA
<213> Artificial sequence
<220>
<221> misc_feature
<222> (13)..(13)
<223> n is u
<400> 23
caaggacatc cgngaaataa atacagatct gtt 33
<210> 24
<211> 33
<212> DNA
<213> Artificial sequence
<220>
<221> misc_feature
<222> (13)..(13)
<223> n is u
<400> 24
caaggacatc cgnaaaagga attccataac ttc 33
<210> 25
<211> 33
<212> DNA
<213> Artificial sequence
<220>
<221> misc_feature
<222> (13)..(13)
<223> n is u
<400> 25
caaggacatc cgngacgata cagctaattc aga 33
<210> 26
<211> 33
<212> DNA
<213> Artificial sequence
<220>
<221> misc_feature
<222> (13)..(13)
<223> n is u
<400> 26
caaggacatc cgnacaagtt tatattcagt cat 33
<210> 27
<211> 33
<212> DNA
<213> Artificial sequence
<220>
<221> misc_feature
<222> (13)..(13)
<223> n is u
<400> 27
caaggacatc cgntgagaga ccaatacatg agg 33
<210> 28
<211> 33
<212> DNA
<213> Artificial sequence
<220>
<221> misc_feature
<222> (13)..(13)
<223> n is u
<400> 28
caaggacatc cgntatgtcc aacaaacagg ttt 33
<210> 29
<211> 33
<212> DNA
<213> Artificial sequence
<220>
<221> misc_feature
<222> (13)..(13)
<223> n is u
<400> 29
caaggacatc cgnagaaggt gagaaagtta aaa 33
<210> 30
<211> 33
<212> DNA
<213> Artificial sequence
<220>
<221> misc_feature
<222> (13)..(13)
<223> n is u
<400> 30
caaggacatc cgntcacatc gaggatttcc ttg 33
<210> 31
<211> 33
<212> DNA
<213> Artificial sequence
<220>
<221> misc_feature
<222> (13)..(13)
<223> n is u
<400> 31
caaggacatc cgnccctccc tccaggaagc cta 33
<210> 32
<211> 33
<212> DNA
<213> Artificial sequence
<220>
<221> misc_feature
<222> (13)..(13)
<223> n is u
<400> 32
caaggacatc cgnaggcaga tgcccagcag gcg 33

Claims (10)

1. A high fidelity target gene library construction method, comprising:
a phosphorylation step, which comprises modifying a phosphate group at the 5' end of the template molecule;
annealing and extending, including annealing and extending a first primer to a target nucleotide region of the template molecule to obtain a double-stranded molecule containing a target nucleotide extension chain, namely the target nucleotide molecule, wherein the first primer contains an enzyme cutting site;
a first sequencing adapter ligation step comprising ligating a first sequencing adapter to the target nucleotide molecule, obtaining a first sequencing adapter ligation product;
the enzyme digestion step comprises the steps of cutting off a first primer with an enzyme digestion site on a first sequencing joint connection product by using enzyme, and then removing a nucleotide sequence at the 5' end of the enzyme digestion site in the first primer to obtain the first sequencing joint connection product after partial sequence of the first primer is cut off;
the method comprises the following steps of (1) flattening the double-stranded end, wherein the step comprises the step of excising a single-stranded sequence protruding from the 3' end of an original template chain in a first sequencing linker ligation product to obtain a first sequencing linker ligation product with a flattened end;
a second sequencing linker ligation step, comprising ligating a second sequencing linker to the blunt-end first sequencing linker ligation product to obtain a second sequencing linker ligation product;
and an amplification step, which comprises amplifying the second sequencing joint connection product by using a second primer and a third primer to obtain a sequencing library.
2. The high fidelity target gene banking method of claim 1, wherein the template molecule is single-stranded DNA and/or double-stranded DNA;
and/or, in the phosphorylation step, the template molecule is selected from at least one of the following DNA molecules:
a) DNA molecules with the length less than or equal to 500 bp;
b) bisulfite-treated DNA molecules;
c) extracellular free DNA;
d) single-stranded and/or double-stranded cDNA reverse-transcribed from an RNA sample;
and/or after modifying phosphate groups at the 5' end of the template molecules, heating the template molecules to 80-98 ℃, keeping for 1-10min, after the reaction is finished, placing the container filled with the template molecules on ice, keeping for 2-10min, and then entering the next reaction;
and/or, the enzyme used in the phosphorylation step comprises T4polynucleotide kinase.
3. The high fidelity target gene library construction method of claim 1, wherein the first primer comprises an enzyme cleavage site, a first sequence that is connected in series to the 3 'end of the enzyme cleavage site and that is complementary to the target nucleotide sequence on the template molecule, and a second sequence that is connected in series to the 5' end of the enzyme cleavage site and that is not complementary to the nucleotide sequence on the template molecule.
4. The high fidelity target gene banking method of claim 3, wherein the second sequence is 3 to 30nt in length;
and/or, the second sequence comprises the following base sequence: CAAGGACATCCG are provided.
5. The high fidelity target gene library construction method of claim 1, wherein the cleavage site of the first primer comprises uracil;
and/or, the 5' end of the first primer is modified with a marker molecule;
and/or, the marker molecule comprises biotin.
6. The high fidelity target gene banking method of claim 1 wherein the first sequencing adapter comprises a first molecular tag;
and/or, the 5' end of the inner side of the first sequencing joint is modified with a phosphate group, and the inner side of the first sequencing joint refers to one side of the first sequencing joint, which can be connected to a target nucleotide molecule in series;
and/or, the outer 3' end of the first sequencing joint is modified with a phosphate group, and the outer side of the first sequencing joint refers to one side of the first sequencing joint, which can not be connected to a target nucleotide molecule in series;
and/or the first sequencing joint comprises a complementary paired forward strand and a complementary paired reverse strand, wherein a first molecular tag is connected in series at the 5' end of the forward strand, a nucleotide sequence which is complementary paired with the first molecular tag is connected in series at the 3 ' end of the reverse strand, and the 5' end of the first molecular tag is modified with a phosphate group;
and/or the 3' end of the forward strand of the first sequencing linker is modified with a phosphate group;
and/or in the first sequencing linker connection step, collecting a first sequencing linker connection product by using magnetic beads coated with streptavidin, and then performing enzyme digestion;
and/or, in the enzyme digestion step, the enzyme used comprises UDG enzyme;
and/or in the enzyme digestion step, collecting a second sequence connected in series with the 5' end of the enzyme digestion site cut off from the first primer by using magnetic beads coated with streptavidin, reserving the remaining first sequencing adaptor connection product in the supernatant, transferring the supernatant into another container, and performing the subsequent double-stranded terminal flattening step;
and/or, in the double-stranded end flattening step, a T4DNA polymerase is used for cutting a single-stranded sequence protruding from the 3' end of the original template strand in the ligation product of the first sequencing adaptor.
7. The high fidelity target gene banking method of claim 1, wherein the 5' end of the inner side of the second sequencing adapter is modified with a phosphate group, and the inner side of the second sequencing adapter is the side of the second sequencing adapter that can be connected in series to the ligation product of the first sequencing adapter;
and/or, the second sequencing adapter may or may not contain a second molecular tag;
and/or, the second sequencing adapter comprises complementary paired forward strand, reverse strand;
and/or when the second sequencing joint does not contain the second molecular tag, the 5' end of the reverse strand of the second sequencing joint is modified with a phosphate group;
and/or when the second sequencing joint contains a second molecular tag, the 5' end of the reverse strand of the second sequencing joint is connected with the second molecular tag in series, the 5' end of the second molecular tag is modified with a phosphate group, and the 3 ' end of the forward strand of the second sequencing joint is connected with a nucleotide sequence which can be complementarily paired with the second molecular tag in series;
and/or, the first molecular tag, the second molecular tag are independently a random nucleotide sequence;
and/or the length of the first molecular label and the second molecular label is 4-19nt independently;
and/or, the second primer comprises a first sample tag;
and/or the second primer comprises an inner adaptor, a first sample label and an outer adaptor which are connected in series from the 3 'end to the 5' end, wherein the inner adaptor can be complementarily paired with the reverse strand of the first sequencing adaptor;
and/or, the third primer contains or does not contain a second sample tag;
and/or, when the third primer does not contain the second sample tag, the third primer contains an inner adaptor and an outer adaptor which are connected in series from the 3 'end to the 5' end, and the inner adaptor can be complementarily paired with the reverse strand of the second sequencing adaptor;
and/or, when the third primer comprises a second sample tag, the third primer comprises an inner adaptor, a second sample tag and an outer adaptor which are connected in series from the 3 'end to the 5' end, wherein the inner adaptor can be complementarily paired with the reverse strand of the second sequencing adaptor;
and/or the length of the first sample label and the second sample label is 4-19nt independently;
and/or the first sequencing linker is a high-throughput sequencing platform right-side sequencing linker;
and/or the first sequencing linker comprises any one of a P7-terminal sequencing linker of an Illumina sequencing platform, a P1-terminal sequencing linker of an MGI sequencing platform, or a right-side sequencing linker of other high-throughput sequencing platforms;
and/or the second sequencing linker is a high-throughput sequencing platform left sequencing linker;
and/or the second sequencing linker comprises any one of a P5-terminal sequencing linker of an Illumina sequencing platform, a P2-terminal sequencing linker of an MGI sequencing platform, or a left-side sequencing linker of other high-throughput sequencing platforms.
8. The library constructed by the high fidelity target gene library construction method according to any of claims 1 to 7.
9. A kit, comprising: the kit comprises a first primer, a second primer, a third primer, a first sequencing joint and a second sequencing joint, wherein the first primer contains an enzyme cutting site.
10. The kit of claim 9, wherein the first primer further comprises a first sequence that is tandemly linked to the 3 'end of the cleavage site and that is complementary to a target nucleotide sequence on the template molecule, and a second sequence that is tandemly linked to the 5' end of the cleavage site and that is not complementary to a nucleotide sequence on the template molecule;
and/or, the length of the second sequence is 3-30 nt;
and/or, the second sequence comprises the following base sequence: CAAGGACATCCG, respectively;
and/or the enzyme cutting site of the first primer contains uracil;
and/or, the 5' end of the first primer is modified with a marker molecule;
and/or, the marker molecule comprises biotin;
and/or, the second primer comprises a first sample tag;
and/or the second primer comprises an inner adaptor, a first sample label and an outer adaptor which are connected in series from the 3 'end to the 5' end, wherein the inner adaptor can be complementarily paired with the reverse strand of the first sequencing adaptor;
and/or, the third primer contains or does not contain a second sample tag;
and/or, when the third primer does not contain the second sample tag, the third primer contains an inner adaptor and an outer adaptor which are connected in series from the 3 'end to the 5' end, and the inner adaptor can be complementarily paired with the reverse strand of the second sequencing adaptor;
and/or, when the third primer comprises a second sample tag, the third primer comprises an inner adaptor, a second sample tag and an outer adaptor which are connected in series from the 3 'end to the 5' end, wherein the inner adaptor can be complementarily paired with the reverse strand of the second sequencing adaptor;
and/or the length of the first sample label and the second sample label is 4-19nt independently;
and/or, the first sequencing adaptor comprises a first molecular tag;
and/or, the 5' end of the inner side of the first sequencing joint is modified with a phosphate group, and the inner side of the first sequencing joint refers to one side of the first sequencing joint, which can be connected to a target nucleotide molecule in series;
and/or, the outer 3' end of the first sequencing joint is modified with a phosphate group, and the outer side of the first sequencing joint refers to one side of the first sequencing joint, which can not be connected to a target nucleotide molecule in series;
and/or the first sequencing joint comprises a complementary paired forward strand and a complementary paired reverse strand, wherein a first molecular tag is connected in series at the 5' end of the forward strand, a nucleotide sequence which is complementary paired with the first molecular tag is connected in series at the 3 ' end of the reverse strand, and the 5' end of the first molecular tag is modified with a phosphate group;
and/or the 5' end of the inner side of the second sequencing joint is modified with a phosphate group, and the inner side of the second sequencing joint refers to one side of the second sequencing joint, which can be connected to the connection product of the first sequencing joint in series;
and/or, the second sequencing adapter may or may not contain a second molecular tag;
and/or, the second sequencing adaptor contains a forward strand and a reverse strand which can be complementarily paired;
and/or when the second sequencing joint does not contain the second molecular tag, the 5' end of the reverse strand of the second sequencing joint is modified with a phosphate group;
and/or when the second sequencing joint contains a second molecular tag, the 5' end of the reverse strand of the second sequencing joint is connected with the second molecular tag in series, the 5' end of the second molecular tag is modified with a phosphate group, and the 3 ' end of the forward strand of the second sequencing joint is connected with a nucleotide sequence which can be complementarily paired with the second molecular tag in series;
and/or, the first molecular tag, the second molecular tag are independently a random nucleotide sequence;
and/or the length of the first molecular label and the second molecular label is 4-19nt independently;
and/or, the first sequencing linker comprises a high throughput sequencing platform right side sequencing linker;
and/or, the first sequencing linker includes, but is not limited to, any one of a P7-terminal sequencing linker of Illumina sequencing platform, a P1-terminal sequencing linker of MGI sequencing platform, or other high throughput sequencing platform right-side sequencing linker;
and/or, the second sequencing linker comprises a high-throughput sequencing platform left-side sequencing linker;
and/or, the second sequencing linker includes, but is not limited to, any one of a P5-terminal sequencing linker of an Illumina sequencing platform, a P2-terminal sequencing linker of an MGI sequencing platform, or a left-side sequencing linker of other high-throughput sequencing platforms.
CN202110230543.2A 2021-03-02 2021-03-02 High-fidelity target gene library building method and kit thereof Pending CN112941147A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110230543.2A CN112941147A (en) 2021-03-02 2021-03-02 High-fidelity target gene library building method and kit thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110230543.2A CN112941147A (en) 2021-03-02 2021-03-02 High-fidelity target gene library building method and kit thereof

Publications (1)

Publication Number Publication Date
CN112941147A true CN112941147A (en) 2021-06-11

Family

ID=76247149

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110230543.2A Pending CN112941147A (en) 2021-03-02 2021-03-02 High-fidelity target gene library building method and kit thereof

Country Status (1)

Country Link
CN (1) CN112941147A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115820380A (en) * 2023-01-04 2023-03-21 深圳赛陆医疗科技有限公司 Microfluidic chip and preparation method and application thereof

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1191575A (en) * 1995-05-22 1998-08-26 斯里国际 Oligonucleotide sizing using cleavable primers
US20100021915A1 (en) * 2006-12-11 2010-01-28 Thomas Jefferson University High throughput dna sequencing method and apparatus
CN106987585A (en) * 2017-03-15 2017-07-28 深圳市海普洛斯生物科技有限公司 A kind of single stranded DNA two generations sequencing library construction method for cfDNA
CN107236729A (en) * 2017-07-04 2017-10-10 上海阅尔基因技术有限公司 The method and kit of a kind of rapid build target nucleic acid sequencing library that enrichment is captured based on probe
CN109234356A (en) * 2018-09-18 2019-01-18 南京迪康金诺生物技术有限公司 A kind of method and application constructing hybrid capture sequencing library
WO2019114146A1 (en) * 2017-12-15 2019-06-20 格诺思博生物科技南通有限公司 Method for enriching gene target regions and library construction kit
CN110904512A (en) * 2018-09-14 2020-03-24 广州华大基因医学检验所有限公司 High-throughput sequencing library construction method suitable for single-stranded DNA
CN112266948A (en) * 2020-11-06 2021-01-26 中山大学孙逸仙纪念医院 High-throughput targeting library building method and application

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1191575A (en) * 1995-05-22 1998-08-26 斯里国际 Oligonucleotide sizing using cleavable primers
US20100021915A1 (en) * 2006-12-11 2010-01-28 Thomas Jefferson University High throughput dna sequencing method and apparatus
CN106987585A (en) * 2017-03-15 2017-07-28 深圳市海普洛斯生物科技有限公司 A kind of single stranded DNA two generations sequencing library construction method for cfDNA
CN107236729A (en) * 2017-07-04 2017-10-10 上海阅尔基因技术有限公司 The method and kit of a kind of rapid build target nucleic acid sequencing library that enrichment is captured based on probe
WO2019114146A1 (en) * 2017-12-15 2019-06-20 格诺思博生物科技南通有限公司 Method for enriching gene target regions and library construction kit
CN110904512A (en) * 2018-09-14 2020-03-24 广州华大基因医学检验所有限公司 High-throughput sequencing library construction method suitable for single-stranded DNA
CN109234356A (en) * 2018-09-18 2019-01-18 南京迪康金诺生物技术有限公司 A kind of method and application constructing hybrid capture sequencing library
CN112266948A (en) * 2020-11-06 2021-01-26 中山大学孙逸仙纪念医院 High-throughput targeting library building method and application

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115820380A (en) * 2023-01-04 2023-03-21 深圳赛陆医疗科技有限公司 Microfluidic chip and preparation method and application thereof
CN115820380B (en) * 2023-01-04 2024-01-30 深圳赛陆医疗科技有限公司 Microfluidic chip and preparation method and application thereof

Similar Documents

Publication Publication Date Title
EP3464634B1 (en) Molecular tagging methods and sequencing libraries
US10988802B2 (en) Methods for next generation genome walking and related compositions and kits
CN110699426B (en) Gene target region enrichment method and kit
JP5140425B2 (en) Method for simultaneously amplifying specific nucleic acids
EP3559269B1 (en) Single stranded circular dna libraries for circular consensus sequencing
CN109844137B (en) Barcoded circular library construction for identification of chimeric products
WO2015131107A1 (en) Reduced representation bisulfite sequencing with diversity adaptors
CN109593757B (en) Probe and method for enriching target region by using same and applicable to high-throughput sequencing
US20230056763A1 (en) Methods of targeted sequencing
CN112410331A (en) Linker with molecular label and sample label and single-chain library building method thereof
CN111936635A (en) Generation of single stranded circular DNA templates for single molecule sequencing
US20140336058A1 (en) Method and kit for characterizing rna in a composition
CN112680796A (en) Target gene enrichment and library construction method
CN110699425A (en) Method and system for enriching gene target region
CN112941147A (en) High-fidelity target gene library building method and kit thereof
US20180100180A1 (en) Methods of single dna/rna molecule counting
CN112639127A (en) Method for detecting and quantifying genetic alterations
CN107083427B (en) DNA ligase mediated DNA amplification technology
WO2018081666A1 (en) Methods of single dna/rna molecule counting
WO2020183188A1 (en) Nucleic acid amplification methods
CN112912514A (en) Barcoding of nucleic acids
CA3110679A1 (en) Size selection of rna using poly(a) polymerase
EP4048812B1 (en) Methods for 3&#39; overhang repair
CN117305466B (en) Detection method capable of identifying single base methylation state
CN112575388A (en) Single-molecule target gene library building method and kit thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination