WO2021227129A1 - Adaptateur universel de séquençage à haut débit et application associée - Google Patents

Adaptateur universel de séquençage à haut débit et application associée Download PDF

Info

Publication number
WO2021227129A1
WO2021227129A1 PCT/CN2020/092418 CN2020092418W WO2021227129A1 WO 2021227129 A1 WO2021227129 A1 WO 2021227129A1 CN 2020092418 W CN2020092418 W CN 2020092418W WO 2021227129 A1 WO2021227129 A1 WO 2021227129A1
Authority
WO
WIPO (PCT)
Prior art keywords
throughput sequencing
sequence
stranded
sequencing adapter
strand
Prior art date
Application number
PCT/CN2020/092418
Other languages
English (en)
Chinese (zh)
Inventor
曹彦东
周洋
扶媛媛
杨颖�
张丽婷
Original Assignee
北京安智因生物技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京安智因生物技术有限公司 filed Critical 北京安智因生物技术有限公司
Publication of WO2021227129A1 publication Critical patent/WO2021227129A1/fr

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6858Allele-specific amplification
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Definitions

  • the invention relates to the field of gene sequencing, in particular to a universal high-throughput sequencing adapter and its application in the process of sequencing database construction.
  • High-Throughput Sequencing high-throughput sequencing technology
  • NGS Next Generation Sequencing Technology
  • Sanger sequencing is to extend a synthetic short oligonucleotide primer by DNA polymerase, hybridize with a single-stranded DNA template to synthesize new DNA fragments, and separate different fragments by polyacrylamide electrophoresis to read DNA sequences.
  • NGS sequencing usually uses massively parallel sequencing (MPS), which can realize simultaneous sequencing of multiple samples and multiple sites, greatly improving sequencing throughput.
  • MPS massively parallel sequencing
  • high-throughput sequencing technology has been proven to have high accuracy and sensitivity in clinical genetic testing.
  • it is affected by various noises and errors in the library construction process and sequencing process, causing low-frequency mutations in the sequencing results. It is difficult to distinguish its authenticity.
  • the proportion of label hopping can be as high as 2% (illumina.Effects of index misassignment on multiplexing) and downstream[Z]Analysis).
  • NGS sequencing platforms are Ion Torrent and Illumina.
  • different platforms use different technical procedures for library construction, which makes the library not universal, that is, libraries suitable for the Ion Torrent platform cannot be used on the Illumina platform. Generate sequencing data and vice versa. This has greatly restricted the clinical application, so it is necessary to find a universal library and apply different sequencing platforms.
  • sequencing adapter research is the focus of library research.
  • the design of sequencing adapters mainly includes two directions. One is to try to improve the shape of the adapter, such as Y-shaped or U-shaped adapters. In order to reduce or avoid the appearance of adapter dimers and increase the amount of available sequencing data; the other is to add specific molecular tags to the adapter structure to identify errors in the library construction process.
  • sequencing libraries prepared through the above two research directions can still only be used for fixed sequencing platforms, and cannot be used in mainstream sequencing platforms such as Ion Torrent and Illumina at the same time.
  • the Ion Torrent platform and the Illumina platform have different sequencing principles and different methods for constructing sequencing templates.
  • the Ion Torrent platform uses emulsion PCR to construct the sequencing template; the Illumina platform uses bridge amplification or exclusive amplification to construct the sequencing template. .
  • Ion Torrent generally uses straight link heads; Illumina generally uses Y-links for library construction.
  • the technical problem to be solved by the present invention is to overcome the disadvantages that the high-throughput sequencing adapter in the prior art cannot realize the compatibility of different sequencing platforms, and the applicability is not strong.
  • the first objective of the present invention is to seek a universal high-throughput sequencing adapter suitable for multiple sequencing platforms
  • the second objective of the present invention is to seek a method for preparing a universal high-throughput sequencing adapter suitable for multiple sequencing platforms
  • the third objective of the present invention is to seek the application of a universal high-throughput sequencing adapter
  • the fourth objective of the present invention is to seek a method for detecting low-frequency gene mutations.
  • the present invention provides the following technical solutions:
  • the present invention provides a single link head, which is characterized in that the single link heads are connected in sequence
  • the free arm includes a library amplification primer binding region and a carrier binding region;
  • the double-stranded complementary region contains two or more sequencing primer binding regions of the sequencing platform.
  • the double-stranded complementary region further includes a tag sequence.
  • the tag sequence is located at one end of the double-stranded complementary region away from the free arm.
  • the tag sequence consists of 6-12 random bases.
  • the length of the free arm of the single link head is 30-56 bp, and the length of the double-stranded complementary region is 40-58 bp.
  • the free arm can be composed of the following sequence:
  • the double-stranded complementary region may be composed of the following sequence:
  • XXXXXX represents a tag sequence composed of 6-12 random bases
  • the "N” represents any base of A, T, C, G, or NA (no base).
  • the free arm further includes a tag sequence, and the tag sequence is consistent with the double-stranded complementary region tag sequence.
  • the free arm can be composed of the following sequence:
  • the "XXXXXX" represents a tag sequence composed of 6-12 random bases
  • the "N” represents any base of A, T, C, G, or NA (no base).
  • the present invention also provides a Y-type high-throughput sequencing adapter, characterized in that the sequencing adapter includes a first single strand and a second single strand;
  • the first single strand and the second single strand respectively include:
  • the free arm includes a library amplification primer binding region and a carrier binding region;
  • the double-stranded complementary region contains two or more sequencing primer binding regions of the sequencing platform.
  • the free arm sequences of the first single strand and the second single strand are not complementary, and the first single strand and the second single strand may be annealed to form a Y-shaped structure double strand.
  • the double-stranded complementary region includes a tag sequence, and the tag sequence is located at one end of the double-stranded complementary region away from the free arm.
  • the sequencing platform includes, but is not limited to, Illumina, Ion Torrent, PacBio, Roche, Helicos, and ABI platforms; preferably, the sequencing platform is Ion Torrent and Illumina platforms.
  • the second single-stranded free arm further includes a tag sequence.
  • the tag sequence in the free arm is the same as the tag sequence in the double-stranded complementary region; more preferably, the tag sequence in the free arm is close to the end of the double-stranded complementary region.
  • the length of the double-stranded complementary region of the first single strand and the second single strand is 40-58 bp; the length of the free arm of the first single strand is 30-45 bp, and the length of the free arm of the second single strand is 30-45 bp.
  • the length is 35-56bp; the tag sequence is composed of random bases of 6-12bp.
  • the 3'end of the free arm of the first or second single strand is modified for stability
  • thio modification is carried out
  • the phosphodiester bond between the last 3 bases at the 3'end is replaced by phosphorothioate.
  • the first single-stranded sequence is as follows:
  • Double-stranded complementary region sequence
  • the "XXXXXX" represents a tag sequence composed of 6-12 random bases
  • the "N” represents any base of A, T, C, G, or NA (no base).
  • the first single strand is connected by a free arm and a double-strand complementary region in a 5'-3' direction sequentially.
  • the second single-stranded sequence is as follows:
  • Double-stranded complementary region sequence
  • the "XXXXXX" represents a tag sequence composed of 6-12 random bases
  • the "N” represents any base of A, T, C, G, or NA (no base).
  • the second single-stranded double-stranded complementary region and the free arm are connected in a 5'-3' direction sequentially.
  • the present invention also provides a high-throughput sequencing adapter set, characterized in that the sequencing adapter set includes the above-mentioned high-throughput sequencing adapter.
  • the high-throughput sequencing adapter set further includes another Y-type high-throughput sequencing adapter as follows: the Y-type high-throughput sequencing adapter includes third and fourth single strands;
  • the sequence of the other Y-type high-throughput sequencing adaptor is similar to the sequence of the above-mentioned Y-type high-throughput sequencing adaptor, except that the sequence of the double-stranded complementary region is different;
  • sequence of the double-strand complementary region of the third single-strand is as follows:
  • sequence of the double-strand complementary region of the fourth single-stranded is complementary to the sequence of the third single-stranded double-strand complementary region;
  • the "XXXXXX" represents a tag sequence composed of 6-12 random bases
  • the "N” represents any base of A, T, C, G, or NA (no base).
  • connection sequence between the free arms of the third and fourth single strands and the double-strand complementary region sequence is the same as that of the first and second single strands.
  • the single-stranded sequences of the Y-type high-throughput sequencing adapter are as follows:
  • the first single-stranded sequence (SEQ ID NO.1):
  • the second single-stranded sequence (SEQ ID NO.2):
  • the third single-stranded sequence (SEQ ID NO.3):
  • the fourth single-stranded sequence (SEQ ID NO.4):
  • the single-stranded sequences of the Y-type high-throughput sequencing adapter are as follows:
  • the first single-stranded sequence (SEQ ID NO.5):
  • the second single-stranded sequence (SEQ ID NO.6):
  • the third single-stranded sequence (SEQ ID NO.7):
  • the fourth single-stranded sequence (SEQ ID NO. 8):
  • SEQ ID NO.1-8 in the sequence listing does not contain "XXXXX".
  • the present invention also provides a composition, characterized in that the composition comprises the above-mentioned high-throughput sequencing linker or linker set.
  • the present invention also provides a complex, which is characterized in that the complex is connected to the above-mentioned high-throughput sequencing adapter or adapter set.
  • the present invention also provides a kit, characterized in that the composition comprises the above-mentioned high-throughput sequencing adapter or adapter set.
  • the kit is a high-throughput sequencing library building kit or a gene sequence enrichment kit.
  • the present invention also provides a method for preparing the above-mentioned high-throughput sequencing adapter, which is characterized in that it comprises the following steps:
  • S1 synthesizes the first strand and the second strand single-stranded sequence respectively
  • S2 specifically anneals the two single-stranded sequences of S1 to obtain the high-throughput sequencing adapter.
  • the present invention also provides a method for constructing a sequencing library, which is characterized in that:
  • S1 prepares the target fragment of the sample to be tested
  • S2 connects the aforementioned high-throughput sequencing adapter or adapter set to the target fragment of S1 to obtain a ligation product
  • S3 amplifies the S1 ligation product, and obtains the sequencing library of the sample to be tested after purification.
  • the present invention also provides a method for detecting low-frequency gene mutations, and is characterized in that it comprises the following steps:
  • S1 prepares the above-mentioned high-throughput sequencing adapters or adapter sets, for the same sample, the tag sequences are the same;
  • S2 performs target fragment amplification on the sample to be tested and digests the primers
  • S3 connects the digested product of S2 to the mid-to-high-throughput sequencing adapter or adapter set of S1 to obtain a ligation product, amplify the ligation product, and obtain a sequencing library after purification;
  • S4 sequence the sequencing library of S3, correct the sequencing data according to the tag sequence of the high-throughput sequencing adapter, and perform mutation analysis based on the corrected sequencing data.
  • the mutation analysis in step S4 is: based on the fact that a specific mutation appears in both the sense strand and the antisense strand of the same read, it is determined as a true low-frequency mutation.
  • the sample to be tested is genomic DNA.
  • the present invention also provides the following applications of the above-mentioned high-throughput sequencing adapter, adapter set, composition, complex or kit:
  • the universal high-throughput sequencing adapter provided by the present invention includes paired double-stranded complementary regions and unpaired single-stranded free arms.
  • the distal end of the paired double-stranded part contains the tag sequence, and the non-free ends of the two free arms contain the tag sequence.
  • the base composition of the tag sequence carried by the same sample is consistent. According to the consistency of the tag sequence, it can be judged whether there is cross-contamination during the library construction process. After using some models of the Illumina sequencing platform for sequencing, the analysis of sequencing data can determine whether there is index hopping based on whether the base composition of the tag sequence of the same read is consistent.
  • the universal high-throughput sequencing adapter provided by the present invention includes paired double-stranded complementary regions and unpaired single-stranded free arms.
  • the distal end of the paired double-stranded part contains a tag sequence, and the base composition of the tag sequence should be the same for the sense strand and antisense strand of the same read.
  • a specific mutation must be in the sense strand of the same read, and the antisense strand can be judged as a true mutation. If a certain read only has mutations in the sense strand or the antisense strand, it can be judged as an error in the library construction or sequencing process, and the mutation cannot be included in the subsequent analysis process to avoid false positives.
  • the tag sequence contained in the universal high-throughput sequencing adapter provided by the present invention only uses a segment of tag sequence, and must exist in both the sense strand and the antisense strand through specific mutations; the base composition of the tag sequence of the sample should be in the read segment
  • the base composition of the tag sequence is the same.
  • the tag sequence of a sample is not the same as the tag sequence in the read segment, it can indicate that the read segment does not belong to this sample, that is, a tag skip situation has occurred.
  • the design of the present invention can effectively overcome the inherent label jumping problem of the sequencing part of the platform, and can realize the authenticity interpretation of low-frequency mutations.
  • the universal high-throughput sequencing adapter provided by the present invention includes paired double-stranded complementary regions and unpaired single-stranded free arms.
  • the universal high-throughput sequencing adapter includes a PN adapter and an AN adapter; the PN adapter double-strand complementary region consists of 40 to 58 bases; the AN adapter double-strand complementary region consists of 40 to 58 bases; the PN adapter or The 5'free arm of the AN linker is composed of 30 to 45 bases; the 3'free arm of the PN linker or the AN adaptor is composed of 35 to 56 bases; the tag sequence is composed of 6 to 12 bases, so as to achieve at least 114048 bases.
  • Universal high-throughput sequencing adapter is composed of 30 to 45 bases; the 3'free arm of the PN linker or the AN adaptor is composed of 35 to 56 bases; the tag sequence is composed of 6 to 12 bases, so as to achieve at least 114048 bases.
  • FIG. 1 Schematic diagram of the structure of the universal high-throughput sequencing adapter shown in Example 1;
  • FIG. 1 The quality control map of the universal high-throughput library 2100 in Example 2, including the quality control map of the universal high-throughput sequencing adapter library 2100 (sample R19054232);
  • FIG. 1 The quality control map of the universal high-throughput library 2100 in Example 2, including the quality control map of the universal high-throughput sequencing adapter library 2100 (sample R20005128);
  • the terms “including”, “including”, “having”, “containing” or “involving” are inclusive or open-ended, and do not exclude other unlisted elements or method steps .
  • the term “consisting of” is considered a preferred embodiment of the term “comprising”. If in the following a certain group is defined as comprising at least a certain number of embodiments, this should also be understood as revealing a group preferably consisting of only these embodiments.
  • nucleic acid refers to any molecule comprising ribonucleic acid, deoxyribonucleic acid or its analogue unit, preferably a polymeric molecule.
  • the nucleic acid can be single-stranded or double-stranded.
  • the single-stranded nucleic acid may be a nucleic acid of one strand of denatured double-stranded DNA.
  • the single-stranded nucleic acid may be a single-stranded nucleic acid that is not derived from any double-stranded DNA.
  • complementary refers to the hydrogen bond base pairing between the nucleotide bases G, A, T, C, and U, so that when two given polynucleotides or polynucleotide sequences anneal to each other At this time, A paired with T, G paired with C in DNA, G paired with C, and A paired with U in RNA.
  • the “sequencing adapter” in the present invention refers to a double-stranded nucleotide sequence connected to the two ends of the target fragment to be sequenced.
  • the double-stranded oligonucleotide sequence can be double-stranded completely complementary or partially double-stranded. Complementarity, such as a "Y-type” linker formed because the terminal partial sequence is not complementary.
  • the sequencing linker of the present invention is preferably such a "Y-type” linker.
  • the composition of the nucleotide sequence of the sequencing adapter is related to the applicable sequencing platform.
  • the composition can include library amplification primer sequence, sample tag sequence, sequencing primer sequence, etc.; and the sequence length of the sequencing adapter is also related to the sequencing platform.
  • the length of the linker can be specifically: 3'free arm sequence is 35-56 bp, 5'free arm sequence is 30-45 bp, double-stranded complementary region sequence The length is 40 ⁇ 58bp.
  • Fig. 1 is a preferred universal high-throughput sequencing "Y-type" linker of the present invention, which includes a PN linker and an AN linker, which can be respectively located at either end of the target sequence.
  • Both the PN linker and the AN linker comprise a double-stranded complementary region, a single-stranded 5'free arm and a single-stranded 3'free arm.
  • the PN linker of the universal high-throughput sequencing linker and the double-stranded complementary region of the AN linker both include a tag sequence, and the tag sequence is composed of 6 to 12 bases.
  • the non-free end of the AN adaptor single-stranded 3'free arm of the high-throughput sequencing adaptor also contains the same base composition as the tag sequence
  • the PN adaptor of the above-mentioned universal high-throughput sequencing adaptor is single-stranded
  • the non-free end of the 3'free arm contains the same base composition as the tag sequence.
  • the 3'end of the 3'-free arm of the universal high-throughput sequencing linker AN linker and PN linker is thio modified; preferably, the last 3 bases The phosphodiester bond between is replaced by phosphorothioate.
  • the double-stranded complementary ends of the universal high-throughput sequencing linker AN linker and PN linker can be connected to the original gene fragment through a ligation reaction by a ligase.
  • the 5'free arm of the AN adaptor and the 3'free arm of the PN adaptor are non-complementary paired single strands and cannot be connected to the original gene fragments, thereby ensuring the efficiency of connecting the universal high-throughput sequencing adaptor to the DNA fragments.
  • PN linker and AN linker respectively refer to a partial double-stranded structural fragment (Y-type structure) containing a double-stranded complementary region and a single-stranded 3'/5' free arm, which is in the library When constructing, they are connected to one end of the target sequence respectively, and the nucleotide sequences of the two are preferably different.
  • the "free arm” in the present invention refers to the region where the bases in the linker sequence are not complementary paired, such as the unpaired region of the PN linker or AN linker of the present invention. Therefore, even if it is not clear that the sequence between the free arms is not complementary, this The field should also understand that the two sequences are not complementary and can form a Y-shaped structure in some cases.
  • the free arm of the present invention includes a library amplification primer region; in other embodiments, the 3'free arm of the present invention also includes a tag sequence.
  • the “double-stranded complementary region” in the present invention refers to the double-stranded complementary region contained in the sequencing adapter. This region usually contains sequencing primer sequences.
  • the double-stranded complementary region of the present invention contains at least two sequencing platforms for sequencing. Primer sequence.
  • the "tag sequence” in the present invention refers to a nucleotide sequence with a base length of 6 to 12 bp, which is used to identify different library samples.
  • non-free end in the present invention refers to the end where the double-stranded complementary region of the PN linker or the AN linker is connected to the 3'or 5'free arm of the single strand.
  • the "free end” in the present invention refers to the 3'end of the single-stranded 3'free arm or the 5'end of the single-stranded 5'free arm of the PN linker or the AN linker.
  • the "high-throughput sequencing platform” in the present invention refers to sequencing platforms such as Ion Torrent, Illumina, Roche454, and ABI. Although the sequencing platforms are preferably Ion Torrent and Illumina in the present invention, they are not limited. It is clear in the art that, based on the inventive concept of the present invention, primer sequences can be selected for any two or more platforms, and they can be constructed in the linker sequence to prepare the compatible high-throughput sequencing linker of the present invention. In addition, for sequencers under different sequencing platforms, considering that the sequencing principles of sequencers under the same type of sequencing platform are basically the same, the method of the present invention is applicable to all models under the same platform, for example, in the Ion Torrent sequencing platform.
  • the "low-frequency mutation” in the present invention refers to mutations where the frequency of gene mutation is less than 5%, including less than 5%, less than 4%, less than 3%, less than 2%, less than 1%, etc. Various mutations.
  • AN1/PN1 is a set of universal high-throughput sequencing adapters with shorter sequences
  • AN2/PN2 is another set of universal high-throughput sequencing adapters with longer sequences.
  • the specific preparation method is as follows:
  • the end of the double-strand complementary region of the AN linker contains a tag sequence, where the tag sequence is 6-12 random bases "X".
  • the 3'free arm of the AN linker also contains a tag sequence of 6-12 random bases "X”, and is connected to the non-free end of the 3'free arm of the AN linker.
  • the end of the double-stranded complementary region of the PN linker contains a tag sequence, where the tag sequence is 6-12 random bases "X”.
  • the 3'free arm of the PN linker also contains a tag sequence of 6-12 random bases "X", and is connected to the non-free end of the 3'free arm of the PN linker. *Represents the thio modification site.
  • the universal high-throughput sequencing linker AN linker and the phosphodiester between the last 3 bases of the 3'end of the 3'free arm of the PN linker The bond is replaced by phosphorothioate.
  • the universal high-throughput sequencing adapters AN1/PN1 and AN2/PN2 prepared in Example 1 were used in the experiment, respectively.
  • the number of universal high-throughput sequencing adapters corresponds to the number of samples to be tested. For example, if the number of samples to be tested is 10, 10 sets of universal high-throughput sequencing adapter sets are prepared correspondingly, and each set of universal high-throughput sequencing adapter sets includes PN1 adapters and AN1 adapters.
  • the base sequence composition of the tag sequence in the same group of PN1 adaptors and AN1 adaptors is the same, and the base sequence composition of the tag sequence in different adaptor groups is different.
  • Sample genomic DNA extraction Take peripheral blood samples 1 and 2 (corresponding to R190542432 and R20005128 respectively) for genomic DNA extraction.
  • the sample DNA was extracted in accordance with the operating instructions of the nucleic acid extraction reagent (DR181003-48) produced by Beijing Anzhiyin Biotechnology Co., Ltd.
  • the target regions to be examined are the entire coding region and the variable splicing region of the ACTA2, COL3A1, FBN1, MYH11, MYLK, SMAD3, TGFBR1, and TGFBR2 genes (20bp from exons to introns).
  • the multiple PCR primer pool of the target detection area is based on the design of Ion Ampliseq Designer, synthesized and provided by Thermo Fisher.
  • Target fragment amplification the specific implementation is as follows:
  • the ligase is Fast T4 DNA Ligase produced by Shanghai Yisheng Biotechnology Co., Ltd.
  • the ligation buffer is Shanghai Yi 5 ⁇ Fast Ligation Buffer produced by Sheng Biotechnology Co., Ltd.
  • the specific implementation is as follows:
  • the Ion Torrent platform and the Illumina platform are used to perform sequencing verification on the above-mentioned high-throughput library, as follows:
  • Ion 520 TM & Ion 530 TM Kit–OT Dilute the library after purification and quality inspection, use Ion 520 TM & Ion 530 TM Kit–OT, and proceed according to the kit operating procedures. After template preparation on the IonTouch 2 instrument, the Ion GeneStudio TM S5 Plus gene sequencer is used for sequencing and data analysis.
  • the library was diluted, and Miseq DX Reagent Kit v3 was used to proceed in accordance with the kit operating procedures. Sequencing and data analysis were performed on the Miseq DX gene sequencer.
  • the average read length of the above two samples is ⁇ 200bp, indicating that all samples in the sample are read through, that is, the bases between the beginning and the end of the target fragment to be tested can be identified;
  • Mean depth average sequencing depth
  • On Target target rate
  • Uniformity is ⁇ 90 %, indicating that the amplification efficiency of each read in the target area to be tested and the efficiency of connecting the universal high-throughput adapter are similar.
  • the above parameters all indicate that the two ends of the target segment to be tested are successfully connected to the universal high-throughput sequencing adapter, and the sequencing is successful; it indicates that the library connected to the universal high-throughput sequencing adapter can be sequenced on the Ion GeneStudio TM S5 Plus gene sequencer.
  • the data output of the above two samples are both ⁇ 0.5G, the Reads data are both ⁇ 3M, and the proportion of Q30 is ⁇ 75%, indicating that the two samples are successfully sequenced; indicating that the two samples are successfully connected to the universal high-throughput sequencing adapter at both ends of the target segment to be tested.
  • the library connected to the universal high-throughput sequencing adapter can be sequenced on the Miseq DX gene sequencer.
  • the use of the sequencing adapters prepared by the present invention to build a library can meet the sequencing requirements of the Ion GeneStudio TM S5 Plus platform and the Miseq DX platform at the same time, that is, meet the requirements of the two mainstream sequencing platforms of the Ion Torrent platform and the Illumina platform at the same time.
  • the sequencing adapter of the present invention has the properties of a universal library-building adapter.
  • the library applicable to the Ion GeneStudio TM S5Plus sequencer can be applied to other Ion Torrent platform sequencers, such as PGM, Proton, etc.
  • the library applicable to the Miseq DX gene sequencer can be applied to other types of sequencers on the Illumina platform, such as MiniSeq, NextSeq, etc. Therefore, it can be clarified that the library connected with the universal sequencing adapter of the present invention can be applied to all types of sequencers on the Ion Torrent platform and the Illumina platform.
  • This embodiment further verifies the application of the sequencing adapter of the present invention in low-frequency detection, and specifically provides a detection method for judging the authenticity of low-frequency mutations, which can correct sequencing errors introduced by index hopping.
  • the technical circuit diagram is shown in Figure 4, which specifically includes the following steps:
  • sample is a commercial tumor SNV 5% gDNA standard (GW-OGTM005), which is serially diluted with commercial human genomic DNA (G304A) to a mutation frequency of 2.5%, 1.25%, and 0.5%, named as sample 1, sample 2.
  • Sample 3 Sample 4.
  • the target area to be inspected is the designated hot spot area of EGFR(L858R/T790M/ ⁇ E746_ ⁇ A750)/PIK3CA(E545K)/KRAS(G12D/G13D/A146T)/NRAS(Q61K) gene.
  • the multiple PCR primer pool of the target detection area uses Thermo Fisher's Ampliseq colon&lung panel. 3 replicates for each sample.
  • Target fragment amplification the specific implementation is as follows:
  • the high-throughput sequencing adapter set adopts the PN1/AN1 and PN2/AN2 described in Example 1. Taking the PN2/AN2 test data as an example, 4 sets of adapter sets are prepared as follows: The sample sequence tags are ATCACG; CGATGT; TTAGGC; TGACCA. For specific preparation methods, refer to Example 1.
  • the amplified library was purified using Ampure magnetic beads, and the purified library was quantified using QUBIT 4.0.
  • the library concentration is calculated according to the dilution factor.
  • the library concentration higher than 1ng/uL can be used for subsequent experimental steps, and the library construction fails when the library concentration is lower than 1ng/uL.
  • the library was diluted, and Miseq DX Reagent Kit v3 was used to proceed in accordance with the kit operating procedures. Sequencing and data analysis were performed on the Miseq DX gene sequencer.
  • Sequencing data analysis mainly includes the following contents:
  • sequencing tag sequence and the universal high-throughput sequencing adapter AN adapter double-stranded end tag sequence base sequence to form a consistent identification of sample cross-contamination and tag hopping (index hopping) introduction
  • the sequencing error For the above-mentioned sequencing data classified into the same sample source, further use the sequencing tag sequence and the universal high-throughput sequencing adapter AN adapter double-stranded end tag sequence base sequence to form a consistent identification of sample cross-contamination and tag hopping (index hopping) introduction The sequencing error.
  • a universal high-throughput sequencing adapter is used. After the sequencing is completed, the obtained sequencing data is analyzed. First, the tag sequence is used to identify the source data of the same sample, and the sample is divided into four mutation frequencies of sample 1, sample 2, sample 3, and sample 4. Then identify whether the double-stranded partial tag sequence of the read linker is the same as the sample tag sequence, and eliminate the index hopping problem. Then, the authenticity of the mutation site is further recognized by whether the positive read and negative read with the same tag sequence have the same mutation site. However, mutations in which only positive or negative reads exist, or mutations in which the tag sequence in the read is inconsistent with the sample tag are excluded, so as to realize the correct identification of low-frequency mutations.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Zoology (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Biochemistry (AREA)
  • Analytical Chemistry (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

La présente invention concerne un adaptateur universel de séquençage à haut débit et une application associée. L'adaptateur universel de séquençage à haut débit comprend une région complémentaire double brin et un bras libre simple brin. L'adaptateur de séquençage peut être compatible avec de multiples plateformes de séquençage, telles que les plates-formes de Ion Torrent et Illumina ; l'adaptateur de séquençage est approprié pour un test clinique et est économique, et peut en outre être appliqué à l'interprétation d'authenticité de mutations basse fréquence.
PCT/CN2020/092418 2020-05-14 2020-05-26 Adaptateur universel de séquençage à haut débit et application associée WO2021227129A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010407833.5 2020-05-14
CN202010407833.5A CN111471754B (zh) 2020-05-14 2020-05-14 一种通用型高通量测序接头及其应用

Publications (1)

Publication Number Publication Date
WO2021227129A1 true WO2021227129A1 (fr) 2021-11-18

Family

ID=71759877

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/092418 WO2021227129A1 (fr) 2020-05-14 2020-05-26 Adaptateur universel de séquençage à haut débit et application associée

Country Status (2)

Country Link
CN (1) CN111471754B (fr)
WO (1) WO2021227129A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115831233A (zh) * 2023-02-07 2023-03-21 杭州联川基因诊断技术有限公司 一种基于mTag的靶向测序数据预处理的方法、设备和介质
WO2023092601A1 (fr) * 2021-11-29 2023-06-01 京东方科技集团股份有限公司 Marqueur moléculaire umi et application, adaptateur, réactif de ligature d'adaptateur et son kit, et procédé de construction de banque
WO2023092872A1 (fr) * 2021-11-26 2023-06-01 广州达安基因股份有限公司 Procédé de séquençage à haut débit basé sur la référence interne d'un marqueur connu
CN117286231A (zh) * 2023-09-28 2023-12-26 广州精检生物技术有限公司 一种基于Ion Torrent测序平台的检测方法

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112301432B (zh) * 2020-12-29 2021-04-06 北京贝瑞和康生物技术有限公司 一种构建全基因组高通量测序的文库的方法和试剂盒
CN115029425B (zh) * 2022-05-26 2023-04-18 北京爱普益生物科技有限公司 兼容多种测序平台的高通量测序str检测试剂盒及其应用

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107858414A (zh) * 2017-10-18 2018-03-30 广州漫瑞生物信息技术有限公司 一种高通量测序接头、其制备方法及其在超低频突变检测中的应用
CN111118001A (zh) * 2019-12-31 2020-05-08 苏州贝康医疗器械有限公司 一种多测序平台通用接头、适用于多测序平台的文库构建方法及试剂盒

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201615486D0 (en) * 2016-09-13 2016-10-26 Inivata Ltd Methods for labelling nucleic acids
CN108893466B (zh) * 2018-06-04 2021-04-13 上海奥根诊断技术有限公司 测序接头、测序接头组和超低频突变的检测方法
CN110827920B (zh) * 2018-08-14 2022-11-22 武汉华大医学检验所有限公司 测序数据分析方法和设备及高通量测序方法
CN110257480A (zh) * 2019-07-04 2019-09-20 北京京诺玛特科技有限公司 核酸序列测序接头及其构建测序文库的方法
CN110734908B (zh) * 2019-11-15 2021-06-08 福州福瑞医学检验实验室有限公司 高通量测序文库的构建方法以及用于文库构建的试剂盒
CN111073961A (zh) * 2019-12-20 2020-04-28 苏州赛美科基因科技有限公司 一种基因稀有突变的高通量检测方法

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107858414A (zh) * 2017-10-18 2018-03-30 广州漫瑞生物信息技术有限公司 一种高通量测序接头、其制备方法及其在超低频突变检测中的应用
CN111118001A (zh) * 2019-12-31 2020-05-08 苏州贝康医疗器械有限公司 一种多测序平台通用接头、适用于多测序平台的文库构建方法及试剂盒

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023092872A1 (fr) * 2021-11-26 2023-06-01 广州达安基因股份有限公司 Procédé de séquençage à haut débit basé sur la référence interne d'un marqueur connu
WO2023092601A1 (fr) * 2021-11-29 2023-06-01 京东方科技集团股份有限公司 Marqueur moléculaire umi et application, adaptateur, réactif de ligature d'adaptateur et son kit, et procédé de construction de banque
CN115831233A (zh) * 2023-02-07 2023-03-21 杭州联川基因诊断技术有限公司 一种基于mTag的靶向测序数据预处理的方法、设备和介质
CN117286231A (zh) * 2023-09-28 2023-12-26 广州精检生物技术有限公司 一种基于Ion Torrent测序平台的检测方法

Also Published As

Publication number Publication date
CN111471754B (zh) 2021-01-29
CN111471754A (zh) 2020-07-31

Similar Documents

Publication Publication Date Title
WO2021227129A1 (fr) Adaptateur universel de séquençage à haut débit et application associée
CN108893466B (zh) 测序接头、测序接头组和超低频突变的检测方法
CN107190329B (zh) 基于dna的融合基因定量测序建库、检测方法及其应用
US11286524B2 (en) Multi-position double-tag connector set for detecting gene mutation and preparation method therefor and application thereof
WO2019114146A1 (fr) Méthode d'enrichissement de régions cibles de gène et kit de construction de bibliothèque
CN105442054B (zh) 对血浆游离dna进行多目标位点扩增建库的方法
CN109971827B (zh) 血浆dna的建库方法和建库试剂盒
CN109844137B (zh) 用于鉴定嵌合产物的条形码化环状文库构建
WO2019144582A1 (fr) Sonde et procédé destinés à une région cible de capture ciblée par le séquençage à haut débit utilisés pour la détection de mutations de gène ainsi que de types de fusion de gène connus et non connus
CN110036117A (zh) 通过多联短dna片段增加单分子测序的处理量的方法
CN106939344B (zh) 用于二代测序的接头
CN110643680B (zh) 适用于超微量dna测序的接头及其应用
CN113502287A (zh) 分子标签接头及测序文库的构建方法
CN113337501B (zh) 一种发卡型接头及其在双端index建库中的应用
CN113005121A (zh) 接头元件、试剂盒及其相关应用
CN110004225B (zh) 一种肿瘤化疗药个体化基因检测试剂盒、引物及方法
CN110564838A (zh) 用于新生儿糖原累积病基因分型的多重pcr引物系统及其用途
CN113249437A (zh) 一种用于sRNA测序的建库方法
CN113046835A (zh) 检测慢病毒插入位点的测序文库构建方法和慢病毒插入位点检测方法
CN111808855B (zh) 一种遗传性家族性高胆固醇血症的通用基因检测文库的构建方法及其试剂盒
CN108728515A (zh) 一种使用duplex方法检测ctDNA低频突变的文库构建和测序数据的分析方法
CN116246704B (zh) 用于胎儿无创产前检测的系统
WO2020232635A1 (fr) Procédé et système pour construire une banque de séquençage sur la base d'une région cible d'adn méthylé, et son utilisation
CN111020711A (zh) 一种带有分子标签的单链建库方法和接头组合、试剂盒
CN111778324B (zh) 一种Alport综合征的通用基因检测文库的构建方法及其试剂盒

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20935873

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 15/03/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20935873

Country of ref document: EP

Kind code of ref document: A1