CN114854825A - Library building joint and method for simplified genome sequencing suitable for DNBSEQ technology - Google Patents

Library building joint and method for simplified genome sequencing suitable for DNBSEQ technology Download PDF

Info

Publication number
CN114854825A
CN114854825A CN202110159102.8A CN202110159102A CN114854825A CN 114854825 A CN114854825 A CN 114854825A CN 202110159102 A CN202110159102 A CN 202110159102A CN 114854825 A CN114854825 A CN 114854825A
Authority
CN
China
Prior art keywords
fragment
sequence
barcode
upstream
downstream
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110159102.8A
Other languages
Chinese (zh)
Inventor
胡晓湘
边成
谈成
李宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Agricultural University
Original Assignee
China Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Agricultural University filed Critical China Agricultural University
Priority to CN202110159102.8A priority Critical patent/CN114854825A/en
Publication of CN114854825A publication Critical patent/CN114854825A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1093General methods of preparing gene libraries, not provided for in other subgroups
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B80/00Linkers or spacers specially adapted for combinatorial chemistry or libraries, e.g. traceless linkers or safety-catch linkers

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Biochemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medicinal Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Analytical Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • General Chemical & Material Sciences (AREA)
  • Biomedical Technology (AREA)
  • Immunology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Plant Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention relates to the technical field of biological information, in particular to a library building joint and a method for simplified genome sequencing suitable for a DNBSEQ technology. The linker comprises an upstream linker sequence and a downstream linker sequence; the upstream adaptor sequence comprises an upstream Barcode fragment, a Q1-1 fragment, an Index fragment and a Q1-2 fragment which are connected in sequence; the downstream adaptor sequence comprises a downstream Barcode fragment and a Q2 fragment; wherein the upstream adaptor sequence is reverse complementary, and the Q2 fragment part in the downstream adaptor sequence is reverse complementary; the upstream Barcode fragment and the downstream Barcode fragment are of unequal length. The invention provides a library construction joint and designs a simplified genome sequencing method suitable for a DNBSEQ technology based on the library construction joint, and the genome sequencing can be accurately, quickly and effectively carried out.

Description

Library building joint and method for simplified genome sequencing suitable for DNBSEQ technology
Technical Field
The invention relates to the technical field of biological information, in particular to a library building joint and a method for simplified genome sequencing suitable for a DNBSEQ technology.
Background
DNA sequencing technology, i.e., the technology of determining the sequence of a DNA molecule. Library construction, i.e., library construction, refers to the process of repairing and ligating DNA present in a sample, such as blood, body fluids, tissues, or stool, to a specific DNA fragment, i.e., an adaptor sequence, so that sequencing can be performed on a specific technology platform. The library is established as an essential step for sequencing, various technical steps are provided according to different application requirements, and correspondingly, the joint also has different structures and sequences.
Sequencing technology is an important foundation in many research fields such as genetics and molecular biology in biological research today. Has important significance for analyzing the molecular mechanism of the biological genetic phenomenon, and has wide application in the research fields of disease treatment, agricultural animal and plant genetic breeding, biological evolution, biological diversity protection, inspection and quarantine, and the like. With the development of sequencing technologies and the rapid increase of various research and application requirements based on sequencing, the development of a library construction technology suitable for new sequencing technologies and industrial application requirements becomes a key for promoting the development of genetic breeding research and industrial application.
The most widely applied sequencing technology at present is the second generation sequencing technology which has the characteristics of higher flux and higher accuracy. Besides whole genome sequencing, the next generation sequencing technology can also be applied to the directions of simplifying genome sequencing, transcriptome sequencing, epigenetic analysis and the like. At present, the second generation sequencing technology is represented by Solexa technology of Illumina and DNBSEQ technology of Huada gene, and the two technologies have great differences in technical details. Among them, DNBSEQ technology shows a good level in applications such as whole genome sequencing, transcriptome sequencing and the like.
The simplified genome sequencing is a technology for obtaining and enriching a part of fragments with a small proportion on a whole genome by means of enzyme digestion or ultrasonic wave breaking, and the like, so as to only perform sequencing on a part of the genome. Compared with common whole genome sequencing, the method can acquire genetic information covering the whole genome range at lower cost; accordingly, the library construction method is more complicated. In fact, for most researches, the sequencing result of the whole genome is redundant, and simplified genome sequencing is enough to meet the requirements of downstream research and analysis, so that genome sequencing and genotyping can be rapidly performed on more samples by using the simplified genome sequencing at the same cost, the application of the simplified genome sequencing in various fields is greatly accelerated, and the simplified genome sequencing has important significance for genetic breeding fields and industrial applications thereof which are more sensitive to sequencing cost.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a library building joint and a method for simplifying genome sequencing, which are suitable for a DNBSEQ technology; the library building joint provided by the invention can be used for carrying out operations such as cyclization, DNA nanosphere formation and the like on genome fragments cut based on restriction enzyme, can be used for constructing a simplified genome library suitable for DNBSEQ technology, and realizes simplified genome sequencing on a DNBSEQ technology platform.
In a first aspect, the invention provides a library-building adaptor for simplified genome sequencing, comprising an upstream adaptor sequence and a downstream adaptor sequence;
the upstream adaptor sequence comprises two reverse complementary long DNA chains, wherein one of the two reverse complementary long DNA chains comprises an upstream Barcode fragment, a Q1-1 fragment, an Index fragment and a Q1-2 fragment which are connected in sequence;
the downstream joint sequence comprises two long DNA chains, wherein the two long DNA chains comprise a reverse complementary downstream Barcode fragment and a Q2 fragment which is not completely reverse complementary;
the Q1-1 fragment comprises a nucleotide sequence shown as SEQ ID NO.1, the Q1-2 fragment comprises a nucleotide sequence shown as SEQ ID NO.2, and the Q2 fragment comprises sequences shown as SEQ ID NO.3 and SEQ ID NO.4 respectively.
In the prior art, both the Barcode sequence and the Index sequence are used for distinguishing sequencing fragments from different sources, only one of the sequences is generally used, the Index sequence is used in DNBSEQ for distinguishing sequencing reads, and the Index sequence is also a necessary sequence for a DNBSEQ sequencing platform, and the absence of the Index sequence can cause sequencing failure. However, in the research of the invention, it is found that when the DNBSEQ sequencing platform is used for carrying out simplified genome sequencing, only the Index sequence cannot carry out sequencing, so that a group of barcode sequences with different lengths is utilized in the invention to avoid the situation that the sequencing fails due to base imbalance caused by the fact that the insert has the same base sequence at the sequencing starting position of the simplified genome library. In practical application, the barcode sequence in the joint can play two roles, on one hand, the barcode sequence can also play a role in distinguishing reads, and the reads can be distinguished by combining the barcode and the index for use; on the other hand, the starting positions of subsequent inserts are ensured to be mutually staggered by the characteristics of different lengths so as to carry out simplified genome sequencing, and simultaneously, the bases are conveniently adjusted so as to ensure the balance of the bases at the same positions. Therefore, the length of the Barcode is not a determined value, the factors of the length range of the Barcode needing to be considered in practical application are better distinguishability (the Barcode cannot be too short) and the Barcode cannot occupy too much sequencing quantity (the Barcode cannot be wasted due to too long Barcode), and the length range of the Barcode is limited to be 6-20 bp
In addition, the Q2 fragment in the downstream fragment of the sequencing joint provided by the invention is not completely complementary in the reverse direction, and is in a Y-shaped joint after connection, so that the cyclization of the sequencing-bearing fragment is facilitated.
Further, the upstream adaptor sequence and the downstream adaptor sequence both further comprise an enzyme-cleaved terminal sequence.
Further, the length of the Index fragment is 10bp, and the lengths of the upstream Barcode fragment and the downstream Barcode fragment are 6-20 bp.
The invention further provides a primer pair matched with the library building joint for use, which comprises:
P1:5’-TGTGAGCCAAGGAGTTG-3’,
P2:5’-GAACGACATGGCTACGATCC-3’。
the invention further provides the use of the banking adaptors or the primer pairs in simplified genome sequencing based on the DNBSEQ technology.
In a second aspect, the invention provides a simplified method of genome sequencing suitable for the DNBSEQ technology, comprising the described pooling adaptor.
Further, the method comprises the steps of:
and obtaining the genome DNA of a sample to be sequenced, carrying out enzyme digestion through restriction endoenzyme, connecting the DNA fragment subjected to enzyme digestion with the library building joint, and sequencing after cyclization and digestion.
Further, the sample to be sequenced corresponds to different types of library building joints according to different source individuals;
each of the pooling adapters comprises a different Index fragment, a different upstream Barcode fragment, and a different downstream Barcode fragment.
Further, the restriction end sequence of the library-building adaptor is determined by the type of enzyme in the restriction performance.
The invention has the following beneficial effects:
the invention provides a library-establishing joint, which comprises an enzyme-cutting end sequence, an index sequence and a Barcode fragment for distinguishing a sample source to be detected, and other fragments for cyclization. The invention provides a simplified genome sequencing method based on DNBSEQ technology based on the library-establishing joint, and the simplified genome sequencing on a DNBSEQ technology platform is realized by performing operations such as cyclization, DNA nanosphere formation and the like on a genome fragment cut based on restriction enzyme through the library-establishing joint, and the method has higher sequencing quality.
Drawings
FIG. 1 is a schematic structural view of an upstream joint F provided in example 1 of the present invention;
FIG. 2 is a schematic structural view of a lower linker R provided in example 1 of the present invention;
FIG. 3 is a schematic structural diagram of the enzyme-cleaved fragment of the present invention after ligation to a linker;
FIG. 4 shows the results of the distribution detection of fragments of the cyclization product provided in example 2 of the present invention;
FIG. 5 shows the results of mass distribution of sequencing data provided in example 2 of the present invention.
Detailed Description
The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
Example 1
This example designs an adaptor for library construction, which is composed of one or several pairs of DNA double-stranded fragments, each pair of adaptors being divided into an upstream adaptor sequence (F) and a downstream adaptor sequence (R).
As shown in fig. 1, the upstream junction F comprises the following structure:
(1) an end portion (E1) for complementary pairing with the cohesive end generated by the restriction enzyme, and the E1 sequence may be substituted with, for example, a linker sequence according to the restriction enzyme selected. E1 is a cohesive end comprising two single strands of DNA, the forward strand E1-A and the reverse complementary strand E1-B. Wherein, E1-A is a double-stranded part sequence in a cohesive end left after the DNA is digested by the corresponding restriction enzyme from the 5 'to 3' direction; E1-B is the reverse strand sequence left after digestion of DNA by this restriction enzyme in the 3 ' to 5 ' direction, including the double stranded portion and the overhanging sticky single stranded end, and E1-2 fragment needs to be modified by phosphorylation at the 5 ' position (P).
(2) Barcode-1 sequence (B1) this partial structure is used to distinguish DNA fragments of different individuals, and the length and sequence can be changed, and different sequences correspond to different individuals.
(3) Index sequence (I), the partial structure is used for distinguishing DNA fragments of different individuals, the length of the DNA fragments cannot be changed, and different sequences correspond to different individual sequences.
(4) Other sequences (Q1), specifically divided into Q1-1 and Q1-2, and the sequence from 5 'to 3' of the Q1-1 fragment is TGTGAGCCAAGGAGTTG; the 5 'to 3' sequence of the Q1-1 fragment was TTGTCTTCCTAAGACCGCTTGGCCTCCGACTT. The length of the partial structure sequence is not variable, and the partial structure sequence is used for processing such as circularization and amplification of DNA and sequencing on a DNBSEQ technical platform.
The F adaptor is formed by annealing two reverse complementary DNA double strands of F1 and F2, wherein the part E1-2 is a cohesive end exposed outside, and the rest part is a complementary DNA double strand. The structures of the F1 chain from 5 'to 3' are respectively a Q1-1 sequence, an Index sequence, a Q1-2 sequence, a Barcode-1 sequence and an enzyme cutting site sequence E1-A; the general structural formula from 5 'to 3' is TGTGAGCCAAGGAGTTG(I) TTGTCTTCCTAAGACCGCTTGGCCTCCGACTT (B1) (E1-A). The structures of the F2 chains from 5 'to 3' are respectively a reverse strand sequence E1-B, Barcode (B1 '), a reverse complementary sequence Q1-2 (Q1-2'), an Index reverse complementary sequence (I ', a reverse complementary sequence Q1-1 (Q1-1'), and a structural general formula from 5 'to 3' is (P) (E1-B) (B1 ') AAGTCGGAGGCCAAGCGGTCTTAGGAAGACAA (I') CAACTCCTTGGCTCACA), wherein the reverse strand sequence comprises a cohesive protruding end.
As shown in fig. 2, the downstream connector R structure comprises the following parts:
(1) an end portion (E2) for complementary pairing with the cohesive end generated by the restriction enzyme, and the E2 sequence may be substituted with, for example, a linker sequence according to the restriction enzyme selected. E2 is a cohesive end comprising two single strands of DNA, the forward strand E2-A and the reverse complementary strand E2-B. Wherein, E2-A is a double-stranded part sequence in a cohesive end left after the DNA is digested by the corresponding restriction enzyme from the 5 'to 3' direction; E2-B is the reverse strand sequence left after digestion of DNA by this restriction enzyme in the 3 ' to 5 ' direction, including the double stranded portion and the overhanging sticky single stranded end, and E2-2 fragment needs to be modified by phosphorylation at the 5 ' position (P).
(2) Barcode-2 sequence (B2), the partial structure is used for distinguishing DNA fragments of different individuals, the length and the sequence can be changed, and different sequences correspond to different individuals.
(3) Other sequences (Q2), Q2 forward strand fragment 5 'to 3' sequence GAACGACATGGCTACGATCCGACTT; the forward partially reverse complementary fragment of Q2 (Q2 ') has a sequence AAGTCGGATCGTGTAAGCTCATCCA from 5' to 3 'and is characterized by the first 13 bases at 5' being non-complementary to the Q2 fragment. The length of the partial structure sequence is not variable, and the partial structure sequence is used for processing such as circularization and amplification of DNA and sequencing on a DNBSEQ technical platform.
The R adaptor is formed by annealing two DNA double strands of which the parts are reverse complementary of R1 and R2, wherein the part E2-2 is exposed outside the sticky end, the non-complementary part Q2 is two DNA single strands, and the rest part is the DNA double strand. The structures of the R1 chain from 5 'to 3' are respectively a Q2 sequence, a Barcode-2 sequence and an enzyme cutting site sequence; the general structural formula from 5 'to 3' is GAACGACATGGCTACGATCCGACTT (B2) (E2-A). The structures of the F2 chains from 5 'to 3' are respectively a reverse strand sequence E2-B, Barcode-2 reverse complementary sequence (B2 ') and a partial reverse complementary sequence Q2 (Q2') of an enzyme cutting site containing a protruding cohesive end; the general structural formula from 5 ' to 3 ' is (P) (E2-B) (B2 ') AAGTCGGATCGTGTAAGCTCATCCA.
The restriction end fragments (E1 or E2) contained in the upstream and downstream adapters can be selected according to the restriction enzymes actually used, so that the requirements for different species or different DNA fragment quantities can be adjusted. The Barcode fragment (B1 or B2) and the Index fragment can increase the species by the number of individuals that need to be sequenced simultaneously, thereby achieving different sample sizes and sequencing throughput. By replacing different E/B/I sequences, a plurality of joint combinations can be obtained, and in the combinations, at least one of the upstream joint and the downstream joint is respectively provided; that is, at least one F or R linker is used in combination to effectively link the upstream and downstream of the restriction enzyme-digested genomic fragment.
This example further designed a primer set for use with the above described pooled adapters, which set of primers contained two single strands of DNA (P1 and P2) comprising sequences complementary to the above described adapters and specific sequences adapted to the DNBSEQ platform for amplification of the DNA fragments to which the above described adapters are ligated.
The 5 'to 3' sequence of the P1 primer is: TGTGAGCCAAGGAGTTG, part sequence is identical to part sequence Q1 in the F linker;
the P2 primer requires phosphorylation modification (P) at the 5 ' end, the 5 ' to 3 ' sequence is (P) GAACGACATGGCTACGATCC, and the partial sequence is identical to the partial fragment of Q2 in the R linker.
The primers P1 and P2 need to be used in pairs, and can specifically amplify only the DNA fragment with the F adaptor and the R adaptor connected simultaneously. The amplification effect is not influenced by the Index sequence, the Barcode sequence and the enzyme cutting site sequence on the F/R joint.
Example 2
In this embodiment, a simplified genome sequencing method suitable for the DNBSEQ technology is designed by using the sequencing adaptor shown in example 1, and the specific process is as follows:
1. adapting to different restriction enzymes and Barcode/Index
According to the joint structure and the general formula designed by the invention, the joint structure can be designed according to the following sequencing requirements:
(1) when genomic DNA was double-digested with EcoRI-MspI, library construction was performed for 3 individuals. 3 Index, 3 Barcode-1,3 Barcode-2 were required for sequencing requirements, and E1 and E2 fragments were replaced with EcoRI and MspI cleaved sticky ends, respectively, as shown in Table 2(SEQ ID NO. 7-18):
table 1: when EcoRI-MspI was used to double-cleave genomic DNA, 3 individuals of Barcode, Index and cleavage site
Figure BDA0002934947440000071
Figure BDA0002934947440000081
Table 2: when EcoRI-MspI is used for double digestion of genomic DNA, 3 individual linkers complete the sequence
Figure BDA0002934947440000082
(2) When genomic DNA was digested with EcoRI-MseI, 4 individuals were subjected to library construction. 1 Index, 4 Barcode-1,3 Barcode-2 were required for sequencing requirements and E1 and E2 fragments were replaced with EcoRI and MspI cleaved sticky ends, respectively, as shown in Table 4(SEQ ID NO. 19-34):
table 3: when EcoRI-MseI was used for double digestion of genomic DNA, 4 individuals of Barcode, Index and cleavage site
Figure BDA0002934947440000083
Figure BDA0002934947440000091
Table 4: when EcoRI-MseI double digestion of genomic DNA is used, the complete sequence of the 4 individual linkers
Figure BDA0002934947440000092
Figure BDA0002934947440000101
By adjusting the Index, Barcode and enzyme cutting terminal sequence in the F joint and the R joint according to the requirements of the restriction enzyme type, the individual number and the like, the specific library establishment aiming at different species, different numbers of individuals and different enzyme cutting modes can be realized.
2. Ring forming effect
2.1 materials of the experiment
The sequence from 5 'to 3' end of the "Spot oligo" standard circularization primer used was GCCATGTCGTTCTGTGAGCCAAGG.
The linker sequence described in example 1 above was used:
table 5: linker sequence for looping test
Figure BDA0002934947440000102
2.2 genomic DNA extraction
Genomic DNA was extracted using a DNA Mini kit from QIAGEN according to the kit instructions. The extracted genome was diluted to 40 ng/. mu.L using ribozyme-free water.
2.3 Joint connection
Annealing the linker S1-R in Table 5 in the forward direction and S1-R in the reverse direction to form S1-R; annealing S1-F in reverse to form S1-F. S1-R and S1-F are connected to both sides of the fragment formed by cutting Ecori and MspI by utilizing T4 ligase, and a connection product is obtained (the structure is shown in figure 3). The concentration was 7.41 ng/. mu.L.
2.4 cyclization and digestion steps
(1) mu.L of Splint oligo was added to 23. mu.L of the ligation product, and denaturation incubation at 95 ℃ was performed on a PCR instrument for 3 minutes, followed by immediate transfer to an ice bath.
(2) 12. mu.L of Splint Buffer, 1.2. mu.L of excitation Enhancer, 36.4. mu.L of ribozyme-free water and 0.4. mu.L of excitation Enzyme were added to the reaction system.
(3) The above system was placed on a PCR instrument and incubated at 37 ℃ for 60 minutes and then stored at 4 ℃.
(4) In the above reaction system, 0.8. mu.L of gelatin Buffer, 2. mu.L of ribozyme-free water, 3.9. mu.L of gelatin Enzyme I, and 1.3. mu.L of gelatin Enzyme III were added.
(5) After a short centrifugation, the mixture was placed in a PCR apparatus, incubated at 37 ℃ for 30 minutes, stored at 4 ℃ and added with 15. mu.L of a digest Stop Buffer to obtain a cyclized product.
2.5 cyclization Effect
The amount of input is 170.43ng DNA, 20.6ng of circularized DNA is obtained, the circularization efficiency is 12.09%, and the circularized DNA accords with the circularization efficiency index of the Huada sequencing platform (the distribution detection result of circularized product fragments is shown in figure 4). The concentration of the obtained DNB is 15.7 ng/mu L, and the requirement of the Huada sequencing platform on the computer is met.
3. Library sequencing Effect
In this embodiment, the library constructed by using the linker is sequenced by using a BGI-500 platform to obtain sequencing data 12Gb, and each index of the sequencing data is normal, that is, the linker structure can successfully complete sequencing on the huada sequencing platform. The results of mass distribution of the sequencing data are shown in FIG. 5.
Table 6: sequencing libraries Using linker sequences
Figure BDA0002934947440000111
Figure BDA0002934947440000121
Although the invention has been described in detail hereinabove with respect to a general description and specific embodiments thereof, it will be apparent to those skilled in the art that modifications or improvements may be made thereto based on the invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.
Sequence listing
<110> university of agriculture in China
<120> library building joint and method for simplified genome sequencing suitable for DNBSEQ technology
<130> KHP211110335.7
<160> 35
<170> SIPOSequenceListing 1.0
<210> 1
<211> 17
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 1
tgtgagccaa ggagttg 17
<210> 2
<211> 32
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 2
ttgtcttcct aagaccgctt ggcctccgac tt 32
<210> 3
<211> 25
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 3
gaacgacatg gctacgatcc gactt 25
<210> 4
<211> 25
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 4
aagtcggatc gtgtaagctc atcca 25
<210> 5
<211> 17
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 5
tgtgagccaa ggagttg 17
<210> 6
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 6
gaacgacatg gctacgatcc 20
<210> 7
<211> 33
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 7
gaacgacatg gctacgatcc gacttatgac gca 33
<210> 8
<211> 37
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 8
aatttgcgtc ataagtcgga tcgtgtaagc tcatcca 37
<210> 9
<211> 67
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 9
tgtgagccaa ggagttgaat agacacattg tcttcctaag accgcttggc ctccgacttg 60
attagta 67
<210> 10
<211> 69
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 10
cgttagttga aagtcggagg ccaagcggtc ttaggaagac aatatacatg tccaactcct 60
tggctcaca 69
<210> 11
<211> 33
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 11
gaacgacatg gctacgatcc gacttagcat caa 33
<210> 12
<211> 37
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 12
aattttgatg ctaagtcgga tcgtgtaagc tcatcca 37
<210> 13
<211> 67
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 13
tgtgagccaa ggagttggac atgtatattg tcttcctaag accgcttggc ctccgacttt 60
caactaa 67
<210> 14
<211> 69
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 14
cgtactaatc aagtcggagg ccaagcggtc ttaggaagac aatgtgtcta ttcaactcct 60
tggctcaca 69
<210> 15
<211> 69
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 15
cgtactaatc aagtcggagg ccaagcggtc ttaggaagac aatgtgtcta ttcaactcct 60
tggctcaca 69
<210> 16
<211> 37
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 16
aatttgagag agaagtcgga tcgtgtaagc tcatcca 37
<210> 17
<211> 67
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 17
tgtgagccaa ggagttgatg ctgaattttg tcttcctaag accgcttggc ctccgacttt 60
gcgcgta 67
<210> 18
<211> 69
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 18
cgtacgcgca aagtcggagg ccaagcggtc ttaggaagac aaaattcagc atcaactcct 60
tggctcaca 69
<210> 19
<211> 74
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 19
tgtgagccaa ggagttgtta ccgacgtttg tcttcctaag accgcttggc ctccgactta 60
ggagttagtt cttc 74
<210> 20
<211> 76
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 20
tagaagaact aactcctaag tcggaggcca agcggtctta ggaagacaaa cgtcggtaac 60
aactccttgg ctcaca 76
<210> 21
<211> 33
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 21
gaacgacatg gctacgatcc gacttagcat caa 33
<210> 22
<211> 37
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 22
aattttgatg ctaagtcgga tcgtgtaagc tcatcca 37
<210> 23
<211> 74
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 23
tgtgagccaa ggagttgagt gacctcattg tcttcctaag accgcttggc ctccgactta 60
gtaggtatgg cgcc 74
<210> 24
<211> 74
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 24
tgtgagccaa ggagttgagt gacctcattg tcttcctaag accgcttggc ctccgactta 60
gtaggtatgg cgcc 74
<210> 25
<211> 76
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 25
taggcgccat acctactaag tcggaggcca agcggtctta ggaagacaat gaggtcactc 60
aactccttgg ctcaca 76
<210> 26
<211> 33
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 26
gaacgacatg gctacgatcc gacttatgac gca 33
<210> 27
<211> 37
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 27
aatttgcgtc ataagtcgga tcgtgtaagc tcatcca 37
<210> 28
<211> 74
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 28
tgtgagccaa ggagttgtcg gattcccttg tcttcctaag accgcttggc ctccgactta 60
gtcaattggc gctg 74
<210> 29
<211> 76
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 29
tacagcgcca attgactaag tcggaggcca agcggtctta ggaagacaag ggaatccgac 60
aactccttgg ctcaca 76
<210> 30
<211> 33
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 30
gaacgacatg gctacgatcc gacttctctc tca 33
<210> 31
<211> 37
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 31
aatttgagag agaagtcgga tcgtgtaagc tcatcca 37
<210> 32
<211> 74
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 32
tgtgagccaa ggagttgcaa ggtacggttg tcttcctaag accgcttggc ctccgacttc 60
aactatccgt gtgg 74
<210> 33
<211> 76
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 33
taccacacgg atagttgaag tcggaggcca agcggtctta ggaagacaac cgtaccttgc 60
aactccttgg ctcaca 76
<210> 34
<211> 33
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 34
gaacgacatg gctacgatcc gacttgagtt cca 33
<210> 35
<211> 37
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 35
aatttggaac tcaagtcgga tcgtgtaagc tcatcca 37

Claims (9)

1. A library-building adaptor for simplified genome sequencing, comprising an upstream adaptor sequence and a downstream adaptor sequence;
the upstream adaptor sequence comprises two reverse complementary long DNA chains, wherein one of the two reverse complementary long DNA chains comprises an upstream Barcode fragment, a Q1-1 fragment, an Index fragment and a Q1-2 fragment which are connected in sequence;
the downstream joint sequence comprises two long DNA chains, wherein the two long DNA chains comprise a reverse complementary downstream Barcode fragment and a Q2 fragment which is not completely reverse complementary;
the Q1-1 fragment comprises a nucleotide sequence shown as SEQ ID NO.1, the Q1-2 fragment comprises a nucleotide sequence shown as SEQ ID NO.2, and the Q2 fragment comprises sequences shown as SEQ ID NO.3 and SEQ ID NO.4 respectively;
the upstream Barcode fragment and the downstream Barcode fragment are of unequal length.
2. The banking adapter of claim 1, wherein the upstream adapter sequence and the downstream adapter sequence each further comprise a cleavage termination sequence.
3. The pooling adaptor of claim 1 or 2, wherein the length of the fragment of said Index is 10 bp; and/or the presence of a gas in the gas,
the lengths of the upstream Barcode fragment and the downstream Barcode fragment are 6-20 bp.
4. A primer pair for use with the banking connector as claimed in any one of claims 1 to 3, comprising:
P1:5’-TGTGAGCCAAGGAGTTG-3’,
P2:5’-GAACGACATGGCTACGATCC-3’。
5. use of the banking adaptors of any one of claims 1 to 3 or the primer pairs of claim 4 for simplified genome sequencing based on DNBSEQ technology.
6. A simplified method of genome sequencing suitable for DNBSEQ technology comprising the pooling adapter of any one of claims 1-3.
7. The method of claim 6, comprising:
and obtaining the genome DNA of a sample to be sequenced, carrying out enzyme digestion through restriction endoenzyme, connecting the DNA fragment subjected to enzyme digestion with the library building joint, and sequencing after cyclization and digestion.
8. The method of claim 7, wherein the sample to be sequenced corresponds to different kinds of library-creating adapters according to different individuals;
each of the pooling adapters comprises a different Index fragment, a different upstream Barcode fragment, and a different downstream Barcode fragment.
9. The method of claim 6 or 7, wherein the sequence of the cut end of the pooling adaptor is determined by the type of enzyme within the restriction capability.
CN202110159102.8A 2021-02-04 2021-02-04 Library building joint and method for simplified genome sequencing suitable for DNBSEQ technology Pending CN114854825A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110159102.8A CN114854825A (en) 2021-02-04 2021-02-04 Library building joint and method for simplified genome sequencing suitable for DNBSEQ technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110159102.8A CN114854825A (en) 2021-02-04 2021-02-04 Library building joint and method for simplified genome sequencing suitable for DNBSEQ technology

Publications (1)

Publication Number Publication Date
CN114854825A true CN114854825A (en) 2022-08-05

Family

ID=82622973

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110159102.8A Pending CN114854825A (en) 2021-02-04 2021-02-04 Library building joint and method for simplified genome sequencing suitable for DNBSEQ technology

Country Status (1)

Country Link
CN (1) CN114854825A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115948621A (en) * 2023-01-18 2023-04-11 珠海舒桐医疗科技有限公司 HPV screening method based on menstrual blood DNA

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115948621A (en) * 2023-01-18 2023-04-11 珠海舒桐医疗科技有限公司 HPV screening method based on menstrual blood DNA

Similar Documents

Publication Publication Date Title
TWI742059B (en) DNA amplification method
CN106906211B (en) Molecular joint and application thereof
CN108138175B (en) Reagents, kits and methods for molecular barcode encoding
CN111808854B (en) Balanced joint with molecular bar code and method for quickly constructing transcriptome library
CN111321202A (en) Gene fusion variation library construction method, detection method, device, equipment and storage medium
CN112359093B (en) Method and kit for preparing and expressing and quantifying free miRNA library in blood
WO2012037881A1 (en) Nucleic acid tags and use thereof
CN112251422B (en) Transposase complex containing unique molecular tag sequence and application thereof
CN114574557B (en) General type preclinical biodistribution detection kit for NK cell therapy products
CN112322700B (en) Construction method, kit and application of short RNA fragment library
CN113046835A (en) Sequencing library construction method for detecting lentivirus insertion site and lentivirus insertion site detection method
WO2019212138A1 (en) Internal control substance for discovering cross-contamination between samples for next generation sequencing
CN114854825A (en) Library building joint and method for simplified genome sequencing suitable for DNBSEQ technology
CN110777154A (en) Mutant gene for drug resistance detection of mycobacterium tuberculosis, and detection method and kit thereof
CN116287124A (en) Single-stranded joint pre-connection method, library construction method of high-throughput sequencing library and kit
CN107002150B (en) High-throughput detection method for DNA synthesis product
Magbanua et al. Innovations in double digest restriction-site associated DNA sequencing (ddRAD-Seq) method for more efficient SNP identification
CN115927540A (en) Construction method of small RNA high-throughput sequencing library based on splint connection
CN112813200B (en) Method for extremely short PCR amplification of nucleic acid, detection method and application
Bhattacharya et al. Experimental toolkit to study RNA level regulation
CN109609681B (en) Identification method of loblolly pine individual based on chloroplast genome sequence
CN112176422A (en) Construction method of RNA library
CN111793623A (en) Typing genetic marker composition, kit, identification system and typing method of 62 multi-allelic SNP-NGS
WO2024093961A1 (en) Method for reduced-representation genome sequencing and related use
WO2022199242A1 (en) Set of barcode linkers and medium-flux multi-single-cell representative dna methylation library construction and sequencing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination