CN113046835A - Sequencing library construction method for detecting lentivirus insertion site and lentivirus insertion site detection method - Google Patents

Sequencing library construction method for detecting lentivirus insertion site and lentivirus insertion site detection method Download PDF

Info

Publication number
CN113046835A
CN113046835A CN201911376706.7A CN201911376706A CN113046835A CN 113046835 A CN113046835 A CN 113046835A CN 201911376706 A CN201911376706 A CN 201911376706A CN 113046835 A CN113046835 A CN 113046835A
Authority
CN
China
Prior art keywords
sequence
lentivirus
sequencing
pcr amplification
round
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911376706.7A
Other languages
Chinese (zh)
Inventor
李静
孙海汐
刘超
周子恒
杨林
欧阳文杰
刘田宾
董国艺
顾颖
侯勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Hemu gene Biotechnology Co.,Ltd.
Original Assignee
BGI Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BGI Shenzhen Co Ltd filed Critical BGI Shenzhen Co Ltd
Priority to CN201911376706.7A priority Critical patent/CN113046835A/en
Publication of CN113046835A publication Critical patent/CN113046835A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing

Abstract

A sequencing library construction method for detecting a lentivirus insertion site and a lentivirus insertion site detection method are provided, wherein the sequencing library construction method comprises the following steps: extracting genomic DNA from lentivirus-infected cells; fragmenting and processing genomic DNA into a form suitable for adaptor ligation; connecting asymmetric double-chain connectors at two ends of the fragmented genome DNA, wherein the asymmetric double-chain connectors comprise long-chain sequences and short-chain sequences; performing a first round of PCR amplification on the joint ligation product; performing second PCR amplification on the product of the first PCR amplification; and (4) circularizing the products of the second round of PCR amplification to obtain a circularization sequencing library suitable for on-machine sequencing. The invention has the characteristics of simple operation, short experimental time, low cost, small initial amount, high flux, more analyzable aspects and the like, and can better perform accurate analysis and evaluation of the virus insertion site in gene therapy.

Description

Sequencing library construction method for detecting lentivirus insertion site and lentivirus insertion site detection method
Technical Field
The invention relates to the technical field of gene analysis and detection, in particular to a sequencing library construction method for detecting a lentivirus insertion site and a lentivirus insertion site detection method.
Background
Gene therapy is an effective means for treating genetic diseases and other malignant diseases, and has been clinically applied in the treatment of severe thalassemia and hematological diseases. Lentiviral vectors are commonly used as delivery vectors in gene therapy, but due to the random integration of lentiviruses into the host genome, random insertion may inactivate oncogenes or activate proto-oncogene expression, resulting in the risk of canceration. Assessment of lentiviral insertion sites and associated safety is a matter of work and research that must be undertaken in preclinical studies and clinical applications. It is necessary and important to carefully study the regions of viral insertion and their distribution within the individual clones. The FDA in the united states requires the detection of all viral insertion sites in gene therapy.
Methods for assessing random insertion of lentiviruses have evolved as the use of gene therapy has evolved. In the previous studies, the method of Southern hybridization and proviral PCR was used for the first time to identify the number of mutations at the insertion site of a virus, and this method had a problem of insufficient sensitivity and specificity. In particular, insertional mutations in viral genes have the potential risk of activating protooncogenes or inhibiting the expression of cancer suppressor genes. In the 90 s of the twentieth century, reverse PCR and SplinkerettePCR were invented by using restriction enzymes, and flanking sequences inserted into viral genomes were detected for the first time. However, inverse PCR is less sensitive, detects up to 40 insertion sites, and does not allow detailed analysis of more complex samples in vivo. The discovery of linear amplification PCR (LAM-PCR) is a major advance in viral insertion sites, and the existing viral insertion site analysis (VIS) is mostly based on this method. LAM-PCR can be used to detect rare insertion sites in complex samples such as those derived from peripheral blood, but this method requires some preference in the cleavage recognition frequency of restriction endonucleases, introduces technical errors, has a long experimental period and requires extensive monoclonal sequencing. Accordingly, various techniques for optimizing intrinsic errors caused by restriction enzymes have been invented, and with the development of sequencing techniques, the clone data of LAM-PCR is combined with the sequencing techniques, and thus, the method becomes an effective method for identifying the insertion site.
Most of the existing virus insertion site detection methods are improved by combining a sequencing technology on the basis of LAM-PCR, and mainly utilize a method for enriching insertion sites by labeling known regions with biotin. However, due to the low binding efficiency of biotin affinity chromatography, the initial amount of template required is relatively high, and it is difficult to apply the method to preclinical or clinical samples. Moreover, the biotin labeling target region is used as a primer, so that the synthesis cost and the working time are increased.
Disclosure of Invention
The invention provides a sequencing library construction method for detecting a lentivirus insertion site and a lentivirus insertion site detection method, which have the characteristics of rapidness, simplicity, low cost, small initial amount, multiple analyzable aspects and the like.
According to a first aspect of the present invention, the present invention provides a sequencing library construction method for detecting a lentivirus insertion site, comprising:
providing genomic DNA extracted from a lentivirus-infected cell, the genomic DNA including at least one lentivirus insertion site to be detected;
fragmenting said genomic DNA and processing said fragmented genomic DNA into a form suitable for adaptor ligation;
connecting asymmetric double-chain linkers to two ends of the fragmented genomic DNA suitable for linker ligation to obtain a linker ligation product, wherein the asymmetric double-chain linkers comprise a long chain sequence and a short chain sequence, the long chain sequence sequentially comprises a fixed sequence, a random UMI sequence and an amplification primer binding sequence from 5 'end to 3' end, and the short chain sequence comprises a sequence complementary to the fixed sequence;
performing a first round of PCR amplification on said adaptor-ligated product, wherein the primers of said first round of PCR amplification comprise a forward primer and a reverse primer, said forward primer comprising a lentiviral sequence inserted into the genome of said cell or a sequence complementary thereto, and said reverse primer comprising a sequence complementary to the long-chain sequence of the amplification primer binding sequence;
performing a second round of PCR amplification on the first round of PCR amplification product, wherein the primers for the second round of PCR amplification comprise a forward primer and a reverse primer, the forward primer comprises, in order from 5 ' to 3 ', a 5 ' phosphorylation group, a sequencing platform forward immobilization sequence, and a lentiviral sequence or a sequence complementary to the lentiviral sequence, the binding site of the forward primer is located downstream of the binding site of the forward primer for the first round of PCR amplification, and the reverse primer comprises, in order from 5 ' to 3 ', a sequencing platform reverse immobilization sequence and a sequence complementary to the long-chain sequence of the amplification primer;
optionally, circularizing the products of the second round of PCR amplification to obtain a circularized sequencing library suitable for on-machine sequencing.
In a preferred embodiment, the above processing of the fragmented genomic DNA into a form suitable for adaptor ligation comprises: the fragmented genomic DNA described above was subjected to end repair and A-tail base addition.
In a preferred embodiment, the random UMI sequence is a 10-base sequence NNNNNNNNNN, wherein each N represents A, G, C, T any one of the four bases.
In a preferred embodiment, the lentiviral sequence is a sequence of an LTR region of a lentivirus.
In a preferred embodiment, the binding sites of the forward primer of the first round of PCR amplification and the forward primer of the second round of PCR amplification are both located within the LTR region of the lentivirus and in close proximity to the genome.
In a preferred embodiment, the initial amount of the genomic DNA is 50ng or more, preferably 50ng to 1. mu.g, more preferably 100ng to 500 ng.
In a preferred embodiment, the reverse primer for the second round of PCR amplification further comprises: a barcode sequence for distinguishing a sample origin, the barcode sequence being located upstream of a sequence complementary to an amplification primer binding sequence of the long chain sequence.
According to a second aspect of the present invention, there is provided a method of detecting a lentiviral insertion site on genomic DNA, comprising:
providing a sequencing library obtained by the method of the first aspect;
performing on-machine sequencing on the sequencing library to obtain sequencing data;
and analyzing the sequencing data to obtain the condition of the lentivirus insertion site on the genome DNA.
In a preferred embodiment, the above-described lentivirus insertion site profile comprises one or more of the distribution of insertion sites in the chromosome, the distribution of insertion sites in different regions of the genome, and the distribution of insertion sites in tumor genes.
In a preferred embodiment, the above-described in-silico sequencing is performed by the BGI-SEQ sequencing platform.
According to the sequencing library construction method for detecting the lentivirus insertion site, the initial amount of the template only needs 50ng, biotin labeling is not needed, the synthesis cost and the purification cost are reduced, the biotin hybridization capture time is reduced, the time is saved, the cutting of restriction endonuclease is avoided, and the error in the cutting frequency process of the restriction endonuclease is reduced; by introducing the random sequence, sequencing data can be well balanced, repeated sequences can be removed and combined, the method can be used for site tracking of complex cells after in vivo differentiation, and the data is more reliable and accurate. The invention has the characteristics of simple operation, short experimental time, low cost, small initial amount, high flux, more analyzable aspects and the like, and can better perform accurate analysis and evaluation of the virus insertion site in gene therapy.
Drawings
FIG. 1 is a technical flowchart of a sequencing library construction method for detecting lentiviral insertion sites in the embodiment of the invention;
FIG. 2 is an electrophoretogram of genomic fragments of 293T and 293L (after lentivirus treatment) samples after ultrasonication (A) and after fragment selection (B) in accordance with an embodiment of the present invention, in which M represents Marker50 bpDNAsader;
FIG. 3 is a Agilent2100 quality control plot of the product after the second round of PCR amplification in the example of the present invention;
FIG. 4 is a graph showing the statistics of sequencing read length (reads) after genomic data removal in the example of the present invention;
FIG. 5 is a graph showing the results of IGV analysis of distribution in the 293L and 293T genomes in accordance with an embodiment of the present invention;
FIG. 6 is a graph showing the results of electrophoretic determination of 293L locus, wherein M denotes Marker2000, and S1-S6 denote insertion site 1-insertion site 6;
FIG. 7 is the sequence after the insertion Site 1(Site1) Sanger sequencing verifies the breakpoint in the example of the present invention, wherein the sequence in the box is marked as the P2 binding Site, the underlined is marked as the breakpoint and the predicted insertion sequence thereafter, and the predicted sequence is consistent with the verified sequence;
FIG. 8 is a diagram showing a scale analysis of the insertion site of the first ten (TOP10) gene using UMI in the present example.
Detailed Description
The present invention will be described in further detail with reference to the following detailed description and accompanying drawings. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, those skilled in the art will readily recognize that some of the features may be omitted in different instances or may be replaced by other materials, methods.
Furthermore, the features, operations, or characteristics described in the specification may be combined in any suitable manner to form various embodiments. Also, the various steps or actions in the method descriptions may be transposed or transposed in order, as will be apparent to one of ordinary skill in the art. Thus, the various sequences in the specification and drawings are for the purpose of describing certain embodiments only and are not intended to imply a required sequence unless otherwise indicated where such sequence must be followed.
In view of the problems and the current situation of the lentivirus insertion site analysis method, the invention provides a method for combining non-restriction endonuclease cut LAM-PCR with anchor PCR (Anchoredemultiplex xPCR), establishes a virus insertion site detection method based on a sequencing platform (such as a Huada BGI-Seq sequencing platform), and has the characteristics of low initial quantity, reality, high efficiency, quickness and low cost.
The invention establishes a targeted nested PCR sequencing method and a primer sequence for identification, the method uses an LTR (long tertiary repeat) specific sequence and a fixed sequence of a joint as amplification primers to perform semi-nested PCR amplification on an insertion site, avoids the problems of low binding efficiency, high cost, high template requirement and the like of biotin capture and biotin magnetic beads, can utilize low-initial-amount DNA to construct a library, and can analyze some rare samples.
The invention does not depend on restriction enzyme, has no preference of restriction enzyme, does not adopt LAM-PCR of restriction enzyme, can avoid errors caused by recognition of restriction enzyme cutting sites, and has higher sensitivity.
The invention has the characteristics of label sequence, random base and the like, has high flux, can accurately detect the virus insertion site for a complex template and a low initial template, can quickly, efficiently, low-cost, high-flux, stably and specifically detect the random insertion site after lentivirus infection by combining an analysis method of bioinformatics, and can be applied to safety evaluation of clinical and preclinical samples. Particularly, as the random primer of 10 bases is introduced into the primer sequence, in the process of biological information analysis, errors caused by preferential amplification can be eliminated by a biological information analysis method, and the consistency of in vivo clone sources can be analyzed, so that the application range is wider.
As shown in FIG. 1, the invention provides a sequencing library construction method for detecting lentivirus insertion sites, which comprises the following steps:
s101: providing genomic DNA extracted from a lentivirus-infected cell, the genomic DNA including at least one lentivirus insertion site to be detected.
FIG. 1 shows the insertion of a lentiviral sequence LTR on the genomic DNA (GenomicsDNA, gDNA) of a host cell. The cells infected with lentivirus (i.e., host cells) can be of various types, and the invention is not limited. In some embodiments of the invention, the lentivirus-infected cells are 293T and K562 cell lines.
S102: the genomic DNA is fragmented and the fragmented genomic DNA is processed into a form suitable for adaptor ligation.
In the present invention, there are many methods for fragmenting genomic DNA, for example, physical disruption or cleavage disruption, and the physical disruption method is preferable. The fragmented genomic DNA is processed into a form suitable for adaptor ligation, such as blunt end or sticky end, and in one embodiment of the present invention, the fragmented genomic DNA is subjected to end repair and a tail base addition treatment to obtain a structure having an a base overhang at the end, which can be ligated to an adaptor having a T base overhang at the end.
S103: and connecting asymmetric double-chain joints to two ends of the fragmented genomic DNA suitable for joint connection to obtain a joint connection product, wherein the asymmetric double-chain joints comprise a long-chain sequence and a short-chain sequence, the long-chain sequence sequentially comprises a fixed sequence, a random UMI sequence and an amplification primer binding sequence from the 5 'end to the 3' end, and the short-chain sequence comprises a sequence complementary to the fixed sequence.
As shown in fig. 1, the asymmetric double-stranded linker is shown as VIS-Adapter in the figure, which shows a 10-base sequence NNNNNNNNNN (each N represents any of A, G, C, T four bases) as a random UMI sequence, which is preceded by a fixed sequence that is linked to fragmented genomic DNA and followed by an amplification primer binding sequence, i.e., a sequence to which primers (e.g., AMPR1 and BCR2 in the figure) in a subsequent PCR amplification process bind. In other embodiments, the length and base selection of the random UMI sequence can be selected according to specific needs, for example, the length of the random UMI sequence can be 6-14 bases.
In one embodiment of the invention, the two asymmetric linker sequences VIS-Adaptor F and VIS-Adaptor R constitute an asymmetric double-link linker. The two asymmetric linker sequences are as follows:
VIS-AdaptorF:GATCGGAAGAGCNNNNNNNNNN
Figure BDA0002341167970000051
(SEQ ID NO:1);
VIS-AdaptorR:GCTCTTCCGATCT(SEQ ID NO:2)。
wherein the boxed portion is the reverse primer binding region.
The two asymmetric linker sequences synthesized are diluted in a buffer (e.g., TEbuffer), mixed, denatured, and annealed to give an asymmetric double-stranded linker structure. For example, in one embodiment of the invention, the long and short linkers are diluted to 20 μm with TEbuffer and mixed in equal volumes; denaturation at 95 deg.C for 5min, decreasing to 16 deg.C at 1 deg.C/min, and storing at-20 deg.C.
S104: performing a first round of PCR amplification on the adaptor ligation product, wherein the primers of the first round of PCR amplification comprise a forward primer comprising a segment of a lentiviral sequence inserted into the genome of the cell or a sequence complementary thereto and a reverse primer comprising a sequence complementary to the long-chain sequence of the amplification primer binding sequence.
As shown in fig. 1, in the first round of PCR amplification, P1 primer and AMPR1 primer were used as forward and reverse primers, respectively. In one example, as shown, the forward primer P1 is bound to a lentiviral sequence (shown as an LTR sequence) that is a sequence inserted into the genomic DNA of a cell. The reverse primer AMPR1 is partially complementary to the amplification primer binding sequence of the long chain sequence of the asymmetric double-stranded linker.
In one embodiment of the present invention, the forward primer and the reverse primer of the first round of PCR amplification are respectively as follows:
LTRP1:TGTGACTCTGGTAACTAGAGATCCCTC(SEQ ID NO:3);
AMPR1:
Figure BDA0002341167970000061
(SEQ ID NO:4)。
wherein the box part is a region complementary to VIS-adapter.
S105: and performing second round PCR amplification on the product of the first round PCR amplification, wherein the primer of the second round PCR amplification comprises a forward primer and a reverse primer, the forward primer sequentially comprises a 5 ' end phosphorylation group, a sequencing platform forward fixed sequence and a lentivirus sequence or a sequence complementary with the lentivirus sequence from the 5 ' end to the 3 ' end, the binding site of the forward primer is positioned at the downstream of the binding site of the forward primer of the first round PCR amplification, and the reverse primer sequentially comprises a sequencing platform reverse fixed sequence and a sequence complementary with the long-chain sequence amplification primer binding sequence from the 5 ' end to the 3 ' end.
As shown in fig. 1, in the second round of PCR amplification, P2 primer and BCR2 primer were used as forward and reverse primers, respectively. Wherein, the 5' end phosphorylation group in the forward primer P2 can be used for cyclization connection to form a cyclic molecule; the forward fixed sequence part of the sequencing platform in the forward primer P2 is related to the sequencing platform, and different sequencing platforms adopt different sequencing sequences; the binding site on the template of the lentiviral sequence or a sequence complementary to the lentiviral sequence in forward primer P2 is also located in the LTR region of the lentivirus and downstream of the binding site of the forward primer of the first round of PCR amplification, so that the two rounds of PCR amplification form a nested PCR amplification with the purpose of enriching the region of interest and providing amplification specificity. The reverse fixed sequence of the sequencing platform in the reverse primer BCR2 is also related to the sequencing platform, and different sequencing platforms adopt different sequencing sequences.
In one embodiment of the present invention, the forward primer and the reverse primer for the second round of PCR amplification are as follows:
LTRP2:/5Phos/GAACGACATGGCTACGATCCGACTTGATCCCTCAGACCCTTTTAGTCA(SEQ ID NO:5);
BC-R2:TGTGAGCCAAGGAGTTG-Barcode-TTGTCTTCCTAAG
Figure BDA0002341167970000071
Figure BDA0002341167970000072
(SEQ ID NO:6)。
the underlined part of the LTRP2 sequence was the fixed sequence of the Huada sequencing platform, and was subjected to 5' phosphorylation. The box part of the BC-R2 sequence is the region complementary to the adapter sequence, and the rest is the fixed sequence of the Huada sequencing platform; the Barcode portion of the BC-R2 sequence represents a Barcode sequence used to distinguish sample sources, the Barcode sequence being associated with a sequencing platform, different sequencing platforms using different Barcode sequences. In the present example, the barcode sequence of the huada sequencing platform used was a random sequence of 10 bases (which can be expressed as NNNNNNNNNN). In other embodiments, the barcode sequence is determined according to a sequencing platform.
S106: and (4) circularizing the products of the second round of PCR amplification to obtain a circularization sequencing library suitable for on-machine sequencing.
The sequencing library is subjected to quality control, and the qualified library is subjected to on-machine sequencing. After sequencing data are obtained, biological information analysis insertion sites and accuracy tests are performed.
The biological information analysis firstly carries out quality control on sequencing data and utilizes LTR and fixed sequences to determine the genome breakpoint. In one embodiment of the invention, sequencing sequences at both ends are intercepted, sequences larger than 10bp are reserved, and BLAST is compared with genome and lentiviral vector information to ensure the reliability of data. And (3) determining sites through BLAST and the sequences of the genome, and performing TSS distribution analysis and analysis of internal and external influences of the gene on the insertion sites after the sites are determined. In the future, 10N random primers can be used for in vivo analysis of monoclonal proliferation and data correction and analysis of abnormal amplification caused in the PCR process.
The working principle of the method of the invention is as follows: the first round of PCR amplification is carried out by utilizing the known sequence on the lentiviral vector and the added asymmetric linker sequence, and the enrichment effect is realized on the viral region. The method does not utilize a method for labeling biotin with a known sequence to carry out linear amplification, because the biotin labeling is found to have low capture efficiency and the data is far lower than that of the existing method. Because genome information is unknown, by designing an asymmetric joint sequence and introducing a random primer and a fixed sequence of 10bp to the joint, amplification can be carried out by using a nested PCR method, unnecessary amplification is avoided, in the process of biological information analysis, repetitive sites can be removed by the random primer and the fixed sequence of 10bp, and more reliable data can be obtained. And amplifying the target region by using a second round of PCR, and introducing a sequence required by a BGI-Seq sequencing platform.
Most of the existing virus insertion site identification methods are high-throughput sequencing based on LAM-PCR, biotin is required for marking and capturing, the synthesis cost and efficiency are high, and the initial amount of a template needs to be more than 100ng at least due to low capture efficiency. According to the method for analyzing the lentivirus insertion site by high-throughput sequencing based on the nested PCR, the initial amount of the template only needs 50ng, biotin labeling is not needed, the synthesis cost and the purification cost are reduced, the biotin hybridization capture time is reduced, the time is saved, the cutting of the restriction endonuclease is avoided, and the error in the cutting frequency process of the restriction endonuclease is reduced; by introducing the random sequence, sequencing data can be well balanced, repeated sequences can be removed and combined, the method can be used for site tracking of complex cells after in vivo differentiation, and the data is more reliable and accurate. Due to the self-tagged sequence, the flux was up to 128 samples.
The invention has the characteristics of simple operation, short experimental time, low cost, small initial amount, high flux, more analyzable aspects and the like, and can better perform accurate analysis and evaluation of the virus insertion site in gene therapy.
The technical solutions and effects of the present invention are described in detail below by examples, and it should be understood that the examples are only illustrative and should not be construed as limiting the present invention.
Example 1
Analysis of the insertion site of a lentiviral vector in the genome of a 293T cell line, comprising the following steps:
1. extraction of genomic DNA
1.1 collecting cells infected by lentivirus, washing twice with PBS, adding 0.05% pancreatin for 3min, collecting cells, centrifuging, and discarding supernatant.
1.2 extracting genome by using a Tiangen blood cell tissue genome extraction kit.
1.3 DNA concentration determination using Qubit.
Random disruption of DNA
2.1 sampling 100 ng-1 mu g DNA for breaking, if RNA pollution exists, adding 1-2 mu l RNase for digestion treatment before breaking, and quantifying to 80-100 mu l for later use. Ultrasonic instrument covaris le210, Dutyfactor (%) 20, celelepersburst 200, Waterlevel6, Intensity5, 105 s.
2.2 purification of the cleaved fragments by magnetic beads, mixing with the cleaved products according to the ratio of 0.6/0.2, standing at room temperature for 5min, placing the sample on a magnetic stand, standing for 2min, removing the supernatant, washing twice with 70% alcohol, and re-dissolving the sample with 1% TE solution. The electrophoretogram of the ultrasonically-cleaved genome fragments is shown in FIG. 2-A. The electrophoretogram of the selected genome fragment is shown in FIG. 2-B, and a DNA sample with the fragment size of about 400bp is obtained.
2.3 fragment detection and determination on Agilent2100 Analyzer Using HighSensitivityDNAassay.
2.4 concentration determination using Qubit.
3. End repair and addition of A
50ng of the selected genomic DNA was taken from each sample, and end repair and dA addition were performed using the BGI end repair kit and Klenow fragmen.
The reaction was carried out according to the system shown in table 1 below:
TABLE 1
Figure BDA0002341167970000091
Carrying out reaction on a PCR instrument at 37 ℃ for 30 min; 15min at 65 ℃; keeping the temperature at 4 ℃.
4. Add-on joint
4.1 Joint annealing to form asymmetric double-link joints
Two asymmetric linker sequences VIS-AdaptorF, VIS-AdaptorR were synthesized:
VIS-AdaptorF:GATCGGAAGAGCNNNNNNNNNN
Figure BDA0002341167970000092
(SEQ ID NO:1);
VIS-AdaptorR:GCTCTTCCGATCT(SEQ ID NO:2)。
the long and short linkers were diluted to 20 μm with TEbuffer and mixed in equal volumes. 5min at 90 ℃, decreasing at 1 ℃/min and preserving heat at 16 ℃.
4.2 adding 5 mul of asymmetric joint, 23.4 mul of Ligation Buffer (Ligation Buffer) and 1.6 mul of DNA Rapid Ligase to the treated end-repaired and A-added product, mixing uniformly, keeping the temperature at 23 ℃ for 30min and 4 ℃.
4.3 purification of the linker product
And taking out DNACEANBEADS 30min in advance, placing at room temperature, fully shaking and uniformly mixing before use, and mixing the linker and the magnetic beads according to the ratio of 1: mixing at a ratio of 0.5 times, standing at room temperature for 5min, placing the sample in a magnetic rack, standing for 2min, removing supernatant, washing twice with 70% ethanol, and dissolving the sample in 1% TE solution.
5. Capture of viral sequences and their flanking sequences
5.1 first round PCR
The forward and reverse primers for the first round of PCR amplification were as follows:
LTRP1:TGTGACTCTGGTAACTAGAGATCCCTC(SEQ ID NO:3);
AMPR1:
Figure BDA0002341167970000101
(SEQ ID NO:4)。
5.1.1 the reagents used were removed and the reaction mixtures shown in Table 2 below were made up on ice:
TABLE 2
Figure BDA0002341167970000102
5.1.2PCR program: circulating at 98 deg.C for 2min 1; 10s at 98 ℃, 652min, 10s at 72 ℃ and 15 cycles; 5min at 72 ℃; keeping the temperature at 4 ℃.
5.1.3PCR product purification
And taking out DNACEANBEADS 30min in advance, placing the DNACEANBEADS at room temperature, fully shaking and mixing the DNACEANBEADS uniformly before use, and mixing the PCR product and the magnetic beads according to the ratio of 1: mixing at a ratio of 0.5 times, standing at room temperature for 5min, placing the sample in a magnetic rack, standing for 2min, removing supernatant, washing twice with 70% ethanol, and dissolving the sample in 1% TE solution.
5.2 second round PCR
The forward and reverse primers for the second round of PCR amplification were as follows:
LTRP2:/5Phos/GAACGACATGGCTACGATCCGACTTGATCCCTCAGACCCTTTTAGTCA(SEQ ID NO:5);
BC-R2:TGTGAGCCAAGGAGTTG-Barcode-TTGTCTTCCTAAG
Figure BDA0002341167970000103
Figure BDA0002341167970000104
(SEQ ID NO:6)。
5.2.1 the reagents used were removed and the reaction mixtures shown in Table 3 below were made up on ice:
TABLE 3
Figure BDA0002341167970000111
5.2.2 PCR program: circulating at 98 deg.C for 2min 1; 10s at 98 ℃ for 622min, 30s at 72 ℃ for 15 cycles; 5min at 72 ℃; keeping the temperature at 4 ℃.
5.2.3 PCR product purification, 1.0X magnetic bead purification, 15. mu.LTE redissolution, 4.3 steps.
The electrophoresis quality control graph of the product after PCR amplification is shown in FIG. 3.
5.2.4 QUBIT HS assay kit to determine the concentration.
6. Cyclization and pooling (pooling)
6.1 cyclization, mixing each sample according to a total of 330ng standard pooling to 48. mu.l, heat denaturation at 95 ℃ for 5 min. The preparation was carried out using Splitoligo and ligase according to the criteria as shown in Table 4 below. The reaction was carried out at 37 ℃ for 30 min.
TABLE 4
Figure BDA0002341167970000112
6.2 Linear digestion
The reaction system is shown in table 5 below:
TABLE 5
Figure BDA0002341167970000121
6.3 quantification of the library with QUBITSSDNAsaaykit. Meanwhile, the Agilent2100 analyzer is used for controlling the fragment size.
7. Library quality control and on-machine sequencing
The library was subjected to data generation and barcode sequence (barcode) resolution on the BGI-SEQ sequencing platform.
8. Bioinformatic analysis of viral insertion sites
The biological information analysis mainly comprises the following steps: 1) filtering data by using LTR sequence and UMI sequence; 2) cutting off LTR and UMI sequences and alignment sequences with sequences less than 10 bp; 3) blast alignment analysis is performed on the residual sequence and the genomic DNA, and the insertion site is analyzed. Including the distribution of insertion sites in chromosomes, the distribution of insertion sites in different regions of the genome, the distribution of insertion sites in tumor genes, etc. Statistics of sequencing read length (reads) after genomic data removal are shown in FIG. 4.
The biological sample information results for the samples are given in table 6 below:
TABLE 6
Sample name Collecting cell number
293T 1*10^6
Lentiviral infection 293T (293L) 1*10^6
The results of virus insertion site analysis (VIS) distribution in this example are shown in fig. 4 and 5, and using the method of the present invention, we performed PCR identification and Sanger analysis on the predicted 293L site, and the results were consistent with the prediction results (fig. 6 and 7). Ratiometric analysis of the first ten (TOP10) gene insertion sites using UMI (fig. 8) resulted in a good display of the effect of gene distribution and insertion sites between samples, embodying the design of UMI and the successful application of the present invention in the VIS.
The present invention has been described in terms of specific examples, which are provided to aid understanding of the invention and are not intended to be limiting. For a person skilled in the art to which the invention pertains, several simple deductions, modifications or substitutions may be made according to the idea of the invention.
SEQUENCE LISTING
<110> Shenzhen Huashengshengsciences institute
<120> method for constructing sequencing library for detecting lentivirus insertion site and method for detecting lentivirus insertion site
<130> 19I29016
<160> 6
<170> PatentIn version 3.3
<210> 1
<211> 41
<212> DNA
<213> Artificial sequence
<220>
<221> misc_feature
<222> (13)..(22)
<223> n is a, c, g, or t
<400> 1
gatcggaaga gcnnnnnnnn nnaagtcgga ggccaagcgg t 41
<210> 2
<211> 13
<212> DNA
<213> Artificial sequence
<400> 2
gctcttccga tct 13
<210> 3
<211> 27
<212> DNA
<213> Artificial sequence
<400> 3
tgtgactctg gtaactagag atccctc 27
<210> 4
<211> 19
<212> DNA
<213> Artificial sequence
<400> 4
accgcttggc ctccgactt 19
<210> 5
<211> 48
<212> DNA
<213> Artificial sequence
<400> 5
gaacgacatg gctacgatcc gacttgatcc ctcagaccct tttagtca 48
<210> 6
<211> 49
<212> DNA
<213> Artificial sequence
<400> 6
tgtgagccaa ggagttgttg tcttcctaag accgcttggc ctccgactt 49

Claims (10)

1. A method of constructing a sequencing library for detecting lentiviral insertion sites, the method comprising:
providing genomic DNA extracted from a lentivirus-infected cell, the genomic DNA including at least one lentivirus insertion site to be detected;
fragmenting the genomic DNA and processing the fragmented genomic DNA into a form suitable for adaptor ligation;
connecting asymmetric double-chain linkers to two ends of the fragmented genomic DNA suitable for linker ligation to obtain a linker ligation product, wherein the asymmetric double-chain linkers comprise a long-chain sequence and a short-chain sequence, the long-chain sequence sequentially comprises a fixed sequence, a random UMI sequence and an amplification primer binding sequence from 5 'end to 3' end, and the short-chain sequence comprises a sequence complementary to the fixed sequence;
performing a first round of PCR amplification on the adaptor-ligated product, wherein the primers of the first round of PCR amplification comprise a forward primer comprising a stretch of lentiviral sequence inserted into the genome of the cell or a sequence complementary thereto, and a reverse primer comprising a sequence complementary to the amplification primer binding sequence of the long-chain sequence;
performing second round PCR amplification on the product of the first round PCR amplification, wherein the primer of the second round PCR amplification comprises a forward primer and a reverse primer, the forward primer sequentially comprises a 5 ' end phosphorylation group, a sequencing platform forward fixed sequence and a lentivirus sequence or a sequence complementary with the lentivirus sequence from the 5 ' end to the 3 ' end, the binding site of the forward primer is positioned at the downstream of the binding site of the forward primer of the first round PCR amplification, and the reverse primer sequentially comprises a sequencing platform reverse fixed sequence and a sequence complementary with the long-chain sequence amplification primer binding sequence from the 5 ' end to the 3 ' end;
optionally, circularizing the products of the second round of PCR amplification to obtain a circularized sequencing library suitable for on-machine sequencing.
2. The sequencing library construction method of claim 1, wherein the processing of the fragmented genomic DNA into a form suitable for adaptor ligation comprises: performing end repair and adding A tail base to the fragmented genomic DNA.
3. The method for constructing a sequencing library of claim 1, wherein said random UMI sequence is a 10-base sequence nnnnnnnnnnnn, wherein each N represents A, G, C, T any one of four bases.
4. The sequencing library construction method of claim 1, wherein the lentiviral sequence is a sequence of an LTR region of a lentivirus.
5. The sequencing library construction method of claim 1, wherein the binding sites of the forward primer of the first round of PCR amplification and the forward primer of the second round of PCR amplification are both located within the LTR region of the lentivirus and in close proximity to the genome.
6. The method of claim 1, wherein the initial amount of genomic DNA is 50ng or more, preferably 50ng to 1. mu.g, more preferably 100ng to 500 ng.
7. The sequencing library construction method of claim 1, wherein the reverse primers for the second round of PCR amplification further comprise: a barcode sequence for distinguishing between sample sources, the barcode sequence being upstream of a sequence complementary to an amplification primer binding sequence of the long chain sequence.
8. A method for detecting a lentiviral insertion site on genomic DNA, the method comprising:
providing a sequencing library obtained by the method of any one of claims 1 to 7;
performing on-machine sequencing on the sequencing library to obtain sequencing data;
and analyzing the sequencing data to obtain the condition of the lentivirus insertion site on the genome DNA.
9. The method of claim 8, wherein the lentiviral insertion site profiles comprise one or more of a distribution of insertion sites in a chromosome, a distribution of insertion sites in different regions of a genome, and a distribution of insertion sites in a oncogene.
10. The method of claim 8, wherein the in-silico sequencing is performed by the BGI-SEQ sequencing platform.
CN201911376706.7A 2019-12-27 2019-12-27 Sequencing library construction method for detecting lentivirus insertion site and lentivirus insertion site detection method Pending CN113046835A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911376706.7A CN113046835A (en) 2019-12-27 2019-12-27 Sequencing library construction method for detecting lentivirus insertion site and lentivirus insertion site detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911376706.7A CN113046835A (en) 2019-12-27 2019-12-27 Sequencing library construction method for detecting lentivirus insertion site and lentivirus insertion site detection method

Publications (1)

Publication Number Publication Date
CN113046835A true CN113046835A (en) 2021-06-29

Family

ID=76506318

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911376706.7A Pending CN113046835A (en) 2019-12-27 2019-12-27 Sequencing library construction method for detecting lentivirus insertion site and lentivirus insertion site detection method

Country Status (1)

Country Link
CN (1) CN113046835A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106591293A (en) * 2016-12-28 2017-04-26 贵州省草业研究所 Method for separating known-sequence flanking sequences from unknown genomes based on enzyme cutting and connection
WO2023109887A1 (en) * 2021-12-15 2023-06-22 南京金斯瑞生物科技有限公司 Method for detecting integration site
WO2023179766A1 (en) * 2022-03-24 2023-09-28 南京传奇生物科技有限公司 Method for preparing dna library and detecting retroviral integration site

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103725773A (en) * 2012-10-10 2014-04-16 杭州普望生物技术有限公司 Technology for identifying HBV (hepatitis B virus) gene integration sites and recurrently targeted genes in host genome
WO2018112806A1 (en) * 2016-12-21 2018-06-28 深圳华大智造科技有限公司 Method for converting linear sequencing library to circular sequencing library
CN108517567A (en) * 2018-04-20 2018-09-11 江苏康为世纪生物科技有限公司 Connector, primer sets, kit and the banking process in library are built for cfDNA
CN109554447A (en) * 2018-12-19 2019-04-02 武汉波睿达生物科技有限公司 Integration site analysis method and primer of the slow virus carrier in CAR-T cell
WO2019084055A1 (en) * 2017-10-23 2019-05-02 Massachusetts Institute Of Technology Calling genetic variation from single-cell transcriptomes

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103725773A (en) * 2012-10-10 2014-04-16 杭州普望生物技术有限公司 Technology for identifying HBV (hepatitis B virus) gene integration sites and recurrently targeted genes in host genome
WO2018112806A1 (en) * 2016-12-21 2018-06-28 深圳华大智造科技有限公司 Method for converting linear sequencing library to circular sequencing library
WO2019084055A1 (en) * 2017-10-23 2019-05-02 Massachusetts Institute Of Technology Calling genetic variation from single-cell transcriptomes
CN108517567A (en) * 2018-04-20 2018-09-11 江苏康为世纪生物科技有限公司 Connector, primer sets, kit and the banking process in library are built for cfDNA
CN109554447A (en) * 2018-12-19 2019-04-02 武汉波睿达生物科技有限公司 Integration site analysis method and primer of the slow virus carrier in CAR-T cell

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106591293A (en) * 2016-12-28 2017-04-26 贵州省草业研究所 Method for separating known-sequence flanking sequences from unknown genomes based on enzyme cutting and connection
WO2023109887A1 (en) * 2021-12-15 2023-06-22 南京金斯瑞生物科技有限公司 Method for detecting integration site
WO2023179766A1 (en) * 2022-03-24 2023-09-28 南京传奇生物科技有限公司 Method for preparing dna library and detecting retroviral integration site

Similar Documents

Publication Publication Date Title
CN107190329B (en) Fusion based on DNA is quantitatively sequenced and builds library, detection method and its application
CN113046835A (en) Sequencing library construction method for detecting lentivirus insertion site and lentivirus insertion site detection method
CN110129415B (en) NGS library-building molecular joint and preparation method and application thereof
CN111808854B (en) Balanced joint with molecular bar code and method for quickly constructing transcriptome library
CN111379031A (en) Nucleic acid library construction method, obtained nucleic acid library and application thereof
WO2018147438A1 (en) Pcr primer set for hla gene, and sequencing method using same
CN114107459B (en) High-throughput single cell sequencing method based on oligonucleotide chain hybridization marker
CN113621609A (en) Library construction primer group and application thereof in high-throughput detection
CN107109698A (en) RNA STITCH are sequenced:For RNA in directly mapping cell:The measure of RNA interactions
CN113337590A (en) Second-generation sequencing method and library construction method
CN112680796A (en) Target gene enrichment and library construction method
CN112259165A (en) Method and system for detecting microsatellite instability state
CN107794573B (en) Method for constructing DNA large fragment library and application thereof
CN114277114A (en) Method for adding unique identifier in amplicon sequencing and application
CN107794257B (en) Construction method and application of DNA large fragment library
CN111944806A (en) Molecular tag group for high-throughput sequencing pollution detection and application thereof
CN114317685B (en) Kit for detecting mRNA variable shear variation, library building method and sequencing method
US11268087B2 (en) Isolation and immobilization of nucleic acids and uses thereof
CN115948522A (en) Method for detecting oligonucleotide sequence consistency
CN114277096A (en) Method and kit for identifying thalassemia alpha anti4.2 heterozygote and HK alpha heterozygote
CN114214734A (en) Single-molecule target gene library building method and kit thereof
CN106566890B (en) Method for developing rape microsatellite marker locus and method for detecting length of microsatellite marker in microsatellite marker locus
CN117070615A (en) Method and system for detecting lentivirus insertion site
CN117660597A (en) Construction method and kit for next-generation sequencing library for enriching low-frequency mutation
AU2021202166A1 (en) Composition for improving molecular barcoding efficiency and use thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220224

Address after: 518083 Room 606, 6 / F, building 11, Beishan Industrial Zone, 146 Beishan Road, Yangang community, Yantian street, Yantian District, Shenzhen, Guangdong

Applicant after: Shenzhen Hemu gene Biotechnology Co.,Ltd.

Address before: 518083 complex building, Beishan Industrial Zone, Yantian District, Shenzhen City, Guangdong Province

Applicant before: BGI SHENZHEN