CN111363795A

CN111363795A - Single cell whole genome sequencing method

Info

Publication number: CN111363795A
Application number: CN201811598380.8A
Authority: CN
Inventors: 白净卫; 刘册
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2018-12-26
Filing date: 2018-12-26
Publication date: 2020-07-03

Abstract

The invention provides a single cell whole genome sequencing method and application of an amplification method in constructing a gene library and single molecule sequencing, wherein the amplification method comprises the step of contacting nucleic acid with nucleic acid polymerase and random primers to carry out multiple displacement amplification of nucleic acid and a strand.

Description

Single cell whole genome sequencing method

Technical Field

The invention relates to the field of molecular biology, in particular to a single cell whole genome sequencing method and application of an amplification method in single molecule sequencing.

Background

The single cell whole genome sequencing technology is a new technology for amplifying and sequencing whole genome at single cell level. The principle is that the separated trace whole genome DNA of a single cell is amplified, a complete genome with high coverage rate is obtained, and then the complete genome is captured through an exon, so that high-throughput sequencing is used for revealing cell population difference and cell evolution relation.

Single cell whole genome amplification refers to a new technology for amplifying a whole genome at a single cell level, and the principle is that a trace whole genome DNA of a single separated cell is amplified, and high-throughput sequencing is performed after a complete genome with high coverage rate is obtained, so as to reveal cell heterogeneity. Currently, the WGA method mainly includes primer extension pre-amplification PCR (PEP-PCR), degenerate oligonucleotide primer PCR (DOP-PCR), Multiple Displacement Amplification (MDA), multiple annealing and annealing-based cyclic amplification (MALBAC), and the like

Patent CN101213311A discloses the use of rolling circle amplification to amplify and clone a single DNA molecule, the amplified DNA being sufficiently background-free to be used directly for sequencing without further purification.

The patent CN104630202A discloses an amplification method capable of reducing bias generated during whole amplification of trace nucleic acid substances, which comprises the step of randomly dispersing the nucleic acid substances into a plurality of independent reaction systems which are not communicated with each other, wherein the sequence of a primer used for amplification comprises 3-20 random bases, and the random bases are randomly selected from two, three or four of A, G, C, T. Compared with the amplification method of trace nucleic acid substances in a large system, the method has the advantages that the amplification bias is reduced by more than 1 order of magnitude, the amplification result can accurately reflect the quantitative information in the original nucleic acid substances, more amplification cycles can be allowed, and more products can be obtained.

Patent CN104560950A discloses a method of MDA-based whole genome amplification, and an amplification kit, and the obtained purified DNA is used for next generation sequencing.

At present, the application in single-molecule sequencing single-cell whole genome sequencing does not exist.

Disclosure of Invention

The invention provides a single cell whole genome sequencing method, which comprises the following steps:

obtaining single cells, and extracting nucleic acid in the single cells;

amplifying the nucleic acid in the extracted single cell to obtain a long nucleic acid sequence of 1-100 kb;

and (3) performing single-molecule sequencing on the long nucleic acid sequence.

Preferably, the single cell is obtained by a method selected from the group consisting of limiting dilution, micromanipulation, fluorescence flow sorting, microdroplet technology, laser microdissection technology, and direct capillary picking.

In one embodiment of the present invention, the method for obtaining single cells is a micro droplet technology.

In another embodiment of the present invention, the method for obtaining single cells is to directly use a capillary to pick up single cells.

Preferably, the method for amplification comprises contacting the nucleic acid with a nucleic acid polymerase and random primers to perform multiple displacement amplification of the nucleic acid from the strand.

More preferably, the amplification method comprises contacting the nucleic acid with a nucleic acid polymerase and random primers under a constant temperature condition to perform multiple displacement amplification of the nucleic acid and the strand.

In one embodiment of the present invention, the method for amplifying includes encapsulating genomic DNA in a single cell in a microdroplet, and then contacting the nucleic acid encapsulated in the microdroplet with a nucleic acid polymerase and a random primer under isothermal conditions to perform multiple displacement amplification of nucleic acid and strand.

In another embodiment of the invention, the amplification method comprises the steps of binding a hexamer random primer (N6) to the genomic DNA, binding DNA polymerase to the 3' end of the hexamer random primer to perform a strand displacement amplification reaction, and continuing to bind the displaced single strand to the primer to perform the next strand displacement amplification reaction; obtaining a long nucleic acid product with a multi-branched structure; digesting the multi-branched amplification product by using S1 nuclease to obtain a linear double-stranded blunt-ended nucleic acid product with the length of 1-100kb, wherein the linear double-stranded blunt-ended nucleic acid product is a long nucleic acid sequence.

The single cell of the present invention may be a prokaryotic cell or a eukaryotic cell. The prokaryotic cell can be bacteria, actinomycetes, blue algae, mycoplasma or rickettsia, and the eukaryotic cell can be plant cell or animal cell and microbial cell. The animal cells are specifically selected from any one of cells obtained by tissue digestion, cells obtained by culture, cells in early embryonic development, cancer cells, microbial cells which are not subjected to enrichment culture, cells obtained by flow sorting, cells obtained by limiting dilution, cells obtained by laser capture and the like. Wherein the cancer cell is a cell in an early stage of cancer.

The random primer is a hexamer random primer (N6), and the sequence is 5-NpNpNpsNpsN-3.

Preferably, the nucleic acid polymerase is a DNA polymerase selected from the group consisting of Phi29DNA polymerase, Tts DNA polymerase, M2DNA polymerase, VENT DNA polymerase, T5DNA polymerase, PRD1DNA polymerase, Bst DNA polymerase, and REPLI-gscDNA polymerase.

Preferably, the DNA polymerase is Phi29DNA polymerase. The Phi29DNA polymerase has 3-5 exonuclease activity and the error rate is only 5x10^-6And high fidelity of amplification is ensured.

Preferably, the long nucleic acid sequence is 1-100kb in length.

Preferably, the long nucleic acid sequence is selected from the group consisting of single-stranded DNA, double-stranded DNA, single-stranded RNA, double-stranded RNA/DNA hybrids, partially hybridized DNA and RNA, or enzymatically, chemically, biologically treated DNA or RNA.

The DNA or RNA treated by the enzymatic, chemical and biological methods of the invention is selected from DNA extracted after immunoprecipitation, DNA treated by sulfite, DNA treated by methyltransferase, cDNA after reverse transcription of RNA, and the like.

Preferably, the single molecule sequencing is selected from single molecule sequencing based on optical signals or single molecule sequencing based on electrical signals.

More preferably, the single molecule sequencing based on the optical signal is single molecule fluorescence Real Time sequencing System (SMRT) sequencing. Based on the single molecule sequencing of optical signals, the principle is that deoxynucleotides are marked by fluorescence, and a microscope can record the change of the fluorescence intensity in real time. When a fluorescently labeled deoxynucleotide is incorporated into a DNA strand, its fluorescence can be simultaneously detected on the DNA strand. When it forms a chemical bond with the DNA strand, its fluorophore is cleaved by the DNA polymerase and the fluorescence disappears. Such a fluorescently labeled deoxynucleotide does not affect the activity of DNA polymerase, and after fluorescence is cut off, the synthesized DNA strand is identical to the natural DNA strand, which is a sequencing-by-synthesis method.

More preferably, the single molecule sequencing based on the electric signal may be nanopore sequencing (nanopore sequencing). The principle of single molecule sequencing based on electric signals is that the electrophoresis technology is adopted, and single molecules are driven to pass through a nanopore one by means of electrophoresis to realize sequencing. Since the diameter of the nanopore is very small, only a single nucleic acid polymer is allowed to pass through, and A, T, C or G single base has different sizes, the formed repression current is different, and the passing base type can be detected through the difference of electric signals, thereby realizing sequencing.

Preferably, the single molecule sequencing comprises constructing a long fragment sequencing library, and performing nanopore sequencing or SMRT on the constructed long fragment sequencing library to obtain sequence characteristics. The sequence of the invention is characterized by a sequence selected from the group consisting of the source, length, identity, sequence, secondary structure of the DNA, or whether the DNA is modified.

In one embodiment of the present invention, the method for constructing the library comprises performing end repair on the long nucleic acid sequence, and performing a ligation reaction by adding A at the 3-terminal and performing a linker to complete the library preparation.

In one embodiment of the invention, the method of constructing the library comprises providing a transposase and a transposon composition, contacting and reacting the composition with the product obtained by amplification, and adding the adapters necessary for sequencing for the amplification of long nucleic acid sequences.

Preferably, the nanopore sequencing step comprises contacting DNA in the constructed long fragment sequencing library with a pore, a helicase, such that the helicase controls movement of DNA in the library through the pore; one or more characteristics of the DNA when interacting with the pore are obtained by electrical and/or optical measurements.

Preferably, the step of SMRT sequencing comprises immobilizing DNA polymerase and single strands of DNA in a long fragment sequencing library in a small well, adding fluorescently labeled free deoxynucleotides, and detecting a fluorescent signal under a microscope when the fluorescently labeled deoxynucleotides are incorporated into the DNA strands in the library.

In one embodiment of the present invention, the method comprises:

obtaining single cells, and extracting nucleic acid in the single cells; the method for obtaining the single cell is selected from a limiting dilution method, a micromanipulation method, a fluorescence flow sorting method, a micro-droplet technology, a laser micro-cutting technology or directly adopting a capillary tube to pick the single cell, wherein the step of extracting the nucleic acid in the single cell comprises the steps of carrying out cell lysis on the single cell and extracting genome DNA;

amplifying the nucleic acid in the single cell to obtain a long nucleic acid sequence of 1-100 kb; the amplification comprises the steps of binding a hexamer random primer (N6) to the genome DNA, then binding polymerase to the 3' end of the hexamer random primer to perform a strand displacement amplification reaction, and continuing to bind the displaced single strand to the primer to perform the next strand displacement amplification reaction. Obtaining a long nucleic acid product with a multi-branched structure, digesting the multi-branched amplification product by utilizing the characteristic of cutting single-stranded DNA by S1 nuclease to obtain a linear double-stranded blunt-ended nucleic acid product with the length of 1-100kb, wherein the linear double-stranded blunt-ended nucleic acid product is a long nucleic acid sequence.

Performing single molecule sequencing of the long nucleic acid sequence; the single molecule sequencing comprises the steps of constructing a long fragment sequencing library by using the obtained 1-100kb long nucleic acid sequence; unwinding double-stranded DNA in the long fragment sequencing library into single-stranded DNA under the action of helicase, and allowing the single-stranded DNA to pass through a nanopore embedded in a membrane under the driving of voltage while unwinding; four different bases on the single-stranded DNA can form different repressing currents when passing through the nanopore, and the base sequence of the DNA is determined by recording the magnitude of the repressing current.

Preferably, the DNA polymerase in the amplification step is Phi29DNA polymerase.

Preferably, the displacement amplification of the strand is performed under a constant temperature condition, and the constant temperature is 20-40 ℃.

In one embodiment of the present invention, the strand displacement amplification is performed under a constant temperature condition, wherein the constant temperature is 30 ℃.

Preferably, the cell lysis method comprises the steps of adding cell lysis solution into the single cell suspension, centrifuging, incubating, adding stop solution, and centrifuging.

More preferably, the cell lysate is selected from a mixture of Reconstituted Buffer DLB and Dithiothreitol (DTT).

More preferably, the incubation temperature is 55-75 ℃.

In one embodiment of the invention, the incubation temperature is 65 ℃.

The helicase of the invention is of the Hel308 family.

The nanopore is a transmembrane pore, and the transmembrane pore is a biological pore, a solid-state pore or a pore for hybridizing organisms and solids.

Preferably, the biological pore is selected from the group consisting of hemolysin, leukocidin, mycobacterium smegmatis porin a (mspa), Csgg, mycobacterium smegmatis porin B, mycobacterium smegmatis porin C, mycobacterium smegmatis porin D, lysenin, MZA, outer membrane protein f (ompf), outer membrane protein g (ompg), outer membrane phospholipase a, or neisseria autotransporter lipoprotein (NalP).

In another embodiment of the present invention, the method for single cell whole genome sequencing comprises:

obtaining single cells, and extracting nucleic acid in the single cells; the method for obtaining the single cell is selected from a limiting dilution method, a micromanipulation method, a fluorescence flow sorting method, a micro-droplet technology, a laser micro-cutting technology or directly adopting a capillary tube to pick the single cell, wherein the step of extracting the nucleic acid in the single cell comprises the step of carrying out cell lysis on the single cell and extracting genome DNA;

amplifying the nucleic acid in the single cell to obtain a long nucleic acid sequence of 1-100 kb; the amplification comprises the steps of combining a hexamer random primer (N6) on the genome DNA, combining DNA polymerase on the 3' end of the hexamer random primer to carry out strand displacement amplification reaction, and continuing to combine the displaced single strand with the primer to carry out the next-stage strand displacement amplification reaction. Obtaining long nucleic acid products with multi-branched structures. And digesting the multi-branched amplification product by utilizing the characteristic of cutting single-stranded DNA by using S1 nuclease to obtain a linear double-stranded blunt-ended nucleic acid product with the length of 1-100kb, wherein the linear double-stranded blunt-ended nucleic acid product is a long nucleic acid sequence.

Performing single molecule sequencing of the long nucleic acid sequence; the single molecule sequencing comprises the steps of constructing a long fragment sequencing library by using the obtained 1-100kb long nucleic acid sequence; fixing DNA single molecular chains in a DNA polymerase and long fragment sequencing library in a small hole, adding fluorescence-labeled free deoxynucleotides, and detecting a fluorescence signal by a microscope when the fluorescence-labeled deoxynucleotides are doped into the DNA chains in the library.

Preferably, the long fragment in the long fragment sequencing library has a length of 1-100 kb.

More preferably, the incubation temperature is 55-75 ℃.

In one embodiment of the invention, the incubation temperature is 65 ℃.

The invention also provides the application of the amplification method in single molecule sequencing.

Preferably, the amplification method comprises contacting the nucleic acid with a nucleic acid polymerase and random primers to perform multiple displacement amplification of the nucleic acid from the strand.

Further preferably, the amplification method comprises contacting the nucleic acid with a nucleic acid polymerase and random primers under a constant temperature condition to perform multiple displacement amplification of the nucleic acid and the strand.

In one embodiment of the invention, the amplification method comprises the steps of binding a hexamer random primer (N6) to genomic DNA, then binding phi29DNA polymerase to the 3' end of the hexamer random primer to perform a strand displacement amplification reaction, and continuing binding the displaced single strand to the primer to perform the next strand displacement amplification reaction. Finally obtaining long nucleic acid products with multi-branched structures. Digesting the multi-branched amplification product by using the characteristic of cutting single-stranded DNA by using S1 nuclease to obtain a linear double-stranded blunt-end nucleic acid product with the length of 1-100 kb. .

Preferably, the single-molecule sequencing is single-molecule sequencing based on an optical signal or single-molecule sequencing based on an electric signal.

More preferably, the single molecule sequencing based on optical signals is SMRT sequencing.

In one embodiment of the invention, the step of SMRT sequencing comprises immobilizing DNA polymerase and single strands of DNA in a long fragment sequencing library in small wells, adding fluorescently labeled free deoxynucleotides, and detecting fluorescent signals microscopically when the fluorescently labeled deoxynucleotides are incorporated into the DNA strands in the library.

More preferably, the single molecule sequencing based on electrical signals may be nanopore sequencing.

In one embodiment of the present invention, the nanopore sequencing step comprises unwinding double-stranded DNA in the long fragment sequencing library into single-stranded DNA by helicase, and unwinding the single-stranded DNA while passing through the membrane-embedded nanopore under the driving of a voltage; four different bases on the single-stranded DNA can form different repressing currents when passing through the nanopore, and the base sequence of the DNA is determined by recording the magnitude of the repressing current.

The invention also provides the application of the amplification method in constructing the gene library.

In one embodiment of the invention, the amplification method comprises the steps of binding a hexamer random primer (N6) to genomic DNA, binding phi29DNA polymerase to the 3' end of the hexamer random primer to perform a strand displacement amplification reaction, and continuing to bind the displaced single strand to the primer to perform the next strand displacement amplification reaction. Finally obtaining long nucleic acid products with multi-branched structures. The multi-branched amplification product was digested by the cleavage of single-stranded DNA with S1 nuclease, to obtain a linear double-stranded blunt-ended nucleic acid product of 1-100kb in length.

Preferably, the constructed gene library is a library constructed for single molecule sequencing.

More preferably, the single molecule sequencing is a long fragment sequencing library based on single molecule sequencing by optical signal or based on single molecule sequencing by electric signal.

In a specific embodiment of the present invention, the single molecule sequencing is nanopore sequencing or SMRT sequencing.

The term "and/or" as used herein includes a list of items in the alternative as well as any number of combinations of items.

The terms "comprises" and "comprising" as used herein are intended to be open-ended terms that specify the presence of the stated elements or steps, and not substantially affect the presence of other stated elements or steps.

The amplification method of the invention is used for amplifying the unicellular hologeneSet, the sample size required for amplification is low (as low as 10)^-15g) The single molecule sequencing of the amplified sample can improve the genome coverage of the amplified product of the whole genome of the single cell and the DNA library for sequencing the whole genome of the single cell, wherein the strand displacement activity of phi29DNA polymerase used for amplification can be continuously amplified under a constant temperature condition, so that the heterogeneous amplification in a PCR reaction is avoided, and the genome template is distributed in different closed spaces by utilizing a primer, so that the amplification reaction is more uniform, the higher genome coverage rate is obtained under the lower sequencing depth, and simultaneously, the information such as Copy Number Variation (CNV) and the like can be more easily detected. The amplification method can be used for obtaining a long-fragment DNA product, digesting a branched structure by utilizing the single-strand cleavage activity of S1 to obtain a double-strand flat-end product, facilitating single-molecule sequencing, facilitating later-stage splicing of the long-fragment sequencing, saving more structural information of genes, analyzing structural variation of genomes and the like, and avoiding the defect of loss of information such as relative positions of the genes and the like caused by the conventional second-generation sequencing.

Drawings

Embodiments of the invention are described in detail below with reference to the attached drawing figures, wherein:

FIG. 1: the amplification method of the invention amplifies the single cell genome and the experimental flow chart of sequencing analysis.

FIG. 2: the amplification product of the amplification method of the invention is run glue detection result diagram.

FIG. 3: s1 nuclease digestion amplification product gel electrophoresis result chart of the invention amplification method.

FIG. 4: the amplification method of the invention amplifies the single cell genome to carry out the read length distribution map of sequencing.

FIG. 5: the amplification method amplifies the genome of the single cell to carry out sequencing, and the genome coverage rate distribution map is obtained.

FIG. 6: the invention relates to an experimental flow chart of amplification of single cell genome and sequencing analysis based on a microdroplet amplification method.

FIG. 7: generation of microdroplets encapsulating genomic DNA of a single cell.

FIG. 8: s1 nuclease digestion amplification products of the microdroplet-based amplification method of the invention gel electrophoresis results.

FIG. 9: and detecting a gel electrophoresis result picture of the digestion product.

FIG. 10: and (3) extracting the Bulk genome DNA and obtaining a gel electrophoresis result picture.

FIG. 11: graph a shows the results of the SV analysis of the sequencing, where graph a shows the results of the data before filtering, and graph B shows the results of the data after filtering, where invup shows inverted repeats, DEL shows deletions, INS shows insertions, TRA shows translocations, INV shows inverted repeats, and DUP shows repeats.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only some embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example 1 amplification of Single cell genome and sequencing analysis according to the amplification method of the invention

The experimental flow is shown in figure 1, and the specific experimental steps and results are as follows:

1. obtaining single cells

(1) Recovering and culturing K562 (human chronic myelogenous leukemia cell line) cells to 10⁶/mL。

(2) Single cells were picked up into PCR tubes using a capillary.

2. Single cell genome extraction and amplification

The amplification system was formulated using a single cell amplification kit (REPLI-g single cell kit (Qiagen, Cat. No. 150343)).

(1) mu.L of Reconstituted Buffer DLB was added to 3. mu.L of dithiothreitol (DTT (1M)) to prepare Buffer D2.

(2) Add 3. mu.L of buffer D2 to 4. mu.L of single cell suspension, mix gently, and centrifuge.

(3) Incubate at 65 ℃ for 10 min.

(4) Adding 3 mu L of stop solution, mixing evenly and gently, and centrifuging to obtain a single cell lysate.

(5) Buffers were configured using the kit.

(6) Add 40. mu.L of buffer to 10. mu.L of single cell lysate. Mixing, and centrifuging.

(7) And (4) incubating for 8h at 30 ℃ to obtain an amplification product.

(8) The results of detection using 1% agarose gel electrophoresis are shown in FIG. 2.

3. The amplification product in step 3 was digested with S1 nuclease (M5761 Promega).

(1) The S1 nuclease digestion system was prepared as shown in Table 1.

TABLE 1S 1 nuclease digestion System Components and volumes

(2) Incubate at 37 ℃ for 30 min.

(3) The digestion products were detected by 1% agarose gel electrophoresis. The results are shown in FIG. 3.

4. The digestion product was recovered using column purification. Sent to Wuhan Michelia for Oxford nanopore Gridios sequencing.

5. Analysis of results

The amplification reaction is uniform, the obtained long fragment is beneficial to nanopore sequencing, and higher genome coverage rate is obtained under lower sequencing depth, and the specific sequencing result is as follows:

4.57G data is obtained by sequencing, the sequencing depth is 1.52X, the read length of the median is 4296bp, and the read length distribution is shown in figure 4. The average genome coverage was 50%, and the genome coverage was shown in FIG. 5.

Example 2 amplification of Single cell genome and sequencing analysis according to the microdroplet-based amplification method of the invention

The experimental flow is shown in fig. 6, and the specific experimental steps and results are as follows:

1. obtaining single cells

(2) Single cells were picked up into PCR tubes using a capillary.

2. Single cell genome extraction and amplification

The single cell genome amplification kit (REPLI-gsinglecellkit, Qiagen, Cat. No.150343) was used to formulate the amplification system.

(1) mu.L of Reconstituted Buffer DLB was added to 3. mu.L of dithiothreitol (DTT, 1M) and configured as Buffer D2.

(3) Incubate at 65 ℃ for 10 min.

(5) Droplet preparation (refer to QX200ddPCR hydroprene protocol)

mu.L of the amplification reaction and 50. mu.L of HFE-7500 fluorinated oil were added to each well of the DG8 cartridge.

A droplet generator (QX200droplet generator, bio-rad 1864002) was turned on and the droplet was generated in the DG8cartridge third well. Droplet generation is shown in fig. 7.

The droplet was pipetted out and placed in a PCR tube.

(6) Incubate at 30 ℃ for 8 h.

3. DNA extraction assay

Taking out the amplification product, adding PFO (1H,1H,2H, 2H-perfluoro-1-octanol) (1H,1H,2H, 2H-perfluor-1-octanol (Sigma-Aldrich, cat.no.370533)) with the same volume, shaking, mixing uniformly, and centrifuging at 10000 g. The upper water phase was pipetted into a new PCR tube (step-specific reference Single-cell analysis and priming drop-based microfluidics, DOI: 10.1038/nprot.2013.046). The results of detection using 1% agarose gel electrophoresis are shown in FIG. 8.

4. DNA amplification products were digested with S1 nuclease (M5761Promega) (S1 nuclease digestion system components and volumes are shown in Table 1)

Incubate at 37 ℃ for 30 min.

The digestion products were detected by 1% agarose gel electrophoresis, and the results are shown in FIG. 9.

5. The digest was purified by column purification and sequenced using Oxford nanopore Gridios.

6. Analysis of results

the average read length of 5k (see table 2 for specific read lengths), at 1.3x approximately 50% coverage of the human genome, and Structural Variation (SV) analysis are shown in fig. 11 and table 3.

TABLE 2 nanopore sequencing read length analysis

Total base reads	4568672850	Average read length	4908.28
				Total number of readings	1103562	Read length of N50	583362
Base reading	3995439568	Median read length	4296
				Reading number	814021	Longest read length	392452
Average score	9.78

TABLE 3 data before and after structural variation Filtering

	Before filtering (rough)	After filtering (accurate)	Total structural variation
				Absence of	47	2	49
Insert into	55	2	57
				Repetition of	141	22	163
Inverting position	25	4	29
				Translocation	30	3	33
Inverted repeat	4	0	4

Example 3 comparative experiment

1. And (3) extracting Bulk genome DNA.

1.1 resuscitating and culturing K562 (human chronic myelogenous leukemia cell line) cells to 10⁶/mL。

1.2 get 2 × 10⁶Extracting genome DNA from the cells by a centrifugal column method. The total mass of DNA was 20. mu.g, as determined by absorbance (nanodrop). The results of the gel run on a 1% agarose gel are shown in FIG. 10.

2. Bulk genomic DNA was sent to wuhan future group companies for sequencing.

3. The bulk genomic DNA sequencing result was considered as the original sequence before amplification as the genomic reference sequence. Comparing the original sequence with the sequencing results of the single cell amplification method and the amplification method based on microdroplet, the probability of occurrence of base mutation and chromosome structure variation in the amplification process of the two single cell genome amplification methods can be evaluated, and the quality of the two amplification methods can be judged.

The preferred embodiments of the present invention have been described in detail, however, the present invention is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solution of the present invention within the technical idea of the present invention, and these simple modifications are within the protective scope of the present invention.

It should be noted that the various technical features described in the above embodiments can be combined in any suitable manner without contradiction, and the invention is not described in any way for the possible combinations in order to avoid unnecessary repetition.

Claims

1. A method for single cell whole genome sequencing, which is characterized by comprising the following steps:

obtaining single cells, and extracting nucleic acid in the single cells;

performing single molecule sequencing of the long nucleic acid sequence;

wherein the amplification method comprises the step of contacting the nucleic acid with nucleic acid polymerase and random primers to carry out the multiple displacement amplification of the nucleic acid and the strand.

2. The method of claim 1, wherein the amplification method comprises contacting the nucleic acid with a nucleic acid polymerase and random primers under isothermal conditions to perform multiple displacement amplification of the nucleic acid from the strand.

3. The method according to claim 1 or 2, wherein the long nucleic acid sequence is selected from the group consisting of single-stranded DNA, double-stranded DNA, single-stranded RNA, double-stranded RNA/DNA hybrids, partially hybridized DNA and RNA, or enzymatically, chemically, biologically treated DNA or RNA.

4. The method of any one of claims 1 to 3, wherein the single molecule sequencing is selected from single molecule sequencing based on an optical signal or single molecule sequencing based on an electrical signal; preferably, the single-molecule sequencing is a nanopore sequencing and single-molecule fluorescence real-time sequencing system.

5. The method of any one of claims 1 to 4, wherein the single cell is obtained by a method selected from the group consisting of limiting dilution, micromanipulation, fluorescence flow sorting, microdroplet techniques, laser microdissection techniques and direct capillary picking of single cells.

6. The method of any one of claims 1 to 5, wherein the single molecule sequencing comprises constructing a long fragment sequencing library, and subjecting the constructed long fragment sequencing library to nanopore sequencing or single molecule fluorescence real-time sequencing.

7. The method according to any one of claims 1 to 6, wherein the method comprises:

amplifying the nucleic acid in the single cell to obtain a long nucleic acid sequence of 1-100 kb; the amplification comprises the steps that a hexamer random primer (N6) is combined on the genome DNA, polymerase is combined on the 3' end of the hexamer random primer to carry out strand displacement amplification reaction, the displaced single strand is combined with the primer continuously to carry out the next-stage strand displacement amplification reaction; obtaining a long nucleic acid product with a multi-branched structure; digesting the multi-branched amplification product by using S1 nuclease to obtain a linear double-stranded blunt-ended nucleic acid product with the length of 1-100kb, wherein the linear double-stranded blunt-ended nucleic acid product is a long nucleic acid sequence;

8. The method according to any one of claims 1 to 6, wherein the method comprises:

amplifying the nucleic acid in the single cell to obtain a long nucleic acid sequence of 1-100 kb; the amplification comprises the steps that a hexamer random primer (N6) is combined on the genome DNA, DNA polymerase is combined on the 3' end of the hexamer random primer to carry out strand displacement amplification reaction, the displaced single strand is continuously combined with the primer, and the next stage of strand displacement amplification reaction is carried out; obtaining a long nucleic acid product with a multi-branched structure; digesting the multi-branched amplification product by using S1 nuclease to obtain a linear double-stranded blunt-ended nucleic acid product with the length of 1-100kb, wherein the linear double-stranded blunt-ended nucleic acid product is a long nucleic acid sequence;

9. The method of claim 7 or 8, wherein the DNA polymerase in the amplifying step is Phi29DNA polymerase.

10. The method according to any one of claims 7 to 9, wherein the displacement amplification of the strand is carried out at a constant temperature of 20 ℃ to 40 ℃.

11. The method according to any one of claims 7 to 10, wherein the displacement amplification of the strand is carried out at a constant temperature of 30 ℃.

12. The method of any one of claims 7 to 11, wherein the long fragment in the long fragment sequencing library is 1 to 100kb in length.

13. Use of an amplification method in single molecule sequencing, wherein the amplification method comprises contacting nucleic acid with a nucleic acid polymerase and random primers to perform multiple displacement amplification of nucleic acid from strand.

14. Use of an amplification method according to claim 13 in single molecule sequencing, wherein the amplification method comprises binding hexamer random primer (N6) to genomic DNA, phi29DNA polymerase to the 3' end of hexamer random primer for strand displacement amplification reaction, and the displaced single strand is continued to bind primer for next stage of strand displacement amplification reaction; obtaining a long nucleic acid product with a multi-branched structure; digesting the multi-branched amplification product by using S1 nuclease to obtain a linear double-stranded blunt-ended nucleic acid product with the length of 1-100 kb.

15. Use of an amplification method according to claim 13 or 14 for single molecule sequencing selected from single molecule sequencing based on light signals or single molecule sequencing based on electrical signals.

16. Use of an amplification method according to claim 15 in single molecule sequencing, wherein said single molecule sequencing comprises nanopore sequencing or single molecule fluorescence real-time sequencing.

17. Use of an amplification method for constructing a long fragment sequencing library, wherein the amplification method comprises contacting nucleic acid with a nucleic acid polymerase and random primers to perform multiple displacement amplification of nucleic acid from a strand.

18. Use of an amplification method according to claim 17 in the construction of a long fragment sequencing library, wherein the amplification method comprises binding hexamer random primer (N6) to genomic DNA, phi29DNA polymerase to the 3' end of hexamer random primer for strand displacement amplification reaction, and the displaced single strand is further bound to primer for the next strand displacement amplification reaction; obtaining a long nucleic acid product with a multi-branched structure; digesting the multi-branched amplification product by using S1 nuclease to obtain a linear double-stranded blunt-ended nucleic acid product with the length of 1-100 kb.

19. Use of an amplification method according to claim 17 or 18 in the construction of long fragment sequencing libraries, wherein the single molecule sequencing is selected from single molecule sequencing based on light signals or single molecule sequencing based on electrical signals.

20. Use of an amplification method according to any one of claims 17 to 19 in the construction of long fragment sequencing libraries, wherein single molecule sequencing comprises nanopore sequencing or single molecule fluorescence real-time sequencing.