Method for detecting chromosome abnormality and recombination site DNA sequence
Technical Field
The invention relates to a method for detecting chromosome abnormality, in particular to a method for rapidly determining chromosome abnormality and DNA sequences of recombination regions of chromosome abnormality and finding recombination sites at low cost.
Background
Birth defects seriously affect the population quality and bring huge losses to the nation, the society and the families. The structural and quantitative abnormality of chromosome is the most important cause of birth defects, and the structural abnormality of chromosome includes chromosome translocation, chromosome inversion, etc.
Chromosomal translocations refer to changes in the location of a chromosomal fragment. When a translocation occurs within a chromosome, it is referred to as a translocation or intrachromosomal translocation; and when a translocation occurs between two homologous or non-homologous chromosomes, it is referred to as an interchromosomal translocation.
The inversion of chromosome is caused by two breaks on the same chromosome, and the generated DNA fragments are inverted by 180 degrees and are reconnected to cause the inversion of chromosome. If an inversion occurs in one arm of the chromosome, it is called an intra-arm inversion; if the inversion contains a centromere region, it is called An inter-arm inversion (Anthony J.F. Griffiths et al, An Introduction to Genetic Analysis, eighth edition).
For many years, karyotyping and Fluorescence In Situ Hybridization (FISH) have been the major techniques for analyzing chromosomal abnormalities, and have a cost advantage in identifying the approximate region of chromosomal recombination, but the sensitivity for prenatal diagnosis of chromosomal abnormalities is only 71.9% (Yang Ling et al, J. eugenics and genetics, 2009, 17 (9): 1-4). In recent years, with the rapid development of Next Generation high throughput Sequencing (NGS) technology, NGS has been widely used for the analysis of chromosomal abnormalities, but since there are a large number of repetitive sequences in higher eukaryotic chromosomes and NGS have a short sequence read length, NGS has a limited effect in dealing with complex situations such as chromosomal translocation and inversion. BioNano and OpGen technologies can help to present a genome-wide frame map, helpful in identifying the location of chromosomal abnormalities. However, the current scheme still has difficulty in finding out the DNA sequence of the chromosomal abnormality rapidly and inexpensively.
Disclosure of Invention
Aiming at the problem that the DNA sequence of the recombination site (translocation, inversion and the like) of the chromosome can not be sequenced rapidly and cheaply by the prior technical scheme, the invention provides a novel method for detecting the DNA sequence of the recombination site.
The method for detecting chromosomal variation provided in the first aspect of the invention comprises:
according to the CRISPR technology, the Cas9 enzyme with single enzyme digestion activity is utilized to cut chromosome DNA,
the double-stranded DNA is disentangled into single-stranded DNA,
and (3) carrying out single-stranded DNA directional sequencing on the NGS platform, comparing with a normal chromosome DNA sequence, and judging whether chromosome variation exists.
In a second aspect, the present invention provides a method for detecting chromosomal variations, comprising:
according to the CRISPR technology, the Cas9 enzyme with single enzyme digestion activity is utilized to cut chromosome DNA,
the double-stranded DNA is disentangled into single-stranded DNA,
performing single-stranded DNA directional sequencing on the NGS platform, comparing with a normal chromosome DNA sequence, and judging whether chromosome variation exists;
if the presence of the variation is judged, the precise region of the recombination site is determined, and the DNA sequence of the recombination site of the precise region is determined.
In a third aspect, the present invention provides a method for detecting a DNA sequence of a chromosomal recombination site, comprising:
the approximate region of the chromosomal ectopic locus is preliminarily located,
designing and preparing an ordered sgRNA sequence based on the reference sequence of the region of the first chromosome,
according to the CRISPR technology, the Cas9 enzyme with single enzyme digestion activity is utilized to cut chromosome DNA,
the double-stranded DNA is disentangled into single-stranded DNA,
and performing single-stranded DNA directional sequencing on the NGS platform, determining the precise region of the recombination site, and determining the DNA sequence of the recombination site in the precise region.
The method for detecting the DNA sequence of the translocation recombination site between chromosomes provided by the fourth aspect of the invention comprises the following steps:
preliminarily locating the approximate region of chromosomal recombination,
designing and preparing a first set of ordered sgRNA sequences based on a reference sequence of the recombination approximate region of the first chromosome,
according to the CRISPR technology, the Cas9 enzyme with single enzyme digestion activity is utilized to cut chromosome DNA,
the double-stranded DNA is disentangled into single-stranded DNA,
performing single-stranded DNA directional sequencing on the NGS platform, determining the precise region of the recombination site of the first chromosome, and determining the DNA sequence of the recombination site of the precise region;
designing and preparing a second set of ordered sgRNA sequences based on the reference sequence of the recombination approximate region of the second chromosome,
according to the CRISPR technology, the Cas9 enzyme with single enzyme digestion activity is utilized to cut chromosome DNA,
the double-stranded DNA is disentangled into single-stranded DNA,
the NGS platform performs single-strand DNA directional sequencing, determines the precise region of the recombination site of the second chromosome, and determines the DNA sequence of the recombination site of the precise region.
In the above context of the present invention, the meaning of the "approximate region" is clear to the skilled person and is generally a larger region containing the site of chromosomal recombination, such as a region of 1-10M, more preferably a region of 1-8M, and still more preferably a region of 1-5M.
In the context of the present invention, the meaning of the "precise region" is clear to the skilled person and is generally a smaller region containing the recombination site of the chromosome, such as ≦ 20kb, more preferably ≦ 15kb, more preferably ≦ 10kb, more preferably ≦ 5kb, more preferably ≦ 1 kb.
In the above-mentioned aspect of the present invention, the method for preliminarily mapping the chromosomal recombination approximate region may be selected from techniques such as karyotyping, FISH, and BioNano.
The step of preliminarily positioning the approximate region of the chromosome recombination can also comprise the judgment of the chromosome abnormality type.
In a preferred embodiment of the method of the present invention, the method may further comprise a step of incorporating base analogues at the nicks by using a DNA nick translation technique after the chromosomal DNA is cleaved, and purifying the single-stranded DNA by using the characteristics of the base analogues after the double-stranded DNA is cleaved.
In the above-mentioned context of the present invention, the meaning of the "base analog" is clear to those skilled in the art, and refers to a compound having a chemical structure similar to that of the base component and capable of being inserted into a DNA molecule in place of a normal base, such as any one or more of 8-azaguanine, 6-mercaptopurine, 2-aminopurine, 5-bromouracil, 5-fluorouracil, 5-bromodeoxyuridine, maleic hydrazide, acrylamide-modified dNTPs, azide-modified dNTPs, digoxin-modified dNTPs, Biotin-modified dNTPs, and 5-hydroxymethylcytosine, and more preferably, Biotin-modified dGTP.
In a preferred embodiment of the method of the present invention, after performing single-stranded DNA sequencing on the NGS platform, the method may further comprise a step of accurately determining the target sequence by using an existing molecular biology technology. The existing molecular biology techniques may be PCR techniques, among other known sequencing techniques.
The invention can analyze the approximate region of chromosome recombination under the condition of low cost by comparing the mature prior art, and effectively overcomes the defects of short NGS read length and the like by utilizing the Cas9 in vitro multipoint single-stranded cutting technology, thereby directly utilizing the NGS platform to carry out sequencing and greatly reducing the sequencing cost.
The method of the invention can not only determine the information of inversion, translocation and the like of the chromosome, but also accurately determine the DNA sequence of the target region. Compared with the traditional chromosome abnormality detection means, the method can obtain more DNA sequence information.
Drawings
FIG. 1 is a schematic flow chart showing a method for determining a DNA sequence of a chromosomal inversion region in example 1 of the present invention;
FIG. 2 is a schematic flow chart showing the method for determining a DNA sequence of a chromosomal translocation region in example 2 of the present invention.
Detailed Description
The method for detecting the DNA sequence of the recombination sites according to the present invention will be described in detail with reference to the accompanying drawings.
Example 1:
(1) the status of the chromosomal abnormality (including what kind of abnormality, approximate location on that chromosome) is determined by classical karyotyping, FISH, BioNano, etc., and the chromosomal abnormality is an inversion of the chromosome as shown in FIG. 1.
To the approximate region of the chromosomal inversion site (e.g., the 1-5M region).
(2) An ordered sgRNA sequence is designed for the region, and sgrnas are prepared in vitro.
(3) Extracting chromosome DNA; chromosomes were cleaved in vitro using Cas9 with single-enzyme activity (e.g., D10A or H840A) according to CRISPR technology.
(4) The Cas9 enzyme is inactivated and base analogs (e.g., Biotin modified dGTP) are incorporated at the nicks using DNA nick translation techniques.
(5) Randomly breaking the DNA; the double strand is disentangled into a single-stranded DNA, and the single-stranded DNA is purified by utilizing the characteristics of the base analogues.
(6) Single-stranded DNA directed sequencing was performed using the NGS platform.
(7) Sequence analysis locked the exact region of the recombination site (e.g. <10 kb).
(8) If (7) the exact recombination site sequence is not found yet, the target sequence can be determined exactly on the basis of (7) by using the existing mature molecular biology techniques (PCR, sequencing, etc.).
Example 2:
(1) the status of a chromosomal abnormality (including which abnormality, approximate location, on that chromosome) is determined by classical karyotyping, FISH, BioNano, etc., as shown in FIG. 2, where the chromosomal abnormality is an interchromosomal translocation.
To the approximate region of the chromosomal inversion site (e.g., the 1-5M region).
(2) An ordered sgRNA sequence was designed for the first chromosome c1 region and sgrnas were prepared in vitro.
(3) Extracting chromosome DNA; chromosomes were cleaved in vitro using Cas9 with single-enzyme activity (e.g., D10A or H840A) according to CRISPR technology.
(4) The Cas9 enzyme is inactivated and base analogs (e.g., Biotin modified dGTP) are incorporated at the nicks using DNA nick translation techniques.
(5) Randomly breaking the DNA; the double strand is disentangled into a single-stranded DNA, and the single-stranded DNA is purified by utilizing the characteristics of the base analogues.
(6) Single-stranded DNA directed sequencing was performed using the NGS platform.
(7) Sequence analysis, locking the exact region of the recombination site (e.g. <10 kb);
(8) then, an ordered sgRNA sequence is designed for the c2 region of the second chromosome, and sgrnas are prepared in vitro, and the above steps (2) to (7) are repeated.
(9) If the precise recombination site sequence is not found in (7) or (8), the target sequence can be precisely determined by using the existing mature molecular biology techniques (PCR, sequencing, etc.) based on (7) or (8).
CRISPR/Cas9 is an adaptive immune defense developed by bacteria and archaea during long-term evolution, and is used to combat invading viruses and foreign DNA (Horvath, p., and Barrangou, R. (2010). The CRISPR/Cas9 system provides immunity by integrating fragments of invading phage and plasmid DNA into the CRISPR and using corresponding CRISPR RNAs (crRNAs) to direct degradation of homologous sequences, and the system works on the principle that crRNA (CRISPR-derived RNA) binds to tracrRNA (trans-activating RNA) by base pairing to form a tracrRNA/crRNA complex that directs the nuclease Cas9 protein to cleave double-stranded DNA at the sequence target site paired with the crRNA. By artificially designing the two RNAs, sgRNA (short guide RNA) with guiding function can be transformed to be enough to guide the Cas9 to cut the DNA at a fixed point (Jinek, M., Chylinski, K., Fonfara, I., Hauer, M., Doudna, J.A., and Charpietier, E. (2012).
Cas9 contains two unique active structures, RuvC at the amino terminus and HNH in the middle of the protein, that play a role in crRNA maturation and double-stranded DNA cleavage. The HNH active site in Cas9 cleaves the complementary DNA strand of crRNA and the RuvC active site cleaves the non-complementary strand, eventually introducing a DNA Double Strand Break (DSB). After mutation of one of the domains (e.g., D10A of RuvC1 and H840A of HNH domain), the mutated Cas9 loses its corresponding activity (Jinek, M., Chylinski, K., Fonfara, I., Hauer, M., Doudna, J.A., and charpienter, E.2012. A programmable dual-RNA-bound DNA endonuclearase in adaptive bacterial science 337, 816-821). Currently, with the breakthrough of a great deal of work, the in vitro cleavage system of Cas9 has become very mature (e.g., the technology for in vitro preparation of sgrnas is very convenient and NEB companies have begun to sell Cas9 enzymes).
As can be seen from the above embodiments, the present invention has the following advantages:
1. because the existing techniques such as chromosome karyotype analysis and FISH are mature, the cost is low if only a rough region of chromosome recombination needs to be identified. The sequencing technology of NGS is mature, and if the NGS platform can be directly utilized, the detection process is not only quick, but also cheap. In the invention, the Cas9 in vitro multipoint single-strand cutting technology, the nick translation technology and the like are utilized to carry out single-strand DNA labeling, and then the NGS sequencing technology is introduced, so that the sequencing cost is greatly reduced, and the defects of short NGS reading length and the like can be effectively overcome.
2. By combining with the NGS sequencing technology, the method not only can determine information such as inversion, translocation and the like of chromosomes, but also can accurately determine the DNA sequence of a target region. Compared with the traditional chromosome abnormality detection means, the method can obtain more DNA sequence information.
The embodiments of the present invention have been described in detail, but the embodiments are merely examples, and the present invention is not limited to the embodiments described above. Any equivalent modifications and substitutions to those skilled in the art are also within the scope of the present invention. Accordingly, equivalent changes and modifications made without departing from the spirit and scope of the present invention should be covered by the present invention.