CN113061626A

CN113061626A - Method for tissue-specifically knocking out zebra fish gene and application

Info

Publication number: CN113061626A
Application number: CN201911295922.9A
Authority: CN
Inventors: 李佳; 杜久林; 李红羽
Original assignee: Center for Excellence in Brain Science and Intelligence Technology Chinese Academy of Sciences
Current assignee: Center for Excellence in Brain Science and Intelligence Technology Chinese Academy of Sciences
Priority date: 2019-12-16
Filing date: 2019-12-16
Publication date: 2021-07-02
Anticipated expiration: 2039-12-16
Also published as: CN113061626B

Abstract

The invention provides a method for knocking out zebra fish genes by tissue specificity and application thereof. Specifically, the invention provides a highly efficient and tissue-specific zebrafish gene knockout method and a nucleic acid construct for use in the method. The invention also provides vectors, reagents, and host cells for expressing the nucleic acid constructs. The invention can simply and conveniently obtain multiple transgenic effects, greatly improve the transgenic efficiency, further simplify the screening step and simply and conveniently obtain the target transgenic zebra fish.

Description

Method for tissue-specifically knocking out zebra fish gene and application

Technical Field

The invention relates to the field of transgenic animals, in particular to a method for knocking out zebra fish genes by tissue specificity and application thereof.

Background

Zebrafish are important model animals for studying development and disease. The juvenile fish is small and transparent, the tissue system has certain representativeness and conservation, and simultaneously has certain complexity, and the development of tissues and organs and various physiological activities can be conveniently and dynamically observed under the existing various operating means in a laboratory. In order to fully exert the advantages of zebra fish in research, a tissue-specific gene knockout method of zebra fish needs to be established. At present, in zebrafish, methods for generating a complete gene function deletion are mature and widely used. However, the technology of knocking out genes specifically in specific tissues and cell types is still immature, which is one of the bottlenecks restricting the development of the research field of zebra fish at present.

Therefore, there is a need in the art for an efficient and tissue-specific zebrafish gene knockout method to achieve knock-out of gene function in specific tissues and cell types.

Disclosure of Invention

The invention aims to provide a zebra fish gene knockout method with high efficiency and tissue specificity, so as to realize the gene function knockout in specific tissues and cell types.

In a first aspect of the invention, there is provided a nucleic acid construct I having a structure of formula I from 5 'to 3':

LA-X-RA (I)

in the formula (I), the compound is shown in the specification,

LA, X, RA are each an element used to construct the construct;

each "-" is independently a bond or a nucleotide linking sequence;

LA is the modified left homologous arm sequence;

x is a first exogenous gene expression cassette;

RA is the right homology arm sequence;

the LA and RA sequences allow site-directed non-homologous recombination of the construct with a target segment of a zebrafish chromosome, wherein the target segment comprises an intron, an exon, a terminator and a 3'UTR segment at the 3' end of a zebrafish target gene, and a single guide RNA (sgRNA) target sequence is contained in the intron sequence of the target segment;

the site of LA site-directed recombination is located in the sequence of the intron and exon at the 5' end of the target segment up to the terminator, and the LA sequence (5' → 3') comprises the single-guide rna (sgrna) target sequence and a stretch of operably linked nucleic acid construct II.

In another preferred embodiment, the sequence length of the target segment is 1000-6000 bp.

In another preferred embodiment, the length of the homology arm sequence is 200-5000bp, preferably 500-3000bp, and more preferably 1000-2000 bp.

In another preferred embodiment, the site of site-directed recombination of the target segment of the nucleic acid construct I is located in the hey2 gene region on chromosome 20 of zebrafish, specifically, from position 39589569 to position 39591030 of NC _ 007131.

In another preferred embodiment, the target segment of nucleic acid construct I is exon 5E 5 of the zebrafish hey2 gene and an intron sequence containing the sgRNA target sequence.

In another preferred example, the sgRNA target sequence is 18-24nt in length; preferably, 19-22 nt.

In another preferred example, the sgRNA target sequence targets an intron sequence of hey2 gene.

In another preferred example, the intron sequence, i.e., the hey2 sgRNA target sequence, is GGAAGGATAATGGTTGGGT (SEQ ID No.:2), wherein PAM is AGG.

In another preferred embodiment, the nucleic acid construct I has the nucleic acid sequence shown in SEQ ID No.:1, wherein the nucleic acid sequence of the LA left homology arm sequence is shown in SEQ ID No.:1 at positions 1-3309, the nucleic acid sequence of the first foreign gene expression cassette X is shown in SEQ ID No.:1 at positions 3310-4098, and the nucleic acid sequence of the RA right homology arm is shown in SEQ ID No.:1 at positions 4099-5205. The sequence of SEQ ID NO. 1 is as follows:

atgatcttattttgactaagcgtggctatgaagcagaaaggaaggataatgagttgggtaggttaggtaagactttcttagacatgagtcaggtcaaagacaacagataattccataaaacatatgtattttttgtattctatagaattgtcattaatattcatgcaaaagatttcttaagtgacatttgcaaatcactccagttgtgtctttctttacattgttttccagttaaatttaggattgatttctgttatttatttagaataataattgtatattaataatgatattgggcagtatattgtactgtaccctgctgctggaagggtatccatatggggtggttcgcccagggtgttgttacacaagtgaaagagcatcgacagggtaatttgctagcttaccggtttacgcgtataacttcgtatagcatacattatacgaagttatccaagcttcaccatcgacccgaattgccaagcatcaccatcgacccataacttcgtatagtacacattatacgaagttatttcgaacgtaatacgactcactatagggcgaattggagctccaccggtggcggccgctctagaactagtggatcctggttctttccgcctcagaagccatagagcccaccgcatccccagcatgcctgctattgtcttcccaatcctcccccttgctgtcctgccccaccccaccccccagaatagaatgacacctactcagacaatgcgatgcaatttcctcattttattaggaaaggacagtgggagtggcaccttccagggtcaaggaaggcacgggggaggggcaaacaacagatggctggcaactagaaggcacagtcgaggctgatcagcggtttctggttctttccgcctcagaagccatagagcccaccgcatccccagcatgcctgctattgtcttcccaatcctcccccttgctgtcctgccccaccccaccccccagaatagaatgacacctactcagacaatgcgatgcaatttcctcattttattaggaaaggacagtgggagtggcaccttccagggtcaaggaaggcacgggggaggggcaaacaacagatggctggcaactagaaggcacagtcgaggctgatcagcgagctccaccgcggtcaattaagtttgtgccccagtttgctagggaggtcgcagtatctggccacagccacctcgtgctgctcgacgtaggtctctttgtcggcctccttgattctttccagtctgtggtccacatagtagacgccgggcatcttgaggttcttagcgggtttcttggatctgtatgtggtcttgaagttgcagatcaggtggcccccgcccacgagcttcagggccatgtcgcttctgccttccaggccgccgtcagcggggtacagcatctcggtgttggcctcccagccgagtgttttcttctgcatcacagggccgttggatgggaagttcacccctctgatcttgacgttgtagatgaggcagccgtcctggaggctggtgtcctgggtagcggtcagcacgcccccgtcttcgtatgtggtgactctctcccatgtgaagccctcagggaaggactgcttaaagaagtcggggatgccctgggtgtggttgatgaaggttctgctgccgtacatgaagctggtagccaggatgtcgaaggcgaaggggagagggccgccctcgaccaccttgattctcatggtctgggtgccctcgtagggcttgccttcgccctcggatgtgcacttgaagtggtggttgttcacggtgccctccatgtacagcttcatgtgcatgttctccttaatcagctcttcgcccttagacacgacgtcaggtccagggttctcctccacgtctccagcctgcttcagcaggctgaagttagtagctcttctcttcttccgaccgcgaagagtttgtcgatcgactgaaaaaaaaaagggaagagagagacacgtcagaaacacacacacactccggattagtgagatctgaataggaacttcataacttcgtataatgtatgctatacgaagttatccaagcatcaccatcgaccctctagtccagaactcaccatcgacccataacttcgtataatgtgtactatacgaagttatactagtattatgtacctgactgatcgatttgcctttgatttctggcatttgtcgggaatttctcaaaacctgttgtcgagtcaaaatctgggctaaaatcatacagtctgaactcggctttaggggttaataatattgaccttaaaatggttttaaaagaattaaaaactgcttttattctagctgacataaaacaaataagactttctccagaagaaaaaaatattttaggaattacagtaaaaaatgtcttgctctgttaaacatcatttgggaaatatttgaacaaaggtatcaaaattcacaggaggtgtgtgtatttaaagattcactagtatgctcatttgaataattctcaatattttttgtcaggatatttcgacgctcattctctggccatggacttcttgagcatcggcttccgggagtgtctgactgaagtggccaggtatttgagctctgtggaaggcctggactccagcgaccctctccgtgtccgtctggtttctcacctcagcagctgtgcctcgcagagggaagcagccgccatgaccacatccatagcccatcaccagcaggcccttcacccgcaccactgggctgccgctttgcatcccattcctgctgcgttcctgcagcagagcggacttccctcctcagagagctcctccggcaggctgtctgaggctcctcaaagaggtgcagcccttttctcccatagtgactcggcactcagagcgccctctactggaagtgtggctccttgcgtgccaccgctgtccacttctctgctttcgttatcagcgaccgttcatgcagcagctgctgcagctgcagctcaaaccttccctctatcatttcccgctggattcccactcttcagccccagcgttacagcatcttcagtggcttcttccaccgtgagctcttccgtttccacatccaccacatcccaacagagcagcgggagcaacagtaaaccataccgaccgtggggaactgaagtgggagcgttttcgggaggtggatccggagctactaatttctccttgcttaagcaagctggtgatgttgaagaaaatcctggtcctatggtgagcaagggcgaggagctgttcaccggggtggtgcccatcctggtcgagctggacggcgacgtaaacggccacaagttcagcgtgtccggcgagggcgagggcgatgccacctacggcaagctgaccctgaagttcatctgcaccaccggcaagctgcccgtgccctggcccaccctcgtgaccaccctgacctacggcgtgcagtgcttcagccgctaccccgaccacatgaagcagcacgacttcttcaagtccgccatgcccgaaggctacgtccaggagcgcaccatcttcttcaaggacgacggcaactacaagacccgcgccgaggtgaagttcgagggcgacaccctggtgaaccgcatcgagctgaagggcatcgacttcaaggaggacggcaacatcctggggcacaagctggagtacaactacaacagccacaacgtctatatcatggccgacaagcagaagaacggcatcaaggtgaacttcaagatccgccacaacatcgaggacggcagcgtgcagctcgccgaccactaccagcagaacacccccatcggcgacggccccgtgctgctgcccgacaaccactacctgagcacccagtccgccctgagcaaagaccccaacgagaagcgcgatcacatggtcctgctggagttcgtgaccgccgccgggatcactctcggcatggacgagctgtacaagaccggttaaatgttggatttaaatgttggacgtcttccatgctttgtacataaaggaaagcagcggctattgtgcctgcttcggtcagcagcatgggcttttgtcttcctctacacttgtgcacatatgcagcgtcaaacttaagccaacattctgggaagaaaagaaagagtttttacacgtcgcactgtgttggaaaccgtaaaggaagtttgtttctgttttaacagtgcctgcataaacactgctaacatgctgcatttgagatgtatgctttgatatcatctgacttccacaaacacccaacagcagctttagagtgaacagcttgttctgaaacaaaccaaagttttgcagataatcactaaagtgaggtgtttgtttttttatctctgatttaacaatccagtttgtaaatctgtacatgtgtaagattgtaactagagtttatattgaaattagttcattggtatgatgcacttcaatcactactgtttgtttggggggagacaggatcttctccgatttatacaataggcctactgaagttgtttttttaaaataacattcactaatactcatgtgagatttttctactactgtaactgtgttaataaccaccctctgtaagatgtaaccttttcctatgcaaaaaaacaaatgtccctcaagaacgaactgagtgtgttttgttttcattctgacacacgctaataaaaccatccttccactagccttcaccacaacacatcgtggaatgttatgagagaaagtaattgttttcccaaagcattatttgagttcttgaaatcgtatggtagggaacaaatgtttgtgctctttaatgtgtttttctaataatgcaaaatatgcagatgaagtcaaacaaacagctgcaattgtaaccgccacttcaacagttataaatctgtcgacaaactttaaagaaagctacaaacacatttaatgaataaaaggtcatcattcttacatgatcagcagcaaatcggtttactttcattgaaaaaagtcaataatttcttctaaagctaaaataactttttagctgtgtgtgaagagctgtactgtgtgacggtgcctgctaaaacccta。

in another preferred embodiment, the first exogenous gene expression cassette X comprises a GSG-P2A self-splicing sequence and an EGFP sequence.

In another preferred embodiment, the EGFP sequence is as follows:

atggtgagcaagggcgaggagctgttcaccggggtggtgcccatcctggtcgagctggacggcgacgtaaacggccacaagttcagcgtgtccggcgagggcgagggcgatgccacctacggcaagctgaccctgaagttcatctgcaccaccggcaagctgcccgtgccctggcccaccctcgtgaccaccctgacctacggcgtgcagtgcttcagccgctaccccgaccacatgaagcagcacgacttcttcaagtccgccatgcccgaaggctacgtccaggagcgcaccatcttcttcaaggacgacggcaactacaagacccgcgccgaggtgaagttcgagggcgacaccctggtgaaccgcatcgagctgaagggcatcgacttcaaggaggacggcaacatcctggggcacaagctggagtacaactacaacagccacaacgtctatatcatggccgacaagcagaagaacggcatcaaggtgaacttcaagatccgccacaacatcgaggacggcagcgtgcagctcgccgaccactaccagcagaacacccccatcggcgacggccccgtgctgctgcccgacaaccactacctgagcacccagtccgccctgagcaaagaccccaacgagaagcgcgatcacatggtcctgctggagttcgtgaccgccgccgggatcactctcggcatggacgagctgtacaag(SEQ ID NO.:3)。

in another preferred embodiment, the RA right homology arm sequence starts at the stop codon of the target gene and encompasses the 3 'intergenic spacer of the entire 3' UTR sequence of the target gene.

In another preferred embodiment, the LA left homology arm sequence comprises the hey2 sgRNA target sequence and an operably linked nucleic acid construct II.

In another preferred embodiment, the operably linked nucleic acid construct II has a structure from 5 'to 3' of formula II:

L5-L5'-Y-L3-L3' (II)

in the formula (I), the compound is shown in the specification,

l5, L5', X, L3 and L3' are elements for constructing the construct, respectively;

each "-" is independently a bond or a nucleotide linking sequence;

l5 is the 5' first site-specific recombination sequence;

l5 'is a 5' second site-specific recombination sequence;

y is an inverted second exogenous gene expression cassette;

l3 is the 3' first site-specific recombination sequence;

l3 'is a 3' second site-specific recombination sequence;

and, the sites of RA site-directed recombination are located at the 3 'terminator and 3' UTR sequences of the segment of interest.

In another preferred embodiment, the nucleic acid construct II has the nucleic acid sequence shown in SEQ ID NO. 1 at position 468-2271.

In another preferred embodiment, the inverted second foreign gene expression cassette Y sequence comprises a BGHpA signal sequence, an inverted TagRFP sequence, and a splicing acceptor; preferably, the sequence is the 1280-2117 of the nucleic acid sequence shown in SEQ ID NO. 1.

In another preferred embodiment, the BGHpA signal sequence is as follows:

ccatagagcccaccgcatccccagcatgcctgctattgtcttcccaatcctcccccttgctgtcctgccccaccccaccccccagaata gaatgacacctactcagacaatgcgatgcaatttcctcattttattaggaaaggacagtgggagtggcaccttccagggtcaaggaaggcac gggggaggggcaaacaacagatggctggcaactagaaggcacag(SEQ ID NO.:7)。

in another preferred embodiment, the amino acid sequence of the TagRFP sequence is as follows:

VSKGEELIKENMHMKLYMEGTVNNHHFKCTSEGEGKPYEGTQTMRIKVVEGGPLPFAFDILATSFMYGSRTFINHTQGIPDFFKQSFPEGFTWERVTTYEDGGVLTATQDTSLQDGCLIYNVKIRGVNFPSNGPVMQKKTLGWEANTEMLYPADGGLEGRSDMALKLVGGGHLICNFKTTYRSKKPAKNLKMPGVYYVDHRLERIKEADKETYVEQHEVAVARYCDLPSKLGHKLN(SEQ ID NO.:8)。

in another preferred embodiment, the site-specific recombination sequence is selected from the group consisting of: wild type Loxp, and mutant Loxp.

In another preferred embodiment, the first site-specific recombination sequence is a wild-type loxP site.

In another preferred embodiment, the second site-specific recombination sequence is a mutant lox5171 site.

In another preferred embodiment, the nucleic acid construct I has the structure of formula III from 5 'to 3':

Z0-Z1-Z2-Z3-Z4-Z5-Z6-Z7-Z8-Z9-Z10-Z11 (III)

in the formula (I), the compound is shown in the specification,

z0 is an additional sequence (e.g., a vector-derived sequence, an optional endonuclease site sequence, or a combination thereof, etc.) that is absent or located at the 5' end;

z1 is a first left homology arm sequence comprising a single guide rna (sgrna) target sequence;

z2 is the 5' first site-specific recombination sequence (equal or identical to L5);

z3 is a 5 'second site-specific recombination sequence (equal or identical to L5');

z4 is an inverted second exogenous gene expression cassette (equal or equivalent to Y);

z5 is the 3' first site-specific recombination sequence (equal or equivalent to L3);

z6 is a 3 'second site-specific recombination sequence (equal or identical to L3');

z7 is a second left homology arm sequence, wherein the first left homology arm sequence and the second left homology arm sequence are adjacent or contiguous on the genome to be gene edited, and the first left homology arm sequence and the second left homology arm sequence together constitute a left homology arm, which corresponds to the left arm (region) of the genome;

z8 is a coding sequence for encoding a P2A peptide or an analogous functional peptide thereof (e.g., GSG-P2A);

z9 is a first exogenous gene expression cassette;

z10 is the sequence of the right homology arm, which corresponds to the right arm (region) of the genome; (preferably, the left and right homology arms are bounded by (a) the end of the last exon of the gene to be knocked out; or a non-functional genomic sequence (e.g., 3' -URT, etc.));

z11 is an additional sequence (e.g., a vector-derived sequence, an optional endonuclease site sequence, or a combination thereof, etc.) that is absent or located at the 3' end;

each "-" is independently a bond or a nucleotide linking sequence.

In another preferred embodiment, the first foreign gene is a gene encoding a fluorescent protein.

In another preferred embodiment, the second exogenous gene is a gene encoding a fluorescent protein.

In another preferred embodiment, the wild-type loxP site is as follows:

ataacttcgtataatgtatgctatacgaagttat(SEQ ID NO.:4)。

in another preferred example, the mutant lox5171 sequence is shown as follows:

ataacttcgtataatgtgtactatacgaagttat(SEQ ID NO.:5)。

in another preferred embodiment, the exogenous gene expression cassette comprises a polynucleotide sequence of an exogenous gene and a sequence module for expressing the desired element.

In another preferred embodiment, the exogenous gene expression cassette further comprises a protein cleavage sequence.

In another preferred embodiment, the protein cleavage sequence is selected from the group consisting of: a P2A sequence, and/or a GSG-P2A sequence.

In another preferred embodiment, the nucleic acid sequence of P2A is as follows:

aggtccagggttctcctccacgtctccagcctgcttcagcaggctgaagttagtagc(SEQ ID NO.:6)。

in another preferred example, the nucleic acid sequence of GSG-P2A encodes a glycine (glycine) -serine (serine) -glycine (glycine) -P2a peptide.

In another preferred embodiment, the exogenous gene is selected from the group consisting of: a reporter gene, a structural gene, a functional gene, or a combination thereof.

In another preferred embodiment, the reporter gene is selected from the group consisting of: fluorescent proteins, and variants or derivatives thereof (e.g., optically highlighted fluorescent proteins).

In another preferred embodiment, the fluorescent protein is selected from the group consisting of: green Fluorescent Protein (GFP), Yellow Fluorescent Protein (YFP), Red Fluorescent Protein (RFP) (e.g., mCherry), Blue Fluorescent Protein (BFP), cyan fluorescent protein gene (CFP), orange fluorescent protein, photoactivated protein (FHP), photoactivated green cherry (GPAC), tomato hemoglobin, photoproteins (e.g., aequorin, firefly luciferin), or combinations thereof.

In another preferred embodiment, the construct is linear or non-linear (e.g., circular).

In a second aspect of the invention, there is provided a vector comprising a construct according to the first aspect.

In another preferred embodiment, the carrier is hey2^zCKOISThe sequence of the donor plasmid is shown as SEQ ID NO. 9:

tgacacatgcagctcccggagacggtcacagcttgtctgtaagcggatgccgggagcagacaagcccgtcagggcgcgtcagcgggtgttggcgggtgtcggggctggcttaactatgcggcatcagagcagattgtactgagagtgcaccatatgcggtgtgaaataccgcacagatgcgtaaggagaaaataccgcatcaggcgccattcgccattcaggctgcgcaactgttgggaagggcgatcggtgcgggcctcttcgctattacgccagctggcgaaagggggatgtgctgcaaggcgattaagttgggtaacgccagggttttcccagtcacgacgttgtaaaacgacggccagtgaattcgagctcggtacccactcgtcgacaaaactagggtttatatgatgatgatcttattttgactaagcgtggctatgaagcagaaaggaaggataatgagttgggtaggttaggtaagactttcttagacatgagtcaggtcaaagacaacagataattccataaaacatatgtattttttgtattctatagaattgtcattaatattcatgcaaaagatttcttaagtgacatttgcaaatcactccagttgtgtctttctttacattgttttccagttaaatttaggattgatttctgttatttatttagaataataattgtatattaataatgatattgggcagtatattgtactgtaccctgctgctggaagggtatccatatggggtggttcgcccagggtgccatttaaactagaaccatcactgcttcagatggcgtctttaatagttcacattgtatgattgttacacaagtgaaagagcatcgacagggtaatttgctagcttaccggtttacgcgtataacttcgtatagcatacattatacgaagttatccaagcttcaccatcgacccgaattgccaagcatcaccatcgacccataacttcgtatagtacacattatacgaagttatttcgaacgtaatacgactcactatagggcgaattggagctccaccggtggcggccgctctagaactagtggatcctggttctttccgcctcagaagccatagagcccaccgcatccccagcatgcctgctattgtcttcccaatcctcccccttgctgtcctgccccaccccaccccccagaatagaatgacacctactcagacaatgcgatgcaatttcctcattttattaggaaaggacagtgggagtggcaccttccagggtcaaggaaggcacgggggaggggcaaacaacagatggctggcaactagaaggcacagtcgaggctgatcagcggtttctggttctttccgcctcagaagccatagagcccaccgcatccccagcatgcctgctattgtcttcccaatcctcccccttgctgtcctgccccaccccaccccccagaatagaatgacacctactcagacaatgcgatgcaatttcctcattttattaggaaaggacagtgggagtggcaccttccagggtcaaggaaggcacgggggaggggcaaacaacagatggctggcaactagaaggcacagtcgaggctgatcagcgagctccaccgcggtcaattaagtttgtgccccagtttgctagggaggtcgcagtatctggccacagccacctcgtgctgctcgacgtaggtctctttgtcggcctccttgattctttccagtctgtggtccacatagtagacgccgggcatcttgaggttcttagcgggtttcttggatctgtatgtggtcttgaagttgcagatcaggtggcccccgcccacgagcttcagggccatgtcgcttctgccttccaggccgccgtcagcggggtacagcatctcggtgttggcctcccagccgagtgttttcttctgcatcacagggccgttggatgggaagttcacccctctgatcttgacgttgtagatgaggcagccgtcctggaggctggtgtcctgggtagcggtcagcacgcccccgtcttcgtatgtggtgactctctcccatgtgaagccctcagggaaggactgcttaaagaagtcggggatgccctgggtgtggttgatgaaggttctgctgccgtacatgaagctggtagccaggatgtcgaaggcgaaggggagagggccgccctcgaccaccttgattctcatggtctgggtgccctcgtagggcttgccttcgccctcggatgtgcacttgaagtggtggttgttcacggtgccctccatgtacagcttcatgtgcatgttctccttaatcagctcttcgcccttagacacgacgtcaggtccagggttctcctccacgtctccagcctgcttcagcaggctgaagttagtagctcttctcttcttccgaccgcgaagagtttgtcgatcgactgaaaaaaaaaagggaagagagagacacgtcagaaacacacacacactccggattagtgagatctgaataggaacttcataacttcgtataatgtatgctatacgaagttatccaagcatcaccatcgaccctctagtccagaactcaccatcgacccataacttcgtataatgtgtactatacgaagttatactagtattatgtacctgactgatcgatttgcctttgatttctggcatttgtcgggaatttctcaaaacctgttgtcgagtcaaaatctgggctaaaatcatacagtctgaactcggctttaggggttaataatattgaccttaaaatggttttaaaagaattaaaaactgcttttattctagctgacataaaacaaataagactttctccagaagaaaaaaatattttaggaattacagtaaaaaatgtcttgctctgttaaacatcatttgggaaatatttgaacaaaggtatcaaaattcacaggaggtgtgtgtatttaaagattcactagtatgctcatttgaataattctcaatattttttgtcaggatatttcgacgctcattctctggccatggacttcttgagcatcggcttccgggagtgtctgactgaagtggccaggtatttgagctctgtggaaggcctggactccagcgaccctctccgtgtccgtctggtttctcacctcagcagctgtgcctcgcagagggaagcagccgccatgaccacatccatagcccatcaccagcaggcccttcacccgcaccactgggctgccgctttgcatcccattcctgctgcgttcctgcagcagagcggacttccctcctcagagagctcctccggcaggctgtctgaggctcctcaaagaggtgcagcccttttctcccatagtgactcggcactcagagcgccctctactggaagtgtggctccttgcgtgccaccgctgtccacttctctgctttcgttatcagcgaccgttcatgcagcagctgctgcagctgcagctcaaaccttccctctatcatttcccgctggattcccactcttcagccccagcgttacagcatcttcagtggcttcttccaccgtgagctcttccgtttccacatccaccacatcccaacagagcagcgggagcaacagtaaaccataccgaccgtggggaactgaagtgggagcgttttcgggaggtggatccggagctactaatttctccttgcttaagcaagctggtgatgttgaagaaaatcctggtcctatggtgagcaagggcgaggagctgttcaccggggtggtgcccatcctggtcgagctggacggcgacgtaaacggccacaagttcagcgtgtccggcgagggcgagggcgatgccacctacggcaagctgaccctgaagttcatctgcaccaccggcaagctgcccgtgccctggcccaccctcgtgaccaccctgacctacggcgtgcagtgcttcagccgctaccccgaccacatgaagcagcacgacttcttcaagtccgccatgcccgaaggctacgtccaggagcgcaccatcttcttcaaggacgacggcaactacaagacccgcgccgaggtgaagttcgagggcgacaccctggtgaaccgcatcgagctgaagggcatcgacttcaaggaggacggcaacatcctggggcacaagctggagtacaactacaacagccacaacgtctatatcatggccgacaagcagaagaacggcatcaaggtgaacttcaagatccgccacaacatcgaggacggcagcgtgcagctcgccgaccactaccagcagaacacccccatcggcgacggccccgtgctgctgcccgacaaccactacctgagcacccagtccgccctgagcaaagaccccaacgagaagcgcgatcacatggtcctgctggagttcgtgaccgccgccgggatcactctcggcatggacgagctgtacaagaccggttaaatgttggatttaaatgttggacgtcttccatgctttgtacataaaggaaagcagcggctattgtgcctgcttcggtcagcagcatgggcttttgtcttcctctacacttgtgcacatatgcagcgtcaaacttaagccaacattctgggaagaaaagaaagagtttttacacgtcgcactgtgttggaaaccgtaaaggaagtttgtttctgttttaacagtgcctgcataaacactgctaacatgctgcatttgagatgtatgctttgatatcatctgacttccacaaacacccaacagcagctttagagtgaacagcttgttctgaaacaaaccaaagttttgcagataatcactaaagtgaggtgtttgtttttttatctctgatttaacaatccagtttgtaaatctgtacatgtgtaagattgtaactagagtttatattgaaattagttcattggtatgatgcacttcaatcactactgtttgtttggggggagacaggatcttctccgatttatacaataggcctactgaagttgtttttttaaaataacattcactaatactcatgtgagatttttctactactgtaactgtgttaataaccaccctctgtaagatgtaaccttttcctatgcaaaaaaacaaatgtccctcaagaacgaactgagtgtgttttgttttcattctgacacacgctaataaaaccatccttccactagccttcaccacaacacatcgtggaatgttatgagagaaagtaattgttttcccaaagcattatttgagttcttgaaatcgtatggtagggaacaaatgtttgtgctctttaatgtgtttttctaataatgcaaaatatgcagatgaagtcaaacaaacagctgcaattgtaaccgccacttcaacagttataaatctgtcgacaaactttaaagaaagctacaaacacatttaatgaataaaaggtcatcattcttacatgatcagcagcaaatcggtttactttcattgaaaaaagtcaataatttcttctaaagctaaaataactttttagctgtgtgtgaagagctgtactgtgtgacggtgcctgctaaaaccctactgcaggcatgcaagcttggcgtaatcatggtcatagctgtttcctgtgtgaaattgttatccgctcacaattccacacaacatacgagccggaagcataaagtgtaaagcctggggtgcctaatgagtgagctaactcacattaattgcgttgcgctcactgcccgctttccagtcgggaaacctgtcgtgccagctgcattaatgaatcggccaacgcgcggggagaggcggtttgcgtattgggcgctcttccgcttcctcgctcactgactcgctgcgctcggtcgttcggctgcggcgagcggtatcagctcactcaaaggcggtaatacggttatccacagaatcaggggataacgcaggaaagaacatgtgagcaaaaggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgtttccccctggaagctccctcgtgcgctctcctgttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcatagctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacactagaagaacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaagggattttggtcatgagattatcaaaaaggatcttcacctagatccttttaaattaaaaatgaagttttaaatcaatctaaagtatatatgagtaaacttggtctgacagttaccaatgcttaatcagtgaggcacctatctcagcgatctgtctatttcgttcatccatagttgcctgactccccgtcgtgtagataactacgatacgggagggcttaccatctggccccagtgctgcaatgataccgcgagacccacgctcaccggctccagatttatcagcaataaaccagccagccggaagggccgagcgcagaagtggtcctgcaactttatccgcctccatccagtctattaattgttgccgggaagctagagtaagtagttcgccagttaatagtttgcgcaacgttgttgccattgctacaggcatcgtggtgtcacgctcgtcgtttggtatggcttcattcagctccggttcccaacgatcaaggcgagttacatgatcccccatgttgtgcaaaaaagcggttagctccttcggtcctccgatcgttgtcagaagtaagttggccgcagtgttatcactcatggttatggcagcactgcataattctcttactgtcatgccatccgtaagatgcttttctgtgactggtgagtactcaaccaagtcattctgagaatagtgtatgcggcgaccgagttgctcttgcccggcgtcaatacgggataataccgcgccacatagcagaactttaaaagtgctcatcattggaaaacgttcttcggggcgaaaactctcaaggatcttaccgctgttgagatccagttcgatgtaacccactcgtgcacccaactgatcttcagcatcttttactttcaccagcgtttctgggtgagcaaaaacaggaaggcaaaatgccgcaaaaaagggaataagggcgacacggaaatgttgaatactcatactcttcctttttcaatattattgaagcatttatcagggttattgtctcatgagcggatacatatttgaatgtatttagaaaaataaacaaataggggttccgcgcacatttccccgaaaagtgccacctgacgtctaagaaaccattattatcatgacattaacctataaaaataggcgtatcacgaggccctttcgtc(SEQ ID NO.:9)。

in a third aspect of the invention, there is provided an agent comprising (a) Cas9mRNA, (b) a target gene sgRNA, and (c) the nucleic acid construct of the first aspect and/or the vector of the second aspect, wherein the single-guide RNA target sequence contained in the nucleic acid construct and/or the vector corresponds to (b) the target gene sgRNA sequence.

In another preferred example, the Cas9mRNA sequence is optimized according to zebrafish codon preference.

In another preferred example, the Cas9mRNA sequence is zCas9mRNA, and its nucleic acid sequence is shown in SEQ ID No. 10:

ggatcacgacatcgactacaaagacgacgatgataagatggcccctaagaaaaagagaaaggtcggaattcacggagttcccgctgcagataaaaagtacagcattggactggacatcggaacaaatagcgtgggctgggctgtgattactgacgaatataaggtgcctagcaaaaagtttaaagtgctgggaaacaccgacagacacagcatcaaaaaaaacctgatcggcgctctgctgtttgatagcggtgaaactgccgaggctactagactgaagagaactgctagaagaagatataccagaagaaagaatagaatttgttacctgcaagaaatctttagcaatgagatggcaaaggttgacgatagcttctttcatagactggaggagagcttcctggtcgaggaggacaagaagcacgagagacaccccatcttcggaaatatcgtggacgaggtggcataccatgaaaagtatcctaccatttaccacctgagaaaaaagctggtggacagcacagacaaggccgatctgagactgatctacctggcactggcccacatgatcaaatttagaggccatttcctgattgaaggagacctgaaccccgataacagcgatgttgataaactgttcatccaactggttcagacctataaccaactgtttgaggagaaccctattaacgccagcggagtggatgcaaaggccatcctgagcgctagactgagcaaaagcagaagactggaaaatctgatcgcccagctgcccggcgaaaaaaagaatggactgttcggcaatctgattgcactgagcctgggactgacacctaacttcaagagcaatttcgatctggctgaggacgccaaactgcagctgagcaaagacacatatgatgacgacctggataacctgctggcacaaattggtgaccaatacgctgacctgttcctggctgctaagaatctgagcgatgccattctgctgagcgacatcctgagagtgaacacagagattaccaaggcacccctgagcgcaagcatgattaagagatacgacgagcaccaccaagatctgaccctgctgaaggccctggtcagacaacaactgccagagaagtataaagaaattttctttgaccaaagcaagaacggttacgctggctacattgacggcggtgcaagccaagaggagttctataagttcattaagccaatcctggagaaaatggatggaactgaggagctgctggttaagctgaatagagaggatctgctgagaaaacaaagaacattcgacaacggtagcatcccacaccagattcatctgggtgagctgcacgcaattctgagaagacaggaagacttttatccattcctgaaggacaacagagaaaagatcgagaagattctgacatttagaatcccctactacgtgggacctctggctagaggcaatagcagattcgcatggatgactagaaagagcgaggagacaattaccccttggaactttgaagaagtggtggataagggagcaagcgcccaaagcttcattgagagaatgacaaacttcgataagaacctgcctaacgagaaggttctgcccaagcatagcctgctgtatgaatatttcacagtgtacaacgagctgacaaaggtcaagtacgtcacagagggcatgagaaagcccgcctttctgagcggagaacaaaagaaggctattgttgacctgctgttcaagaccaacagaaaagttacagttaaacagctgaaagaggactacttcaaaaagattgaatgttttgacagcgtggaaatcagcggcgttgaggacagatttaacgctagcctgggcacctaccacgatctgctgaaaatcatcaaagataaggactttctggacaacgaagaaaacgaggacattctggaagacattgtgctgacactgactctgttcgaagatagagaaatgatcgaggaaagactgaaaacttatgcacatctgttcgacgacaaagtgatgaagcaactgaagagaagaagatacactggatggggcagactgagcagaaagctgatcaacggaatcagagacaagcaaagcggaaaaactattctggattttctgaaaagcgacggtttcgccaatagaaacttcatgcaactgattcacgatgacagcctgactttcaaggaggatattcaaaaggcacaggtgagcggccagggcgatagcctgcacgaacacatcgcaaatctggccggtagccctgccattaagaagggcatcctgcagacagtgaaggttgttgatgaactggtcaaggtgatgggtagacacaagcccgagaatattgtgatcgagatggctagagagaaccaaacaacacaaaagggacagaagaatagcagagaaagaatgaaaagaattgaggagggaatcaaggagctgggtagccagatcctgaaagaacaccctgtcgagaatacacaactgcaaaacgaaaagctgtacctgtactacctgcaaaatggcagagacatgtacgtggaccaagagctggatattaacagactgagcgactacgatgtcgaccacatcgtgcctcaaagcttcctgaaggatgacagcatcgacaataaagtgctgactagaagcgacaagaacagaggaaaaagcgacaacgtgcccagcgaggaagtggttaaaaagatgaagaactactggagacagctgctgaatgccaagctgatcacacaaagaaaattcgacaacctgaccaaagccgagagaggaggtctgagcgaactggacaaggctggattcattaagagacaactggttgaaaccagacagattacaaagcacgtggctcaaatcctggacagcagaatgaataccaaatatgacgagaacgacaaactgattagagaggtgaaggttattactctgaagagcaaactggtcagcgacttcagaaaggacttccaattctacaaggtgagagagatcaacaattaccaccacgcacacgacgcttacctgaacgctgtggtgggcacagctctgatcaaaaagtatccaaaactggaaagcgagtttgtgtacggtgactataaagtttatgatgtgagaaaaatgatcgctaagagcgagcaggagatcggaaaggctacagccaagtatttcttttacagcaacattatgaactttttcaagactgaaatcaccctggcaaacggtgagatcagaaaaagaccactgatcgaaacaaatggcgagacaggcgagatcgtgtgggataagggaagagacttcgctaccgttagaaaggttctgagcatgccacaggttaacattgtgaagaaaactgaggtgcagacaggaggtttcagcaaggagagcatcctgcctaagagaaacagcgataagctgattgcaagaaaaaaggattgggaccctaagaagtacggcggttttgacagccctactgtggcttacagcgtgctggtggtggctaaagtggagaagggcaaaagcaagaagctgaaaagcgtgaaggaactgctgggaattacaatcatggagagaagcagcttcgagaagaacccaatcgacttcctggaggctaagggatacaaggaagttaagaaggacctgatcatcaagctgcccaagtacagcctgttcgagctggaaaatggtagaaagagaatgctggctagcgctggtgagctgcagaagggaaatgaactggcactgcctagcaagtacgttaactttctgtatctggcaagccattacgagaaactgaaaggaagccccgaggacaatgagcagaaacaactgttcgtggaacagcacaaacactatctggacgagattatcgagcagatcagcgaatttagcaaaagagtgatcctggctgatgctaacctggataaagtcctgagcgcttacaacaaacatagagataagcctatcagagagcaggccgaaaacatcatccacctgttcacactgacaaacctgggcgctcctgccgctttcaagtactttgataccactattgatagaaagagatatactagcaccaaagaggtgctggacgccaccctgattcaccagagcattaccggactgtacgaaactagaatcgacctgagccaactgggaggagacaagagacccgctgcaactaaaaaggcaggtcaggccaaaaagaagaaa。

in another preferred embodiment, the reagent comprises zCas9mRNA, hey2 sgRNA and hey2^CKOISA donor plasmid.

In another preferred example, the nucleic acid sequence of hey2 sgRNA is as follows:

GGAAGGATAATGGTTGGGTgttttagagctagaaatagcaagttaaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgctttttttaaagct(SEQ ID NO.:11)。

in a fourth aspect of the invention, there is provided a host cell comprising a construct according to the first aspect, or having integrated into its genome one or more constructs according to the first aspect.

In another preferred embodiment, the host cell comprises a zebrafish adult somatic cell, a zebrafish fetal somatic cell, or a zebrafish embryonic stem cell.

In another preferred example, the host cell is fertilized egg of zebra fish in single cell stage.

In another preferred embodiment, the host cell is a cell into which the construct according to the first aspect and/or the vector according to the second aspect has been introduced by a method selected from the group consisting of: homologous recombination, microinjection, electroporation, lipofection, calcium phosphate precipitation, viral infection, or sperm vector.

In a fifth aspect of the present invention, there is provided a method for preparing a transgenic cell in vitro, comprising the steps of:

(i) transfecting a cell with a construct according to the first aspect and/or a vector according to the second aspect such that site-directed non-homologous recombination of the construct with a chromosome in the cell occurs, thereby producing a transgenic cell.

In another preferred example, in step (i), the method further comprises transfecting the cell with a construct for site-directed cleavage, thereby site-directed cleaving the chromosome of the cell, wherein the site of site-directed cleavage is located in the target segment of the chromosome of the zebrafish.

In another preferred embodiment, the site of site-directed cleavage is located on chromosome 20 of zebrafish, specifically from position 39589569 to position 39591030 of NC _ 007131.

In another preferred embodiment, the method further comprises, in step (ii): transgenic cells are screened for positive insertion of the nucleic acid construct and for which random addition and deletion sequences do not affect the expression of endogenous genes, including target genes.

In another preferred embodiment, the method produces a transgenic cell that expresses both the endogenous gene and the first exogenous gene.

In another preferred example, the HEY2 protein is normally expressed and EGFP is expressed in the transgenic cells prepared by the method.

In another preferred example, the HEY2 protein expressed in the transgenic cells prepared by the method is the HEY2 protein^zCKOISThe protein comprises a fusion product of a bHLH domain at the N terminal, an Orange domain and a protein-protein interaction YRPW ("Y") module close to the C terminal and GSG-P2A-EGFP.

In another preferred embodiment, the transgene comprises a knock-in, a knock-out, or a combination thereof.

In a sixth aspect of the present invention, there is provided a method for preparing a transgenic cell in vitro, comprising the steps of:

(i) transfecting a cell with a construct according to the first aspect and/or a vector according to the second aspect in the presence of Cre recombinase to cause site-directed nonhomologous recombination of the construct with chromosomes in the cell, thereby producing a transgenic cell.

In another preferred example, in step (i), the method further comprises: and (c) transfecting the cell with the construct for site-directed cleavage, thereby performing site-directed cleavage on the chromosome of the cell, wherein the site of site-directed cleavage is located in the chromosome target section of the zebrafish.

In another preferred embodiment, the method produces transgenic cells that express both the first exogenous gene and the second exogenous gene.

In another preferred embodiment, the transgenic cells produced by the method have a deletion of the 3' last exon of the endogenous gene.

In another preferred embodiment, the transgenic cells produced by the method have truncated endogenous gene transcripts.

In another preferred embodiment, the EGFP and TagRFP are expressed simultaneously in the transgenic cell prepared by the method.

In another preferred embodiment, the transgenic cell prepared by the method expresses Hey2^zCKOIS-invProtein, said Hey2^zCKOIS-invThe protein is a product of fusion of a bHLH domain of a Hey2 protein and P2A-TagRFP.

In a seventh aspect of the present invention, there is provided a method for producing a transgenic animal, comprising the steps of:

(i) transfecting a cell with a construct according to the first aspect and/or a vector according to the second aspect such that site-directed recombination of the construct with a chromosome in the cell occurs, thereby producing a transgenic cell, and wherein the site of site-directed cleavage is located in a chromosome target segment of zebrafish; and

(ii) regenerating the obtained transgenic cell into an animal body, thereby obtaining a transgenic animal.

In another preferred embodiment, step (ii) includes the steps of:

(ii-1) somatic cloning is performed using the obtained transgenic cell as a nuclear donor, thereby obtaining a transgenic animal.

In another preferred example, the cell is fertilized egg of zebra fish in single cell stage.

In an eighth aspect of the present invention, there is provided a method for preparing a tissue-specific transgenic animal, comprising the steps of:

(a) preparing a transgenic animal F1 having a stably inserted genome into the construct of the first aspect according to the method of the seventh aspect;

(b) crossing the transgenic animal F1 obtained in step (a) with an animal F2 which expresses tissue-specifically Cre recombinase; and

(c) screening to obtain a transgenic animal expressing the first exogenous gene and simultaneously expressing the second exogenous gene in tissue specificity.

It is to be understood that within the scope of the present invention, the above-described features of the present invention and those specifically described below (e.g., in the examples) may be combined with each other to form new or preferred embodiments. Not to be reiterated herein, but to the extent of space.

Drawings

FIG. 1 shows hey2^zCKOISA construction method of zebra fish strains.

(A) Construction of intron-targeted mediated hey2 Using the CRISPR/Cas9 System^zCKOISPattern diagram of zebra fish. The zebrafish hey2 gene has 5 exons, and E4 and E5 represent the 4 th and 5 th exons, respectively. The sgRNA target sequence is red and the Protospacer Adjacent Motif (PAM) sequence is green. The left and right arm sequences of the Donor plasmid are indicated by the double-headed brown line. The Donor plasmid, sgRNA and ZCAS9mRNA are injected into zebra fish embryo together to make hey2^zCKOISThe plasmid was targeted for integration into site hey 2. The left arm of the Donor plasmid is 3300bp in length, including the original left arm and inverted TagRFP cassette sequences in the genome. The right arm is 1107bp in length. GSG-P2A is a glycine (glycine) -serine (serine) -glycine (glycine) -P2a sequence.

(B) Pair hey2^zCKOISPCR analysis was performed on genomic DNA of the knock-in F1 generation. A DNA band of 3.2kb in length was amplified using the F1 and R1 primers, and a DNA band of 5.3kb in length was amplified using the F1 and R2 primers. The two strips are only hey2^zCKOISPresent in embryos, but absent in the Wild Type (WT) group. The specific positions of the primers F1, R1, F2 and R2 are shown in (A).

(C) The left channel is shown using RT-PCR technology for analysis hey2^zCKOIScDNA Synthesis of F1 embryos. The 1.5kb band amplified by the F1 and R1 primers appeared only in hey2^zCKOISIn an embryo. The right lane shows the results of RT-PCR using primers that bind to the hey2 coding sequence, indicating a 0.3kb band that is consistent with the size of the hey2 coding sequence.

(D)hey2^zCKOIS(ii) a Day 1.5 confocal projection images (lateral view) of Ki (GFAP-TagBFP) embryos showing the expression of EGFP in TagBFP-labeled brain glial cells. Ki (GFAP-TagBFP) is used for making TagBFP protein specifically express in glial cellThe knock-in strain of (1). White arrow, midbrain; arrow, forebrain. Scale bar: 100 μm.

(E) Confocal projection images (side view) of Ki (GFAP-TagBFP); at hey2^zCKOISEGFP was expressed in glial cells (white arrows) of the spinal cord in embryos (day 3.5). Cyan arrow, non-specific signal on skin. Scale bar: 100 μm.

(F and G) are both hey2^zCKOIS(ii) a Confocal projection images of Tg (flk1: Ras-mCherry)3.5 day embryos; (F) expression of EGFP in the basal arterial circle of the Brain (BCA) and posterior major venous confluence (PCS) (dorsal view). Top left side: an EGFP signal; top right: overlapping signals (EGFP/mCherry). The area of a and b marked by outlines in the figure on the right side of the top end is shown enlarged in the bottom end channel. White arrows, BCA and PCS; white arrow, Choroidal Vascular Plexus (CVP). (G) EGFP is expressed in the Dorsal Aorta (DA) and not in the posterior major vein (PCV) in the zebrafish embryonic torso (lateral view). Internodal arterial segment vessels (aISVs) extending from the DA showed more EGFP signaling than internodal venous vessels (vsvs) extending from the PCV. Top left side: an EGFP signal; top right: overlapping signals (EGFP/mCherry). The area of a and b marked by outlines in the figure on the right side of the top end is shown enlarged in the bottom end channel. White arrows, DA and aISVs; white arrows, PCV and vsvs. Cyan arrow, non-specific signal on skin. Scale bar: 100 μm.

FIG. 2 demonstrates hey2^zCKOIS-invIs a loss-of-function allele.

(A)hey2^zCKOISAllele and post-translational hey2^zCKOISSchematic representation of proteins. The endogenous Hey2 protein included an N-terminal bHLH domain, an Orange domain, and a protein-protein interaction YRPW ("Y") module near the C-terminus. Hey2^zCKOISThe protein is a fusion product of a Wild Type (WT) Hey2 protein and GSG-P2A-EGFP.

(B) Cre enzyme induced hey2^zCKOISInversion of the allelic Assembly leads to hey2^zCKOIS-invSchematic representation of alleles. Hey2^zCKOIS-invThe protein is fused with the bHLH structural domain of the Hey2 protein and is P2A-TagRFP.

(C) PCR detection hey2^zCKOIS-invThe inverted genome. The band of 2.8kb in length is present only at hey2^zCKOIS-invAnd is not present in WT or hey2^zCKOISIn an embryo.

(D) Left channel, RT-PCR detected hey2^zCKOIS-invTranscription of (4). hey2^zCKOIS-invAmplified a band of 0.9kb in length, WT and hey2^zCKOISNo embryos were amplified. Right lane, RT-PCR controls using primers that bind to the hey2 coding sequence, with a 0.3kb band for each group.

(E)hey2^zCKOIS-inv(ii) a 1.5 day embryo confocal projection images (lateral view) of Ki (GFAP-TagBFP), TagRFP is capable of expression in glial cells. The enlarged view of the boxed area is shown on the right. White arrow, midbrain; arrow, forebrain.

Scale bar: 100 μm.

(F)hey2^{zCKOIS/zCKOIS-inv}Confocal projection images (lateral view) of embryonic torso for 2.5 days. Red fluorescence channel hey2^zCKOIS-invHey2 (truncated) -P2A-TagRFP in the allele encoded translation, and green fluorescence was translated by WT Hey2-GSG-P2A-EGFP encoded translation. Cyan arrow, non-specific signal on skin. Scale bar: 100 μm.

(G) Hey2 at 2.5 days^{zCKOIS/zCKOIS-inv}Confocal projection images of DA on the embryonic torso showed that two HSCs germinated from DA. Scale bar: 50 μm.

(H) A top end channel: brightfield image display homozygote hey2^zCKOIS-inv/^zCKOIS-invThere was severe pericardial edema. Bottom end channel: heterozygote hey2^zCKOIS-invThere was no significant edema at 3.5 days. Left channel: superimposed images of Bright Field (BF) and TagRFP. Right channel: TagRFP. Scale bar: 500 μm.

(I) In embryos with different genetic backgrounds, a normal proportion of tail circulation was shown. The numbers in the figure are the total number of embryos from different groups at 3.5 days.

FIG. 3 shows specific knock-out of hey2 gene in endothelial cells.

(A) Schematic representation of hey2 in specific knockout endothelial cells. By adding pureKi (flk1-P2A-Cre) strains and hey2^zCKOISThe lines were mated and the Hey2 protein in the ECs of the resulting progeny was specifically disrupted. In non-ECs that do not express Cre, hey2^zCKOISIs not reversed, so that the Hey2 protein is normally expressed.

(B) RT-PCR assay Using cDNA detection hey2^zCKOISAnd hey2^zCKOIS-invThe transcription status of (1). The left channel is a band of 1.5kb of the target gene amplified by the primers F1 and R2, hey2^zCKOIS(ii) a Ki (flk1-P2A-Cre) and hey2^zCKOISEmbryos of the group were amplified, while embryos of the WT group were not. The middle channel, a 0.9kb band amplified by primers F1 and F2, is present only in hey2^zCKOIS(ii) a Ki (flk1-P2A-Cre) embryos. Right lane, RT-PCR controls using primers that bind to the hey2 coding sequence, with a 0.3kb band for each group.

(C) Hey2 days 3.5^{zCKOIS/zCKOIS}(ii) a Ki (flk1-P2A-Cre) confocal projection images (side view) of trunk vessels in embryos. In DA and aISVs, the Cre-induced reversal of the components initiated hey2^zCKOISTagRFP expression, with little decrease in EGFP expression (indicated by arrows and dashed lines). Left apical is EGFP signal; apical right, overlapping signal (EGFP/TagRFP). The outline in the figure is an enlarged region. Cyan arrow, non-specific signal on skin. Scale bar: 100 μm.

(D) In embryos with different genetic backgrounds, a normal proportion of tail circulation was shown. The numbers in the figure are the total number of embryos from different groups at 3.5 days.

FIG. 4 shows sequencing of fish line hey2^zCKOISGenomic and transcriptional analysis. (A) Pair hey2^zCKOISThe knock-in strain F1 generation was sequenced 5' to the Donor plasmid integration site. PAM and sgRNA target sequences are shown in green and red, respectively. At integration hey2^zCKOISThe intron near the sgRNA target region of the Donor plasmid had a deletion of 894bp bases. (B) hey2^zCKOISThe transcribed cDNA sequence showed that EGFP was directly linked in-frame to exon5 of hey 2.

FIG. 5 shows sequencing of fish line hey2^zCKOIS-invGenomic and transcriptional analysis. (A) To pairhey2^zCKOIS ^-invThe knock-in strain F1 generation was sequenced 5' to the Donor plasmid integration site. PAM and sgRNA target sequences are shown in green and red, respectively. Intron near sgRNA target region hey2^zCKOISThe same 894bp deletion was present in the lines. (B) hey2^zCKOIS-invThe transcribed cDNA sequence showed that TagRFP was linked to exon4 of hey 2.

FIG. 6 shows hey2^{zCKOIS/zCKOIS-inv}Confocal projection images of embryos. (A) Co-location map. Example and measurement: 100 μm. (B) At hey2^{zCKOIS/zCKOIS-inv}Embryo day 2.5, EGFP (by hey 2)^zCKOISCode) and TagRFP (by hey 2)^zCKOIS-invCode) co-localization at the fluorescence signal gill cover artery (ORA). Scale bar: 50 μm.

FIG. 7 shows hey2^zCKOIS(ii) a And (3) carrying out genotype identification on the Ki (flk1-P2A-Cre) fish line. Left panel, PCR analysis of genomic DNA showed hey2^zCKOISAnd hey2^zCKOIS(ii) a The Ki (flk1-P2A-Cre) group had a 3.2kb band, whereas the WT group did not. Right picture, only hey2^zCKOIS(ii) a Ki (flk1-P2A-Cre) group and positive control hey2^zCKOIS-invThe group present a band of 2.8kb in size, WT group and hey2^zCKOISThe group is not present.

FIG. 8 shows hey2^{zCKOIS/zCKOIS}Is a confocal projection image of the trunk. In the absence of Cre, EGFP was expressed in DA (indicated by an arrow), and no red fluorescent signal of TagRFP was detected. Asterisk, nonspecific signal on yolk sac. Cyan arrow, non-specific signal on skin. Scale bar: 100 μm.

FIG. 9 shows the validation of Ki (flk1-P2A-Cre) fish lines.

Ki (flk 1-P2A-Cre); tg (bactin2: loxP-STOP-loxP-DsRedEx); confocal projection images on the trunk of Tg (flk1: EGFP)3.5 day embryos. Red denotes bactin2: DsRedEx. EGFP denotes flk1: EGFP. Scale bar: 100 μm.

Detailed Description

The inventor establishes a high-efficiency and tissue-specific zebra fish gene knockout method through extensive and intensive research, inserts an exogenous gene expression cassette between the left arm and the right arm of donor plasmid, and reversely constructs a DNA sequence element containing two LoxP sites, namely an inverted second exogenous gene expression cassette in the left arm intron sequence. In the absence of Cre protein expression, the endogenous gene and the first exogenous gene, which are structurally and functionally intact, are normally expressed. In the case of expression of Cre protein, the DNA sequence elements of the two LoxP sites are inverted by Cre and bind to the previous exon of the gene via the splice acceptor, disrupting the structure of the gene and causing the mutated gene to also express the second foreign gene. The present invention has been completed based on this finding.

Term(s) for

In order that the disclosure may be more readily understood, certain terms are first defined. As used in this application, each of the following terms shall have the meaning given below, unless explicitly specified otherwise herein. Other definitions are set forth throughout the application.

The term "about" can refer to a value or composition that is within an acceptable error range for the particular value or composition as determined by one of ordinary skill in the art, which will depend in part on how the value or composition is measured or determined. For example, as used herein, the expression "about 100" includes 99 and 101 and all values in between (e.g., 99.1, 99.2, 99.3, 99.4, etc.).

As used herein, the term "comprising" or "includes" can be open, semi-closed, and closed. In other words, the term also includes "consisting essentially of …," or "consisting of ….

Sequence identity is determined by comparing two aligned sequences along a predetermined comparison window (which may be 50%, 60%, 70%, 80%, 90%, 95% or 100% of the length of the reference nucleotide sequence or protein) and determining the number of positions at which identical residues occur. Typically, this is expressed as a percentage. The measurement of sequence identity of nucleotide sequences is a method well known to those skilled in the art.

As used herein, the terms "gene inactivation", "gene knockout", and the like, are used interchangeably and refer to genetic manipulation such as disruption, knockout, etc. of a certain gene of interest, such that the expression and/or activity of the gene of interest is substantially reduced or even completely lost.

As used herein, the term "hey 2^zCKOISDonor plasmids "," Donor plasmids "and" Donor plasmids "are interchangeable.

CRISPR/Cas system

The CRISPR/Cas system (Clustered regulated short palindromic repeats/CRISPR-associated protein) is an acquired immune defense mechanism against foreign gene invasion in prokaryotes. Has evolved from bacteria and archaea in the process of defending against the invasion of foreign viruses and bacteriophages. The system can integrate DNA fragments of foreign invasion hosts into CRISPR sites, and then guide Cas endonuclease to cut foreign DNA sequences through corresponding CRISPR RNAs (crRNAs), so as to resist the invasion of viruses or phages. The CRISPR/Cas gene cluster consists of a series of encoding genes of Cas proteins (Cas1, Cas2, Cas4 and effector proteins such as Cas9, Cpf1 and the like) and a section of CRISPR sequence,

CRISPR sequences consist of a leader (leader), a number of short and conserved repeat regions (repeat), and a spacer (spacer). The repeated sequence region contains a palindrome sequence and can form a hairpin structure. And the spacer is the foreign DNA sequence captured by the host. These trapped foreign DNA sequences correspond to the "black list" of the immune system, and when these foreign genetic material re-invades the host, the bacteria begin to transcribe CRISPR, forming a primary transcription product pre-crRNA, which is cleaved by ribonuclease or Cas protein within the repeat site to form mature crRNA, which forms a ribonucleoprotein complex with a specific CRISPR effector protein, recognizes and cleaves foreign DNA that is capable of complementary pairing with the crRNA, causing double strand breaks, initiating self-repair of the host cell.

CRISPRs are classified into type 2 and type 5, 16 subtypes in total, according to the composition of Cas genes and the number of effector proteins. Class 1 is CRISPR/Cas system using multiple effector protein complexes to interfere with target genes, including types i, iii and iv; class 2 is the CRISPR/Cas system that interferes with a target gene using a single effector protein, including type ii and type v. The most widely studied and utilized is the type 2 ii, i.e., CRISPR/Cas9 system. The system successfully achieved gene editing in mammalian cells in 2013. Type ii systems can utilize a single Cas9 nuclease to precisely and sufficiently cleave DNA target sites via crRNA guidance. The system is simple to operate, short in experimental period, high in efficiency and widely applicable to multiple species. The system needs to design a special guide RNA, namely sgRNA (single guide RNA), and the sequence of the sgRNA is designed to be about 20nt of nucleotide sequence of PAM (NGG) region in genome sequence. Under the guidance of sgRNA, Cas9 protein can perform site-directed cleavage on genome, cause DNA double strand break, activate two repair mechanisms of Non-Homologous end joining (NHEJ) or Homologous Recombination (HR) of cells, thereby realizing gene knockout, random fragment deletion or insertion, or utilize specific template repair, thereby realizing permanent modification of genome.

As used herein, the terms "single guide RNA", "sgRNA" are interchangeable.

In a preferred embodiment, the Cas9mRNA sequence is zCas9mRNA optimized according to zebrafish codon preference, and the nucleic acid sequence is shown as SEQ ID No. 10.

Cre-LoxP system

The Cre recombinase gene coding region sequence has a full length of 1029bp (EMBL database accession number X03453) and codes 38kDa protein. The Cre recombinase is a monomeric protein consisting of 343 amino acids. Belongs to lambda Int enzyme supergene family, it not only has catalytic activity, but also can be similar to restriction enzyme, and can recognize specific DNA sequence, i.e. loxP site, so that the gene sequence between loxP sites can be deleted or recombined.

The loxP (logic of X-over P1) sequence is derived from P1 phage and consists of two 13bp inverted repeat sequences and a middle spaced 8bp sequence, and the 8bp spacer sequence also determines the orientation of the loxP. The Cre enzyme is covalently bound to DNA in the process of catalyzing DNA strand exchange, and the 13bp inverted repeat sequence is a binding domain of the Cre enzyme.

The loxP site used in the present invention includes wild-type loxP site sequence and mutant loxP site sequence, and the common mutant loxP site includes: lox511, lox5171, lox2272, loxm2, loxm3, loxm7, and loxm 11.

In a preferred embodiment, the two loxP sites used in the present invention are a wild-type loxP site and a mutant lox5171 site, wherein the wild-type loxP site is shown in SEQ ID No. 4, and the mutant lox5171 site is shown in SEQ ID No. 5.

Nucleic acid constructs

The invention also provides a construct as described in the first aspect of the invention.

The various elements used in the constructs of the invention are known in the art, and thus the corresponding elements can be obtained by conventional methods, such as PCR, total artificial chemical synthesis, enzymatic digestion, and then ligated together by well-known DNA ligation techniques to form the constructs of the invention.

The vector of the present invention is formed by inserting the construct of the present invention into a foreign vector, particularly a vector suitable for the manipulation of transgenic animals.

The vector of the invention is transformed into a host cell so as to mediate the vector of the invention to integrate the chromosome of the host cell, thus obtaining the transgenic cell.

As used herein, "exogenous gene" refers to an exogenous DNA molecule whose action is a stepwise action. The foreign gene that can be used in the present application is not particularly limited, and includes various foreign genes commonly used in the field of transgenic animals. Representative examples include (but are not limited to): red fluorescent protein gene, green fluorescent protein gene, lysozyme gene, salmon calcitonin gene, lactoferrin, serum albumin gene, or the like.

In a preferred embodiment, two "foreign genes," TagRFP and EGFP, are used herein. The amino acid sequence of the TagRFP sequence is shown in SEQ ID No. 8, and the EGFP sequence is shown in SEQ ID No. 3.

As used herein, "selectable marker gene" refers to a gene used in a transgenic process to select a transgenic cell or a transgenic animal, and the selectable marker gene that can be used in the present application is not particularly limited, and includes various selectable marker genes commonly used in the transgenic field, representative examples including (but not limited to): neomycin gene, or puromycin resistance gene.

The term "expression cassette" as used herein refers to a polynucleotide sequence comprising the sequence components of the gene to be expressed and the elements required for expression. For example, in the present invention, the term "selectable marker expression cassette" refers to a polynucleotide sequence comprising a sequence encoding a selectable marker and a sequence module for expressing a desired element. Components required for expression include a promoter and polyadenylation signal sequence. In addition, the selectable marker expression cassette may or may not contain other sequences, including (but not limited to): enhancers, secretory signal peptide sequences, and the like.

In a preferred embodiment, the invention uses two self-cleaving element P2A sequences and a GSG-P2A sequence. The nucleic acid sequence of the P2A sequence is shown in SEQ ID NO. 6, and the nucleic acid sequence of the GSG-P2A is a glycine (glycine) -serine (serine) -glycine (glycine) -P2a sequence.

In the present invention, the promoter suitable for the foreign gene expression cassette and the selection marker gene expression cassette may be any one of common promoters, which may be a constitutive promoter or an inducible promoter. Preferably, the promoter is a constitutive strong promoter, such as bovine beta-lactoglobulin promoter and other promoters suitable for eukaryotic expression, and the BGHpA signal sequence is shown in SEQ ID No. 7.

Hey2 gene

The Hey2 gene, which is known as hes related family bHLH transcription factor carrying YRPW motif 2(hes-related family bHLH transcription factor with YRPW motif 2), is involved in circulatory system development. The zebra fish hey2 gene is located on chromosome 20, has 5 exons, and has a nucleic acid sequence with a full length of 7524bp, and GenBank accession number NC-007131; the amino acid sequence of the expressed protein is 324aa in total, and the accession number is NP-571697.2.

Hey2 is constructed by using CRISPR/Cas9 system^zCKOISA zebra fish strain provides sgRNA targeting an intron fragment between the 4 th exon and the 5 th exon of a zebra fish hey2 gene (the target sequence is shown as SEQ ID NO: 2, PAM is AGG), and the nucleic acid sequence of hey2 sgRNA is shown as SEQ ID NO.11.

The invention also provides hey2^zCKOISDonor plasmid (Donor plasmid), the sequences of the left and right arms of which are shown in FIG. 1A. The Donor plasmid has a length of 3300bp in left arm, and comprises original left arm and inverted DNA components in genome. The inverted DNA module has TagRFP cassette tag sequence, splicing acceptor and 2 sets of loxP sites (loxP, Lox 5171). In a preferred embodiment, the inverted DNA module is linked to the hey2 left arm sequence via a ClaI cleavage site. The original left arm comprises the 5 th exon E5 of hey2 and a stretch of intron sequence containing the sgRNA target sequence. The Donor plasmid was identical to the original right arm, 1107bp long, and contained the 3 'intergenic spacer of the entire 3' UTR sequence of the target gene. hey2^zCKOISThe GSG-P2A self-cutting sequence and the EGFP label fragment are also inserted between the left arm and the right arm of the donor plasmid, and the GSG-P2A is a glycine (glycine) -serine (serine) -glycine (glycine) -P2a sequence. In a preferred embodiment, the left and right arms are respectively connected to the 5 'and 3' ends of the GSG-P2A-EGFP fragment in the T-GSG-P2A-EGFP vector to obtain hey2^zCKOISThe sequence of the donor plasmid is shown as SEQ ID NO. 9.

Applications of

In one example of the invention, a reagent is provided comprising zCas9mRNA, hey2 sgRNA and hey2^CKOISA donor plasmid. In a preferred example, the reagent contains 800 ng/. mu.l zCas9mRNA, 80 ng/. mu.l sgRNA and 15 ng/. mu.l donor plasmid. In a preferred embodiment, the Cas9mRNA sequence is zCas9mRNA optimized according to zebrafish codon preference, and the nucleic acid sequence of the zCas9mRNA is shown as SEQ ID No. 10.

In one embodiment of the present invention, there is also provided hey2^zCKOISProtein, said hey2^zCKOISThe protein is a fusion product of a Wild Type (WT) Hey2 protein and GSG-P2A-EGFP.

In one embodiment of the present invention, there is also provided a method of obtaining hey2^zCKOISA method of fish line comprising the steps of: (a) providing a fertilized egg at a single-cell stage, and injecting a mixture containing ZCAS9mRNA, hey2 sgRNA and hey2 into the fertilized egg at the single-cell stage^CKOISReagent for donor plasmid, cultured for several days to obtain the transformantGenetically forming fish; (b) crossing the transgenic adult fish obtained in the step (a) with a wild zebra fish, and identifying the genomic DNA of the filial generation to obtain hey2 by screening^zCKOISA fish line. In a preferred embodiment, the hey2^zCKOISFish line expression hey2^zCKOISA protein. In a preferred embodiment, the hey2^zCKOISThe fish transcribed cDNA sequence showed that EGFP was directly linked in-frame to exon5 of hey 2. In a preferred embodiment, the hey2^zCKOISSimultaneous expression of EGFP and hey2 in cells of fish lines^zCKOISA protein.

In one embodiment of the present invention, there is also provided hey2^zCKOIS-invProtein, said hey2^zCKOIS-invThe protein is a fusion product of bHLH structural domain of the Hey2 protein fused with P2A-TagRFP.

In one embodiment of the present invention, there is also provided a method of obtaining hey2^zCKOIS-invA method of fish production comprising the steps of hey2^zCKOISThe Cre recombinase is simultaneously expressed in the cells of the fish line. In a preferred embodiment, the hey2^zCKOIS-invFish line expression hey2^zCKOIS-invA protein. In a preferred embodiment, the hey2^zCKOIS-invThe fish transcribed cDNA sequence showed that TagRFP was linked to exon4 of hey 2. In a preferred embodiment, the hey2^zCKOISExpression of TagRFP and hey2 simultaneously in cells^zCKOIS-invA protein. In a preferred embodiment, the obtaining hey2^zCKOIS-invA method of fish line comprising blending hey2^zCKOISThe fish line was crossed with a knock-in line specifically expressed by Cre. In a preferred embodiment, the knock-in lines of Cre-specific expression include the line Ki (flk1-p22-Cre), the line Ki (GFAP-TagBFP). The Cre recombinase mRNA has a nucleic acid sequence shown as SEQ ID No. 12.

The main advantages of the invention include:

1. the invention provides a tissue-specific zebra fish gene knockout method, which uses a non-homologous recombination technology, and the knockout efficiency is obviously improved compared with that of homologous recombination.

2. The method has the outstanding characteristics that a section of exogenous gene can be knocked in systemically, and the function of the target gene in non-specific tissues is not affected; the target gene is knocked out in a tissue specificity mode in a non-specific tissue, meanwhile, a section of exogenous gene can be knocked in a site-specific specificity mode, multiple transgenic effects can be obtained through a one-step method, and the transgenic efficiency is greatly improved.

3. The invention can further simplify the screening step, and the exogenous gene is designed to express the fluorescent protein, so that the target transgenic zebra fish can be simply and conveniently obtained by screening the embryo with the specific fluorescent marker intuitively.

The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Experimental procedures without specific conditions noted in the following examples, generally followed by conventional conditions, such as Sambrook et al, molecular cloning: the conditions described in the Laboratory Manual (New York: Cold Spring Harbor Laboratory Press,1989), or according to the manufacturer's recommendations. Unless otherwise indicated, percentages and parts are percentages and parts by weight.

Materials and methods

The Flk1 gene and Ki (Flk1-p22-cre) line

The Flk1 gene, which is called kinase insert domain receptor (kinase insert receptor like), is located on chromosome 14, contains 30 exons, has a nucleic acid sequence with the full length of 61184bp, and has GenBank accession number NC-007125; the total amino acid sequence of the expressed protein is 1300aa, and the accession number is XP _ 009289417.

Ki (flk1-p22-cre) fish used in the present invention were produced by the CRISPR/cas9 mediated KI method, according to the method described in the prior patent (patent No. 201510075155.6). Briefly, the EGFP sequence in the flk1-P2A-EGFP donor plasmid was replaced by Cre, resulting in a flk1-P2A-Cre plasmid.

Tg(bactin2:loxP-stop-loxP-dsred-express)^sd5Fish system

This transgenic fish line was gifted from professor David Traver, department of biology, university of san diego, california. Mainly used for verifying the cutting efficiency of Cre protein expressed by vascular endothelial cells in Ki (flk1-p22-Cre) fish lines. The transgenic fish line expresses a DNA sequence containing loxP-stop-loxP-dsred-express under the drive of bacin 2 promoter. loxP is loxP sequence; stop is a stop codon sequence; dsred-express is a red fluorescent protein sequence. The dsred-express red fluorescent protein is not expressed in the transgenic fish line without Cre protein; if Cre protein appears, stop sequence is cut, dsred-express red fluorescent protein is expressed.

Ki (GFAP-TagBFP) fish line

Ki (GFAP-TagBFP) fish lines were made as described in the prior patent (patent No. 201510075155.6). The simplification of the steps described to the GFAP-TagBFP donor plasmid was achieved by replacing the EGFP sequence in the GFAP-EGFP donor with TagBFP.

Zebra fish breeding

The adult zebra fish is cultured in an aquatic animal culture system of Beijing Aisheng company in an environment with the water temperature of 28 ℃, the PH of 7-8 and the photoperiod of 14 hours light/10 hours dark. Embryos were raised in 10% Hank's solution. The solution composition was as follows (millimoles): 140NaCl, 5.4KCl, 0.25Na₂HPO₄，0.44KH₂PO₄，1.3CaCl₂，1.0MgSO₄And 4.2NaHCO₃(pH 7.2)。

Example 1 Synthesis of sgRNA, zcAS9mRNA and Cre mRNA

Plasmid pGH-T7-ZCAS9 expressing ZCAS9 was obtained from the university of Beijing, Life sciences college. The plasmid was linearized with Xba I endonuclease and purified by transcription with mMACHINE T7 Ultra kit (Ambion) to obtain ZCAS9 mRNA. The Cre coding sequence was amplified from the Cre plasmid using primers containing the T7 promoter. Cre mRNA was synthesized by the same T7 Ultra kit.

The sequence of the zCas9mRNA is shown in SEQ ID No. 10, and the sequence of the Cre mRNA is shown in SEQ ID No. 12.

The targeting DNA sequences used to prepare the different sgrnas are shown below:

hey2	GGAAGGATAATGGTTGGGT	in the sense strand	SEQ ID No.:13
				gfap	GTGCGCAACACATAGCACCA	In the antisense strand	SEQ ID No.:14
flk1	TCTGGTTTGGAAGGACACAG	In the sense strand	SEQ ID No.:15

The above sgRNA sequences were cloned into BbsI sites of PT7-sgRNA plasmids (obtained from Life sciences of Beijing university) respectively, transcribed in vitro by Maxisscript T7 kit (Ambion), and used with mirVana^TMAnd recovering a miRNA Isolation Kit (Ambion) Kit to obtain the sgRNA. sequence of sgRNA hey2 sgRNA was used as an example, and the obtained nucleic acid sequence of hey2 sgRNA is shown in SEQ ID No. 11.

Example 2 construction hey2^CKOIS' Zebra fish strains and verification

2.1 the GSG-P2A-EGFP fragment was ligated to a PMD-19-T vector (purchased from Takara) by T-A cloning to form a T-GSG-P2A-EGFP vector. The left and right arms of hey2 were amplified from wild type zebrafish genomic DNA using KOD-PLUS Neo DNA polymerase. The left and right arms were then ligated to the 5 'and 3' ends of the GSG-P2A-EGFP fragment in the T-GSG-P2A-EGFP vector, respectively. The total gene is synthesized to haveoxP site, splice acceptor, TagRFP and BGHpA. The inverted DNA module was obtained with a size of 1.8kb and passed through the ClaI restriction site (AT)^▼CTAG) (attached to the hey2 left arm sequence, giving hey2^CKOISA donor plasmid. Finally, zCas9mRNA, hey2 sgRNA and hey2 were combined^CKOISThe donor plasmid was co-injected into the fertilized egg at the single cell stage. Each embryo was injected with 1nl of a solution containing 800 ng/. mu.l zcAS9mRNA, 80 ng/. mu.l sgRNA and 15 ng/. mu.l donor plasmid. The left and right arm sequences of the donor plasmid were amplified from adult wild-type AB zebrafish genomic DNA using the following primers:

1) left arm amplification primer:

F(l)	5′-CGAGGTACCCACTCGTCGACAAAACTAGGG-3′	SEQ ID No.:16
			R(l)	5′-CGAGGATCCAAACGCTCCCACTTCAGTTC-3′	SEQ ID No.:17

2) right arm amplification primer:

F(r)	5′-CGAACCGGTTAAATGTTGGATTTAAATGT-3′	SEQ ID No.:18
			R(r)	5′-CGACTGCAGTAGGGTTTTAGCAGGCACCG-3′	SEQ ID No.:19

2.2 for screening hey2^CKOISFish lines, adult fish are crossed with Wild Type (WT) zebrafish, genomic DNA (dpf) is extracted 3 days after fertilization, and identified by PCR detection.

Total RNA from zebrafish embryos at 1.5 and 3.5dpf was extracted using TRIzol reagent as per the instructions (Invitrogen, 15596018). Total RNA extracted Using PrimeScript^TMRT Master Mix (Takara, RR036A) produced first-strand cDNA.

hey2^zCKOISAnd hey2^zCKOIS-invThe genotyping and RT-PCR analysis of the transcription products of (1), the information of the primers F1\ R1 for PCR of genomic DNA and F2\ R2 for RT-PCR analysis of cDNA:

F1	5′-GATCTGCCAAGTTGGAGAAAGC-3′	SEQ ID No.:20
			F2	5′-TCAATTAAGTTTGTGCCCCAGT-3′	SEQ ID No.:21
R1	5′-CACCGTGAACAACCACCACT-3′	SEQ ID No.:22
			R2	5′-CTTGTACAGCTCGTCCATGCC-3′	SEQ ID No.:23

RT-PCR analysis of hey2 cDNA primer F3\ R3 primer:

F3	5′-ATGAAGCGGCCCTGTGAGGA-3′	SEQ ID No.:24
			R3	5′-CTTTTCCTCCTGTGGCCTGAA-3′	SEQ ID No.:25

2.3 for screening hey2^CKOISThe fish line is obtained by hybridizing adult fish and Wild Type (WT) zebra fish, extracting genome DNA (dpf) 3 days after fertilization, and further detecting and identifying by using an imaging technology.

Confocal imaging serial fluorescence pictures were acquired in the Z-axis direction in optical sections using FN1 confocal microscopy (Nikon, Japan) with 25-fold (n.a., 1.1) or 10-fold (n.a., 0.3) water scope. The resolution of all pictures is 1024 × 1024 or 512 × 512. Structural morphology was later reconstructed by ImageJ software (NIH).

As a result:

the probability of the screened bases which can be stably handed over to the next generation of F1 and the insertion direction is positive is: 2/21. Namely, 21 microinjected F0 zebrafish were screened, of which two founders were screened. While the sequence of the donor plasmid is inserted into the genome of the zebra fish by means of homologous recombination, the fonder is not selected, and even if 50 microinjected F0 zebra fish are selected, the fonder which can be stably inherited into the next generation is not obtained. The scheme provided by the invention is proved to be capable of efficiently and stably editing the zebra fish genome at a fixed point.

As shown in FIGS. 1B-C, the PCR electrophoresis bands demonstrated the successful construction of hey2^zCKOISZebra fish model, i.e. at hey2^zCKOISThe modified hey2 is inserted into the cell^zCKOISThe left arm of the donor plasmid and the GSG-P2A-EGFP structure express hey2^zCKOISA protein. As shown in FIGS. 2C-D, PCR and RT-PCR bands demonstrated the successful construction of hey2^zCKOIS-invZebra fish model, i.e. at hey2^zCKOIS-invThe cell is inserted with P2A-TagRFP, that is, hey2 is expressed^zCKOIS-invA protein. As shown in FIG. 3B, the bands of RT-PCR electrophoresis demonstrated that EGFP-containing fragments were inserted into all embryonic cells, and only hey2^zCKOIS(ii) a Ki (flk1-P2A-Cre) line embryonic cells have Cre enzyme-induced P2A-TagRFP fragment turnover, namely Cre enzyme effect.

As shown in FIG. 4 and FIG. 5, the fish line hey2 was sequenced separately^zCKOIS、hey2^zCKOIS-invGenome and transcription validation, demonstrated the introduction of hey2^zCKOISThe donor plasmid resulted in integration hey2^zCKOISThe intron near the sgRNA target region of the Donor plasmid had a deletion of 894bp bases. The deletion of the sequence is caused by the randomness of the genomic DNA repair after the Cas9/sgRNA is cut, so the sequence addition and deletion are random and generally vary from a few base pairs to hundreds of base pairs.

The above conclusion, i.e., the successful construction of hey2, is also demonstrated by immunofluorescent labeling imaging, as shown in FIGS. 1D-G, FIGS. 2E-I, and FIGS. 3C-D^zCKOISZebra fish model, and hey2^zCKOIS-invIs a loss-of-function allele. As in fig. 6, pair hey2^{zCKOIS/zCKOIS-inv}Confocal projection images of embryos show that embryos day 2.5, EGFP (by hey 2)^zCKOISCode) and TagRFP (by hey 2)^zCKOIS ^-invCode) co-localization at the fluorescence signal gill cover artery (ORA). As shown in FIG. 8, forhey2^{zCKOIS/zCKOIS-inv}Confocal projection images of the trunk revealed that EGFP was expressed in DA (arrow indication) in the absence of Cre for 2.5 days of the embryo, while the absence of detection of the red fluorescent signal of TagGFP (asterisk, nonspecific signal on the yolk sac; cyan arrow, nonspecific signal on the skin) also demonstrated hey2^zCKOIS-invIs a loss-of-function allele.

EXAMPLE 3 construction of knock-in line Ki with Cre-specific expression (flk1-P2A-Cre)

Ki (flk1-p22-cre) fish were produced by the CRISPR/cas9 mediated KI method, as described in the prior patent (patent No. 201510075155.6). Briefly, the EGFP sequence in the flk1-P2A-EGFP donor plasmid was replaced by Cre, resulting in a flk1-P2A-Cre plasmid. A1 nl solution containing 800pg zCas9mRNA, 80pg flk1gRNA and 15pg flk1 donor plasmid were injected into the zebrafish embryo at single cell stage. These embryos were grown to adulthood and subjected to primary screening. To screen Ki (flk1-P2A-Cre) fish lines, adult fish were crossed with AB wild type zebrafish, genomic dna (dpf) was extracted 1 day after fertilization, and identified by PCR. The fish line was then mixed with Tg (bactin2: loxP-stop-loxP-dsred-express)^sd5Fish lines were mated to determine the effect of Cre-mediated deletion of loxP sites in blood vessels.

As a result: hey2^zCKOIS(ii) a The genotype of the Ki (flk1-P2A-Cre) fish line was identified by PCR analysis as shown in FIG. 7. Ki (flk1-P2A-Cre) 3.5 days; tg (bactin2: loxP-STOP-loxP-DsRedEx); confocal projection image results on the trunk of Tg (flk1: EGFP) embryos are shown in FIG. 9. The effect of Cre-mediated deletion of loxP sites in blood vessels in Ki (flk1-P2A-Cre) fish lines was demonstrated.

Discussion of the related Art

At present, the application of gene manipulation in zebra fish is just started. The limiting factors are that in vitro culture and zebra fish embryonic stem cell operation technologies are not available, and the surrogate pregnancy of zebra fish and the generation of chimeric zebra fish cannot be realized at the present stage. Therefore, it is not possible to establish a zebrafish knock-in strain by using a homologous recombination in vitro targeting method which is commonly used in mice.

Meanwhile, since the efficiency of homologous recombination at the cellular background level is very low, if only a targeting vector comprising left and right homology arms is injected into a fertilized egg, it is difficult to achieve gene knock-in at the in vivo level. However, if double-stranded DNA breaks (DSBs) produced by the genome are used, the efficiency of homologous recombination can be improved by several orders of magnitude. The feature of efficient DNA cleavage of Cas9 system to generate DSBs provides a simpler approach to gene knock-in. The Cas9 system was first applied in mice by the teaching laboratory of Jaenisch in usa to achieve site-directed insertion of large fragments of DNA. Cas9mRNA, sgRNA and a template plasmid are co-injected into mouse zygotes at a cell stage, and a large-fragment reporter gene is inserted into the 3' end of Nanog and Oct4 genes through a homologous recombination mechanism, so that protein markers for expression of embryonic stem cells and targeted genes are successfully realized (Yang, H.et al.cell 154, 1370-1379).

However, this strategy cannot be applied to zebrafish to achieve the double LoxP site knock-in, and one of the important reasons is that the zebrafish fertilized eggs have very short residence time in one cell stage, only ten minutes (mice have hours), and the short time window is difficult to achieve efficient induction of homologous recombination, so the integration efficiency is low.

Another important reason is that the genome background of zebrafish is relatively complex, the inbred strain genome sequence is not sufficiently pure, and it is difficult to clone the long fragment homology arms with completely identical sequences (homologous recombination requires that the sequences of the left and right arms are completely identical to the genome). The two points restrict the establishment of the zebra fish site-specific gene knock-in by using a homologous recombination and integration method.

At present, two methods for preparing a zebra fish with a specific knockout gene are available, and the first method is to insert LoxP sites at two ends of an exon of a gene respectively by means of homologous recombination. The method inserts LoxP sites in a gene integration mode of homologous recombination, however, the efficiency of the homologous recombination in the early development of zebra fish is extremely low, so the success rate of the method is low. Meanwhile, because two LoxP sites need to be inserted in sequence, the time consumption is long, and at least more than 6 months are needed.

Another approach is to achieve gene knock-in for specific tissues and cell types by driving expression of Cas9 protein and the corresponding sgRNA by tissue specific promoters. However, due to the uncertainty of the gene sequence caused by the cleavage of the gene by Cas9/sgRNA and the inability to ensure that the target site can be cleaved in all cells, only part of the cells will be disrupted, and the overall knockout phenotype will be difficult to be exhibited, resulting in false negative results.

In view of the above problems, the present inventors have skillfully constructed a DNA sequence element containing two LoxP sites in reverse orientation into the left arm intron sequence of the knock-in plasmid. Endogenous genes and fluorescent tags, which are structurally and functionally intact, are normally expressed in the absence of Cre protein expression. In the case of Cre protein expression, the DNA sequence elements of the two LoxP sites are inverted by Cre and bind to the previous exon of the gene via the splice acceptor, disrupting the structure of the gene and also tagging the mutated gene with a fluorescent label of a different color. Therefore, scientific research personnel can conveniently and directly identify whether the gene is knocked out or not through color under a fluorescent microscope, and the tedious operation that the genotype can be identified only after the living sample is killed and PCR amplification sequencing is avoided.

In one embodiment of the present invention, the probability of the filtered found fountain that can be stably passed to the next generation of F1 and inserted forward is: about 10% (2/21), 21 microinjected F0 zebrafish were screened, of which two founders were screened. While the sequence of the donor plasmid is inserted into the genome of the zebra fish by means of homologous recombination, the fonder is not selected, and even if 50 microinjected F0 zebra fish are selected, the fonder which can be stably inherited into the next generation is not obtained. This indicates that the methods of the invention are unexpectedly efficient and stable for site-directed editing (including knock-in and/or knock-out) of zebrafish genomes.

Due to the randomness of genomic DNA repair after Cas9/sgRNA cleavage. The resulting sequence additions and deletions are random and generally vary from a few base pairs to hundreds of base pairs. Because the technical scheme provided by the invention is carried out in the intron of the gene, the addition and deletion of the basic group can not influence the target gene to be knocked in and the tissue-specific knocked-out gene. And considering that the knockout efficiency of the scheme is obviously improved, a final experimental target with less base pair deletion can be selected by screening and sequencing a plurality of different starting zebra fishes (founders).

All documents referred to herein are incorporated by reference into this application as if each were individually incorporated by reference. Furthermore, it should be understood that various changes and modifications of the present invention can be made by those skilled in the art after reading the above teachings of the present invention, and these equivalents also fall within the scope of the present invention as defined by the appended claims.

Sequence listing

<110> Shanghai Life science research institute of Chinese academy of sciences

<120> method for knocking out zebra fish gene by tissue specificity and application

<130> P2019-2050

<160> 12

<170> PatentIn version 3.5

<210> 1

<211> 5136

<212> DNA

<213> synthetic sequence (Artificial sequence)

<220>

<223> nucleic acid construct I

<400> 1

atgatcttat tttgactaag cgtggctatg aagcagaaag gaaggataat gagttgggta 60

ggttaggtaa gactttctta gacatgagtc aggtcaaaga caacagataa ttccataaaa 120

catatgtatt ttttgtattc tatagaattg tcattaatat tcatgcaaaa gatttcttaa 180

gtgacatttg caaatcactc cagttgtgtc tttctttaca ttgttttcca gttaaattta 240

ggattgattt ctgttattta tttagaataa taattgtata ttaataatga tattgggcag 300

tatattgtac tgtaccctgc tgctggaagg gtatccatat ggggtggttc gcccagggtg 360

ccatttaaac tagaaccatc actgcttcag atggcgtctt taatagttca cattgtatga 420

ttgttacaca agtgaaagag catcgacagg gtaatttgct agcttaccgg tttacgcgta 480

taacttcgta tagcatacat tatacgaagt tatccaagct tcaccatcga cccgaattgc 540

caagcatcac catcgaccca taacttcgta tagtacacat tatacgaagt tatttcgaac 600

gtaatacgac tcactatagg gcgaattgga gctccaccgg tggcggccgc tctagaacta 660

gtggatcctg gttctttccg cctcagaagc catagagccc accgcatccc cagcatgcct 720

gctattgtct tcccaatcct cccccttgct gtcctgcccc accccacccc ccagaataga 780

atgacaccta ctcagacaat gcgatgcaat ttcctcattt tattaggaaa ggacagtggg 840

agtggcacct tccagggtca aggaaggcac gggggagggg caaacaacag atggctggca 900

actagaaggc acagtcgagg ctgatcagcg gtttctggtt ctttccgcct cagaagccat 960

agagcccacc gcatccccag catgcctgct attgtcttcc caatcctccc ccttgctgtc 1020

ctgccccacc ccacccccca gaatagaatg acacctactc agacaatgcg atgcaatttc 1080

ctcattttat taggaaagga cagtgggagt ggcaccttcc agggtcaagg aaggcacggg 1140

ggaggggcaa acaacagatg gctggcaact agaaggcaca gtcgaggctg atcagcgagc 1200

tccaccgcgg tcaattaagt ttgtgcccca gtttgctagg gaggtcgcag tatctggcca 1260

cagccacctc gtgctgctcg acgtaggtct ctttgtcggc ctccttgatt ctttccagtc 1320

tgtggtccac atagtagacg ccgggcatct tgaggttctt agcgggtttc ttggatctgt 1380

atgtggtctt gaagttgcag atcaggtggc ccccgcccac gagcttcagg gccatgtcgc 1440

ttctgccttc caggccgccg tcagcggggt acagcatctc ggtgttggcc tcccagccga 1500

gtgttttctt ctgcatcaca gggccgttgg atgggaagtt cacccctctg atcttgacgt 1560

tgtagatgag gcagccgtcc tggaggctgg tgtcctgggt agcggtcagc acgcccccgt 1620

cttcgtatgt ggtgactctc tcccatgtga agccctcagg gaaggactgc ttaaagaagt 1680

cggggatgcc ctgggtgtgg ttgatgaagg ttctgctgcc gtacatgaag ctggtagcca 1740

ggatgtcgaa ggcgaagggg agagggccgc cctcgaccac cttgattctc atggtctggg 1800

tgccctcgta gggcttgcct tcgccctcgg atgtgcactt gaagtggtgg ttgttcacgg 1860

tgccctccat gtacagcttc atgtgcatgt tctccttaat cagctcttcg cccttagaca 1920

cgacgtcagg tccagggttc tcctccacgt ctccagcctg cttcagcagg ctgaagttag 1980

tagctcttct cttcttccga ccgcgaagag tttgtcgatc gactgaaaaa aaaaagggaa 2040

gagagagaca cgtcagaaac acacacacac tccggattag tgagatctga ataggaactt 2100

cataacttcg tataatgtat gctatacgaa gttatccaag catcaccatc gaccctctag 2160

tccagaactc accatcgacc cataacttcg tataatgtgt actatacgaa gttatactag 2220

tattatgtac ctgactgatc gatttgcctt tgatttctgg catttgtcgg gaatttctca 2280

aaacctgttg tcgagtcaaa atctgggcta aaatcataca gtctgaactc ggctttaggg 2340

gttaataata ttgaccttaa aatggtttta aaagaattaa aaactgcttt tattctagct 2400

gacataaaac aaataagact ttctccagaa gaaaaaaata ttttaggaat tacagtaaaa 2460

aatgtcttgc tctgttaaac atcatttggg aaatatttga acaaaggtat caaaattcac 2520

aggaggtgtg tgtatttaaa gattcactag tatgctcatt tgaataattc tcaatatttt 2580

ttgtcaggat atttcgacgc tcattctctg gccatggact tcttgagcat cggcttccgg 2640

gagtgtctga ctgaagtggc caggtatttg agctctgtgg aaggcctgga ctccagcgac 2700

cctctccgtg tccgtctggt ttctcacctc agcagctgtg cctcgcagag ggaagcagcc 2760

gccatgacca catccatagc ccatcaccag caggcccttc acccgcacca ctgggctgcc 2820

gctttgcatc ccattcctgc tgcgttcctg cagcagagcg gacttccctc ctcagagagc 2880

tcctccggca ggctgtctga ggctcctcaa agaggtgcag cccttttctc ccatagtgac 2940

tcggcactca gagcgccctc tactggaagt gtggctcctt gcgtgccacc gctgtccact 3000

tctctgcttt cgttatcagc gaccgttcat gcagcagctg ctgcagctgc agctcaaacc 3060

ttccctctat catttcccgc tggattccca ctcttcagcc ccagcgttac agcatcttca 3120

gtggcttctt ccaccgtgag ctcttccgtt tccacatcca ccacatccca acagagcagc 3180

gggagcaaca gtaaaccata ccgaccgtgg ggaactgaag tgggagcgtt ttcgggaggt 3240

ggatccggag ctactaattt ctccttgctt aagcaagctg gtgatgttga agaaaatcct 3300

ggtcctatgg tgagcaaggg cgaggagctg ttcaccgggg tggtgcccat cctggtcgag 3360

ctggacggcg acgtaaacgg ccacaagttc agcgtgtccg gcgagggcga gggcgatgcc 3420

acctacggca agctgaccct gaagttcatc tgcaccaccg gcaagctgcc cgtgccctgg 3480

cccaccctcg tgaccaccct gacctacggc gtgcagtgct tcagccgcta ccccgaccac 3540

atgaagcagc acgacttctt caagtccgcc atgcccgaag gctacgtcca ggagcgcacc 3600

atcttcttca aggacgacgg caactacaag acccgcgccg aggtgaagtt cgagggcgac 3660

accctggtga accgcatcga gctgaagggc atcgacttca aggaggacgg caacatcctg 3720

gggcacaagc tggagtacaa ctacaacagc cacaacgtct atatcatggc cgacaagcag 3780

aagaacggca tcaaggtgaa cttcaagatc cgccacaaca tcgaggacgg cagcgtgcag 3840

ctcgccgacc actaccagca gaacaccccc atcggcgacg gccccgtgct gctgcccgac 3900

aaccactacc tgagcaccca gtccgccctg agcaaagacc ccaacgagaa gcgcgatcac 3960

atggtcctgc tggagttcgt gaccgccgcc gggatcactc tcggcatgga cgagctgtac 4020

aagaccggtt aaatgttgga tttaaatgtt ggacgtcttc catgctttgt acataaagga 4080

aagcagcggc tattgtgcct gcttcggtca gcagcatggg cttttgtctt cctctacact 4140

tgtgcacata tgcagcgtca aacttaagcc aacattctgg gaagaaaaga aagagttttt 4200

acacgtcgca ctgtgttgga aaccgtaaag gaagtttgtt tctgttttaa cagtgcctgc 4260

ataaacactg ctaacatgct gcatttgaga tgtatgcttt gatatcatct gacttccaca 4320

aacacccaac agcagcttta gagtgaacag cttgttctga aacaaaccaa agttttgcag 4380

ataatcacta aagtgaggtg tttgtttttt tatctctgat ttaacaatcc agtttgtaaa 4440

tctgtacatg tgtaagattg taactagagt ttatattgaa attagttcat tggtatgatg 4500

cacttcaatc actactgttt gtttgggggg agacaggatc ttctccgatt tatacaatag 4560

gcctactgaa gttgtttttt taaaataaca ttcactaata ctcatgtgag atttttctac 4620

tactgtaact gtgttaataa ccaccctctg taagatgtaa ccttttccta tgcaaaaaaa 4680

caaatgtccc tcaagaacga actgagtgtg ttttgttttc attctgacac acgctaataa 4740

aaccatcctt ccactagcct tcaccacaac acatcgtgga atgttatgag agaaagtaat 4800

tgttttccca aagcattatt tgagttcttg aaatcgtatg gtagggaaca aatgtttgtg 4860

ctctttaatg tgtttttcta ataatgcaaa atatgcagat gaagtcaaac aaacagctgc 4920

aattgtaacc gccacttcaa cagttataaa tctgtcgaca aactttaaag aaagctacaa 4980

acacatttaa tgaataaaag gtcatcattc ttacatgatc agcagcaaat cggtttactt 5040

tcattgaaaa aagtcaataa tttcttctaa agctaaaata actttttagc tgtgtgtgaa 5100

gagctgtact gtgtgacggt gcctgctaaa acccta 5136

<210> 2

<211> 19

<212> DNA

<213> Zebra fish (Danio rerio)

<400> 2

ggaaggataa tggttgggt 19

<210> 3

<211> 717

<212> DNA

<213> synthetic sequence (Artificial sequence)

<220>

<223> EGFP sequence

<400> 3

atggtgagca agggcgagga gctgttcacc ggggtggtgc ccatcctggt cgagctggac 60

ggcgacgtaa acggccacaa gttcagcgtg tccggcgagg gcgagggcga tgccacctac 120

ggcaagctga ccctgaagtt catctgcacc accggcaagc tgcccgtgcc ctggcccacc 180

ctcgtgacca ccctgaccta cggcgtgcag tgcttcagcc gctaccccga ccacatgaag 240

cagcacgact tcttcaagtc cgccatgccc gaaggctacg tccaggagcg caccatcttc 300

ttcaaggacg acggcaacta caagacccgc gccgaggtga agttcgaggg cgacaccctg 360

gtgaaccgca tcgagctgaa gggcatcgac ttcaaggagg acggcaacat cctggggcac 420

aagctggagt acaactacaa cagccacaac gtctatatca tggccgacaa gcagaagaac 480

ggcatcaagg tgaacttcaa gatccgccac aacatcgagg acggcagcgt gcagctcgcc 540

gaccactacc agcagaacac ccccatcggc gacggccccg tgctgctgcc cgacaaccac 600

tacctgagca cccagtccgc cctgagcaaa gaccccaacg agaagcgcga tcacatggtc 660

ctgctggagt tcgtgaccgc cgccgggatc actctcggca tggacgagct gtacaag 717

<210> 4

<211> 34

<212> DNA

<213> synthetic sequence (Artificial sequence)

<220>

<223> wild-type loxP site

<400> 4

ataacttcgt ataatgtatg ctatacgaag ttat 34

<210> 5

<211> 34

<212> DNA

<213> synthetic sequence (Artificial sequence)

<220>

<223> mutant lox5171 sequence

<400> 5

ataacttcgt ataatgtgta ctatacgaag ttat 34

<210> 6

<211> 57

<212> DNA

<213> synthetic sequence (Artificial sequence)

<220>

<223> P2A sequence

<400> 6

aggtccaggg ttctcctcca cgtctccagc ctgcttcagc aggctgaagt tagtagc 57

<210> 7

<211> 225

<212> DNA

<213> synthetic sequence (Artificial sequence)

<220>

<223> BGHpA signal sequence

<400> 7

ccatagagcc caccgcatcc ccagcatgcc tgctattgtc ttcccaatcc tcccccttgc 60

tgtcctgccc caccccaccc cccagaatag aatgacacct actcagacaa tgcgatgcaa 120

tttcctcatt ttattaggaa aggacagtgg gagtggcacc ttccagggtc aaggaaggca 180

cgggggaggg gcaaacaaca gatggctggc aactagaagg cacag 225

<210> 8

<211> 236

<212> PRT

<213> synthetic sequence (Artificial sequence)

<220>

<223> TagRFP sequence

<400> 8

Val Ser Lys Gly Glu Glu Leu Ile Lys Glu Asn Met His Met Lys Leu

1 5 10 15

Tyr Met Glu Gly Thr Val Asn Asn His His Phe Lys Cys Thr Ser Glu

20 25 30

Gly Glu Gly Lys Pro Tyr Glu Gly Thr Gln Thr Met Arg Ile Lys Val

35 40 45

Val Glu Gly Gly Pro Leu Pro Phe Ala Phe Asp Ile Leu Ala Thr Ser

50 55 60

Phe Met Tyr Gly Ser Arg Thr Phe Ile Asn His Thr Gln Gly Ile Pro

65 70 75 80

Asp Phe Phe Lys Gln Ser Phe Pro Glu Gly Phe Thr Trp Glu Arg Val

85 90 95

Thr Thr Tyr Glu Asp Gly Gly Val Leu Thr Ala Thr Gln Asp Thr Ser

100 105 110

Leu Gln Asp Gly Cys Leu Ile Tyr Asn Val Lys Ile Arg Gly Val Asn

115 120 125

Phe Pro Ser Asn Gly Pro Val Met Gln Lys Lys Thr Leu Gly Trp Glu

130 135 140

Ala Asn Thr Glu Met Leu Tyr Pro Ala Asp Gly Gly Leu Glu Gly Arg

145 150 155 160

Ser Asp Met Ala Leu Lys Leu Val Gly Gly Gly His Leu Ile Cys Asn

165 170 175

Phe Lys Thr Thr Tyr Arg Ser Lys Lys Pro Ala Lys Asn Leu Lys Met

180 185 190

Pro Gly Val Tyr Tyr Val Asp His Arg Leu Glu Arg Ile Lys Glu Ala

195 200 205

Asp Lys Glu Thr Tyr Val Glu Gln His Glu Val Ala Val Ala Arg Tyr

210 215 220

Cys Asp Leu Pro Ser Lys Leu Gly His Lys Leu Asn

225 230 235

<210> 9

<211> 7801

<212> DNA

<213> synthetic sequence (Artificial sequence)

<220>

<223> hey2 # 2zCKOIS donor plasmid

<400> 9

tgacacatgc agctcccgga gacggtcaca gcttgtctgt aagcggatgc cgggagcaga 60

caagcccgtc agggcgcgtc agcgggtgtt ggcgggtgtc ggggctggct taactatgcg 120

gcatcagagc agattgtact gagagtgcac catatgcggt gtgaaatacc gcacagatgc 180

gtaaggagaa aataccgcat caggcgccat tcgccattca ggctgcgcaa ctgttgggaa 240

gggcgatcgg tgcgggcctc ttcgctatta cgccagctgg cgaaaggggg atgtgctgca 300

aggcgattaa gttgggtaac gccagggttt tcccagtcac gacgttgtaa aacgacggcc 360

agtgaattcg agctcggtac ccactcgtcg acaaaactag ggtttatatg atgatgatct 420

tattttgact aagcgtggct atgaagcaga aaggaaggat aatgagttgg gtaggttagg 480

taagactttc ttagacatga gtcaggtcaa agacaacaga taattccata aaacatatgt 540

attttttgta ttctatagaa ttgtcattaa tattcatgca aaagatttct taagtgacat 600

ttgcaaatca ctccagttgt gtctttcttt acattgtttt ccagttaaat ttaggattga 660

tttctgttat ttatttagaa taataattgt atattaataa tgatattggg cagtatattg 720

tactgtaccc tgctgctgga agggtatcca tatggggtgg ttcgcccagg gtgccattta 780

aactagaacc atcactgctt cagatggcgt ctttaatagt tcacattgta tgattgttac 840

acaagtgaaa gagcatcgac agggtaattt gctagcttac cggtttacgc gtataacttc 900

gtatagcata cattatacga agttatccaa gcttcaccat cgacccgaat tgccaagcat 960

caccatcgac ccataacttc gtatagtaca cattatacga agttatttcg aacgtaatac 1020

gactcactat agggcgaatt ggagctccac cggtggcggc cgctctagaa ctagtggatc 1080

ctggttcttt ccgcctcaga agccatagag cccaccgcat ccccagcatg cctgctattg 1140

tcttcccaat cctccccctt gctgtcctgc cccaccccac cccccagaat agaatgacac 1200

ctactcagac aatgcgatgc aatttcctca ttttattagg aaaggacagt gggagtggca 1260

ccttccaggg tcaaggaagg cacgggggag gggcaaacaa cagatggctg gcaactagaa 1320

ggcacagtcg aggctgatca gcggtttctg gttctttccg cctcagaagc catagagccc 1380

accgcatccc cagcatgcct gctattgtct tcccaatcct cccccttgct gtcctgcccc 1440

accccacccc ccagaataga atgacaccta ctcagacaat gcgatgcaat ttcctcattt 1500

tattaggaaa ggacagtggg agtggcacct tccagggtca aggaaggcac gggggagggg 1560

caaacaacag atggctggca actagaaggc acagtcgagg ctgatcagcg agctccaccg 1620

cggtcaatta agtttgtgcc ccagtttgct agggaggtcg cagtatctgg ccacagccac 1680

ctcgtgctgc tcgacgtagg tctctttgtc ggcctccttg attctttcca gtctgtggtc 1740

cacatagtag acgccgggca tcttgaggtt cttagcgggt ttcttggatc tgtatgtggt 1800

cttgaagttg cagatcaggt ggcccccgcc cacgagcttc agggccatgt cgcttctgcc 1860

ttccaggccg ccgtcagcgg ggtacagcat ctcggtgttg gcctcccagc cgagtgtttt 1920

cttctgcatc acagggccgt tggatgggaa gttcacccct ctgatcttga cgttgtagat 1980

gaggcagccg tcctggaggc tggtgtcctg ggtagcggtc agcacgcccc cgtcttcgta 2040

tgtggtgact ctctcccatg tgaagccctc agggaaggac tgcttaaaga agtcggggat 2100

gccctgggtg tggttgatga aggttctgct gccgtacatg aagctggtag ccaggatgtc 2160

gaaggcgaag gggagagggc cgccctcgac caccttgatt ctcatggtct gggtgccctc 2220

gtagggcttg ccttcgccct cggatgtgca cttgaagtgg tggttgttca cggtgccctc 2280

catgtacagc ttcatgtgca tgttctcctt aatcagctct tcgcccttag acacgacgtc 2340

aggtccaggg ttctcctcca cgtctccagc ctgcttcagc aggctgaagt tagtagctct 2400

tctcttcttc cgaccgcgaa gagtttgtcg atcgactgaa aaaaaaaagg gaagagagag 2460

acacgtcaga aacacacaca cactccggat tagtgagatc tgaataggaa cttcataact 2520

tcgtataatg tatgctatac gaagttatcc aagcatcacc atcgaccctc tagtccagaa 2580

ctcaccatcg acccataact tcgtataatg tgtactatac gaagttatac tagtattatg 2640

tacctgactg atcgatttgc ctttgatttc tggcatttgt cgggaatttc tcaaaacctg 2700

ttgtcgagtc aaaatctggg ctaaaatcat acagtctgaa ctcggcttta ggggttaata 2760

atattgacct taaaatggtt ttaaaagaat taaaaactgc ttttattcta gctgacataa 2820

aacaaataag actttctcca gaagaaaaaa atattttagg aattacagta aaaaatgtct 2880

tgctctgtta aacatcattt gggaaatatt tgaacaaagg tatcaaaatt cacaggaggt 2940

gtgtgtattt aaagattcac tagtatgctc atttgaataa ttctcaatat tttttgtcag 3000

gatatttcga cgctcattct ctggccatgg acttcttgag catcggcttc cgggagtgtc 3060

tgactgaagt ggccaggtat ttgagctctg tggaaggcct ggactccagc gaccctctcc 3120

gtgtccgtct ggtttctcac ctcagcagct gtgcctcgca gagggaagca gccgccatga 3180

ccacatccat agcccatcac cagcaggccc ttcacccgca ccactgggct gccgctttgc 3240

atcccattcc tgctgcgttc ctgcagcaga gcggacttcc ctcctcagag agctcctccg 3300

gcaggctgtc tgaggctcct caaagaggtg cagccctttt ctcccatagt gactcggcac 3360

tcagagcgcc ctctactgga agtgtggctc cttgcgtgcc accgctgtcc acttctctgc 3420

tttcgttatc agcgaccgtt catgcagcag ctgctgcagc tgcagctcaa accttccctc 3480

tatcatttcc cgctggattc ccactcttca gccccagcgt tacagcatct tcagtggctt 3540

cttccaccgt gagctcttcc gtttccacat ccaccacatc ccaacagagc agcgggagca 3600

acagtaaacc ataccgaccg tggggaactg aagtgggagc gttttcggga ggtggatccg 3660

gagctactaa tttctccttg cttaagcaag ctggtgatgt tgaagaaaat cctggtccta 3720

tggtgagcaa gggcgaggag ctgttcaccg gggtggtgcc catcctggtc gagctggacg 3780

gcgacgtaaa cggccacaag ttcagcgtgt ccggcgaggg cgagggcgat gccacctacg 3840

gcaagctgac cctgaagttc atctgcacca ccggcaagct gcccgtgccc tggcccaccc 3900

tcgtgaccac cctgacctac ggcgtgcagt gcttcagccg ctaccccgac cacatgaagc 3960

agcacgactt cttcaagtcc gccatgcccg aaggctacgt ccaggagcgc accatcttct 4020

tcaaggacga cggcaactac aagacccgcg ccgaggtgaa gttcgagggc gacaccctgg 4080

tgaaccgcat cgagctgaag ggcatcgact tcaaggagga cggcaacatc ctggggcaca 4140

agctggagta caactacaac agccacaacg tctatatcat ggccgacaag cagaagaacg 4200

gcatcaaggt gaacttcaag atccgccaca acatcgagga cggcagcgtg cagctcgccg 4260

accactacca gcagaacacc cccatcggcg acggccccgt gctgctgccc gacaaccact 4320

acctgagcac ccagtccgcc ctgagcaaag accccaacga gaagcgcgat cacatggtcc 4380

tgctggagtt cgtgaccgcc gccgggatca ctctcggcat ggacgagctg tacaagaccg 4440

gttaaatgtt ggatttaaat gttggacgtc ttccatgctt tgtacataaa ggaaagcagc 4500

ggctattgtg cctgcttcgg tcagcagcat gggcttttgt cttcctctac acttgtgcac 4560

atatgcagcg tcaaacttaa gccaacattc tgggaagaaa agaaagagtt tttacacgtc 4620

gcactgtgtt ggaaaccgta aaggaagttt gtttctgttt taacagtgcc tgcataaaca 4680

ctgctaacat gctgcatttg agatgtatgc tttgatatca tctgacttcc acaaacaccc 4740

aacagcagct ttagagtgaa cagcttgttc tgaaacaaac caaagttttg cagataatca 4800

ctaaagtgag gtgtttgttt ttttatctct gatttaacaa tccagtttgt aaatctgtac 4860

atgtgtaaga ttgtaactag agtttatatt gaaattagtt cattggtatg atgcacttca 4920

atcactactg tttgtttggg gggagacagg atcttctccg atttatacaa taggcctact 4980

gaagttgttt ttttaaaata acattcacta atactcatgt gagatttttc tactactgta 5040

actgtgttaa taaccaccct ctgtaagatg taaccttttc ctatgcaaaa aaacaaatgt 5100

ccctcaagaa cgaactgagt gtgttttgtt ttcattctga cacacgctaa taaaaccatc 5160

cttccactag ccttcaccac aacacatcgt ggaatgttat gagagaaagt aattgttttc 5220

ccaaagcatt atttgagttc ttgaaatcgt atggtaggga acaaatgttt gtgctcttta 5280

atgtgttttt ctaataatgc aaaatatgca gatgaagtca aacaaacagc tgcaattgta 5340

accgccactt caacagttat aaatctgtcg acaaacttta aagaaagcta caaacacatt 5400

taatgaataa aaggtcatca ttcttacatg atcagcagca aatcggttta ctttcattga 5460

aaaaagtcaa taatttcttc taaagctaaa ataacttttt agctgtgtgt gaagagctgt 5520

actgtgtgac ggtgcctgct aaaaccctac tgcaggcatg caagcttggc gtaatcatgg 5580

tcatagctgt ttcctgtgtg aaattgttat ccgctcacaa ttccacacaa catacgagcc 5640

ggaagcataa agtgtaaagc ctggggtgcc taatgagtga gctaactcac attaattgcg 5700

ttgcgctcac tgcccgcttt ccagtcggga aacctgtcgt gccagctgca ttaatgaatc 5760

ggccaacgcg cggggagagg cggtttgcgt attgggcgct cttccgcttc ctcgctcact 5820

gactcgctgc gctcggtcgt tcggctgcgg cgagcggtat cagctcactc aaaggcggta 5880

atacggttat ccacagaatc aggggataac gcaggaaaga acatgtgagc aaaaggccag 5940

caaaaggcca ggaaccgtaa aaaggccgcg ttgctggcgt ttttccatag gctccgcccc 6000

cctgacgagc atcacaaaaa tcgacgctca agtcagaggt ggcgaaaccc gacaggacta 6060

taaagatacc aggcgtttcc ccctggaagc tccctcgtgc gctctcctgt tccgaccctg 6120

ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa gcgtggcgct ttctcatagc 6180

tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct ccaagctggg ctgtgtgcac 6240

gaaccccccg ttcagcccga ccgctgcgcc ttatccggta actatcgtct tgagtccaac 6300

ccggtaagac acgacttatc gccactggca gcagccactg gtaacaggat tagcagagcg 6360

aggtatgtag gcggtgctac agagttcttg aagtggtggc ctaactacgg ctacactaga 6420

agaacagtat ttggtatctg cgctctgctg aagccagtta ccttcggaaa aagagttggt 6480

agctcttgat ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag 6540

cagattacgc gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc tacggggtct 6600

gacgctcagt ggaacgaaaa ctcacgttaa gggattttgg tcatgagatt atcaaaaagg 6660

atcttcacct agatcctttt aaattaaaaa tgaagtttta aatcaatcta aagtatatat 6720

gagtaaactt ggtctgacag ttaccaatgc ttaatcagtg aggcacctat ctcagcgatc 6780

tgtctatttc gttcatccat agttgcctga ctccccgtcg tgtagataac tacgatacgg 6840

gagggcttac catctggccc cagtgctgca atgataccgc gagacccacg ctcaccggct 6900

ccagatttat cagcaataaa ccagccagcc ggaagggccg agcgcagaag tggtcctgca 6960

actttatccg cctccatcca gtctattaat tgttgccggg aagctagagt aagtagttcg 7020

ccagttaata gtttgcgcaa cgttgttgcc attgctacag gcatcgtggt gtcacgctcg 7080

tcgtttggta tggcttcatt cagctccggt tcccaacgat caaggcgagt tacatgatcc 7140

cccatgttgt gcaaaaaagc ggttagctcc ttcggtcctc cgatcgttgt cagaagtaag 7200

ttggccgcag tgttatcact catggttatg gcagcactgc ataattctct tactgtcatg 7260

ccatccgtaa gatgcttttc tgtgactggt gagtactcaa ccaagtcatt ctgagaatag 7320

tgtatgcggc gaccgagttg ctcttgcccg gcgtcaatac gggataatac cgcgccacat 7380

agcagaactt taaaagtgct catcattgga aaacgttctt cggggcgaaa actctcaagg 7440

atcttaccgc tgttgagatc cagttcgatg taacccactc gtgcacccaa ctgatcttca 7500

gcatctttta ctttcaccag cgtttctggg tgagcaaaaa caggaaggca aaatgccgca 7560

aaaaagggaa taagggcgac acggaaatgt tgaatactca tactcttcct ttttcaatat 7620

tattgaagca tttatcaggg ttattgtctc atgagcggat acatatttga atgtatttag 7680

aaaaataaac aaataggggt tccgcgcaca tttccccgaa aagtgccacc tgacgtctaa 7740

gaaaccatta ttatcatgac attaacctat aaaaataggc gtatcacgag gccctttcgt 7800

c 7801

<210> 10

<211> 4237

<212> DNA

<213> synthetic sequence (Artificial sequence)

<220>

<223> zCas9 mRNA

<400> 10

ggatcacgac atcgactaca aagacgacga tgataagatg gcccctaaga aaaagagaaa 60

ggtcggaatt cacggagttc ccgctgcaga taaaaagtac agcattggac tggacatcgg 120

aacaaatagc gtgggctggg ctgtgattac tgacgaatat aaggtgccta gcaaaaagtt 180

taaagtgctg ggaaacaccg acagacacag catcaaaaaa aacctgatcg gcgctctgct 240

gtttgatagc ggtgaaactg ccgaggctac tagactgaag agaactgcta gaagaagata 300

taccagaaga aagaatagaa tttgttacct gcaagaaatc tttagcaatg agatggcaaa 360

ggttgacgat agcttctttc atagactgga ggagagcttc ctggtcgagg aggacaagaa 420

gcacgagaga caccccatct tcggaaatat cgtggacgag gtggcatacc atgaaaagta 480

tcctaccatt taccacctga gaaaaaagct ggtggacagc acagacaagg ccgatctgag 540

actgatctac ctggcactgg cccacatgat caaatttaga ggccatttcc tgattgaagg 600

agacctgaac cccgataaca gcgatgttga taaactgttc atccaactgg ttcagaccta 660

taaccaactg tttgaggaga accctattaa cgccagcgga gtggatgcaa aggccatcct 720

gagcgctaga ctgagcaaaa gcagaagact ggaaaatctg atcgcccagc tgcccggcga 780

aaaaaagaat ggactgttcg gcaatctgat tgcactgagc ctgggactga cacctaactt 840

caagagcaat ttcgatctgg ctgaggacgc caaactgcag ctgagcaaag acacatatga 900

tgacgacctg gataacctgc tggcacaaat tggtgaccaa tacgctgacc tgttcctggc 960

tgctaagaat ctgagcgatg ccattctgct gagcgacatc ctgagagtga acacagagat 1020

taccaaggca cccctgagcg caagcatgat taagagatac gacgagcacc accaagatct 1080

gaccctgctg aaggccctgg tcagacaaca actgccagag aagtataaag aaattttctt 1140

tgaccaaagc aagaacggtt acgctggcta cattgacggc ggtgcaagcc aagaggagtt 1200

ctataagttc attaagccaa tcctggagaa aatggatgga actgaggagc tgctggttaa 1260

gctgaataga gaggatctgc tgagaaaaca aagaacattc gacaacggta gcatcccaca 1320

ccagattcat ctgggtgagc tgcacgcaat tctgagaaga caggaagact tttatccatt 1380

cctgaaggac aacagagaaa agatcgagaa gattctgaca tttagaatcc cctactacgt 1440

gggacctctg gctagaggca atagcagatt cgcatggatg actagaaaga gcgaggagac 1500

aattacccct tggaactttg aagaagtggt ggataaggga gcaagcgccc aaagcttcat 1560

tgagagaatg acaaacttcg ataagaacct gcctaacgag aaggttctgc ccaagcatag 1620

cctgctgtat gaatatttca cagtgtacaa cgagctgaca aaggtcaagt acgtcacaga 1680

gggcatgaga aagcccgcct ttctgagcgg agaacaaaag aaggctattg ttgacctgct 1740

gttcaagacc aacagaaaag ttacagttaa acagctgaaa gaggactact tcaaaaagat 1800

tgaatgtttt gacagcgtgg aaatcagcgg cgttgaggac agatttaacg ctagcctggg 1860

cacctaccac gatctgctga aaatcatcaa agataaggac tttctggaca acgaagaaaa 1920

cgaggacatt ctggaagaca ttgtgctgac actgactctg ttcgaagata gagaaatgat 1980

cgaggaaaga ctgaaaactt atgcacatct gttcgacgac aaagtgatga agcaactgaa 2040

gagaagaaga tacactggat ggggcagact gagcagaaag ctgatcaacg gaatcagaga 2100

caagcaaagc ggaaaaacta ttctggattt tctgaaaagc gacggtttcg ccaatagaaa 2160

cttcatgcaa ctgattcacg atgacagcct gactttcaag gaggatattc aaaaggcaca 2220

ggtgagcggc cagggcgata gcctgcacga acacatcgca aatctggccg gtagccctgc 2280

cattaagaag ggcatcctgc agacagtgaa ggttgttgat gaactggtca aggtgatggg 2340

tagacacaag cccgagaata ttgtgatcga gatggctaga gagaaccaaa caacacaaaa 2400

gggacagaag aatagcagag aaagaatgaa aagaattgag gagggaatca aggagctggg 2460

tagccagatc ctgaaagaac accctgtcga gaatacacaa ctgcaaaacg aaaagctgta 2520

cctgtactac ctgcaaaatg gcagagacat gtacgtggac caagagctgg atattaacag 2580

actgagcgac tacgatgtcg accacatcgt gcctcaaagc ttcctgaagg atgacagcat 2640

cgacaataaa gtgctgacta gaagcgacaa gaacagagga aaaagcgaca acgtgcccag 2700

cgaggaagtg gttaaaaaga tgaagaacta ctggagacag ctgctgaatg ccaagctgat 2760

cacacaaaga aaattcgaca acctgaccaa agccgagaga ggaggtctga gcgaactgga 2820

caaggctgga ttcattaaga gacaactggt tgaaaccaga cagattacaa agcacgtggc 2880

tcaaatcctg gacagcagaa tgaataccaa atatgacgag aacgacaaac tgattagaga 2940

ggtgaaggtt attactctga agagcaaact ggtcagcgac ttcagaaagg acttccaatt 3000

ctacaaggtg agagagatca acaattacca ccacgcacac gacgcttacc tgaacgctgt 3060

ggtgggcaca gctctgatca aaaagtatcc aaaactggaa agcgagtttg tgtacggtga 3120

ctataaagtt tatgatgtga gaaaaatgat cgctaagagc gagcaggaga tcggaaaggc 3180

tacagccaag tatttctttt acagcaacat tatgaacttt ttcaagactg aaatcaccct 3240

ggcaaacggt gagatcagaa aaagaccact gatcgaaaca aatggcgaga caggcgagat 3300

cgtgtgggat aagggaagag acttcgctac cgttagaaag gttctgagca tgccacaggt 3360

taacattgtg aagaaaactg aggtgcagac aggaggtttc agcaaggaga gcatcctgcc 3420

taagagaaac agcgataagc tgattgcaag aaaaaaggat tgggacccta agaagtacgg 3480

cggttttgac agccctactg tggcttacag cgtgctggtg gtggctaaag tggagaaggg 3540

caaaagcaag aagctgaaaa gcgtgaagga actgctggga attacaatca tggagagaag 3600

cagcttcgag aagaacccaa tcgacttcct ggaggctaag ggatacaagg aagttaagaa 3660

ggacctgatc atcaagctgc ccaagtacag cctgttcgag ctggaaaatg gtagaaagag 3720

aatgctggct agcgctggtg agctgcagaa gggaaatgaa ctggcactgc ctagcaagta 3780

cgttaacttt ctgtatctgg caagccatta cgagaaactg aaaggaagcc ccgaggacaa 3840

tgagcagaaa caactgttcg tggaacagca caaacactat ctggacgaga ttatcgagca 3900

gatcagcgaa tttagcaaaa gagtgatcct ggctgatgct aacctggata aagtcctgag 3960

cgcttacaac aaacatagag ataagcctat cagagagcag gccgaaaaca tcatccacct 4020

gttcacactg acaaacctgg gcgctcctgc cgctttcaag tactttgata ccactattga 4080

tagaaagaga tatactagca ccaaagaggt gctggacgcc accctgattc accagagcat 4140

taccggactg tacgaaacta gaatcgacct gagccaactg ggaggagaca agagacccgc 4200

tgcaactaaa aaggcaggtc aggccaaaaa gaagaaa 4237

<210> 11

<211> 108

<212> DNA

<213> synthetic sequence (Artificial sequence)

<220>

<223> hey2 sgRNA

<400> 11

ggaaggataa tggttgggtg ttttagagct agaaatagca agttaaaata aggctagtcc 60

gttatcaact tgaaaaagtg gcaccgagtc ggtgcttttt ttaaagct 108

<210> 12

<211> 997

<212> DNA

<213> synthetic sequence (Artificial sequence)

<220>

<223> Cre recombinase

<400> 12

gcctgcatta ccggtcgatg caacgagtga tgaggttcgc aagaacctga tggacatgtt 60

cagggatcgc caggcgtttt ctgagcatac ctggaaaatg cttctgtccg tttgccggtc 120

gtgggcggca tggtgcaagt tgaataaccg gaaatggttt cccgcagaac ctgaagatgt 180

tcgcgattat cttctatatc ttcaggcgcg cggtctggca gtaaaaacta tccagcaaca 240

tttgggccag ctaaacatgc ttcatcgtcg gtccgggctg ccacgaccaa gtgacagcaa 300

tgctgtttca ctggttatgc ggcggatccg aaaagaaaac gttgatgccg gtgaacgtgc 360

aaaacaggct ctagcgttcg aacgcactga tttcgaccag gttcgttcac tcatggaaaa 420

tagcgatcgc tgccaggata tacgtaatct ggcatttctg gggattgctt ataacaccct 480

gttacgtata gccgaaattg ccaggatcag ggttaaagat atctcacgta ctgacggtgg 540

gagaatgtta atccatattg gcagaacgaa aacgctggtt agcaccgcag gtgtagagaa 600

ggcacttagc ctgggggtaa ctaaactggt cgagcgatgg atttccgtct ctggtgtagc 660

tgatgatccg aataactacc tgttttgccg ggtcagaaaa aatggtgttg ccgcgccatc 720

tgccaccagc cagctatcaa ctcgcgccct ggaagggatt tttgaagcaa ctcatcgatt 780

gatttacggc gctaaggatg actctggtca gagatacctg gcctggtctg gacacagtgc 840

ccgtgtcgga gccgcgcgag atatggcccg cgctggagtt tcaataccgg agatcatgca 900

agctggtggc tggaccaatg taaatattgt catgaactat atccgtaacc tggatagtga 960

aacaggggca atggtgcgcc tgctggaaga tggcgat 997

Claims

1. A nucleic acid construct I having a structure of formula I from 5 'to 3':

LA-X-RA (I)

in the formula (I), the compound is shown in the specification,

LA, X, RA are each an element used to construct the construct;

each "-" is independently a bond or a nucleotide linking sequence;

LA is the modified left homologous arm sequence;

x is a first exogenous gene expression cassette;

RA is the right homology arm sequence;

the LA and RA sequences allow site-directed non-homologous recombination of the construct with a target segment of a zebrafish chromosome, wherein the target segment comprises an intron, an exon, a terminator and a 3'UTR segment at the 3' end of a zebrafish target gene, and a single guide RNA (sgRNA) target sequence is contained in the intron sequence of the target segment; and is

2. The nucleic acid construct I of claim 1, wherein the nucleic acid sequence of the nucleic acid construct I is shown as SEQ ID No. 1, wherein the nucleic acid sequence of the LA left homology arm sequence is shown as SEQ ID No. 1 at positions 1-3309 of the nucleic acid sequence shown as SEQ ID No. 1, the nucleic acid sequence of the first foreign gene expression cassette X is shown as SEQ ID No. 1 at positions 3310-4098 of the nucleic acid sequence shown as SEQ ID No. 1, and the nucleic acid sequence of the RA right homology arm is shown as SEQ ID No. 1 at positions 4099-5205 of the nucleic acid sequence shown as SEQ ID No. 1.

3. The nucleic acid construct I of claim 1, wherein said operably linked nucleic acid construct II has a structure from 5 'to 3' of formula II:

L5-L5'-Y-L3-L3' (II)

in the formula (I), the compound is shown in the specification,

each "-" is independently a bond or a nucleotide linking sequence;

l5 is the 5' first site-specific recombination sequence;

l5 'is a 5' second site-specific recombination sequence;

y is an inverted second exogenous gene expression cassette;

l3 is the 3' first site-specific recombination sequence;

l3 'is a 3' second site-specific recombination sequence;

4. A vector comprising the construct of claim 1.

5. A reagent, wherein the reagent comprises: (a) cas9mRNA, (b) a target gene sgRNA, and (c) the nucleic acid construct of claim 1 and/or the vector of claim 4, wherein the single-guide RNA target sequence contained in the nucleic acid construct and/or the vector corresponds to (b) the target gene sgRNA sequence.

6. A host cell comprising the construct of claim 1, or having integrated into its genome one or more constructs of claim 1.

7. A method for producing a transgenic cell in vitro comprising the steps of:

(i) transfecting a cell with the construct of claim 1 and/or the vector of claim 4 such that site-directed nonhomologous recombination of the construct with a chromosome in the cell produces a transgenic cell.

8. A method for producing a transgenic cell in vitro comprising the steps of:

(i) transfecting a cell with the construct of claim 1 and/or the vector of claim 4 in the presence of Cre recombinase to cause site-directed nonhomologous recombination of the construct with chromosomes in the cell, thereby producing a transgenic cell.

9. A method for producing a transgenic animal, comprising the steps of:

(i) transfecting a cell with the construct of claim 1 and/or the vector of claim 4 such that site-directed recombination of said construct with a chromosome in said cell produces a transgenic cell, and wherein the site of site-directed cleavage is located in a zebrafish chromosomal target segment; and

10. A method of producing a tissue-specific transgenic animal comprising the steps of:

(a) preparing a transgenic animal F1 having a genome stably inserted into the construct of claim 1 according to the method of claim 9;